BaBar - DQM Shifter FAQ
Currently Maintained by Zafar Yasin;
update : Apr. 30 , 2007
update : Mar. 16 , 2006
When to Mark Any Particular Run?
Mark the QA check as sson as, after the run ends. If you forget,
thirty minutes after end of any particular run, an alarm on
pilot console, will latch.
It is suggested not to mark the run as "OK" before
it ends, to be changed later if needed. This
can lead to erraneous QA check, i.e. if needed
and you forget to change after the end of run.
Where to Mark QA Check?
In elogbook, there is a window for QA check for any
particular run, under title Runs.
How to Mark QA Check?
- Use information about detector conditions, and your
JAS3 observations, to decide the flag of any particular run.
- If the Leading and the Following Runs are bad (Unusable/Flawed) in the QA check, then
the particular Run under consideration will be marked as Unusable/Flawed.
- Whenever you mark data as Flawed or Unusable,
put a note in front of any a particular subsystem
or in Other section, for any reason not specific to
any particular subsystem. Such reasons include but not
limited to high background, wrong beam energy etc.
- For "Colliding Beams", and "Cosmics", you can judge data as:
- OK
- Flawed
- Unusable
NOTE: You can select "Detector Studies", if for some reason data for any particular
run needs to be taken out of the processing.
- Runs meant for machine study, need to be marked as "Detector Studies".
- If you are unsure about any particular QA check,
say due to some problem in any subsystem, which
may or may not cause run to stop, you can consult
with relevant subsystem expert, if around in IR2 or in
communication by phone line.
- Send immediately an email to Run Coordinators
, and DQM Manager, for a QA check you
made and not certain about. Please indicate run number, and brief description.
- If for some reason, you cant go over all
histograms even a single time, then declare data as flawed for that
particular subsystems. As per DQM shift instructions at
DQM shifter Overview ,
you are advised to look at the histograms, once at the begining of the run.
How do I reset/restart the reference server?
On the Right hand side of Fast Monitroing Control GUI
there are two Buttons one for the LiveFastMon, associated with subsystem fast monitoring and the other for
the IR2LiveL3, which is assciated with the histograms for trigger (L1), Oep/Dataflow, and L3. These should be
in _RUNNING_ state all the time. These show the OepManager-LiveFastMon GUI and OepManager-PepFastMon GUI.
In between runs, the refernces servers should now go to READY , due to some recent switching between
cosmics and physics references implemented few months ago.
Reference server
be restarted only if :
A detector subsystem expert inform you that they have changed their reference histograms.
Reference plots appear as absent in the JAS histograms.
Just click on the corresponding reference server Button, which need to be restarted.
You will get _Restart Reference_ option in the drop down menu and clicking on that will restart them.
How to Start LiveFastMon If Needed ?
- According to the current settings, LiveFastMonitoring should restart automatically after each run. All the nodes in their respective streams, turn blue in between the two consecutive runs. If you see more then one "node" blue during the run, page OEP
Manager immediately. Keep on taking data.
- If some problem or error happen in LiveFastMonitoring, then unless it stop responding itself, never try to restart
it during the run without asking the OEP Manager. As backup options you can ask Run Coordinators if in IR2 or
DQM Manager.
- Fast monitoring will stop automatically at the end of the Run. You don't need to do anything manually for this. If for
some reason you need to stop early, you can select the LiveFastMon's, "STOP" action.
- "Fast Monitoring Control" GUI has three streams on it:
- LiveFastMon
- PepFastMon
- IR2LiveL3-ref (which is related to Level3 in pilot section and DQM shifter does not need to worry.)
-
The Fast Monitoring Control Gui is mainly intended for use by experts,and you
should not have to touch it during normal operations. However, we provide a
brief descript of its operation to assist you in case you are requested to do
some specific operation. The leftmost column of the display contains menu buttons, one
for each fast monitoring stream (such as LiveFastMon). The menu items on these buttons are
as follows:
- Start: This needs to be selected once, after the fast monitoring system is
restarted from scratch. It should also be selected after recovery from errors
and after manually stopping the stream (this is rare).
- Stop: Stops the fast monitoring stream, and collects histograms. The only
reason to do this yourself is if you are going to restart the fast monitoring
stream (or the entire subsystem) in the middle of a run, and you want to make
sure that the histograms are collected first. Do not use this unless advised
by an expert.
- Clear: Certain errors withing the fast monitoring subsystem (notably
OepManager errors) need to be cleared. This will take you back to the READY
state so that you can issue "Start" to begin running again.
- Retry: If the histogram collector gets an error, this will retry the
collection operation. Unless something else has been done to correct the
underlying problem, chances are the retry will simply fail again, so you
should not use this until advised by an expert.
- Abandon Collect: If the histogram collector gets an error, and the error is
unrecoverable, this operation will skip the collection (thereby permanently
losing the histograms) and return the system to the READY state so that it
can be "Started" for a new run. Do not do this unless advised by an expert.
- Extra Collect: It is possible to collect histograms at any time for the
statistics collected up to that time. The full set of histograms will still
be collected at the end of the run. This feature is rarely used.
- Expert: This exposes some additional, more dangerous options for shutting down
and restarting the stream. These should be used only under the advice of an OEP Manager.
- Manager-PepFastMon GUI will not show any thing on the JAS. It is being observed by
Chris O' Gradey's group (ODF) and you dont need not to worry about it but just make sure that it is there when FastMonitoring GUI is up.
- There is common "design" bug, for short runs. Fast Monitroing Control window will show
"OEPMANAGER ERROR", and the OepManager window will show either "Early Exit" or "State Convergence Timeout". In this situation,
select "Clear" from LiveFastMon button or the PepFastMon button or both, which should allow system to fix it. There is no need
to page OEP in this case, unless DQM is not sure or failure mode is anything else.
Relevant Contacts :
For any general questions, problems or comments regarding this page, send email to
Zafar Yasin.
For any specific inquiries about LiveFastMonitroing, email to Jim Hamilton.
For any specific inquiries about JAS plots, email to Victor Serbo.