|
D
IRC
O
PERATIONS
M
ANUAL
Last significant update: Nov 2007 (Jose)
|
|
The aim of this page is to provide a
rather detailed operations guide and to perform first aid
troubleshooting. As some of the issues addressed evolve quickly, all
experts and knowledgeable users are encouraged to modify the manual
accordingly.
|
PDF versions of the main sections of this manual
are available
here
(last update: 2007/03/27)
The manual covers the following DIRC operations issues:
Additional day-by-day information for the DIRC commissioner team can be
found at
|
Day-to-day operations duties (not complete)
|
Before reading the manual, make sure that you fulfilled all the safety requirements for a DIRC system worker. The spokesperson Safey and the DIRC Safety and Control webpages should provide all the information you need.
When you have to use a section of the manual to work on the DIRC, do not
rely on any printed copy you may have or find: it may be outdated!
The instructions available on the web should always be followed.
You are the on-call expert
Be responsive to pages, always carry the pager and carry the
cell phone if you are not near another phone. If you have to
rely on the cell phone, make sure that the reception is
adequate in your area.
Some locations in the Bay Area (e.g. parts of Redwood City and San
Carlos) are effectively blacked out.
-
When you start being on call or backup expert, perform the following tasks
- Make yourself familiar with the DIRC system and the current run conditions:
Update all contact information
-
Attend the operations meeting in IR-2
- The daily BaBar operations meeting takes place
at 3:45 pm in the IR-2 Cornelius Conference room Monday through Friday.
A DIRC commissioner has to represent the DIRC there
every day.
If the on-call commissioner is unable to attend it, he/she should
make sure that another DIRC representative will be present to
report the DIRC current status.
To prepare this meeting, the DIRC commissioner has to
make sure that all DIRC systems work nominally or that
eventual problems are being addressed by the proper experts.
The relevant DIRC systems are:
- Hardware
- Photomultipliers (PMTs), Front-End Electronics (FEE), High
Voltage (HV), Electronics Cooling Water Chiller, Water
Plant (including SOB Water Dump system),
Nitrogen Gas Circulation, LED Calibration System,
Data Acquisition System (ROMs).
Taking a daily tour of the water plant, gas rack and
electronics house to convince yourself that everything
is working well is mandatory.
- Detector Control
- • Check that the DIRC epics panels are all green and that the
main parameters (crate temperatures, water and N2 flows,
HV channel status etc.) are nominal.
Make sure that the front-end crate fans have been working
steadily in the last 24 hours. The logfile of the corresponding
cronjob is ~babardrc/CronJobs/plotFans.log.
Look for errors in it if you have any doubt about the
freshness of the plots displayed on the webpage. The cronjob can
be reran by hand using the scripts ~babardrc/CronJobs/fansMonitoringCronJob.
• The chiller power SIAM and water level SIAM can be monitored here.
• The Wiener Crates can be monitored here.
• The dead pmts can be monitored here.
• It is useful to have epics, the alarm handler and some scaler
stripcharts always running on your office's (home) computer.
- DAQ and Data Quality
- • The fastmonitoring plots
of all runs must be checked daily (weekends
included...) to make sure that
the DIRC status remains fine. Instead of checking all runs
taken during the last 24h in a row, it is better to look at an
handful of runs a few times per day. This should also allow
you to catch new problems quicker. The corresponding
documentation is available here.
•On-Call person should check the Odd runs and Back-Up person the Even runs.
• Check the background levels in the last
24 hours as well. Minimal documentation is provided with the
plots. A README
describes the
DrcBackgroundMonitor package which is used to produce
the plots automatically. The logfiles of the corresponding
cronjobs are ~babardrc/CronJobs/bkgMonitoring_daily.log
and ~babardrc/CronJobs/bkgMonitoring_biWeekly.log
respectively. Look for errors in it if you have any doubt
about the freshness of the plots displayed on the webpage. They
can be reran by hand using the scripts ~babardrc/CronJobs/dailyBkgMonitorCronJob and ~babardrc/CronJobs/biWeeklyBkgMonitorCronJob.
• OPR QA postscript plots can be located using the
PR QA Output page.
The current OPR status can be checked in the
PR Run Progression webpage.
Checking the run quality after reconstruction
is not among the ops. crew task:
there is a monitoring expert checking the DIRC QA.
- Calibration
-
Global calibrations are taken once every 1-2 days by pilot shifters
when beams are lost and PEP does not reinject immediately. Each time
new DIRC calibration data are taken, an e-mail is sent to babardrc.
When you see this message, you should check the
quality of the calibration asap.
Bad calibrations can always be
overwritten in the database but the first processing of mis-calibrated
runs will be bad. Fixing it will require to reprocess these data which
will delay the use of these runs for physics analysis. In addition,
bad calibrations can also trigger hardware problems with the calibration
systems which need to be addressed promptly.
If you notice during your checks that one automatic
script seems not to be working, you should try to fix the problem asap.
If it is not possible, make sure that the proper expert is aware of the
situation. Having no monitoring data is as
worrysome as having bad data!
-
Maintain the Commissioner Logbook
- To help in keeping track of problems that may influence data
quality, the DIRC commissioner has to update the
DIRC
commissioner logbook every day.
A simple 'Nothing to report' (NTR) is much better
than no entry at all!
-
E-logbook
- Read the e-log at least once
per shift
and respond to any comment regarding the DIRC. Report any
work on the DIRC in IR-2 in the logbook: remember that
you get a proper work authorization by checking in with
the pilot before starting to work and
by making an entry in the e-logbook.
Start any entry in the e-logbook by [DIRC myName]
to allow an easy identification of who wrote what.
If you're talking by phone with the pilot (e.g. after a page),
make sure that he/she understands what you're saying and that
a summary of the discussion appears in the logbook. If this is
not the case, call again or make the entry yourself.
You can also check the new
MCC e-log although it is likely to be too technical.
-
Maintain Shift Instructions
- In order to propagate the information on DIRC problems to the
BABAR shifters, the DIRC commissioner has to maintain several
short-term and long term instruction pages:
Remember that shifters are taught to read long-term instructions once per block of shifts and short-term instructions once a day. So any new guidance should first appear in the short-term instructions (and in the long-term ones if needed).
If new instructions apply for the two shifters, include them in both manuals. Experience shows that pilots don't read DQMs instructions (and vice-versa) and that the quality of the communication between the shifters is not always perfect.
-
Check reference histos
- Browse periodically the JAS plots to make sure that the reference
histograms they contain are still up-to-date. Discuss any (valid)
DQM comment with your FastMonitoring expert.
-
Maintain Contact Info
- Maintain the
DIRC Operations web page.
Also, feel free to make changes to the ops. manual,
there is plenty of room for improvement!
Maintain the DIRC system contact info in the
-
Keep the other commissioners and the DIRC group informed
- In addition to updating the daily log, send emails about
questions, news or work done to
babardrc and post important problem reports or activities in the
DIRC Operations Hypernews forum
.
Other Duties
-
Once
-
- Subscribe to the DIRC Operations and BaBar Detector Operations Hypernews forums.
- Add your e-mail address to the .forward file of the babardrc
account (~babardrc/.forward).
This e-mail address is used to exchange and discuss information related to the DIRC operations.
- Subscribe to the PEPIIBBR bfmail distribution (pepiibbr) to be
notified when BaBar/PEP meetings are scheduled. The subscription
procedure is explained here.
- Print a copy of the DIRC NIM paper which is THE reference for the
subdetector.
- DIRC commissioners are encouraged to take
responsibility for a detector analysis project in coordination
with the DIRC software experts and the ops. manager(s). Both
actitivies would benefit one from the other!
-
Weekly
-
There is a weekly PEP/BaBar meeting, usually every Friday at 1.30 pm.
That's the best place to learn things about PEP status and plans. Each
subsystem should have at least one representative in the meeting,
normally the oncall expert. If you're oncall and can't go, make sure
someone will replace you.
BaBar/PEP meetings are announced via the bfmail distribution
PEPIIBBR to which you should subscribe.
Every Tuesday night, the oncall commissionner must prepare a short
weekly report
summarizing what happened in the last week. Just copy and rename the
report from the previous week and fill in the new information required.
Avoid to be too verbose
(details should be put in the DIRC logbook which is our main tool to
monitor daily operations) and don't spend much time on hunting lumi
numbers: rough precision is enough! On the other hand, post the summary
as early as possible (on Saturday at the latest). Then, send a Hypernews
to DIRC-Operations.
-
Bi-weekly to monthly
-
The commissioner should give a report about the activities since the
last DIRC meeting in the DIRC meeting, held on demand, typically
once or twice per month, on Wednesday at 9 am in the Sierra room
(central lab. 1st floor).
If useful, prepare a small talk summarizing the main points to
be discussed -- otherwise just go through the last 2 weekly reports.
Post any presentation to the DIRC ops. hypernews before the meeting:
this will ensure that the provided information is easily available
and can be viewed by people outside SLAC.
The DIRC meeting allows for a phone conference. The current
phone number is (510) 665-5437 and the meeting ID 5418. If you're
not on site and want to connect to it, crosscheck these details
with the meeting announcement e-mail posted on the hypernews.
-
Monthly
-
One member of the DIRC ops. crew (commissioner or operation manager)
has to give the DIRC
monthly report. Such reports are given on Wednesday after
the IR2 3.45 pm meeting. The monthly report planning is usually written
on the white board in the Cornelius room. In any case, a subsystem knows
at least one week in advance that it has to give a monthly report.
-
Collaboration Meeting
-
At each BABAR collaboration meeting, there is a DIRC talk in
one of the detector plenary sessions. One member of the DIRC ops. crew
(commissioner or operation manager, alternatively 'European' and
'American') gives a report about the activities since the last meeting
and presents the status and the future plans for the DIRC.
Finally, check the Useful links section of the manual
to make sure you know where important webpages are located and how to reach
them quickly. Bookmark them in your favorite web browser and feel free to add
new ones to the list.
There are several ways to page a system without going through the web and so without having an entry in the logbook:
- If there is a SLAC userID (for instance babardrc) associated to the pager number, you
can open a terminal and type:
telalert -i babardrc -m "Alejandro, what are you doing ?"
if the shell cannot find the telalert command, try to find the full path:
whereis telalert
- If you just know the pager number, you can open a terminal and type:
echo "message" | mail 650XXXYYY@myairmail.com
again if the shell cannot find the mail command, try to find the full path:
whereis mail
- you can do the same as the previous step by using your favorite
mail client, for example pine
in the To field: 650XXXYYYY@myairmail.com
in the Subject field:
in the message field write the message
- via phone, just dial the phone number and type the extension number after the tone, followed by #
|
Orientation in the IR2 hall
|
Schematic view of the DIRC mechanical elements
To follow IR2 meetings, it is better to have an idea of the BaBar hall orientation... The picture above aims at avoiding various headaches occuring when one needs to decide which door must be opened for the DIRC!
- The DIRC is on the backward side of the detector.
- The positrons point to the North and the electrons to the South.
So, the backward east door is on the left when one looks at the DIRC. It protects the sectors #6, #7, #8, #9, #10 and #11. Conversely, the sectors #0, #1, #2, #3, #4 and #5 are below the backward west door which is on the right when one look at the DIRC.
|
How to react in case of a problem
|
This section aims at summarizing some advices you may follow if you're facing some problem with the DIRC.
- Think before acting.
- Important: Whatever the problem is, make sure that it does not prevent PEP from injecting. This is likely to be the case if we have a hardware problem as all subdetectors need to be potentially runnable (i.e. global status OK and HV ramped down) to make BaBar injectable. To bypass the DIRC injectable flag status, one has to use the Experts State Machine.
Try also to allow the data taking to keep on -- after all, the problem may not be that serious! If the DIRC is not runnable, one should bypass its status in ORC.
Troubleshooting: Information about the DIRC injectable and runnable flags are available here. [Link should be working soon: remind me if you find it broken! (Nicolas 2006/06/14)]
- Look at the Alarm Handler and ask the pilot if he/she is seeing other alarms -- even alarms which appeared after the DIRC problem was identified can be interesting!
Always keep in mind that the DIRC monitoring is powerful enough to detect problems not induced by the DIRC!
- Check the status of the DIRC main systems (Front-End, HV, finisar, gas and water etc.) in epics.
Troubleshooting:
If you're getting annoyed by an epics variable you can't find in any of
the DIRC panels, it is likely to be one of Jerry's
background monitoring sensors which are kindly connected to our IOCs
although they have nothing to do with the DIRC. A list of Jerry's variables
(as of 2006/05/10) can be found here (the epics names
are in the third column from the left).
- If the data quality is likely to be compromised, ask the pilot to
- end the run
- reconfigure ORC by going to the standby mode
- restart a new run.
- Then look at the new fastmonitoring data to see if the problem is still visible. If it disappeared, you'll be more relax to try to understand what happened.
- If the problem is still there and if its origin remains unknown, try to correlate it with plots from other subsystems (trigger, dataflow etc.).
- Ask the pilot to page other experts if needed.
- Do not hesitate to call other DIRC experts if needed (the Care and Feeding manual should contain an up-to-date list displaying all their details).
Once the problem has been fixed:
- Double-check that everything is working well!
- All epics panels should be green.
- HV channels should be set to nominal -- check after HV-related work.
- Undo (unbypass, uncomment etc.) any special setup you may have defined
during the fix.
- Test the DIRC as much as possible: calibrations (from ORC or standalone) and data taking (standalone or cosmics).
- When data taking restarts after any significant down period, make sure some expert is looking at the first data.
|
How to recover from a power outage
|
Power outages randomly occur in IR2: VESDA alarms, problems with the cooling
water etc. Depending on which area(s) of the BaBar hall gets powered down, some
parts of the DIRC will go off. As soon as you're notified of a power outage
(usually all systems get paged quickly by the shifters), go to IR2 and start
investigating the DIRC status. Your goal is to bring it back up
safely and as quickly as
possible.
Use the following instructions as a generic guidance and adapt them if needed.
- Open epics and check the status of the different DIRC systems.
Note the main variables which are in alarm
(yellow or
red displays) or
disconnected
(white displays).
Alarms usually mean that systems are OFF but powered; disconnections are rather
the sign that some DIRC hardware is unpowered, possibly the IOCs.
- Check the alarm annunciator ('beta') panel (top left panel on the pilot
wall on cosoles) to see the status of the DIRC chiller and of the SOB.
- Make a quick tour of the DIRC hardware: water plant, electronics house
(HV crates, drc-hv
and drc-mon IOCs,
ROMs), chiller and the N2 rack (on top of the E.H.).
If the HV crates are unpowered, turn them to the OFF position using the key.
In this way, they won't suffer more when the power gets restored. If the
chiller is OFF, see if you can turn it ON -- otherwise be prepared to switch
to the BCS.
-
If there is anything weird with the gas rack (red lights on the humidity
panel, N2 stopped,
drc-gas IOC unpowered
etc.) or with the DIRC water system, page
immediately Matt and Jerry and don't touch anything
before they arrive.
If you find drc-mon
unpowered, it is always safer to ask Matt to come in IR2 to have the dump
valve disabled while you restore the IOC. Hence, it is worth paging him
as recovering from a power outage usually takes hours:
take your time, don't work in a hurry!
- When the power is back in the E.H., turn the HV crates and the IOCs ON.
You'll likely have to reboot the IOCs once or twice
before seeing them communicate again with the hardware. If you have some
troubles in this phase, this may mean that some pieces of hardware are
broken. Good luck...
The IOC logfiles should help you diagnosing the problem -- look for any
error showing up during the reboot phase.
-
As soon as the DIRC and the DAQ are back, take a (global) calibration and a
test run (either in standalone or some cosmics) to make sure everything is
indeed OK.
-
If drc-mon was
powered OFF, remember to inhibit the BCS status SIAM channel (2nd
channel of the right-most SIAM board in the
drc-mon crate).
The DIRC Coordinate Systems
(more information concerning the
DIRC
Numerology):
The DIRC detector space (dTag) is defined by module, section, chip and
channel, like for all BaBar subsystems. The specific DIRC dimensions
are (counted as: 0...n-1):
- 6 Read-Out-Modules (ROMs) which perform the FEE data acquisition;
- 28 Sections (DIRC Front-End Boards, DFBs in short) per module;
- 4 TDC chips per section;
- 16 PMT channels per TDC chip, resulting in a total of
10752 = 6*28*4*16 channels (PMTs) for the whole DIRC. Upstream from
the TDC chip, there are 2 analogical chips which get data from 8
PMTs each.
The ROMs communicate with the FEE through p-address links. The module,
chip and channel coordinates are identical to the dTags and the only
difference concerns the sections: p-sections follow the actual slots in
the FEE ("
Wiener") crates while the
dTag sections count in the inverse direction. There are 16 p-sections
(slots) in an FEE crate but only 14 of them are actually used for the
DIRC.
The lowest non-empty slot (slot 05) is occupied by the DIRC Controller
Carte (
DCC) which
- steers the input/output of the Fast Control
(FC) commands via the Control Link
(CLINK);
- sends the measured TDC and ADC data via the Data Link (DLINK);
- provides and feeds slow control CANBUS channels.
The 2 following slots (right after the DCC) are empty. The next
14 slots (slots 8 → 21) of the VME crate are filled with
DFBs. The last slot in the crate has the p-section bit (link) 15, while
it has the d-section (
DFB, also called
"
harness") 0. The first slot occupied
by a DFB (after the DCC and the 2 empty slots) has the p-section bit 2
and the d-section address 14. This is an important difference. For
instance, the individual register configuration of the 8 analog chips
per DFB is addressed in p-space: the links go from 2^2=0x0004 to
2^15=0x8000. On the other hand, the electronics coordinates
(
dTags) used in the Online Event
Processing (
OEP) go from section 0 to 13.
For convenience, the distinction between chips and channels is sometimes
kludged together to one "
DFB channel"
coordinate, going from 0 to 63.
The DIRC geometry space consists of (counted as: 0...n-1):
- 12 SOB sectors;
- 41 rows of PMTs;
- 30 columns (the number of columns in a row varies: it is the
smaller the lower the row number is).
The sectors correspond to a combination of modules and sections in d-space.
The conversion formula reads:
Sector = 2 * Module + ((Section < 14) ? 0 : 1)
For instance, module 0 hosts the sectors 0 (modules 0 → 13) and 1
(modules 14 → 27).
The conversion from "harnesses" (which is equal to a DFB and to a
section, for section < 14), and DFB channels within a given sector
to columns and rows is non-trivial. A (supposedly) self-explaining
FORTRAN
program which is the basis of the DIRC geometry model used in the
reconstruction code is available for this purpose. It may be directly
called from its location using the command (working on Sun machines only):
/afs/slac/u/ec/hoecker/tools/position.run
The conversion is particularly important when one wants to identify bad
tubes in a sector. It is more convenient to count columns and rows, but
the cables are labeled using harnesses and channels (the actual harness
numbering on the detector counts clockwise from 0 to 167 = 12*14-1, seen
from outside the SOB).
The DIRC HV channels are organized in groups of 16 PMTs, all equally
distributed between 2 harnesses of a same sector. Thus, there are 56 HV
groups per sector and 672 for the whole DIRC. The individual HV settings
have been chosen in order
to maximize the efficiency and purity of the PMT signals. It is therefore
crucial to make sure that the group settings visible through EPICS panels
correspond to the
nominal
values.
The sector plus HV group to HV setting conversion is also a
feature of the above "position.run" executable.
Troubleshooting:
The DIRC coordinates are closely related to the actual cabling of the
crates, boards (harnesses), HV groups and signal channels. Miscabling
of a HV group will show up as a hot group of 16 channels and a
corresponding cold group in the same sector. Miscabled optical fibers
will interchange whole sectors. This shows up as interrupted Cherenkov
rings in the
Event Display.
Interchanged harnesses or signal cables however require a
high statistics
single
tube Cherenkov angle analysis. Cable errors are corrected via
software during the dTag-to-gTag conversion.
The module IDs (0...5) are set by dip switches on the controller card
of the ROM. Wrong Ids show up as invalid or doubled module numbers printed
on the xyplex windows after a reboot.
PMTs:
The quality of the PMTs is primarily controlled by the periodical
calibrations and the fast monitoring data.
Troubleshooting:
The main reason for PMTs to become noisy is
a decrease of their air tightness around the base joints.
PMTs with a bad vacuum scatter light to adjacent tubes generating
the so-called
"Christmas
tree" pattern. The funniest example of such a spitting PMT is shown
here;
FYI, the noisy PMT sits in column 1, row 1 of sector 4 and is in fact dead
for signal photons.
A
Xmas tree usually trips the HV channel which
powers it. So, 16
PMTs are (temporarily) lost when a Xmas tree shows up as one cannot keep
its voltage at the nominal setting. Nothing can be done until we get an
access to the backward side of BaBar.
As such access is likely to be decided at the last minute because PEP
and/or BaBar need some urgent repair, do not forget to
put your access request on the white board in the control room
to make sure that the run coordinators are
aware of it and that the pilot will page the DIRC when the opportunistic
access is being set-up.
During the access, the faulty PMT must be disconnected from the HV.
It is mandatory to identify the unplugged HV cable (sector, row, column)
and to explain why/when/by whom it was disconnected. Make sure that you
unplugged the correct PMT before the door gets closed again:
- check that the HV channel can reach its nominal voltage and that the
current is stable and not too high;
- take a quick standalone or cosmics run to check the hit occupancies
in the whole DIRC.
From time to time, PMTs can also become noisy.
Usually, the noise comes and
go on a few hours timescale without any clear reason. Noisy channels can be
ignored during cosmics data taking as the PMT rate is much lower than during
colliding data taking. If a PMT gets too noisy, it can be masked (i.e.
removed from the Data Acquistion) by the DIRC online expert. Practically,
a PMT is too noisy if the number of hits it generates prevents the DQMs and
the DIRC crew from checking accurately the occupancies in the DIRC.
Tip:
There is no guaranteed recipe to locate a PMT responsible for a Xmas
tree: that just makes the game nicer! Here are a few tricks...
- Knowing which HV channel trips should leave you only 16 candidates.
- A Xmas tree is usually a quiet/dead PMT surrounded by PMTs with
very high rates.
- Compare Fast Monitoring plots produced during the last days/weeks
to see whether the efficiency of one PMT in the group didn't change
dramatically on a short timescale.
- If you have the opportunity, take a standalone run with the faulty
HV channel ON but at a voltage much lower than the nominal one. The voltage
should be low enough to prevent the current from tripping the channel but
high enough to allow the Xmas tree to bright.
Locating a noisy channel is simpler. The Fast Monitoring plots should tell
you the sector, the HV channel and possibly the harness which contain the
noisy PMT. Crosschecking this information should leave you with a few
candidates. A final look at the first page of the Fast Monitoring plots
taken during a run in which the PMT was noisy should allow you to remove
the ambiguity: the noisy PMT appears as a black dot on the hit map.
Scalers:
12 PMTs (1 per sector) are used to monitor the background in the DIRC.
They are called scalers and integrate hits 0.5 seconds every
4 seconds -- the gate enabling/disabling the scalers comes from a
generator, which for historical and now obscure reasons, belong to the
IFR. The number of collected hits is multiplied by 2 (to have 'Hz') and
then sent to epics. The location of the scalers in the DIRC is explained
here.
Information from the scalers is the most important input data for DIRC
background survey.
Front-End Electronics:
The FEE boards (14 DFBs and 1 DCC per sector) are located within the
magnetic shielding which renders them inaccessible during normal BaBar
and PEP-II operation. The Fast Control global commands in one direction,
and the FEE data in the reverse direction, are sent through optical fibers (2
per FEE crate) which connect the 6 ROMs to the 12 DCCs.
Troubleshooting:
Possible
DCC problems concern mainly the
distribution of the 15 MHz
clock to the DFBs. The symptoms might show up as one or more
completely failing DFBs, or even a complete module (2 sectors) which
does not show any data hits. A missing clock disables the DFBs to
receive commands and to send data synchronized with FC. The data
received by the ROM are then not recognizable as such and are rejected
by simultaneously setting the DLINK damage flag.
Troubleshooting:
DCCs are also used by the trigger group. They have their own set of
boards which includes some spares. If for some reasons one board is
loaned to one group by the other (DIRC → trigger or trigger
→ DIRC), one has to remember that the S1 switch (red switch at the
bottom of the board) may need to be moved to make the board work
for the other subsystem.
For instance, a DIRC spare board was given to the trigger folks
during the October'05 shutdown to replace temporarily a broken DCC
shipped back to France for repair. To make it work in a trigger
crate, the S1 switch had to be moved from the left to the right.
A DCC inventory (DIRC+trigger) can be found
here. The DIRC has also its own inventory available from the DIRC operations web page.
DFB problems are more difficult to diagnose:
errors during the FEE
configuration (messages printed on the xyplex windows) could be a hint
to a malfunctioning DFB. Other hints are dataflow damages, such as
"out-of-order" (wrong trigger tag) or "out-of-synch" (wrong
time-stamp). These damages are listed as part of the default
Fast Monitoring histograms (page 3, top-right plot).
The easiest way to test whether a board
malfunctions is to mask it off and to see if the errors observed
disappear. Again, the replacement of a FEE board is only possible
during an access with open rear doors (magnetic shield). Spare DCCs as
well as DFBs are available in the DIRC area (located upstairs at
IR2) and as part of the DIRC IR2 teststand crate.
A not-working DFB is not by itself a reason good enough
to request an immediate door opening. As harnesses read PMTs along
DIRC radii, a dead DFB
would make the DIRC reconstruction miss at most 1-2 photons for
approximately 20-30% of tracks.
In case of problem on a DCC, a mail must be sent to Denis Bernard and
Antoine Mathieu from the Ecole Polytechnique lab. To be repaired, the
board must be shipped back to this lab.
Troubleshooting: do not forget
to protect the 2 connectors in which the optical fibers (= Finisar links)
are normally plugged in. Use tape if you cannot find some fiber
covers.
A few covers should be available in the drawer where
the keys of the DIRC cabinets are stored. Don't loose them! Put them
on the DCC finisars which are not used to protect them from dust.
The LAL is in charge of the DFB support;
therefore, in case of problems, an e-mail must be sent to Anne-Marie Lutz
(LAL P.I.) and Dominique Breton (LAL electronics ingenieer). DFBs are
usually repaired at SLAC by Dominique when he comes at SLAC.
If the problem is just a damaged or unsoldered component, Lupe Salgado and
her team can fix it at SLAC (ask Antoine or Dominique first!).
An interrupted optical fiber path
due to damaged fibers or bad
contacts is identified by the error message "Error: Odf boot task 5"
printed on the xyplex windows after a reboot of the ROM. In addition,
the configuration of all DFBs in this crate will fail and the
datagrams sent will be empty. Finally, the "Ready" LED on the
corresponding ROM and fiber should glue red and, in case of a light
transmission failure, the corresponding Finisar EPICS panel should
indicate errors.
Front-End crate Power:
The 12 DIRC front-end crates take power from two power boxes located on the backward sides of BaBar: one on the west side (powering sectors #0 to #5) and one on the east side (powering sectors #6 to #11). Labels have been put on the power cords to allow an easy identification of the sector powered by a given cable. The picture below shows the way the 12 cables were plugged in during the October 2005 shutdown.
Front-end crate power cord status on 2005/10/19.
If some cords are moved in the future, it would be nice to update this picture: the xfig file in $BFROOT/www/Detector/DIRC/Operations/ should be used to overwrite the gif picture displayed here.
The two boxes are powered by a panel located on top of the detector. Between
this panel and the sector crates, there is a relais (also located on top
of the detector, about 2 meters away from the main box) which is controlled
by the DIRC chiller: chiller running ↔ relais closed; chiller not
running ↔ relais opened. The control line of the relais is supplied
by 24 V provided by a power supply located in the electronics hut in a DCH
rack (B620B-10, the PS is actually hidden behind a front plate!). The
open/close status of the 24V is given by the chiller SIAM output.
High Voltage System:
The DIRC HV power is supplied by 6 SY527 CAEN crates. Each crate is
powering 2 consecutive SOB sectors. They are located in Rack #5
and #6 in the Electronics House. Each CAEN crates contains 8
independent HV boards which can power each 16 HV channels (called
groups). Their status and the HV settings
are monitored by
dedicated
EPICS panels.
An alternative, but discouraged way to connect to the
crates are the serial "xyplex" lines, available for each CAEN crate using
the command:
xyplex drc-caeni
where i=1...6 is the number of the CAEN crate.
Troubleshooting: For security reasons, the access to all HV power supply serial lines has been restricted on August 12th 2005. Therefore, these connections can now only be done from a limited number of machines which have AFS and regular home directories: bbr-reflin, bbr-reflin1 (Linux machines) or bbr-refsun and bbr-refsun1 (SUN machines). By default, users are not allowed to log in these machines. If you think you need such permission for DIRC-related activities, please send an e-mail justifying your request to ir2-admin with the DIRC ops. manager(s) in cc. Otherwise you can connect to these machines using the DIRC generic account babardrc.
Once the xyplex session has been setup, pressing "D" displays
- the individual HV settings;
- the voltage and current 'absolute' limits (respectively 1600 V
and 2200 uA);
The channel voltage is hardware-limited: a small screw on the back
plane of each HV board allows one to adjust this limit.
- and the monitored HV and current values.
It is under all circumstances forbidden to
operate the HV when the SOB is open to light.
There are 6 closable glass windows in the SOB which allow
visual inspection of the inside. One should check that all of them are closed
before turning the HV ON. In addition, if one of the optical
fibers used to transport light from the DIRC
light generator crate to
the SOB is unplugged, some light enters inside as well.
In addition to turning the HV off with epics, the
following actions are mandatory before starting any work involving the
DIRC HV system.
- Read and sign the non-routine JHAM describing the work planned.
- Turn off the HV crate using the unique key which should be in
one of the crates. This hardware interlock switch guarantees that the
HV supply is interrupted.
Keep the key in your pocket while you're working on the DIRC HV.
If the key get lost, don't panic: any IR2 CAEN crate key should fit!
- If you plan to work on an HV board, unplug the power cords at
the back of the three crates which are in the rack. Then, wait for the
red light to turn off to make sure that all capacitors have dissipated
their charges.
Faulty HV boards (or crate) must be brought to
Paul Stiles, the SLAC technician in charge of
them. Ask Ray Rodriguez for more information
if needed.
Electronics Cooling:
When the doors of the magnetic shield are closed, the FEE crates need
to be cooled to evacuate the power dissipated during the running of the
detector. Therefore, cooling radiators are attached to the inner side of
the shielding and come close to the crate fan trays when the doors are closed.
The cold water circulating in these radiators is supplied by the
DIRC
Water Chiller. The inspection of the chiller is part of the routine
BaBar shift tour.
Troubleshooting:
The DIRC chiller is connected via a SIAM (located in Rack #07 in the
Electronics House) to the two BaBar enunciator (VESDA) panels, located in
the Electronics House and in the control room. A SIAM trip caused by a
chiller malfunction opens a relay, located on top of the detector (and thus
inacessible during normal running) which simultaneously interrupts the
power supplies of all 12 DIRC FEE crates. Such a malfunction could be
generated by a broken chiller fuse or an IR2 power cut. After removing
the cause of the failure, the SIAM needs to be reset and the FEE crates
can be turned on via the FEE EPICS panel.
Similarly, it may be necessary to inhibit the Backup Cooling System (BCS)
on the SIAM after a general power cut. If the BCS is not inhibited whereas
we rely on the DIRC chiller, FEE crates cannot be turned on. If everything
else looks normal and if the FEEs do not want to be powered ON, the
corresponding LED on the SIAM is worth being checked. See the
chiller problem section for more details.
If the DIRC water chiller breaks, one must use
the Backup Cooling System (BCS)
to cool the front-end crates
and minimize the interruption of data taking. The BCS can only be used by
one subsystem at a time. So, in case we need it, we have to inform the other
potential users of the BCS (DCH, EMC) that it is currently in use.
Investigating the DIRC chiller problem should start immediately by
notifying the Site Engineering Maintenance (SEM) / Heat Ventilation and
Air Conditionning (HVAC) with an urgent trouble call.
Instructions to switch from the DIRC chiller to the BCS can
be found in a dedicated section
of the Care and Feeding Manual. To make such switch as
smooth and quick as possible, it is recommanded to try this operation
outside of the data taking periods, possibly during ROD, more likely
during a shutdown. Walt Innes can help you to setup this test and give
advices if needed. To avoid interfering with other works, the request must
be presented well in advance in one IR2 operations meeting.
To be able to turn ON the front-end crates while using the BCS as cooling
system, the DIRC chiller SIAM must be inhibited. Remember that there is no
additional protection of the crates if the BCS fails. So, one should be
cautious when operating the BCS: the temperatures of the crates must
be monitored with care.
Ensure that the chillers will never be operated without having
a open path from the outlet to the inlet! Otherwise, either the
chiller pump or the valves will break.
Remember that without any chiller the FEEs are no more cooled. So be sure to
know how to operate before switching the chiller off if the FEEs are ON!
When the BCS is working, tell the shift leader that the DIRC
is operational again. Make notes in the logbook and arrange the repair
of the DIRC chiller if needed -- don't forget to switch it off in this case.
Before leaving IR2, monitor the system (temperatures and values at the BCS,
temperatures of the FEEs) for a sufficient amount of time.
Water Recycling System:
The SOB water circulation system is located behind the Electronics
House in the IR2 entrance hall. The inspection of the water system is
part of the routine BaBar DQM tour, done once per shift (see the
DQM checklist for details).
An EPICS panel informs about
water level, temperature, pH values. A detailed
DIRC expert checklist
provides guidelines about how to fill the SOB and how to maintain the
cleaning system, and what to do to recover from a power outage.
The DIRC water plant is a sensitive part of the DIRC
system. Commissioners should not try to fix anything on it unless they
were given special instructions to do so. The safe procedure is to page
the water plant experts, Matt McCulloch and Jerry Va'Vra -- if needed.
Troubleshooting:
After a power outage, the circulating pump might not restart by itself. If
the SOB water level falls, you might want to check that the pump is running.
See specific instructions
here. If any of the water
plant pumps is dead, water plant experts should be notified.
Spare pumps
are available but an electrical tech. is required to replace a pump
safely.
SLAC pressured air supply:
The valves in the Water Recycling System are controlled by pressured
air.
Troubleshooting:
In case of an outage of the air pressure a small DIRC reservoir will
provide pressured air for about two hours. If the outage is expected
to be longer, contact a DIRC water expert.
Nitrogen Gas Circulation and SOB Water Dump System:
The DIRC DIRC
Gas System detects water leaks by measuring the humidity level of N2
within each barbox and looking for water inside the barbox slots. There
is a total of 4 leak sensors per barbox, although some are shared between
barboxes. Signals on any two sensors on the same barbox will trigger
a SOB water dump. The sensor electronics is located in Rack 5,
on top of the detector,
and the majority logic electronics that creates signal for the dump is
located in Rack #38 on the top of the Electronics House. The humidity level
and the N2 flow are read out via the DIRC Gas System Epics panel.
Troubleshooting:
The Gas system is connected to uninterruptable power (UPS) which means
that batteries located in IR2 take over the electricity supply in case
of a power outage. In addition, the dump module has an independent
battery which will however not last longer than 4-7 hours. In case of
a power outage it is important to check that the UPS works. A complete
power loss might cause the SOB dump valves to open!
Power Outages:
Relatively frequent power outages in the Electronics House due to
cooling problems or VESDA alarms need the following actions:
Troubleshooting:
- Once the power is back, the 2 DIRC IOCs located in the E.H.
(drc-mon
and drc-hv)
need to be rebooted.
- If the software reboot seems not to be working, go to Rack #07 in
the E.H., and check that everything looks OK: boards (VSAM etc.) may die
during a power outage. If there is no obvious problem, reset the IOCs by
pushing the... 'RESET' button (!) to trigger a reboot.
- Boot time depends on the IOC: ~5 min for
drc-mon and only
~2 min (!) for
drc-hv which is
now a PowerPC running at 1 GHz on RTEMS! With the old 66 MHz board
running on VxWorks, the boot time was rather ~15 minutes....
Once the boot is finished, the corresponding epics panels
(front-end and HV for instance) should show green (OK), yellow
(minor alarm) or red (major alarm) status colors instead of white which
means 'channel not connected'. HV sectors and front-end crates may need
to be turned on manually using the "ON" buttons on the corresponding panels.
- If the power outage was extended to the DIRC Gas Rack #37, the
drc-gas
IOC needs to rebooted in the same way as the other IOCs. Check that the
Nitrogen gas flows -- both rotameters (upstream flows) and
electronic flowmeters (downstream flows) -- show nominal values.
- If the power outages included the water system, check the status of the
dump electronics and the valves. Check the air pressure on the air
tank as well, reset the water pump and check that the vacuum pump
is running. Before doing so, look at the sections of the manual dedicated
to the water plant. Do not hesitate to call an
expert if you're not sure of what to do.
The DIRC has 3 independent Input-Output Controller (IOC) modules to which serial link connections can be established
from the proper IR2 machines:
- The MON IOC
- (serial link: xyplex drc-mon)
controls 12 PMT hit rates (scalers), low voltage, temperature and
status of the FEE boards and crates, the values of the magnetic
sensors and the light transmission intensity of the optical fibers
through 2 daisy-chained CANBUS systems, one on the Wiener FEE
crates and the other on the DCCs. In addition, the gas humidity
and flow readout as well as the SOB status are controlled here. The
corresponding setup files (database libraries) are found in the
directory:
/nfs/bbr-srv01/u1/babar/boot/apps/drc-mon/db/
- The HV IOC
- (serial link: xyplex drc-hv)
controls the HV configuration and status. The database setup files
are found in:
/nfs/bbr-srv01/u1/babar/boot/apps/drc-hv/db/
More information on the HV control system can be found here.
- The Gas SIAM IOC
- (serial link: xyplex drc-gas)
controls the SIAMs of the various gas humidity sensors.
Equivalently to the other IOCs, the database configuration files
reside in:
/nfs/bbr-srv01/u1/babar/boot/apps/drc-gas/db/
In fact,
drc-mon,
drc-hv and
drc-gas are
3 symbolic links which point to the last versions of the packages, currently
used in production. The BaBar packages containing the IOC codes are
epics-drc-mon,
epics-drc-hv and
epics-drc-gas.
Epics variables are normally initialized with upper and lower alarm
thresholds. These limits are set in the
db/ subdirectory
of the corresponding IOC package. The alarm status of variables is visible
on the epics panel through a color code (OK,
minor alarm,
major alarm
or not connected)
and
variables linked to the ALarm Handler (ALH) produce audible and visible alarms
on the pilot console when they go outside their nominal range or loose
connection.
The DIRC ALH configurations are stored in the
alh/ subdirectory
of the IOC packages. There are usually two files per ALH section: one for
the normal data taking conditions with all alarms enabled; one for the
shutdown periods with several alarms disabled. As parts of the DIRC are
usually turned off during shutdowns, disabling the alarms prevents from
overloading the ALH panels with fake alarms.
To know more on an epics variable middle-click it on the epics panel.
This should open a small grey popup window displaying the name of the
variable. By clicking then on 'Examine', you'll access some important
parameters of the variable: its scanning frequency and its alarm thresholds.
Ambient database:
Several DIRC epics variables are archived in the ambient
database. These
data allow one to monitor the DIRC behavior over long-time periods or to
make detailled studies of the DIRC status at a given time in the past. In
particular, two important components of the DIRC monitoring (front-end
fan speed monitoring and background monitoring)
are extracting their data from this database.
Among the monitored variables, one can find:
- the 12 scalers;
- several parameters from the front-end crates;
- currents and voltages for all the HV channels;
- various environment data: N2 flows, B-field sensors,
neutron counter rates, water plant main parameters...
- background monitoring variables of various types.
There are two main ways to access ambient data.
- To use the ambient browser.
This is the simplest way to create quick plots. If you're working on
a linux SLAC workstation, a convenient way to
start it is to click the button 'Ambient Browser' on the right part of
the main BaBar epics panel. This should open a window similar to the picture
below.
Screen shot of the ambient browser showing some of the
DIRC epics variables archived in Ambient.
If you're working from a remote site, it is much simpler to install JAS on
your computer and to connect to the BaBar ambient server from there. This
will prevent you from importing the browser from SLAC in addition to the
monitoring data you're interested in! Otherwise, the browser will be very
slow and may crash quite often...
- To use OdcNtupleMaker.
This nice tool (managed by Andy Salnikov) uses a time range and a
list of epics variables as input. It produces a ntuple (PAW or ROOT format
depending on the user's request) which contains the time history of all
the selected variables over the chosen range. The ntuple can then be
analyzed offline using suitable macros.
The directory
$BFROOT/detector/drc/Odc/ shows several examples of input datacards
(files with the .inp extension). The conventions used in these text files
should be pretty straightforward. The file
analysis.kumac
contains PAW macros which can be used to analyze ntuples produced by
using the .inp input files.
- TIME_WALK_SCALERS_NEW: plots of the scalers, PEP-II luminosity
and LER/HER currents versus time;
- FREQ_VS_I: the scaler rates versus LER current or luminosity for
all sectors including fits of the linear and quadratic components,
if required;
- I_AND_HV, ALL_V: the current and HV stability;
- PLOT_TEMP: the FEE temperatures;
- TIME_WALK_HUM_FLOW: the gas flow and humidity versus time.
Warning: this directory has not been used
for years and so its contents are expected to be quite old-fashioned.
Therefore, use them with care: they are good examples of what one can do
with OdcNtupleMaker
but they may not be directly usable.
Adding new channels in Ambient is quite straightforward. Details
can be found in a dedicated
guidance webpage or in this
hypernews.
For information and pictures, see
here.
Data Acquisition System (ROMs):
The
DIRC
DAQ system has 6 "segment" level ROMs and 2 "fragment" level
(slot-one) ROMs divided up into 3 logical DAQ crates, located in the
same physical crate in the Electronics House. In the dataflow
conventions, these crates are 18 (0x40000), 19 (0x80000) and 20
(0x100000). The "segment" level ROMs
communicate over optical fibers through the DCCs with the DFBs, send
commands and receive the TDC and ADC as well as header and status data.
The datagrams are received with a clock frequency of 15MHz. Once
arrived in the single board computer in a ROM, they are
"feature extracted"
(FX) i.e. cleaned up from header and status words and
checked on their structure. The results of these checks are sent to the
event level as an
FX-status
word which is traced in the DIRC fast monitor. Status values above 20
are fatal errors for which FX sets the dedicated FX-damage bit.
Currently, all ADC data are truncated within FX in order to speed up
the dataflow within the ROMs and to prevent dead time. In presence of
high machine background, it may happen that the FX status reaches 9 or
higher which means that TDC data are truncated too. This must be
observed as it affects the DIRC reconstruction (loss of data).
The most recent (August 2005) information on the FX status can be found
here.
The "fragment" level ROMs collect the data from their logical DAQ crates
and send them to the event level (UNIX).
In order to debug ROM functions like library loading, configuration,
calibration or running, it is useful to open serial lines through the
xyplex board. The command
$BFROOT/detector/drc/bin/DrcStartXyplexes
opens 6 xyplex connections (xterms) for the 6 segment level ROMs. As
only one serial line can be open at a time, one may need to use
$BFROOT/detector/drc/bin/DrcKillXyplexes
first which closes all existing DIRC DAQ xyplex connections.
Troubleshooting:
- A malfunctioning ROM
will show corrupted plots
in the fast monitor and will fail calibration validations.
- FX status 9 or 11:
it is most likely due to high machine
background which produces huge datagrams which are then cut
within feature extraction in order to prevent dataflow dead time.
Check the scalers via epics and the DIRC background monitoring plots.
If the background is too high, make sure that the pilot is aware of
the situation and that he is discussing it with MCC.
FEE and Finite State Machine (FSM) Configuration:
Before starting a data taking run or a calibration, the individual registers
of the DFBs need to be configured. This is performed during the
so-called "Configure Transition". The system which drives the
configuration is "Reverse Dataflow" (RDF) which itself is driven by
"keys". Each key (top-key) is linked to a hierarchy of configuration
data for each subsystem involved. These data are usually read from
files (in xtc format) and the correspondance between a key and the
files in the configuration tree is fetched from the configuration
database during the configure transition. The DIRC has 2 types of
configuration files:
DrcRegisters
for the FEE register configuration, and
DrcCycles
for the FSM and FX or Calibration configuration. The corresponding
files are located in:
/nfs/bbr-srv02/dataflow/rdf/Drc/FE/
/nfs/bbr-srv02/dataflow/rdf/Drc/Cycles/
There are 6 different
DrcRegisters configuration files, one for each ROM. The
executable
TestDrcRegisters
(in the same repository) allows to transform a xtc file into readable
ASCII format. The structure of the FEE configuration is the following:
group register_address value linkA linkB
where the groups are data taking (0) and calibration (1). The links
determine which FEE element is addressed by a particular register
configuration.
LinkA talks to the even sector number served by a ROM and linkB addresses
the odd sector number. The only individual register settings define the
gains of the analog chips (8 ADCs per DFB, register addresses 0x0-0x7) which
are different for each ADC. The DCCs have 2 sectors (calibration strobe
ON and the calibration strobe delay) which must be addressed through the
first DFB in the front-end crate which is
linkA/B = 0x1.
See also the section Masking a noisy channels.
Troubleshooting: Configuration errors
show up during the
configuration procedure when the written values are read back from the
configured registers and compared to the settings. If differences
are found, they show up through an error mask printed by the ROMs (e.g. onto
the xyplex xterms) for each register that failed. The mask corresponds
to
(linkB << 16) + linkA
for the FEE elements which show a wrong configuration. If these errors
persist one may try to "power-cycle" the corresponding FEE crate and
if this does not help it is a
serious sign of a malfunctioning DFB (or
DCC, if the clock distribution failed).
If one or more harnesses (or even a whole ROM) fails, it
can be necessary to mask out bad FEEs. This is done with the help of
the executable:
/nfs/bbr-srv02/dataflow/rdf/Drc/FE/DrcChangeLinks -f
<DrcReg.xtc> <0xabcdefgh>
The inputs are the original register configuration file,
DrcReg.xtc,
of the ROM for which the problems occurred and the error mask as printed
by the ROM
0xabcdefgh
(you may cut-and-paste it). The output file
<DrcReg_offabcdefgh.xtc>.
contains the new configuration for which the bad elements are masked off.
FastMon:
For information about the Fast Monitoring, check the
dedicated webpage.
Calibration: Many of the final tools of the DIRC calibration
system are already in place, some are still in development. All DIRC
calibrations are started with the help of the calibration GUI (tk library):
DrcCalibrations
Which is an alias defined, e.g., in the setup script of the DIRC account:
babardrc. In future this is supposed to be an expert tool for deeper studies
and debugging, while all default calibrations are performed by the BaBar
DAQ shifter using Run Control (ORC). All calibrations exist in a monitor
and a database mode. Both modes produce ntuples which can be analyzed with
the help of the various "Check Results"-type buttons. The GUI is a simple
command line executer. It starts tcl and csh scripts which then calls the
respective actions. All relevant calibration scripts reside in the directory:
$BFROOT/detector/drc/DrcCalScripts/
Temporary files needed for the database validation (there is an absolute
validation of the calibration results with respect to some initial ranges
and a relative validation by comparison with the previously stored results),
various checks, the hbook ntuples and the log files of the different calibration
levels (event level, control level) are saved at:
$BFROOT/detector/drc/tmplog
The individual DIRC calibrations are:
- Online T0 calibration
- (using LED flasher): fits the individual T0's of all PMTs. The most
important contribution to the T0 derivation between channels
originates from the different HV settings: dT0/dHV = (-15 +/- 1)
ps/V.
- ADC calibration
- (using LED flasher): fits the single photoelectron peak and the
threshold of the ADC spectrum
- Electronics calibration
- (using DFB's internal charge injection) -STILL IN PREPARATION-:
fit the gain (slope) and the pedestal (ADC offset at zero charge injection)
and the threshold (ADC signal rise) using the charge injection facility
of the DFBs.
- Occupancy calibrations
- -STILL IN PREPARATION-: there are 3 different types of occupancy
calibrations
- PMT noise calibration taken without beam and LED
- Background calibration taken in presence of beam
- LED occupancy calibration used in order to monitor the LED light intensity
and to take fast snapshots of the system for debugging purposes.
- DFB serial number calibration
- this "calibration" is used in order to attribute the correct gain settings
(lab measurements connected to the DFB serial number) to the FEE elements
used in the detector. It only needs to be performed when DFBs have been
exchanged.
Troubleshooting: If a calibration fails its validation this
may trigger some hardware problems.
Check on the EPICS panels that all FEE boards are
functioning normally. Press the "Check Results" button and watch the
hitmaps and occupancy distributions for abnormal shapes.
If error messages appear on the event level and/or the shared memory
reader xterm(s), there may be a mismatch between the various scripts
and tcl files interacting during a calibration.
If the control level does not proceed the different minor cycles, check
on the xyplex windows if it counts the calibration pulses. If this is
not the case there may be a problem with the internal dataflow
pulser.
Additional Features: The calibration
GUI contains two panels that are very useful shortcuts to frequently
used tools: a) the DRC scaler rate history and correlation with the LER
current and b) the time history of the calibration values. Both are
started by clicking on the aprropriate panel and entering the time
interval into the pop-up window (comply with the format given).
- Plot Scaler Rates
- Clicking on this panel produces an ntuple using the above mentioned
OdcNtupleMaker with the standard ntuple variables: HER+LER currents,
luminosity, and scaler rates. The input file is defined in the shell script
$BFROOT/detector/drc/DrcCalScripts/runOdcNtupleMaker.csh
The output file is written to
$BFROOT/detector/drc/tmp Then the macro
$BFROOT/detector/drc/DrcCalScripts/ntuplemaker.kumac is
run on the ntuple and produces a 2 page postscript file. Finally, a ghostview
process displays the PS file. Quitting ghostview returns you to the GUI.
- Plot calibration history
- This command used the DrcCondNtupleMaker to plot a large number of
calibration constants as a function of time. It generates a 26 page postscipt
file and launches a ghostview process to view the output.
Teststand:
Upstairs in IR2 (bldg 621, room 201) we have a teststand which is used to test
front-end components (power supplies, fans, boards etc.). Spare HV boards,
tools etc. are also stored there -- we also have 2 cabinets in this room; the
one closer to Ray's office contains the front-end spare boards and the tools
used by Orsay engineers to test and fix the DFBs.
Safety guidances:
- Never leave a crate ON on the front workbench (the one facing the
entrance from outside) if you're not around.
This restriction does not apply on the other workbench.
- If you're burning PS or fans, do not put front-end boards in to
avoid wearing them out.
- Once you're done with the DCC located in the spare crate, put back the
covers on the finisar outputs to prevent dust from going in.
|
Taking a DIRC standalone run
|
- First and foremost, never
start this procedure unless you are in the BaBar
control room. Make sure that the pilot is aware of your
work and gives you the green light.
- Obviously, the DIRC HV need to be set a nominal voltage to take
meaningful data.
- Find in the logbook the last Physics run and write down the
current physics key (this is 0x7ba (=1978) as of January 22nd, 2003).
- Log in to bbr-farm02 (or any other not-overloaded dataflow machine
like e.g. bbr-farm04 or bbr-farm07) and open two more terminals.
(Note: Do not use x-terms because fcgui below may not work and if fgui still does not work below try sourcing drcSetup.csh).
- Find out what the current release is
bbr-farm02:~> which showPartitions
showPartitions: aliased to /afs/slac.stanford.edu/g/babar/dist/online_releases/8.14.1/bin/SunOS58/showPartitions
In this case it is 8.14.1
- In one terminal, type (replace 8.14.1 with the current online release)
cd /afs/slac.stanford.edu/g/babar/dist/online_releases/8.14.1/
srtpath
<RETURN>
<RETURN> ← WARNING: before pushing <RETURN> the second time, check that the default BFARCH is the same as the one included in the showPartitions path, normally SunOS58_sparc_WS6U1. If this is not the case, select the correct BFARCH.
cd $drc/bin
DrcOepLog
to start OEP (the event level). This will open a new window called
"OepLoggingApp". If this doesn't work, check the
BFARCH value (see above).
- Then go to the second terminal and type
cd $drc/bin
- Add the directory
$ODF_ROOT/package/shlib/$BFARCH to the
LD_LIBRARY_PATH:
setenv LD_LIBRARY_PATH "${LD_LIBRARY_PATH}:$ODF_ROOT/package/shlib/$BFARCH"
- In this terminal, type
fcgui -k 0x7ba -c 0x1c0000
where 0x7ba is to be replaced by the current key (preceeded by the 0x!)
and 0x1c0000 is the DIRC crate mask (do not change it!).
This will open an "ODF Sequencer" window which looks like this:
- Click on the 'Allocate' button.
- Click on the 'Arguments' button. This will open the following window:
Check that the configuration key is correct; then close this window
using the 'Close' button.
- Click on the button
'MinorCycle'. The button label
will first go red and then green. In the message window at the bottom,
you will see the 'Last received transition' change.
The 'Last damage:' has to stay none. In case of a problem in the
partitioning or configuring phases, the key and/or crate mask is
likely not to be correct: crosscheck your inputs!
Another possible problem: the ROMs may need to be
rebooted. For instance, ROMs must be
rebooted each time the FEEs are switched on...
Try to avoid rebooting the ROMs if this is not mandatory.
- Click on the 'Calibration' button. This will open another
dedicated window:
Change the settings in the window to match the ones in above picture:
- Cycle is Internal.
- Commands 1, 2 and 3 enabled (i.e. button pressed,
'Disable' is visible).
- Command 1 is set to L1Accept.
- Command 2 and 3 are set to NoOp with a delay of 64000 (ticks).
- Counts field is set to 0.
Once this is done, click on 'Apply' and then 'Close' the window.
- Click on the 'Enabled' transition in the main 'ODF Sequencer' window.
This will start the standalone run.
Check that data are indeed accumulating on disk in the new xtc file
created by the
fcgui
command. You can use the following command line:
ls -ltrh /nfs/bbr-srv02/calib/drc/conditions/drc/ | tail -3
Check the file size regularly to make sure that it is growing at a
reasonnable rate.
- Typically, you want at least 150 kEvents
for a random trigger run. This corresponds to an xtc
file size just under 100MB.
Obviously, if some PMTs are noisy (e.g. because there is a Christmas tree ON)
events will be much bigger and so the number of recorded events should
be significantly reduced. Otherwise, analyzing the hbook will take ages
and may even fail!
- If you take a random trigger run to analyze dead and inefficient
PMTs you will have to collect more events.
For a typical analysis, you need to take about 5M events.
To avoid problems with disk space or hbook file size, please
split the 5M events in smaller runs of about 1M triggers each.
Remember to check the available disk space for the xtc files
on the output disk:
df /nfs/bbr-srv02/calib/drc/conditions/drc/
A run with 1M events will take 500-700MB for the xtc file.
It may therefore be necessary to move each xtc file to the
DIRC NFS disk
/nfs/farm/babar/drc/babardrc.
Don't fill up the
/nfs/bbr-srv02/calib/drc file system!
If this disk area gets 100% full, the calibration monitoring
job will fail next time a calibration is taken!
Please remember to clean up after yourself when the
xtc files are no longer needed.
If the /nfs/farm/babar/drc/babardrc file system is filling up and if
you need disk space there to move files from the
bbr-srv02
file system, please consider backing up the data from
/nfs/farm/babar/drc/babardrc to
mstore.
The path to the babardrc mstore repository is
/nfs/mstore/u/babardrc.
A typical sequence of mstore commands would be:
- The minimum file size for mstore is 100MB, the maximum 10GB.
For larger files use the astore command.
If the files are smaller, create a compressed tar archive of the
files you want to archive,
let's call that myruns.tgz (you could use the
$BFROOT/work/b/babardrc area for that purpose).
- Create an entry in mstore with the command
mstore create /nfs/mstore/u/babardrc/myruns.tgz
- Copy the file to the mstore entry with the command
cp myruns.tgz /nfs/mstore/u/babardrc/myruns.tgz
- Write the file to tape with the command
mstore put -p -v /nfs/mstore/u/babardrc/myruns.tgz
- Now you can safely delete the xtc files and the tar file on
/nfs/farm/babar/drc/babardrc. (You will be able to retrieve
the file from mstore with the
mstore get ... command.)
- To see what has been stored, just type
mstore ls /nfs/mstore/u/babardrc
.
Don't forget to check first if other users (not babardrc) are
responsible for filling up
/nfs/farm/babar/drc/babardrc.
If someone other than babardrc is using more than a few
hundred MB, ask them politely to delete the files asap
or have the Slac Computing Service (SCS) delete the files for them...
-
Once you have accumulated enough data,
end the run by clicking on the button
labelled 'Halted'. This
button will end the program and release the partition. The ROMs should
all be ready to run again without rebooting.
Please note, If the run ends
abnormally, use the blue buttons labelled 'Reboot' and
'Dissolve'. Click the reboot button first if the ROMs need to
be rebooted. You can check in the xyplex windows (you can start
those with the
DrcStartXyplexes
command) that all the ROMs began to reboot, and
use ctrl-X to reboot any which didn't. Then click the 'Dissolve'
button. This will gracefully clean up the partition. Failure to
do this (e.g., killing the GUI with ctrl-C) may require manual
intervention in the master crate.
-
Tell the shift leader that you are done
and clean up the terminals.
Looking at the results of a DIRC standalone run
First process the run to make an ntuple - DrcOedPlayback is a
convenient way to do that.
- Log in to a fast SunOS IR-2 machine (e.g.
bbr-refsun)
as babardrc
cd $drc/bin
mkdir $BFROOT/work/b/babardrc
setenv LD_LIBRARY_PATH /afs/slac/package/amulet/SunOS5/lib
DrcOedPlayback -n <numberOfEvents> -f <filename> tcl/myDrcOed.tcl
Note: the babardrc work directory might exists, in that case you
will see an error message you can ignore. The
mkdir
command is here to make sure that the default output
directory for the hbook file produced by
DrcOedPlayback
exists -- this directory gets deleted
automatically if it remains idle during one week.
filename is the file created above
/nfs/bbr-srv02/calib/drc/conditions/drc/filename.xtc
The job will take some time (~2-3 mins per 100K events)
and produce a hbook file named
$BFROOT/work/b/babardrc/drcoed.hbook
On completion it will print the number of events processed. Note this
for the next step below if you did not run xtcCat before.
- Now look at it with
Paw:
exec random_run <#events> '<title>'
['<filename>'] ['<histoname>']
where #events is the number of events processed above.
Specify a title for the top of the PS output,
(for instance the date of the random trigger run) and the
filename of the postscript output file (the
default name is
random.ps
in the current directory).
You can also specify the full file name (including full path)
of the hbook file that you want to use as histoname
(it defaults to
$BFROOT/work/b/babardrc/drcoed.hbook).
Another useful plot can be made for sector <sector>
using:
n/pl 2.row%column (1./<#events>)/0.6E-3.and.sector=<sector>
option=colz
Note: the weighting factor converts the random triggers into a
noise level in kHz:
0.6E-3 = 600E-7 (width of the window in which
hits are accepted) * 1000 (Hz to kHz conversion)
To draw the sector boundaries on top of the previous plot, use the command
exec ~babardrc/kumac/drawSector.kumac
right after the previous one. The borders should appear in red. If it
doesn't work, type
filecase keep and try again -- by default paw doesn't differenciate
capital and small letters whereas unix does...
Please remember to clean up the area when
postscript files are no longer needed.
- You may want to keep a copy of the hbook file for future reference.
The dedicated disk for that is
/nfs/farm/babar/drc/babardrc/histo
Please give the hbook file a name that identifies it for future use
(like date and number of events).
|
How to run DrcOedPlayback
|
For instructions with some pictures, see
here.
|
Taking a DIRC standalone calibration
|
- First and foremost, never
start this procedure unless you are in the BaBar
control room. Make sure that the pilot is aware of your
work and gives you the green light.
Also make sure the high voltage is set to V0.
- Log in to bbr-farm02 as babardrc:
ssh -X babardrc@bbr-farm02
- Setup the release and the working directory:
cd $drc/newDrcNfs/
source ir2_setup.src <current online release> (e.g. 17.1.0)
[The release number is written in the file DrcCommissioner/.online.]
cd DrcCommissioner
srtpath ← Choose SunOS58_sparc_WS6U1 as BFARCH (not the default so far)
ir2boot
cd workdir
- Launch the calibration GUI:
newExpert2.tcl
DrcCalibration GUI
- Click on "Scratch key run (are you sure??)" to launch the calibration. This will open 3 windows one after the other one and the whole process should take around 2-3 minutes.
- Currently (11/27/2007) the "Scratch key run (are you
sure??)" is linked to take a "long adc calibration" which
allows for checking the ADC distributions (This is necesary when
there is HV update to correct for pmt aging for example).
- If the calibration fails, this may mean that the ROMs are
not in a good state. Rebooting them by clicking on "Reboot all Roms" and waiting about 20 seconds
may solve the problem; simply retry the calibration when it's
done!
- Also you may need to click on Cleanup (TCP/IP nodes and DB
locks)
- Finally before you even launch newExpert2.tcl you should
check that there are no other partitions. Enter the command
showPartitions
There should only be one partition created by root.
- You should do all of these two things to get a clean start.
- A way to be sure that the calibration is running fine is to look at the ROM logs. Click on the button "Steal and open Xyplex windows" before starting the calibrtion: it will open 6 xterms (1 per ROM) to which the logfiles are redirected until they are closed.
Troubleshooting:When a xyplex window is opened, the corresponding logfile is only written on the terminal screen. Therefore, the information it displays gets lost when it is closed! So one should be cautious when using xyplex windows.
- To look at the calibration results, click on "Check last scratch-key LED calibration":
this will open another window where you have to choose what file to check → calibration_scratch.hbook
- The result of the calibration is in the directory /nfs/bbr-srv02/calib/drc/ :
- the postscript is: ./tmp/results_calibration_scratch.ps
→ please rename it as ./tmp/results_calibration_tdc_YYYYMMDDHHMM.ps otherwise it will get lost the next time we take a standalone calibration
- for reference, the hbook is: ./hbook/calibration_scratch.hbook
→ please rename it as ./hbook/calibration_tdc_YYYYMMDDHHMM.hbook
- The same instructions above apply for taking a TDC or ADC
calibration. The only difference is the name of the output hbook file
which is ./hbook/calibration_(tdc/adc)_YYYYMMDDHHMM.hbook
And they can be checked with the button Check
standalone LED calibration
|
Checking the last global calibration
|
Global calibrations are taken by the BaBar pilot through the ORC interface, usually once every 1-2 days when PEP cannot immediately reinject after a beam loss. The result of a global calibration does matter: if for some reason the calibration output get corrupted, the initial processing of the data taken while the calibration parameters are valid will be bad. So the data will need to be reprocessed after that the validity of a good calibration would have been extended to cover the bad time range. This may take weeks as runs are normally processed in the order they are taken. Fortunately, the DRC calibrations are stable over quite long periods and so this fix is harmless.
Checking the last global calibration is very easy.
- Open a session as babardrc on a fast IR2 machine, for instance
bbr-dev102.
ssh -X babardrc@bbr-dev102
- Go in the proper directory:
cd $drc/newDrcNfs/
- Run the following automated script
checkLastGlobalCalibration
The output postscript file is written in the directory
/nfs/bbr-nfs03/det/drc/drc/tmp_central/.
Its generic name is
currentGlobalCalib_<date>_<time>.ps where the
date
(in the format YYYYMMDD) and
time (format HHMMSS)
labels allow one to identify which calibration was used to produce
which set of monitoring plots.
- Check the plots!
An example can be found here. The main things to know and to check are
the following.
- As of now (December'05), the postscript file does not display the
result of the last calibration but rather a comparison between the last
calibration and the previous one. Therefore, pages 1 to 5 show the
differences between the last two calibrations PMT per PMT with a color
code (green is OK: no difference!).
- Sector #5 is historically 'messy'.
- The most important page is the second one which shows the T0
difference between the calibrations. That plot should be mostly green.
- Having a few random
white
PMTs per sector is not a problem -- there seems to be always some
white PMTs.
Most of them should be dead or unefficient PMTs.
- On the other hand, having a cluster of
white
PMTs may trigger a problem. PMTs which are temporarily unefficient
(HV channel turned off because of a Christmas tree; misbehaving DFB
etc.) will appear
white.
on the calibration plots. If
white
PMTs appear to be 'alive' in the fastmonitoring plot, the problem
is likely to be somewhere in the calibration procedure. The light
generator located in the E.H. may be the culprid -- see
here for more information.
- From time to time, PMTs along a DIRC 'radius' show some
T0 variations: this means that there were some fluctuations in
the timing of a DFB. If possible, take a second calibration: the
fluctuations should be gone, meaning that the board reached a
steady state again.
- Page 6 is also worth being checked: it shows the location of
the PMTs for which the T0 fit failed. Obviously, this map should
match the distribution of
white
PMTs in the 5 first pages.
- Email babardrc@slac.stanford.edu telling the status of the calibration.
Examples of unusual calibration plots (good or bad,
check them carefully!) can be found
here.
As the runs get 'PC-processed' a few hours after the data
were taken in IR2, it is important to check the quality of a calibration
as early as possible. Therefore, an automated e-mail is sent to
babardrc each time a new (DIRC) global calibration has been taken.
Additional note: The alias
lastGlobalCalibration
(available when logged on as babardrc) does steps 2., 3. and part of step 4.
(it brings you to the output directory) in a row. Then one just needs to
open the most recent .ps file and to check its data.
Troubleshooting:
When trying to validate a calibration, one gets sometimes a message saying:
Last global calibration
could not be validated. (...) Current DIRC offline-online release XX.X.X
doesn't match production offline YY.Y.Y
This is not a problem: it means that the release used for the online
production has been updated and that it doesn't match the DIRC online release
anymore. To have the calibration validation working again, one just needs to
update the DIRC framework.
Currently, the way to do this is by running the following script from
the $drc/newDrcNfs/ directory:
DrcCommissioner/DrcCalScripts/updateRelease.csh
This operation should be done by the operations manager not by the commissioners.
|
Monitoring the time evolution of DIRC calibrations
|
In case of a problem with the DIRC data processing or after a long period without data taking during which some online-related work was done, monitoring the time evolution of DIRC calibration data over the last weeks is a good idea.
Indeed, outputs of DIRC calibrations are expected to be very stable and so any unexpected variation is likely to trigger a problem -- or to reflect some change in the hardware, for instance a DFB replacement. In the latter case, a new working point should have been reached: check this assumption by taking a few more calibrations. Their data should be stable again.
The recipe to make such time-evolution plots is quite simple.
- Log in to
bbr-dev20)
as babardrc:
ssh -X babardrc@bbr-dev20
- Setup the release and go to the working directory:
cd $drc/newDrcNfs/DrcCommissioner
srtpath ← Choose SunOS58_sparc_WS6U1 as BFARCH (not the default so far)
ambientboot ← use ir2boot only if you want to include recent calibrations (less than one day old) in the survey.
cd workdir
Troubleshooting: DO NOT use ir2boot to run on long time ranges. While the job would proceed, the DIRC 'part' of the database would be locked which means for instance that no calibration data could be written.
- Produce the time evolution of the calibrations in the time range you're interested in.
For instance, to include all calibrations between June 15th and June 30th 2005, launch the command:runTdcHistory.csh 2005-06-15_00:00:00 2005-07-01_00:00:00 0
Its last argument is the minimum number of hours between two consecutive calibrations taken into account in the history plot. By setting it at 0 one is sure not to miss any calibration.
In fact, the macro slightly adapts the range defined by the user by adding/removing a few hours. Therefore, to avoid getting annoyed by this feature, add one extra day before and after the range you are interested in.
The output of this command can be found here: the calibration from June 12th has been included in the history, an example of the feature described above.
Keep this feature in mind when you analyze the postscript output of the command: the plot origin is not the time you specified but the time of the previous calibration!
- The output of the command is a postscript file:
tmp/ntuplemaker_drct0type.ps
All plots should be pretty flat apart in sector #5 which is usually a bit 'messy' (until the October 2005 shutdown, sector 10 was also messy but this was due to an hardware problem which has been fixed).
|
Finding and Masking a noisy electronics channel
|
To find the location of a noisy channel:
Open paw with from your home directory:
paw
Then inside paw use the following commands:
h/file 1 /nfs/bbr-nfs102/pubfastmon/Monitoring/OutputArchive/2007/03/LiveFastMon/LiveFastMon-0071363-20070321-115817.hbook 0
where LiveFastMon-0071363-20070321-115817.hbook corresponds to the RUN you want use.
ldir
cd drcfm
h/l
which will list all the histograms
set ncol 50
palette 1
op logz
h/pl 45 colz
where 45 is the 2-D histogram correspoding to the appropiate sector. The noisy pmt will show as a hot bin in this histogram, maybe 100 or more times hotter than the average pmt.
exec ~babardrc/kumac/drawSector_FM.kumac
locate
which allows you to get the coordinates of the noisy channel when you click on the channel bin on the plot
q
to exit.
To mask the noisy channel:
There are two new scripts that make the jobs much easier:
~hadig/babar/dirc/maskedchannels.pl key_date
to find the list of channels masked in the given key and
~hadig/babar/dirc/maskchannel.pl old_key_date sector dfb channel new_key_date
to mask an addition channel.
Things to know before following the instructions:
- Terminology of the DIRC geometry and basics of the readout.
- This includes the terms: ROM, sector, harness, channel, TDC, and DFB.
See
here
for details.
- Which channel is noisy (sector, row, column) ?
- To get this information, do a
standalone run
and look at the sector plane:
n/pl 2.row%column sector.eq.# option=colz
locate
with # replaced by the number of the noisy sector. Please make sure that
you get the row and column number in the format starting at 0. To check
plot:
n/pl 2.row%column sector.eq.#.and.(row.ne.#.or.column.ne.#) option=colz
with the #'s replaced by the corresponding numbers. The noisy PMT should
no longer be visible, but a white square should appear at its coordinates.
- Is it electronics noise or a noisy PMT ?
- If the neighboring PMTs see light from the noisy PMT, it is a X-mas tree
and not a noisy tube. Otherwise, switch off the HV for the corresponding HV
group and make another standalone run and check whether the PMT is still hot.
In case it is a PMT related problem, try to lower the voltages in order to
check whether you can save the PMT, otherwise, disconnect the HV cable from
the PMT (see other texts for more information).
Gathering the information
Note that hexadecimal numbers are prepended with an 0x in this text.
In the ASCII version of the xtc file that you will edit, all values are
entered in hexadecimal format WITHOUT the leading 0x.
We now have the sector, row, and column, of the PMT. Next, we need the ROM,
link, harness, channel, section, TDC, and TDC channel information.
The following table converts sector to ROM and link:
| Sector |
ROM |
Link |
| 0 | 0 | A |
| 1 | 0 | B |
| 2 | 1 | A |
| 3 | 1 | B |
| 4 | 2 | A |
| 5 | 2 | B |
| 6 | 3 | A |
| 7 | 3 | B |
| 8 | 4 | A |
| 9 | 4 | B |
| 10 | 5 | A |
| 11 | 5 | B |
In order to convert the row and column to harness and channel, run the program
~hoecker/tools/position.run
Select option 2 and enter the column and row numbers. The DFB number is the
same as the harness number.
Get the section number using the following table:
| Harness |
Section if Link==A |
Section if Link==B |
| 0 | 13 | 27 |
| 1 | 12 | 26 |
| 2 | 11 | 25 |
| 3 | 10 | 24 |
| 4 | 9 | 23 |
| 5 | 8 | 22 |
| 6 | 7 | 21 |
| 7 | 6 | 20 |
| 8 | 5 | 19 |
| 9 | 4 | 18 |
| 10 | 3 | 17 |
| 11 | 2 | 16 |
| 12 | 1 | 15 |
| 13 | 0 | 14 |
We only need the section number modulo 14, i.e. note the number in the second column.
Get the TDC information by:
TDC number : channel / 16 (number from 0 to 3)
TDC channel : channel modulo 16 (number from 0 to 15)
Look for the current xtc file. You should ask the Configuration
specialist (Vasia) about the current xtc files. You are likely to find
those in:
/nfs/bbr-srv02/dataflow/rdf/Drc/FE
Please note that you should never change or delete those
files as they are part of the current configuration and if
we loose those, DIRC will go non-runnable.
Find the correct xtc file for the ROM you want to change, it should
be called Drc0#_DATE.xtc with # replaced by the ROM number (0-5).
Convert the file into a human readable format by running:
TestDrcRegisters filename.xtc > test.dat
Open the file
test.dat in your favorite editor. The format is line
oriented. The columns have the following meanings:
| Column |
Description |
Values |
| 0 |
Group |
0 : data taking |
| 1 : calibration |
| 2 : internal test |
| 1 |
Address |
0-7: ADC |
| 0x8-0xf: ANDer |
| 0x10 and up: TDC |
| 2 |
Value |
0-0xffff |
| 3 |
Link A Mask |
0-0xfffc |
| 4 |
Link B Mask |
0-0xfffc |
More detail on all the fields, can be found
here.
We are only interested in the Group 0. The channel will be masked
in the ANDer (in theory we can also mask it in the TDC only but this
will leave the ADC for this channel switched on). The Address values
for masking are:
| TDC number |
Address |
| 0 | 0x9 |
| 1 | 0xb |
| 2 | 0xd |
| 3 | 0xf |
Look for the lines in the text that are Group 0 and the Address value you
determined in the table above. A typical line (without any masked channels)
will look as follows:
0 b ffff fffc fffc
Please note that 0xfffc stands for the binary number %1111 1111 1111 1100,
i.e. the upper 14 bits are set, the lower 2 not.
Correspondingly, 0xffff stands for the binary number %1111 1111 1111 1111,
i.e. all 16 bits are set.
Remember also that Link A and Link B stand for one full Front-End-Crate
with 14 DFBs.
This line will cause the following configuration to be set:
In Link A all upper 14 bits are set. Therefore, the initialization is sent to
all 14 DFBs. For Link B, all upper 14 bits are set to abd the initialization
valid for those 14 DFBs, too. The Address 0xb tells us that the Value
is set in the registers of the second TDC (number 1) on those boards.
The value itself is 0xffff, i.e. all 16 channels on that TDC are enabled.
Lets assume you want to switch off TDC channel 7 on the TDC number 2 on
Link A section 5. This means that all TDC channels BUT TDC channel 7
have to be switched on for that TDC.
Lets assume the old line is:
0 d ffff fffc fffc
Thus, the value will be 0xffff - 2^7,
i.e. all bits set but the 8th: %1111 1111 0111 1111 = 0xff7f.
The Link B Mask should be 0, as none of the Link B DFB/TDCs should get this
value. The Link A Mask should only contain the bit for the TDC on section
5. Remember that the bits are shifted by two, i.e. 2^(5+2) = 0x80.
The other TDCs should still be set to 0xffff, the mask for Link B is unchanged,
the one for Link A is the old value minus the new Link A mask calculated in
the last step.
Now you can replace the old line by the following new lines:
0 d ffff ff7c fffc
0 d ff7f 0080 0
Before converting the file back into xtc format, make sure to do the following:
- Remove all none data lines at the header.
- Check that the last data line in the file has the full masks set in the Link A
and Link B field as it is used as run mask, i.e. something like
0 8 0 fffc fffc
has to be in the last line.
- Add two lines to the end:
FFFF
and the file name of the xtc file. You should name the xtc file according
to the rules as described above; use a new not yet used in order to
not overwrite an old configuration !
Convert the ASCII file back to the new xtc file by running the command:
TestDrcRegisters < test.dat
It is best to check that the translation was successfull by reconverting the
new xtc file into an ASCII file and making a diff of the two files.
In order to make the changes offical, Vasia has to create a new DIRC key which
has then to be included in a new BaBar key. Once data taking starts with this
key, the masking is effective.
It is not recommended, but possible, to change the DIRC xtc file currently in
use. In this case, the change will get used as soon as the shift leader
goes to 'STANDBY' to allow a reconfiguration of the system.
If you have to do this, make a backup copy of the
original xtc file so that you can back out your changes in case of
unexpected problems with the new key.
|
How to mask DFBs in the ROMs
|
For instructions with some pictures, see
here.
Please don't forget that the ROMs should not be rebooted 'just for fun'
- Log in to a subnet 60 machine
ssh bbr-farm02
- Set some basics:
setenv ODF_PLATFORM 127
- Reboot the DIRC ROMs
/nfs/bbr-srv02/dataflow/ir-2/drc/release/bin/SunOS58_sparc_WS6U1/rebootPartition -c 0x1c0000
|
How to use the Macintosh in the test-stand
|
For instructions with some pictures, see
here.
|
How to recover a failed HV supply
|
For instructions with some pictures, see
here.
|
Replacing a CAEN HV mainframe
|
Detailed instructions with some pictures can be found in the
above section.
Make sure that the main frame has a problem. In order to do that, do
the following tests:
- Restart the CAEN HV crate
Turn the key to the OFF position. Remove
both cables from the CAEN serial line. Turn the key to the ON
position. After about 10 seconds, you should see the colored
LEDs blink and the LCD screen show the CAEN logo.
If this comes up, there is no problem.
- Check the fuses
Switch the power supply off; go to the back of the crate and
disconnect the power cable. Wait untill the red light gets OFF.
Use a screw driver to unscrew the black fuse covers. Check the
fuses with a DMM (to measure its resistivity). Replace a broken
fuse by a spare (upstairs in the Lab area, close to
Ray's office). If Ray is around, you can ask him to help you.
- Try to remove load from the power supply
Switch the power supply off; go to the back of the crate and
disconnect the power cable. Wait untill the red light gets OFF.
Pull 4 of the 8 boards out of the crate. Reconnect the power
cable and try again to turn the crate ON.
If the crate needs to be replaced, go to the
previous
section of the manual and follow carefully the instructions.
|
Replacing a CAEN HV board
|
Changing a HV board is easy provided that you follow the non-routine JHAM
guidances and that you're familiar with the HV hardware which is described
in the previous section.
Troubleshooting: After each HV replacement,
check the nominal HV values of the corresponding channels and, if needed,
reload them.
Spare HV boards are stored in our teststand upstairs in IR2. Don't forget to
bring the faulty board(s) in Paul Stiles' workshop... and to bug him
afterwards to make sure the repair is quick!
The main steps of the HV board change procedure are recalled below.
- Turn off the HV crates located in the same rack via epics.
- Power down these crates by turning the key to the off (top) position
- Open the rack door and unplug the crate power cords.
- Wait for the red LEDs at the bottom left of the crates to switch off --
red light ON means charge remaining in the crate...
- Pull out slowly the board (which may or may not be screwed); be very
careful not to cut some HV cables.
- Put the board on the plane surface (the blue stool on wheels which
is normally somewhere in the E.H. is appropriate for this operation).
- Bring the spare board close to the broken one and
move the HV cables one by one. Make sure that the board is
not upside-down and that you respect the plug order: each
group of 16 HV channels requires a different nominal HV!
- Do not forget to move the yellow 50 Ohm resistor from the old board
to the new one.
- Push the new HV board in the crate: make sure it follows the racks and
that it is well plugged in the slot.
- Plug the power cord -- from now on, HV cables are
taboo!
- Close the rack.
- Unplug the serial line from the IOC controlling the HV crates to
hopefully avoid rebooting it.
- Turn the HV crates on one by one (turn key 90 degrees left). Make
sure they come back OK.
- Look at epics to check the nominal V0/V1 values -- the correct
values are located
here.
If they are not accurate, reload them.
Special case: failure of the
4th board in a sector.
In each sector, the 4th (from the left looking at the back of the
crates) board has a particular status. It is only powering 8 HV channels (hence
128 PMTs) instead of 16 for the other 3 boards. In addition, these PMTs are
located in the outermost part of the sector (see for example the
failure of
sector 4 4th board on 2006/06/30) where very few tracks have their
Cerenkov ring measured.
A 55' run taken with the sector 4 4th HV board dead on the swing
shift of 2006/06/30 has been DQG-ed OK which means that changing that board
can be delayed (for instance up to the next beam loss). On the other hand, a
failure of the first board in sector 1 in July 2005 lead to data declared bad
by the PID group. Hence,
- in case of failure of one of the fist 3 HV
boards (= powering 16 channels) in a sector, the board must be
replaced immediately at the price of ~15 minutes
downtime;
- if the 4th HV board fails, its
replacement can be delayed until the next
opportunity to minimize the downtime.
In the latter case, be very cautious: check quickly the FM plots for each run
to make sure that only the same 8 channels are dead; get in touch with the
DIRC DQ expert to check that the PC (ER) processing of these runs is OK; be
ready to change the board anytime -- 2 or 3 (better) people needed. Obviously,
if you have a doubt on the quality of the data, do not hesitate: just change
the board!
|
Setting the default voltages for the DIRC HV
|
After having changed a HV board, one needs to set the injectable (V1) and
runnable (V0) voltages of all its channel. The simplest way to do this is
to use the burtwb
tool. It reads a text file containing the voltage settings for all 672
DIRC HV channels and writes them in the 48 DIRC HV boards. The recipe to
use this tool is detailled below.
- Switch off all HV crates in EPICS, but leave them on in EH.
- Log in to a SUN machine which has write permission in the CAEN crates,
e.g. bbr-refsun.
ssh -l babardrc bbr-refsun
- Go to the directory where the HV setting files are stored.
cd /nfs/bbr-srv01/u1/babar/boot/apps/drc-hv/burt/
- Load the HV setting files
~vasileia/bin/burtwb -f nominalV0_20071120_final.snap
for V0 (runnable) HV values
~vasileia/bin/burtwb -f nominalV1_20071120_final.snap
for V1 (injectable) HV values
and load the trip thresholds
~vasileia/bin/burtwb -f nominalITrip_20071120_final.snap
These two files containt the current default HV settings.
There should be no error message. Sometimes, the write command fails and
gives error messages. Simply retry the
burtwb
until it works.
- Check in EPICS that the V0WR and V1WR fields are filled and have
sensible values: for all channels, V0WR - V1WR should be 300 V
(except for channel in each sector which is the scaler) and the
default runnable voltages for all channels can be found
here.
The old values (before Nov. 2007) can be found here here.
- Switch ON the HV. check that the V0RD values are equal to the V0WR
settings. The physical voltages applied on the HV channels are given by the
VMON voltages. When the DIRC is runnable, these values should be very close
to the V0RD values (a difference of 1-2 volts in some channels is not a
problem).
Practically, the switch between the two voltage levels is
made through the SIAM #2 (note the right-most SIAM has been
moved to another slot. See here
for an up-to-date picture of the location of the siams in the.
drc-mon
crate) which can either be 'on trip' (corresponding to the runnable voltage
level V0) or 'OK' (injectable voltage level: V1 = V0 - 300 V). Hence, as stated on a post-it glued on the SIAM in the E.H., having this SIAM tripped during
data taking is normal.
SIAM 2 used to set the DIRC HV to V0 (SIAM tripped) or V1 (SIAM not tripped). So, in normal operations, this SIAM looks like in trip mode but this is normal!
|
Investigating finisar problems
|
For instructions with some pictures, see
here.
|
Investigating and fixing chiller problems
|
For instructions with some pictures, see
here.
|
Investigating Wiener crate problems
|
For instructions with some pictures, see
here.
Remember that failure of a single DFB is not reason enough to open the backward doors!
Turn off all the DIRC HV and the power to the Wiener Crate where the DFB (DCC) is to be replaced before starting work.
Get a small flathead screwdriver (in teststand drawer) and a replacement DFB or DCC which has be recently tested on the teststand to insure it is working well. A ladder is required to access the crates on the side of the DIRC and the bridge is required to access the upper crates. After unscrewing the DFB to be replaced, remove it very carefully - the small yellow capacitors on the back are easily knocked off. (DCCs also require care, but are not right next any other boards.) Be sure the new DFB/DCC is correctly aligned in the tracks before pushing it in.DCCs require more force to push in at the end.
After successful replacement, it is necessary to take 2 calibrations to check that the replacement was a success. Inform anyone working near the DIRC that you plan to turn the HV on before doing so.
Please test the broken DFB (DCC) in the teststand if possible to check if the brokenness is reproducible. After testing, send the board to the appropriate place (usually France) for repair.
After completing the replacement of a DFB or DCC there are 5 files to update:
Spare Status
SLAC IDs for DIRC Boards
Boards Currently in the Detector
Board History
Troubled Slot History at (/BFROOT/www/Detector/DIRC/Operations/troubled_slot_history.xls)
Finally one needs to update the gains on the DFBs. This is done by following the instructions in the README file which is located at:
/afs/slac.stanford.edu/u/ec/babardrc/RegistersScan/README
|
Replacing a Wiener crate fan tray
|
For instructions with some pictures, see
here.
|
Replacing a Wiener crate power supply
|
For instructions with some pictures, see
here.
|
A DIRC Water Plant Digest for Commissionners
|
For instructions with some pictures, see
here.
|
Procedure to be followed after a SOB dump
|
For instructions with some pictures, see
here.
|
Restarting the SOB circulating pump
|
For instructions with some pictures, see
here.
|
Restarting the SOB degasing pump
|
For instructions with some pictures, see
here.
In the following, the word 'IOC' is used to indicate
one of the three hardware IOCs the DIRC has been using since the
beginning of BaBar: drc-mon,
drc-hv and drc-gas. Since
the summer 2005, the DIRC has a fourth IOC, drc-soft, which is purely software.
Completely independent from the data taking,
drc-soft
is mainly used to monitor the DIRC background. Its reboot procedure is
different from the methods which are described in the core of this section.
Rebooting drc-soft: From a
con-01 terminal
(one needs to be logged as babar to execute this reboot), go to the
directory /nfs/bbr-srv01/u1/babar/boot/apps/drc-soft/boot/. There, run the command
source restartCommand.csh
The reboot should take about 20 seconds during which the ALH variables of the
DRC_BACKGROUND section are disconnected for a short while.
Never reboot an IOC during a run if this is not
necessary. Indeed, if the DAQ system is not able to reach the IOC for
2 minutes, a proxy will go non-runnable,
interrupting the data taking. This is
unfortunated because rebooting a DIRC IOC would be harmless otherwise.
However, this is a BaBar policy which we have not been able to change.
There are several ways to reboot an IOC. The preferred ones
are those which preserve the IOC logfile which can then be analyzed if the reboot fails. IOC logfiles can be found in the area
/nfs/bbr-nfs01/logfiles/Odc. The current logfiles can be found in the curr/ directory (one file per IOC, having the same name as the IOC). Logfiles of the 7 previous days are archived in the directories prev1/... prev7/. Older logfiles may be restored upon request.
IOC reboot through epics: (Recommanded
procedure)
- From the main BaBar epics pannel, click on
'CEN' → 'IOC'
('Detector Control Crates' section) →
'IOC' button of the IOC to be rebooted →
'Control'
(bottom left).
- The last click opens an expert panel called 'VME
Crate Controls<IOC name>'. To start the reboot of the IOC,
click on 'Enable' and then on
'Sys Reset', button which is disabled by default.
- Then, check the logfile to see how the reboot proceeds -- there is a
delay of a few tens of seconds in the update of the file. If the reboot gets
successfully completed, all epics panels and ALH variables which went white
when the process started should reconnect quickly.
IOC reboot using a serial line connection (xyplex):
Troubleshooting: For security reasons, the access to all IOCs has been restricted on August 12th 2005.
Therefore, serial line connections (xyplex) can now only be done from a limited number of machines: bbr-reflin (Scientific Linux) and bbr-refsun (SUN). By default, users are not allowed to log in these machines. If you think you need such permission for DIRC-related activities, please send an e-mail justifying your request to ir2-admin with the DIRC ops. manager(s) in cc. Otherwise, you can connect to these machines using the DIRC generic account babardrc.
- Connect to the IOC by using the following command:
xyplex -f drc-XXX
where XXX must be
replaced by hv,
mon, or
gas depending on
the IOC to be rebooted.
- The reboot command depends on the IOC.
- For the two oldest IOCs,
drc-mon and
drc-gas (66
MHz boards running on VxWorks),
CTRL-x will
trigger the reboot.
- During the October 2005 shutdown,
drc-hv has
been upgraded to a 1 GHz PowerPC running on RTEMS. Given the huge
number of HV channels, having a faster board allows to reduce the
boot time by a factor of at least 5! The reboot command is
send brk
- In both cases, it is recommended to close the xyplex
after the boot has started (use the command
CTRL-] to come
back to a telnet session one can quit with the command
q).
Otherwise, the log messages will be sent to the terminal and not archived
in the logfile.
How to check if the reboot is proceeding well?
First, most of the boot errors generate messages which are obviously triggering
errors even if their precise meaning may be quite obscure. When a database
file (generic path db/<fileName>.db) is successfully loaded, the logfile should show the following three
lines:
(...)
load data
dbloadRecord "db/<fileName>.db"
value=0=0x0
(...)
Later in the logfile, you may also look for the following two lines
(...)
ecrit data 1 et valid=ffffff
value=0=0x0
(...)
If you have any doubt on the IOC behavior, contact the DIRC epics expert or
discuss with someone from the IR2 computing group (Matthias or Steffen).
|
DIRC N2 system in a nutshell
|
For information and pictures, see
here.
|
Investigating light generator crate problems
|
For information and pictures, see
here.
|
How to produce a Backgrounds plot
|
This kind of plot:
http://www.slac.stanford.edu/BFROOT/www/Detector/DIRC/BackgroundMonitoring/Standalone/standaloneScope_ParallelFeb07.gif
can be obtained this way:
cd /nfs/bbr-nfs03/det/drc/drc/BackgroundMonitor/DrcBackgroundMonitor
Then do
source standalonePlots.csh 200701230 200702282 ParallelFeb07 180 YES kFALSE kFALSE
where 200701230 is the initial date in this format:
YYYYMMDDS
where S is the shift (0=owl, 1=day, 2=swing)
ParallelFeb07 is a title for the plot file name.
All the other parameters, you can understand them by looking at the perl script.
It produces lots of files, stored in the directory
/nfs/bbr-nfs03/det/drc/drc/BackgroundMonitor/Standalone
do ls -lrt on this directory to see which were the plots which were produced.
The job can take several minutes.
|
How to update the ROM files for noisy channels and new DFB Gains
|
Just cut-and-past these comands being careful to change everything that needs to be updated (e.g. release number,etc):
Create an updated release:
ssh benitezj@bbr-dev100
newrel -s $BFROOT/build/b/benitezj -t -o 17.2.0 JoseNew_17.2.0
srtpath
echo "FD_NUMBER = 7594" > .bbobjy
gmake all
addpkg workdir
gmake workdir.setup
Copy old ROM files into new release:
cp -R OldRelease/workdir/REGISTERS JoseNew_17.2.0/workdir/.
cp ROM0_20071026.dat ROM0_20080110.dat
..
Check which channels are currently masked:
ssh benitezj@bbr-dev20
cd /nfs/bbr-srv02/dataflow/rdf/Drc/FE
/nfs/bbr-nfs03/detector/drc/test_JK_16.4.0/DIRCscripts/maskedchannels.pl 10262007
exit
Edit the rom files:
cd /nfs/bbr-nfs03/detector/drc/JoseNew_17.2.0/workdir/REGISTERS
emacs ROM*.dat
...
Create .xtc files:
ssh benitezj@bbr-dev20
cd /nfs/bbr-nfs03/detector/drc/JoseNew_17.2.0/workdir/REGISTERS
/nfs/bbr-srv02/dataflow/rdf/Drc/FE/TestDrcRegisters < ROM0_20080110.dat
...
Copy the .xtc files into the production area:
su babardrc
cd /nfs/bbr-nfs03/detector/drc/JoseNew_17.2.0/workdir/REGISTERS
cp Drc00_01102008.xtc /nfs/bbr-srv02/dataflow/rdf/Drc/FE/.
...
exit
Create file identifiers for each file:
cd /nfs/bbr-nfs03/detector/drc/JoseNew_17.2.0
srtpath
source /nfs/bbr-nfs03/det/drc/drc/ir2_setup.src 17.2.0
ir2boot
cd /nfs/bbr-srv02/dataflow/rdf
createFileIdentifier -c -a /3/0/0/0/0 -d "ROM0-10Jan08 masks" Drc DrcRegisters FE/Drc00_01102008.xtc
(in this last line the red colored numbers indicates that this # needs to be changed for each rom.
cd Drc
su babardrc
cp new_rom.alias new_rom.alias_JB01102008
emacs new_rom.alias
to update the numbers.
exit
exit
Create New Keys:
ssh benitezj@bbr-dev100
cd /nfs/bbr-nfs03/detector/drc/JoseNew_17.2.0
srtpath
source /nfs/bbr-nfs03/det/drc/drc/ir2_setup.src 17.2.0
ir2boot
cd /nfs/bbr-srv02/dataflow/rdf/Drc
CfgUtils -v EditAlias -f new_rom.alias drc
CfgUtils -v UpdateTree -d "new register files" drc
Be careful with this last step because it locks some database, try to do it when not taking colliding beam data.
(Im not sure of this last statement).
----Call Pilot to end the run, refresh run type list and start a new run----
DIRC offline software
DIRC Reconstruction
Created 11-27-99, by Andreas
Hoecker,
modifications by: Joe
Major modifications: December 2005, Nicolas
Last modified: Thu Apr 26 08:20:59 PDT 2007
(Nicolas)