BaBar Data Distribution System
Dominique Boutigny May 26, 1999
The requirements for the BaBar distribution system
have been established by the CCG and are summarized in the following document
. The latest estimation of the data volume to transfer
is available in the following Excel
spreadsheet .
Event Selection :
The BaBar data will be written in 4 non overlapping
output stream (from Bob Jacobsen):
-
Random Triggers -
This selection will be based on the Level 1 random trigger bits. It has
been requested by the background group that we write this as a separate
stream as soon as possible.
-
Calibration 2 Prongs -
mainly Bhabha + di-muons - - This is a mixed sample of mu pairs and Bhabhas
selected using Level 3 results (e.g. no offline track finding, no PID).
Note that this might contain some events of interest to physics, particularly
taus.
-
Multi-hadrons
- This is the primary stream for the physics group. Nando has decided that
it will be defined by the "isMultiHadron" bit written by EventTagTools/TagEventSelector.
This tag bit is generated from the output of the full reconstruction during
prompt reco. More info on this selection is available at: /BFROOT/www/Organization/CollabMtgs/softwareMtgs/April_1999/rahatlou2.pdf
-
Bulk - everything
_not_ selected by one of the above. This will be used for trigger
studies. It may contain some physics events of interest, particularly for
tau and 2photon physics.
How to select only "detector
is OK" data is a separate discussion.
During June, we will not be including detector status and quality bits
in the streaming decision as we do not expect that they will be truly representative
then. Eventually we would like to include these in the streaming decision.
Further streams can be added, with the constraint
that very large file will be written and will be transferred only when
full. A stream containing rare events would not fill the file fast enough
to be exported in a timely manner.
Each site will subscribe to the streams it wants
to receive.
If n files are available in a given stream it
will be possible for a site to require only a fraction of n. Only complete
files will be distributed. Copying part of the available files may result
in getting a fraction of incomplete events.
Remark : At the beginning, only one
stream will be written. The multi-stream output will be introduced by mid-June.
Level of reconstruction :
The transfer unit will be : RAW, REC, ESD, AOD and
TAG
The data distribution system will produce Master
Files containing ::
-
University : (TAG
+ AOD)
-
Regional Centers : (TAG
+ AOD) + ESD - ESD witten separately
-
Mirror Site : (TAG
+ AOD) + (ESD) + REC + RAW - REC + RAWwritten
separately
Distribution Strategy :
Every time a stream file is complete and not more
often than once a week, the master files will be copied according to the
site requirements :
-
Selected stream
-
Level of reconstruction
-
Media type (DLT 7000 or REDWOOD) (Eagle cartridge
will be added later)
The copies will be handle by a dedicated person (to
be hired ?). This person will also be in charge to extract the cartridges
from the silo and to ship them to their destinations.
The central BaBar at SLAC will not be responsible
for handling the tapes in the remote sites.
The whole condition and configuration databases
coresponding to the transfered data will be distributed at the same time.
Later, it will be possible to transfer only the necessary part of the condition
and configuration database.
The condition and configuration databases will
also be made available separately on a regular basis.
A database to keep track of the cartridge moving
will be necessary, but does not exist yet.
The export system will run on dedicated machines
(probably datamove3)
Coordination :
Each remote site will designate a contact person
with the following responsabilities:
-
Supervision of the tape copies for his site
-
Responsability to provide and maintain a sufficient
stock of cartridges at SLAC
-
Responsability to handle the cartridges at the remote
site
The cost of the tapes will be supported by the remote
sites.
Small event samples :
Small event samples will be distributed via anonymous
ftp to ftp-babar.slac.stanford.edu. This server will have access to event
samples located on datamove3.
The condition database will be available from
the same ftp server. And will be updated on a regular basis.
Remote site requirements :
RAL :
-
Streams : Random
Triggers - Calibration 2-prong - Multihadrons
-
Reco level :
REC
-
Media :
DLT
Contact Persons in Remote Sites :
|