SLAC PEP-II
BABAR
SLAC<->RAL
Babar logo
HEPIC E,S & H Databases PDG HEP preprints
Organization Detector Computing Physics Documentation
Personnel Glossary Sitemap Search Hypernews
Unwrap page!
Comp. Search
Who's who?
Meetings
FAQ Homepage
Archive
Environment
Administration
New User Info.
Web Info/Tools
Monitoring
Training
Tools & Utils
Programming
C++ Standard
SRT, AFS, CVS
QA and QC
Remedy
Histogramming
Operations
PromptReco
Simulation Production
Online SW
Dataflow
Detector Control
Evt Processing
Run Control
Calibration
Databases
Offline
Workbook
Coding Standards
Simulation
Reconstruction
Prompt Reco.
BaBar Grid
Data Distribution
Beta & BetaTools
Kanga & Root
Analysis Tools
RooFit Toolkit
Data Management
Data Quality
Event display
Event Browser
Code releases
Databases
Check this page for HTML 4.01 Transitional compliance with the
W3C Validator
(More checks...)

Data Management in the BaBar Event Store

To avoid having data disappear off disk when the staging system gets busy, we can manually load both complete runs and specific collections to disk.  The following sections describe that process.

The amount of disk space used by these collections is normally reported every other week to the Physics/Reconstruction/Simulation Forum.

Complete Runs


We explicitly manage the raw and reco information for complete runs that is kept on disk.  The aim is to provide stability for people who need large samples of raw and reco to study, while still being able to accommodate new specific requests.

Several web pages maintain the lists of on-disk data. At the moment this service is not available for physboot2 and newer federations.

For physboot, please see:

  http://www.slac.stanford.edu/babar-internal/keptdata/ana3/requests.html.

For analboot2, please see:

  http://www.slac.stanford.edu/babar-internal/keptdata/ana2/requests.html.

For data in sp3analboot, please see:

  http://www.slac.stanford.edu/babar-internal/keptdata/sp3/requests.html

For data in simuboot (sp4analboot), please see:

  http://www.slac.stanford.edu/babar-internal/keptdata/sp4/requests.html

If you discover that one of these collections does not have all the listed data on disk, that's an error; please let us know.

We do the actually loading, etc, during the scheduled event store outages. Please make requests by at least 12 hours before the outage starts so we have time to handle them.

Collections of Specific Digis

The above method stores large numbers of contiguous events on disk. This is only efficient if you want to look at a large fraction of those events. For certain purposes, you want to take a comparatively sparse collection of events and reprocess them from digis. We provide "digi copies of collections" to make this more efficient. Once you have a collection of events, even if quite sparse, we can load a copy of just the digis from just these events onto disk, where they will stay until you delete them. Note that we are currently only copying the raw data (digis); you can then run Bear on these collections as often as needed to reprocess the data for your own use.

Note that we are currently only doing this for the analboot2 federation.

Paul Raines has provided a web-based system for making and checking on requests. Its available from http://www.slac.stanford.edu/babar-internal/collreq/requests.html. You need a BaBar account name and password to use it.

To check on a particular run or collection, enter the run range and press show. The status codes are:

  • REQUESTED - in the queue
  • PENDING - actually being processed
  • DONE - finished and thought to be OK
  • FAILED - failed in processing, will be retried
  • ON HOLD - failed twice, now pending some intervention
  • CANCELLED - has been removed from the queue

To add a collection to the list to be copied, use the "Make request" button. It will open a panel in which to enter the input and output collections, and the run number of the start of the data. We use that run number to optimize the order in which we process tapes. Generally, the output digi collections should have "digis" in their collection names. For example, copy from "/groups/GroupName/collection" to "/groups/GroupName/digis/collection"; see also the web page above for examples.

There are also command-line tools for making large numbers of requests. If you'd like to use these, please contact me directly.

Temporarily staging data to disk

The "collstagein" command can temporarily load data to disk. The "-help" option will describe usage and options, additional information is also available in an initial Hypernews post. Please search Hypernews for more recent updates. A typical command to stage the raw data for collection /users/me/myevents would be
  collstagein -wait -include raw /users/me/myevents
Note that you need to have OO_FD_BOOT set (via e.g. the analboot2 command) and have done a srtpath to 8.6.2d or more recent release.

Disk Usage

There is a rough report available that contains information the disk space used by a particular user or group, but you have to dig for it. Look for your group name. The "aio" files are the "all-in-one" data produced by the digi-copying process and similar.

You can also use the "lscoll" script to get summaries. For example

   lscoll /groups/.../digis/...
will list all the groups that have copied digis using the standard names (not all have, though), along with the total number of events and collections.

Page maintained by Bob Jacobsen, Bob_Jacobsen@lbl.gov
  Last updated July 5, 2001