Event Store Data Management
Normally, only micro and tag level data is kept on disk permanently. Raw, rec, esd (mini) data is kept on tape and must be staged in before the job is run.
There are three ways of getting staged data on disk. They offer different approaches, based on your needs.
Temporarily staging data to disk
The collstagein command can temporarily load data to disk. This is a solution when you only want to run a job on a specific collection once.
The -help option will describe usage and options. Additional information is available in Selected Application Section (see collstagein).
A typical command to stage the raw data for collection /users/me/myevents is
collstagein -wait -include raw /users/me/myevents
When data is staged, e-mail will be sent to you (e-mail will be sent for each host the data is staged in).
To use this command in a script that you submit to a batch queue, there is an option -wait, which blocks the exist until all the files appear on disk. This option doesn't time out the script in case if the data can't be staged.
The data you staged using the collstagein command becomes eligible for purging off disk 12 hours after the last access to it. Actual purging time depends on overall disk usage and other staging activity. The rule is: the oldest eligible for purging is purged first when there is a need to make room for new data.
Therefore one should not assume that the data is on disk the day after the collstagein command was run. The best way of ensuring that the data is on disk at job run time is to use collstagein with the -wait option in a script for batch submission.
Long Term Kept Data
You can use KeptData service when you need a collection to be available on disk for a relatively long time (weeks or months), or when you need to access data several times during shorter period (few days). To use KeptData service, for either submitting a request or checking what's already on disk, go to the following URLs (command line version is not available yet):
For physboot1 see:
For physboot2 see:
For analboot2 see:
For sp3analboot see:
For simuboot (slac data only) see:
Data from newer physics federations and MC data produced by remote sites is not covered by this server. It is being worked on.
If you discover that one of the collections listed on KeptData pages does not have all the listed data on disk, that's an error. Please let us know.
Collections of Specific Digis
The above method stores large numbers of contiguous events on disk. This is only efficient if you want to look at a large fraction of those events. For certain purposes, you want to take a comparatively sparse collection of events and reprocess them from digis. We provide "digi copies of collections" to make this more efficient. Once you have a collection of events, even if quite sparse, we can load a copy of just the digis from just these events onto disk, where they will stay until you delete them. This is also useful when you want to export such skim collections to your institution. Note that we are currently only copying the raw data (digis); you can then run Bear on these collections as often as needed to reprocess the data for your own use.
Note that we are currently only doing this for the analboot2 federation.
Paul Raines has provided a web-based system for making and checking the requests. It's available from http://www.slac.stanford.edu/babar-internal/collreq/requests.html. You need a BaBar account name and password to use it.
To check a particular run or collection, enter the run range and press show. The status codes are:
- REQUESTED - in the queue
- PENDING - actually being processed
- DONE - finished and thought to be OK
- FAILED - failed in processing, will be retried
- ON HOLD - failed twice, now pending some intervention
- CANCELLED - has been removed from the queue
To add a collection to the list to be copied, use the Make request button. It will open a panel in which to enter the input and output collections, and the run number of the start of the data. We use that run number to optimize the order in which we process tapes. Generally, the output digi collections should have digis in their collection names. For example, copy from /groups/GroupName/collection to /groups/GroupName/digis/collection. See also the information above for examples.
There are also command-line tools for making large numbers of requests. If you'd like to use these, please contact Bob Jacobsen directly.
BaBar Public Site | SLAC | News | Links | Who's Who | Contact Us
Page Owner: Jacek Becla
Last Update: June 13, 2002