BABARDATABASE WORKSHOP
LBNL 8th-9th July 1997

Draft: 15th July 1997
This is a report on the workshop. It is incomplete and will be updated as more information is received from people who took notes during several of the sessions. I'm also hoping that we'll be able to scan in the talks
that are not already in electronic form.
Goal:
The principal goal was to determine a persistence model which we will use to store and retrieve our event data, both raw, simulated, reconstructed and partially analyzed.
Agenda:
Tuesday 8th July
- 9:15-10:15 The BaBar Reconstruction Environment & Data Model (Bob Jacobsen) as a
PostScript.
- 10:15-10:30 Schedule Implications (Frank Porter)
- 11:00-12:15 The RD45 Approach to Persistence (Dirk Duellmann)
- 13:30-13:45 Geant4 Persistence (Gabriele Cosmo)
- 13:45-14:00 Prototype BaBar Persistent Data Model (Pavel Binko)
- 14:00-14:30 Objectivity Collection Classes (David
Quarrie), in PDF format ( 1-Up or 2-Up) or in PostScript 2-Up format.
- 14:30-15:30 Proposed Strategies (David Quarrie),
in PDF format ( 1-Up or 2-Up) or in
PostScript 2-Up format.
- 15:45-16:30 Confronting strategies with use cases (Bob Jacobsen)
Two different sets of uses cases were presented by Bob and taken as topics for discussion on Wednesday. The two sets are available on the BABAR HyperNews Forums
here and here. In addition, Bob stressed that the
following is not a required use case, and that having to explicitly preload data was acceptable:
Trk* t;
t->gimeHot()->gimeHit()->gimeDigi()->gimeGHit();
Wednesday 9th July
- 9:00-10:30 Working Session (Online Event Processing & Physics Analysis Issues)
The following notes were taken by David Quarrie and are rather brief. Please let me know if there are important points that were missed.
OEP
Whilst current plans for OEP require it to have access to the conditions database, this is not on a per-event basis during normal triggering operation and is not expected to cause a bandwidth or lock interaction rate
problem. The baseline design has OEP writing sequential files to a spool disk for subsequent input to prompt reconstruction. Output to the event store is therefore downstream of PR. Much of the discussion centered around the problems that would be posed
by a fully-persistent solution if reconstruction classes were to be used within the OEP environment. Solutions that were discussed were to create all the persistent objects in a signel container and then just delete the container at the end of the
processing, perhaps amortizing this across several events so as to minimize interaction with the lock manager. Concerns were expressed about the size of the client cache and lock overhead at the required 2kHz event processing rate. A transient solution
would be preferable.
Physics Analysis
My notes on this are too sketchy to be useful. Most of the discussion took place during the Use Cases session.
- 11:00-12:15 Working Session (Use Cases)
Maria Grazia Pia added another set of use cases to those
already presented by Bob. Notes for the discussions on each of the 3 sets were taken by Gregory Dubois-Felsmann, David Quarrie & David Aston.
- Bob's initial use cases Gregory's notes.
- Maris Grazia Pia's use cases. David Quarrie's notes. These use cases again highlighted the
need for a mechanism to share data samples between disjoint federated databases, and to perform set operations (e.g. union)on event samples in such a way that duplicate events (identified by some event Id - event number, run number, processing
release number?) are excluded from the resultant samples.[draft notes - will try to write more.]
- Bob's 2nd set of use cases. David Aston's notes.
- 13:30-15:30 Working Session (Schedule & Tools)
The following tools & schedule were identified as being necessary. The following is not a prioritized list.
- Fast backup & restore of development databases (9/9/97)
- Leak Checkers (mark & sweep)
- Event Dump with Schema Name
- Load database from .xdr files (9/9/97)
- Browsing Tools
- Performance Tools (monitoring, statistics, reclustering)
- Event Intelligent Deep Copy for data exchange between disjoint federations. Collapse into a single container & re-expand into optimal placement. Selection of subtrees within the event & cutoff points.
- Conditions database deep copy.
- Database preallocation tools
- Schema preallocation tools
- Support for Event ID (event number, run number, processing release number?)
- Support for partial reprocessing (versions?) with a release tagging compatible versions.
- Cloning of event samples (uses Event Intelligent Deep Copy).
- Selections - write out subset as new collection (9/9/97)
- Selections - multiple input collections with correct handling of duplicates etc.
The major topic of the schedule was the date when physics analysis users would have "production" access to database-based event samples. This is currently forseen as sometime in Dec 1997 and much of the discussion centered
around whether this was early enough for new algorithms to be used in creation of data for "The Book".
- 15:45-16:30 Closing Summaries
Additional Conclusions:
A major outcome of the workshop was not discussed much during the working sessions of the workshop itself, but immediately thereafter in further informal discussions. That was the decision on the event API. The proposal is
that the processing of an event within the reconstruction environment be broken into 7-12 chunks (e.g. tracking) with their associated input & output data. A transient strategy will be adopted, based on conventional C++ classes &
pointers, by requiring that client software explicitly request the data for each such chunk. The corresponding C++ pointers will be NULL if the data is not explicitly. At the end of processing, the transient event will be "walked" and the corresponding
persistent data associated with the event.

DB Home | BaBar Home | Computing | Reconstruction | Simulation |
Search

DRQuarrie@LBL.Gov
|