SLAC PEP-II
BABAR
SLAC<->RAL
Babar logo
HEPIC E,S & H Databases PDG HEP preprints
Organization Detector Computing Physics Documentation
Personnel Glossary Sitemap Search Hypernews
Unwrap page!
Computing Search
Who's who?
Meetings
FAQ Homepage
Archive
Environment
Online SW
Offline
Workbook
Simulation
Reconstruction
Data Distribution
Beta
Beta Tools
Event display
Code releases
Databases:
Hot Items!
About Us
Meetings
General DB info
Conditions DB
Event Store
Online DB
Links
Check this page for HTML 4.01 Transitional compliance with the
W3C Validator
(More checks...)

BABARDATABASE WORKSHOP

LBNL 8th-9th July 1997

Draft: 15th July 1997

This is a report on the workshop. It is incomplete and will be updated as more information is received from people who took notes during several of the sessions. I'm also hoping that we'll be able to scan in the talks that are not already in electronic form.

Goal:

The principal goal was to determine a persistence model which we will use to store and retrieve our event data, both raw, simulated, reconstructed and partially analyzed.


Agenda:

Tuesday 8th July

  • 9:15-10:15 The BaBar Reconstruction Environment & Data Model (Bob Jacobsen) as a PostScript.
  • 10:15-10:30 Schedule Implications (Frank Porter)
  • 11:00-12:15 The RD45 Approach to Persistence (Dirk Duellmann)
  • 13:30-13:45 Geant4 Persistence (Gabriele Cosmo)
  • 13:45-14:00 Prototype BaBar Persistent Data Model (Pavel Binko)
  • 14:00-14:30 Objectivity Collection Classes (David Quarrie), in PDF format ( 1-Up or 2-Up) or in PostScript 2-Up format.
  • 14:30-15:30 Proposed Strategies (David Quarrie), in PDF format ( 1-Up or 2-Up) or in PostScript 2-Up format.
  • 15:45-16:30 Confronting strategies with use cases (Bob Jacobsen)

Two different sets of uses cases were presented by Bob and taken as topics for discussion on Wednesday. The two sets are available on the BABAR HyperNews Forums here and here. In addition, Bob stressed that the following is not a required use case, and that having to explicitly preload data was acceptable:

Trk* t;
    t->gimeHot()->gimeHit()->gimeDigi()->gimeGHit();

 

Wednesday 9th July

  • 9:00-10:30 Working Session (Online Event Processing & Physics Analysis Issues)

The following notes were taken by David Quarrie and are rather brief. Please let me know if there are important points that were missed.

OEP

Whilst current plans for OEP require it to have access to the conditions database, this is not on a per-event basis during normal triggering operation and is not expected to cause a bandwidth or lock interaction rate problem. The baseline design has OEP writing sequential files to a spool disk for subsequent input to prompt reconstruction. Output to the event store is therefore downstream of PR. Much of the discussion centered around the problems that would be posed by a fully-persistent solution if reconstruction classes were to be used within the OEP environment. Solutions that were discussed were to create all the persistent objects in a signel container and then just delete the container at the end of the processing, perhaps amortizing this across several events so as to minimize interaction with the lock manager. Concerns were expressed about the size of the client cache and lock overhead at the required 2kHz event processing rate. A transient solution would be preferable.

Physics Analysis

My notes on this are too sketchy to be useful. Most of the discussion took place during the Use Cases session.

  • 11:00-12:15 Working Session (Use Cases)

Maria Grazia Pia added another set of use cases to those already presented by Bob. Notes for the discussions on each of the 3 sets were taken by Gregory Dubois-Felsmann, David Quarrie & David Aston.

  1. Bob's initial use cases Gregory's notes.
  2. Maris Grazia Pia's use cases. David Quarrie's notes. These use cases again highlighted the need for a mechanism to share data samples between disjoint federated databases, and to perform set operations (e.g. union)on event samples in such a way that duplicate events (identified by some event Id - event number, run number, processing release number?) are excluded from the resultant samples.[draft notes - will try to write more.]
  3. Bob's 2nd set of use cases. David Aston's notes.
  • 13:30-15:30 Working Session (Schedule & Tools)

The following tools & schedule were identified as being necessary. The following is not a prioritized list.

  • Fast backup & restore of development databases (9/9/97)
  • Leak Checkers (mark & sweep)
  • Event Dump with Schema Name
  • Load database from .xdr files (9/9/97)
  • Browsing Tools
  • Performance Tools (monitoring, statistics, reclustering)
  • Event Intelligent Deep Copy for data exchange between disjoint federations. Collapse into a single container & re-expand into optimal placement. Selection of subtrees within the event & cutoff points.
  • Conditions database deep copy.
  • Database preallocation tools
  • Schema preallocation tools
  • Support for Event ID (event number, run number, processing release number?)
  • Support for partial reprocessing (versions?) with a release tagging compatible versions.
  • Cloning of event samples (uses Event Intelligent Deep Copy).
  • Selections - write out subset as new collection (9/9/97)
  • Selections - multiple input collections with correct handling of duplicates etc.

The major topic of the schedule was the date when physics analysis users would have "production" access to database-based event samples. This is currently forseen as sometime in Dec 1997 and much of the discussion centered around whether this was early enough for new algorithms to be used in creation of data for "The Book".

  • 15:45-16:30 Closing Summaries

Additional Conclusions:

A major outcome of the workshop was not discussed much during the working sessions of the workshop itself, but immediately thereafter in further informal discussions. That was the decision on the event API. The proposal is that the processing of an event within the reconstruction environment be broken into 7-12 chunks (e.g. tracking) with their associated input & output data. A transient strategy will be adopted, based on conventional C++ classes & pointers, by requiring that client software explicitly request the data for each such chunk. The corresponding C++ pointers will be NULL if the data is not explicitly. At the end of processing, the transient event will be "walked" and the corresponding persistent data associated with the event.

 

DB Home | BaBar Home | Computing | Reconstruction | Simulation | Search

e-mail DRQuarrie@LBL.Gov