SLAC PEP-II
BABAR
SLAC<->RAL
Babar logo CM2 logo
HEPIC E,S & H Databases PDG HEP preprints
Organization Detector Computing Physics Documentation
Personnel Glossary Sitemap Search Hypernews

Extended collection names

Collections for Bdb/Objy, classic Kanga and the new CM2 Kanga typically have a form like:

/xxx/yyy/zzz

For CM2 Kanga, a concrete example is:

/store/PR/R14/AllEvents/0004/58/14.3.2/AllEvents_00045810_14.3.2V00

This collection was produced by PR and contains the events output from a particular processing of run 45810. (If you are interested there is an RFC which describes all of details of how these collection names are constructed in CM2.)

For skim collections, we skim many different (PR or SP) collections and then merge the output. An example skim collection is:

/store/PRskims/R12/14.3.2h/Jpsitoll/02/Jpsitoll_0213

In practice this means that any given skim collection may contain skimmed events from many different data runs (for example). It is occasionally useful to be able to exclude specific runs, however, as runs may be declared bad, have specific detector problems, etc.

In order to deal with this type of situation and other similar ones, we have created an "extended" collection syntax for CM2. An example of the new syntax (which deals with the problem mentioned above) is:

/store/PRskims/R12/14.3.2h/Jpsitoll/02/Jpsitoll_0213%rejectRun=35654

This will read the indicated skim collection, but all events from run 35654 will be filtered out and won't be passed to the event loop of a Framework application.

In general the collection name is "extended" simply by appending something of the form:

%< keyword>=< value>

Currently there are 10 keywords which can be used as part of the extended collection syntax:
and "values" can be constructed as comma-separated lists, ranges or combinations of both, e.g. are all valid extensions.

The "CondAlias" extension is a special case as it, unlike the others, doesn't take simple numeric values, but instead the same CONDALIAS strings you use to configure Moose, e.g.
The "EventId" extension is another special case; only individual values are accepted (no ranges).  The event Ids are the ASCII form of the EidEventTriplet (ie, the same as generated in the log files by EvtCounter, for example: 7f:4fff7fff:2d268d/7d318347:K.)

In fact you can even combine these such that multiple extensions are added:

/xxx/yyy/zzz%rejectRun=35654%selectEventSequence=1-1000

will pass events which come from the first 1000 events in the collection and which are not run 35654.

As you can see, each "%< keyword>=< value>" is effectively a filter on the input events. One subtle thing to note is that each time you append a "%< keyword>=< value>" to a collection name you are creating a new, separate filter. This means that:

/xxx/yyy/zzz%selectRun=8%selectRun=10

will select no events (as no single event can pass both filters). What you probably wanted to do was:

/xxx/yyy/zzz%selectRun=8,10

to select all events from both run 8 and run 10.

The dataset bookkeeping will generate (extended) collections using in particular the "rejectRun" keyword. In addition the old method of generating tcl to allow individual jobs to run on a limited number events (using the "first set xxx" of the input module and "ev begin -nev yyyy") will be replaced by collection names generated using the "selectEventSequence" keyword.

One last thing to note is that this extended collection syntax is only available when running on the CM2 Kanga data and only when running Framework applications (KanCollUtil/KanCopyUtil and interactive Kanga do not support it at the moment). As the extensions are basically filters, they can also only be used on input collections.


Last modified 30-Jul-2004, Peter Elmer