|
|
|
|||||||||||||||||||||||||
|
My working assumption is that the stage headers are as small as possible apart from containing references to their data objects, and that the stage tags are typically 1-10% of the size of the corresponding stage information. Return to Table of Contents. CollectionsEvent collections contain references to events within the event store. Multiple types of collections are supported, although they all exhibit the same abstract interface. The multiple collections are designed to accommodate both small (limited to less than 10^6 events) and large (allowing up to 10^12 events) collections of events, as well as collections that organize events according to the run number as derived from the online system. As noted above, all event collections obey the same abstract interface, which is the mechanism by which new events can be added to an existing collections, or by which all the events within the collection may be accessed. One restriction is that these collections do not yet support the concept of iterators and that therefore only a single iteration within a process for a particular collection is supported. This limitation might be removed following evaluation of the prototype. Event collections are persistent-capable and care should be taken in creating them. A standard mechanism is provided for creating transient collections that can later be migrated to being persistent. However, such collections are restricted in their size. The prototype does not enforce a size restriction. Return to Table of Contents. DictionariesDictionaries are a mechanism by which a persistent event collection may be assigned a name and may then later be accessed by that name. Multiple dictionaries can exist, corresponding to the different context levels described above, and to the various groups and users. Thus there is a single System level dictionary that can be updated only by applications running at that context level, but may be accessed by any other database application. A single dictionary may further be defined for each defined group and user, that can only be updated by that group or user. In order to facilitate database management activities, all persistent event collections will automatically be assigned an internally generated name and will be associated with the corresponding dictionary depending on the context of the application. The application-specified man replaces this automatically generated one. This approach ensures that all persistent event collections can always be located by scanning through all available dictionaries. Each dictionary has a separate name space and thus collection name clashes can only occur within the same dictionary. An event collection may appear in several different dictionaries under the same or different names. Typical processing modes for an event store application are the following:
Return to Table of Contents. RegistryA registry of all dictionaries is maintained for management purposes. This enables the current set of dictionaries to be located, and thence all the persistent collections to be located. Return to Table of Contents. ApplicationsAll applications using the BABAR event store access information via a single BdbEventStore class. This provides access to the registry and dictionaries that themselves provide access to named collections of events. In addition it performs transaction and error reporting management as well as statistics gathering. The BdbEventStore class is implemented as a singleton and thus provides a single instance through which all management must be performed. Applications can simultaneously access any of the BDMS domains using the appropriate domain-specific application classes. Return to Table of Contents. Clustering StrategyThe goal of the clustering strategy is to optimize access to event data. This is particularly important in a situation where much of the data resides on tape rather than disk. This will be true for BABAR, although the use of a migrating file-system (as provided by HPSS) will automatically bring accessed files into the disk cache if they are not already present. There are several components to the strategy:
The implications of this strategy are:
Since the goal is to allow simple queries to operate purely on the event tag, it might make sense to give the tag an interface that allows access to the remainder of the event and bypass use of the event header as a portal to the tag. This interface would be implemented by delegation to the event header, there being a di-directional association between them. The advantage of this is that a collection of event tags would be the most compressed representation of events and avoid the space and time overhead of the event header. Return to Table of Contents. Directory & File OrganizationDirectory Tree
$BDBROOT/ -+- events/ -+- system/ -+- dic/ -+- dictionary.bdb
| |
| +- raw/ -+- raw000000/ -+- raw000000.bdb
| | | |
| | | +- [...]
| | | |
| | | +- raw0000FF.bdb
| | | +
| | | +- rawhdr000001.bdb
| | | |
| | | +- [...]
| | |
| | +- raw000100/ -+- raw000100.bdb
| | | |
| | | +- [...]
| | |
| | +- [...]
| |
| +- rec/ -+- rec000000/ -+- rec000000.bdb
| | | |
| | | +- [...]
| | |
| | +- rec000100/ -+- rec000100.bdb
| | | |
| | | +- [...]
| | |
| | +- [...]
| |
| +- sim/ -+- sim000000/ -+- sim000000.bdb
| | | |
| | | +- [...]
| | |
| | +- [...]
| |
| +- esd/ -+- esd000000/ -+- esd000000.bdb
| | | |
| | | +- [...]
| | |
| | +- [...]
| |
| +- aod/ -+- aod000000/ -+- aod000000.bdb
| | | |
| | | +- [...]
| | |
| | +- [...]
| |
| +- tag/ -+- tag000000/ -+- tag000000.bdb
| | | |
| | | +- [...]
| | |
| | +- [...]
| |
| +- hdr/ -+- hdr000000/ -+- hdr000000.bdb
| | | |
| | | +- [...]
| | |
| | +- [...]
| |
| +- col/ -+- col000000/ -+- col000000.bdb
| | |
| | +- [...]
| |
| +- [...]
|
+- groups/ -+- group[i]/ -+- dic/ -+- dictionary.bdb
| | |
| | +- raw/ -+- raw000000/ -+- raw000000.bdb
| | | | |
| | | | +- [...]
| | | |
| | | +- [...]
| | |
| | +- [...]
| |
| +- group[j]/ -+- dic/ -+- dictionary.bdb
| | |
| | +- [...]
| |
| +- [...]
|
+- users/ --+- user[i]/ -+- dic/ -+- dictionary.bdb
| |
| +- raw/ -+- raw000000/ -+- rawoo0000.bdb
| | | |
| | | +- [...]
| | |
| | +- [...]
| |
| +- [...]
|
+- user[j]/ -+- dic/ -+- dictionary.bdb
| |
| +- [...]
|
+- [...]
Database File Details
Subdirectories
raw Raw data
Name: rawNNNNNN.bdb
Length: Fixed Length [~20GB]
Broken into fixed length containers of ~1GB
Raw Headers & Tags
Name: rawhdrNNNNNN.bdb
rawtagNNNNNN.bdb
rec Reconstructed data
Name: recNNNNNN.bdb
Length: Fixed Length [~20GB]
Broken into fixed length containers of ~1GB
Rec Headers & Tags
Name: rechdrNNNNNN.bdb
rectagNNNNNN.bdb
sim Simulated Truth data
Name: simNNNNNN.bdb
Length: Fixed Length [~20GB]
Broken into fixed length containers of ~1GB
Sim Headers & Tags
Name: simhdrNNNNNN.bdb
simtagNNNNNN.bdb
esd Event Summary Data
Name: esdNNNNNN.bdb
Length: Fixed Length [~1GB]
Broken into fixed length containers of ~1GB
What's the data organisation - by event or by class?
ESD Headers & Tags
Name: esdhdrNNNNNN.bdb
esdtagNNNNNN.bdb
aod Analysis Object Data
Name: aodNNNNNN.bdb
Length: Not yet specified
Data organisation by class (first guess)
tag Tag data
Name: tagNNNNNN.bdb
Length: Not yet specified
Data organisation by event (single object per event tag)
dic Dictionary
Contains appropriate dictionaries and any other book-keeping
databases
Name: dictionary.bdb
[others?]
col Event Collections
Contains event collections
Name: colNNNNNN.bdb
Length: Not yet specified
hdr Event Headers
The Event Header is the object that has references to the other
processed objects - tag, aod, esd, rec, raw, sim (if appropriate).
Name: hdrNNNNNN.bdb
Length: Not yet specified
In the above, NNNNNN denotes the Sequence Number in Hex. Notes/Issues
Deriving the creation locationThe following derivation tree is implied for the creation clustering hint
BasicHint
|
+--------------------------+---------------------+
| | |
ConditionsHint EventsHint OnlineHint
| |
[tla] [Context]
| |
+------+------+ +----------+------------+
| | | | | |
tla[i] tla[j] tla[k] System Group User
| |
[Sequence] [Family]
|
+------+------+------+------+------+------+-----+-----+
| | | | | | | | |
raw rec sim esd aod tag col hdr dic
| | | | | | | | |
[Sequence] [Sequence] [Sequence] [Sequence] |
| |
+--------+ dictionary.bdb
| |
raw000000 raw000100
|
[Sequence]
Return to Table of Contents. Federated Database ManagementThe FDDB contains both the schema (class definitions) and the database catalog (locations etc.). It's imperative that this be well managed and protected against accidental corruption or deletion. In general, once a significant amount of data is loaded into a FDDB, it should be "hard" to change the schema. Objectivity supports several strategies for schema migration and conversion:
Further considerations are that two developers attempting to run the DDL compiler against the same federated database may come into conflict such that one of their compilations fails because it can't lock the FDDB for exclusive use during the course of the compilation. Another factor is the concept of multiple partitions where one site is designated as being the primary partition and contains the primary FDDB and data, and other sites (secondary partitions) may have local copies of only some of the databases within the FDDB, and may make extensions to the primary schema without affecting the primary FDDB. The secondary partitions act as write-back caches such that if requested data is available locally it is directly accessed, otherwise it is fetched from the primary partition. Similarly, if data is updated then it will automatically be propagated back to the primary partition (and to other secondary partitions) unless it is purely local to the secondary partition. The present (very preliminary) strategy for managing the FDDB and allowing developer access is the following:
The present design supports within the Production FDDB both work-group and user-based databases and event collections in adition to the global system ones. This concept is not yet integrated with the developer view outlined above. Return to Table of Contents. Access to data from Reconstruction ApplicationsThe basic access model is that there will be named collections of events which are accessed through one of several dictionaries. There is one System-wide dictionary, several Group wide dictionaries and many User-specific dictionaries. Thus there can be named collections at the System level representing particular physics data samples (and obviously the complete event sample), similar collections at the group-level (e.g. for each physics working groups) and at the User level. In general these collections need not contain any data, they just contain references to the event headers of the selected events. However, it may be advantageous to duplicate part of the information for the events in order to improve repeated access to that data. Components of the strategy
Return to Table of Contents. Data Distribution Strategy[This section missing] Return to Table of Contents. Data Logging Strategy[This section missing] Return to Table of Contents. Useful QuantitiesFederated database capacity vs. database file sizeA maximum of 65535 databases files is allowed in an Objectivity/DB Federated Database. The following table assumes that 50% of these database files are small and therefore make little contribution to the total capacity. Thus the following table corresponds to 32767 database files of the specified size.
Number of files for a nominal year's worth of data (10^7 secs)
Number of files for a weeks worth of data (604800 secs)
Time taken to fill a fixed-length database file at various writing rates
0.25 MB/sec writing rate
1GB = 4000secs = 66mins 40secs = 1hrs 6mins 40secs
2GB = 8000secs = 133mins 20secs = 2hrs 12mins 20secs
10GB = 40000secs = 666mins 40secs = 11hrs 6mins 40secs
20GB = 80000secs = 1333mins 20secs = 22hrs 13mins 20secs
50GB = 200000secs = 3333mins 20secs = 55hrs 33mins 20secs
1 MB/sec
1GB = 1000secs = 16mins 40secs
2GB = 2000secs = 33mins 20secs
10GB = 10000secs = 166mins 40secs = 2hrs 46mins 40secs
20GB = 20000secs = 333mins 20secs = 5hrs 33mins 20secs
50GB = 50000secs = 833mins 20secs = 13hrs 53mins 20secs
2.5 MB/sec
1GB = 400secs = 6mins 20secs
2GB = 800secs = 12mins 20secs
10GB = 4000secs = 66mins 40secs = 1hrs 6mins 40secs
20GB = 8000secs = 133mins 20secs = 2hrs 13mins 20secs
50GB = 20000secs = 333mins 20secs = 5hrs 33mins 20secs
10 MB/sec
1GB = 100secs = 1mins 40secs
2GB = 200secs = 3mins 20secs
10GB = 1000secs = 16mins 40secs
20GB = 2000secs = 33mins 20secs
50GB = 5000secs = 83mins 20secs = 1hrs 23mins 20secs
Return to Table of Contents. References
DB Home | BaBar Home | Computing | Reconstruction | Simulation | Search
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||