Data Distribution for ROOT implementation of the Configuration Database
Data distribution for the configuration database is implemented using standard BaBar bookkeeping and data distribution tools. Configuration database will be a part of the special datasets which include both conditions and configurations data. As of December 2005 there are no conditions datasets yet, but there is already one dataset defined which includes Configuration database only. The name of this dataset is "Nonevent-CfgDB".
ROOT implementation of the Configuration database uses KanAccess mechanism to find the physical location where the ROOT files are stored, and also an additional configuration file which tells the name of the most recent snapshot. Section below describes how to modify KanAccess and to setup the additional configuration files.
The process of import of the Configuration database includes two stages:
- Copying the ROOT files which are the part of the configuration of conditions dataset. This is done with the standard data distribution tools.
- Update of the local rules file for the name of the most recent snapshot. This is done simply by calling the specific script and giving it the file name of the snapshot that has been imported.
The import procedure depends on whether you are using the remote bookkeeping database or local copy of it. Below we describe two typical use cases, for sites with or without local bookkeeping database.
Typical size of the ROOT file with the Configuration database now is around 7MB, so you won't need a lot of space just for it. But if you copy Configuration database you would probably need Conditions too, and Conditions' size is much bigger.
If you do not have it already at your site, you should create a configuration file for Kanga data access which defines the translation rules from the "logical file name" (LFN) to a "physical file name" (PFN). The name of the file is $BFROOT/kanga/config/KanAccess.cfg. For the full description of the KanAccess.cfg file format check "friendly" BaBar web, it should be somewhere there :( For Configuration database all that is needed is to add a translation rule for "/store/cfg/*" path. If your site uses xrootd as SLAC, this can look like:
read /store/cfg/* xrootd kanolb-a:1094/
write /store/cfg/* error
If you prefer to access the Configuration database using the ordinary filesystem, then it might look like:
read /store/cfg/* file /work/cfgdb-root
write /store/cfg/* error
To check that it works as expected run the following command:
KanAccess /store/cfg/CfgDB.root
Its output should print the path to the intended storage space, either xrootd or a filesystem directory.
ROOT implementation of the configuration database uses its own rules to translate generic name of the ROOT file into the specific snapshot name. For this you need to create the file with the name $BFROOT/kanga/config/cfgdb/CfgDBNameRules.cfg with the following content (or copy it from SLAC):
# we don't allow writing to a generic DB
write /cfg/CfgDB\.root error
# generic name translation for the most recent snapshot
-include kanga/config/cfgdb/CfgDBNameRules-latest.cfg
As you can see this file has an include statement for another file (CfgDBNameRules-latest.cfg) which does not exist yet but will be created once you have imported the snapshot and run the update script CfgRootUpdateCfg. See details below. Note that the directory where the rules file lives should be writable by the person who will be running the update script.
When you do not have local bookkeeping database you can use the database at the remote site, typically SLAC. Here is an example command that you would use to copy the Configuration database to a local disk:
BbkImport --dbsite=slac --noupdate-sql --ftp=bbftp --remote='0*' \
--dataset=Nonevent-CfgDB \
${USER}@bbr-xfer01.slac.stanford.edu \
/work/cfgdb-root
Explanation of the options:
--dbsite=slac
- instructs BbkImport to use remote bookkeeping database at SLAC
--noupdate-sql
- instructs BbkImport to to not update bookkeeping database
--ftp=bbftp
- specifies transfer method, other methods include bbcp or scp (default)
--remote='0*'
- select all files, not just those marked for import
--dataset=Nonevent-CfgDB
- specifies the name of the dataset, "Nonevent-CfgDB" is configuration-only dataset, there could be also conditions datasets which include configuration database
${USER}@bbr-xfer01.slac.stanford.edu
- ssh will connect as a current user to the bbr-xfer01 host
/work/cfgdb-root
- directory name where the file will be copied, if you do not use xrootd then this directory should probably be the same as the one in KanAccess.cfg file
THIS IS NOT TESTED, USE AT YOUR OWN RISK. When you use your local bookkeeping database for data transfers from SLAC you would typically use bookkeeping tools first to mark the collection in the "Nonevent-CfgDB" dataset for import (after synchronizing your database with SLAC copy)):
BbkFiles --dbsite=local --dbuser=bbrora --dataset=Nonevent-CfgDB \
--remote=2 --setremote=1C
Explanation of the options:
--dbsite=local
- instructs BbkImport to use local bookkeeping database
--dbuser=bbrora
- connect as the user with the write privileges
--dataset=Nonevent-CfgDB
- specifies the name of the dataset
--remote='2'
- selects just those files that haven't already had their status defined
--setremote=1C
- new status flag, statuses starting with 1 are marked "to import"
After that you can import marked files:
BbkImport --dbsite=local --dbuser=bbrora --ftp=bbftp \
bbrdist@bbr-xfer01.slac.stanford.edu
Options to this command have the same meaning as in the above commands.
The last step is to update the rules file. In some case you may need to wait until the data that you have copied will become available through xrootd, for example. There should be some means to determine when it happens, but this is dependent on particular setup at remote site. But if you use filesystem access to a data files this is not probably needed at all.
To update the rules you need to know the name of the file that has been copied. This name can be determined from different sources:
-
When BbkImport copies the file it prints a transfer summary, and from that you can determine the file name. For example from this output:
05/12/07 11:39:55 get bbr-xfer07.slac.stanford.edu:/kanga/store/cfg/2005/11/CfgDB-20051108T024220.root ...
one can easily guess that the file name that has been copied is /store/cfg/2005/11/CfgDB-20051108T024220.root. In case the file already exists locally BbkImport would tell you some thing like:
05/12/07 14:21:33 /store/cfg/2005/11/CfgDB-20051108T024220.root (6870234 bytes, 6.6 MB) is already here
from which it's also easy to see the file name.
-
You can extract the name of the file from your local bookkeeping database:
% BbkUser -q --dataset=Nonevent-CfgDB file
/store/cfg/2005/11/CfgDB-20051108T024220.root
-
Or you can extract the name of the file from the remote bookkeeping database at SLAC:
% BbkUser -q --dbsite=slac --dataset=Nonevent-CfgDB file
/store/cfg/2005/11/CfgDB-20051108T024220.root
but this method may not be very reliable because you can't control updates to the remote sites.
Once you have the file name of the configuration database snapshot you can update the rules file with the following command (use the file name that you get earlier, do not copy-and-paste from here):
% CfgRootUpdateCfg /store/cfg/2005/11/CfgDB-20051108T024220.root
This should create the file CfgDBNameRules-20051108T024220.cfg in the directory $BFROOT/kanga/config/cfgdb and the symlink CfgDBNameRules-latest.cfg in the same directory pointing to that file.
Default implementation of the configuration database is still Objectivity. To use ROOT implementation instead of Objectivity you have to define the environment variable (assuming (t)csh):
% setenv CFG_DEFAULT_IMPL ROOT
You can even put this into the same boot script which defines the boot file for regular Objectivity database.
All above should work in the recent releases, recent means anything after 18.3.0 probably. For earlier releases all this is irrelevant but harmless, so you can still play with the alternative implementations without disturbing older clients.
We are still working on the automation of the updates to the central bookkeeping database, so you should expect delays in getting most recent information from the bookkeeping.
|