Setting up local Bookkeeping Databases
This page describes the procedure for installing a
MySQL database and setting up a local copy of the
BaBar bookkeeping databases and, if needed, a skimming database.
It replaces this old
documentation.
Summary
- Install the MySQL server
- Create the BaBar databases
- Install MySQL client software
- Setup database connection information
- Import the bookkeeping release
- Load the database schema
- Load the bookkeeping information
- Keeping the local database up-to-date
- Ongoing maintenance
The local bookkeeping databases can be used to manage
the import of BaBar data to your site.
The skimming database can be used to
control production skimming.
The MySQL server can run on any machine network-accessible to the
user login machines (where people run BbkDatasetTcl etc). If you want
your database accessible from other sites (eg. SLAC), which is often
convenient, then the server should be outside your site's firewall or
else the MySQL port (3306) should be open (it is also possible to route
requests via a proxy, but that is not described here). Access to the
MySQL server is not needed from the batch worker nodes.
Log in as root and:-
- Install the following RPMs, if not already done.
mysql
mysql-server
You can install the versions that are included in the RedHat or
Scientific Linux distribution (the BaBar bookkeeping requires MySQL
3.23 or later).
- Start the MySQL server:
/etc/init.d/mysql start
- Set the password for the database's "
root" username:
mysqladmin -u root password ROOT_PASSWORD
where you should pick your own ROOT_PASSWORD
(NB. this should
probably be different from the Unix root account password).
Log onto the MySQL server machine. You don't need to be logged in
as Unix root, but the normal setup requires that access to the MySQL
"root"
username has to be done from the local host.
Connect to the MySQL database using the root database username
and
password (probably not the same as the Unix root account).
shell> mysql -u root -p
Enter password: *************
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 96992 to server version: 3.23.58
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
You can now set up the databases that you need, probably one or
more of "bbkr14" (bookkeeping for R14/R16),
"bbkr18" (bookkeeping for R18/R22),
"bbkr24" (bookkeeping for R24),
and "stm1" (TM2 skim database).
You can use different database names
and different usernames if you like (eg. to conform with local
conventions). The following commands (from the mysql
command's prompt)
are for the bbkr18 database, but should be repeated for the other
databases as required.
mysql> CREATE DATABASE bbkr18;
Query OK, 1 row affected (0.03 sec)
mysql> GRANT SELECT ON bbkr18.* TO anyuser;
Query OK, 0 rows affected (0.01 sec)
mysql> GRANT ALL ON bbkr18.* TO bfactory;
Query OK, 0 rows affected (0.01 sec)
That defines usernames "anyuser"
(unprivileged, read-only access) and "bfactory"
(privileged access). If this is the first database you
have set up with those usernames, then you must set a password for each
of them.
mysql> SET PASSWORD FOR anyuser=PASSWORD('query');
Query OK, 0 rows affected (0.00 sec)
mysql> SET PASSWORD FOR bfactory=PASSWORD('BFACTORY_PASSWORD');
Query OK, 0 rows affected (0.00 sec)
where you should pick your own BFACTORY_PASSWORD.
For the skim database (if you create this), it is also useful to
create a production username, eg. "bbruser":
mysql> GRANT SELECT,INSERT,UPDATE,DELETE ON stm1.* TO bbruser;
Query OK, 0 rows affected (0.01 sec)
mysql> SET PASSWORD FOR bbruser=PASSWORD('BBRUSER_PASSWORD');
Query OK, 0 rows affected (0.00 sec)
The default MySQL installation includes a couple
of "anonymous accounts".
These can interfere
with access to the BaBar accounts from the MySQL server machine itself, so it is probably
simplest to remove them.
mysql> DELETE FROM mysql.user WHERE User='';
Query OK, 2 rows affected (0.00 sec)
mysql> DELETE FROM mysql.db WHERE User='';
Query OK, 2 rows affected (0.00 sec)
mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.02 sec)
Exit from the mysql command ("quit" command)
and check that you can connect to the database:
shell> mysql -u anyuser -p bbkr18 -e quit
shell> mysql -u bfactory -p bbkr18 -e quit
(and similarly for the other username/database combinations).
You could load the database schema here, but it is often more
convenient to do this from a user login machine (where you can check out
the schema files from CVS), as described below.
The MySQL client code and Perl are needed on all user login
machines
(where people run BbkDatasetTcl etc).
You can install the Perl RPM that is included in the RedHat or
Scientific Linux distribution. The BaBar bookkeeping code requires Perl
5.6.1 or later (SL
3.0.X
has perl 5.8.0).
Log in as root
and install the following RPMs:
mysql
perl-DBI
perl-DBD-MySQL
perl-PlRPC
For the first three, you can install the RPMs that are included
in
the RedHat or Scientific Linux distribution. perl-PlRPC
(which is required to access the SLAC database, eg. to get updates)
is available from the DAG RPM Repository
(eg. for Scientific Linux 3), or
can be installed from source with
perl -MCPAN -e 'install RPC::PlClient'
Alternatively, you could install all the Perl modules this way
using
perl -MCPAN -e 'install Bundle::DBI'
but that can take a long time.
You need to set up the connection parameters and passwords in the bookkeeping
connection manager. Check whether your site is defined in
$BFROOT/internal/Tools/bookkeeping
in SLAC AFS (should be a directory
with the same name as your $BFSITE). If
it isn't, ask Douglas Smith
(douglas@slac.stanford.edu)
to create the directory.
In this directory, create a file "connect"
containing:
type=MySQL
connection=Network
server=MYSQL_SERVER_HOST
database=bbkr18
driver=dbi:mysql:bbkr18:MYSQL_SERVER_HOST
replacing MYSQL_SERVER_HOST (twice)
with the full host name of your MySQL
server name (eg. babar-sql.gridpp.rl.ac.uk for RAL).
The database name ("bbkr18" in this example) can specify any of
the databases you created.
Then create "anyuser",
"admin",
and "production"
subdirectories
and
one file for each database username containing the password, eg. create
files
anyuser/anyuser
admin/bfactory
production/bbruser
where the directory name defines the function (can be any name you
like) and the file name defines the username. So for site "MY_SITE":
yakut04> cd $BFROOT/internal/Tools/bookkeeping/MY_SITE
yakut04> mkdir anyuser
yakut04> echo query > anyuser/anyuser
and similarly for the other files.
The files are in separate directories so access to the password
files can be controlled (if the same people manage the database and run production,
then you can put both bfactory and bbruser files in the same
directory, but keeping them separate gives greater flexibility).
You should set the AFS protections on the "admin" and
"production"
directories so that only database administrators and people doing the
skimming have access, eg.
yakut04> fs listacl admin
Access list for admin is
Normal rights:
g-babar:associate rl
g-babar:community rl
g-babar:member rl
g-babar rlidwka
system:slac l
system:administrators rlidwka
yakut04> fs setacl admin g-babar:associate none \
g-babar:community none \
g-babar:member none
yakut04> fs setacl admin adye rl
You can find examples for other sites in the
$BFROOT/internal/Tools/bookkeeping
directory.
Check this works from your local user login machine (logged in
as
one of the allowed users). This will reload the cache (may prompt for a
password and take a few seconds the first time).
shell> BbkTestConnect --dbsite=MY_SITE --dbname=bbkr18 --dbuser=bfactory
Testing site MY_SITE :
Testing allowed user bfactory:
Received DB connection.
If you don't yet have a site defined in
$BFROOT/internal/Tools/bookkeeping,
you can still try it out by
creating the admin/bfactory file under your ~/.bbk/sites/local
directory and use --dbsite=local.
This step is not strictly necessary if you will only use the
local
database for skim production, but it is still useful.
To ensure that the bookkeeping commands and database scheme stay
in
synch, the BbkDatasetTcl, BbkLumi, BbkUser, etc commands are run from a
well-defined release (to run those commands from another release, you
can use relBbkDatasetTcl,
but this is discouraged). The bookkeeping
release will already be the default at SLAC, so you can easily see
which release it is by logging into SLAC and running
yakut04> BbkDatasetTcl --version
BbkDatasetTcl V2.2 (Id: relBbkDatasetTcl,v 1.17 2006/06/22 17:43:31 adye Exp )
Script /afs/slac.stanford.edu/g/babar/dist/releases/22.1.1b/bin/Linux24SL3_i386_gcc323/relBbkDatasetTcl uses /afs/slac.stanford.edu/g/babar/reco/vol17/22.1.1b/include/BbkTools
You can see the release number (22.1.1b in
this case) in the
script
path.
Import this release (and, for a lettered release, its base
release)
to your local site:
importrel -p -a 22.1.1
importarch -p 22.1.1 Linux24SL3_i386_gcc323
importrel -p -a 22.1.1b
importarch -p 22.1.1b Linux24SL3_i386_gcc323
cd $BFDIST/releases/22.1.1
gmake siteinstall
cd $BFDIST/releases/22.1.1b
gmake siteinstall
The details of importing releases may well vary between sites, but
you probably already worked that out.
Now update $BFROOT/bin
to the latest tag, or just the HEAD:
cd $BFROOT/bin
cvs update -A
If you didn't already have $BFROOT/bin,
then check out the "bin"
package from SLAC CVS into $BFROOT, and
make sure that $BFROOT/bin
is
in every user's $PATH.
You can check the setup by running BbkDatasetTcl --version on
your
machine. It should show the same release number as it did at SLAC. You
can also try connecting to the SLAC database, eg.
shell> BbkDatasetTcl --site=slac A0-Run6-OnPeak-R22d
BbkDatasetTcl: wrote A0-Run6-OnPeak-R22d.tcl
Selected 145 collections, 21610321/774158477 events, ~49590.7/pb, from bbkr18 at slac
The default database is SLAC. If you set up the database
connection
information for your site, then you can make your site the default.
Create a file $BFOVERRIDE/SoftRelTools/bbk.conf
containing just
dbsite=MY_SITE
with your database site name instead of MY_SITE. You
should then
find that BbkTestConnect will connect to your site, even if you leave
out the --dbsite=MY_SITE.
The bookkeeping database schema for bbkr14, bbkr18, and bbkr24 are in
BbkTools. These can be loaded straight from the bookkeeping release
(see above to check the correct release number to use instead of
22.1.1b here):
mysql -h MYSQL_SERVER_HOST -u bfactory -p bbkr18 < $BFDIST/releases/22.1.1b/BbkTools/bbk_table_schema.mysql
mysql -h MYSQL_SERVER_HOST -u bfactory -p bbkr18 < $BFDIST/releases/22.1.1b/BbkTools/data_table_schema.mysql
or else check out the bookkeeping release's version of BbkTools
and use that (find with: statusrel -mp 22.1.1b BbkTools).
Repeat those two commands for bbkr14 and/or bbkr24 if necessary.
Once the schema are loaded, the normal bookkeeping commands should
run, though they won't (yet) return anything. Eg.
shell> BbkDatasetTcl
BbkDatasetTcl: 0 datasets found in bbkr18 at MY_SITE.
The skim database schema should be loaded from the
BbkTaskManager package version that
will be used for skimming.
mysql -h MYSQL_SERVER_HOST -u bfactory -p stm1 < BbkTaskManager/stm1_table_schema.mysql
The mirror procedure, described below, will keep the bookkeeping
database up-to-date. To speed up the initial loading, it is much faster
to download a recent "snapshot" and load that.
- Download the snapshot files
from http://hepunx.rl.ac.uk/BaBar/mysqldump/bbkr18-ral.sql.gz.
The snapshot is updated twice a week from the RAL bookkeeping
database (we don't make the snapshot at SLAC, because Oracle can't make
a file to be loaded into MySQL).
The following examples are for the bbkr18 database. You can
repeat them for bbkr14 and/or bbkr24 (if you made those databases), using the snapshot files
http://hepunx.rl.ac.uk/BaBar/mysqldump/bbkr14-ral.sql.gz and/or
http://hepunx.rl.ac.uk/BaBar/mysqldump/bbkr24-ral.sql.gz.
- Load the snapshot into your database:
gunzip -c bbkr18-ral.sql.gz | mysql -h MYSQL_SERVER_HOST -u bfactory -p bbkr18
- The snapshot misses out information about which files (etc)
are
local. You need to fill the initial settings (this is normally done
automatically by the mirror, but not for the snapshot).
mysql -h MYSQL_SERVER_HOST -u bfactory -p bbkr18 < $BFDIST/releases/22.1.1b/BbkTools/data_table_update.mysql
(replacing 22.1.1b with the current bookkeeping release, as
above).
- The collection-local flag has to be initialised separately.
BbkModifyDseFlags --site=MY_SITE --dbuser=bfactory --dbname=bbkr18 --all --autosetislocal
When this is complete, you should be able to use the bookkeeping
commands to query the local database, as long as you don't restrict the
query to locally-imported datasets (since we haven't imported anything
yet). Eg.
shell> BbkDatasetTcl --nolocal A0-Run6-OnPeak-R22d
Dataset A0-Run6-OnPeak-R22d is not available in bbkr18 at ral
Generating tcl file anyway, but it won't work at ral.
BbkDatasetTcl: wrote A0-Run6-OnPeak-R22d.tcl
Selected 145 collections, 21610321/774158477 events, ~49590.7/pb, from bbkr18 at ral
Dataset is not available in bbkr18 at ral
tcl file includes all collections in the dataset, so won't work at ral.
The mirror procedure updates your local bookkeeping database
from
the master at SLAC.
BbkDbMirror -sn bbkr18 -ts MY_SITE -tu bfactory -tn bbkr18 --update --fix
The -ts,
-tu,
and -tn
options specify the target (ie. your
local)
site, database user, and database name. -ss,
-su,
and -sn
specify the
same for the source - the default is -ss slac -su anyuser, so we don't
need to specify those. The command can be run in a nightly cron job.
The --update
option includes changes, not just additions. You
probably always want that. The --fix
option specifies a more careful check, fixing any discrepancies found
(the check can be run without fixing, with --check
instead of --fix).
You can leave off this option for
faster running, with less database load. This is important if you run
the mirror too often (eg. more than once a day). As long as the full
check (and fix if necessary) is done at least once a week, there should
be no problem.
See the BbkDbMirror help
for more information.
The mirror procedure can run in a nightly cron job.
You should keep an eye on the cron job logfiles, since there may
be occasional problems. Many of these can be fixed automatically,
but if the problem remains on repeated runs, you may need to
make some intervention (eg. updating the schema).
Subscribe to the Site Contacts and Data Distribution HyperNews fora.
Updates to $BFROOT/bin
- in particular new bookkeeping releases - are announced in Site Contacts.
Schema
updates are announced in Data Distribution.
If you have problems with any of these procedures, you can
post your question to
the BookkeepingSOS
HyperNews forum.
http://www.slac.stanford.edu/BFROOT/www/Computing/Offline/DataDist/bbk-setup.html last modified 21st January 2008 by
Tim Adye,
<T.J.Adye@rl.ac.uk>
|