Data Distribution with the SLAC Database
You can import files using the SLAC (or any other) database.
To see what’s available, use the BbkFiles command
% BbkFiles --dbsite=slac --dataset=AllEventsRun3
On disk
(status=0)
Skim Release Stream Components Files Events GBytes
============ ========= ========== ===== ========= ======
12.4.0g AllEvents E 860 120897863 491.8
12.4.0g AllEvents HCBA 860 120897863 312.0
12.4.0h AllEvents E 681 87246705 360.0
12.4.0h AllEvents HCBA 681 87246705 225.6
12.4.0j AllEvents E 2145 265955780 1152.2
12.4.0j AllEvents HCBA 2144 265406972 695.9
============ ========= ========== ===== ========= ======
Totals 7371 947651888 3237.6
database: 7371 (3476346311340 bytes, 3.2 TB)
--dbsite=slac- connects to SLAC database (actually, this is the default)
--dataset=AllEventsRun3- selects run 3 AllEvents collections
See BbkFiles -h for a summary of the main options. BbkFiles --help lists them all.
Before trying to import for the first time, set up an identity file so you can connect to bbr-xfer01.
% ssh bbrdist@bbr-xfer01.slac.stanford.edu
If that works without prompting you for a password, then you can try the import
% BbkImport --dbsite=slac --dataset=AllEventsRun3 \
--remote='0*' --noupdate-sql \
--ftp=bbftp --components=A --goodruns=38629 \
bbrdist@bbr-xfer01.slac.stanford.edu
04/02/17 00:52:26 1 files (10078332 bytes, 9.6 MB) selected for transfer
04/02/17 00:52:26 checking stop-files BbkImport.stop.8017 BbkImport.pause.8017 ...
04/02/17 00:52:42 get bbr-xfer04.slac.stanford.edu:/kanga/store/PR/R12/AllEvents/
0003/86/12.4.0j/AllEvents_00038629_12.4.0jV01_C14.2.0bV01.01.root
(10078332 bytes, 9.6 MB onto /stage/bdata-top1)
04/02/17 00:53:18 copied 10078332 bytes (9.6 MB), 35.57 s, 2.16 Mbits/s
(51.23 s, 1.50 Mbits/s with overhead) (OK)
--remote=0
- Select all files on disk at SLAC (by default, BbkImport selects files marked
"to import" (status=1),
but since we are using the SLAC database, that isn't appropriate).
--noupdate-sql
- prevents updating the SLAC database with your local status
--ftp=bbftp
- use
bbftp for the transfer. Can also use bbcp or scp (the default)
--components=A
- select only files with
A in the components list; ie files containing the aod component (use E for files containing esd). If --components is not specified, then all components are selected.
--goodruns=35487
- just import this run (otherwise we import the entire dataset)
bbrdist@bbr-xfer01.slac.stanford.edu
- ssh connect as user
bbrdist to bbr-xfer01. Connection will then be redirected to the server that has the files.
See BbkImport -h for a summary of the main options. BbkImport --help lists them all.
Using the SLAC database has several disadvantages:-
- A network connection is required for all bookkeeping commands
(eg. BbkDatasetTcl). (It is not required for analysis jobs to run.)
This is mainly an issue for laptops.
- All import file selection must be done on the BbkImport command line - there is no memory of what data is required.
- It is cumbersome to prioritrise the imports.
- There is no record of which files are available locally.
- If BbkImport is rerun (eg. after a crash, or with an expanded
dataset), it will skip files that are already on disk (and look OK, eg.
have the right size). However this will take a long time if there are
many files already imported.
- Local data management becomes more complex.
- There is no record of which collections are available locally.
- BbkDatasetTcl
will include collections that are not yet imported and the jobs will
crash. (At the moment it does this anyway, but this will be fixed soon.)
The solution is to set up a local database.
/BFROOT/www/Computing/Offline/DataDist/bbk-slacdb.html last modified on 10th May 2004 by
Tim Adye, <T.J.Adye@rl.ac.uk>
|