Task Management System
The prototype version of this is being used to run the skim
production on the converted dataset.
To use the new Task Management system you should check out
BbkTaskManager sjg20031129a.
Initial Setup
You need to setup your mysql data as mentioned in the Data Distribution page if you are not
doing this at site (like SLAC) that is already setup.
You need to initialise the tables for the Task Management;
[antonia] ~/reldirs/tst14.1.3b > mysql -u bbrora < BbkTaskManager/stm_tableSchema.mysql
Task Setup
Before running a task you need to set it up. There are a few
ingredients that need to be ready first.
- Release Configuration
- Setup
- Task
- You need to add datasets to a task also
You should create an entry for the releases you will be using. I'll
setup 14.1.3b first. For each release you should give it a precidence
(which defines which release is "newer"). This is generally done by
giving each element of a release two digits and turning the letter in
to the last two digits. So I would do (at SLAC you don't need the
-u bbrora option);
[antonia] ~/reldirs/tst14.1.3b > BbkCreateRelease -u bbrora 14.1.3b 14010302
name : 14.1.3b
created at : 2003/11/30 05:12:37
precedence : 14010302
id : 1
You should do this for all releases you plan to use (each release
needs it done once). Now we should configure a setup for the task we
want to carry out (later on I discover you need to give the path to
the executable (and the tcl file if it won't be in the directory you
do BbkSubmitJobs from);
[antonia] ~/reldirs/tst14.1.3b > BbkCreateSetup -u bbrora miniAnal BetaMiniApp MyMiniAnalysis.tcl
successfully created new configuration
name : miniAnal
maintainer : gowdy
created at : 2003/11/30 05:15:01
id : 1
associated with task : -
task id : 0
release : 14.1.3b
runs : BetaMiniApp MyMiniAnalysis.tcl
job wrapper : RELEASE/bin/<BFARCH>/jobWrapper
default queue : bfobjy
logfile : RELEASE/log/<TASKNAME>/<RUNNUMBER>/<PASS>.LOG
output : RELEASE/results/<TASKNAME>/<RUNNUMBER>/<PASS>/
Kan-configuration file : -
tcl template : RELEASE/BbkTaskManager/tclTemplate
user2 : -
ready : no
It checks to make sure the release is defined (fairly good
idea!). It also uses BFCURRENT, another good idea, if you don't use
the --release option.
Now we should setup a task;
[antonia] ~/reldirs/tst14.1.3b > BbkCreateTask -u bbrora -c miniAnal -i sjg,bbrora,bbrora -o sjg,bbrora,bbrora miniAnal
successfully created new task
name : miniAnal
maintainer : gowdy
created at : 2003/11/30 15:06:11
id : 1
current configuration : miniAnal
input datasets : -
input database : sjg,bbrora,bbrora
output stream(s) : -
output database : sjg,bbrora,bbrora
ready : no
active : no
You may want to use different input and output databases, I'm
testing this on my laptop.
I now need to define the dataset to use as input. From the earlier
import there are three datasets defined;
[antonia] ~/reldirs/tst14.1.3b > BbkDatasetHistory --dbsite sjg --dbuser bbrora
BbkDatasetHistory: 3 datasets found:-
AllEventsRun3Conv
GridKa.32955-36172
Padova.36173-39320
I'll setup this task to processes the last one of these;
[antonia] ~/reldirs/tst14.1.3b > BbkAddDataset -h sjg -u bbrora miniAnal Padova.36173-39320
successfully added dataset(s)
name : miniAnal
maintainer : gowdy
created at : 2003/11/30 15:06:11
id : 1
current configuration : miniAnal
input datasets : Padova.36173-39320
input database : sjg,bbrora,bbrora
output stream(s) : -
output database : sjg,bbrora,bbrora
ready : yes
active : no
Before submiting the job you need to setup a TCL snippet to
configure it correctly, as each application will look for difference
TCL variables to be set prior to invoking the tcl for it. The default
one (as seen in the Setup configuration) is
BbkTaskManager/tclTemplate. See the reference below for
allowed special tokens. If you make your own you should configure the
Setup with BbkEditSetup --tcl <yourFile>. I'll now
create the jobs to run;
[antonia] ~/reldirs/tst14.1.3b > BbkCreateJobs -h sjg -u bbrora miniAnal
Argument "stream|s=s" isn't numeric in numeric gt (>) at ./bin/Linux24RH72_i386_gcc2953/BbkCreateJobs line 24.
Checking local run list and creating lookup table...
11/30 15:34:21
...done... Retrieving dse list for dataset Padova.36173-39320 from the input database...
11/30 15:34:21
..done.. checking list for dublicates...
11/30 15:34:21
..done.. creating jobs...
11/30 15:34:21
..done.
created 0 jobs.
Created 0 nJobs job(s).
successfully created new Jobs
So it figured out that I didn't have any of the files for this
Dataset on my laptop's database. I should change the Task to also use
the Karlsruhe Dataset (with BbkAddDataset -h sjg -u bbrora
miniAnal GridKa.32955-36172). Retrying the above command it setup
1115 jobs;
...
1111 pending - bfobjy 50354
1112 pending - bfobjy 62507
1113 pending - bfobjy 22503
1114 pending - bfobjy 8888
1115 pending - bfobjy 6839
I can check the status thus;
[antonia] ~/reldirs/tst14.1.3b > BbkShowJobs -u bbrora --summary miniAnalsh: qstat: command not found
No jobs in any queues... Can that be right? Try again later.
Taskname prepared submitted done ok failed superceded last update
------------------------------------------------------------------------------
miniAnal 1115 0 0 0 0 0 2003/11/30 15:39:20
Now it is time to try running a job. This will fail though as I
don't have a batch system on my laptop... I'll try submitting a
job;
[antonia] ~/reldirs/tst14.1.3b/workdir > BbkSubmitJobs -u bbrora --jobid 150 miniAnal
11/30 19:25:17 BbkTaskManager::BbkTMJob::jobFiles::1231: Error! Can't execute /home/gowdy/reldirs/tst14.1.3b/workdir/BetaMiniApp. Aborting.
11/30 19:25:17 BbkTaskManager::BbkTMJob::submit::857: Error! Can't prepare output files. Aborting.
11/30 19:25:17 BbkTaskManager::BbkTMTask::submitJobs::1537: Error! Can't submit job.
Submitted 0 job(s).
No jobs submitted.
So it looks like when you create a Task you should give the full
path to the executable and tcl file.
To run locally without a batch system you can use the --local option to BbkSubmitJobs. So, this would look like;
[antonia] ~/reldirs/tst14.1.3b > BbkSubmitJobs -u bbrora --local --jobid 150 miniAnal
12/01 05:50:51 BbkTaskManager::BbkTMJob::jobFiles::1231: Error! Can't execute /home/gowdy/reldirs/tst14.1.3b/BetaMiniApp. Aborting.
12/01 05:50:51 BbkTaskManager::BbkTMJob::submit::857: Error! Can't prepare output files. Aborting.
12/01 05:50:51 BbkTaskManager::BbkTMTask::submitJobs::1537: Error! Can't submit job.
Submitted 0 job(s).
No jobs submitted.
No surprise that there is the same error. I'll make symlinks so it
can find the executable and tcl file. (I lost the output for job 150,
so I'll start using 151 below) Here goes, I'll break up the output
here so this page doesn't get too wide;
[antonia] ~/reldirs/tst14.1.3b/workdir > BbkSubmitJobs -u bbrora --local --jobid 151 miniAnal
/home/gowdy/reldirs/tst14.1.3b/workdir/RELEASE/bin/Linux24RH72_i386_gcc2953/jobWrapper \
--exec /home/gowdy/reldirs/tst14.1.3b/workdir/BetaMiniApp \
--tcl /home/gowdy/reldirs/tst14.1.3b/workdir/RELEASE/log/miniAnal/<RUNNUMBER>/job_151.tcl \
--stats /home/gowdy/reldirs/tst14.1.3b/workdir/RELEASE/log/miniAnal/<RUNNUMBER>/stats_151.dat \
--logs /home/gowdy/reldirs/tst14.1.3b/workdir/RELEASE/log/miniAnal/<RUNNUMBER>/00.LOG \
--out /home/gowdy/reldirs/tst14.1.3b/workdir/RELEASE/results/miniAnal/<RUNNUMBER>/00 \
--jobId 151 --debug
Submitted 1 job(s).
The <RUNNUMBER> element hasn't been substituted here for some
reason. That causes the job to fail straight away... so close...
References
There is a Quick
Start guide you should probably use for now.
Stephen J. Gowdy
Last Update: 29th November 2003
|