"Task Management"
Status demo - Setting up tasks.
Introduction:
For an outline of the 'Task Management' see http://www.slac.stanford.edu/~roethel/bookkeeping/TaskManagement/.
This demo is meant to demonstrate how to setup (configure) a task. The steps
to configure a task include:
- setting up your job environment (executable, tcl-file, output and logfile
locations etc.).
-> defining a task setup.
- evtl. registering the software release with the database (if this release
is not registered already).
-> registering a software release.
- defining the input datasets and output streams.
-> defining a task.
Glossary:
- Task: A task is used here to define the main objective
of the processing job. 'Oct. 2003 SP skims' could be and example of a task.
A task then keeps information on what (i.e. which datasets) and how (i.e.
the common configuration) the data should be processed. Very roughly a task
can be compared to a skim job in SkimTools, with the extension that a task
can change the configuration (but can only have one configuration at any time).
- Dataset: This is easy - it's any dataset in the meaning
of (and as defined in) the new bookkeeping model.
- Configuration/Setup: The configuration stores all the information
to process the data, i.e. the executable, the logfile path, the output path,
the software release etc.
- Jobs: A jobs is loosly defined in the sense of a process,
i.e. it resembles the actual job that will be submitted to the batch queue.
This is different from the SkimTools definition of a skim job, but I think
it's more coherent with the general use of the term job. A job stores the
specific information that is required to run a process (in contrast to the
configuration that stores the common that apply to all jobs), e.g. the input
collection, the output collections, the processing queue (if different from
the default) etc.
- TaskManager: The interface between user and the skim task
management. This additional layer was added be able to extend the application
with a graphical user interface.
Running the Demo (BbkTaskManager V00-00-00):
runDemo is a shell script that contains the example listed here.
Setup the environment:
- > cd ~bbrskim/12.5.3c
- > srtpath <cr> <cr>
It is recommended to reset the demo by running resetDemo.pl . This
will erase the entries made in the demo that might still exists from running
runDemo. To run the demo from the shell script instead of cut-and-pasting
the commands type runDemo . After finishing the demo please run
resetDemo.pl so others might try.
Registering and listing database entries for the software release used:
- Listing existing entries:
roethel@noric05> listRelease.pl --user aforti --db aforti --host slac
id name created precedence
--------------------------------------------------------------------------
1 12.5.3c 2003/10/27 17:32:20 12050303
- Look at details (Well - there aren't any yet, but maybe later we might add
a 'description' column, where users can add comments to releases they created.
This might be valuable since general analysis use diffferent sets of tags
to overwrite the default tags).
roethel@noric05> listRelease.pl --user aforti --db aforti --host slac
id name created precedence
--------------------------------------------------------------------------
1 12.5.3c 2003/10/27 17:32:20 12050303
roethel@noric05> listRelease.pl --user aforti --db aforti --host slac 12.5.3c
name : 12.5.3c
created at : 2003/10/27 17:32:20
precedence : 12050303
id : 1
- Create a new entry
roethel@noric05> createRelease.pl --user aforti --db aforti --host slac 12.6.0 12060000
name : 12.6.0
created at : 2003/10/28 12:32:26
precedence : 12060000
id : 2
- ... is it there...
roethel@noric05> listRelease.pl --user aforti --db aforti --host slac
id name created precedence
--------------------------------------------------------------------------
1 12.5.3c 2003/10/27 17:32:20 12050303
2 12.6.0 2003/10/28 12:32:26 12060000
Defining a setup:
Setup parameters can be changed as long as a setup is not registered to a task.
Any setup can only be registered to one task.
- list existing setups:
roethel@noric05> listSetup.pl --user aforti --db aforti --host slac
id name created release task
--------------------------------------------------------------------------
1 TestSetup-1 2003/10/27 17:41:57 12.5.3c
3 TestSetup-2 2003/10/27 17:47:57 12.5.3c
4 TestSetup-3 2003/10/27 17:49:40 12.5.3c
- create a new setup:
roethel@noric05> createSetup.pl --user aforti --db aforti --host slac MyTestSetup-1 myExec run.tcl
successfully created new configuration
name : MyTestSetup-1
maintainer : roethel
created at : 2003/10/28 14:00:08
id : 7
associated with task : -
task id : 0
release : 12.5.3c
runs : myExec run.tcl
default queue : bfobjy
logfile path : /u/br/roethel/devel/12.5.3c/log/<TASKNAME>/<RUNNUMBER>/<PASS>.LOG
output path : /u/br/roethel/devel/12.5.3c/results/<TASKNAME>/<RUNNUMBER>/<PASS>/<STREAM>-micro.root
optional configuration file : /u/br/roethel/devel/12.5.3c/workdir/.bbkTMConfig
ready : no
Placeholders <TASKNAME>, <RUNNUMBER> etc. will be replaced by
the actual value when jobs are created/subitted. Keeping the taskname and
the runnumber in the logfile and output file guarantees the files to be unique.
But - ooops in my example the setup was created with the default (current)
release 12.5.3. I want to use 12.6.0 So I ....
- ... edit a setup (editSetup.pl --help gives the full list of options):
roethel@noric05> editSetup.pl --user aforti --db aforti --host slac --release 12.6.0 MyTestSetup-1
successfully edited setup.
name : MyTestSetup-1
maintainer : roethel
created at : 2003/10/28 14:44:33
id : 7
associated with task : -
task id : 0
release : 12.6.0
runs : myExec run.tcl
default queue : bfobjy
logfile path : /u/br/roethel/devel/12.5.3c/log/<TASKNAME>/<RUNNUMBER>/<PASS>.LOG
output path : /u/br/roethel/devel/12.5.3c/results/<TASKNAME>/<RUNNUMBER>/<PASS>/<STREAM>-micro.root
optional configuration file : /u/br/roethel/devel/12.5.3c/workdir/.bbkTMConfig
ready : no
Defining a task:
The easiest way to define a task is to associate a setup with a list of input
streams.
- Create a task:
roethel@noric05> createTask.pl --user aforti --db aforti --host slac -c MyTestSetup-1 -s "AllEvents,Jpsitoll" MyFirstTask
successfully created new task
name : MyFirstTask
maintainer : roethel
created at : 2003/10/28 15:19:14
id : 3
current configuration : MyTestSetup-1
input datasets : -
output stream(s) : AllEvents,Jpsitoll
ready : no
active : no
- Now we need to add the input datasets. This has to be done separate from
creating a task since it is not guaranteed that the Task-Manager and the (public)
run-database will be in the same physical database. Therefore we need to provide
connection information to the database the dataset resides in (yes - I know.
Using the per-site defined 'public' database as default will simplify this):
roethel@noric05> addDataset.pl --user aforti --db aforti --host slac MyFirstTask
run1-dataset
successfully added dataset(s)
name : MyFirstTask
maintainer : roethel
created at : 2003/10/28 15:19:14
id : 3
current configuration : MyTestSetup-1
input datasets : run1-dataset
output stream(s) : AllEvents,Jpsitoll
ready : yes <--- our task is ready now!
active : no
- Let's see what we've got sofar:
roethel@noric05> listTask.pl --user aforti --db aforti --host slac
name created setup description
--------------------------------------------------------------------------
MyFirstTask 2003/10/28 15:19:14 MyTestSetup-1
Next things to do...
Task-Management Framework:
- 'Synchronize' the Task-Manager with the offline. The offline now has better
methods to pass 'on-the-fly' information to the executable.
- Create the (simple) interface to the batch system. Straight forwared since
a similar system is already in use for skimming.
- Testing and refining (adding options).
User Interface:
Using the Task-Manager over the command line is possible, but is not meant
to be the default way of using it. In particular to create and edit a task the
standart way to do that should be over a menu-driven interface (using a terminal
or a graphical user interface) with built-in help functions etc.
- Adding and testing further commands (in particular catching 'nonsense' input).
- Adding a terminal driven menu (started).
- Adding a graphical user interface.
|