Status of KanGA(Roo)
This is the status of KanGA(Roo)
as of today, Thursday, January 27, 2000.
Introduction
What's New since Last Report?
KanGA code
Data Format
Root Conditions Database
Micro-Truth
Event Size
Versions of ROOT
Making an Analysis
Executable
Communication & Documentation
Offline Production
OPR Production
Export of KanGA files
offsite
How
Well are we Doing on our Initial Requirements ?
Manpower
Action Items for KanGA
Introduction
KanGA(Roo)
is an acronym for
Kind
and Gentle Analysis
(without Relying on
objectivity)
KanGA (formerly
known as NOTMA, for Non
Objectivity Tag
& Micro Analysis)is
the result of a BaBar
task force that was formed just after the August Computing
Review. The original list of requirements can be found
at
http://www.slac.stanford.edu/~gautier/Requirements.ps
KanGA data
persistancy is based on ROOT.
What's New
since Last Report ?
Last report (January
13, 2000) can be found at :
http://www.slac.stanford.edu/~gautier/KanGA_status_011300.html
-
Decision has been taken (by the Computing Management) NOT
to run KanGA into OPR Production.
KanGA production will be done offline from the analboot
federation.
This has of course strong impact on the project, in particular
the 8.6.0 milestone has to be redefined.
-
Conditions Database problems related to the access to the
beam spot parameters after 8.2.5 has been fixed
see http://babar-hn.slac.stanford.edu:5090/HyperNews/get/kanga/43.html
-
A new script to browse the Root Conditions Database has been
developed
see http://babar-hn.slac.stanford.edu:5090/HyperNews/get/kanga/41.html
-
A script to produce KanGA files for SP2 Collections
has been released
-
There is a huge memory leak when writing KanGA files
which is not understood at the moment.
It is not related to Beta, and appears to come
from the RooOutputModule itself. This is under investigation.
KanGA
code
The KanGA
core code can be found in packages with the "Roo" TLA :
RooUtils
RooModules
RooScribes
RooSequences
RooCond
and a few others
and in packages containing KanGA persistent classes :
BtaDataR
TagDataR
StdHepDataR
EidDataR
As much as possible, the architecture of BdbConverters
has been reused, although a great deal of code had
to be re-written and adapted for ROOT.
For more information about the architecture, see :
http://www.slac.stanford.edu/~salnikov/Root/ScribesAndModules.html
Data Format
The data is written in split mode (for EID, TAG, AOD and
TRU) with ROOT
compression level 2.
The choice of split mode is for two main reasons :
- performance when not reading ALL the quantities
- possibility of reading files in interactive mode in
ROOT
For more information on the tag & micro-dst format,
see :
/BFROOT/www/doc/workbook/nanomicro/v8.3/
The data in KanGA
files is not packed as in BtaDataP/Objy.
Data Compression is build-in in ROOT.
The TAG is fully split and expandable.
Root Conditions
Database
A Root Conditions Database was developed.
In contains at the moment the beam conditions and the
Drift Chamber dE/dx Bethe & Bloch parametrization.
Adding new types and rebuilding the Root Conditions
Database is fairly easy.
Documentation can be found at
http://www.slac.stanford.edu/~davidk/RooCond/
Micro-Truth
There is full access to the truth in a KanGA MC analysis.The
matching is not persistent and therefore a chi2-basedassociator must be
used.
Work is on-going for adapting the micro-truth format to
make it available in Objectivity as well. The
code has been released but tests are still needed
to introduce this in SP2 production (manpower needed).
Event Size
The size of an isPhysics event in data with KanGA
8.3.2 was 2.3 kBytes on average, including the TAG.
The size of a multi-hadron event is of order 4 kBytes,
the size of a mu+mu- event, 0.8 kByte.
For Monte-Carlo data, Micro-truth adds 4 to 5 kBytes
to hadronic events.
Versions
of ROOT
The version of ROOT
we currently use (which fixes a bug at reading time and
allows ful splitting of the AOD) is 2.23-09. This is the last
development version before the next released version.
ROOT 2.23-09
has been installed at SLAC.
We've had problems with ROOT,
but the fact that the code is accessible makes it
easy to find and fix the bugs, and the ROOT
team is in general reponsive to bug reports and react promptly.
In the future, we will try to stick to the same version
of ROOTas long as a new version
is not needed for technical reasons, in order to avoid
the burden of installing new versions of ROOT
atremote sites. We will avoid having to use development versionas
much as possible.
Making
an Analysis executable
gmake BetaUser.all BETAOPTION=BetaNotmaRoot
Beta and the Beta Tools at the micro level (general tools,
vertexing, Pid, composition, tagging) are fully available
with KanGA data, and switching to KanGA is transparent
to the user. The physics sequences has been sucsesfully tested with KanGA
data, including ALL the Composite Selectors, the Good Track selectors and
the Pid Selectors at the micro level, as well as other Beta Tools (Cornelius++
etc.). Only the Tag modules and some PidSelectors acting on full
BtaCandidates are not useable.
Different recipes are available from the Physics How-To
Page maintained by Ray Cowan :
/BFROOT/www/Physics/Analysis/Howto/index.html
The recipes will be updated regularly for each new production
release.
Communication
& Documentation
An hypernews forum has been set up
http://babar-hn.slac.stanford.edu:5090/HyperNews/get/kanga.html
A tutorial was given at the October Collaboration Meeting,
based on release 8.2.11.notma
http://persil.lbl.gov/~skluth/notma_tutorial.html
Offline
Production
KanGA production
in the short term consists of offline jobs which read an
Objy collection and write corresponding file of events
in KanGA(Roo) format. See
http://www.slac.stanford.edu/~davidk/Kanga/OfflineProd.html
Production with 8.2.11.notma
We have done tests of production with the 8.2.11.notma
version of the Bdb to KanGA
application.
The KanGA
data generated by these tests consists of about 1M events spanning
runs 9441-9516 (end of Sept) and originally processed
with OPR version 8.2.5f.
There is a list of the KanGA
event files in:
$BFROOT/kanga/Production/log/master.log
Each line in this file is the KanGA file name and the
number of events it contains. The file names
are relative to:
$BFROOT/kanga/EventStore/
There are also KanGA files non-produced with the official
production scripts corresponding to the Padova runs
and 8.2.5 hadron skims. This consists of about 700,000
hadronic events that can be found in
$BFROOT/kanga/data/
See
http://babar-hn.slac.stanford.edu:5090/HyperNews/get/kanga/3.html
The files produced by the special release 8.2.11.notma
will not be readable with the 8.3.x releases, but
can still be read within 8.2.11.notma. That's
old micro-dst format, of course.
Production with 8.3.2/ROOT 2.23-09
The production is based on an executable produced in
release 8.3.2 plus the latest tags for Roo packages.
About 8,000,000 isPhysicsEvents produced with OPR 8.2.5c,d,f,i
and available in the analboot federation have been converted to KanGA
(runs 9000-10599), as well as 210,000 isPhysicsEvents produced with OPR
8.3.1a (runs 10398-10531.) Warning : some of these KanGA ROOT
trees have been found to be corrupted. We are investigating.
A script called kangaruns, written by
Tim Adye, is available. It produces tcl files with the list
of KanGA collections for a given range of run numbers. Documentation
can be found from the Physics Analysis Script page :
/BFROOT/www/Physics/Tools/Scripts/intro.html
KanGA files have been produced also for certain skims
(e.g. D*) and for SP2 collections.
News relative to production are posted on the KanGA
hypernews forum.
OPR Production
-
The interface code and infrastructure for running KanGA in
OPR production has been developed and tested (S.Gowdy).
-
The (semi)-automatic merging of KanGA files has been deployed.
It re-uses perl scripts used for the merging of hbook files.
-
However, KanGA is not considered as robust enough
to be deployed in 8.4.1 OPR. Manpower is needed to solve the technical
problems.
Export
of KanGA Data Offsite
The KanGA 8.3.2 data at Slac (25 GBytes) have been successfully
exported to RAL
For network exports, the easiest is to start using existing
tools for mirroring directory trees, handling incremental copying.
Tim Adye has been using "rsync" (over ssh) for
copying the data from Slac. For more details on data export issues, see
:
/BFROOT/www/Computing/Offline/DataDist/kanga_export.html
How
Well are we Doing on our Initial Requirements ?
| Access to EventID, TAG, AOD as for Objy |
Yes |
| Support (including documentation) |
Yes (can always be better) |
| Production of KanGA
files in OPR |
No, not deployed |
| Easy means for data distribution |
Yes, tools exist |
| micro analysis without Objy server |
Yes |
| compatibility with BaBar framework |
Yes |
| run on BaBar supported platforms |
Yes |
| version control thru CVS/SRT |
Yes |
| reasonable compile/link time |
Yes |
| avoid duplication of code |
Yes |
| support for ROOT
optional at remote site |
Yes |
| easy access to RAW/REC for selected evts |
No, tools needed |
| no link to Objectivity libraries |
Yes |
| no link to subsystem libraries |
Yes |
| filter mode, read TAG then AOD |
Yes |
| reasonable init time (<6s) |
Yes |
| ability to write-out selected events |
Yes |
| ... with additional data |
No (?) |
| access to micro version of truth (MC) |
Yes |
| access to limited set of Cond Data |
Yes |
| no dependency with subsystem specifics |
Yes |
| tools for offline production of KanGA |
Yes |
Manpower
The KanGA
effort is VERY SHORT in manpower, and without a prompt re-inforcement
of the task force, the success of the project is in
jeopardy.
Since KanGA files are going to be created as part of OPR
production, there is a need for a person (institution?) to take
reponsibility for KanGA/BaBar code and support of the ROOT I/O system in
BaBar executables (this may include debugging of ROOT code and interaction
with the ROOT team.) As requested at the ATB, a preliminary
charge for this job has been written (Bob.) This person
would probably have to be full time on the job in the first weeks,
and preferrably at Slac.
Recently, two of our key developers have left :
Andrei Salnikov, who
did a terrific job in developing the ROOT scribes and
Converters, is back to Siberia to finish his thesis.
Stefan Kluth, who
is the prime developer of the Root persistent packages and
the KanGA modules & sequences,
has found a position at the MPI in Munich.
Andrei and Stefan have not been replaced yet.
David Kirkby, of Stanford
U, has been extremely active in many aspects of
KanGA; in particular he developed
the Root Conditions Database and developed the scripts
for the KanGA production.
He is still working for the task force, but will be
switching more and more to physics analysis as co-convener
of a B-mixing group.
Stephen J. Gowdy,
of LBNL, is of a great help in particular in the
setting up of SoftRelTools for ROOT,
the schema evolution, the sequences, his general expertise
in BaBar Computing, etc.
He is now working on one of the key aspects of theproject
: integration of KanGA into
OPR production.
Urs Langenegger, of
Slac, is in charge of installing the versions of ROOTat
Slac.
Marcel Kunze, of the
Bochum Group, is managing the PAF aspect of the project.
In particular he is working on the KanGA+PAF+Beta
design for interactive analysis, known as Option 7.
Marcel is also working on problems of merging of ROOT
files in OPR.
Matthias Steinke is
also developping a non-Objectivity non-Fortran representation of the magnetic
field.
Roland Waldi, Bernard
Spaan, Leif Wilden, Thorsten
Brandt & Klaus
Schubert, of the Dresden Group, are working
on a common data format to KanGA and PAF, as well as on the so-called Option
7. The aim is to have a KanGA+PAF+Beta+BetaTools
fully compatible solution by January 15th.
Rolf Dubitzky, of
Dresden, is working at making the matching with MC
truth persistent (KanGA and
Objy).
Jens Brose, of Dresden,
has agreed to take over the offline production of
KanGA file.
Tim Adye, Ulrike
Egede & Paul
Dauncey, of RAL and Imperial College, are working on data exportation
tools.
Gautier Hamel de Monchenault
is coordinating the effort; he is working on aspects
of the project related to Beta and its design, on
micro-truth, and is helping in offline production.
Action
Items for KanGA
Short Term (in progress)
-
schema evolution to follow recent micro-dst changes
-
find and fix memory leak problems related to the RooInputModule
-
prepare/update user-friendly recipes for running on KanGA
data and MC files
-
prepare an introductary Web page on KanGA for new users
-
update offline production tools to work with the latest release
with SP2 and with Objy skims in input
-
develop the offline production strategy
-
survey remote sites to understand their expectations and
readiness with respect to KanGA
-
standardize methods to validate KanGA data and benchmark
KanGA performances
-
develop/update users tools to browse available KanGA data
and prepare tcl files
LongerTerm
-
start routine benchmark and data validation
-
find manpower (with ROOT expertise)
-
find package coordinators to replace Andrei Salnikov and
Stefan Kluth
-
assign responsibilities/plan for maintaining recipes &
docs
-
implement offline production plan
-
assess status of KanGA in OPR and estimate remaining
work required
This page is maintained by Gautier Hamel de Monchenault
|