Workbook for Offline Users: How to
generate a set of
CM2 ntuples using SimpleComposition and BtaTupleMaker
The tutorial in this section was written by Chris Hearty,
15-Nov-2004; updated to analysis-23 on 8-Feb-2005. (The earlier
analysis-22 version is here). It
is intended to be
run with a CM2 analysis job - see CM2
documentation for further details of CM2 (the rest of the workbook
deals with pre-CM2 analysis).
This tutorial will describe how to generate a set of analysis ntuples
using SimpleComposition and BtaTupleMaker.
These were written by Chris Roat and Chih-hsiang Cheng respectively as
a way of simplifying physics analysis on BaBar. They can be used for
many publication-quality BaBar analyses.
To do your own analysis, you will need to start from an
existing skim,
or produce your own one. There are something like 120 skims available,
so you may find that an existing one is suitable. Ask your convener,
or others in your AWG. You should get your AWG and its conveners
involved at the earliest possible stage, and make regular reports
throughout your analysis. My observation is that analyses that don't
interact regularly with their AWGs never get finished.
In general, the ntuples you make should be stored on
your own disk
resources - your laptop, desktop, or university, for example. Keep two
copies - this type of disk space is cheap. (I just bought a 120 GB
external HD suitable for backups for $70 after rebate). On the other
hand, disk space at SLAC is extremely expensive (~$2k for the same
disk), and limited. (It is also way better disk.)
It is easy to produce large sets of ntuples with these
tools. If you
do this, you will be frustrated by how long it takes to make each and
every plot when analyzing them on your laptop. Work a bit with signal
and background MC to tighten your selection before making all your
ntuples. For an exclusive B decay, I would aim for a few tens of GB
at most (including both data and MC); my last couple have been a few
GB.
In this tutorial, we will start with a simple case, B+
-->
Jpsi K+, with Jpsi --> mu+ mu-. This way we can
go
through the full process without getting too bogged down in the
complexities of Composition. We will then go back and do a full B
--> Jpsi pi+ pi- K analysis, including
charged and
neutral modes, and various intermediate resonances, in the second part
of this tutorial.
Before doing this workbook section, you should work
through the
introductory sections. I would suggest Logging
In, A
Quick Tour of the BaBar Offline World, Information
Resources, Unix
for BaBar, Event
Store: BaBar Database Summary, Frameworks
- the Enviroment for Running, and Batch
Processing. There are other sections on Paw (1, 2)
and Root(1, 2, 3),
which may be useful.
This tutorial is written assuming you are running on a yakut
machine at SLAC. It is often faster to work at a
different Tier A site, but you will need to use different commands for
the batch system and for scratch space. It is based on analysis-23
and release 14. (Analysis-22 and analysis-23 are built to
run on Red hat 7.2 and Scientific Linux respectively; otherwise, they
are identical). Commands that the user might enter are written in bold, blue to make them
easier to find
on later browsing.
Note you should ssh yakut.slac.stanford.edu.
Don't
ssh to a specific yakut - different ones have different operating
systems. If you get complaints from ssh about changing IP addresses,
you need to install the full list of SLAC unix sites into your .ssh/known_hosts
file; see http://www.slac.stanford.edu/comp/unix/ssh.html.
If you get complaints about "keys of a different kind are already known
for this host", try ssh -1 yakut.slac.stanford.edu.
Here are a few useful links:
OK, let's begin!
Build your executable in analysis-23.
You will need about 450 MB of disk space for your
release,
including libraries and executable. If you don't have that much, put
in the request before you begin to AFS-req.
Now type:
newrel -t analysis-23 Rel23
cd Rel23
srtpath
<enter> <enter>
cond14boot
addpkg workdir
gmake workdir.setup
This will put the library and executable in your own disk
space. You should definitely do this when you are producing
ntuples for a real analysis. If you are just working through a
tutorial for a few days, you could put them in an AFS build scratch
area (e.g. /afs/slac/g/babar/build/h/hearty)
newrel -s my-build-scratch-dir -t analysis-23 Rel23
If you don't have such an area, run the script bbrCreateBuildArea
to create it. You cannot
use this area
for ntuples or log files or such. Conversely, you probably don't want
to use your normal NFS scratch space (e.g. $BFROOT/work/h/hearty)
for your release
because of poor
performance if someone else is heavily using the same disk space by,
for example, writing ntuples from 100 batch jobs simultaneously.
If you use a scratch area, your executable will vanish
six
days after you create it, and your jobs will all crash, and
you won't know why. This is probably OK, if you are just working
through the tutorial.
We are now ready to add tags. To find out which ones you
should add, check the Extra
Tags web site. At the time this tutorial is being written, the
relevant tags are:
(Tue 13:45) hearty@yakut05 $ showtag
BetaCoreTools V00-02-12-02
BetaMini eac-041216
BetaMiniData eac-041216
BetaMiniModules V00-05-00
BetaMiniSequences V00-11-04
BetaPid V00-01-89-01
BtaMiniDataK eac-041216
BtaTupleMaker V00-01-06
EmcDataK V01-00-05
KanBase V01-00-09-02
SimpleComposition V00-00-30
UsrDataK V01-01-00
VtxCascade V02-03-04
VtxFitter V00-11-06
VtxTreeFitter V00-02-07
workdir V00-04-16
Add these tags with
addpkg BetaCoreTools V00-02-12-02
addpkg BetaMini eac-041216
and so forth. Some of these are the tag in the release, but they need
to be
recompiled. Why didn't we add FilterTools, which is also listed?
Because the comment refers to problems with SkimMiniApp,
which is not the executable we are using. Remember - don't just copy
the tags listed here - check the web page for the latest ones.
Before compiling, check your disk space: fs listquota. You
will need 450 MB.
bsub -q bldrecoq -o lib.log gmake lib
or
gmake lib
Afterwards, check that you actually got library files for each of
these, and that the dates and sizes are reasonable. If anything is
funny, check for error messages at the end of the compilation for that
package. If you compile a second time, be sure to use a different name
than lib.log, or else delete the file first.
(Tue 14:33) hearty@yakut05 $ ls -l lib/$BFARCH
total 277947
-rw-r--r-- 1 hearty ec 14169764 Feb 7 16:41 libBetaCoreTools.a
-rw-r--r-- 1 hearty ec 9098874 Feb 7 16:42 libBetaMini.a
-rw-r--r-- 1 hearty ec 6892516 Feb 7 16:42 libBetaMiniData.a
-rw-r--r-- 1 hearty ec 11061312 Feb 7 16:43 libBetaMiniModules.a
-rw-r--r-- 1 hearty ec 4574296 Feb 7 16:43 libBetaMiniSequences.a
-rw-r--r-- 1 hearty ec 49641116 Feb 7 16:47 libBetaPid.a
-rw-r--r-- 1 hearty ec 24112518 Feb 7 16:49 libBtaMiniDataK.a
-rw-r--r-- 1 hearty ec 23619954 Feb 7 16:50 libBtaTupleMaker.a
-rw-r--r-- 1 hearty ec 12823812 Feb 7 16:51 libEmcDataK.a
-rw-r--r-- 1 hearty ec 7432732 Feb 7 16:52 libKanBase.a
-rw-r--r-- 1 hearty ec 31828710 Feb 7 16:54 libSimpleComposition.a
-rw-r--r-- 1 hearty ec 17612228 Feb 7 16:55 libUsrDataK.a
-rw-r--r-- 1 hearty ec 46760094 Feb 7 17:02 libVtxCascade.a
-rw-r--r-- 1 hearty ec 14308576 Feb 7 17:03 libVtxFitter.a
-rw-r--r-- 1 hearty ec 10670234 Feb 7 17:04 libVtxTreeFitter.a
drwxr-xr-x 19 hearty ec 2048 Feb 7 16:34 templates/
Before linking, check that you have at least 170 MB disk space. A typical problem is that
your disk fills up and your binary doesn't get created so that when
you run, you get the binary of the same name in the release, which
does not have the tags we just added. Beware also of core files which
can be created from an unsuccessful compilation or failure of a job to
run - these can take up vast quantities of space and may need to be
deleted to carry on. BtaTupleMaker is our analysis
executable (it is the code that creates and fills the ntuple).
bsub -q bldrecoq -o bin.log gmake BtaTupleMaker.bin
or
gmake BtaTupleMaker.bin
Afterwards, check the size and date of the binary:
(Tue 14:33) hearty@yakut05 $ ls -l bin/$BFARCH
total 160307
-rwxr-xr-x 1 hearty ec 164153744 Feb 7 17:09 BtaTupleApp*
That is it - unless new problems are found, and new tags created, you
will not have to compile/link code again to analyze Runs 1-4 data.
Recall that every time you log on, you need to run srtpath
and cond14boot from your release directory (i.e., Rel23).
Analysis Code - General
Our core analysis code (the part used for every job we run)
consists of a single tcl file. There is an additional small
tcl file (a "snippet") that specializes the code for each
job, setting file names and so forth. There are four
components to this core code: a general part that is the
same for everyone, a tag-filter specific to this analysis,
the SimpleComposition section, and the ntuple-dumping
(BtaTupleMaker) part. These will be the topics of following
sections. Here is the code we will have at the end of the initial
section (note the non-standard file name and extension, Analysis-Simple.tcl.txt.
copy this file to your workdir and rename it Analysis.tcl.
Let's go through the first part of Analysis.tcl.
Quotes from this
file are
shown in green to distinguish them from
sample output, commands that you might enter, etc.
#..Analysis.tcl
# # Main tcl file for B+ --> Jpsi K+ [Jpsi --> mu+ mu-] workbook tutorial
#..General setup needed in all jobs
sourceFoundFile ErrLogger/ErrLog.tcl
sourceFoundFile FrameScripts/FwkCfgVar.tcl
sourceFoundFile FrameScripts/talkto.tcl
Lines starting with # are comments. sourceFoundFile
executes
the tcl in the specified package or gives an error if it
can't find it. If you have checked out the package (i.e.,
you did addpkg xxx) it will use that version, otherwise
the
version in the release. The path for package xxx is workdir/PARENT/xxx.
These three files turn on error logging, and define
Framework
Configuration Variables (FwkCfgVar), and "talkto", which
we will use to communicate with code modules.
The following are the Framework Configuration Variables
for our
analysis. They allow you to adjust parameters from one job to
another. For example, you will want to change the ntuple file name,
the number of events to run, and whether or not the input is data or
MC. The first couple (objectivity vs Kanga, and what type of data to
use) are not generally adjusted, but are here if you did need to. We
will discuss the specific meaning of each as they come up in the
tutorial.
The format is: FwkCfgVar variable_name
default_value
i.e., the default value is set in Analysis.tcl. You can
override the default in the tcl snippet for a
particular job:
set variable_name value_for_this_job
With FwkCfgVars, you will not need to use
environmental
variables, which were an on-going source of problems and
confusion.
#-------------- FwkCfgVars needed to control this job ---------------------
set ProdTclOnly true
#..allowed values of BetaMiniReadPersistence are "Kan", "Bdb"
FwkCfgVar BetaMiniReadPersistence Kan
#..allowed values of levelOfDetail are "micro", "cache", "extend" or "refit"
FwkCfgVar levelOfDetail "cache"
#..allowed values of ConfigPatch are "Run2" or "MC".
FwkCfgVar ConfigPatch "MC"
#..Filter on tag bits by default
FwkCfgVar FilterOnTag "true"
#..Print Frequency
FwkCfgVar PrintFreq 1000
#..Ntuple type and name
FwkCfgVar BetaMiniTuple "root"
FwkCfgVar histFileName "Tutorial.root"
#..Number of Events defaults to 0 (run on full tcl file)
FwkCfgVar NEvents 0
There is a standard set of physics code you will need, which
includes particle ID and standard track definitions. It
includes a couple of things we don't need, which we might
disable later. btaMiniPhysics.tcl is the standard set to
use, but you can get more (btaMiniPhysProdSequence.tcl) or
less (btaMini.tcl) if it suits you better.
#--------------------------------------------------------------------------
#..General physics sequences
#..set to 'trace' to get info on a configuration problem
ErrLoggingLevel warning
#..btaMiniPhysics is the basic physics sequences;
sourceFoundFile BetaMiniUser/btaMiniPhysics.tcl
At the end of the Analysis.tcl file, we use some of the
FwkCfgVars to control the actual running of the job. printFreq prints
a line into your log file at the specified interval. Useful to track
progress in batch jobs, but you don't want to bury important log file
information in millions of such lines.
If $NEvents is 0, all events in the collection are run.
If
it is negative, the "ev beg" command is not executed, so get
just get a framework prompt after Analysis.tcl is finished.
This is useful if you want to "mod talk" to a module before
running.
I generally include a "path list" so I can check the log
file to see what was actually run in the job.
#-----------------------------------------------------------------------
#..Run time options
mod talk EvtCounter
printFreq set $PrintFreq
exit
path list
if { $NEvents>=0 } {
ev beg -nev $NEvents
exit
}
This general part of Analysis.tcl was derived from MyMiniAnalysis.tcl
in package BetaMiniUser.
If you are switching to a new release, or want extra information, take
a look.
Analysis Code - Tag Filter (and a review of data
processing)
It is useful to briefly review data processing to put the
idea of a Tag Filter into context. Data is processed in PR
("prompt reconstruction"), which creates the AllEvents
collections from the XTC files. For new data, this is done
in Padova; for reprocessing, it can be done at SLAC or
another Tier A site as well.
At a later stage, the skim executable is run. It
includes a
block of physics code which sets a large number (>100) of
booleans (tag variables) to be true or false. Based on these
tag variables and other criteria, such as trigger or
BGFilter booleans, events are placed into various
collections called skims. Every event, whether selected by a
skim or not, is placed into the AllEventsSkim collection. It
differs from the original AllEvents collection because it
contains the tag variables set by the skim executable.
For "deep copy" skims, the event is copied into the new
collection. We can end up with multiple copies of the event
on disk, so we try to not to do this for skims that select a
significant fraction of all events. For these large skims,
we create "pointer collections", which consist of pointers
to the corresponding events, which are physically located in
the AllEventsSkim collection.
Several versions of the skim executable can be run on
the
same data set, as people get new ideas for skims. For
example, there is the original Release 14 skim of runs 1 -
4, identified by "R14" in the file name of the collection. A
small number of skims were then rerun on Runs 1 - 3, again
in Release 14, and are identified by "R14a" in the file
name. We are about to reskim runs 1 - 4 in Release 16 ("R16a"),
which is also be used for Run 5.
The block of physics code used in the skim executable is
also run
as part of Moose (the executable used to produce MC), so the resulting
AllEvents collections contain the corresponding booleans. Note that
the full skim executable is not run - there are no skims
created. These tag bits, of course, are the ones for the release used
to generate the MC in the first place. For example, SP6 corresponds to
the "R14a" set in data, while SP5 corresponds to Release 12. (See the skims
page for the details of each release).
OK, we are ready to tackle Tag Filters. A Tag Filter
allows
you to check the values of the tag variables and read the
full event from disk only if the desired ones are true. This can easily
save an order of magnitude in wall-clock
time compared to running directly on the AllEventsSkim
collection in data or the AllEvents collection in SP. (Of
course, it isn't useful if the tag variable doesn't exit in
the collection, such as in the data AllEvents collection, or
in SP created before the bit was defined). Tag filters may
not help at all if you are running on a skim, or they may
help a bit. For example, the Jpsitoll skim includes both
Jpsi to l+l- and Psi2s to l+l- tag bits. If you are only
interested in Psi2s decays, you can gain with a tag filter.
The Tag Filter in Analysis.tcl checks for
the two Jpsi
to l+l- tag bits. The option andList set xxxx also exists. I
have set an option to crash the job if the tag bits don't exist in the
event, so that I will know there is something wrong with the
collection I am using. Note that the Tag Filter is appended to the
sequence (i.e., actually executed) only if the FwkCfgVar FilterOnTag
is equal to "true". From above,
you can see
this is the default.
The JpsiELoose and JpsiMuLoose tag bits
have
been present since Release 8, so we can use this filter on SP4 or
later MC.
#----------------------------------------------------------------------------
#..Use $FilterOnTag to disable tag filtering if desired.
module clone TagFilterByName TagJpsill
module talk TagJpsill
orList set JpsiELoose
orList set JpsiMuLoose
assertIfMissing set true
exit
if { $FilterOnTag=="true" } {
sequence append BetaMiniReadSequence -a KanEventUpdateTag TagJpsill
}
The nano also contains some integers (e.g. numbers of tracks) and
floats (e.g. R2All). You can filter on these as well, although you
will need to add a couple of tags, recompile and relink. See http://babar-hn.slac.stanford.edu:5090/HyperNews/get/AnalTools/750.html.
Analysis Code - Candidate composition using
SimpleComposition
It might be useful to check the SimpleComposition
web site while working on this section. For now, we will do only B+
--> Jpsi K+, with Jpsi --> mu+
mu-. Once
everything is working, we will expand to the full analysis.
The first step is easy - we create a sequence called AnalysisSequence
and add it to the path, which
is called Everything.
#--------------------------------------------------------------------------
#..Use SimpleComposition to make the candidates of interest
#..Create Analysis sequence and append it to the path.
# All the composition modules get added to this sequence
sequence create AnalysisSequence
path append Everything AnalysisSequence
Now let's make a Jpsi from mu+ mu-.
We use the neural-net muon selector, which is
the
recommended one. There are actually 8 levels available, with names
like muNNxxx or muNNxxxFakeRate. (The
latter
aims for lower fake rate at the cost of lower efficiency). For plots
of performance, see the release
14 selectors webpage.
The list of recommended selectors is at selector_changes.html.
#------------------ Jpsi Lists --------------------------------------
#..Basic Jpsi to mu mu list, no mass constraint (for ntuple).
# We may want to add a cut on the chisquare of the fit as well.
mod clone SmpMakerDefiner MyJpsiToMuMu
seq append AnalysisSequence MyJpsiToMuMu
talkto MyJpsiToMuMu {
decayMode set "J/psi -> mu+ mu-"
daughterListNames set muNNVeryLoose
daughterListNames set muNNVeryLoose
fittingAlgorithm set "Cascade"
fitConstraints set "Geo"
preFitSelectors set "Mass 2.8:3.4"
postFitSelectors set "Mass 2.9:3.3"
}
The names in the decayMode must match PDT syntax; see the file
pdg.table in the package PDT (workdir/PARENT/PDT/pdg.table). We
are using the Cascade fitter, and applying a constraint ("Geo") to
force the two daughters to come from a common vertex. We are not
applying a mass constraint. Cascade is a good choice for standard
fitting of charged tracks. If you are working with something that has
a non-negligible lifetime, TreeFitter may be more appropriate. See the fittersAndConstraints
webpage. The question of what fitter to use under what
circumstances can be tricky, and if you are not sure, I would suggest
posting a question to the Vertexing
and Composition HN.
We will now constrain the mass of the Jpsi candidates to
the
PDG value. In general, to get better resolution on the B
candidate, you should always mass-constrain daughters with
width narrower than detector resolution: pi0, eta, eta', Ks,
D, D*, Ds, Ds*, Jpsi, chi_c1, chi_c2, psi2s. (I may have
missed some here.)
We could have this in the step above by including the
line fitConstraints set "Mass". We don't, because
we want to
calculate the unconstrained mass and store it in the ntuple
for use in distinguishing signal from background.
#..Now add the mass constraint
mod clone SmpRefitterDefiner MyJpsiToMuMuMass
seq append AnalysisSequence MyJpsiToMuMuMass
talkto MyJpsiToMuMuMass {
unrefinedListName set MyJpsiToMuMu
fittingAlgorithm set "Cascade"
fitConstraints set "Geo"
fitConstraints set "Mass"
}
There are pitfalls when creating self-conjugate states or
decays to CP eigenstates, which we will discuss when we
re-visit SimpleComposition below.
We make the B+ candidates by combining a Jpsi
list and
a K+ list:
#------------------------ B+ ---------------------------------------
#..B+ --> Jpsi K+
mod clone SmpMakerDefiner BchtoJpsiKch
seq append AnalysisSequence BchtoJpsiKch
talkto BchtoJpsiKch {
decayMode set "B+ -> J/psi K+"
daughterListNames set "MyJpsiToMuMuMass"
daughterListNames set "KLHVeryLoose"
fittingAlgorithm set "Cascade"
fitConstraints set "Geo"
preFitSelectors set "DeltaE -0.20:0.20"
preFitSelectors set "Mes 5.19:5.30"
postFitSelectors set "ProbChiSq 0.001:"
postFitSelectors set "DeltaE -0.12:0.12"
postFitSelectors set "Mes 5.20:5.30"
postFitSelectors set "CmsCosTheta"
postFitSelectors set "Mmiss"
createUsrData set true
}
To save some CPU time, we make loose cuts on DeltaE and Mes
before we do the fit.
The last line stores all Selector quantities as user
data associated with the B candidate. If we were writing a skim
(or sub skim), we could write out the list of B candidates
and the associated user data in the selected events. In our
case, we will store the quantities in the ntuple. This is
why we list postFitSelectors such as CmsCosTheta - we want
it calculated and stored, even if we don't want to cut on it
right now.
Mmiss is a kinematic variable that partners with the
candidate mass, in the same way Mes and DeltaE go together.
It may be a better choice if your final state contains
exactly one poorly measured particle.
Analysis Code - Ntuple Structure
Take a look at the BtaTupleMaker
web page above to
understand the options and syntax below, which I will not
explain.
First, add the ntuple-dumping module to the path:
#--------------------------------------------------------------------
#..Use BtuTupleMaker to write out ntuples for SimpleComposition job
path append Everything BtuTupleMaker
We determine the quantities to be stored in the ntuple by
setting tcl parameters that control the behavior of
BtuTupleMaker. (Note that while the executable made is called
BtaTupleApp, and it is made using a gmake BtaTupleMaker
command, a module called BtuTupleMaker is actually called and appended
to your path.) I'll intersperse comments with the code
below.
talkto BtuTupleMaker {
These first block contains information per event (not per
candidate). I keep the center-of-mass four-momentum, the
beam spot, primary vertex, and a couple of other items from
the Tag (number of charged tracks, and R2 calculated using
both tracks and neutral clusters).
By default, every event that BtuTupleMaker sees (every
event
passing the TagFilter, if you have one) is dumped to the
ntuple. This can increase the size by a substantial factor,
so set this option to false, unless you have a really good
reason. (For example, I sometimes keep every signal MC event
so that I can study the MC truth distributions).
#..Event information to dump
eventBlockContents set "EventID CMp4 BeamSpot"
eventTagsInt set "nTracks"
eventTagsFloat set "R2All xPrimaryVtx yPrimaryVtx zPrimaryVtx"
writeEveryEvent set false
You definitely want to keep the MC truth. You don't need to
do anything differently when running on data - the block
will still be created, so that your code structure can be
the same - but the number of entries in the block will
always be 0.
#..MC truth info
fillMC set true
mcBlockContents set "Mass CMMomentum Momentum Vertex"
Now we are at the heart of the code. BtuTupleMaker starts
from a specified list, in our case, of B candidates. If you
were doing a recoil analysis, where you fully reconstruct
both B's in the event, you could form Upsilon(4S) candidates
from the two and store that list. If your analysis were of
inclusive Jpsi, that would be the list.
BtuTupleMaker stores information for particles on the
list
and for all its daughters (and granddaughters, and so
forth), and includes links between the ntuple blocks for
each particle type. Every particle type in the decay chain
must be assigned to an ntuple block. You can have more than
one particle type per block, if you prefer. For example, I
prefer to put both B+ and B0 candidates into a single "B"
block; others put them in separate blocks.
Don't worry if you forget the ntpBlockConfigs command
for a
particular particle type - the code will just abort and give
you an error message; it won't do anything wrong. Note that
ntuples are packed, so that increasing the maximum size of
the block increases the ntuple size only for those events
with extra candidates.
The ntpBlockContents command specifies the items to
store.
UsrData(BchtoJpsiKch) stores the User data for our B list,
which is created by the "createUsrData set true" command in
the SimpleComposition block.
#..Particle blocks to store
listToDump set BchtoJpsiKch
ntpBlockConfigs set "B- B 2 50"
ntpBlockContents set "B: MCIdx Mass Momentum CMMomentum Vertex VtxChi2
UsrData(BchtoJpsiKch)"
(where the last line above was split for formatting purposes and
should be entered as a single line). We don't have any Usr data for
the Jpsi list, but we do want to store the unconstrained Jpsi
mass. This is an unusual item, in the sense that it is not obtained
directly from the Jpsi list used to make the B candidates
(MyJpsitoMuMuMass), but rather from MyJpsitoMuMu. However, the
ntuAuxListContents command exists for this purpose. BtuTupleMaker
will search through the list you specify and find the corresponding
Jpsi, and store the requested quantities, mass in this case. The
"Unc" is just a text string to allow you to distinguish ntuple
quantities from this auxilary list.
ntpBlockConfigs set "J/psi Jpsi 2 50"
ntpBlockContents set "Jpsi: MCIdx Mass Momentum Vertex VtxChi2"
ntpAuxListContents set "Jpsi: MyJpsiToMuMu : Unc : Mass"
The interesting thing in the K+ and mu+ blocks are the
PIDWeights that are stored for each MC candidate. (Weights
are 1 for data). The PID weight is the ratio of the
efficiency for this track to satisfy the specified selector
in real data to that for simulated data. It is a function of
the track momentum, theta, and phi. When filling histograms
with MC data, weight the event with the product of the PID
weights for the involved tracks (and the corresponding
tracking effiency weights) to correct for MC/data
differences.
You get not only the weight in your ntuple but also its
uncertainty and a status integer. You should check the
status is 1 (OK) before using the weight. And you should ask
around to make sure there really are PID weights in the
conditions database if you are running on new data. (The
weight requires that the PID group has analyzed both data
and MC for a run period).
There are a variety of related techniques for correcting
for
PID data/MC differences. You should use only one of these -
in other words, don't apply PID tweaking at a sub skim
level, then make an ntuple set storing PID weights. See pidonmc.html.
Post questions to the Particle
ID
tools HN.
ntpBlockConfigs set "K+ K 0 50"
ntpBlockContents set "K: MCIdx Momentum PIDWeight(KLHVeryLoose,KLHLoose,KLHTight)"
ntpBlockConfigs set "mu+ mu 0 50"
ntpBlockContents set "mu: MCIdx Momentum PIDWeight(muNNVeryLoose,muNNLoose)"
You need to tell the selectors to operate in "weight" mode (as
opposed to "tweak" or "kill", for example). Somewhere in your Analysis.tcl, place:
pidCfg_mode weight *
I find it useful to store every track and gamma in the
event, in addition to those used in forming a B candidate.
It can allow you to recover from a variety of mistakes. But
you should test how much this option increases the size of
your ntuples.
#..Want to save all CalorNeutrals in the gamma block
ntpBlockConfigs set "gamma gamma 0 60"
ntpBlockContents set "gamma: MCIdx Momentum"
gamExtraContents set EMC
fillAllCandsInList set "gamma CalorNeutral"
#..TRK block. Save all of them as well.
fillAllCandsInList set "TRK ChargedTracks"
#..remember to change this back to K pi mu e
ntpBlockToTrk set "K mu"
ntpBlockContents set "TRK: Doca DocaXY"
trkExtraContents set "BitMap:pSelectorsMap,KSelectorsMap,piSelectorsMap,
muSelectorsMap,eSelectorsMap,TracksMap"
trkExtraContents set HOTS
trkExtraContents set Eff:ave
}
(where in the above the first trkExtraContents line was only split for
formatting purposes, and should be entered as a single line).
In the track block you can see the tracking efficiency
weight (Eff:ave), which corrects for the MC/data difference
in the efficiency for this track to be reconstructed as a
GoodTracksLoose in MC and data. A GoodTracksLoose track is
required to come from the near the interaction point and to
have a minimum number of drift chamber hits, so that there
is a good momentum measurement. In most cases, you should
require your tracks to be GoodTracksLoose. If your tracks
have low pt --- slow pions from D* decays, for example ---
use GoodTracksVeryLoose instead. If they don't come from
the origin --- pions from Ks decays, for example --- then
just use ChargedTracks.
If you aren't using GoodTracksLoose, consider checking
the
pt distribution of your candidates that don't satisfy
GoodTracksLoose. Any track with pt more than ~120 MeV ought
to have drift chamber hits if it is measured correctly.
You select the track type in SimpleComposition by
sublisting, which we will discuss later, or by using the
TracksMap bitmap. The bits set to 1 in this word indicate
which Track requirements are satisfied by the candidate. The
safest way to get the correspondence between the bits and
the track types is to talk to module that creates this work,
TrkMicroDispatch, and type "show". (More on this later).
You can like-wise find out which muon ID selectors
selected
this using the muSelectorsMap bitmap. To get the correspondence
between these bits and the selectors, "mod talk" to MuonMicroDispatch.
(We will do this later). The bitmap is useful
if you want to tighten the criteria you used to make your B
candidates without having to rerun your jobs.
We will next run some sample jobs on MC samples. We will
use book keeping tools to locate the collections.
Some subdirectories
I find it convenient to create a sub directory of workdir
called maketcl, which I use for creating the tcl files and
other book keeping activities. I then create another
subdirectory of workdir called tcl, which contains soft
links to the tcl files that are actually located in maketcl.
I do this so that I can then delete the links in "tcl" as
the corresponding job successfully finishes, without losing
the orginal file itself. But feel free to do whatever you
like best.
mkdir maketcl
mkdir tcl
We will also need directories to store the log files and
ntuples you create. You should probably put these in your
scratch (work) area and make soft links to workdir - you
won't have enough space in your own area. Recall that files
in this area vanish after 6 days. Be sure to move both the
ntuples and log files to permanent storage within that time.
mkdir $BFROOT/work/h/hearty/Rel23/
mkdir $BFROOT/work/h/hearty/Rel23/log
mkdir $BFROOT/work/h/hearty/Rel23/hbookdir
ln -s $BFROOT/work/h/hearty/Rel23/log log
ln -s $BFROOT/work/h/hearty/Rel23/hbookdir hbookdir
I will be disappointed with you if you actually create your
directories in /h/hearty.
I find it useful to have a couple of other directories
called "failed" and "successful": the batch system writes to
"log" and "hbookdir", then I manually move the files to
"failed" or "successful" as appropriate.
mkdir $BFROOT/work/h/hearty/Rel23/failed
mkdir $BFROOT/work/h/hearty/Rel23/successful
ln -s $BFROOT/work/h/hearty/Rel23/failed failed
ln -s $BFROOT/work/h/hearty/Rel23/successful succesful
Other people keep all of their log files and ntuples in
their local area. To do this, you will want to gzip
everything as it is created, and keep a careful tab on your
disk space. You can use commands like zgrep, zcat, zless and
so forth to work with gzipped files.
While we are at it, let's create a subdirectory for tcl
snippets, which we will need shortly
mkdir snippets
Will Roethel has writtn a prototype "analysis task
management" system available that is undoubtedly a way
better way to work. See SJMMain.htm
I haven't tried it yet.
Since many of the book keeping commands used are obscure
and
difficult to remember, I recommend keeping a record of them.
I use a file called commands.txt in maketcl.
Finding the available MC modes
You will need to know the mode number of the MC sample you intend to
use. You can get some information by browsing the old SP5
inventory web site:
(Recall that SP5 corresponds to runs 1 - 3, SP6 is run
4,
and SP7 is run 5. SP8 will cover the full data set.)
You should also ask your conveners and other colleagues
in
your AWG.
More generally, you can use BbkSPModes to get lists of
different types of MC. You can search the descriptions (also
called "runtypes") of all modes by:
BbkSPModes --runtype "J/psi"
BbkSPModes --runtype "Jpsi"
To search in the titles of the decay (.dec) files:
BbkSPModes --decfile "Jpsi"
The resulting output can be quite large. It is perhaps
easier to write this to a file so you can look at it in your
editor:
BbkSPModes --decfile "Jpsi" > Jpsi-2.txt
BbkSPModes --decfile "jpsi" >> Jpsi-2.txt
Browsing through this, we find mode 1121, which is Inclusive
Jpsi (generic B events that contain a Jpsi --> e+e- or
mu+mu- decay), and 989, which is the exclusive decay B- -->
Jpsi K-. Mode 4817, B+ --> X3872 K+ might be interesting as
well.
You will undoubtedly want generic MC as well. Here are
the
mode numbers and the cross sections, which you will need for
calculating the equivalent luminosity.
B+B- 1235 540 pb
B0B0 1237 540 pb
cc 1005 1300 pb
uds 998 2090 pb
tau tau kk2f 3429 890 pb
mu mu kk2f 3981 1150 pb
Making tcl files for the selected modes
Generic MC is skimmed, while non-generic is generally not
skimmed. If you really need to have another mode skimmed,
talk to your convener.
It is very much faster to run on skimmed generic MC than
on
the AllEventsSkim collection - you can save an order of
magnitude in time. To make tcl files for the Jpsitoll skim
of generic mode 1235 (B+B-) in Run 1:
(Wed 15:06) hearty@noric03 $ BbkDatasetTcl -t -ds SP-1235-Jpsitoll-Run1-R14
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14.tcl (1279553 events)
Selected 8 collections, 1279553/24246000 events, ~0.0/pb
This command produced the file SP-1235-Jpsitoll-Run1-R14.tcl.
The Jpsitoll
skim contains
1279553 events, selected from an original 24246000 B+B- events in Run
1. (So the skim rate is 5.3%; the equivalent luminosity, assuming a
B+B- cross section of 0.54 nb, is 44.9 fb-1, 2.3x the luminosity for
Run 1.)
Make sure you keep track of these numbers - you will
need them when
you want to scale your MC to the data luminosity. Write the numbers
in your log book, or put the output in your commands.txt
file. In any case, be sure to keep your tcl files.
The tcl file SP-1235-Jpsitoll-Run1-R14.tcl is not a
practical way to analyze data - your job will certainly run
out of CPU time before it can go through the full file. BbkDatasetTcl
will split it up for you if you prefer:
(Wed 15:13) hearty@noric03 $ BbkDatasetTcl -t 100k -ds SP-1235-Jpsitoll-Run1-R14
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-1.tcl (133566 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-2.tcl (237398 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-3.tcl (172173 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-4.tcl (165859 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-5.tcl (176085 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-6.tcl (286136 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-7.tcl (33260 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-8.tcl (75076 events)
Selected 8 collections, 1279553/24246000 events, ~0.0/pb
This version of the command will not split collections, so
the resulting number of events in each file can vary quite a
bit. No matter how small your request, the files can still
be larger than what you require. To get around this,
request the collections to be broken into smaller pieces:
(Wed 15:20) hearty@noric03 $ BbkDatasetTcl -t 100k --splitruns -ds SP-1235-Jpsitoll-Run1-R14
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-1.tcl (100000 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-2.tcl (100000 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-3.tcl (100000 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-4.tcl (100000 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-5.tcl (100000 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-6.tcl (100000 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-7.tcl (100000 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-8.tcl (100000 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-9.tcl (100000 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-10.tcl (100000 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-11.tcl (100000 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-12.tcl (100000 events)
BbkDatasetTcl: wrote SP-1235-Jpsitoll-Run1-R14-13.tcl (79553 events)
Selected 8 collections, 1279553/24246000 events, ~0.0/pb
If you prefer a different file name, use the option --basename
blahblah.
How do you decide how many events to put in a tcl
file? There are a few criteria. If you are writing hbook ntuples,
you should keep them less than 30 MB per job, or you risk corrupting
them. (Root can handle bigger files, I believe). Your job can fail
due to CPU time limits on the batch queue, or wall-clock limits. Try a
couple of your longest jobs before you submit a whole bunch.
The CPU limit for the kanga queue, which you should normally use,
is 720 "slac minutes" (type bqueues -l kanga).
However, 1 CPU min on the batch machine generally counts as more than 1
"slac minute". To get the scale factor CPUF, for batch machine
barb0309 (for example), type bhosts -l barb0309.
So for this machine, the actual CPU limit is 720*60/2.11 = 20,474
sec. You can use other queues - short, long, or xlong, for
examples - but you will be sharing the resources with Glast and
ILC and so forth.
Exclusive decay mode MC is generally not skimmed, so
there
is no skim name in the collections. You will probably need
to put fewer events per job as well. Let's make some B+ -->
Jpsi K+ signal MC tcl files:
(Wed 17:13) hearty@noric03 $ BbkDatasetTcl -t 10k -ds SP-989-Run1
BbkDatasetTcl: wrote SP-989-Run1-1.tcl (9000 events)
BbkDatasetTcl: wrote SP-989-Run1-2.tcl (3000 events)
Selected 4 collections, 12000/12000 events, ~0.0/pb
Remember to copy and paste these commands into your
"commands.txt" file. And if you are using the maketcl and
tcl structure that I use, make soft links to the tcl files
you made:
cd tcl
ln -s ../maketcl/SP-989*.tcl .
tcl files for data; luminosity and B counting
The commands for data are essentially the same. Check the Data
Quality page for information on the latest datasets.
To make tcl files, use BbkDatasetTcl as
before. The
format of the dataset names is slightly different - in particular,
data can be on peak or off peak, unlike MC.
Wed 17:36) hearty@noric03 $ BbkDatasetTcl -ds Jpsitoll-Run1-OnPeak-R14 -t 100k
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-1.tcl (246468 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-2.tcl (284454 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-3.tcl (915342 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-4.tcl (996990 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-5.tcl (995903 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-6.tcl (1011621 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-7.tcl (985835 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-8.tcl (980917 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-9.tcl (971637 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-10.tcl (797326 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-11.tcl (1023349 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-12.tcl (795212 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-13.tcl (346808 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-14.tcl (558708 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-15.tcl (520786 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-16.tcl (442998 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-17.tcl (395107 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-18.tcl (275509 events)
BbkDatasetTcl: wrote Jpsitoll-Run1-OnPeak-R14-19.tcl (368243 events)
Selected 19 collections, 12913213/268587495 events, ~19502.5/pb
Depending on which skim you are using, yours might be -R14a,
or -R16a instead.
To get the B counting (and exact luminosity) for your
sample, you can use the "lumi" script with your tcl files.
Actually, it is easiest to rerun BbkDatasetTcl to make a
single file, then to use that:
(Wed 17:38) hearty@noric03 $ BbkDatasetTcl -ds Jpsitoll-Run1-OnPeak-R14 --basename lumi-Run1
BbkDatasetTcl: wrote lumi-Run1.tcl (12913213 events)
Selected 19 collections, 12913213/268587495 events, ~19502.5/pb
(Wed 17:38) hearty@noric03 $ lumi -t lumi-Run1.tcl
Using B Counting default release 14
===> lumi-Run1.tcl
Using collections:
/store/PRskims/R12/14.3.2/Jpsitoll/00/Jpsitoll_0031
/store/PRskims/R12/14.3.2/Jpsitoll/00/Jpsitoll_0032
/store/PRskims/R12/14.3.2/Jpsitoll/00/Jpsitoll_0034%rejectRun=12227
/store/PRskims/R12/14.3.2/Jpsitoll/00/Jpsitoll_0037
/store/PRskims/R12/14.3.2/Jpsitoll/00/Jpsitoll_0039
/store/PRskims/R12/14.3.2/Jpsitoll/00/Jpsitoll_0040
/store/PRskims/R12/14.3.2/Jpsitoll/00/Jpsitoll_0041
/store/PRskims/R12/14.3.2/Jpsitoll/00/Jpsitoll_0042
/store/PRskims/R12/14.3.2/Jpsitoll/00/Jpsitoll_0043
/store/PRskims/R12/14.3.2/Jpsitoll/00/Jpsitoll_0045
/store/PRskims/R12/14.3.2/Jpsitoll/00/Jpsitoll_0048
/store/PRskims/R12/14.3.2/Jpsitoll/00/Jpsitoll_0057
/store/PRskims/R12/14.3.2f/Jpsitoll/00/Jpsitoll_0080
/store/PRskims/R12/14.3.2f/Jpsitoll/00/Jpsitoll_0081
/store/PRskims/R12/14.3.2f/Jpsitoll/00/Jpsitoll_0084
/store/PRskims/R12/14.3.2f/Jpsitoll/00/Jpsitoll_0085
/store/PRskims/R12/14.3.2f/Jpsitoll/00/Jpsitoll_0089
/store/PRskims/R12/14.4.0c/Jpsitoll/01/Jpsitoll_0111
/store/PRskims/R12/14.4.0c/Jpsitoll/02/Jpsitoll_0279
==============================================
Run by hearty at Wed Nov 10 17:38:35 2004
First run = 9932 : Last Run 17106
== Your Run Selection Summary =============
***** NOTE only runs in B-counting release 14 considered *****
***** Use --OPR or --L3 options to see runs without B-counting *****
Number of Data Runs 3222
Number of Contributing Runs 3222
-------------------------------------------
Y(4s) Resonance ON OFF
Number Recorded 3222 0
== Your Luminosity (pb-1) Summary =========
Y(4s) Resonance ON OFF
Lumi Processed 19458.963 0.000
The lumi script knows to remove runs that are declared bad
in the tcl file. Be sure to check that all the runs have B
counting information - you will see a warning message if any
are missing.
Updating your tcl files
Some time after you start your analysis, it may be that more
data or MC have been skimmed, and you would like to make tcl
files from just these collections. To do this, check the
time stamp in last line of the last tcl file of the previous
set. For example, in SP-1235-Jpsitoll-Run1-R14-13.tcl it is
## Last dataset update: 2004/09/13 12:19:14
Then request only collections created after this time, and
start the numbering of these new tcl files at 14 (the last
set ended at 13).
BbkDatasetTcl -t 100k,14 --splitruns -ds SP-1235-Jpsitoll-Run1-R14 -s 2004/09/13-12:19:15
Note the extra hyphen in the time stamp, and add a second to
be sure you don't get the same collections back.
Making tcl snippets
For each job you run, you need not just the tcl file we just made, but
a tcl "snippet" to set the FwkCfgVar flags to the values appropriate
for this job. This could be tedious to do manually, so with Enrico
Robutti's help, I wrote a perl script called make_snippet.
Copy this to your workdir.
To make snippets for the 13 B+B- (1235) Jpsitoll skim
files
we just created:
make_snippet -MC -tcldir snippets -logdir log -ntpdir hbookdir tcl/SP-1235*.tcl
If you skip the flags specifying the directories, everything
will end up in workdir. Type make_snippet -h if you can't
remember the command options.
For each filename.tcl, it creates a file run_filename.tcl.
Let's look at one:
(Thu 5:36) hearty@yakut11 $ cat snippets/run_SP-1235-Jpsitoll-Run1-R14-1.tcl
#..See Analysis.tcl for description of FwkCfgVars.
sourceFoundFile tcl/SP-1235-Jpsitoll-Run1-R14-1.tcl
set ConfigPatch "MC"
set FilterOnTag "false"
set BetaMiniTuple "hbook"
set histFileName hbookdir/SP-1235-Jpsitoll-Run1-R14-1.hbook
set NEvents 0
sourceFoundFile Analysis.tcl
You should edit make_snippet to match your own
taste. root vs hbook, for example, or FilterOnTag
"true". The selection between MC and data is done via an option
flag of make_snippet - you don't need to edit this.
make_snippet also creates a script to submit all 13 jobs
to
the batch queue, called sub_SP-1235-Jpsitoll-Run1-R14-1 in
this case. We will come back to it when we talk about the
batch system.
Before running an actual job, it is a good
idea to look at your
path. You might notice things are missing, or extra things you don't
need. We will use run_SP-1235-Jpsitoll-Run1-R14-1.tcl. By
default, the NEvents flag is set to run on all events ("0"). To set up
your job, but not execute an "ev beg" command, edit the file and
change NEvents to -1.
To run the job (from workdir), type
BtaTupleApp snippets/run_SP-1235-Jpsitoll-Run1-R14-1.tcl
Here is the resulting output, ending at a
framework prompt: path.txt.
The first part lists the values of various FwkCfgVars.
Notice, for example, that PrintFreq is not set in the
snippet, so we just get the default value of 1000.
Most of the output is the path list, which starts with
Everything and ends with BtuTupleMaker. Read through the
one-line descriptions to get an idea of what is actually
happening in your job. Note your AnalysisSequence is a
pretty small part of the whole operation.
While we are here, let's check the meaning of the muon
ID bit
map.
mod talk MuonMicroDispatch
MuonMicroDispatch> show
Current value of item(s) in the "MuonMicroDispatch" module:
Value of verbose for module MuonMicroDispatch: f
Value of production for module MuonMicroDispatch: f
Value of enableFrames for module MuonMicroDispatch: f
Value of outputMap for module MuonMicroDispatch: IfdStrKey(muSelectorsMap)
Value of inputList for module MuonMicroDispatch: IfdStrKey(ChargedTracks)
Value of inputMaps for module MuonMicroDispatch:
inputMaps[0]=muMicroMinimumIonizing
inputMaps[1]=muMicroVeryLoose
inputMaps[2]=muMicroLoose
inputMaps[3]=muMicroTight
inputMaps[4]=muMicroVeryTight
inputMaps[5]=muNNVeryLoose
inputMaps[6]=muNNLoose
inputMaps[7]=muNNTight
inputMaps[8]=muNNVeryTight
inputMaps[9]=muNNVeryLooseFakeRate
inputMaps[10]=muNNLooseFakeRate
inputMaps[11]=muNNTightFakeRate
inputMaps[12]=muNNVeryTightFakeRate
inputMaps[13]=muLikeVeryLoose
inputMaps[14]=muLikeLoose
inputMaps[15]=muLikeTight
MuonMicroDispatch> exit
>
I print out this sort of thing and put it in my log book.
(How did I know this was the correct module to check? From
the description in the path list.) You can do the same thing
for the other four charged particles, and for
TrkMicroDispatch, which fills the track bit map.
There are a lot of sequences listed under
SimpleComposition,
which you might think you want to disable to save time. In
fact, a key feature of SimpleComposition is that it doesn't
make the lists unless they are requested somewhere, so there
is essentially no overhead.
After you are finished browsing, type exit to end this
job.
Run an interactive job
Edit run_SP-1235-Jpsitoll-Run1-R14-1.tcl and set the
number of events to 1000. It is easiest to run the job in background
mode so that you get a log file.
BtaTupleApp snippets/run_SP-1235-Jpsitoll-Run1-R14-1.tcl >&
log/SP-1235-Jpsitoll-Run1-R14-1a.log &
(where the above command should be executed as a single line).
Here is the resulting file: SP-1235-Jpsitoll-Run1-R14-1a.log
There are quite few comments in here that look alarming,
but
in fact, this job completed successfully.
You can get the total number of events processed and the
CPU
time:
EvtCounter:EvtCounter: total number of events=1000
total number of events processed=1000
total number of events skipped=0
EvtCounter:Total CPU usage: 121 User: 118 System: 3
Also a report from the Tag Filter, which is pretty boring in
this case, since we didn't turn it on:
TagJpsill:TagJpsill: endJob summary:
Events processed: 0
Events passed : 0
Events prescaled: 0
Submit a batch job
Edit run_SP-989-Run1-1.tcl to run on 1000 events, then
submit it to the kanga queue, which you should be using for analysis
at SLAC. (Other Tier A sites will have other commands.)
bsub -q kanga -o log/SP-989-Run1-1.log ../bin/$BFARCH/BtaTupleApp
snippets/run_SP-989-Run1-1.tcl
(where the above command should be entered as a single line).
Check on the status of your job:
(Tue 14:59) hearty@yakut05 $ bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
710693 hearty RUN kanga yakut05 barb0309 *un4-1.tcl Feb 8 14:58
Or check the log file
bpeek 710693 | tail
(or bpeek -f 710693>)
Here is the resulting log file: SP-989-Run1-1.log
There is extra information at the end with respect to
the
interactive job. "Successfully completed" is what you like
to see at the end.
Dealing with many batch jobs
The script created by make_snippet, sub_SP-1235-Jpsitoll-Run1-R14-1
for example, will submit all the jobs for you at once. You should
probably not put more than a couple of hundred jobs in the queue at
one time.
source sub_SP-1235-Jpsitoll-Run1-R14-1
As the jobs finish, move the success ones into the
successful directory (grep for "Successfully completed") and
the failures (grep for "exit code" or "Abort") into the
failed directory. If you have a few mysterious failures, you
might try just resubmitting them.
You might find it useful to make some small, temporary,
scripts to do this sort of thing. For example,
(Thu 18:19) hearty@yakut14 $ grep "Successfully completed" log/*.log > move-2
(Thu 18:19) hearty@yakut14 $ cat move-2
log/SP-989-Run1-1.log:Successfully completed.
Then edit move-2 to change lines like
log/SP-989-Run1-1.log:Successfully completed.
into
mv log/SP-989-Run1-1.log successfull/
mv hbookdir/SP-989-Run1-1.hbook successfull/
/bin/rm tcl/SP-989-Run1-1.tcl
/bin/rm snippets/run_SP-989-Run1-1.tcl
gzip maketcl/SP-989-Run1-1.tcl
To use the script, just say source move-2
As I mentioned before, there might be a real task
manager to
do this sort of thing for you.
Be sure to keep track of all your failures so that you
can
correct the luminosity weighting of your MC sample.
If you are using hbook files (vs root), you will want to
gzip them before you transfer them to your local disks. Either way, you
will definitely want to gzip the log files.
Ntuple Structure
Take a look at the resulting ntuple (now located in the
"successful" directory).
paw
PAW > hist/file 1 successful/SP-989-Run1-1.hbook
PAW > nt/print 1
Here is the printout of the ntuple structure: ntuple-structure.txt
You will want to navigate between the different blocks
and
different candidates.
BLund tells you what type of B this
candidate is
Bd1Lund is the particle type of the first
daughter
Bd1Idx is the index of that daughter in the
relevant block
(Jpsi in this case). Note that the indices all use C++
numbering, so that Bd1Idx = 0 refers to the first entry in
the Jpsi block. If you are actually going to use paw, you
will need to add 1 to all the indices. You will probably get
this wrong numerous times.
You can similarly navigate from the Jpsi block to the mu
block (or e block, once you add Jpsi --> e e decays), and
from the mu block to the TRK block.
When using electrons with bremsstrahlung recovery (in
which
radiated photons are added back into the reconstructed
electron), the navigation is a bit tricky. The
brem-recovered electron is a composite, consisting of an
electron plus one, two, or three photons. So it does not
point to the TRK block (the index is -1); you need to
navigate to the daughter electron, which does point to the
TRK block. This is the only case where a daughter of a
particle is the same type of particle.
The Root
III chapter of the workbook gives an example of an analysis on a
similar type of ntuple. Note that the ntuples in that chapter were
produced using the old CompositionUtil package, and have the opposite
indexing problem of the current ntuples (i.e., in Root, you need to
subtract 1 from all indices).
The Paw
II chapter discusses macros a bit. When doing analyses in paw, I
use nt/loop.
PAW > hist/file 1 successful/SP-989-Run1-1.hbook
PAW > nt/loop 1 smallcode.f(0)
PAW > h/pl 1000
Here is smallcode.f.
On some systems, you can say nt/loop
1
smallcode.f77(0) to compile the code before it is
run.
Note on MC truth Matching
The MC block contains the full MC truth for the event at the
four-vector level, and is very useful for categorizing different types
of backgrounds, for example.
The names such as JpsiMCIdx connect the reconstructed
candidate to the MC truth block. However, the MC matching is
a bit dicey for any state that can radiate. (And we may be
using Photos not just for leptons, but also for hadrons in
some of our MC). For example, if you plot JpsiMCIdx, you
will see that it is -1 in many cases (i.e., no MC truth
match was found). But if you plot the lund ID of the
MC-truth matched partners of the muon daughters of the Jpsi,
you see that they are actually muons:
nt/plot 1.mclund(muMcidx(1)+1)
nt/plot 1.mclund(muMcidx(2)+1)
The problem is that in the MC truth, the Jpsi actually
decayed to mu+ mu- gamma, while the reconstructed Jpsi
contains only the muons. In principal, this gamma could be
1 MeV, and of no real consequence. The lesson is that you
cannot blindly use the MC truth match for composites. Actually, it is
probably better to avoid MC truth matching
entirely if you can. (In this case, if I did need to match,
I would say the Jpsi is matched if the two muons daughter
match the MC-truth daughters of the Jpsi).
Next steps
Here are a few things to try before we go on to the more
complex SimpleComposition case.
Make a decent Jpsi mass plot (with binning suitable for
the
detector resolution), and observe how it changes if you
require muNNLoose on both legs instead of VeryLoose. It
might be interesting to check the scatterplot of momentum vs
theta for your muons to see where they fall in the detector.
What happens if you require GoodTracksLoose for both
muon
legs and the Kaon? You can do this by checking the bitmap,
but you should also try sublisting in SimpleComposition. Another
approach (suitable for your own analysis code, not
for skimming), would be to change the input lists to the
muon and kaon selectors. Look through the path to figure out
where to do this.
Perhaps it would be worth adding a cut on the quality of
the
vertex of the Jpsi. Note that we don't want to make a cut on
the chisquare of the mass-constrained Jpsi - this would be
like making a cut on the Jpsi mass. You will need to add
this quantity to your ntuple, and will need to compare
signal MC to data to see if there is anything to gain.
Before expanding our analysis, we should discuss the issues
of self-conjugate modes and clones, which can be quite
confusing if you don't know the underlying issues. HN is
full of questions, including some from me.
Self-Conjugate modes
If you build the decay A --> B C in SimpleComposition, it
will, in most cases, automatically build A-bar --> B-bar
C-bar. If B = B-bar, or C = C-bar (Jpsi, for example, or
Ks), it knows to use the B or C list instead of the B-bar or
C-bar. So if you ask for B+ --> Jpsi K+, you will also get
B- --> Jpsi K-.
You will not get the A-bar list, however, if A = A-bar.
There can be consequences, if you are not careful. Consider,
for example, creating a Jpsi from one muon required to
satisfy muNNLoose, and the other muNNVeryLoose
mod clone SmpMakerDefiner BadJpsiToMuMu
seq append AnalysisSequence BadJpsiToMuMu
talkto BadJpsiToMuMu {
decayMode set "J/psi -> mu+ mu-"
daughterListNames set muNNLoose
daughterListNames set muNNVeryLoose
fittingAlgorithm set "Cascade"
fitConstraints set "Geo"
preFitSelectors set "Mass 2.8:3.4"
postFitSelectors set "Mass 2.9:3.3"
}
The problem is that SimpleComposition will do what you asked
- require the mu+ to be a muNNLoose and the mu- to be a
muNNVeryLoose. It does not do the charge-conjugate list,
J/psi --> mu- mu+. If you want this case, you need to create
it yourself and merge the two. Personally, I find it easier
just to use the same list when a particle appears twice in
the final state.
Clones
A related topic is that of clones. When SimpleComposition
merges two lists --- the A and A-bar lists, for example ---
it checks for clones and, by default, removes them. Two
candidates are considered to be clones if the final state
contains the same candidates (in any order) with the same
mass hypotheses. Note that it does not check whether or not
the initial state is the same.
As an example, you make a D0 --> pi+ pi- list.
SimpleComposition will also make a D0-bar --> pi- pi+ lists,
but will discover upon merging the two lists that it has the
same pair of tracks on both lists. It will then delete all
of the D0-bar candidates. So when you reconstruct B+ --> D0
K+ decays, you will end up only with B+ candidates, and no
B- candidates. Note that if you use different lists for the
pi- and pi+, it will delete only some of the D0-bar
candidates, so you will end up with wrong answers, but
perhaps not obviously wrong answers.
To override this default behaviour, add to your module
the
line
disableCloneCheck set true
Note that if you were merging D0 --> pi+ pi- (both
GoodTracksLoose) and D0 --> K+ K- (also both
GoodTracksLoose), you will keep both candidates, even if you
have used the same tracks. This is because the candidates
have a different mass hypothesis.
Author: Chris Hearty
Last modified: 17 March 2005
Last significant update: 7 December 2004
|