|
|
|
||||||||||||||||||||||||||
Reskimming is one of the key functionalities of the Analysis Model. Users
are expected to run over production skims, make a preselection
suitable for their analysis and write the selected events in their
AWG or their private output collections.
The output collection can borrow the Aod ( the "micro" ) and the
Esd (the rest of the "mini" ) components
from the original collection ( pointer skim ), therefore saving disk space, or
can clone the components in the output collection ( deep-copy skim ),
which is need in order to copy the collections to others sites later.
Either the skim is a pointer or a deep copy, the output collections can be
augmented with additional custom informations: composite BtaCandidates
created by the user analysis modules and user defined data, which are
attached to the whole event or to the composite BtaCandidates. Further
possible customizations will be discussed later.
Schematically, a reskimming job is set up to
Because of its nature and purpose, while skims production is
managed centrally, the reskimming process will not.
Instead, Analysis Working Groups and final users have to properly
configure their packages to do so.
Reskimming is our Analysis Model replacement to the AWG massive
N-tuple productions, which should be abandoned.
To run a reskimming job, the users execute the user package application, the equivalent of BetaMiniApp in the user package, on a job configuration tcl file ( snippet ). The single snippet
For example, the following snippet
#-------- job configuration file -------
set stream BToSomething
set ConfigPatch MC
set MCTruth true
set inputList input_collection
set outputCollection /work/users/user_id/output_collection
lappend outputBtaCandidates MyList1 MyList2 ... MyListN
lappend outputCndUsrBlocks MyUserBlock1 MyUserBlock2 ... MyUserBlockN
set outputEvtUsrBlocks MyEventUserBlock
sourceFoundFile UserPackage/UserPackageProduction.tcl
#----------------------------------------
defines completely the configuration to run on a Monte Carlo
collection and produce an output collection.
Notes
At the end of the snippet, the production script <UserPackage>/<UserPackage>Production.tcl is sourced, where UserPackage is the actual package being used. The purpose of the production file consists in
#-------- UserPackage Production file ------- sourceFoundFile ErrLogger/ErrLog.tcl sourceFoundFile FrameScripts/FwkCfgVar.tcl sourceFoundFile FrameScripts/talkto.tcl sourceFoundFile FrameScripts/setProduction.tclNotes
FwkCfgListRequire inputList ;# input collection list FwkCfgVar stream ;# stream ( i.e. ) analysis to run FwkCfgVar outputCollection "" ;# Output collection name ;# if (empty) default value is overidden ;# the MiniWriteSequence is added by runSkim FwkCfgVar outputBtaCandidates "" ;# composite BtaCandidate lists to persist FwkCfgVar outputCndUsrBlocks "" ;# Candidate User Data Blocks to persist FwkCfgVar outputEvtUsrBlocks "" ;# Event User Data Blocks to persist FwkCfgVar FwkCfgVar components "deepCopyMicro" ;# components to write in the event store
sourceFoundFile BetaMiniSequences/BetaMiniSequence.tcl
runSkim $stream UserPackage $outputCollection $components $outputBtaCandidates / $outputCndUsrBlocks $outputEvtUsrBlocks #--------------------------------------------
The physics configuration file should look like
# ------ BToSomething Physics configuration file ------
FwkCfgVar extPar defaultval # set in the job config. snippet
sourceFoundFile CompositionSequences/WhateverINeed.tcl
sequence append BToSomethingPhysics WhateverINeed
module enable myModule # it can be the Filter that selects the events
talkto myModule {
par1 set val1
par2 set $extPar # the value can be set in the job config. snippet
}
sequence append BToSomethingPhysics myModule
# -----------------------------------------------------
Notes
runSkim <stream> <package>
[outputColl] [components]
[cndLists] [cndUsrData] [eventUsrData]
[cndConfigOptions]
<stream>
Identifies the analysis. runSkim creates a path
named <stream>Path and append all the needed sequences to it.
<package>
Identifies the package name where the <stream>Physics.tcl file has to
be looked for. if the <package>/<stream>Physics.tcl is
found, a sequence named <stream>Physics is automatically created
and added to the path.
[outputColl]
Sets the output collection name. If present, a BetaMiniWriteSequence<stream>
is configured and added to the path. The default is to not have an
output collection and, consequently, a write sequence, allowing the
user to run test jobs without writing an output collection.
[components]
Components to write in the output collections.
Available options are
| pointer | borrow all mini data ( mini include micro ) | deepCopyMicro | write micro data and borrow rest of the mini | deepCopyMini | write all the components | tagOnly | write just the tag and borrow all the rest of the mini |
| explicit list of components | explicit list of component to write ( except usr, see Notes below ) |
[cndLists]
tcl list to set the names of the composite BtaCandidate lists to write
[cndUsrData] [evtUsrData]
tcl lists to set the names of BtaCandidate-level and
Event-level User Data blocks to write
[cndConfigOptions]
tcl list to allow one to specify the vertexing and four-momentum cached
information for BtaCandidates and to configure the branch structure for
different particle types. The options take keyword=value pairs.
Keyword: cndStoreOpt, allowed options are:
| RecoP4 | cache 4-momentum for all reconstructed BtaCandidates |
| CompositeP4 | cache 4-momentum for all persisted composite BtaCandidates |
| CompositeVtx | cache vertex position for all persisted composite BtaCandidates |
| CompositeDaughters | store cached information also for all the daughters of persisted composites |
| <Name> <types in PDT> <cache variables> | To store candidates by particle type (see example below) |
Keyword: trkFitType, allowed options are:
| All | Store track fits for all stable particle mass hypotheses |
| any of Electron, Muon, Pion, Kaon, Proton | Store the track fits for any of the specified particle types |
| BtaCandidate | Store track fits specified by the type of the track-based BtaCandidates being stored |
Keyword: trkFitStorage, allowed options are:
| ZAxis | Store track fits at the point of closest approach to the z axis |
| CandPoint | Store track fits at the point of closest approach to the BtaCandidate production vertex |
| None | Don't store track fits (usable only when storing the full mini and reading back in refit mode) |
Keyword: cluster, allowed options are:
| any of Esd, Aod, Tru, Tag, Cnd | Cluster the specified components into their own file, e.g., cluster={Esd, Aod}. By default, all components are clustered into the same file |
The default is to cache nothing. All the options above all
cumulable. That is, all possible combinations of the
options can be passed as a tcl list. For example, one can do
lappend cndOptions "cndStoreOpt=CompositeVtx"
lappend cndOptions "cndStoreOpt=RecoP4"
lappend cndOptions "cndStoreOpt=CompositeP4"
lappend cndOptions "cndStoreOpt=Pion pi+ pi- p4"
lappend cndOptions "cndStoreOpt=Kaon K+ K- p4"
lappend cndOptions "cndStoreOpt=Proton p+ anti-p- p4"
lappend cndOptions "cndStoreOpt=B0 B0 anti-B0 p4 vtx"
lappend cndOptions "cndStoreOpt=D0 D0 anti-D0 p4 vtx"
lappend cndOptions "trkFitType=All"
lappend cndOptions "trkFitStorage=CandPoint"
... ...
runSkim <stream> <package>
[outputColl] [components]
[cndLists] [cndUsrData] [eventUsrData]
$cndOptions
Q: Why should I use two different files, one for general production configuration and another one for my Physics Analysis configuration? I want to put everything in a single file!
A: It separates configuration aspects common to all the analysis in
the area of interest of the package from your analysis-specific configurations.
For example, you may have more than one analysis using the same
package thar may share the some tools and scripts. Or you may have different
configurations for the same analysis, and you can switch from one to
without modifying the existing configurations.