Kanga++: Statement of Purpose
Goals and Basic Design Issues
These are the basic goals and design issues in Kanga++.
- To be able to possbile to use the full set of ROOT functionality (mainly plotting or running macros directly) from the Kanga data. To be able to achieve this, substantial re-working of the structure of the Kanga was required.
- It is intended that users will be-able to replace large scale ntuple production with Kanga++ functionality in the skim micro (or mini) output.
- Since ROOT provides mechanisms to call functions of stored objects and to rebuild the stored objets themselves, a user can plot the return value of any member function of any stored object, or directly access that stored object in an analysis macro.
- Kanga (and Kanga++) files are (in principle at least) a "complete" representation of the data in the sense that they contain enough information to do any analysis. This means that these is never the need to re-run the production because some variables are missing for the ntuples.
- Since the data in Kanga files are stored as objects, it possible to reused existing BaBar software to manipulate the data, rather than having to re-write analysis function for Ntuple analysis.
Analysis Model:
A potential analysis model would look something like this.
- A group of people (AWG?) would produce full Kanga files for the events that pass their skims. I'll call this a "Skim" file.
- Then the various members of the AWG could run their own analyses on the "Skim" data, each producing output files which could reference candidates and data back in the original "Skim" Kanga. I'll call these the "Re-Skim" files.
- This is no different from how existing functionality in objectivity. However the obvious advantage here is that they could access data directly from the Kanga files, without the overhead of the BaBar framework.
- Each of the "Re-Skim" files of a particular "Skim" file is independant of all the other "Re-Skim" files. However they each depend on the "Skim" file to the extent that they refer to data in the "Skim" files and will return null values for those data if the parent files are no longer present.
- In priniciple, there is no limit to the amount of nesting. That is to say, a user could run on their "Re-Skim" data and use it as a base to make a "Summary" data files, and so on.
- An example of "Summary" data could be, for example, just the variables to be passed to the fitter.
- Then for simple analysis, (plotting a few variable or whatever) the users can run from ROOT. For more complicated analysis, (involving rebuilding all the candidates and redoing combinatorics) they can re-run a framework job from Kanga.
Eric Charles
Last modified: Wed Nov 24 11:50:20 PST 2004
eac-041122: Modified to use terms "Skim" and "Re-Skim" to match with adopted usage.