============================================ A proposal to modify BaBar SRT & Framework (CDB migration related) ============================================ Igor Gaponenko, Akbar Mokhtarani May 22, 2006 __________________ 1. What we suggest The general idea is to extend a static package dependency scheme we have in SRT to allow an automated dependency generation for Framework based applications - (read-only) clients of CDB. A small modification to the BaBar Framework infrastructure will also be needed. The proposed extension will help us to fulfill a complex task of migrating CDB clients from Objectivity/DB to ROOT I/O and MySQL (though we're not saying that this will automatically solve all related problems). Its specific importance in this context is that the automated dependency generation will open the most optimal path in the migration process for Framework applications. We say it's going to be "optimal" for the following reasons: * it will be _generally_ (some clean up in static dependency files will be needed) non-intrusive to Framework applications * it will work the same way for Framework applications being a part of official releases and for those which are in CVS only (private analysis applications of users) * it won't affect other applications * very little changes to TCL files are expected; the overall structure of TCL sequences and modules won't change * code modifications will be confined within a scope of persistent packages and technology-neutral proxies * an incremental development and deployment of (CDB) migrated code will be possible; changes will be immediately integrated into applications as they'll be release; step-by-step testing of new changes will be possible In the rest of this document, a further justification for the proposal followed by a concrete suggestion of changes to be done to SRT and Framework will be outlined. ____________________________________________________________________ NOTE: It may also worth to note, that the proposed scheme is going to be transparent to top level application developers. So, later on, the scheme can be replaced with the present static dependency model, should this be needed. ___________________________________ 2. Why we request this modification _____________________________________ 2.1 A bit a of history of the problem First of all, all what's being discussed in this document is related to the so called "payload" migration of CDB client applications. The core part of CDB based on ROOT I/O and MySQL has already been done (though, some extra work is still needed there). A major obstacle to the payload migration is that BaBar applications have never been designed to be fully persistent technology neutral. As a result a very complex (and rich of physics algorithms and logic) code of Framework based application has become depending either directly or indirectly onto low level packages dealing with Objectivity API. These are mostly those packages containing CDB "proxies". Partially, this problem was solved by Fall 2002, when the new CDB was deployed in BaBar. Unfortunately, that solution only covered the "metadata" part of CDB. Interfaces to the "payload" classes (user defined DDL ones) weren't changed at that time. A graphical illustration of the problem can be found, for example, on the left diagram of the first slide: http://www.slac.stanford.edu/~gapon/CDB/Objy2Root/Doc/AnalyzeDeps_BaBarSoft.ppt The second slide is just a concrete example what we have in BaBar code just for one selected application and selected "payload" package. One year ago (Spring 2005) a second step to further decouple persistent from transient context was made. Here is the relevant document: http://www.slac.stanford.edu/BFROOT/www/Public/Computing/Databases/experts/docTechNeutralProxiesAndPayloadTranslators.html The idea is illustrated on the right diagram of the first slide mentioned above. ____________________________________________________________________ 2.2. A straightforward attempt to accomplish the "payload" migration A straightforward approach (which was made when migrating the BetaMiniUser application) to ROOT "payload" would require the following tasks performed over BaBar code: 1. DDL classes migrated to RDL ones (we have ~220 of them) 2. Proxies migrated to the technology-neutral model (~200). Some proxies need to be moved into newly created (tech-neutral only) packages to stay separated from relevant persistent classes. 3. Persistent-to-transient translators be implemented for persistent classes (one per persistent class). For most persistent classes it's just one line of C++ code to instantiate a special template. 4. Creating a set of new (parallel to existing, tech-specific one) modules and sequences instantiating the proxies 5. Performing a thorough dependency analysis and finding out all potential Framework applications which may be affected by this change and modifying their AppUserBuild.cc to instantiate relevant CDB sequences (and modules). There are 317 files in release 20.1.0 alone. We know that there is a number of user analysis applications in private CVS packages as well. 6. Study dependency analysis for TCL files make modifications to install new CDB modules. Obvious issues with that plan are related to its steps 4,5 and 6. In particular: * Changes to lower level packages aren't possible w/o affecting (breaking) hundreds of Framework applications. Each of these applications has to be migrated simultaneously. _______________________________________________________________ NOTE: An exactly the same problem can happen to the migrated applications in the future if, for example, a new persistent package will get added. Then all top-level applications which need that condition will stop functioning w/o an explicit change in their static dependency lists. * An incremental migration and deployment of the applications doesn't seem to be possible. Modifications at a low level packages would have to be followed with an immediate change made to all relevant applications (including those which aren't a part of official releases). In fact, the only scenario when that plan might work would be when all code developments on BaBar were frozen for a substantial period of time (6+ months). We can't afford this. ______________________________________________________________________ NOTE: That CDB "payload" migration is much more complex and much less "mechanical" than the one made in BaBar when migrating to C++ Standard Library. _______________________ 3. What exactly we need In the previously mentioned "technology-neutral" model of proxies there are two types of packages, let's call them: CONSUMERS: the ones which have technology-neutral proxies. These are purely technology-neutral packages. Their proxies need a (persistent-to-transient) translation service for relevant transient classes in order to build their transient products. PROVIDERS: this is a group of technology-specific packages which would have; persistent (DDL or RDL) classes and the corresponding (persistent-to-transient) translators. These two groups of packages "communicate" to each other by mean of transient types whose objects are needed by CONSUMERS and provided by PROVIDERS. __________________________________________________________________________ AND HERE IS A VERY IMPORTANT THING: an attempt to migrate the BetaMiniUser analysis to ROOT I/O was undertaken based on this CONSUMER/PROVIDER model and _STATIC_ dependency approach. This was done on a copy of a subset of existing (Objy specific and relevant) packages and TCL sequences/modules. As a result a sort of the "branch" in BaBar code had been created. Apart of applicability of this approach to the rest of the applications (perhaps ~300), there are serious concerns in the maintainability of the duplicated code. Here are two complementary things we're proposing: 1. "Refactor" existing Objectivity/DB based BaBar code first by splitting it into technology-neutral and technology specific model of CONSUMERS and PROVIDERS. This will make the code ready for the final step of switching Objectivity/DB based providers with the ROOT based ones. 2. Deploy an automated mechanism for matching CONSUMERS and PROVIDERS at a level of SRT. That mechanism would find that match and instantiate (automatically and transparently for Framework applications developers) the providers needed to satisfy consumers of a given application. Once the refactoring stage would be done for all relevant packages then an easy switch of Objectivity/DB specific PROVIDERS to the ROOT based one can be done rather quickly (provided that there work on migrating DDL classes to RDL one would also be accomplished by that time). Major benefits we expect from the proposed scheme: * The refactoring (and further migration) can be confined into the low level packages only and perhaps one or more intermediate packages instantiating technology-neutral proxies. Once it's done then the code would be automatically available to all relevant applications w/o making big (if ever) changes to the applications. * Changes won't break existing applications as they'll be properly built and configured. * We'll be able to see an effect of changes instantly. This means a rapid testing of the modifications w/o waiting for a long cycle before all relevant stuff will get migrated. * The partially rafactored applications will be fully functional as they were before the refactoring. For us this means that we'll be able to keep working on the refactoring within the main stream of (even production) releases. * In theory, we may even consider partially _migrated_ applications, in which some conditions were being loaded from ROOT based CDB! That would be very helpful for the application-level testing. A general implementation of the ides is based on explicitly specified "export" (from PROVIDERS) and "import" (CONSUMERS) lists. Transient class names will be the keys in the lists. ______________________________ 4. How this can be implemented In our test releases, we've already tested one way to implement the explained ideas. Let's consider as an example these three packages from release 20.1.0: "BgsApp" - builds the binary BgsApp "PepSimGeomP" - a persistent (DDL) package having one persistent class: "PepSimPLayout" and an installer function for the translator: "PepSimGeomPTranslators". That's our PROVIDER package for the transient class: "PepInertMaterial". "PepSimGeom" - a CONSUMER package "SoftRelTools" "Framework" Changes can be seen in this test release: SLAC AFS: ~gapon/vol2/releases.20.1.0_DynDepends/ An explanation of changes found in that release is following below for each of the mentioned package: ______ BgsApp No changes for this package except getting rid of direct static dependencies onto relevant persistent packages. This move itself is optional as the application will still properly link. ___________ PepSimGeomP Added two new files. The first one represents an "export" list of translators this package provides: // File: PepSimGeomP_Bdb.mk # Export list for classes provided by the current package ifneq (,$(CDBIMPORT)) CDBEXPORT = PepInertMaterial ifneq (,$(filter $(CDBIMPORT), $(CDBEXPORT))) override LINK_PepSimGeomP += $(PACKAGE)GNUmakefile override CDBEXPORTPACKAGES += PepSimGeomP endif endif Note, that this file is being included into "SoftRelTools/standard.mk" in a context of building a binary target of Framework based applications for which this provider will be needed. The code of that new file can be further simplified by moving all but an assignment of the 'CDBEXPORT' variable into "SoftRelTools/standard.mk". So the file may look as simple as: // File: PepSimGeomP_Bdb.mk CDBEXPORT = PepInertMaterial Also note, that having "_Bdb" in that file implies that this package is a provider of translators from Objectivity/DB specific DDL classes. For ROOT RDL we'll have a similar package with a file ending with "_Roo.mk". This naming convention will be used by an automated dynamic dependency construction procedure. Then we have the second group of files which declare and define a global function instantiating persistent-to-transient translators for DDL classes of the package: // File: PepSimGeomPTranslators.hh #ifndef PEPSIMGEOMPTRANSLATORS_HH #define PEPSIMGEOMPTRANSLATORS_HH extern void PepSimGeomPTranslators(); #endif // File: PepSimGeomPTranslators.cc .. void PepSimGeomPTranslators() { registerTranslator(new PepSimPLayout_Translator()); } __________ PepSimGeom Added one line with a list of required (to be "imported") transient classes into to the existing file. For these classes the corresponding PROVIDER match will be identified by the automated procedure: // File: link_PepSimGeom.mk .. override CDBIMPORT += PepInertMaterial ____________ SoftRelTools Added a Perl script to generate a function required by the extended Framework (see next topic after this section): // File: CdbGetProviders.pl For CDB technology "Bdb" (it's a CDB codename for Objectivity/DB) and BgsApp binary this will generate this global function: #include "BdbCondModules/BdbCondInitSequence.hh" #include PepSimGeomP/PepSimGeomPTranslators.hh" .. void AppUserBuildExt( AppUserBuild* ) { BdbCondInitSequence( forWhom ); PepSimGeomPTranslators(); } The output file is placed into: tmp/$BFARCH/BgsApp/BgsApp_AppUserBuildExt_Bdb.cc Note, that the generated file will also have a proper initialization for all relevant "infrastructure" sequences & modules like "CdbBdbInit" module in Objectivity/DB CDB and "CdbRooInit" for CdbRoo. The "standard.mk" has also been modified around an existing line where "PackageList/link_all_reco.mk" gets included by adding this: .. PACKAGELIST_SAVED := $(PACKAGELIST) LINKLISTDEPENDS_SAVED := $(LINKLISTDEPENDS) LOADLIBES_SAVED := $(LOADLIBES) include PackageList/link_all_reco.mk ############################################################################ ## Begin CDB specific section which is meant to generate dynamic ## dependancies of a binary target on persistent packages for the requested ## technology. ############################################################################ # ifneq (,$(BINNAME)$(filter $(MAKEGOAL),$(BINARIES))) ifneq (,$(CDBIMPORT)) -include $(foreach PKG,$(PACKAGELIST), $(PKG)/$(PKG)_Bdb.mk) override PACKAGELIST := $(PACKAGELIST_SAVED) override LINKLISTDEPENDS := $(LINKLISTDEPENDS_SAVED) override LOADLIBES := $(LOADLIBES_SAVED) ## Save the list of imports to prevent its expansion during the second ## pass of the "link_all_reco.mk" ## CDBIMPORT_SAVED := $(CDBIMPORT) include PackageList/link_all_reco.mk override CDBIMPORT := $(CDBIMPORT_SAVED) ifeq (,$(CDBTECH)) CDBTECH := Bdb endif ifneq (,$(BINNAME)) CDBFILE := $(TOPDIR)/tmp/$(BFARCH)/$(PACKAGE)/$(BINNAME)_AppUserBuildExt_$(CDBTECH) else CDBFILE := $(TOPDIR)/tmp/$(BFARCH)/$(PACKAGE)/$(MAKEGOAL)_AppUserBuildExt_$(CDBTECH) endif CDBIMPORT_OLD = $(shell if [ -f $(CDBFILE).import ]; then cat $(CDBFILE).import; else echo ""; fi) ifneq ($(CDBIMPORT),$(CDBIMPORT_OLD)) $(shell echo $(CDBIMPORT) > $(CDBFILE).import) endif ifneq (,$(BINNAME)) $(bindir)$(BINNAME): $(CDBFILE).o else $(bindir)$(MAKEGOAL): $(CDBFILE).o endif $(CDBFILE).cc: $(CDBFILE).import @echo "Generating $@ [CDB dynamic dependencies]"; \ if [ -x $(TOPDIR)/SoftRelTools/CdbGetProviders.pl ]; then \ $(TOPDIR)/SoftRelTools/CdbGetProviders.pl $(CDBTECH) $(CDBEXPORTPACKAGES) > $@ ; \ else \ $(BFDIST)/releases/$(BFCURRENT)/SoftRelTools/CdbGetProviders.pl $(CDBTECH) $(CDBEXPORTPACKAGES) > $@ ; \ fi endif endif # ################################### ## The end of CDB specific section #################################### _________ Framework Added this header file with a forward declaration of the previously mentioned automatically generated (when building binaries) function: // File: AppUserBuildExt.hh #ifndef APPUSERBUILD_EXT_HH #define APPUSERBUILD_EXT_HH class AppUserBuild; extern void AppUserBuildExt( AppUserBuild* ); #endif The "AppMain.cc" has been modified to call the extended post-initialization for applications: #include "Framework/AppUserBuildExt.hh" .. AppUserBuild* build = new AppUserBuild( AppTheFramework ); AppUserBuildExt( build ); .. ___________________ 4.1. Open questions An obvious side effect from the above shown modifications to the "SoftRelTools/standard.mk" file is an increased build time for applications. It's coming from two sources: a. scanning all existing (in a release) packages in search for packages - PROVIDERS (those ones having _Bdb.mk or _Roo.mk files). b. including "PackageList/link_all_reco.mk" twice. Meanwhile the second (b) seems to be unavoidable, the first (a) action may be simplified by having a release based list of "known" RPROVIDER packages. There should be much less than 100 them (out of 1000 packages in releases). That's where we presently have DDL "payload" classes. This should get rid of the introduced delay during the link time. Of course, this optimization should be made correctly when linking in the "lettered" and "test" release modes to take into account local versions of relevant CONSUMER and/r PROVIDER packages. Implementing this shouldn't be a big step.