SLAC PEP-II
BABAR
SLAC<->RAL
Babar logo
HEPIC E,S & H Databases PDG HEP preprints
Organization Detector Computing Physics Documentation
Personnel Glossary Sitemap Search Hypernews
Unwrap page!
Comp. Search
Who's who?
Meetings
FAQ Homepage
Archive
Environment
Administration
New User Info.
Web Info/Tools
Monitoring
Training
Tools & Utils
Programming
C++ Standard
SRT, AFS, CVS
QA and QC
Remedy
Histogramming
Operations
PromptReco
Simulation Production
Online SW
Dataflow
Detector Control
Evt Processing
Run Control
Calibration
Databases
Offline
Workbook
Coding Standards
Simulation
Reconstruction
Prompt Reco.
BaBar Grid
Data Distribution
Beta & BetaTools
Kanga & Root
Analysis Tools
RooFit Toolkit
Data Management
Data Quality
Event display
Event Browser
Code releases
Databases
Check this page for HTML 4.01 Transitional compliance with the
W3C Validator
(More checks...)

Offline Programming Recommendations: OO/C++

On this page: [Kinds of Classes] [Design Issues] [Documentation Issues] [Implementation Issues]
Other pages: [Background Info]


Recent Changes

  • Jan 27, 1999 -- Created by merging older pages into new web form (Bob)

Kinds of Classes

A class should be of at most one of the following types:
  • Modules (subclasses of AppModule): These are the unit of execution and processing. A module gets objects from the event, maniplates them, and puts new objects back to the event. Its essentially pure algorithm, with only enough internal state to handle configuration and control
  • Event data classes (subclasses of AbsEvtObj): All transient objects that can be in the event; the stuff that is manipulated by modules. This includes tracks, clusters, etc.
  • Classes for event persistance: The persistant classes that store representations of (2), plus the various Convertor classes that manipulate them
  • Conditions data classes (subclasses of AbsEnvObj): All transient objects that represent "conditions" data, such as calibrations, alignments, etc.
  • Classes for conditions data persistance: The persistant classes that store representations of (4), plus the various proxies, Cal* classes, etc that animate them.

Of course, simple classes for internal use may not fit in any of the above categories.

General

  1. Include files in C++ have the extension ".hh". Similarly, include files in C have the extension ".h".
  2. Implementation files in C++ have the extension ".cc". In C they have the extension ".c".
  3. Dont use inline definition files (e.g. files with the extension ".icc" Such files contain inline definitions for simple function members) (e.g. data member accessors) and are themselves included into the class header (.hh) file. We have tools for navigating C++ source that work best when everything is found in .cc and .hh files. Remember that the inline keyword is only a hint to the compiler and there is no guarantee that a complex function will actually be expanded inline.
  4. Every include file should contain a mechanism to prevent multiple inclusions. For the file XxxMyFile.hh, the BaBar convention is:
    	
    	#ifndef XXXMYFILE_HH
    	#define XXXMYFILE_HH
    	    [....]
    	#endif
    
    This should be put at the top of the file, before any opening comments, to allow the compiler to be smarter about how much of the .h file to read.
  5. Use a separate .cc file, and corresponding .hh file, for each C++ class. The filename should be identical to the class name.
  6. Do not create class names (and therefore filenames) that differ only by case.
  7. Include directives should not contain any directory path names apart from those corresponding to BABAR packages. In this case, the package name should be used as a directory prefix:
    	#include "package/filename.hh"
    
  8. The identifier of every globally visible class, function or variable should begin with the package "TLA" (Three Letter Acronym) prefix from the package to which it belongs. This implies that the implementation (.cc) and interface (.hh) files for C++ classes should also begin with the same prefix.
  9. The first line of code in each .cc file (i.e. following the header comments), should be the following:
    	#include "BaBar/BaBar.hh"
    
  10. Encapsulate global variables and constants, enumerated types, and typedefs in a class. This greatly reduces the probability of your names colliding with those from some other package.
    	class XxxMyGlobals
    	{
    	public:
    	    static void myFunc( );
    	    static int myData( );
    	    static void setMyData( int value );
    	private:
    	    int _myData;
    	};
    	[....]
    	MyGlobals::myFunc( );
    
  11. Remember to add comments to both the interface (.hh) file as well as the implementation (.cc) file for each class. People will often read just the .hh file to figure out how to use a class. Interface comments should stress what the class does whereas implementation comments should stress how. Using KDOC format for the comments is greatly preferred, as it allows us to automatically make a package reference using the LIGHT system.
  12. Avoid overloading functions and operators unless there is a clear improvement in the clarity of the resulting code. For read and write access to data members, use:
    	int myData( ) const;
    	void setMyData( const int value );
    
    rather than
    	int myData( ) const;
    	void myData( const int value );
    
    The former makes the intention clear whereas the latter is an example of using a facility because it exists, not because it has any advantages.
  13. Data members should have names that begin with a leading underscore ("_") character in order to avoid name conflicts with any accessor functions.
  14. As part of defensive coding, use the assert macro to check that specified conditions are true.
    	#include <assert.h>
    	[....]
    	void
    	MyClass::myFunction( char* name )
    	{
    	    assert ( 0 != name ); // Check that the name is not NULL
    
    The assertion can be removed from the code, once it has been tested, by compiling with the symbol NDEBUG defined.
  15. Don't implicitly compare pointers to nonzero (i.e. do not treat them as having a boolean value). Use
    	if ( 0 != ptr ) ...
    
    instead of
    	if ( ptr ) ...
    
  16. If you are doing an assignment in a comparison expression, make the comparison explicit:
    	while ( 0 != (ptr=iterator() ) ) ...
    
    prevents the next person from coming along, seeing
    	while ( ptr=iterator() ) ...
    
    and saying "That idiot doesn't know the difference between = and ==, I'll just fix this..." and messing up your code.
  17. Unfortunately, the compilers we are currently using don't support namespaces. Don't use them.

Format of Comments

Use KDOC and LIGHT!

When using C++, the preferred form for short comments is:

	// This is a one-line comment.

i.e. use the "//" comment format. If the comment extends over multiple lines, each line must have // at the beginning:

	// This is a long and boring comment. 
	// I need to put // at the start of each line. 
	// Note that the comment starts at the // and 
	// extends to the end of line. These comments
	// can therefore appear on the same line as code, 
	// following on from the code.

Class and Module Design Issues

  1. All .cc files should have
    	#include "BaBar/BaBar.hh"
    	
    as the first non-comment line, before any other include. This gives us a tool for introducting typedefs, #define's and other ways of handling compiler differences and class migrations.
  2. It often useful to choose whether a given pointer "owns" the object its pointing to or not. If it does, the object containing the pointer is responsible for deleting the pointed-to owned object when the owning object has finished using the owned object.
  3. A class that contains "owned" pointers to other objects must have a non-trivial destructor to delete them.
  4. If a class contains a non-trivial destructor, it must be virtual, unless positive steps have been taken to prevent inheritance from the class. This is to prevent the non-trivial destructor from being omitted if an object of a subclass is being destructed.
  5. Arguments passed by reference should never be retained. Use a pointer argument if its possible that the address will be retained for later use. Reason: To clue the user of the routine that an address is being used.
  6. Member functions which merely return a value (calculated or from a member variable) should be named for the value, e.g. theta(), x(), length(). Member functions which store a value into internal state should start with "set", e.g. setTheta(double ), setX(double ), setLength(int ).
  7. Classes should only provide constructors that build objects that can do what the class is supposed to do. It sounds obvious, but it means classes must provide only constructors that make sense for creating fully functional objects, and should not provide constructors that do not receive enough information to build a fully functional object. Document how these should be used, including what the null constructor produces. A null constructor and a copy constructor are desirable iff they are consistent with what the class is intended to do. If this means that the null constructor is not appropriate, that should be documented in the header file.
  8. We have a convention for print methods. "print(ostream & )" prints a single line summary of an object, without a trailing endl. "printAll(ostream & )" prints all the internal state of an object. See AbsEvtObj.hh for more details. If any print methods are defined, both of these must also be.
  9. Always consider the use of "standard objects" within a new class. If not appropriate for design reasons, document this with a short statement in the header comments. If not appropriate because of missing, though consistent, functionality in the standard object, everybody will gain if you take the time to add it.
  10. Static variables and public data members in module classes are forbidden because of too many chances for unexpected side-effects. There should in any case be little need for them, as there is rarely more than one object of a module class, and they should not communicate with each other. If inter-module communication is needed, put it in AbsEvent or AbsEnv instead.
  11. No "random numbers" may be used in reconstruction code. If a sequence of numbers are needed (e.g. for simulated annealling or doing an integral), they must be
    1. contained within and used by a single module, i.e. not in a independent class,
    2. have their seed reset on every event, and
    3. based on the standard G4Random classes.
  12. The size guarantees for various datatypes should be kept in mind. For integer types, these are:
    1. short: 8 bits (-127 to +127)
    2. int: 16 bits (-32767 to +32767)
    3. long: 32 bits
    In particular, use of "short" and "int" should be rare and well documented.

    The floating point representations are harder to quantify, but the net effect is that only "double" should be used in classes that are likely to be reused or written out. Optimization to "float", where needed and acceptable, can be done later.

  13. Functions should return their value in their return type instead of via arguments. If more than one value must be returned, document in the header file why a new type (class) should not be defined to return the values as a whole.
  14. Provide assignment operators and copy ctor for any class with owned pointers. These can either do the necessary work, or just be declared "private" and call abort(). They are needed because the compiler provides them unless they are defined, and the default ones do not handle owned pointers well. Be careful of self-assignment; you need to carefully consider what special things need to happen if somebody writes "a = a;" or the equivalent.
  15. Modules should only change their behavior based on parameters and Framework commands (including custom ones, if needed). Use of environment variables directly must be replaced by reading them from parameters (which can be set from environment variables in the .tcl scripts) Putting values of cuts, controls, etc, in "configuration files" is only acceptable as a temporary measure until they can be loaded from the database; this should be minimized.
  16. A class should provide arithmetical operators only if they will work very much like the underlying operators from high-school math. Its important that users get what their prior experience leads them to expect when they use operator + or similar.
  17. Because several of the C++ compilers we depend on cannot reliably handle them, do not use "template specialization" in your designs. This is when you provide more than one definition of the code for a template, with different ones for general use and for use with certain templated types.
  18. C++ classes that "wrap" FORTRAN (e.g. provide a C++ interface) must take into account that FORTRAN code is generally not reentrant, and may not correctly save state across multiple "wrappings". Acceptable solutions include:
    • Keeping the state in the C++ objects, and passing it as needed to the FORTRAN code (HepTuple works effectively this way)
    • Using a singleton pattern to make sure there is really only one C++ object
    • Using C++ wrappers around FORTRAN code is not an acceptable way to write a module. If significant FORTRAN code is called from a module, the module should inherit from AppNoCloneModule instead of AppModule.
    In any case, make sure the restrictions on multiple uses are clearly documented.

Documentation Issues

  1. Classes that logically cannot have more than one instantiation (singleton pattern, or use of static members for major behavior) must document why its assumed there will be only one unique instantiation. This must be an argument about how the class is to be used when doing physics, not a discussion of how its implemented. In particular, explain why multiple copies will not be needed for "before and after" comparisons, "MC vs. data" checks, etc.
  2. Ownership of the object that a pointer references should be clearly thought out and documented. Every pointer in a class definition must be labeled as either "owned" or "unowned". Every list should be labeled similarly.

C++ Implementation Issues

  1. Never overlay a fortran name with C++ code. This includes using f2c to merely translate an existing routine. Reason: The linker can pick either the new C++ or original F77, and its impossible to guarantee that the two are equivalent.
  2. Routines that do not have a void return type should always return something (i.e. no "return;" statements or running off the end of the procedure body). Null bodies (in member functions that have not yet been written, for example) often trigger this problem.
  3. Preconditions and postconditions are checked with assert( "condition" ). These may later be compiled out. Whereever possible, these should be at the beginning and end of a routine, respectively, and commented.
  4. If checks for internal logic errors are outside an assert, they should call abort(), not assert(0) when they fail. This way the action and check will have the same behavior in test and production versions.
  5. Be careful about self-assignment, i.e. you will probably have to have special purpose code to handle the case "aObject = aObject;", as in that case memory management has to be careful not to delete things that its going to use later.
  6. Use of "inline" requires judgement.

    "inline" makes accessor functions a very good idea. In particular, only accessing member data via accessor functions, even within the class member functions, is required for all data and computational classes to ease migration and maintenance. Performance improvements may not be used as a reason to avoid this; all our compilers will inline accessors at a high enough optimization level, and most accessors are not a performance problem in any case.

    On the other hand, long, complex in-line member functions should be avoided. They reduce clarity, make it take longer to compile and harder to debug, and the performance gain from making them inline is small for large routines. It is bad design to make a function inline if that requires additional header files to be referenced by #include. In particular, one or more compilers have restrictions that make it unacceptable to inline:

    • functions with multiple returns (except the simplest 'if...return x; else...return y;', and the 'else' *must* be present).
    • functions with 'while' or 'for' statements
    • functions with aggregate initialization ('array = {a1,a2,...an};')
    • value returning functions which call non-value- returning functions.
    To avoid future problems, do not inline:
    • functions calling member functions of other classes
    • functions calling virtual member functions of this class
    • functions calling templated functions
    In effect, don't inline anything that calls something else, or involves significant control structures.
  7. Only the *GHit and other Geant3 classes should currently be based on dbio.
  8. Use "bool", not "int", for true/false data items.
  9. Use enums for small named constants within classes. Do not use #define for this because it can result in name collisions in other people code. Unfortunately, use of "const int nlay=15;" inside a class is not supported by some compiler(s), so don't use this either. This is the recommended form:
              enum { nlay=15 };
    
  10. cpp macros that generate code are generally not acceptable. One exception is "assert" and similar system-provided macros.
  11. Use size_t as the type for array indices instead of int or unsigned int. (See the prior item about enums for small constants, including array dimensions).
  12. Use classes, not arrays when passing arguments and return values. You need a really good reason, better than "performance", to pass an array of objects instead of a list. You have to have extremely good reasons to pass "double v[3]" instead of a Hep3Vector object.
  13. Use "const" whenever possible. In particular, it should be always used with input arguments to functions that are passed as pointers or via references. This allows the compiler to optimize, and clues the user that there will be no side-effects. Also, if you nest calls without "const" on the arguments:
       A&  inside ();
       void outside(A& a);
       outside(inside());
    
    the compiler will be forced to create an extra temporary variable so it can pass a reference to a copy of the object returned from "inside".
  14. Never use #include when a simple forward declaration ("class foo;") will do. Most header files should not need to include many others. You only need the full declaration if creating or destroying an object of that class (e.g. using as a member variable), inheriting from the class, or actually invoking a member function. Pointers, references and use as arguments in declarations do not require the header file be present. If header files from another package must be #included to make a member function "inline", do not make the function inline.
  15. Declaring a member function both "inline" and "virtual" is problematic; there are very few places where a compiler can actually do the inlining.
  16. Due to compiler limitations, we must be careful with the scoping of variable declarations in for statements. In particular,
    	for (int i=0; ... ) { ... }
    	...
    	for (int i=0; ... ) { ... }
    
    must be rewritten as
    	int i;
    	for (i=0; ... ) { ... }
    	...
    	for (i=0; ... ) { ... }
    
    Note that the conditional expression (the thing in parentheses) in a while statement may not include declarations, so we expect to keep the restriction on while statements that code like
     
    	while (Foo* p = bar() ) { ... }
    
    is not permitted, and must be instead written as
    	Foo* p;
    	while (p=bar() ) { ... }
    
  17. There should be no architecture-specific if's in the GNUmakefiles. Include an arch_spec_*.mk file, or if need be create a new one.
  18. Class static variables of classes that have non-default constructors and/or destructors must not be used. These cause too many problems at link time. Static ints, doubles, and pointers are OK. Modules should instead use member variables that point to objects that are allocated and destroyed in the begin() and end() member functions. Other classes can use a static member function that wraps a file-static variable, where the member function definition is in a .cc file:
        class Foo {
          static  Bar& refBar();
          }
          
        Foo::refBar () {
        	static Bar b;
        	return b;
        	}
    
  19. If a list of some type of objects must be in a particular order, for example hits on a track which might have to be in ascending order by pathlength, the documentation for the list should clearly specify this, and the member function used to add hits to the list should check it as a precondition. Also consider making access to the list "const" to prevent it from being reordered or changed.
  20. Indexes (offsets) within an AbsEvent list must never be stored in an object. Use a pointer instead. dbio I/O is an exception to this for historical reasons. Reason: the lists within an AbsEvent are not guaranteed to stay in the same order, particularly when written out and read back in.
  21. Reconstruction objects in memory should not carry index numbers of objects within lists internally instead of pointers. They must not contain index numbers for lists that they have not created locally. Reason: There's no way to force these to stay consistent with the actual object's index in the list. Further, the conditions database and event store systems will only ensure the target objects are available in memory when they are referenced via pointers. Note that various objects moved from GEANT3 via dbio do carry index numbers; these are a historical exception.
  22. Some compilers generate unavoidable warnings about "extraneous semicolons". They seem particular common after function bodies, where they are not needed. Remember that a semicolon in C and C++ is a statement separator, not the end of a statement. You rarely need them after a closing brace except at the end of a class definition.
  23. On 64-bit machines, you get a warning when you convert from pointer to int. In particular, this will happen when iPtr is a pointer in
    	assert(iPtr);
    
    which is better written as
    	assert(iPtr != 0);
    
  24. Some systems do not have a header file called "ostream.h", so its better to '#include <iostream.h>' regardless of what types of streams you are referencing.
  25. Current vendor compilers do not support the declaration of a data member to be 'mutable', which allows it to change even inside of a 'const' function. The 'mutable' declaration is very useful when the data member represents a a volatile cache of information which might be updated inside a 'const' function without changing the state of the core object. Another use is when RogueWave template classes are used to build object data members. Since many of the RW functions are not const (including the iterator constructor), access to RW contained objects not declared 'mutable' in 'const' member functions is severely restricted.

    To temporarily get around this lack, use const inline functions for every data member which would have been declared mutable, that 'casts off the const' for that variable. Const functions which need to alter the cache variables then access through this inline. This approach is somewhat safer than simply casting off the const of the entire object (*this) at the begining of these functions. An example is as follows:

    
    class MyClass{
    
            void doSomethingNice() const;
    
    private:
            double _cachevalue; // should be declared mutable
            double& cacheValue() const { return (double&) _cachevalue;}
            RWTPtrSList<double> _doublelist; // should be declared mutable
            RWTPtrSList<double>& doubleList() const { return
                                            (RWTPtrSList<double>&) _doublelist; }
    };
    
    MyClass::doSomethingNice() const {
    
    //  loop over the list
       RWTPtrSListIterator<double> iter(doubleList());
    ...
    //  update the cache
       cacheValue() = newvalue;
    }
    
  26. When including a non-system header file, its important that it be written as

    #include "packagename/filename.hh"
    If you omit the packagename part, there will be trouble with template instantiations.

  27. If you use "assert", you need to define it by
      #include "assert.h"
    
  28. The HP iostreams header files are in terrible shape. This implies the order of their inclusion is important. In particular, iomanip.h must have iostream.h included _before_ it. Luckily, we're not now using HP, but this is still a good idea.
  29. Both the Sun and OSF compilers will not allow you to change the return type of a virtual function when redefining it in a subclass. This is very annoying, because it prevents you from returning (for example) a subclass of what was being returned in the original function. But its a limitation we have to live with.
  30. An "accessor" member function returns access (e.g. a reference or pointer) to a contained object. If the contained object is always guaranteed to be present, i.e. the function can always return access to a real object, then a reference should be returned. If not, i.e. the function must sometimes indicate that it can't return access to a real object, it should return a pointer value, and the value of 0 (zero) should be used to indicate that no object is available. This is a standard C and C++ idiom.

    To avoid accidentally dereferencing a zero pointer, all pointer values that are returned from functions should be checked for zero values. If the program cannot cope with an non-existant value, an assert() can be used. Otherwise, use an explicit comparison to zero in an if statement.

  31. C programmers often use "NULL" in place of "0" (zero) when testing a pointer value to see if no object is pointed to. This can be problematic in C++. Please don't use NULL, but instead use the literal 0. (See this Hypernews thread for discussion)
    Return to BABAR reconstruction software page, BABAR Computing Home Page, or BABAR Home Page.


Maintained by Bob Jacobsen,
Bob_Jacobsen@lbl.gov 510-486-7355