Guidelines for Writing or Modifying BABARSoftware
David R. Quarrie
for the BABAR Computing Group
Lawrence Berkeley National Laboratory, MS 50B-3238
DRQuarrie@LBL.Gov
Version Information
Draft: 21 August, 1996
This document is still under development. If you have any questions or comments, please address them to the author.
These guidelines are intended to improve programmer productivity and to promote the development of software that is easy to maintain. The overall goal is develop reliable software that can easily be updated to accommodate unforeseen changes throughout
the lifetime of the experiment. An important component in achieving that is that the software should be clear and understandable, not only to its original author, but also to other maintainers. Such good programming practices can significantly reduce the
debugging phase of program development and ensure easy integration into the overall system.
These guidelines therefore detail both common pitfalls and methods of avoiding them, together with references to some templates and example programs that can ease the development of new software and result in a more uniform coding style than would
otherwise be possible. This uniformity can considerably ease the burden on the person who has to maintain code since they will know where to find components of the code modules and in which sequence major subsections will appear. A uniform and consistent
naming convention should prevent name conflicts when large and complex programs are created that use code from many different libraries. Such a naming convention can also ease the burden on the maintainer by allowing differentiation between local and
external variables etc.
The overall philosophy for software development within the BABAR Collaboration is to balance the number of rules and regulations by promoting good software practices through early adoption of the code
templates and sample programs in conjunction with these guidelines. However, there will be some monitoring of the software quality, especially in areas where there is the potential for irretrievable loss of data (e.g. the on-line triggering and
run-control). When you are writing code, be aware that someone other than yourself will probably look at it and that the insertion of extra comments and adherence to the recommended style guidelines will help even yourself when you look at the code again
in several months time. A tiered approach to software quality monitoring will be taken, and software that is deemed adequate at one level (e.g. your own physics analysis) might well be considered unacceptable at a higher level (e.g. bulk
production) if you do not adhere to these guidelines.
Many aspects of these guidelines are based on some other documentation that is much more detailed than this in its discussion of rules and recommendations. These primary documents are detailed in the Bibliography Section of
this document and you are strongly recommended to read them. We have decided to conform to them, with very few exceptions, since they cover most aspects of software development and the common pitfalls. This document will only touch briefly on a few of the
suggestions and recommendations and these primary documents should be referred to before embarking on a major software development project. The few exceptions are detailed in the Section "Differences between the BaBar Guidelines and
the Primary Guideline References".
Note that the Bibliography Section contains several references to other useful books, both the language reference manuals and books on good coding practices.
As a result of a recommendation by the Computing Group to the collaboration, the C++ language has been selected as the primary programming language for the development of BABAR software. Note that ANSI
C is allowed as a subset of C++ in areas where, for example, a C++ compiler is unavailable (e.g. Front-end boards), or there are other limiting factors.
FORTRAN-77, with a few extensions, is allowed within the context of the GEANT-3 based Monte-Carlo simulation program. This is the only context where new FORTRAN code is allowed. A plan to migrate to the C++ based GEANT-4 simulation framework is in
place.
Existing FORTRAN code can be utilized within BABAR software by "wrapping" it with C++.
The following guidelines are language-independent
- Provide plenty of comments. The standard templates (described in the Section "Code Templates") provide a header and major subdivisions that organize the overall flow of your code. In general, every function or function
member, unless trivial, should have its own comment that describes the purpose of the function, the input and output arguments and any side-effects that the function produces. Within a function, separate major blocks of code and identify their purpose
with a leading comment. Don't bother to comment individual source lines unless there is an underlying subtlety that needs to be explained or you are documenting the arguments to a function or procedure.
- Avoid tricks unless absolutely necessary and explain why a particular approach has been taken. Terse code might seem clever when you write it, but it probably won't when you look at it again in several months time, and it almost certainly won't be
understandable to the person who takes over from you later on.
- Code for clarity rather than efficiency. This is obviously a generalization, but you should be certain that there is an efficiency problem (e.g. by profiling your application) before you start re-arranging (and possibly obscuring) code for
efficiency reasons.
- Attempt to write linear code - i.e. code that starts at the first executable statement and flows down to a final "return" or end of block statement.
- Code defensively. Check the return values for functions and don't assume that the call succeeded. If your function only operates on a limited range of input values, check at the beginning of the function and don't assume that the caller has made the
check. For C++ and C, the assert macro can be used for this (see later).
- Use meaningful names. Names of functions in modules should include the module prefix (see later). Generally, the more visible the name, the more "English-like" it should be.
Identifiers that refer to data types, like module and class names, typedefs and enums, should have initial capital letters (LittleObject), while functions and variables should have initial lower case letters (setValue, nextChoice).
Capitalize each word after the first, and don't separate words with underscores. Avoid abbreviations (e.g. use nameLength instead of nLn) and non-standard acronyms. Using the occasional article (aBMesonCandidate) or additional
information (nextBMesonCandidate) can help where an identifier looks a bit odd (bMesonCandidate).
Multiple word names let the reader know the intended use of an item, saving the hassle of inferring from context and avoiding ambiguities. Does "empty" mean a test for emptiness or setting the object to empty? Whilst the signature might give a
clue, it is better to use names that unambiguously convey the intent of the function. In the former case, "isEmpty" would be better; in the latter case, use "setEmpty" instead. It is often possible to name functions with a verb phrase
(setValue, fitTrack, complainBitterly) and variables with a noun phrase (loopIndex, nextWorkItem).
Variables with small, temporary scope can be given small names (temp, next), although a little effort can make their content more understandable and save the reader the effort of figuring out their meaning from how they are filled
(trackLoopIndex instead of temp, nextDaughter instead of next). Generally avoid single character names except for local loop and array indices.
Don't bother putting things like "Pointer", "Class" or "Struct" into the name: it is almost always clear from the context.
- Indent each nested block in order to highlight the start and end of each of them. Use trailing comments on the block terminators if this can guide the eye. An indent of 4 spaces is the recommended BABAR standard. It is compatible with the emacs default and the indent code beautifier. Do not redefine the tab character to be 4 characters in your editor since other people are unlikely to use this definition.
- Avoid implicit data-type conversions and mixed mode arithmetic. Use explicit operators (e.g. (float)...). The use of explicit conversions will have no effect on code efficiency and might avoid any inadvertent rounding or truncation errors and
will highlight the data type conversions to anyone attempting to follow the code.
- Avoid complicated "if" constructs. It is better to use several simpler nested "if" constructs rather than a complicated compound one. Avoid implicit precedence rules and use parentheses to identify the components of such constructs. Even if you can
remember all the precedence rules it is likely that someone else won't and it will make their job easier.
- These guidelines are broad enough that a wide variety of different coding styles are still possible. When modifying someone else's code, adapt to their style rather than imposing your own.
- Avoid "go to"s.
- Minimize the use of globals. In C++ encapsulate globals into a class or classes.
The following guidelines are specific to the C++ and C languages. This is only a selected subset of the full recommendations. Please refer to the primary reference [1] for the complete set. Note however, that some minor
differences are detailed in the Section "Differences between the BaBar Guidelines and the Primary Guideline References".
- Include files in C++ have the extension ".hh". Similarly, include files in C have the extension ".h".
- Implementation files in C++ have the extension ".cc". In C they have the extension ".c".
- Inline definition files have the extension ".icc". Such files contain inline definitions for simple function members (e.g. data member accessors) and are themselves included into the class header (.hh) file (see the templates for an example of
this). Remember that the inline keyword is only a hint to the compiler and there is no guarantee that a complex function will actually be expanded inline.
- Every include file should contain a mechanism to prevent multiple inclusions. For the file MyFile.hh, the BaBar convention is:
#ifndef MYFILE_HH
#define MYFILE_HH
[....]
#endif
- Use a separate .cc file, and corresponding .hh file, for each C++ class. The filename should be identical to the class name.
- Do not create class names (and therefore filenames) that differ only by case.
- In general, include directives should not contain any directory path names apart from those corresponding to BABAR packages. In this case, the package name should be used as a directory prefix:
#include "package/filename.hh"
- The identifier of every globally visible class, function or variable should begin with a prefix that identifies the class or module library to which it belongs. This implies that the implementation (.cc) and interface (.hh) files for C++ classes
should also begin with the same prefix.
- The first line of code in each .cc file (i.e. following the header comments), should be the following:
#include "BaBar/BaBar.hh"
- Encapsulate global variables and constants, enumerated types, and typedefs in a class.
class MyGlobals
{
public:
static void myFunc( );
static int myData( );
static void setMyData( int value );
private:
int _myData;
virtual dum() = 0; // Trick to make the class abstract
};
[....]
MyGlobals::myFunc( );
- Remember to add comments to both the interface (.hh) file as well as the implementation (.cc) file for each class. Interface comments should stress what the class does whereas implementation comments should stress how.
- Avoid overloading functions and operators unless there is a clear improvement in the clarity of the resulting code. For read and write access to data members, use:
int myData( ) const;
void setMyData( const int value );
rather than
int myData( ) const;
void myData( const int value );
The former makes the intention clear whereas the latter is an example of using a facility because it exists, not because it has any advantages.
- Data members should have names that begin with a leading underscore ("_") character in order to avoid name conflicts with any accessor functions.
- As part of defensive coding, use the assert macro to check that specified conditions are true.
#include <assert.h>
[....]
void
MyClass::myFunction( char* name )
{
assert ( 0 != name ); // Check that the name is not NULL
The assertion can be removed from the code, once it has been tested, by compiling with the symbol NDEBUG defined.
- Never implicitly compare pointers to nonzero (i.e. do not treat them as having a boolean value). Use
if ( 0 != ptr ) ...
instead of
if ( ptr ) ...
- If you are doing an assignment in a comparison expression, make the comparison explicit:
while ( 0 != (ptr=iterator() ) ) ...
is much better than
while ( ptr=iterator() ) ...
- The following are equally acceptable recommended styles for function declarations:
int
myFunction( int intValue,
char* charPointerVaue,
int* intPointerValue,
MyClass* myClassPointerValue );
int
myFunction(
int intValue,
char* charPointerVaue,
int* intPointerValue,
MyClass* myClassPointerValue
);
Since they both allow space for a comment describing each argument.
- The following are equally acceptable recommended styles for locating braces enclosing a block:
while ( ... )
{
...
}
while ( ... ) {
...
}
- Watch out for the following:
if ( value = 0 ) ...
Where what you really meant was:
if ( value == 0 ) ...
One method of avoiding this pitfall is to specify the const item first:
if ( 0 == value ) ...
since this will give rise to a compile-time error if an assignment is specified by mistake.
- Use
while (...) {}
instead of
while (...) ;
for null loops. They happen remarkably often in C++, and the semantics of the semi-colon is a horrible trap for beginners.
- Take care when testing floating point values for equality. It is better to use:
#include <math.h>
if ( fabs(value1 - value2) < 0.001 ) ...
than:
if ( value1 == value2 ) ...
The following guidelines are based on experience using several C++ compilers.
- The following is not allowed:
[In file MyClass.hh]
class MyClass {
static const int myInt = 14;
...
};
You have to do the following:
[In file MyClass.hh]
class MyClass {
static const int myInt;
...
};
[In file MyClass.cc]
const int MyClass::myInt = 14;
This should be allowed by the latest language definition, but is a recent change and the DEC compiler doesn't yet support it.
- The following is not allowed in DEC C++
while ( int i = ... ) {...}
use the following:
int i;
while ( i = ... ) {...}
Note that within "for" loops it's ok.
- Omit the trailing ";" after function and function member definitions. Although this only causes a compiler warning, it's extremely messy and can in extreme cases confuse the compiler. Remember, however, that a ";" is necessary at the end of a class {
... }; declaration.
- The following fails
class MyClass {
const int nlay=15;
...
Use
enum { nlay=15 };
Apparently the compiler thinks that nlay is meant to be a pure virtual function member and that therefore =0 is the only valid "assignment".
- Don't use "ostream.h". Use "iostream.h" instead.
- Watch out for function pointers instead of data members. The following is disallowed by the DEC C++ compiler but not g++
if ( value == 5 ) {...}
where
int value( ) { return _value; }
i.e. value is a function member. What was intended is
if ( _value == 5 ) {...}
The DEC compiler flags this as an error. However "value" is a valid function pointer and therefore this should not generate an error.
When using C++, the preferred form for comments is:
// This is a one-line comment.
i.e. use the "//" comment format. If the comment extends over multiple lines, each line must have // at the beginning:
// This is a long and boring comment.
// I need to put // at the start of each line.
// Note that the comment starts at the // and
// extends to the end of line. These comments
// can therefore appear on the same line as code,
// following on from the code.
Unfortunately this construct is not part of ANSI C, where the form of a comment is
/* This is a C comment */
In order to maintain a uniform appearance, we recommend the following form for C comments:
/*
// This is a long and boring C comment.
// In order to make it look like a C++ comment I will
// add // at the start of each line.
*/
The code templates are located in the following directories:
$BFROOT/doc/Programming/Templates/C++
The approach taken in developing these templates is to make them more complete than you might need, with the intention being that you will remove redundant sections as appropriate. However, their completeness is designed to act as an aid to remembering
what sections should be present and in which sequence.
Sample programs that illustrate the recommended coding style are located in:
$BFROOT/doc/Programming/???? (C++)
[These sample programs do not yet exist.]
Differences between the BABARGuidelines and the Primary Guideline References
C++
Although most of the BABAR C++ recommendations are taken directly from the primary Ellemtel reference document [1], there are some minor differences. These are listed below with
the Ellemtel Rule or Recommendation number in parentheses:
- All Ellemtel rules are initially to be treated as recommendations until the BaBar policy is finalized.
- Inline functions may be directly defined in the .hh file. If a separate file is created for their definition (this is the recommended practice), it should have the file name extension ".icc". (Rule 3).
- Generally each class should consist of a single implementation (.cc) file, a single interface (.hh) file and an optional inline definition file (.icc). In exceptional circumstances a very complex class can be split into multiple implementation files,
although the preferred approach is to introduce an abstract parent class that implements part of the required functionality (Rec. 4).
- A single-level directory specifier should appear in include directives (Rule 10). This specifier identifies the particular BABAR package:
#include "TRACK/filename.hh"
- A leading underscore ("_") character may be used when naming data members in order to differentiate them from any accessor functions (Rule 16).
- Two styles of function declarations are equally acceptable (Rec. 21).
- Two styles of brace ("{}") location are equally acceptable (Rec. 24).
- [etc.]
Guidelines and Tutorials
- Ellemtel Telecommunication Systems Laboratories, Programming in C++: Rules and Recommendations
- Stanley B. Lippman, C++ Primer, (2nd edition), Addison-Wesley, 1991, ISBN 0-
- John J. Barton, Lee R. Nackman, Scientific and Engineering C++, Addison-Wesley, 1994, ISBN 0-201-53393-6
Reference Books
- Margaret A. Ellis, Bjarne Stroustrup, The Annotated C++ Reference Manual, Addison-Wesley, 1990 (1992 corrections), ISBN 0-201-51459-1
Object Oriented Methodology
- Robert Martin, Designing OO C++ Applications using the Booch Method, Prentice Hall, 1995, ISBN 0-13-203837-4
- Erich Gamma, et al., Design Patterns: Elements of reuseable object-oriented software, Addison Wesley, 1995, ISBN 0-201-63361-2
- Iseult White, Using the Booch method: a Rational approach, Benjamin Cummings, 1994, ISBN 0-8053-0614-5
- Grady Booch, Object-oriented Analysis and Design with Applications (2nd edition), Benjamin Cummings, 1994, ISBN 0-8053-5340-2
|