Babar logo
HEPIC E,S & H Databases PDG HEP preprints
Organization Detector Computing Physics Documentation
Personnel Glossary Sitemap Search Hypernews
Workbook Home Introduction Account Setup QuickTour Packages Modules Unwrap page!
Event Information Tcl Commands Editing Compile and Link Run the Job Debugging
Check this page with the
W3C Validator
(More checks...)
Parameters Tcl Files Find Data Batch Analysis ROOT Tutorial

Introduction to C++


This section is intended to teach the user just enough about C++ to be able to make minor changes to existing code.

Choose a text editor

You can edit C++ and tcl code on any text editor. Popular choices are xemacs, or emacs.

emacs "knows" about different types of code, so if you name your file "" or "foo.hh" then it will "know" that it is C++, and if you call it "foo.tcl" then it will "know" that it is tcl. emacs will then adopt a mode which applies a colour coding to different aspects of the file, a process called "fontifying". It can also be used to spot errors like failure to close brackets or to appropriately indent code (as in Fortran mode). For example, on my emacs editor, the first part of the Quicktour's looks like this:

// in general, a module constructor should not do much.  The beginJob
// begin(run) members are better places to put initialization

QExample::QExample( const char* const theName, 
			  const char* const theDescription )
  : AppModule( theName, theDescription )

Other text editors you could try include kwrite, nedit, or kate.

C++ Syntax

Basic Syntax

C++ code is case sensitive. Two variables named Test and test are differentiated between. However, the use of variable or function names that differ only by case is strongly discouraged. Variable and function names may consist of all alphabet characters (upper and lower case). Variable names must begin with either a letter or the underscore character '_'. Most of the C++ language consists of lower case letters comprising key words and punctuation.

Comments are indicated by a double slash '//'. All text from the double slash to the end of the line is a comment and is ignored by the compiler. An alternate syntax uses '/*' to begin a comment and '*/' to end a comment. This style of commenting is supported to be compatible with C code. The double slash is a preferred convention in BaBar C++ code.

A declaration is a statement that introduces a name into a program. It consists of four parts, two optional and two mandatory. The general structure is:

[specifier] <base type> <declarator> [initializer];
When the optional initializer is included the statement is both a declaration and a definition. With the exception of function definitions, declarations are ended with a semicolon.
int x;	//declaration 
int x = 5; //declaration and definition
The '= 5' in the last statement is the initializer. The equals sign '=' is the assignment operator.

A function declaration has the format:

<return type> ClassName::FunctionName(<type> <name> , <type> 
<name> , .....);

The type and name pairs in parenthesis are arguments of the function. When a function has more than one argument they are separated by commas. If a function has no arguments then the parenthesis are still included but with nothing in between. When a function does not return any values or objects the return type is void. This declaration syntax may also be referred to as the function prototype.

The definition of a function typically consists of a series of statements. The code that comprises the function's definition is demarcated by curly braces. There is no semicolon at the end of a function's definition. For example:

int ExampleFunction(int i){ 
   int j; //declaration of j
   statement one;
   statement two;
   return j;
ExampleFunction takes one argument, an integer. Within the body of the function an integer named j is declared. Some calculations using the argument integer i and the function's integer j are performed. The final value of j is returned by ExampleFunction.

Typically a statement is one line of code, in either the body of a program or the definition of a function. Statements are a base unit of code and in C++, as in C, they must be terminated by a semicolon. A statement may fulfill one of many roles: declaration, definition, call to a function, allocation of memory, assignment, calculation, and so forth.

There are two ways to use your new function, the main one being to make an object (an "instantiation") of the class:

ClassName myClassObject;
Then call that function through your new instance of the class:
int returnedinteger = myClassObject->ExampleFunction(3);
where in this example, the integer 3 is passed to the example function, and the new integer "returnedinteger" is assigned the returned value of the function.

Data Types, Pointers, and Arrays

When a declaration is made, memory is set aside for the declared variable. This compiler needs a type associated with all memory so that it can properly interpret the stored data. The built in memory types of C++ are: int (integer), double (floating point), bool (boolean), and char (character).

In addition to memory being one of these types, memory can be defined as a pointer to a type. A variable declared as a pointer is interpreted as an address of memory. The syntax of this declaration is to append an asterix '*' to the type. For example:

int* x;
The variable x can have the address of an integer stored in its associated memory. It is important to recognize that a declaration will set aside, allocate, memory for a pointer. The memory for the data type pointed to must be allocated and defined explicitly. This is called initialization. To initialize or access the value of a pointer one must dereference the memory pointed to with the asterix '*', and the pointer itself with the ampersand '&'. For example:
*x = 5; //assign 5 to the memory x points to

int y; //declare an int named y
y = *x; //assign y the value x points to (both ints)

y = &x; //assign y the value stored in x, an address
A built in data structure of C++ is the array. An array is a block of memory set aside for the contiguous storage of like elements. The declaration of an array must include the type and number of elements to be stored. An array is indicated by appending a set of square brackets '[ ]' to the variable name. For example:
int myarray[10]; //declare array of 10 integers
To define any element of the array, one may use the assignment operator and place the element index between the square brackets. Array indexing begins with zero, 0. Thus, for an array of ten integers valid indices are 0 through 9.
myarray[4] = 6; //assign 6 to the 5th element

Class Syntax

User-defined data types is central to the design of C++. The most common unit of user-defined data in C++ and BaBar code specifically is the class.

To create a class a user must define it to the compiler. A class is identified by the name determined in the declaration and consists of data members and member functions (also called methods). The syntax of this declaration is:

class ClassName{
data members and member functions
This defines a new type and is thus referred to as a class definition. However, historically and analogous to general declarations, it is also called a class declaration. To add to the linguistic confusion, the implementation of the class (the code that defines each member function), is also called the class definition. For consistency I will refer to this initial step as the declaration and the implementation as the definition.

The data members and member functions of a class can be placed in one of three categories:

accessible to all code
accessible only to friends of this class, classes and friends of classes that inherit from this one
accessible only to classes that inherit from this one

Typically data members that store the state of the object should be private. This protects the implementation of the class. Though users can read the declaration of a class, client code is not allowed to access it, and the data is protected from being altered. The functions supporting the class, those that make it a useful entity for applications, should be made public.

Data members of a class are declared with the same syntax as variables. Similarly, member functions are declared with the same syntax as general functions. However, the name of a member function is ClassName::FunctionName. The '::' character is called the scope resolution operator. It indicates that FunctionName is a function of the ClassName class. Multiple classes with same function names do not give rise to conflicts or ambiguity. When a member function is called, it is accessed using the scope resolution operator on the object. For example:

ClassName example;         //declare an object named example, type ClassName
example::FunctionName();   // call the function FunctionName associated
                           // with ClassName to act on example
Two special member functions are the constructor, same name as the class, and the destructor , class name prepended with a tilde '~' character. The constructor is called whenever a variable of its class type is declared. Similarly the destructor is called whenever an object of the class type needs to be deleted.

When the class is declared, the member functions and data members are also declarations. The body of the class declaration is within curly braces and is terminated with a semicolon. The class definition provides each member function's definition. The definition of each member function is contained within curly braces and is not terminated with a semicolon.

Here is an example class declaration and class definition.

Loops and Conditional Statements

C++ offers a set of characters for performing comparison. If the relationship is satisfied then a boolean (bool) true is returned, else a bool false is returned.

= = equal to
! = not equal
> greater than
> = greater or equal
< less than
< = less or equal
Some built in C++ statements take boolean values as their argument. If the conditional argument is true the rest of the statement will be executed if the conditional argument is false then it is not. When the statement to be executed only exists of one statement, that statement is terminated with a semicolon. When there are multiple statements to be executed, they are bounded by curly braces. Each statement in the statement body is terminated by a semicolon, whereas the body itself is not (no semicolon after the closing curly brace). Perhaps the most common conditional statements for analysis are the if and the while statements.

If statements are useful for execution of a block of code once after a test has been satisfied. For example, only if a particle is within a mass range should analysis be continued. The conditional test is performed, if it evaluates to true the statement body is executed. In the if/else block if the condition evaluates to false then the else body of statements is executed.

While statements are used when a block of code should be executed as long as the test/conditional argument is true. If the condition evaluates to true then the statements in the body are executed. Execution of code then returns to the condition and evaluates it again. This sequence will continue until the condition evaluates to false. While loops are very useful for running quick checks and making plots. A common source of errors, bugs, is the accidental replacement of the assignment operator with the boolean 'is equal' comparator and vice versa.

General Code Structure

While loops and if statements can be used on their own, as part of a function, and be nested within themselves and each other. The body of code associated with each statement is marked by curly braces. Each open curly brace must be matched with a closed curly brace. When there is nesting the entire nested statement must be within the body of the outer statement. In a multiply nested sequence the closed braces will be associated in reverse order from the open braces. That is the first open curly brace will be matched by the last closed curly brace.

When a variable is declared memory is allocated for it (in an area of memory called the stack). The scope of a variable begins at its declaration and persists until it is either explicitly deleted or the body of code in which it was created has finished executing.

For example, a variable declared within a member function is allocated at the time of its declaration. When execution of the code moves past the closing curly brace of the function the variable is said to have gone out of scope. When this happens the memory that was allocated to the variable is no longer reserved. That memory can now be reused by the operating system and the variable that has gone out of scope should not be accessed.

It is important to keep track of the body delimiters of these statements for compile purposes, for illuminating the scope of variables, and for making code readable. A standard convention, also used in BaBar code, is the use of indentation when nesting occurs. All code in the body of a statement should be indented systematically. The amount of indentation should correspond to the level of nesting. The curly brace closing the body of a statement should be placed on its own line at the same level of indentation as the opening part of the statement.

Look here for an example.


Sequential storage of data is a common occurrence. Often general packages of code or libraries make available typical structures such as arrays, lists, and vectors. For these containers to be functional, a user must be able to transverse the elements, often in a systematic or all inclusive manner. At the same time the encapsulation of implementation details must be preserved. The notion of an iterator satisfies both of these requirements.

An iterator is an abstraction of a pointer to an element. It is typically implemented as a class or function associated with a given container class. The iterator points to one element in the sequence and has access to the information needed to move to the next element of the sequence. It also has access to information that will determine the end of the data sequence. Concepts supported by a general iterator are the idea of the current element, next element/incrementation, and equality/comparison.

In BaBar each reconstructed event contains many lists of like data, for example lists of pions, charged tracks, and so forth. Iterators used in conjunction with loops facilitate execution of a segment of analysis code on each element in a list. For example, an iterator is used to access a charged track in an event's list, and then the momentum is plotted in a histogram. This sequence continues to loop until each track of the list has been plotted.

C++ Structure

Program Organization

For large programs it is not reasonable for all of the code to exist in one file. This is due to readability, maintenance, and primarily compile time. If all of the code were in one unit, even the smallest change would require re-compilation of all code. To avoid this very costly dependence, code is partitioned into a set of coherent modules. The physical structure, the system of code files, is likely to reflect the logical structure of the program.

The many units of a source code in a large program must be mutually consistent. For one, types in declarations must be uniform throughout all units of code. A primary method of accomplishing this is to gather all declarations and interface information into one place, a header file, while placing the definition code into an implementation file.

Header File

Header files will contain the declarations an implementation file wants to make available to other units of code. The standard code that a header file should include are type definitions, function declarations, and name declarations. By BaBar convention header files have names with the suffix '.hh'

Units of code, files, access the code declared in a header file by using a preprocessor include command. The syntax is:

#include "<header file name>"

Before code is compiled the preprocessor will prepend a copy of the header file in any file that has included it. The final executable usually needs only one compilation of a header file, even though that header file may be included in many code files. To prevent unnecessary compilation of header file code the following macro syntax is used.

#ifndef <definevalue>
#define <definevalue>

...header file contents


The first time the compiler sees the header file code it is compiled and internally assigned a value. When the compiler comes to the header file again, it is already defined so everything between the ifndef and the endif is ignored. BaBar convention sets definevalue to the name of the header file in all capital letters ended by _HH. For example, the QExample.hh file is defined QEXAMPLE_HH.

Implementation File

All of the source code for the implementation and definition of a header file's declarations is placed in an implementation file. Complete function definitions should be placed in the implementation file. By BaBar convention implementation files have names with the suffix '.cc'.

The implementation must have access to the declarations and types that it defines, so it must include its own header file.

Standard Libraries

Standard libraries are included with the C++ language to provide commonly used and needed functions and types. Accessing the code of a standard library is analogous to using user-written source code. Any code making use of a standard library must include it. The include syntax is the same as for including header files except the standard library name is enclosed by angle brackets instead of double quotes.

#include <<library name>>

General Related Documents:

Back to Workbook Front Page

Send comments to Workbook Team.