SLAC PEP-II
BABAR
SLAC<->RAL
Babar logo
HEPIC E,S & H Databases PDG HEP preprints
Organization Detector Computing Physics Documentation
Personnel Glossary Sitemap Search Hypernews
Unwrap page!
Comp. Search
Who's who?
Meetings
FAQ Homepage
Archive
Environment
Administration
New User Info.
Web Info/Tools
Monitoring
Training
Tools & Utils
Programming
C++ Standard
SRT, AFS, CVS
QA and QC
Remedy
Histogramming
Operations
PromptReco
Simulation Production
Online SW
Dataflow
Detector Control
Evt Processing
Run Control
Calibration
Databases
Offline
Workbook
Coding Standards
Simulation
Reconstruction
Prompt Reco.
BaBar Grid
Data Distribution
Beta & BetaTools
Kanga & Root
Analysis Tools
RooFit Toolkit
Data Management
Data Quality
Event display
Event Browser
Code releases
Databases
Check this page for HTML 4.01 Transitional compliance with the
W3C Validator
(More checks...)

Valgrind for BaBar

Summary

Valgrind is an open-source tool for finding memory management problems (leaks and corruption) on linux/x86 systems. The main web page for the tool can be found here. It is simple to use and requires no recompilation of your program.

To summarize, to obtain information on your program all you have to do is:

valgrind <valgrind-options> <your-executable> <your-exe-arguments>

and the 'valgrind' output will appear on stdout, mixed with your program output. See 'valgrind -h' for a number of useful options.

Notes and suggestions on how to use valgrind

  • When you run your application with valgrind you will find that it:
    • takes more memory that it would when run normally
    • runs significantly more slowly
    In practice this means that you should run over a limited number of events (e.g. in a Framework based application). Most problems can in fact be found by running over a small number of events, however.

  • If you are looking at a crash which happens on the Nth event in some event collection (where N is large), usually you can just skip ahead and start processing just before that event when you run with valgrind.

  • The simplest way to run valgrind just checks for memory corruption or misuse, but not leaks. Useful valgrind options to use in this case have been found to be:

    --num-callers=15 --error-limit=no

    From version 2.2.0 onwards you can also specify the tool to use with something like:
    --tool=memcheck

  • In addition valgrind can be used to look for memory leaks. (This slows it down a bit and requires somewhat more memory.) Useful valgrind options to use in this case have been found to be:

    --leak-check=full --show-reachable=yes --num-callers=15 --error-limit=no

  • When chasing memory leaks, you will have to differentiate between once per job leaks and leaks which occur in the event loop. (The latter are typically those which cause more problems.) When running over a small number of events it can sometimes be difficult to differentiate these two things. Many of us find it useful to run two valgrind jobs when looking at leaks, one on M events and one on 2*M events. By looking at the difference, the per event leaks are more readily identified.

  • When examining a valgrind leak report, it is usually useful to begin by focusing on those flagged as "definitely lost", then look at those flagged as "possibly lost" and then finally at the "still reachable" category. Sometimes things in the later categories are simply knock-on effects of those in the former (i.e. in particular "definitely lost").

Installation and use at SLAC

John Bartelt has kindly installed version 3.1.0 of 'valgrind' in /usr/local/bin on the linux machines, so you should find it automatically in your path.

At SLAC a special batch queue (the "valgrindq") is provided which has a small number of machines setup such that jobs can use up to 3GB of real memory. Please use this queue only for running your application with valgrind rather than run interactively. (And run only valgrind jobs in the valgrindq.)

Installation and use away from SLAC

Download the valgrind source from the valgrind website and follow the instructions to build it in the "INSTALL" file included with the source. It is very straightforward and requires nothing particular in terms of other installed software (except for the compiler and other things you would have on any standard linux system).

At SLAC we noticed that it was necessary to compile valgrind explicitly for different RH versions (e.g. RH7.2 and RH9) due to the fact that the use different glibc versions. It was not possible to run a valgrind binary compiled on RH7.2 on a RH9 machine. (AFAIK, this has not been cheked for RHEL3.)

As we note for SLAC above, valgrind is very memory intensive, so you will need to run valgrind on a machine with the appropriate amount of memory for the application you are testing and arrange things such that one user running valgrind on a machine doesn't cause problems for other users on the same machine.

Last modified 27-Jan-2006, Peter.Elmer@cern.ch