Using Purify with BaBar software
This page includes some basic information about using Rational Software
Corporation's run-time error checking tool "Purify" with BaBar software. Most
of my experience has been in using it with Elf, but most of what I say below
should be true for any BaBar application.
This page is under development. In addition to my own experience, it
contains information and text from Jacek Becla and Ed Frank, amongst others.
The errors and inaccuracies are entirely mine
What is Purify?
Purify is a run-time error detection tool. To quote the user's guide:
"Purify checks every memory access operation, pinpointing where
errors occur and providing detailed diagnostic information to help
you analyze why the errors occur. Among the many errors that
Purify helps you locate and understand are:
- Reading or writing beyond the bounds of an array
- Using uninitialized memory
- Reading or writing freed memory
- Reading or writing beyind the stack pointer
- Reading or writing through null pointers
- Leaking memory and file descriptors"
You get the picture.
When you run a program which has been instrumented with Purify, a browser
window will appear, containing reports of errors as the program finds them. You
can use the brower to study errors there and then, you can print off a summary,
or you can save the reports to a file and study them with the browser later.
Where is it and who can use it?
Purify can be found at/afs/slac.stanford.edu/g/babar/package/purify/purify
BaBar have 4 "simple" licenses. This allows up to 4 people to use purify
concurrently on a single host. For SLAC, this host is
logically shire. Any BaBar user logged on to shire may use
purify, subject to the 4-user limit (at present the licensing software tracks
useage, but seems to neither enforce the limit on the number of users nor
restrict useage to shire).
Linking with Purify
You should first add Purify to your path, e.g. for tcsh:
set path=($path /afs/slac.stanford.edu/g/babar/package/purify)
rehash
The first stage is then to link the program using Purify. Purify will
instrument your object code and libraries by adding
checking instructions before every memory operation. There is no
need to recompile.
The most convenient
way of linking is probably to define an alias, e.g. (for csh-like shells):
alias gmp "gmake 'CXX=purify -cache_dir=X CC' 'CC=purify-cache_dir=X CC'"
where "X" is the cache directory where you want Purify to cache its
instrumented versions of libraries and object files (it does this to speed
future linking, but it does use a lot of space).
You then use this alias in place of "gmake" a copy of your executable:
gmp Package.bin
Note that there is no need to recompile using Purify.
Warning: linking with Purify uses a lot
of memory. Until recently you were forced to link interactively on Shire
(out-of-hours unless it is a real emergency!). Even so, for many BaBar
applications the build would exceed the 2 GB datasize limit
(!!), and so would fail unless you used the command "unlimit
datasize" first (though you can also remove unnecessary libraries from the
link in some cases if - unlike me - you know what you are doing).
Currently building in bldrecoq seems to be working, which makes using
Purify quite a bit easier. Make sure that you place your cache directory
somewhere which will be visible to any of the build machines.
Running with Purify
Having linked your executable using Purify, you just run it
as normal (except that you must run it on shire!).
If you are running interactively, a browser window will appear when you
start the program. Error reports appear in the browser as they are detected
during program execution. The browser will remain active after the program
terminates, until you close it yourself. You may also save the error reports
to a file, and use the browser to study them at a later time.
The browser is pretty straightforward to use. For each error detected it
prints a summary line, telling you what type the error was and how many times
it occurred. You can select a message and expand it so that it shows the
sequence of calls leading to the error, the relevant lines of source code
for each function in the calling sequence, and even pull up the source file
in an editor window.
I'll put some figures in here when I get the chance
Alternatively, you can suppress the browser and make Purify write the
error reports directly into a file, which you can then examine later using the
browser. To do this, you should set the environment variable PURIFYOPTIONS to
include the string "-view-file=file.name", e.g.
setenv PURIFYOPTIONS "-view-file=/tmp/atw_elf_8.2.5.pv $PURIFYOPTIONS"
This would be used if you were running your purified application in batch
(if a suitable queue is available on shire01), or if you find it simply a more
convenient way of working.
To view a previously-stored file using the Purify browser, you type the
command:
purify -view file.name
Note that at least for the BaBar applications I've looked
at so far, the view files can be quite large (the ones I currently have range
from 0.6 - 2.2 MB). The fewer problems, the smaller these will be.
How do I interpret the output?
What Purify reports depends on the nature of the problem. As a "typical"
example, it would report for a memory access error (e.g. reading from memory
which has already been freed, or FMR in Purify nomenclature) both where
the read occurred and where the memory in question was freed. For both, it
lists a sequence of calls leading to the error (with the routine directly
causing the error listed at the top, and those calling it listed below). In
most cases it can show you the line in the source code where the error
occurred, and will allow you to pop up an editor window containing the source
file so that you can investigate the context further. In some cases this is
enough, in others the "traceback" does not go far enough to be as helpful as
you might like.
The traceback listings also show the calling arguments for the functions,
at least in some cases,
which can be helpful in understanding what has happened.
Memory leak reports are less helpful, in that it only reports where the
leaked memory was allocated (line in source, not time sequence in execution),
and not
where the pointers to that memory were leaked. It also lists memory leaks by
total size (size per occurance * number of occurances in complete job), so you
have to work out for yourself where in the job they occurred (job start, job
end, once per event, just on a subset of events, or whatever). See "Hints and
Tips" below.
As you might expect, some errors reports are much easier to make use of than
others (my experience is that even a novice, such as myself, can often
spot exactly how an Array Bounds Read error occurs in completely unfamiliar
code using this tool. Conversely, for some errors all I can say is that
"purify reports such-and-such an error on memory allocated at this point in
this routine", and how often it happens). I'm still learning how best to use
the information Purify provides, and will update this page as I learn more.
Things to Watch
- Jobs that have been instrumented with Purify run a lot slower than
normal, and use a lot more memory. Only run small numbers of events,
particularly at first when you don't know how long it will take. Warn Charlie
Young if you need to make a particularly long run.
- If you find that the job seems to "stick", be patient. Purify is doing a
lot of checking "behind the scenes". As an example, instrumented Elf jobs
can appear to "hang" for > 1 hour during the initialization of the job. It's
best to have some other work to do....
- Printing a summary of the error report uses a lot of paper, as it prints
all of the errors, but does not (for me) provide enough detail
about individual errors to actually track down the problems. The browser is
more useful.
Hints and Tricks
- Use the menus in the browser to suppress categories of error which you are
not interested in (e.g. because you do not consider them serious, or simply
because you are looking for a particular type of problem). You can suppress
a message just for the current session, or permanently. You can also opt to
see suppressed messages - they are still recorded, just not displayed by
default.
- Purify only checks for memory leaks at the end of the job, unless you
use the browser to tell it to perform a leak test. Hence, if you are interested
in what it says about memory leaks, you may wish to run with the browser active
and explicitly test for leaks at different points in the job (e.g. before
event loop, within event loop, at end of event loop).
- A corollary of this is that if the job does not terminate normally (i.e.
if there is fatal error), you will get no information about
memory leaks unless you have manually requested it to check for
leaks previously (in which case any leak reports you have obtained prior to
the crash will be available, though you will have no information about leaks
since your last leak check).
- Many errors are "correlated", e.g. Free Memory Read, Free Memory Write
and Freeing Unallocated Memory often have a common cause. Look out for this
(at the least, it will reduce the number of individual reports you submit to
Remedy!).
Page author(s):
Alan Watson
| Last significant update: Dec-10-1999 |
Expiry date: Mar-01-2000 |
|