SLAC PEP-II
BABAR
SLAC<->RAL
Babar logo
HEPIC E,S & H Databases PDG HEP preprints
Organization Detector Computing Physics Documentation
Personnel Glossary Sitemap Search Hypernews
Unwrap page!
Comp. Search
Who's who?
Meetings
FAQ Homepage
Archive
Environment
Administration
New User Info.
Web Info/Tools
Monitoring
Training
Tools & Utils
Programming
C++ Standard
SRT, AFS, CVS
QA and QC
Remedy
Histogramming
Operations
PromptReco
Simulation Production
Online SW
Dataflow
Detector Control
Evt Processing
Run Control
Calibration
Databases
Offline
Workbook
Coding Standards
Simulation
Reconstruction
Prompt Reco.
BaBar Grid
Data Distribution
Beta & BetaTools
Kanga & Root
Analysis Tools
RooFit Toolkit
Data Management
Data Quality
Event display
Event Browser
Code releases
Databases
Check this page for HTML 4.01 Transitional compliance with the
W3C Validator
(More checks...)

Guidlines for writing automated regression tests for the BaBar software environment


version: $Id: RegrTest-guidelines.html,v 1.1 1998/08/06 18:16:43 samuel Exp $



Introduction and goals

This document describes guidelines for implementing automating regression testing of components in the BaBar software environment.

The goal of the automated regression testing system is to provide a way of scripting common tasks that may be used to exercise the functionality of various packages in the software environment, and verify that they perform as expected. For the system to be widely-used and successful, it must satisfy these requirements:

  • The system should be flexible enough to accomodate any automated testing procedures that do not required direct operator intervention.
  • The system should not impose significant overhead on the developer beyond what work is necessary to design and implement the tests themselves.
  • The system should be integrated with the automated build system, so that regression tests may be run following an automated build, with the results of regression testing presented along with the build report.
  • Output files from test scripts must be kept in separate directories from the test source, just as build products are located apart from source files. The system must allow for testing to be performed separately on separate architectures.

Test files and the testit program

Regression tests are implemented via test scripts. Test scripts contain UNIX commands to run to exercise software components, along with the output that is expected from them. Typically package test scripts will invoke executables that are part of that package, either standard executables or test executables specially created for regression testing.

Test file format

Test files are plain text files containing test commands. Each test file contains one or more tests sections. Each test section includes commands to run, any input necessary for these commands, and the expected output.

Each test section in a test script has four required parts, separated by labelled dividers. These parts must follow the order described, and must be separated by dividers as shown in the example below. Sections may be empty (but a test with an empty run section will not do anything).

The parts sections in a test file are these:

  • run: Contains the commands to be run. The contents of this section are copied to a temporary file and executed in a shell process. By default, /bin/sh is used. Another shell may be specified on the first line in the usual way for shell scripts, for instance #!/bin/csh. A new shell process is spawned for each test section, and the entire contents of the run section are executed in that shell process. Note that this means environment variables etc. are local to each test section.
  • stdin:Contains text to be fed into standard input for the test process. Input that would be typed into running programs (e.g. TCL commands for analysis programs) should be placed here. An end-of-file (^D) character is assumed at the end.
  • stdout:Contains the expected standard output from the test process, if any.
  • stderr:Contains any output on standard error from the test process, if any.

The "output" and "error" sections should indicate the test output exactly, including whitespace.

Text before run or after stderr is assumed to be comments and ignored, except for special directives described below:

  • @timeout interval
    Specifies a timeout interval, in seconds, for all subsequent sections. If a section has not completed within this time, it is assumed to have failed, and is killed. The default timeout interval is 60 seconds. To disable the timeout, specify a timeout interval of 0.
Test files have the extension ".t".

Take a look at an example test file to see how test files are laid out.

The testit program

The testit program is the test file driver. See the man page. It parses test files and runs the test contained in them. The result is an output file identical to the input file except that the contents of "stdout" parts are replaced by the actual standard output of the tests, and contents of the "stderr" parts are replaced by the actual standard error output of the tests. The output from testit is typically given the extension ".T".

The calling syntax of testit is

testit inputfile outputfile

Once testit has been run, it is a simple matter to check if the test has succeeded (i.e. whether the actual output of the test program matches the expected output). Simply diff the ".t" file (test file) and the resulting ".T" file.

If changes to the package software cause the expected output of a test script to change, an easy way of endorsing the new output is renaming the script.T file (testit output) to script.t. Of course, the new output should carefully be validated before doing this.

Integration with the build system

The standard makefiles in the SoftRelTools package support automated regression tests. Test files must be in the form described above and should have the extension ".t". Multiple test sections may be placed in one test script, and multiple test scripts may be included in each package. Regression tests to be run by SRT are specified by the $(TESTSCRIPTS) variable in the package GNUmakefile.

The phony make target test, is used to run tests. This target has bin as a dependency, so that test binaries are built up-to-date before any tests are run. From the main release directory, gmake test will run all tests in all included packages. This will run testit on the test files, as described above, and then invoke diff to compare the test output to the expected output and determine whether the test passes. Similar to other phony targets, the test target may be invoked for a single package with


gmake package.test

Unless gmake is invoked with the -k option, it will abort if a test fails (since diff returns non-zero when differences are found).

Output from tests is placed in a separate directory tree, indicated by the gmake variable $(testdir). By default, the output for the tests in a package will be placed in the subdirectory ./test/$BFARCH/package/ underneath the main release directory. These directories are automatically created by the gmake installdirs target along with the tmp, bin, and lib hierarchies.

In addition to the script.T output file from testit for each test file, an additional file will be created with either of the extensions ".fail" or ".pass", depending on whether or not diff finds differences between the test file and test output. If the test fails, the script.fail file will contain the output from diff. If the test passes, the script.pass file will be created, with a length of zero.

Regression tests are run with the working directory set to the test specified above. All temporary test files should be placed in this directory. Several variables are inherited from the make environment and may be used in test scripts:

  • $testdir: the directory in which tests are run (CWD)
  • $bindir: the directory containing binaries for this release and architecture; binaries should be run using this variable
  • $srcdir: the package source directory; this directory should be treated as read-only in regression tests

Typically, gmake -k test will be run as part of an automated build system. An easy way to find which tests failed in an entire release is by executing


find test/ -name \*.fail -print

from the release directory. If gmake test is invoked again, only tests that previously failed will be re-run (and the script.T file recreated), unless dependencies (binaries or test script source) change.

Guidelines for writing regression test scripts

Just as with program source code, it is important to consider legibility and maintainability when designing test scripts. Changes to source code will affect program execution, so regression tests must be maintained along with the programs they test.
  • Limit dependencies of test programs on code in other packages. As much as possible, test the functionality in your package without relying on proper functioning of code in other packages. This minimizes the chance of your tests failing when code elsewhere in the system breaks, and makes it easier to pinpoint the source of a problem.
  • Use one regression test script for each area of functionality in your package. This makes it easier to re-run, validate, and maintain tests when you make changes.
  • Split scripts into multiple test sections to improve clarity. For instance, placing several commands that write to stdout in one test section makes it difficult to identify which portion of the output comes from which command.
  • Within a script, use separate test sections for setup and cleanup. This is especially important in test scripts that may time out. If the timeout interval expires for a test section, the test subprocess is killed and all remaining commands in that section are not executed. However, if cleanup commands are in a separate section, they will still be executed.
  • Make test scripts as self-contained as possible. Avoid writing test scripts that depend on external files. This makes for a single point of maintenance for each regression test. Also, clarity is improved, since someone else can understand the entire test script by reading a single file. Also don't write test scripts that depend on the results or actions of other test scripts.
  • When possible, design test programs to read their input from stdin. This makes it easier to follow the previous guideline. It also makes it easier to perform further diagnostics by running manually the test program.
  • Write test output in stdout. This is the complement of the previous guideline. This permits others to view all the results of your regression test by reading the script.T file. For failed tests, details of the problems will be clearly visible in the script.fail file, which is generated by diff'ing script.T with the original script.t script file.
  • If your test executables require additional files to run, construct them in the test script where possible. For instance, for reasonable-size ASCII input files (such as TCL files), create a test section with run command
    cat > my_input.txt
    and the desired input file in stdin. Place another test section at the end of the script to clean up files you have created.
  • If your test program must produce output to a file, add a test section to cat the output file. Add the expected output as stdout for this test section. Do not diff the output file against an expected output file yourself in the test script -- this produces test results that are difficult to read. If the output is a binary file and not too long, consider dumping it to stdout using a hex dump program. While this is not ideally legible, at least the output is recorded in the script.T file where an expert can examine it.
  • Make sure your test programs do not produce output that contains running context information. Be careful to avoid printing
    • times and dates
    • user names
    • machine names and IP addresses
    • fully-qualified paths
    • system-assigned resource handles, such as socket descriptors and port numbers
    If these kinds of output are unavoidable, use sed replace them with an invariant text string. For instance, filter all dates and replace them with [-date-].
  • Use CodeTemplates/template.t when writing your package test scripts.

Testhist

A utility exists for statistical comparison of histograms, useful for regression testing. It can be used to compare histograms in a file against each other, or to compare generated histograms to reference histograms in another file. See the man page of the program testhist in the package RegrTest.