Guidlines for writing automated regression tests for the BaBar software environment
version: $Id: RegrTest-guidelines.html,v 1.1 1998/08/06 18:16:43 samuel Exp $
Introduction and goals
This document describes guidelines for implementing automating regression testing of components in the BaBar software environment.
The goal of the automated regression testing system is to provide a way of scripting common tasks that may be used to exercise the functionality of various packages in the software environment, and verify that they perform as expected. For the system
to be widely-used and successful, it must satisfy these requirements:
- The system should be flexible enough to accomodate any automated testing procedures that do not required direct operator intervention.
- The system should not impose significant overhead on the developer beyond what work is necessary to design and implement the tests themselves.
- The system should be integrated with the automated build system, so that regression tests may be run following an automated build, with the results of regression testing presented along with the build report.
- Output files from test scripts must be kept in separate directories from the test source, just as build products are located apart from source files. The system must allow for testing to be performed separately on separate architectures.
Test files and the testit program
Regression tests are implemented via test scripts. Test scripts contain UNIX commands to run to exercise software components, along with the output that is expected from them. Typically package test scripts will invoke executables that are part
of that package, either standard executables or test executables specially created for regression testing.
Test file format
Test files are plain text files containing test commands. Each test file contains one or more tests sections. Each test section includes commands to run, any input necessary for these commands, and the expected output.
Each test section in a test script has four required parts, separated by labelled dividers. These parts must follow the order described, and must be separated by dividers as shown in the example below. Sections may be empty (but a test with an empty
run section will not do anything).
The parts sections in a test file are these:
- run: Contains the commands to be run. The contents of this section are copied to a temporary file and executed in a shell process. By default, /bin/sh is used. Another shell may be specified on the first line in the usual way for
shell scripts, for instance #!/bin/csh. A new shell process is spawned for each test section, and the entire contents of the run section are executed in that shell process. Note that this means environment variables etc. are local to each test
section.
- stdin:Contains text to be fed into standard input for the test process. Input that would be typed into running programs (e.g. TCL commands for analysis programs) should be placed here. An end-of-file (^D) character is assumed at the end.
- stdout:Contains the expected standard output from the test process, if any.
- stderr:Contains any output on standard error from the test process, if any.
The "output" and "error" sections should indicate the test output exactly, including whitespace.
Text before run or after stderr is assumed to be comments and ignored, except for special directives described below:
- @timeout interval
Specifies a timeout interval, in seconds, for all subsequent sections. If a section has not completed within this time, it is assumed to have failed, and is killed. The default timeout interval is 60 seconds. To disable the timeout, specify a timeout
interval of 0.
Test files have the extension ".t".
Take a look at an example test file to see how test files are laid out.
The testit program
The testit program is the test file driver. See the man page. It parses test files and runs the test contained in them. The result is an output file identical to the
input file except that the contents of "stdout" parts are replaced by the actual standard output of the tests, and contents of the "stderr" parts are replaced by the actual standard error output of the tests. The output from testit is typically
given the extension ".T".
The calling syntax of testit is
testit inputfile outputfile
Once testit has been run, it is a simple matter to check if the test has succeeded (i.e. whether the actual output of the test program matches the expected output). Simply diff the ".t" file (test file) and the resulting
".T" file.
If changes to the package software cause the expected output of a test script to change, an easy way of endorsing the new output is renaming the script.T file (testit output) to script.t. Of course, the new
output should carefully be validated before doing this.
Integration with the build system
The standard makefiles in the SoftRelTools package support automated regression tests. Test files must be in the form described above and should have the extension ".t". Multiple test sections may be placed in one test script, and multiple
test scripts may be included in each package. Regression tests to be run by SRT are specified by the $(TESTSCRIPTS) variable in the package GNUmakefile.
The phony make target test, is used to run tests. This target has bin as a dependency, so that test binaries are built up-to-date before any tests are run. From the main release directory, gmake test will run all tests in all
included packages. This will run testit on the test files, as described above, and then invoke diff to compare the test output to the expected output and determine whether the test passes. Similar to other phony targets, the test target
may be invoked for a single package with
gmake package.test
Unless gmake is invoked with the -k option, it will abort if a test fails (since diff returns non-zero when differences are found).
Output from tests is placed in a separate directory tree, indicated by the gmake variable $(testdir). By default, the output for the tests in a package will be placed in the subdirectory ./test/$BFARCH/package/
underneath the main release directory. These directories are automatically created by the gmake installdirs target along with the tmp, bin, and lib hierarchies.
In addition to the script.T output file from testit for each test file, an additional file will be created with either of the extensions ".fail" or ".pass", depending on whether or not diff finds differences
between the test file and test output. If the test fails, the script.fail file will contain the output from diff. If the test passes, the script.pass file will be created, with a length of zero.
Regression tests are run with the working directory set to the test specified above. All temporary test files should be placed in this directory. Several variables are inherited from the make environment and may be used in test scripts:
- $testdir: the directory in which tests are run (CWD)
- $bindir: the directory containing binaries for this release and architecture; binaries should be run using this variable
- $srcdir: the package source directory; this directory should be treated as read-only in regression tests
Typically, gmake -k test will be run as part of an automated build system. An easy way to find which tests failed in an entire release is by executing
find test/ -name \*.fail -print
from the release directory. If gmake test is invoked again, only tests that previously failed will be re-run (and the script.T file recreated), unless dependencies (binaries or test script source) change.
Guidelines for writing regression test scripts
Just as with program source code, it is important to consider legibility and maintainability when designing test scripts. Changes to source code will affect program execution, so regression tests must be maintained along with the programs they test.
- Limit dependencies of test programs on code in other packages. As much as possible, test the functionality in your package without relying on proper functioning of code in other packages. This minimizes the chance of your tests failing when
code elsewhere in the system breaks, and makes it easier to pinpoint the source of a problem.
- Use one regression test script for each area of functionality in your package. This makes it easier to re-run, validate, and maintain tests when you make changes.
- Split scripts into multiple test sections to improve clarity. For instance, placing several commands that write to stdout in one test section makes it difficult to identify which portion of the output comes from which command.
- Within a script, use separate test sections for setup and cleanup. This is especially important in test scripts that may time out. If the timeout interval expires for a test section, the test subprocess is killed and all remaining commands in
that section are not executed. However, if cleanup commands are in a separate section, they will still be executed.
- Make test scripts as self-contained as possible. Avoid writing test scripts that depend on external files. This makes for a single point of maintenance for each regression test. Also, clarity is improved, since someone else can understand the
entire test script by reading a single file. Also don't write test scripts that depend on the results or actions of other test scripts.
- When possible, design test programs to read their input from stdin. This makes it easier to follow the previous guideline. It also makes it easier to perform further diagnostics by running manually the test program.
- Write test output in stdout. This is the complement of the previous guideline. This permits others to view all the results of your regression test by reading the script.T file. For failed tests, details of the problems will be
clearly visible in the script.fail file, which is generated by diff'ing script.T with the original script.t script file.
- If your test executables require additional files to run, construct them in the test script where possible. For instance, for reasonable-size ASCII input files (such as TCL files), create a test section with run command
cat > my_input.txt
and the desired input file in stdin. Place another test section at the end of the script to clean up files you have created.
- If your test program must produce output to a file, add a test section to cat the output file. Add the expected output as stdout for this test section. Do not diff the output file against an expected output file yourself in
the test script -- this produces test results that are difficult to read. If the output is a binary file and not too long, consider dumping it to stdout using a hex dump program. While this is not ideally legible, at least the output is recorded in the
script.T file where an expert can examine it.
- Make sure your test programs do not produce output that contains running context information. Be careful to avoid printing
- times and dates
- user names
- machine names and IP addresses
- fully-qualified paths
- system-assigned resource handles, such as socket descriptors and port numbers
If these kinds of output are unavoidable, use sed replace them with an invariant text string. For instance, filter all dates and replace them with [-date-].
- Use CodeTemplates/template.t when writing your package test scripts.
Testhist
A utility exists for statistical comparison of histograms, useful for regression testing. It can be used to compare histograms in a file against each other, or to compare generated histograms to reference histograms in another file. See the man page of the program testhist in the package RegrTest. |