Functional Requirements
The following is a preliminary list of functional requirements for
extending Dhp comparisons to make tests based on fits to histograms:
- We wish to extend Dhp comparisons to be based on a fit parameter
and its error, after a fit has been performed to a live histogram.
- The reference will frequently be a parameter of a fit to
reference histogram, although it may be a specified function.
- Comparisons should be at the level of individual parameters, so
that responses can be tailored to the deviation observed. One may also
want to have a comparison on a global fit characteristic (CL,
Chi2/dof) to check for a meaningful fit.
- Several comparisons may stem from the same fit, i.e. on several
parameters of the fit. The fit should not be repeated if already
executed.
- Dhp fits should be a general enhancement of Dhp functionality and not
purely of DhpComparisonRecord, i.e. there should be a public interface
directly to the fits, so that they can be performed by user code. This
same interface can then be used by fits configured through DhpToken and
executed through DhpComparisonRecord. In some cases, it is sensible to
have the fits made _before_ the comparisons are established, and make
the comparisons to histograms of the fit results. This is particularly
applicable to iterated fits, e.g. slice-by-slice-fits of a 2D
histogram, where it is sensible to store individual fit parameters as
a histogram vs an axis of slice co-ordinate. Comparisons can the be
performed on the histograms of fit results, rather than requiring the
comparisons to establish such complicated fit procedures.
- Specification of the comparisons/fits should be with minimal
overhead in the CRI file, to avoid large-scale rewriting of the
DhpToken parser.
- Would also like to avoid excessive replication of information in
the CRI file concerning specification of closely-related comparisons,
i.e. comparisons of different fit parameters for the same
histogram. DhpComparisonRecordToken specification is already >50% by
volume concerned with the result of the individual comparison, rather
than the (common) source of the comparison.
- Want to offer a range of pre-defined fit functions, where user has
to do little more than name the fit parameters and give optional
best-guess starting values. A provisional list of necessary functions
is:
Gaussian (3 pars)
Order-N polynomial, up to a maximum N of 5 or 6 (N+1 pars)
Gaussian + order-N polynomial (3+N+1 pars)
Double Gaussian (6 pars)
Double Gaussian, common mean (5 pars)
Exponential (2 pars)
Double Exponential (4 pars)
- May want to support 2D fit functions.
- Probably also want to allow user more flexibility in introducing a
user-defined fit function.
- Want to be able to limit fits to certain bin ranges of the input
histograms [in fact, want to limit all Dhp comparisons to
CRI-defined ranges].
- May want to support 2D fit functions.
- Want to fit slices of 2D histograms as individual 1D histograms,
keeping track of parameters for each slice [this is probably best
implemented "upstream", resulting in a histogram of fit parameter
values vs slice co-ordinates, which can then be subjected to a comparison].
- Parameters should be named, and users should be able to add
optional initial values, bounds and step sizes.
- User should have optional control over convergence limits,
iteration limits, fit minimization strategies etc .... but suitable
defaults must be provided.
- Fit parameter values and errors should be reported in the messages
generated by a failed comparison concerning those parameters.
- Ultimately need to record fit parameters to databases, so that
history can be examined (longer-term objective).
Implementation points
A few points relating to the potential implementation:
- Fits to be implemented with minimum of re-invention of fit
apparatus ... make maximum re-use of existing fit packages.
- Two contenders for "fit engines" are MINUIT/minuit++ and PrmFit.
- Would be "nice" if we can separate abstraction from implementation,
so that we could do both of the above if we really had to.
- Using MINUIT commands in the CRI file and the MINUIT parser to set
up the fit would help to meet the specification for minimal changes to
CRI file formats, but obviously constrains us to a MINUIT-based
implementation.
- Maximum flexibility in fit function yet simplicity in
specification may require all fits to be based on a fortran fit
function, where a "library" of standard functions is provided for
pre-defined fits, and user provides user-defined function where
required.
- Fit statistic should be pre-defined, with Chi-squared as the default.
- Bit of conflict between DhpComparisonRecord expecting one
comparison per record, and idea of a single fit, from which multiple
comparisons can be performed. Would like to make fit a subcomponent of
DhpCR, and yet one fit could be common to several DhpCRs.
- Useful to distinguish between
- test as the evaluation of a statistic against
user-defined limits, and the generation of messages as a result, and
- comparison which includes the evaluation of the test statistics,
with a live histogram compared against a reference
such that a comparison may include a fit, and a comparison may include
several tests, some of which may be based on the fit.
- Fits to references will be implemented through a sub-class of
DhpAbsHistComparator.
- May as well re-fit the references each time, and then the
specification of the fit can be changed without invalidating a stored
reference value.
- Fits may be simplified by providing iterators over bins and slices.
Questions
Dunno ...
- How to map named parameters to specific parameters of specific fit
function?
- How to associate tests, fits and comparisons? Probably aggregation
of one fit and several tests within comparison.
Information prepared by Paul Bright-Thomas
Last modified Tue Jul 27 14:00:00 BST 1999