Connie Logg and Les Cottrell, SLAC, May 2004.
The
principle behind this algorithm is evolved from work by NLANR [1]. It involves
buffering the time sequence ABwE data into two buffers: a history buffer (h) for base-lining, and when a datum
meets specific requirements, into a trigger buffer (t). Each buffer has a maximum number of entries parameter l and t respectively. t determines how long the
bandwidth change must exist before an analysis of its data is performed to see
if we have encountered an alert condition or event. Note that if the ABwE data
is taken once a minute, t is roughly the number of minutes (assuming no data is lost) that a
drop must exist before an event may be deemed to have occurred. The history
buffer is initialized at start up with l data points
Once h is
initialized, we enter the data processing loop. The mean (mh) and standard deviation (oh) of h are
then calculated.
Two other critical parameters besides the buffer
lengths are used in the analysis and event detection:
·
The sensitivity (b) is the number of standard deviations (oh) beyond mh that a datum must lie to
be considered a trigger value. The default at this time is 2, however we are
evaluating how to dynamically set this as a function of mh and oh.
·
Threshold (d) – is the difference between the buffer means mh and mt, in units of standard deviations, that must be
exceeded for an event to be detected. Once we are in an event detected state,
this threshold must again be met before another event is detected. We are in an
event detected state when an event has been detected and we have only seen
trigger data.
Currently b and d are set manually. Reasonable
values for b are in the range 1.5 to 3, and by
default it is set to 2. By default we set d
to 40%. Note although the actual algorithm
is symmetric for bandwidth rises as well as drops, for simplicity we only
consider drops in the following description.
Setting
the buffer lengths will probably depend on user requirements. For example, how long must a change be
sustained (t) before it is considered significant depends on how long the user
wants a drop to be sustained before she / he is notified.
Pseudo perl [2] code[1] for the detection algorithm
External parameters:
b = sensitivity (default = 2);
d = threshold (default 20%)
l = history buffer length (default = 600)
t = trigger buffer length (default = 60)
Code variables:
@y, y = list of & current bandwidth estimates
mh, oh = history buffer mean & standard deviation
mt, ot = trigger buffer mean & standard deviation
me, oe = event buffer mean & standard deviation
@h history buffer, current length = scalar(@h)
@t trigger buffer, current length = scalar(@t)
mt = me = ot = oe = 0;
foreach y (@y) {
if (y > (mh - b * oh)) {#then NOT a trigger
a=0;
if(scalar(@h)>l){a=shift(@h);}#remove oldest
me = 0;
if (y > (mh + 2 * b * oh)) {#outlier?[2]
if(scalar(@t) >0){a = shift(@t); next;}
}
if (a < 0 && abs(y - mh) / mh < 0.2) {y = -y}[3]
push(@h, y); #push y into history buffer
if (y > 0) {
(mh, oh) = calcstats(@h);
#calcstats returns mean & stdev for
#positive non-zero values in array
if (scalar(@t)> 0) {a = shift(@t);}
}
}
else {#trigger data
push (@t, y); #add value to trigger buffer
if (scalar(@t) < t) {next;}#enough triggers?
(mt, ot) = calcstats(@t);#yes, so see if event
if ( (mh - mt) / mh > d) {#event?
unless (me == 0) {#already in event state?
if((me - mt) / me) >= d) {
me = mt; oe = ot;
foreach t (@t) {push (@h,t);}
while (scalar(@h) > l) {a=shift(@h);}
(mh, oh) = calcstats(@h);
@t=(); #empty trigger buffer
}
else {a = shift(@t);}
}
else {#not in event state
me = mt; oe = ot;
foreach t (@t) {push (@h,t);}
while (scalar(@h) > l) {a=shift(@h);}
(mh, oh) = calcstats(@h);
@t = ();#empty trigger buffer
}
}
else {a=shift(@t);} #no event
}
}
Figure 1 illustrates how the algorithm works. It shows the time series of the EWMA smoothed (as a continuous line) and raw (before EWMA smoothing) bandwidth estimates of Cap with an event marked by a circle. Also marked are the current size of the trigger buffer (t), mh and mh - b * oh. In this case the analysis was performed on the raw Cap since the long term effects of smoothing may cause the algorithm to miss short changes in bandwidth. For example for the data in Figure 1, mh - mt) / mh is 20% for the EWMA Cap but 48% for the raw Cap, so with a d of 40% we would miss the event when analyzing the EWMA Cap data.

Figure 1: Bandwidth estimates from SLAC to BINP Novosibirsk, June 2, 2004
[1] In perl: a variable name with an @ prefix is an array; the scalar function applied to an array gives the length of the array; the shift function shifts the first value of the array off and returns it; push treats the array as a stack and pushes the second argument onto the end of the array given in the first argument
[2] This is to guard against large outliers often seen in the ABwE data.
[3] Following the ideas of [8], only data that differs from the mean by > 20% is used in the statistics calculations. This is to prevent long periods of low bandwidth variation (such as may be observed for the smoothed data) from preventing the means and standard deviations tracking the most recent trends.
[1] Automated Event Detection for Active Measurement
Systems, A. J. McGregor and H-W. Braun, Passive and
Active Measurements 2001. http://byerley.cs.waikato.ac.nz/~tonym/papers/event.pdf
[2] Programming Perl, L. Wall, T. Christiansen, J. Orwant, O’Reilly
& Associates Inc., 2002