Authors: Les Cottrell and Warren Matthews. Created: February 23, 2000; last updated on March 14, 2000
Both the Surveyor and RIPE monitoring projects rely on a dedicated PC running Unix to be placed at each monitoring site. Each PC in turn relies on a Global Positioning System (GPS) device to obtain accurate time and to synchronize time between each of the monitors. The monitors send packets at Poisson randomized time intervals to each other and use these packets to gather one way end-to-end delay and loss measurements. They also make concurrent traceroutes which provides route history information. The community for Surveyor is Internet 2, though there are monitors at non Internet 2 sites, and in particular at 3 Higher Energy Physics (HEP) sites CERN, FNAL and SLAC that are also PingER monitor sites. The community for RIPE is European Internet Service Providers (ISPs), though again there are RIPE machines at CERN and SLAC.
More general information comparing Surveyor and RIPE and other active measurement projects can be found in Comparison of some Internet Active End-to-end Performance Measurement projects.
In the figure RIPE data has been normalized so the peak height of the 1st peak around 86 msec. has the same height as the equivalent peak in the Surveyor distribution. Also 0.2 msec. have been subtracted from the delays for the RIPE data to improve the agreement, this is discussed in further detail below.
It can be seen that there is reasonably good agreement. Looking in more detail it is noteworthy that the second peak at around 95 msec. has more counts for the Surveyor data. Also there is more RIPE data at lower delays in the region below the 1st peak. These difference are due to the 2 experiments not both being up during the entire interval and therefore not measuring exactly the same behavior. In particular the SLAC Surveyor machine didn't take data during the period when the SLAC RIPE machine was reporting one-way delay less than 90ms.
We then scatter plotted the two distributions against one another, i.e. each point on the scatter plot is for a given delay and has an x-value equal to the normalized RIP frequency for that delay, and the y-value is the Surveyor frequency for the same delay. The plot of this data can be seen to the left. The black line is a linear least squares fit of y to a straight line in x, and the red line is for a linear least squares it to a power series.
It can be seen that there is a strong correlation between the points with
R2 > 0.9 (i.e. the proportion of variation in the
Surveyor distribution attributable to the RIPE distribution is over 90%,
see "The New Statistical Analysis of Data", by T. W. Anderson and
Jeremy D. Finn, published by Springer, 1996).
Then we plotted the R2 of the scatter plot data as a function of the "correction" (drift) made to the Surveyor delay. The plot of this is seen to the right.
It can be seen that the best correlation (highest value of R2) is obtained when the Surveyor delay times are increased by 0.2 msec. Some of this difference maybe attributable to the differences in packet sizes of the probes used. Surveyor uses 40 Byte packets whereas RIPE uses 100 Byte packets. For larger packets delay will be higher. Henk Uijterwaal of RIPE-NCC reports: "I've measured this effect for the path Advanced-RIPE and back last April 1999, and found differences of 0.4..0.6 ms (depending on the time of day and thus network load). One can calculate the effective throughput from these numbers. Back in April, I ended up with something in the 80 to 120 kb/s range, which is consistent with what one sees when transferring large files between the two sites. Of course, the numbers may be different for the SLAC-CERN connection, but delays for the RIPE box should be a little higher than for Surveyor."
The frequency histograms for the RIPE and Surveyor Jan-4 data set is shown to the right together with relevant values for the median, average, standard deviation (Stdev), inter quartile range (IQR) and the number of samples. The Frequency is plotted as a log scale to illustrate the heavy tailed behavior. The RIPE frequencies have been normalized to have the same number of samples as Surveyor's, so the plots can be overlaid. It can be seen that the outliers in the Surveyor data result in a much higher standard deviation. The peaks have similar widths as measured by the IQR, but the medians (and averages) for the RIPE and Surveyor measurement differ by 0.1 to 0.26 msec.
To pin point the differences of the median delays between the RIPE and Surveyor measurements, we adjusted the RIPE delays in the range of -0.5 msec. to +0.2 msec. in steps of 0.1 msec. For each of these adjusted delays we compared the frequency histogram points for each value of the delay by calculating the square of the correlation coefficient (R2) and plotting it as a function of the adjustment made to the RIPE delays. It can be seen that the in the chart to the left that the optimum fit (largest value of R2) is when between 0.2 and 0.3 msec are subtracted from the RIPE delays.
To compare the individual measurements made by RIPE and Surveyor, for each of the ~2160 RIPE measurements we took the Surveyor measurement immediately preceding the RIPE measurement. Then we took the difference in the timestamps, i.e. dTi = TSi - TRi for each pair (with index i), where TSi is the timestamp of the i-th Surveyor delay measurent and TRi is the same for the RIPE delay measurement. Then we sorted by differential dT, and kept those pairs that had a dT of < 2 secs. Then we scatter plotted these points and caluculated the R2. We also looked at the dT distribution. The results are shown to the right. This may illustrate that for this data, even for meaurements close togther in time (< 2 seconds apart and a median separation of 0.545 seconds), there is little correlation between the points. This is even though there are points that have long delays, the problem is that the long delays in one measurement are not reflected in the other measurement. Presumably this means that the effect that caused the long delay in one measurement has change its effect by the time (< 2 secs later) the other measurement is made. To obtain stronger correlation one probably needs more data with a persistent structure in the data (for example such as would be caused by a routing change or congestion). An example of data with more structure can be seen in the results reported in Comparison of Surveyor and PingER.
To see whether there was any point is trying to correlate individual delay measurements made at similar time stamps, we plotted the auto correlation values for the RIPE and Surveyor data. They are seen to the right. It can be seen that at these time scales and for the Jan-4 data set the auto-correlation is small even for small values of the Lag (the minimal lag or average separation between adjacent samples for RIPE is about 40 seconds, and for Surveyor it is about 1 second) the auto correlation has a very small value.