International Committee for
Future Accelerators (ICFA)
Standing Committee on
Inter-Regional Connectivity (SCIC)
Chairperson: Professor Harvey
Newman, Caltech
ICFA SCIC Network Monitoring
Report
Prepared by the ICFA SCIC
Monitoring Working Group
On behalf of the Working
Group:
Les Cottrell cottrell@slac.stanford.edu

January
2006 Report of the ICFA-SCIC Monitoring Working Group
Edited by R. Les Cottrell and Aziz A. Rehmatullah on behalf of the ICFA-SCIC Monitoring WG
Created January 18, 2006. Last Update February 2, 2006
ICFA-SCIC Home Page | Monitoring WG Home Page
This report is available from http://www.slac.stanford.edu/xorg/icfa/icfa-net-paper-jan06/
Contents:
Executive Overview | Introduction
| Goals | Methodology
| PingER Results | Comparison
between Africa and South Asia| A View From
Africa | A Case Study on Pakistan
| IEPM Results |
Comparison with HEP Needs | New Monitoring and Diagnostic Efforts in HEP | Comparisons with Economic
Indicators | Accomplishments
since Last Report | Efforts for
Better PingER Management | Recommendations |
Appendix: Countries in PingER
Database | References
Internet performance is improving each year with packet losses typically improving by 40-50% per year and losses by 25%-45% per year, and for some regions such as S. E. Europe, even more. Geosynchronous satellite connections are still important to countries with poor telecommunications infrastructure. However, the number of countries with fiber connectivity has and continues to increase and in most cases, satellite links are used as backup or redundant links. In general for HEP countries satellite links are being replaced with land-line links with improved performance (in particular for RTT). On the other side of the coin number of Internet usage is increasing (see http://www.internetworldstats.com/stats.htm), the application demands (see for example [bcr]) are growing and the expected reliability is increasing, so we cannot be complacent.
In general, throughput measured from within a region is
much higher than when measured from outside. Links between the more developed
regions including Anglo America,
Africa and
For modern HEP collaborations and Grids there is an increasing need for high-performance monitoring to set expectations, provide planning and trouble-shooting information, and to provide steering for applications.
There is a positive correlation between the various economic and development indices. Besides being useful in their own right these indices are an excellent way to illustrate anomalies and for pointing out measurement/analysis problems. The large variations between sites within a given country illustrate the need for careful checking of the results and the need for multiple sites/country to identify anomalies.
To quantify and help bridge the Digital Divide, enable
world-wide collaborations, and reach-out to scientists world-wide, it is
imperative to continue and extend the PingER monitoring coverage to all
countries with HEP programs and significant scientific enterprises. On the
other hand, more support is required from these countries to enable adding more
sites to PingER. Based on a few weeks of data, we have tried to highlight the
problems in
The formation of this working group was requested at the ICFA/SCIC meeting at CERN in March 2002 [icfa-mar02]. The mission is to: Provide a quantitative/technical view of inter-regional network performance to enable understanding the current situation and making recommendations for improved inter-regional connectivity.
The lead person for the monitoring working group was identified as Les Cottrell. The lead person was requested to gather a team of people to assist in preparing the report and to prepare the current ICFA report for the end of 2002. The team membership consists of:
Table 1: Members of the ICFA/SCIC Network Monitoring team
|
Les Cottrell |
SLAC |
US |
cottrell@slac.stanford.edu |
|
|
|
|
|
|
Sergei Berezhnev |
RUHEP, |
|
sfb@radio-msu.net |
|
Sergio F. Novaes |
FNAL |
|
novaes@fnal.gov |
|
Fukuko Yuasa |
KEK |
|
Fukuko.Yuasa@kek.jp |
|
Sylvain Ravot |
Caltech |
CMS |
Sylvain.Ravot@cern.ch |
|
Shawn McKee |
|
I2 HEP Net Mon
WG |
This report may be regarded as a follow on to the May 1998 Report of the ICFA-NTF Monitoring Working Group [icfa-98], the January 2003 Report of the ICFA-SCIC Monitoring Working Group [icfa-03] the January 2004 Report of the ICFA-SCIC Monitoring Working Group [icfa-04] and the January 2005 Report of the ICFA-SCIC Monitoring Working Group [icfa-05]. The current report updates the January 2005 report, but is complete in its own right in that it includes the tutorial information from the previous reports.
There are two complementary types of Internet monitoring reported on in this report.
The PingER data and results extend back to the start of 1995. They thus provide a valuable history of Internet performance. PingER has 34 monitoring nodes in 14 regions, that monitor 1037 remote nodes at over 750 sites in around 120 countries (see PingER Deployment [pinger-deploy]). These countries contain over 90% of the world's population (see Table 2) and over 99% of the online users of the Internet. Most of the hosts monitored are at educational or research sites. We try and get at least 2 hosts per country to help identify and avoid anomalies at a single host, although we are making efforts to increase the number of monitoring hosts to as many as we can. The requirements for the remote host can be found in [host-req]. Fig. 1 below shows the locations of the monitoring and remote (monitored sites).

Figure 1: Locations of PingER
monitoring and remote sites as of Jan 2006.
There are around thirty seven hundred monitoring/monitored-remote-host pairs, so it is important to provide aggregation of data by hosts from a variety of "affinity groups". PingER provides aggregation by affinity groups such as HEP experiment collaborator sites, Top Level Domain (TLD), Internet Service Provider (ISP), or by world region etc. The world regions, as defined for PingER, and countries monitored are shown below in Fig. 2. The regions are chosen starting from the U.N. definitions [un]. We modify the region definitions to take into account which countries have HEP interests and to try and ensure the countries in a region have similar performance.

Figure 2: Major regions of the world for PingER aggregation by regions
More details on the regions are provided in Table 2 that highlights the number of countries monitored in each of these regions, and the distribution of population in these regions.
Table 2: Countries and populations by region
|
Regions |
# of
Countries |
% of
World Population |
% of
Monitored Population |
|
|
31 |
11.5 |
12.8 |
|
Balkans |
9 |
1.0 |
1.1 |
|
|
4 |
1.0 |
1.1 |
|
|
25 |
7.9 |
8.8 |
|
|
18 |
8.2 |
9.1 |
|
|
2 |
5.1 |
5.7 |
|
|
4 |
23.2 |
25.9 |
|
|
6 |
6.2 |
6.9 |
|
|
5 |
22.5 |
25.0 |
|
|
5 |
2.8 |
3.1 |
|
|
5 |
0.5 |
0.5 |
|
|
1 |
2.2 |
2.5 |
To assist in interpreting the results in terms of their impact on well-known applications, we categorize the losses into quality ranges. These are shown below in Table 3.
|
Table 3: Quality ranges used for loss |
||||||
|
|
Excellent |
Good |
Acceptable |
Poor |
Very Poor |
Bad |
|
Loss |
<0.1% |
>=0.1% & |
> =1% |
>= 2.5% |
>= 5% |
>= 12% |
More on the effects of packet loss and RTT can be found in the Tutorial on Internet Monitoring & PingER at SLAC [tutorial], briefly:
It must be understood that these quality designations apply to normal Internet use. For high performance, and thus access to data samples and effective partnership in distributed data analysis, much lower packet loss rates may be required.
Loss
Of the two metrics loss & RTT, loss is more critical
since a loss of a packet will typically cause timeouts that can last for
several seconds, moreover, RTT increases with increase in distance between any
two nodes and also, with the increase in the number of hops, whereas loss is
less distance dependent. For instance RTT between a node at SLAC and somewhere
in
|
|
|
Figure 3: December 2005 packet loss snapshot seen
from |
Fig. 3 shows a snapshot of the losses for December 05. We
observe that very few countries have bad connectivity. Most of N. America, Europe,
Oceania and
Another way of looking at the losses is to see how many hosts fall in the various loss quality categories defined above as a function of time. An example of such a plot is seen in Fig 4.
|
|
|
Figure 4: Number of hosts measured from SLAC for each quality category from February 1998 through December 2005. |
It can be seen that recently most sites fall in the good quality category. The numbers at the bottom indicate the percentage of total sites that see good or better packet loss at the start of the year. Also the number of sites with good quality has increased from about 55% to about 75% in the 9 years displayed. The plot also shows the increase in the total number of sites monitored from SLAC over the years. The improvements are particularly encouraging since most of the new sites are added in developing regions.
Towards the end of 2001 the number of sites monitored started
dropping as sites blocked pings due to
The increases in monitored sites towards the end of 2002
and early 2003 was due to help from the Abdus Salam Institute of Theoretical
Physics (ICTP). The ICTP held a Round Table meeting on Developing
Country Access to On-Line Scientific Publishing: Sustainable Alternatives
[ictp] in
The increase towards the end of 2003 was spurred by
preparations for the second Open Round Table on Developing
Countries Access to Scientific Knowledge: Quantifying the Digital Divide
23-24 November Trieste, Italy and the WSIS conference and associated activities in
The increases in 2004 were due to adding new sites
especially in Africa, S. America,
In 2005, the Pakistan Ministry Of Science and Technology
(MOST) and the US State Department funded SLAC and the National University of
Sciences and Technology’s (NUST) Institute of Information Technology (NIIT) to
collaborate on a project to improve and extend PingER. As part of this project
and the increased interest from Internet2 in “Hard to Reach Network Places”
many new sites in the South Asia and
Fig. 5 below shows the long term trends for the various regions as seen from Anglo America.

Figure 5:
Packets loss trends from Anglo America to various regions of the world.
The following general observations can be made for the losses:
Fig. 6 shows the world's connected population fractions obtained by dividing countries up by loss quality seen from the US, and adding the connected populations for the countries (we obtained the population/country figures from "How many Online" [nua] for 2001 and from CIA World Factbook for 2005 [cia-pop-figures]).

Figure 6:
Fraction of the world's connected population in countries with measured loss
performance in 2001 and Dec 2005
It can be seen that in 2001, <20% of the population lived in countries with acceptable or better packet loss. By December 2005 this had risen to 79%. The coverage of PingER has also increased from about 70 countries at the start of 2003 to over 120 in December 2005. This in turn reduced the fraction of the connected population for which PingER has no measurements. The results are even more encouraging when one bears in mind that the newer countries being added typically are from regions that have traditionally poorer connectivity.
It is interesting to compare the packet losses seen by various regions with those seen by residential DSL customers in the San Francisco Bay Area. This is shown in Fig. 7 below.

Figure 7:
Losses from SLAC to various world regions compared with that for residential
customers in the San Francisco Bay Area.
RTTs
There are limits to the minimum RTT due to the speed of
light in fibers or electricity in copper. Typically today, the minimum RTTs for
terrestrial circuits are about 2 * distance / ( 0.6 * c), or roughly
100km/ms (RTT time,) where c is the velocity of light, the factor of
2 accounts for the round-trip, 0.6*c is roughly the speed of light in
fibre. For geostationary satellites links, the minima are between 500 and
600ms. For
Fig. 8 below shows the trends of the min-RTT measured from ESnet sites in Anglo America to the various regions of the world. The straight lines are exponential fits to the data (straight lines on a log-linear plot), and the wiggly lines are moving averages for the last 6 months.

Figure 8:
Minimum RTT measured from ESnet sites in the
As is seen by comparing the exponential
fits with the moving averages, the trends here are less clear. Europe and the
Balkans and to a lesser extent Russia have been pretty stable since upgrading
the links from say 45 to 155, 622 or 2400 or 10,000 Mbps implying that for high
speed links, the actual link speeds have a small effect on the minimum RTT, the
main effect being the distance.
Fig. 9 shows the RTT from the

Figure 9: December 05 comparison of Minimum RTT with 2003 and 2000 results
It is seen that
the number of countries with satellite links (> 600ms RTT or dark red) has
decreased markedly in the 6 years shown. Today satellite links are used in
places where it is hard or unprofitable to pull terrestrial-lines (typically
fibers) to. Barring a few countries in Central and Eastern Africa,
Two interesting examples stand out in this data:
Throughput
We also combine the loss and RTT measurements using throughput = 1460Bytes[Max Transmission Unit]/(RTT * sqrt(loss)) from [mathis]. The results are shown in Fig. 10. The orange line shows a ~40% improvement/year or about a factor of 10 in < 7 years.
|
|
|
Figure 10: Derived throughput as a function of time seen from ESnet sites to various regions of the world. The numbers in parentheses are the number of monitoring/remote host pairs contributing to the data. The lines are exponential fits to the data. |
The data for several of the developing countries only
extends back for only about five years so some care must be taken in interpreting
the long term trends. With this caveat, it can be seen that links between the
more developed regions including Anglo America,
View from
To assist is developing a less N. American view of the Digital
Divide; we added many more hosts in developing countries to the list of hosts
monitored from CERN in
|
|
|
Figure 11: Derived throughputs to various regions as seen from CERN |
The slow increase for
Variability of performance between
and within regions
The throughput results, so far presented in this report,
have been measured from Anglo America or to a lesser extent from
Table 4: Derived throughputs in kbits/s from
monitoring hosts to monitored hosts by region of the world for December 2005

As expected it can be seen that within regions (the circled
cells) performance is generally better than between regions. Also performance
is better between closely located regions such as Europe and S. E. Europe,
To provide further insight into the variability in performance for various regions of the world seen from SLAC Fig. 12 shows various statistical measures of the losses and derived throughputs. The regions are sorted by the median of the measurement type displayed. Note the throughput graph uses a log y-scale to enable one to see the regions with poor throughput. The countries comprising a region can be seen in Fig. 2.

Figure 12:
25 percentile, median and 75 percentile derived throughputs and losses for
various regions measured from SLAC for Oct-Dec '05
The difference in throughput for N. America and Europe is
an artifact of the measurements being made from N. America (SLAC) which has a
much shorter RTT (roughly between a factor of 2 and 20 times or for the average sites close
to 3 to 4) to N. American than to European sites. Since the derived throughput
goes as 1/RTT this favors N.
Case
Study on NIIT,
With NIIT being an important collaborator with SLAC, Caltech and CERN, we
prepared a small case study with 3 PingER monitoring sites in
The Pakistan Education and Research Network (PERN) “is a nationwide educational intranet connecting premiere educational and research institutions of the country. PERN focuses on collaborative research, knowledge sharing, resource sharing, and distance learning by connecting people through the use of Intranet and Internet resources”.
PERN uses the services of NTC
(National Telecommunication Corporation), which is the national
telecommunication carrier for all official/government services in
Table 5: Remote sites in
|
|
Remote Node |
University Location |
Service Provider |
Traffic Enters the Country Via |
End host location |
|
|
|
|
|
|
|
|
1 |
PK.QAU.EDU.N1 |
|
NTC |
|