International Committee for Future Accelerators (ICFA)

Standing Committee on Inter-Regional Connectivity (SCIC)

Chairperson: Professor Harvey Newman, Caltech








ICFA SCIC Network Monitoring Report









Prepared by the ICFA SCIC Monitoring Working Group

On behalf of the Working Group:
Les Cottrell






February 7, 2003

January 2003 Report of the ICFA-SCIC Monitoring Working Group

Edited by R. Les Cottrell with assistance from Richard Mount, on behalf of the ICFA-SCIC Monitoring WG

Created November 23, 2002. Last Update February 7, 2003

ICFA-SCIC Home Page | Monitoring WG Home Page

This report is available from


Executive Overview

Internet performance is improving each year with packet losses typically improving by 40-50% per year and Round Trip Times (RTTs) by 10-20% and, for some countries such as China, even more. Geosynchronous satellite connections are still important to countries with poor telecommunications infrastructure. In general for HEP countries satellite links are being replaced with land-line links with improved performance (in particular for RTT).

Links between the more developed regions including Anglo America, Japan and Europe are much better than elsewhere (5 - 10 times more throughput achievable). Regions such as China, S.E. Europe, Latin America and Russia are improving at a similar rate to the more developed regions. However, China, Russia, S.E. Europe and Latin America are a few years behind in performance compared to the more developed regions, and do not appear to be catching up. Countries/regions with particularly bad connections include the the Caucasus, India, and Africa. Poor connections are also prevalent to S.E. Europe.

For modern HENP collaborations and Grids there is an increasing need for high-performance monitoring to set expectations, provide planning and trouble-shooting information, and to provide steering for applications

To quantify and help bridge the Digital Divide, enable world-wide collaborations, and reach-out to scientists world-wide, it is imperative to continue and extend the monitoring coverage to all countries with HENP programs and significant scientific enterprises. This in turn will require help from ICFA to identify sites to monitor and contacts for those sites, plus identifying sources of on-going funding support to continue and extend the monitoring.


The formation of this working group was requested at the ICFA/SCIC meeting at CERN in March 2002 [icfa-mar02]. The mission is to: Provide a quantitative/technical view of inter-regional network performance to enable understanding the currrent situation and making recommendations for improved inter-regional connectivity.

The lead person for the monitoring working group was identified as Les Cottrell. The lead person was requested to gather a team of people to assist in preparing the report and to prepare the current ICFA report for the end of 2002. The team membership consists of:

Les Cottrell SLAC US
Richard Hughes-Jones University of Manchester UK
Sergei Berezhnev RUHEP, Moscow State.Univ. Russia
Sergio F. Novaes FNAL S. America
Fukuko Yuasa KEK Japan and E. Asia
Sylvain Ravot Caltech CMS
Daniel Davids CERN CERN, Europe, LHC
Shawn McKee Michigan I2 HENP Net Mon WG

Goals of the Working Group

This report may be regarded as a follow on to the May 1998 Report of the ICFA-NTF Monitoring Working Group [icfa-98]. Besides providing information for the last 4.5 years since the previous report, this report also adds information on high throughput performance monitoring.


There are two complementary types of Internet monitoring reported on in this report.
  1. In the first we use PingER [pinger] which uses the ubiquitous "ping" utility available standard on most modern hosts. Details of the PingER methodology can be found in the May 1998 Report of the ICFA-NTF Monitoring Working Group [icfa-98]. PingER provides low intrusiveness (~ 100bits/s per host pair monitored) Round Trip Time (RTT), loss, reachability (if a host does not respond to a set of 21 pings it is presumed to be non-reachable). The low intrusiveness enables the method to be very effective for measuring regions and hosts with poor connectivity. Since the ping server is pre-installed on all remote hosts of interest, minimal support is needed for the remote host (no software to install, no account needed etc.) 
  2. The second method (IEPM-BW [iepm]) is for measuring high network and application throughput between hosts with excellent connections. Examples of such hosts are to be found at HENP accelerator sites and tier 1 and 2 sites, as well as major academic and research sites in Anglo America1, Japan and Europe. The method is quite intrusive (for each remote host being monitored from a monitoring host, it can utilize hundreds of Mbits/s for ten seconds or more each hour). It also requires more support from the remote host. In particular an account is required, software (servers) must be installed, disk space, compute cycles etc. are consumed, and there are security issues. The method provides expectations of throughput achievable at the network and application levels, as well as information on how to achieve it, and  trouble-shooting information.



The PingER data and results extend back to the start of 1995. They thus provide a valuable history of Internet performance. There are about 38 monitoring hosts in 12 countries, monitoring over 791 remote hosts at 482 sites in 70 countries (see PingER Deployment [pinger-deploy]). These countries contain over 78% of the world's population and over 99% of the  online users of the Internet. Most of the hosts monitored are at educational or research sites.

Since there are over 3500 monitoring/monitored host pairs, it is important to provide aggregation of data by hosts from a variety of "affinity groups". PingER provides aggregation  by affinity groups such as HENP experiment collaborator sites, Top Level Domain (TLD),  Internet Service Provider (ISP), or by world region etc. The world regions are shown below in Table 1. They are chosen starting from the U.N. definitions [un]. We modify the region definitions to take into account which countries have HENP interests and to try and ensure the countries in a region have similar performance. The regions are shown in Fig. 1, together with the names of some of the countries monitored.

Figure 1: Major regions of the world for PingER aggregation  by regions

The major regions are: Anglo America, Latin America, Europe, S.E. Europe, Africa, Mid East, Caucasus, Former Soviet Union, S. Asia, E. Asia and Australasia. We also subdivide regions at times to provide better granularity.The major sub-regions  by separating: Central America and the Carribean from S. America; Israel from the Mid-East, the Baltic States from Europe; Russia from the Former Soviet Union; China, Japan & Korea from E. Asia.

To assist in interpreting the results in terms of their impact on well-known applications, we categorize the RTTs, losses etc. into quality ranges.  These are shown below in Table 2.
Table 2: Quality ranges used for loss and RTT
  Excellent Good Acceptable Poor Very Poor Bad
Loss <0.1% >=0.1% & &
< 1%
> =1%
& < 2.5%
>= 2.5%
& < 5%
>= 5%
& < 12%
> 12%
RTT   <62.5ms >=62.5ms
& < 125ms
>= 125ms
& < 250ms
& < 500ms
More on the effects of packet loss and RTT can be found in the Tutorial on Internet Monitoring & PingER at SLAC [tutorial], briefly:

It must be understood that these quality designations apply to normal Internet use. For high performance, and thus access to data samples and effective partnership in distributed data analysis, much lower packet loss rates may be required.


Of the two metrics loss & RTT, loss is more critical since a loss of a packet will typically cause timeouts that can last for several seconds. Fig. 2 shows the loss measured from ESnet sites (ANL, BNL, DoE, Snet NOC, FNAL, LANL, & SLAC) to various regions of the world.

Figure 2: Monthly packet loss as a function of time, seen from ESnet sites to various regions of the world. The numbers in parentheses indicate the number of monitor site / remote site pairs contributing to the medians. The orange dots show 50% improvement/year.

The following observations can be made:

Another way of looking at the losses is to see how many hosts fall in the various loss quality categories defined above as a function of time. An example of such a plot is seen in Fig 3.

Figure 3: Number of hosts measured from SLAC for each quality category from Jan 2000 through Mar 2002.

It can be seen that recently most sites fall in the good quality category. Also the number of sites with good quality has increased by about 50% in the the 2 years displayed.

The loss results, so far presented in this report, have been measured from Anglo America. This is partially since there is more data for a longer period available for the Anglo America monitoring hosts. Table 3 shows the losses seen between monitoring and monitored hosts in the major regions of the world. Each column is for monitoring hosts in a given region, each row is for monitored hosts in a given region. The cells are colored according to the median quality for the monitoring region/monitored region pair. White is for < 1% loss (good), green for >= 1% & < 2.5% (acceptable), orange for >=2.5% & < 5% (poor), and magenta for >=5% (very poor to bad). The table is ordered by increasing average loss seen from Anglo American, W. European and Japanese hosts. The number in parentheses are the number of monitoring hosts in the named region. The Monitoring countries are identified by the 2 character TLD. We have broken apart monitoring from Moscow (ITEP) and Novosibirsk since they are very different. The .ORG site is JLab. The .NET sites are APAN in Japan and the ESnet NOC at LBNL. The .GOV sites are ESnet sites (ANL, BNL, DOE-HQ, FNAL, and LANL). FSU- refers to the Former Soviet Union excluding Russia (we are monitoring Belorussia, Kazakhstan, Mongolia, Moldova, Ukraine, and Uzbekistan). S. Asia is the Indian sub-continent; S.E. Asia is composed of measurements to Indonesia, Malaysia, Singapore, Thailand and Vietnam. The numbers in parentheses are the number of monitoring sites. The table is sorted in increasing order of the average ping times measured from Anglo America + W. Europe and Japan. The cells are colored white for losses < 1% (good), green between 1% and 2.5% (acceptable), yellow between 2.5 % and 5% (poor), and magenta for > 5% (very poor to bad).
Table 3: Percentage packet loss from monitoring hosts to monitored hosts by region of the world for November 2002.


There are a couple of anomalies: the Mid East measurements are almost entirely composed of measurements to Israel; the Caucasus measurements are to 2 countries with very different performance: Azerbaijan (50% loss) and Georgia (3% loss). It can be seen that in general performance is good to acceptable. Regions with very poor to bad performance include Africa (we monitor S. Africa, Uganda & Egypt), S. Asia (India) and the Caucasus region. Performance to S.E. Asia is generally poor.

Fig. 4 shows the world population fractions obtained by dividing countries up by loss quality seen from the US, and adding the populations for the countries (we obtained the population/country figures from "How many Online" [nua]) for a given loss quality to get a fraction compared to the total world's population.

Figure 4: Fraction of the world's population in countries with measured loss performance in 2001.

It can be seen that in 2001, <20% of the population lived in countries with good to acceptable packet loss.

The evolution of loss with path upgrades is also worthy of attention. Typically the losses drop each time the bottleneck link in a path is upgraded, and then increase again until the next upgrade. Over time, however, the overall trend is down. This is illustrated for the Trans-Atlantic link in Fig. 7.
Figure 7: Evolution of losses with time on path from ESnet to the UK, indicating the effects of various upgrades in the Trans-Atlantic link.


Unfortunately there are limits to the minimum RTT due to the speed of light in fibers or electricity in copper. Typically today, the minimum RTTs for terrestrial circuits are about 2 * distance / (0.6 * (0.6 * c)), where c is the velocity of light, the factor of 2 accounts for the round-trip, 0.6*c is roughly the speed of light in fiber and the extra 0.6 allows roughly for the delays in network equipment such as routers. For geostationary satellites links the minima are between 500 and 600ms. For U.S. cross country links (e.g. from SLAC to BNL) typical minimum RTT (i.e. a packet sees no queuing delays) is about 70 msec.

The RTTs seen from ESnet monitoring sites  to hosts in various regions are seen in Fig. 6.

Figure 6: RTTs seen from ESnet sites to various regions of the world. Edu indicates hosts in the .edu Top Level Domain (TLD), i.e. mainly U.S. universities. The straight lines are fits to exponential functions to help guide the eye. The orange lines are for 10 & 20% improvements (reductions) in RTT/year.

It is seen that Anglo American and European sites have been improving by between 10 & 20% per year. Japan and Russia are improving at a slower rate and S. E. Europe & mainland China at a faster rate. The improvements are due to faster links (less time clocking the bits in and out of the network equipment),  faster routers and improved routes..

Fig. 7, shows the RTT from U.S. to the world in January 2002 and August 2002. It also indicates which contries of the world contain sites that are monitored (countries in green are not monitored).


Figure 7: Average monthly RTT measured from U.S to various countries of the world for January 2000 and August 2002. Countries shaded green are not measured.

It is seen that the number of countries with satellite links (> 600ms RTT or dark red) has decreased markedly in the 2.5 years shown. Today satellite links are used in places where it is hard or unprofitable to pull terrestrial-lines (typically fibers) to.


We also combine the loss and RTT measurements using throughput = 1460Bytes/(RTT * sqrt(loss)) from [mathis]. The results are shown in Fig. 8. The orange line shows an 80% improvement/year of about a factor of 10 in 4 years.

Figure 8: Derived throughput as a function of time seen from ESnet sites to various regions of the world.
It is important to note that although the performances to regions such as S.E. Europe, Russia, Latin America and China are improving at a similar rate to the more developed countries, they do not appear to be catching up, and in the case of Russia may even be falling further behind. Currently we do not have sufficient information to pin point how other, even less developed regions, such as Africa are managing (see Connectivity Mapping in Africa [ictp-jensen] and African Internet Connectivity [africa]).


The PingER method of measuring throughput breaks down for high speed networks due to the different nature of packet loss for ping compared to TCP, and also since PingER only measures about 14,400 pings of a given size/month between a given monitoring host/monitored host pair. Thus if the link has a loss rate of better than 1/14400 the loss measurements will be inaccurate. This is partially the reason the ESnet and Abilene thoughputs appear to have leveled out since 2000 in Fig. 8. For example if the loss probability is < 1/14400 then we take the loss as being 0.5 packet to avoid a division by zero, so if the average RTT for ESnet is 50msec then the maximum throughput we can use PingER data to predict is ~ 1460Bytes*8bits/(0.050sec*sqrt(0.5/14400)) or ~ 40Mbits/s.

To address this challenge and to understand and provide monitoring of high performance throughput between major sites of interest to HEP and the Grid, we developed the IEPM-BW monitoring infrastructure and toolkit. There are now (December 2002) about 10 monitoring hosts and about 50 monitored hosts in 9 countries (CA, CH, CZ, FR, IT, JP, NL, UK, US). Both application (file copy and file transfer) and TCP throughputs are measured.

These measurements indicate that throughputs of several hundreds of Mbits/s are achievable on today's production academic and research networks, using common off the shelf hardware, standard network drivers, TCP stacks etc., standard packet sizes etc. Achieving these levels of throughput requires care in choosing the right configuration parameters. These include large TCP buffers and windows, multiple parallel streams, sufficiently powerful cpus (typically better than 1 MHz/Mbit/s), fast enough interfaces and busses, and a fast enough link (> 100Mbits/s) to the Internet. In addition for file operations one needs well designed/configured disk and file sub-systems.

Though not strictly monitoring, there is currently much activity in understanding and improving the TCP stacks ([floyd], [low], [ravot]). In particular with high speed links (> 500Mbits/s) and long RTTs (e.g. trans-continental or trans-oceanic) today's standard TCP stacks respond poorly to congestion (back off too quickly and recover too slowly). To partially overcome this one can use multiple streams or in a few special cases large (>> 1500Bytes) packets. In addition new applications ([bbcp], [bbftp], [gridftp]) are being developed to allow use of larger windows and multiple streams as well as non TCP strategies ([tsnami]). Also there is work to understand how improve the operating system configurations [os] to improve the throughput performance. As it becomes increasingly possible to utilize more of the available bandwidth, more attention will need to be paid to fairness and the impact on other users. Besides ensuring the fairness of TCP itself, we may need to deploy and use quality of service techniques such as QBSS [qbss]. These subjects will be covered in more detail in the companion ICFA-SCIC Advanced Technologies Report. We note here that monitoring infrastructures such as IEPM-BW can be effectively used to measure and compare the performance of TCP stacks, measurement tools, applications and sucomponents such as disk and file systems and operating systems in a real world environment.

Comparison with HEP Needs

Recent studies of HEP needs, for example the TAN Report ( have focused on communications between developed regions such as Europe and Anglo America.  In such reports packet loss less than 1%, vital for unimpeded interactive log-in, is assumed and attention is focused on bandwidth needs and the impact of low, but non-zero, packet loss on the ability to exploit high-bandwidth links.  The PingER results show clearly that much of the world suffers packet loss impeding even very basic participation in HEP experiments and points to the need for urgent action.

The PingER throughput predictions based on the Mathis formula assume that throughput is not limited by bandwidth, only by packet loss.  It is therefore somewhat fortuitous that the 80% per year growth curve in figure 8 is a close match to the 79% per year growth in future needs that can be inferred from the tables in the TAN Report.  True throughput measurements have not been in place for long enough to measure a growth trend.  Nevertheless, the throughput measurements, and the trends in predicted throughput, indicate that current attention to HEP needs between developed regions could result in needs being met.  In contrast, the measurements indicate that the throughput to less developed regions is likely to continue to be well below that needed for full participation in future experiments.


Internet performance is improving each year with losses typically improving by 40-50% per year and RTTs by 10-20% and, for some countries such as China, even more. Geosynchronous satellite connections are still important to countries with poor telecommunications infrastructure. In general for HEP countries satellite links are being replaced with land-line links with improved performance (in particular for RTT).

Links between the more developed regions including Anglo America, Japan and Europe are much better than elsewhere (5 - 10 times more throughput achievable). Regions such as China, S.E. Europe, Latin America and Russia are improving at a similar rate to the more developed regions. However, China, Russia, S.E. Europe and Latin America are a few years behind in performance compared to the more developed regions, and do not appear to be catching up. Countries/regions with particularly bad connections include the the Caucasus, India, and Africa. Poor connections are also prevalent to S.E. Europe.

The ICFA-SCIC "Digital Divide" report will dwell in more detail on many of the issues of the performance differences for the developed and less well-developed countries. The Abdus Salam International Center for Theoretical Physics (ICTP) recently held a Round Table meeting on Developing Country Access to On-Line Scientific Publishing: Sustainable Alternatives [ictp] in Trieste that included a Proposal for Real time monitoring in Africa [africa-rtmon]. Following the meeting a formal declaration was made on RECOMMDENDATIONS OF the Round Table held in Trieste to help bridge the digital divide [icfa-rec]. The PingER project is collaborating with the ICTP to develop a monitoring project aimed at better understanding and quantifying the Digital Divide. On December 4th the ICTP electronic Journal Distribution Service (eJDS) sent an email entitled Internet Monitoring of Universities and Research Centers in Developing Countries [ejds-email] to their collaborators informing them of the launch of the monitoring project and requesting participation. By January 14th 2003, with the help of ICTP, we have added about 23 hosts in about 17 countries including: Bangladesh, Brazil, China, Columbia, Ghana, Guatemala, India (Hyderabad and Kerala), Indonesia, Iran, Jordan, Korea, Mexico, Moldova, Nigeria, Pakistan, Slovakia and the Ukraine.


The monitoring covers most countries with formal HEP programs. However, there are a few countries appearing in the "Pocket Diary for Physicists" that are not monitored. The monitoring should be extended to include hosts in such countries including Pakistan, and Armenia. There may also be interest from ICFA to extend the monitoring further to countries with no formal HEP programs, but where there are needs to understand the Internet connectivity performance in order to aid the development of science. Africa is a region with many such countries.

The collaboration with the ICTP appears to be very fruitful to bring in contacts from developing nations with scientific interests. This collaboration and project should be fostered, the coverage of developing nations should be extended, and the results made available and publicized to the HEP community (e.g. by a paper at CHEP03). Typically HENP leads other sciences in its needs and developing an understanding and solutions. The outreach from HENP to other sciences is to be encouraged. The results should be publicized more widely, the World Summit of the Information Society may be a suitable venue. We recommend working with the ICTP to further publicise by creating a web site, an A3 color poster for distribution world-wide, and informing International organizations such as the UN, UNESCO and Soros foundation of the eJDS/PingER monitoring project.

We need assistance from ICFA and others to find sites to monitor and contacts in the following countries:

Although not a recommendation per se, it would be disingenous to finish without noting the following. SLAC & FNAL are the leaders in the PingER project. The funding for the PingER effort has come from the DoE MICS office since 1997, however it will finish at the end of the coming year. To continue the effort would probably require central funding at a level of about 50% of a Full Time Equivalent (FTE) person, plus travel. The DoE needs to be approached to investigate ongoing funding, and the ICTP may be able to seek support from the EU and other foundations.

Appendix: Countries in PingER Database

The following table lists the 79 countries currently in the PingER database. Such countries contain one or more sites that are being or have been monitored by PingER. A site is currently monitored in almost all the countries apart from those mentioned in the previous section.
Costa Rica
Croatia Cuba
Czech Republic
New Zealand
Puerto Rico
South Africa
United Kingdom
Viet Nam


[africa] Mike Jensen, "African Internet Connectivity". Available
[africa-rtm] Enrique Canessa, "Real time network monitoring in Africa - A proposal - (Quantifying the Digital; Divide)". Available
[bbcp] Andrew Hanushevsky, Artem Trunov, and Les Cottrell, "P2P Data Copy Program bbcp", CHEP01, Beijing 2002. Available at
[bbftp] "Bbftp". Available
ejds-email] Hilda Cerdeira and the eJDS Team, ICTP/TWAS Donation Programme, "Internet Monitoring of Universities and Research Centers in Developing Countries". Available
[floyd] S. Floyd, "HighSpeed TCP for Large Congestion Windows", Internet draft draft-floyd-tcp-highspeed-01.txt, work in progress, 2002. Available
[gridftp] "the GridFTP Protocol Protocol and Software". Available
[icfa-98] "May 1998 Report of the ICFA NTF Monitoring Working Group". Available
[icfa-mar02] "ICFA/SCIC meeting at CERN in March 2002". Available
[iepm] "Internet End-to-end Performance Monitoring - Bandwidth to the World Project". Available
[ictp] Developing Country Access to On-Line Scientific Publishing: Sustainable Alternatives, Round Table meeting held at ICTP Trieste, Oct 2002. Available
[ictp-jensen] Mike Jensen, "Connectivity Mapping in Africa", presentation at the ICTP Round Table on Developing Country Access to On-Line Scientific Publishing: Sustainable Alternatives at ITCP, Trieste, October 2002. Available
[ictp-rec] RECOMMDENDATIONS OF the Round Table held in Trieste to help bridge the digital divide. Available
[low] S. Low, "Duality model of TCP/AQM + Stabilized Vegas". Available
[mathis] M. Mathis, J. Semke, J. Mahdavi, T. Ott, "The Macroscopic Behavior of the TCP Congestion Avoidance Algorithm",Computer Communication Review, volume 27, number 3, pp. 67-82, July 1997
[nua] NUA Internet Surveys, "How many Online". Available
[os] "TCP Tuning Guide". Available
[pinger] "PingER". Available; W. Matthews and R. L. Cottrell, "The PingER Project: Active Internet Performance Monitoring for the HENP Community", IEEE Communications Magazine Vol. 38 No. 5 pp 130-136, May 2002.
[pinger-deploy] "PingER Deployment". Available
[qbss] "SLAC QBSS Measurements". Available
[ravot] J. P. Martin-Flatin and S. Ravot, "TCP Congestion Control in Fast Long-Distance Networks", Technical Report CALT-68-2398, California Institute of Technology, July 2002. Available
[tsunami] "Tsunami". Available
[tutorial] R. L. Cottrell, "Tutorial on Internet Monitoring & PingER at SLAC". Available
[un] "United Nations Population Division World Population Prospects Population database". Available

1. Since North America officially includes Mexico, we follow the Encyclopedia Britannica recommendation and use the terminology Anglo America (US + Canada) and Latin America. Unfortunately many of the figures use the term N. Amerrica for what should be Anglo America.
h. These countries appear in the Particle Data Group diary andf so would appear to have HENP programs.
*. These countries are no longer monitored, usually the host no longer exists, or pings are blocked.