SLAC's HEP Internet Monitoring
Les Cottrell,
Last Update: March 31,
1997
Report prepared for CHEP97 Mini Workshop on HEP & the Internet, Berlin, April 5, 1997
[
CHEP97 |
Mini Workshop on HEP & the Internet |
Talk Profile Instructions
]
Connections
SLAC has an ATM/T3 and two T1 connections (Caltech & LBNL) to the
ESnet Backbone.
In addition there is a 10 Mbps uwave link to Stanford University.
ESnet has international connections directly to Japan/China/Novosibirsk, Brazil, Italy and Germany.
Table 1: ESnet International Connections
Country | Bandwidth Today (Jan-Mar '97) | Plans |
Brazil | 128 kbps to FNAL |
|
Canada | 1.54Mbps to PPPL | 1.54Mbps into DC POP |
Germany (DFN) | 1.54Mbps to DC POP | Add part of 2*T3 capacity |
Italy | 1.54Mbps to PPPLLooking at moving to DC POP. Possibly add capacity via DFN |
Japan | - KEK (HEP) to FixW 512kbps
- JAERI (Fusion) 256kbps
- NIFS (Fusion) 64kbps
|
- KEK 1.54Mbps relocate to LBNL (~6 mos)
- JAERI 758kbps FR (~3 mos)
- NIFS 128kbps FR
|
SLAC's Monitoring of Internet Connections
SLAC monitors hosts at about
65 remote Internet sites. These are broken down into
about 12 ESnet sites, about 12 sites in western N. America, about 25
sites in eastern N. America, and 13 International sites in about 10 countries.
The monitoring is mainly based on using ping to measure response time,
packet loss, unreachability and unpredictability (see
Tutorial on WAN Monitoring at
SLAC for more information). The hosts are chosen as being at SLAC collaborator sites
or other HEP sites of interest to SLAC. There are some
Requirements for WAN Hosts being Monitored. Over time host names
to be monitored at a site are likely to change, so we report the information
by site rather than by host.
Rating the Connections
For the data gathered between Jan 1995 and March 1996, a typical distribution of
average monthly ping packet loss for N. American and International sites monitored from SLAC
is seen in Figure 1. Table 2 shows the current quality indicators versus packet loss percentages.
Table 2: Quality indicators versus packet loss
Ping Packet Loss | Quality |
0-1% | is good |
1-5% | is acceptable |
5-12% | is poor |
12%-25% | is bad |
> 25% | is unacceptable |
We are refining the above definitions and solicit input to the definitions.
Table 3 shows the percentages in the quality categories for
each group of sites for this period.
Table 3: Percentages of host-months in the ping loss quality categories for the various site
groups from Jan-95 thru Feb-97
Group of Hosts | % Good |
% Accept |
% Poor |
% Bad |
Host-months |
ESnet | 55.6% | 40.1% | 3.2% | 0% | 187 |
N. America W | 9.5% | 37.1% | 37.6% | 15.4% | 221 |
N. America E | 11.5% | 33.7% | 37.9% | 16.5% | 261 |
International | 10.5% | 41.4% | 26.8% | 14.1% | 220 |
The response times, packet loss, unreachability and unpredictability measurements for
the first 11 days of March 1997 are seen below:
In the above plot, the loss and response time are measured during SLAC prime
time (7am - 7pm, weekdays), the other measures are for all the time. The hosts
are grouped according to how they are connected to SLAC and the geographical
location of the site. Within the groups the hosts are ranked by packet loss.
- The loss rates are plotted as a bar graph above the y=0 axis and
are for 100 byte payload ping
packets. Horizontal lines
are indicated at packet losses of 1%, 5% and 12% at the boundaries of the
connection qualities defined above.
- The response time is plotted as a blue line
on a log axis, labelled to the right, and is the round trip time for 1000 byte
ping payload packets.
- The host unreachability is plotted as a bar graph negatively extending from
the y=0 axis.
A host is deemed unreachable at a 30 minute interval if it did
not respond to any of the 21 pings made at that 30 minute interval.
- The host
unpredictability is plotted in green here as a negative value, can range
from 0 (totally unpredictable) to 1 (highly
predictable) and is a measure of the variability of the ping response time
and loss during each 24 hour day. It is defined in more detail in
Ping Unpredictability.
The following observations are also relevant:
- ESnet hosts in general have good packet loss (median 0.79%). The average
packet losses for the other groups varies from about 4.5% (N. America East)
to 7.7% (International).
Typically 25%-35% of the hosts in the non-ESnet groups are in the poor to bad
range.
- The response time for ESnet hosts averages at about 50ms, for N. America West
it is about 80ms, for N. America East about 150ms and for International hosts
around 200ms.
- Most of the unreachable problems are limited to a few hosts mainly in the
International group (Dresden, Novosibirsk, Florence).
- The unpredictability is most marked for a few International hosts and
roughly tracks the packet loss.
Other Measurements
We do not regularly measure FTP rates. We have made measurements in the past to
make
Correlations between FTP & Ping.
We have also developed tools such as:
- a Reverse Traceroute Server to allow
a Web server to provide a traceroute to a Web client.
- Pingroute to trace the route to a host
and ping each host in the path for response time and packet loss.
Summary of off-site/international networking changes
in next 12 months
See Table 1 for a summary of the ESnet plans. It is also noteworthy that:
- ESnet are exploring providing access between CERN and KEK through ESnet,
probably by establishing a Permanent Virtual Circuit. There are appropriate
use issues that need to be worked out.
- CERN and Caltech have decided to move the US-CERN link to terminate in
the Washington DC POP that ESnet is located in.
- With DFN, Canada and Italy also moving or considering moving to the DC POP
this should simplify interconnections with ESnet and improve connectivity.
ESnet will soon (by May 1997) have a West Coast US connection to vBNS at the SDSC.
VBNS will probably be the major backbone for Internet II. Many universities
of interest to HEP are part of Internet II.
ESnet is putting a T3 to the Chicago NAP. VBNS is also located at Chicago
NAP. Recently ESnet established peering with the UC campuses via UC Berkeley.
Resources
SLAC has roughly 30% of a full time equivalent (FTE) person working on Internet
connectivity, including monitoring.
If you wish to ping a host at SLAC then I recommend
a lightly loaded name server (ns1.slac.stanford.edu). The contacts for Internet
monitoring at SLAC are:
Our reverse traceroute server can be found a:
http://www.slac.stanford.edu/cgi-bin/nph-traceroute.pl.
[
Reporting network Problems |
Feedback
]