SLAC logo

Network connectivity between SLAC and Platform Computing January 2001 Network logo

Les Cottrell. Page created: January 12, 2001.

Central Computer Access | Computer Networking | Network Group | ICFA-NTF Monitoring
SLAC Welcome
Highlighted Home
Detailed Home
Search
Phonebook

Introduction

On January 12, 2001, we received email from Ed Russell of SLAC, saying:

For the last two days [at least] slow response to Platform Computing
seems to be hung on this Teleglobe.net stuff.  Do we have any way to
improve the performance there or bypass it?

Route

The traceroute from SLAC to ftp.platform.com shows that there is a problem after hop 13. Also there appears to be a long delay between Scarborough.net and the next site (see hop 12). Scarborough.Teleglobe..net is in Quebec, Canada (see http://cello.cs.uiuc.edu/cgi-bin/slamm/ip2ll/). Note that there are 2 traceroutes shown the first is from a Windows NT machine (Windows NT uses UDP for its traceroute probes), the 2nd from a Sun Solaris machine. It can be seen that the Windows NT traceroute can detect the host greyowl.platform.com whereas the Solarais traceroute cannot. Pings also showed that there was about 20% loss from SLAC to Platform.
----greyowl.platform.com PING Statistics----
202 packets transmitted, 168 packets received, 16% packet loss
round-trip (ms)  min/avg/max = 122/172/284
We also tried traceroutes from Stanford University and that showed a different route (as would be expected). Note that there are 2 traceroutes, the first was done from a Windows NT machine that does not use ICMP for the traceroute probes but rather uses UDP. The second was done from a Sun Solaris machine. Again the Solaris traceroute does not detect the ftp.platform.com machine. It is possible that the ftp.platform.com is blocked from receiving the ICMP probes. Note that Also the packet loss from Stanford University to Platform was lower (< 10%) than from SLAC.
---- greyowl.platform.com (192.219.104.253) PING Statistics ----
202 packets transmitted, 186 packets received, 7.9% packet loss
round-trip (ms) min/avg/max = 105/164/331 (std = 39.3)
The path characterization from SLAC indicates congestion in the Teleglobe.net network.

We used pingroute to see if we could see where the pings start to be lost. It also indicates that from SLAC the losses start either at the border with ESnet or in the Teleglobe network. However, for the pingroute from Stanford there does not appear to be much loss in the Teleglobe network. From this we may deduce that the problem may be at the Esnet Teleglobe border.

We also used synack to see whethere there was any ICMP rate limiting between SLAC and ftp.platform.com but both syn/acks and pings were being lost at roughly the same rate (see Ping vs. Synack for plots of the data and ICMP Rate Limiting for a description of the methodology).

Historical performance

Unfortunately performance between SLAC and Platform Computing was not measured by either IEPM/PingER or Surveyor. So we added ftp.platform.com to PingER measurements. The PingER loss and RTT graph graph indicates the RTT is fairly steady at about 125msec., but the loss varies from 0 to 40%.

Reporting problem

We used the list of NOCs to identify who to report the problem to at Teleglobe, and reported it at 10:32am 1/12/00 PST. We also reported the problem to ESnet. See the email concerning the ticket for the follow up.

Change in problem

On 1/20/01 it was observed by Teleglobe that the route from SLAC to Platform was no longer via Teleglobe but via UUnet/Alternet. To follow up on this on 1/21/00 around 8:00am PST we measured the ping packet loss from flora.slac.stanford.edu to ftp.platform.com and noted the loss rate was still over 20%:
----greyowl.platform.com PING Statistics----
360 packets transmitted, 266 packets received, 26% packet loss
round-trip (ms)  min/avg/max = 136/143/306
We confirmed that the routefrom SLAC to Platform was now via Alternet. We also noted that the route from Stanford University to Platform was still via Teleglobe and had low (<1%) packet loss. We ran pingroute again from flora.slac.stanford.edu to ftp.platform.com to see if we could observe where the losses on this route started. Losses appear to start at hop 9. Note that hop 8 at 192.41.177.249 is reported by nslookup to be part of Alternet (br2.tco1.alter.net) and located near Washington according to the Hostname to Lat/Long service. The packet loss gets really bad between hops 14 and 15. We are unable to find any information on 157.130.159.54 or 216.18.63.94.

At the same time the route from Platform to SLAC was still using Teleglobe, thus the routing had become asymmetric.

We therefore used Sting to see whether the one way packet losses were also asymmetric. The results indicate that the losses are much higher (almost a factor of 10) higher in the route from Platform to SLAC.


Page owner: Les Cottrell