Problems with Stanford Connectivity, September 2005Les Cottrell. Page created: September 24, 2005Central Computer Access | Computer Networking | Network Group | More case studies |
|
The alert indicated that there had been an ~ 50% drop (from about 600Mbits/s to about 300Mbits/s) in the achievable throughput measured by the thrulay probe from SLAC to Caltech at 11:43am 9/10/2005. The email also provided information on the routes before and after the change and time series measurements of the various probes.
Looking at the topology map from the above nodes for this day the routes can be seen to have switched from CENIC (green)/Abilene (blue) to ESnet (red). Looking at http://calendar.es.net/cgi-bin/pmcalendar.pl we could not see any scheduled maintenance between these times. Drilling down http://cricket.cenic.org/grapher.cgi to utilization > CENIC backbone hpr-routers > hpr-svl-1ge-summary (multiple targets) Octets > weekly one could see the Stanford router lost all its traffic, while another Sunnyvale router added a lot of extra traffic.
The thrulay time series showed the step down (and later step up) in perfomance. As expected, we saw similar effects with the iperf probe time series though the effect on the multi-stream iperf was less than on the single stream (presumably due to the multi-stream iperf being less friendly and pushing aside other traffic on the more congested backup link). The effect was not noticeable on the pathchirp probe time series. Looking at the ABwE and ping RTT results there was no effect visible on the ABwE dynamic bandwidth capacity. There was a slight effect visible on the available bandwidth though we did not automatically detect it. On the other hand the minimum RTT showed a dramatic effect. This was also visible to the other affected sites such as U Florida.
Later we found the official Cisco Field Notice about the generic fan problems.