ESNet outage because of repair work between Sunnyvale-San DiegoAdnan Iqbal and Les Cottrell. Page created: Jan 23, 2006
Central Computer Access | Computer Networking | Network Group | More case studies
[mailto:firstname.lastname@example.org] On Behalf Of operator
Sent: Monday, January 23, 2006 6:00 AM
Subject: [esnet-status] TTS#14413 SNV2-SDN1 <-10GE-> SDSC-SDN1 (NLR-SAND-SUNN-10GE-45) UP, 1/23
Sunnyvale connectivity to San Diego via the 10GE was down this morning at 02:06PT (1/23) due to a maintenance by Level(3), at Tustin, CA.
The Circuit (NLR-SAND-SUNN-10GE-45) was restored at 03:34PT.
ESnet - The Energy Science Network 1 800-33-ESnet (1 800-333-7638)
Network Operations & Management Center +1 510-486-7600 (Outside of USA)
To report problems via Email: email@example.com
To request information via Email: firstname.lastname@example.org
To view an open trouble ticket: finger <ticket-number>@ticket.es.net
ESnet - Connecting people, information and resources. http://www.es.net
Given the above information we were interested in seeing the effect on the network measurements of this ~ 90 minute outage. Only one of the paths that we were monitoring noticed a route change. This was SLAC to SDSC. Looking at the traceroutes measured at 10 minute intervals the route change occured between 2:06:18 and 2:16:35 and then reverted between 3:36:52 and 3:46:50. This agrees pretty well with the ESnet email report times of 2:06 and 3:34. We examined data reported from our measurements to SDSC, to investigate any impact and to correlate with any performance drop. The analysis revealed that quickly after the outage started, there was a route change as expected. Traffic flowing through 22.214.171.124 (Sunnyvale) changed the path and started flowing through 126.96.36.199 (Oakland). Traffic continued to flow through the new path until the end of the outage. The original path returned quickly after the outage i.e., within 10 minutes. Looking at the data obtained by different tools, we noticed that an effect is visible in pathchirp and ping but not really in the thrulay data. The drop for pathchirp was about a factor of 10, i.e. from about 1 Gbits/s to about 120Mbits/s. Ping showed an improvement in the minimum as well as the maximum round trip time in this period. The minimum ping dropped by about 1 msec. In other words the new route has lower latency but also lower bandwidth. Thrulay reported these changes a little late as compared to pathchirp and ping. None of the tools showed a persistent (> 6 hour) change such that an alert was produced.
In summary the event was most clearly detectable in the traceroute and ping tools. It was also clearly visible with the pathchirp available bandwidth tool, but not with the thrulay achievable throughput tool. The event duration was too short for us to detect with our event analysis toolkit. Graphs of the data and table describing route change are presented below.
|Route Before Outage||Route during Outage||Route After Outage|
|rtr-gsr-test 188.8.131.52 0.276 ms||rtr-gsr-test 184.108.40.206 0.268 ms||rtr-gsr-test 220.127.116.11 0.276 ms|
|rtr-core1-p2p-test 18.104.22.168 0.243 ms||rtr-core1-p2p-test 22.214.171.124 0.256 ms||rtr-core1-p2p-test 126.96.36.199 0.243 ms|
|rtr-dmz1-ger 188.8.131.52 0.232 ms||rtr-dmz1-ger 184.108.40.206 0.209 ms||rtr-dmz1-ger 220.127.116.11 0.232 ms|
|i2-gateway.stanford.edu 18.104.22.168 0.268 ms||i2-gateway.stanford.edu 22.214.171.124 0.281 ms||i2-gateway.stanford.edu 126.96.36.199 0.268 ms|
|hpr-svl-hpr--stan-ge.cenic.net 188.8.131.52 0.765 ms||hpr-oak-hpr--stan-ge.cenic.net 184.108.40.206 1.586 ms||hpr-svl-hpr--stan-ge.cenic.net 220.127.116.11 0.765 ms|
|lax-hpr--svl-hpr-10ge.cenic.net 18.104.22.168 42.775 ms||sac-hpr--oak-hpr-10ge.cenic.net 22.214.171.124 3.152 ms||lax-hpr--svl-hpr-10ge.cenic.net 126.96.36.199 42.775 ms|
|riv-hpr--lax-hpr-10ge.cenic.net 188.8.131.52 14.534 ms||riv-hpr--sac-hpr-10ge.cenic.net 184.108.40.206 11.847 ms||riv-hpr--lax-hpr-10ge.cenic.net 220.127.116.11 14.534 ms|
|hpr-sdsc-sdsc2--riv-hpr-ge.cenic.net 18.104.22.168 14.162 ms||hpr-sdsc-sdsc2--riv-hpr-ge.cenic.net 22.214.171.124 12.957 ms||hpr-sdsc-sdsc2--riv-hpr-ge.cenic.net 126.96.36.199 14.162 ms|
|lightning.sdsc.edu 188.8.131.52 14.285 ms||lightning.sdsc.edu 184.108.40.206 12.986 ms||lightning.sdsc.edu 220.127.116.11 14.285 ms|
|node1.sdsc.edu 132.249.xxx.xxx 14.185 ms||node1.sdsc.edu 132.249.xxx.xxx 12.948 ms||node1.sdsc.edu 132.249.xxx.xxx 14.185 ms|