Througput Performance between SLAC and ILAN (Israel)Les Cottrell. Page created: September 3, 3000.Central Computer Access | Computer Networking | Network Group | ICFA-NTF Monitoring |
|
Tracing the route to WWWSLUG.SLAC.Stanford.EDU (134.79.18.131) 1 gp1-mag.ilan.net.il (128.139.198.80) 4 msec 4 msec 4 msec 2 tau-gp2-fe-i2.ilan.net.il (192.114.99.33) 4 msec 4 msec 4 msec 3 chi-gp3-0.ilan.net.il (192.114.99.65) 560 msec 556 msec 556 msec 4 chi-gp4-fe-i2.ilan.net.il (192.114.101.34) 560 msec 556 msec 556 msec 5 ESnet-ILAN.ilan.net.il (192.114.98.33) 372 msec 372 msec 372 msec 6 slac1-atms.es.net (134.55.24.13) 420 msec 424 msec 556 msec 7 RTR-DMZ.SLAC.Stanford.EDU (192.68.191.17) 424 msec 420 msec 420 msecThe route from SLAC (and the AS's) as seen by the SLAC reverse traceroute server is seen below.
4 ESNET-A-GATEWAY.SLAC.Stanford.EDU (192.68.191.18) [AS32 - Stanford Linear Accelerator Center] 2.38 ms (ttl=252) 5 pppl4-atms.es.net (134.55.24.10) [AS293 - Energy Sciences Network (ESnet)] 60.6 ms (ttl=251) 6 60hudson-pppl.es.net (134.55.43.2) [AS293 - Energy Sciences Network (ESnet)] 64.1 ms (ttl=249!) 7 esnet.ny.dante.net (212.1.200.217) [AS9010 - TEN-155/TEN-US Backbone] 64.2 ms (ttl=249) 8 ny2-ny3.ny2.ny.dante.net (212.1.200.110) [AS9010 - TEN-155/TEN-US Backbone] 64.8 ms (ttl=248) 9 il-us.il.dante.net (212.1.200.70) [AS9010 - TEN-155/TEN-US Backbone] 417 ms (ttl=248!) 10 tau-gp1-fe-i1.ilan.net.il (192.114.99.50) [AS701 - Architectual & Computer Aids] 417 ms (ttl=247!) 11 buproxy.iucc.ac.il (128.139.197.25) [AS378 - ILAN-AND-HUJI] 418 ms (ttl=246!)Pingroute from SLAC to buproxy.ac.il on September 3, is shown below. It would appear to indicate there may be a problem for large (1400 bytes) packets between hops 8 and 9 (probably in DANTE), even though the link is lightly loaded (see http://noc.ilan.net.il/stats/TAU-GIGAPOP/il-us.il.dante.net.html. The loss pattern (low for small packets, high for large packets) may hint at an ATM issue, which results in cell loss.
6cottrell@flora01:~>bin/pingroute.pl -c 100 buproxy.ac.il Sun Sep 3 10:18:45 2000 Architecture=SUN5, commands=traceroute -q 1 and ping -s node 1400 100, pingroute.pl version=1.4, 5/16/00, debug=1 pingroute.pl version 1.4, 5/16/00 using traceroute to get nodes in route from flora01 to buproxy.ac.il traceroute: Warning: ckecksums disabled traceroute to buproxy.iucc.ac.il (128.139.197.25), 30 hops max, 40 byte packets pingroute.pl version 1.4, 5/16/00 found 11 hops in route from flora01 to buproxy.ac.il 4 ESNET-A-GATEWAY.SLAC.Stanford.EDU (192.68.191.18) 0.974 ms 5 pppl4-atms.es.net (134.55.24.10) 60.462 ms 6 60hudson-pppl.es.net (134.55.43.2) 62.903 ms 7 esnet.ny.dante.net (212.1.200.217) 63.676 ms 8 ny2-ny3.ny2.ny.dante.net (212.1.200.110) 64.363 ms 9 il-us.il.dante.net (212.1.200.70) 416.979 ms 10 tau-gp1-fe-i1.ilan.net.il (192.114.99.50) 416.958 ms 11 buproxy.iucc.ac.il (128.139.197.25) 417.550 ms Wrote 11 addresses to /tmp/pingaddr, now ping each address 100 times from flora01 pings/node=100 100 byte packets 1400 byte packets NODE %loss min max avg %loss min max avg from flora01 192.68.191.18 ESNET-A-GATEWAY.SLAC.STANFORD. 0% 0.0 144.0 3.0 0% 2.0 207.0 5.0 Sun Sep 3 10:28:41 PDT 2000 134.55.24.10 PPPL4-ATMS.ES.NET 0% 60.0 257.0 63.0 0% 62.0 173.0 65.0 Sun Sep 3 10:31:59 PDT 2000 134.55.43.2 60HUDSON-PPPL.ES.NET 0% 62.0 66.0 62.0 0% 66.0 67.0 66.0 Sun Sep 3 10:35:17 PDT 2000 212.1.200.217 ESNET.NY.DANTE.NET 0% 63.0 297.0 67.0 0% 66.0 213.0 70.0 Sun Sep 3 10:38:35 PDT 2000 212.1.200.110 NY2-NY3.NY2.NY.DANTE.NET 0% 64.0 278.0 70.0 0% 68.0 253.0 71.0 Sun Sep 3 10:41:53 PDT 2000 212.1.200.70 IL-US.IL.DANTE.NET 1% 416.0 463.0 417.0 22% 422.0 424.0 422.0 Sun Sep 3 10:45:12 PDT 2000 192.114.99.50 TAU-GP1-FE-I1.ILAN.NET.IL 0% 417.0 421.0 417.0 5% 423.0 425.0 423.0 Sun Sep 3 10:48:32 PDT 2000 128.139.197.25 BUPROXY.IUCC.AC.IL 0% 417.0 430.0 418.0 21% 423.0 440.0 424.0 Sun Sep 3 10:51:51 PDT 2000
3cottrell@flora02:~>pathchar buproxy.ac.il pathchar to buproxy.iucc.ac.il (128.139.197.25) mtu limitted to 1500 bytes at FLORA02.SLAC.Stanford.EDU (134.79.16.57) doing 32 probes at each of 64 to 1500 by 44 0 Host | 25 Mb/s, 211 us (0.90 ms) 1 router1 | 93 Mb/s, 175 us (1.38 ms) 2 router2 | 91 Mb/s, 55 us (1.62 ms) 3 border router -> 134.79.111.4 (22920) | 111 Mb/s, -111 us (1.51 ms) 4?ESNET-A-GATEWAY.SLAC.Stanford.EDU (192.68.191.18) -> 192.68.191.18 (1) | 27 Mb/s, 29.8 ms (61.5 ms) 5?pppl4-atms.es.net (134.55.24.10) -> 134.55.24.10 (1) | 27 Mb/s, 1.08 ms (64.1 ms) 6?60hudson-pppl.es.net (134.55.43.2) -> 134.55.43.2 (1) | 43 Mb/s, 408 us (65.2 ms) 7?esnet.ny.dante.net (212.1.200.217) -> 212.1.200.217 (1) | 53 Mb/s, 260 us (66.0 ms) 8?ny2-ny3.ny2.ny.dante.net (212.1.200.110) | 8.3 Mb/s, 176 ms (420 ms), 7% dropped 9 il-us.il.dante.net (212.1.200.70) -> 212.1.200.70 (4) | 36 Mb/s, 84 us (421 ms), 8% dropped 10?tau-gp1-fe-i1.ilan.net.il (192.114.99.50) -> 192.114.99.50 (4) | 27 Mb/s, 2 us (421 ms), 8% dropped 11?buproxy.iucc.ac.il (128.139.197.25) 11 hops, rtt 417 ms (421 ms), bottleneck 8.3 Mb/s, pipe 436669 bytes
Late on September 3 email from Joe Burrescia of the ESnet NOC said: I believe this trouble can be traced to the ESnet upgrade of our router connecting to DANTE from a cisco to a Juniper. The bgp syntax is just different enough that we missed matching a local-pref and the route through DANTE was preferred. I've corrected the problem and the route to is now once again preferring our peering at Chicago. I apologize for all the trouble this has caused. The route from SLAC to ILAN on September 4th is shown below. At this time the route appeared to go via the Chicago STAR-TAP.
4 ESNET-A-GATEWAY.SLAC.Stanford.EDU (192.68.191.18) [AS32 - Stanford Linear Accelerator Center] 1.33 ms (ttl=252) 5 chicago1-atms.es.net (134.55.24.17) [AS293 - Energy Sciences Network (ESnet)] 57.6 ms (ttl=251) 6 ILAN-ESnet.ilan.net.il (192.114.98.34) [AS701 - Cimatron CAD/CAM Systems] 58.9 ms (ttl=250) 7 chi-gp3-fe-i2.ilan.net.il (192.114.101.33) [AS4617 - Amdocs, Inc.] 59.1 ms (ttl=249) 8 tau-gp2-s0.ilan.net.il (192.114.99.66) [AS701 - Architectual & Computer Aids] 611 ms (ttl=248) 9 tau-gp1-fe-i2.ilan.net.il (192.114.99.34) [AS701 - Architectual & Computer Aids] 612 ms (ttl=247) 10 buproxy.iucc.ac.il (128.139.197.25) [AS378 - ILAN-AND-HUJI] 611 ms (ttl=246)Also email from Oded Comay on September 4th that said Fixing ESnet routing scheme probably solved the problem we had with SLAC. However, we seem to have a much larger problem with DANTE, which is about to become our only link for few weeks now. The problem is that although the link is not heavily loaded (see http://noc.ilan.net.il/stats/TAU-GIGAPOP/il-us.il.dante.net.html), we suffer high loss rate. The loss pattern (low for small packets, high for large packets) may hint at an ATM issue, which results in cell loss. I suggest we take a look at our DANTE peer router and verify a correct setting.
Further email from Rafi Sadowsky on September 4th said: P.S. the drops from DANTE-NY to ILAN are probably due to an ATM policing problem which seems to be kicking in before the guaranteed PVC capacity.
Email from Yaron Zabary on Spetember 4th said: After I spoke with Marek over the phone, it turned out he thought that the degraded performance was because the sat T3 line was down. After explaining that the line is up but without the Mentat SkyX boxes, we agreed that the bandwidth (~30Kb/s) was reasonable for these conditions.