SLAC logo

Network connectivity between SLAC and UTDallas December 2000 & February 2001. Network logo

Les Cottrell. Page created: December 19, 2000, last update November 11, 2001.

Central Computer Access | Computer Networking | Network Group | ICFA-NTF Monitoring
SLAC Welcome
Highlighted Home
Detailed Home
Search
Phonebook

Introduction

On December 18, 2000, we received email from Joe Izen of UT Dallas, saying:

>     Is it understood why Xinchou only sees 144KB/s on large transfers 
> between UTD and SLAC?  I would have expected much better of 
> Internet2.  Thanks!  -Joe

On February 6th 2001, Joe sent another email:

I just (Tue Feb  6 08:36:10 CST 2001) observed about a 2minute freeze
of the Internet2 connection between UTD and SLAC.  It interupted
xterm and emacs.  Actually, I am emacs'ing locally a file in an
afs-exported volume at slac.  emacs loops, sucking in %0% of
morticia's cpu when this happens.

I've noticed this a few times/day over the past week.  Is the cause understood?

In general, the network connection has not been as snappy lately as
it has in the past few months when doing interactive sessions.

-Joe

Route

The path characterization from SLAC to UT Dallas shows that there is about 50msec. round trip time (RTT) between SLAC and UT Dallas. It is also seen that the route is mainly via Internet 2, and the bottlneck (excluding the initial measuring host's (flora05) is indicated to be about 45Mbps (possibly a T3 link to the UT Dallas campus).

The traceroute from UTDalls to SLAC indicates taht the main delay is added between 208.44.44.254 and losa-hstn.abilene.ucaid.edu which is probabky the long haul from the Bay area to Houston Texas, and so is to be expected. The lack of response at hops 10 and 11 is since the internal SLAC routers are configured not to respond for security reasons. By comparing with the traceroute from SLAC to UTDallas it is seen that the routes are failry symmetric.

Losses

The pingroute from SLAC to UTDallas indicates that the losses start between 208.44.44.242 and shot.utdallas.edu. The pingroute from UTDallas to SLAC indicates that the packet loss starts (for small 100 Byte packets) between the 1st and 2ns hop, i.e. between utdgw16.utdallas.edu and 208.44.44.254.

Historical performance

Unfortunately performance between SLAC an UT Dallas is not measured by either IEPM/PingER or Surveyor. We started monitoring with PingER in December, 2000. The PingER RTT and loss graph indicates that the packet loss started around January 16th, 2001.

Throughput

We installed iperf at UTDallas and ran a server there. We then ran the client at the SLAC end using the methodology outlined in Bulk Throughput Measurements. With iperf we were able to achieve a best throughput of 7.3Mbits/s (0.9MBytes/s), and an average throughput of 2.5Mbits/s (308kBytes/s). The measured throughput was very variable with measurements with similar number of streams and flows made within minutes of each other, differing by factors of 10. The standard deviation of the throughput measurements was 2 Mbits/s and the IQR was 3Mbits/sec. This variability is presumed to be due to the variability in the cross-traffic. With only one flow we were only able to achieve a throughput of about 50kbits/s. The loss rate was about 4.5% for the loaded case, and about 1.4% for the unloaded case.

Using bbftp (an FTP application that support multiple streams and large windows) Itaru Kitayama of UTDallas was able to get ~500kBytes/s with 3 streams and 256kByte windows.

Using the ns-2 network simulator we are able to predict throughput/goodput. For this link we set:

The predicted goodput was 927kbits/s (909kbits/sec) or 116kBytes/s (114kBytes/s). Setting the window size to 64kBytes increased the goodput to 6.7Mbits/s (4.2Mbits/s) or 836kBytes/s (528kBytes/s). Increasing the number of flows to 10 and maintaining the window size at 64kBytes the goodput predicted is 40Mbits/s (42Mbits/s) or 5MBytes/s (5.3MBytes/s).

For more information on achieving high throughput/goodput see: Bulk throughput measurements, and to see how well the simulator agrees with observed measurements, see Bulk throughput simulation.

Resolution

We noted on February 9, 2001 that that the loss appeared to have decreased. Ping indicated that the losses were tnow about 1-2%:
----WWW4.SLAC.Stanford.EDU PING Statistics----
285 packets transmitted, 277 packets received, 2% packet loss
round-trip (ms)  min/avg/max = 49/76/273

----morticia.utdallas.edu PING Statistics----
135 packets transmitted, 133 packets received, 1% packet loss
round-trip (ms)  min/avg/max = 49/68/164
The improvement appeared to happen at about 11:00 February 7, 2001 GMT. Since then the RTT appears to be more variable, typically this indicates more congestion. The routes in both directions were measured 1t 9:15am February 9, 2001, to be identical to those measured when the heavy losses were occuring.

March '01

On March 1, 2001 Joe Izen enquired whether the 2% lost packet problem was tracked down?. Les Cottrell ran 1 second separated pings for an hour (3600 pings) from SLAC to morticia.utdallas.edu across lunchtime 3/1/01 and observed no losses. The PingER RTT and Loss plot for Dec 2000 through Feb 2001 indicates the problem went away on February 11, 2001.

September '01

On September 23, 2001, we measured the iperf TCP throughput from pharlap.slac.stanford.edu to morticia.utdallas.edu. The top 10% throughputs are above 41Mbits/s. The pipechar indictaes that the bottelenck is probably a T3 link (43Mbps).

November '01

On November 10, '01 iperf performance from SLAC to Dallas appeared very variable with 2 plateaus at ~ 40Mbits/s and 10 to 20Mbits/s.
Page owner: Les Cottrell