The Rutherford lab (RL) computer center,
near Oxford England, is planning to become a
BaBar tier 1
remote computing site. This means it will have a copy of all the Babar
data (currently 1 TByte a day and expected to increase by a factor of 3 in the next year),
and will be a major site where Babar computing will be performed.
As such it needs excellent connectivity/performance to
SLAC and other
collaborator sites. While making throughput measurements
from SLAC (pharlap.slac.stanford.edu) to RL
(dev04.hepgrid.clrc.ac.uk) in preparation for the
challenge it was discovered that the maximum throughput achievable
was about 40 Mbits/s. The
pipechar from SLAC to RL indicates that
the connection is via ESnet and JAnet and the bottleneck is somewhere
in JAnet. The pipechar from
RL to SLAC appears to show taht the route is OC48 all the way.
The link from SLAC to ESnet at this time was OC3. This discrepancy
(OC48 vs OC3 for the SLAC - ESnet link) is probably due to
on more distant links.
The traceroute from SLAC to IN2P3 shows that there is about
70msec. round trip time (RTT) between SLAC and ESnet and 139msec. from
SLAC to RL. It also clearly indicates the carriers as being ESnet and JAnet.
traceroute from RL to SLAC
indicates the route is fairly symmetric.
I remeasured the pipechar from SLAC to RL
but the ~40Mbits/s bottleneck still appeared in JAnet.
We also made some more in depth measurements of the
throughput as a function of window size and streams. These measurements were
made with the iperf client on pharlap.slac.stanford.edu (a Sun E4500 with
6*336MHz cpus running Solaris 5.8 with a GE interface and a 4MByte TCP buffer)
and dev04.hepgrid.clrc.ac.rl a 604 MHz
Linux 2.2.16-3 host with a GE interface and an 8MByte TCP buffer.
The results are shown below, and indicate that that the maxima (10% of the
measurements that give the greatest throughput) are over 38Mbits/s.
This is well below the expected maximum of about 100 Mbps on an OC3 bottleneck
On Jan 14 '02 I received the following email from Chris Selig at RAL:
Would it be possible for you to re-run your test between SLAC and
RAL. We have identified (and fixed) a routing problem with the network to
which the dev04 is connected. This problem affected ONLY the network on
which the test machine lived, which explains why none of our measurements
show up the problem.
I reran the test, and got a maximum throughput of 37Mbits/s with the top 10%
measurements being over 29Mbits/s. I also noticed that the maximum
window size was set to 65KBytes and so requested that it be increased.
On Friday March 1st Tim Folkes of RL reported:
OK, Have made the change, let me know if it improves things.
I then remeasured the throughputs using varying window sizes and streams.
I was able to achieve over 100Mbits/s with various streams/window combinations.
Joe Metzger of RL reported, Mar 3 '02:
Currently we are not policing any of the circuits between
SLAC and our JANET connection in New York at rates lower than
Page owner: Les Cottrell