SLAC logo

Qbone Scavenger Service (QBSS) at SLAC Network logo

Les Cottrell. Page created: May 3, 2001, last update May 12, 2001.

Central Computer Access | Computer Networking | Network Group | ICFA-NTF Monitoring
SLAC Welcome
Highlighted Home
Detailed Home
Search
Phonebook

Introduction

Some people in Internet 2 are starting to experiment with some form of scavenger QoS service. We (SLAC) believed we could be of some assistance with this as we have critical needs to transmit large amounts of data between Babar sites. These include IN2P3 in Lyon France, CERN, RAL in the UK, INFN Rome Italy, Caltech as well as LLNL, LBNL and Colorado. we can utilize most of the available bandwidth for quite long periods (days) by using large windows and mulitiple streams (see for example Bulk throughput measurements and Bulk throughput: streams vs. Windows). One possibility for reducing the impact of our traffic on others is to try a scavenger QoS service.

The basic idea of QBSS according to Stanislav Shalunov of Internet 2, is:
The gist is that you'd mark your bulk flows (which already have congestion control) with a special DSCP value (001000), and this value would be passed through by all networks involved. Some networks may choose to ignore the marking and treat this traffic just like the default best-effort class. Other network (particularly those that experience congestion) would use a variation of weighted round-robin queuing discipline (or whatever their router vendor calls it) to give QBSS a very small percentage of link share on head ends of congested links. This way, QBSS would use whatever is left over from the default best-effort class. Within QBSS, there would be normal competition of the same exact sort we find in BE.

Routes

One critical point is to understand the routes from SLAC (or Stanford( to the candidate sites mentioned above to see whether they pass through Internet 2. A second requirement is some point that the path connects through that has congestion. The traceroutes below indicate which paths use Internet 2/Abilene and the pipechars give some idea of where there may be congestion and the bottleneck bandwidth. The standard deviation (sd) of the RTT also gives some idea of the congestion (larger is more congested).
Path characteristics seen from SLAC
Click for tracerouteClick for Pipechar Min RTT (sd)Map Sustained iperf throughputAS's
Caltech, Pasadena, CA 75Mbps 25 (0.13) msecNo 40Mbps (Dec-00) CalREN2
CERN, Geneva, Switzerland 69Mbps 160 (0.23) msecYes 56Mbps (Mar-01) ESnet
Colorado, Boulder, CO 94Mbps 27.4 (0.91) msecNo 20Mbps (May-01) CalREN2, Abilene
Daresbury Lab, Liverpool, England 17Mbps 151 (3.42) msecYes 40Mbps (Mar-01) ESnet, JANet
GSFC/NASA 80Mbps 64 (22) msecNo
CalREN2, Abilene, NASA
IN2P3, Lyon, France 27Mbps 147 (0.31) msecYes 30Mbps (Mar-01) ESnet, PHYnet
INFN, Rome, Italy 31Mbps 180 (0.26) msecYes 26Mbps (Mar-01) ESnet, TEN-155, GARR
UTDallas, TX 23Mbps 84.6 (43) msecNo 12Mbps (Feb-01) CalREN2, Abilene
Path characteristics seen from Stanford
Click for tracerouteClick for Pipechar Min RTT (sd)Map Sustained iperf throughputAS's
Daresbury Lab, Liverpool, England 16Mbps 141 (4.1) msecNo 32 Mbps (May-01) CalREN2, Abilene, JAnet
From the above tables it appears that good candidates for QBSS testing (i.e. they pass through Abilene and there is congestion identified in the pipechar measurements and standard deviation of RTT) are UTDallas, GSFC and Daresbury Lab. Unfortunately according to Joe Izen of UT Dallas: My network group explained that the router with plenty of backplane bandwidth is running out of CPU cycles filtering more packets in addition to its routing duties. and as stated by Stansislav Shalunov: In cases where apparent congestion is caused by scarcity of computing cycles in a router rather than scarcity of link capacity one would expect that any QoS technique would only hurt instead of helping.

GSFC

With the help of Andy Germain of GSFC we were able to get an iperf server set up on 198.10.49.61 (pawn.eos.gsfc.nasa.gov), running FreeBSD 3.5RELEASE #0 on a 200MHz Intel chip. This machine has a Gbps Ethernet interface. The pipechar from SLAC to the machine indicates a bottleneck of about 150Mbps. The iperf server only allowed up to 9 parallel flows. We ran the TCP iperf client from tersk07.slac.stanford.edu, a Sun Netra t 1400/1405 with 4*440MHz cpus with 4GB of memory running Solaris 5.7, for 1, 5, 8,4,3,7,9,2,6 flows for window sizes of 8, 1024,16, 512, 32, 256, 128 and 64 kBytes (see Bulk Throughput for the methodology). The first graph below shows the average TCP throughput (averaged over all flow settings). The second graph shows the throughput by flow averaged over all the window sizes. It can be seen that there is much more variation with window size compared to flows, in fact increasing the number of flows may have a detrimental affect on the throughput. This is different from the behavior seen for many other links (see Bulk Throughput and Bulk throughput: Windows vs. Streams) where the use of more streams is more effective than using large windows in achieving high throughput. It is also seen that there is appears to be an optimum window size around 500kBytes. This is less than would be predicted using the RTT * Bottleneck bandwidth product which predicts about 1 MByte.
Average throughput SLAC to GSFC Average throughput SLAC to GSFC
The following graphs show the throughput broken down by flows and windows for 3 different sets of measurements at different times (1st: May 11, 2001 22:58 - 23:22 PDT, 2nd: May 12, 2001 9:49am - 10:10 am, 3rd: May 12, 2001 10:33 - 11:03 PDT). It can be seen that there is considrable variation from plot (time) to plot (time). Looking at EOS network graphs: destination: SLAC of TCP Iperf through from destruction.gafc.nasa.gov ( 128.183.166.156), an SGI running IRIX 6.5, to SLAC, it can be seen that there are large (factors of 8) variations in throughput from hour to hour. There is also considerable variation in the maximum throughput that can be achieved, varying by over a factor of 3 from about 25Mbits/s to over 90Mbits/s. This may be caused by variation in the competing load (cross-traffic). However, the pipechar, indicates that the bottleneck is not within Abilene.
Throughput by window & flow from SLAC to GSFC Throughput by window & flow from SLAC to GSFC Throughput by window & flow from SLAC to GSFC

Daresbury

The folks at Daresbury have a UKERNA funded joint project with SLAC and I2 to Investigate the effectiveness of QoS techniques. Measurements of iperf throughput from SLAC to DL and Stanford to DL are available. The host that we have access to at DL, is a Linux host which has not been configured to allow windows of > 64KBytes, so measurements from DL are limited in their applicability. We have installed the bbcp network file copy program at DL and made some measurements from SLAC and DL. We have verified using bbcp to set the QBSS bit at DL and send the packets to SLAC, and then using snoop on a Solaris host at SLAC, that the bit is still set when the packet reaches SLAC.
Page owner: Les Cottrell