Evaluation of Advanced TCP stacks on Fast Long-Distance production Networks
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL,  for the
BaBar/SCS meeting
November 11, 2003
www.slac.stanford.edu/grp/scs/net/talk03/babar-long-nov03.html

Project goals
Test new advanced TCP stacks, see how they perform on short and long-distance real production WAN links
Compare & contrast: ease of configuration, throughput, convergence, fairness, stability etc.
For different RTTs, windows, txqueuelen
Recommend “optimum” stacks for data intensive science (BaBar) transfers using bbftp, bbcp, GridFTP
Validate simulator & emulator findings & provide feedback

Protocol selection
TCP only
No Rate based transport protocols (e.g. SABUL, UDT, RBUDP) at the moment
No iSCSI or FC over IP
Sender mods only, HENP model is few big senders, lots of smaller receivers
Simplifies deployment, only a few hosts at a few sending sites
No DRS
Runs on production nets
No router mods (XCP/ECN), no jumbos,

Protocols Evaluated
Linux 2.4 New Reno with SACK: single and parallel streams (P-TCP)
Scalable TCP (S-TCP)
Fast TCP
HighSpeed TCP (HS-TCP)
HighSpeed TCP Low Priority (HSTCP-LP)
Binary Increase Control TCP (Bic-TCP)
Hamilton TCP (H-TCP)

Reno single stream
Low performance on fast long distance paths
AIMD (add a=1 pkt to cwnd / RTT, decrease cwnd by factor b=0.5 in congestion)

P-TCP
TCP Reno with 16 streams
Parallel streams heavily used in HENP & elsewhere to achieve needed performance, so it is today’s de facto baseline
However, hard to optimize both the window size AND number of streams since optimal values can vary due to network capacity, routes or utilization changes

S-TCP
Uses exponential increase everywhere (in slow start and congestion avoidance)
Multiplicative decrease factor b = 0.125
Introduced by Tom Kelly of Cambridge

Fast TCP
Based on TCP Vegas
Uses both queuing delay and packet losses as congestion measures
Developed at Caltech by Steven Low and collaborators

HS-TCP
Behaves like Reno for small values of cwnd
Above a chosen value of cwnd (default 38) a more aggressive function is used
Uses a table to indicate by how much to increase cwnd when an ACK is received
Introduced by Sally Floyd

HSTCP-LP
Mixture of HS-TCP with TCP-LP (Low Priority)
Backs off early in face of congestion by looking at RTT
Idea is to give scavengers service without router modifications
From Rice University

Bic-TCP
Combine:
An additive increase used for large cwnd
A binary search increase used for small cwnd
Developed Injong Rhee at NC State University

H-TCP
Similar to HS-TCP in switching to aggressive mode after threshold
Uses an heterogeneous AIMD algorithm
Developed at Hamilton U Ireland

Measurements
20 minute tests, long enough to see stable patterns
Iperf reports incremental and cumulative throughputs at 5 second intervals
Ping interval about 100ms
At sender use: 1 for iperf/TCP, 2nd for cross-traffic (UDP or TCP), 3rd for ping
At receiver: use 1 machine for ping (echo) and TCP, 2nd for cross-traffic

Networks
3 main network paths
Short distance: SLAC-Caltech (RTT~10ms)
Middle distance: UFL and DataTAG Chicago (RTT~70ms)
Long distance: CERN and University of Manchester (RTT ~ 170ms)
Tests during nights and weekends to avoid unacceptable impacts on production traffic

Windows
Set large maximum windows (typically 32MB) on all hosts
Used 3 different windows with iperf:
Small window size, factor 2-4 below optimal
Roughly optimal window size (~BDP)
Oversized window

RTT
Only P-TCP appears to dramatically affect the RTT
E.g. increases by RTT by 200ms (factor 20 for short distances)

txqueuelen
Regulates the size of the queue between the IP layer and the Ethernet layer
May increase the throughput if we find optimal values
But may increase duplicate ACKs (Y. T Li)

Throughput (Mbps)

Throughput

Stability
Definition: standard deviation normalized by the average throughput
At short RTT (10ms) stability is usually good (<=12%)
At medium RTT (70ms) P-TCP, Scalable & Bic-TCP and appear more stable than the other protocols

Sinusoidal UDP
UDP does not back off in face of congestion, it has a “stiff” behavior
We modified iperf to allow it to create UDP traffic with a sinusoidal time behavior, following an idea from Tom Hacker
See how TCP responds to varying cross-traffic
Used 2 periods of 30 and 60 seconds and amplitude varying from 20 to 80 Mbps
Sent from 2nd sending host to 2nd receiving host while sending TCP from 1st sending host to 1st receiving host
As long as the window size was large enough all protocols converged quickly and maintain a roughly constant aggregate throughput
Especially for P-TCP & Bic-TCP

TCP Convergence against UDP
Stability better at short distances
P-TCP & Bic more stable

Slide 23

Cross TCP Traffic
Important to understand how fair a protocol is
For one protocol competing against the same protocol (intra-protocol) we define the fairness for a single bottleneck as:
All protocols have good intra-protocol Fairness (F>0.98)
Except HS-TCP (F<0.94) when the window size > optimal

Fairness (F)
Most have good intra-protocol fairness (diagonal elements), except HS-TCP
Inter protocol Bic & H appear more fair against others
Worst fairness are HSTCP-LP, P-TCP, S-TCP, Fast, HSTCP-LP
But cannot tell who is aggressive and who is timid

Inter protocol Fairness
For inter-protocol fairness we introduce the asymmetry between the two throughputs:
Where x1 and x2 are the throughput averages of TCP stack 1 competing with TCP stack 2

Inter Fairness – UFl (A)

Reverse Traffic
Cause queuing on reverse path by using P-TCP 16 streams
ACKs are lost or come back in bursts (compressed ACKs)
Fast TCP throughput is 4 to 8 times less than the other TCPs.

Future work
Finish measurements to Manchester/CERN
More analysis
Work with Caltech to correlate with simulation
Compare with other people’s measurements
Test Westwood+
Tests with different RTTs on the same link
Try on 10Gbps links
More tests with multiple streams
Look at performance of rate based protocols
Use with production applications

Preliminary Conclusions
Advanced stacks behave like TCP-Reno single stream on short distances for up to Gbits/s paths, especially if window size limited
TCP Reno single stream has low performance and is unstable on long distances
P-TCP is very aggressive and impacts the RTT badly
HSTCP-LP is too gentle, this can be important for providing scavenger service without router modifications. By design it backs off quickly, otherwise performs well
Fast TCP is very handicapped by reverse traffic
S-TCP is very aggressive on long distances
HS-TCP is very gentle, like H-TCP has lower throughput than other protocols
Bic-TCP performs very well in almost all cases

More Information
TCP Stacks Evaluation:
www-iepm.slac.stanford.edu/bw/tcp-eval/

Extra Slides

Throughput
With optimal window all stacks within ~20% of one another, except Reno 1 stream on medium and long distances

Inter Fair Caltech

Stability