1	Evaluation of Advanced TCP stacks on Fast Long-Distance production Networks Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the BaBar/SCS meeting November 11, 2003 www.slac.stanford.edu/grp/scs/net/talk03/babar-long-nov03.html
2	Project goals Test new advanced TCP stacks, see how they perform on short and long-distance real production WAN links Compare & contrast: ease of configuration, throughput, convergence, fairness, stability etc. For different RTTs, windows, txqueuelen Recommend “optimum” stacks for data intensive science (BaBar) transfers using bbftp, bbcp, GridFTP Validate simulator & emulator findings & provide feedback
3	Protocol selection TCP only No Rate based transport protocols (e.g. SABUL, UDT, RBUDP) at the moment No iSCSI or FC over IP Sender mods only, HENP model is few big senders, lots of smaller receivers Simplifies deployment, only a few hosts at a few sending sites No DRS Runs on production nets No router mods (XCP/ECN), no jumbos,
4	Protocols Evaluated Linux 2.4 New Reno with SACK: single and parallel streams (P-TCP) Scalable TCP (S-TCP) Fast TCP HighSpeed TCP (HS-TCP) HighSpeed TCP Low Priority (HSTCP-LP) Binary Increase Control TCP (Bic-TCP) Hamilton TCP (H-TCP)
5	Reno single stream Low performance on fast long distance paths AIMD (add a=1 pkt to cwnd / RTT, decrease cwnd by factor b=0.5 in congestion)
6	P-TCP TCP Reno with 16 streams Parallel streams heavily used in HENP & elsewhere to achieve needed performance, so it is today’s de facto baseline However, hard to optimize both the window size AND number of streams since optimal values can vary due to network capacity, routes or utilization changes
7	S-TCP Uses exponential increase everywhere (in slow start and congestion avoidance) Multiplicative decrease factor b = 0.125 Introduced by Tom Kelly of Cambridge
8	Fast TCP Based on TCP Vegas Uses both queuing delay and packet losses as congestion measures Developed at Caltech by Steven Low and collaborators
9	HS-TCP Behaves like Reno for small values of cwnd Above a chosen value of cwnd (default 38) a more aggressive function is used Uses a table to indicate by how much to increase cwnd when an ACK is received Introduced by Sally Floyd
10	HSTCP-LP Mixture of HS-TCP with TCP-LP (Low Priority) Backs off early in face of congestion by looking at RTT Idea is to give scavengers service without router modifications From Rice University
11	Bic-TCP Combine: An additive increase used for large cwnd A binary search increase used for small cwnd Developed Injong Rhee at NC State University
12	H-TCP Similar to HS-TCP in switching to aggressive mode after threshold Uses an heterogeneous AIMD algorithm Developed at Hamilton U Ireland
13	Measurements 20 minute tests, long enough to see stable patterns Iperf reports incremental and cumulative throughputs at 5 second intervals Ping interval about 100ms At sender use: 1 for iperf/TCP, 2nd for cross-traffic (UDP or TCP), 3^rd for ping At receiver: use 1 machine for ping (echo) and TCP, 2^nd for cross-traffic
14	Networks 3 main network paths Short distance: SLAC-Caltech (RTT~10ms) Middle distance: UFL and DataTAG Chicago (RTT~70ms) Long distance: CERN and University of Manchester (RTT ~ 170ms) Tests during nights and weekends to avoid unacceptable impacts on production traffic
15	Windows Set large maximum windows (typically 32MB) on all hosts Used 3 different windows with iperf: Small window size, factor 2-4 below optimal Roughly optimal window size (~BDP) Oversized window
16	RTT Only P-TCP appears to dramatically affect the RTT E.g. increases by RTT by 200ms (factor 20 for short distances)
17	txqueuelen Regulates the size of the queue between the IP layer and the Ethernet layer May increase the throughput if we find optimal values But may increase duplicate ACKs (Y. T Li)
18	Throughput (Mbps)
19	Throughput
20	Stability Definition: standard deviation normalized by the average throughput At short RTT (10ms) stability is usually good (<=12%) At medium RTT (70ms) P-TCP, Scalable & Bic-TCP and appear more stable than the other protocols
21	Sinusoidal UDP UDP does not back off in face of congestion, it has a “stiff” behavior We modified iperf to allow it to create UDP traffic with a sinusoidal time behavior, following an idea from Tom Hacker See how TCP responds to varying cross-traffic Used 2 periods of 30 and 60 seconds and amplitude varying from 20 to 80 Mbps Sent from 2^nd sending host to 2^nd receiving host while sending TCP from 1^st sending host to 1^st receiving host As long as the window size was large enough all protocols converged quickly and maintain a roughly constant aggregate throughput Especially for P-TCP & Bic-TCP
22	TCP Convergence against UDP Stability better at short distances P-TCP & Bic more stable
23
24	Cross TCP Traffic Important to understand how fair a protocol is For one protocol competing against the same protocol (intra-protocol) we define the fairness for a single bottleneck as: All protocols have good intra-protocol Fairness (F>0.98) Except HS-TCP (F<0.94) when the window size > optimal
25	Fairness (F) Most have good intra-protocol fairness (diagonal elements), except HS-TCP Inter protocol Bic & H appear more fair against others Worst fairness are HSTCP-LP, P-TCP, S-TCP, Fast, HSTCP-LP But cannot tell who is aggressive and who is timid
26	Inter protocol Fairness For inter-protocol fairness we introduce the asymmetry between the two throughputs: Where x₁ and x₂ are the throughput averages of TCP stack 1 competing with TCP stack 2
27	Inter Fairness – UFl (A)
28	Reverse Traffic Cause queuing on reverse path by using P-TCP 16 streams ACKs are lost or come back in bursts (compressed ACKs) Fast TCP throughput is 4 to 8 times less than the other TCPs.
29	Future work Finish measurements to Manchester/CERN More analysis Work with Caltech to correlate with simulation Compare with other people’s measurements Test Westwood+ Tests with different RTTs on the same link Try on 10Gbps links More tests with multiple streams Look at performance of rate based protocols Use with production applications
30	Preliminary Conclusions Advanced stacks behave like TCP-Reno single stream on short distances for up to Gbits/s paths, especially if window size limited TCP Reno single stream has low performance and is unstable on long distances P-TCP is very aggressive and impacts the RTT badly HSTCP-LP is too gentle, this can be important for providing scavenger service without router modifications. By design it backs off quickly, otherwise performs well Fast TCP is very handicapped by reverse traffic S-TCP is very aggressive on long distances HS-TCP is very gentle, like H-TCP has lower throughput than other protocols Bic-TCP performs very well in almost all cases
31	More Information TCP Stacks Evaluation: www-iepm.slac.stanford.edu/bw/tcp-eval/
32	Extra Slides
33	Throughput With optimal window all stacks within ~20% of one another, except Reno 1 stream on medium and long distances
34	Inter Fair Caltech
35	Stability