Evaluation of Advanced TCP stacks on Fast Long-Distance production Networks

Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the

BaBar/SCS collaboration

November 2003

Project goals

Test new advanced TCP stacks, see how they perform on short and long-distance real production WAN links

Compare & contrast: ease of configuration, throughput, convergence, fairness, stability etc.

For different RTTs, windows, txqueuelen

Recommend “optimum” stacks for data intensive science (BaBar) transfers using bbftp, bbcp, GridFTP

Validate simulator & emulator findings & provide feedback

Protocol selection

TCP only

No Rate based transport protocols (e.g. SABUL, UDT, RBUDP) at the moment

No iSCSI or FC over IP

Sender mods only, HENP model is few big senders, lots of smaller receivers

Simplifies deployment, only a few hosts at a few sending sites

No DRS

Runs on production nets

No router mods (XCP/ECN), no jumbos,

Protocols Evaluated

Linux 2.4 New Reno with SACK: single and parallel streams (P-TCP)

Scalable TCP (S-TCP)

Fast TCP

HighSpeed TCP (HS-TCP)

HighSpeed TCP Low Priority (HSTCP-LP)

Binary Increase Control TCP (Bic-TCP)

Hamilton TCP (H-TCP)

Reno single stream

Low performance on fast long distance paths

AIMD (add a=1 pkt to cwnd / RTT, decrease cwnd by factor b=0.5 in congestion)

P-TCP

TCP Reno with 16 streams

Parallel streams heavily used in HENP & elsewhere to achieve needed performance, so it is today’s de facto baseline

However, hard to optimize both the window size AND number of streams since optimal values can vary due to network capacity, routes or utilization changes

Measurements

20 minute tests, long enough to see stable patterns

Iperf reports incremental and cumulative throughputs at 5 second intervals

Ping interval about 100ms

At sender use: 1 for iperf/TCP, 2nd for cross-traffic (UDP or TCP), 3^rd for ping

At receiver: use 1 machine for ping (echo) and TCP, 2^nd for cross-traffic

Networks

3 main network paths

Short distance: SLAC-Caltech (RTT~10ms)

Middle distance: UFL and DataTAG Chicago (RTT~70ms)

Long distance: CERN and University of Manchester (RTT ~ 170ms)

Tests during nights and weekends to avoid unacceptable impacts on production traffic

Windows

Set large maximum windows (typically 32MB) on all hosts

Used 3 different windows with iperf:

Small window size, factor 2-4 below optimal

Roughly optimal window size (~BDP)

Oversized window

RTT

Only P-TCP appears to dramatically affect the RTT

E.g. increases by RTT by 200ms (factor 20 for short distances)

Throughput (Mbps)

Throughput

Cross TCP Traffic

Important to understand how fair a protocol is

For one protocol competing against the same protocol (intra-protocol) we define the fairness for a single bottleneck as:

All protocols have good intra-protocol Fairness (F>0.98)

Except HS-TCP (F<0.94) when the window size > optimal

Inter protocol Fairness

For inter-protocol fairness we introduce the asymmetry between the two throughputs:

Where x₁ and x₂ are the throughput averages of TCP stack 1 competing with TCP stack 2

Preliminary Conclusions

Advanced stacks behave like TCP-Reno single stream on short distances for up to Gbits/s paths, especially if window size limited

TCP Reno single stream has low performance and is unstable on long distances

P-TCP is very aggressive and impacts the RTT badly

New TCP stacks work well with a single stream, most are fair and stable

Ready to put into use with production applications

More Information

TCP Stacks Evaluation:

www-iepm.slac.stanford.edu/bw/tcp-eval/


	Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
	BaBar/SCS collaboration
	November 2003


	Test new advanced TCP stacks, see how they perform on short and long-distance real production WAN links
	Compare & contrast: ease of configuration, throughput, convergence, fairness, stability etc.
	For different RTTs, windows, txqueuelen
	Recommend “optimum” stacks for data intensive science (BaBar) transfers using bbftp, bbcp, GridFTP
	Validate simulator & emulator findings & provide feedback


	TCP only
		No Rate based transport protocols (e.g. SABUL, UDT, RBUDP) at the moment
		No iSCSI or FC over IP
	Sender mods only, HENP model is few big senders, lots of smaller receivers
		Simplifies deployment, only a few hosts at a few sending sites
		No DRS
	Runs on production nets
		No router mods (XCP/ECN), no jumbos,


	Linux 2.4 New Reno with SACK: single and parallel streams (P-TCP)
	Scalable TCP (S-TCP)
	Fast TCP
	HighSpeed TCP (HS-TCP)
	HighSpeed TCP Low Priority (HSTCP-LP)
	Binary Increase Control TCP (Bic-TCP)
	Hamilton TCP (H-TCP)


	Low performance on fast long distance paths
		AIMD (add a=1 pkt to cwnd / RTT, decrease cwnd by factor b=0.5 in congestion)


	TCP Reno with 16 streams
		Parallel streams heavily used in HENP & elsewhere to achieve needed performance, so it is today’s de facto baseline
		However, hard to optimize both the window size AND number of streams since optimal values can vary due to network capacity, routes or utilization changes


	20 minute tests, long enough to see stable patterns
	Iperf reports incremental and cumulative throughputs at 5 second intervals
	Ping interval about 100ms
	At sender use: 1 for iperf/TCP, 2nd for cross-traffic (UDP or TCP), 3^rd for ping
	At receiver: use 1 machine for ping (echo) and TCP, 2^nd for cross-traffic


	3 main network paths
		Short distance: SLAC-Caltech (RTT~10ms)
		Middle distance: UFL and DataTAG Chicago (RTT~70ms)
		Long distance: CERN and University of Manchester (RTT ~ 170ms)
		Tests during nights and weekends to avoid unacceptable impacts on production traffic


	Set large maximum windows (typically 32MB) on all hosts
	Used 3 different windows with iperf:
		Small window size, factor 2-4 below optimal
		Roughly optimal window size (~BDP)
		Oversized window


	Only P-TCP appears to dramatically affect the RTT
		E.g. increases by RTT by 200ms (factor 20 for short distances)


	Important to understand how fair a protocol is
	For one protocol competing against the same protocol (intra-protocol) we define the fairness for a single bottleneck as:


	All protocols have good intra-protocol Fairness (F>0.98)
	Except HS-TCP (F<0.94) when the window size > optimal


	For inter-protocol fairness we introduce the asymmetry between the two throughputs:
		Where x₁ and x₂ are the throughput averages of TCP stack 1 competing with TCP stack 2


	Advanced stacks behave like TCP-Reno single stream on short distances for up to Gbits/s paths, especially if window size limited
	TCP Reno single stream has low performance and is unstable on long distances
	P-TCP is very aggressive and impacts the RTT badly
	New TCP stacks work well with a single stream, most are fair and stable
	Ready to put into use with production applications


	TCP Stacks Evaluation:
		www-iepm.slac.stanford.edu/bw/tcp-eval/