FAST TCP for Multi-Gbps
WAN: Experiments and Applications
|
|
|
Les Cottrell & Fabrizio Coccetti–
SLAC |
|
Prepared for the Internet2, Washington,
April 2003 |
|
http://www.slac.stanford.edu/grp/scs/net/talk/fast-i2-apr03.html |
|
|
Outline
|
|
|
|
High throughput challenges |
|
New TCP stacks |
|
Tests on Unloaded (testbed) links |
|
Performance of multi-streams |
|
Performance of various stacks |
|
Tests on Production networks |
|
Stack comparisons with single streams |
|
Stack comparisons with multiple streams |
|
Fairness |
|
Where do I find out more? |
High Speed Challenges
|
|
|
|
After a loss it can take over an hour
for stock TCP (Reno) to recover to maximum throughput at 1Gbits/s |
|
i.e. loss rate of 1 in ~ 2 Gpkts
(3Tbits), or BER of 1 in 3.6*1012 |
|
|
New TCP Stacks
|
|
|
|
|
|
Reno (AIMD) based, loss indicates
congestion |
|
Back off less when see congestion |
|
Recover more quickly after backing off |
|
Scalable TCP: exponential recovery |
|
Tom Kelly, Scalable TCP: Improving
Performance in Highspeed Wide Area Networks Submitted for publication,
December 2002. |
|
High Speed TCP: same as Reno for low
performance, then increase window more & more aggressively as window
increases using a table |
|
Vegas based, RTT indicates congestion |
|
Caltech FAST TCP, quicker response to
congestion, but … |
Typical testbed
Testbed Collaborators
and sponsors
|
|
|
Caltech: Harvey Newman, Steven Low,
Sylvain Ravot, Cheng Jin, Xiaoling Wei, Suresh Singh, Julian Bunn |
|
SLAC: Les Cottrell, Gary Buhrmaster,
Fabrizio Coccetti |
|
LANL: Wu-chun Feng, Eric Weigle, Gus
Hurwitz, Adam Englehart |
|
NIKHEF/UvA: Cees DeLaat, Antony Antony |
|
CERN: Olivier Martin, Paolo Moroni |
|
ANL: Linda Winkler |
|
DataTAG, StarLight, TeraGrid, SURFnet,
NetherLight, Deutsche Telecom, Information Society Technologies |
|
Cisco, Level(3), Intel |
|
DoE, European Commission, NSF |
|
|
Windows and Streams
|
|
|
Well accepted that multiple streams (n)
and/or big windows are important to achieve optimal throughput |
|
Effectively reduces impact of a loss by
1/n, and improves recovery time by 1/n |
|
Optimum windows & streams changes
with changes (e.g. utilization) in path, hard to optimize n |
|
Can be unfriendly to others |
Even with big windows
(1MB) still need multiple streams with Standard TCP
|
|
|
Above knee performance still improves
slowly, maybe due to squeezing out others and taking more than fair share due
to large number of streams |
Stock vs FAST
TCP
MTU=1500B
|
|
|
|
Need to measure all parameters to
understand effects of parameters, configurations: |
|
Windows, streams, txqueuelen, TCP
stack, MTU, NIC card |
|
Lot of variables |
|
Examples of 2 TCP stacks |
|
FAST TCP no longer needs multiple
streams, this is a major simplification (reduces # variables to tune by 1) |
|
|
TCP stacks with 1500B
MTU @1Gbps
Jumbo frames, new TCP
stacks at 1 Gbits/s
Production network tests
High Speed TCP vs Reno –
1 Stream
Slide 14
Slide 15
Scalable vs
multi-streams
FAST & Scalable vs.
Multi-stream Reno (SLAC>CERN ~230ms)
Scalable & FAST TCP
with 1 stream vs Reno with n streams
Fairness FAST vs Reno
Summary (very
preliminary)
|
|
|
|
With single flow & empty network: |
|
Can saturate 2.5 Gbps with standard TCP
& jumbos |
|
Can saturate 1Gbps with new stacks
& 1500B frame or with standard & jumbos |
|
With production network, |
|
FAST can take a while to get going |
|
Once going, FAST TCP with one stream
looks good compared to multi-stream
RENO |
|
FAST can back down early compared to
RENO |
|
More work needed on fairness |
|
Scalable |
|
Does not look as good vs. multi-stream
Reno |
|
|
What’s next?
|
|
|
|
Go beyond 2.5Gbits/s |
|
Disk-to-disk throughput & useful
applications |
|
Need faster cpus (extra 60% MHz/Mbits/s
over TCP for disk to disk), understand how to use multi-processors |
|
Further evaluate new stacks with
real-world links, and other equipment |
|
Other NICs |
|
Response to congestion, pathologies |
|
Fairness |
|
Deploy for some major (e.g. HENP/Grid)
customer applications |
|
Understand how to make 10GE NICs work
well with 1500B MTUs |
|
Move from “hero” demonstrations to
commonplace |
More Information
|
|
|
|
10GE tests |
|
www-iepm.slac.stanford.edu/monitoring/bulk/10ge/ |
|
sravot.home.cern.ch/sravot/Networking/10GbE/10GbE_test.html |
|
TCP stacks |
|
netlab.caltech.edu/FAST/ |
|
datatag.web.cern.ch/datatag/pfldnet2003/papers/kelly.pdf |
|
www.icir.org/floyd/hstcp.html |
|
Stack comparisons |
|
www-iepm.slac.stanford.edu/monitoring/bulk/fast/ |
|
www.csm.ornl.gov/~dunigan/net100/floyd.html |
|
www-iepm.slac.stanford.edu/monitoring/bulk/tcpstacks/ |
|
|
|
|
|
|
Extras
FAST TCP vs. Reno – 1
stream
Scalable vs. Reno - 1
stream
Other high speed gotchas
|
|
|
|
Large windows and large number of
streams can cause last stream to take a long time to close. |
|
Linux memory leak |
|
Linux TCP configuration caching |
|
What is the window size actually
used/reported |
|
32 bit counters in iperf and routers
wrap, need latest releases with 64bit counters |
|
Effects of txqueuelen (number of
packets queued for NIC) |
|
Routers do not pass jumbos |
|
Performance differs between drivers and
NICs from different manufacturers |
|
May require tuning a lot of parameters |