1
|
- Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
- BaBar/SCS meeting
- November 11, 2003
- www.slac.stanford.edu/grp/scs/net/talk03/babar-long-nov03.html
|
2
|
- Test new advanced TCP stacks, see how they perform on short and
long-distance real production WAN links
- Compare & contrast: ease of configuration, throughput, convergence,
fairness, stability etc.
- For different RTTs, windows, txqueuelen
- Recommend “optimum” stacks for data intensive science (BaBar) transfers
using bbftp, bbcp, GridFTP
- Validate simulator & emulator findings & provide feedback
|
3
|
- TCP only
- No Rate based transport protocols (e.g. SABUL, UDT, RBUDP) at the
moment
- No iSCSI or FC over IP
- Sender mods only, HENP model is few big senders, lots of smaller
receivers
- Simplifies deployment, only a few hosts at a few sending sites
- No DRS
- Runs on production nets
- No router mods (XCP/ECN), no jumbos,
|
4
|
- Linux 2.4 New Reno with SACK: single and parallel streams (P-TCP)
- Scalable TCP (S-TCP)
- Fast TCP
- HighSpeed TCP (HS-TCP)
- HighSpeed TCP Low Priority (HSTCP-LP)
- Binary Increase Control TCP (Bic-TCP)
- Hamilton TCP (H-TCP)
|
5
|
- Low performance on fast long distance paths
- AIMD (add a=1 pkt to cwnd / RTT, decrease cwnd by factor b=0.5 in
congestion)
|
6
|
- TCP Reno with 16 streams
- Parallel streams heavily used in HENP & elsewhere to achieve needed
performance, so it is today’s de facto baseline
- However, hard to optimize both the window size AND number of streams
since optimal values can vary due to network capacity, routes or
utilization changes
|
7
|
- Uses exponential increase everywhere (in slow start and congestion
avoidance)
- Multiplicative decrease factor b = 0.125
- Introduced by Tom Kelly of Cambridge
|
8
|
- Based on TCP Vegas
- Uses both queuing delay and packet losses as congestion measures
- Developed at Caltech by Steven Low and collaborators
|
9
|
- Behaves like Reno for small values of cwnd
- Above a chosen value of cwnd (default 38) a more aggressive function is
used
- Uses a table to indicate by how much to increase cwnd when an ACK is
received
- Introduced by Sally Floyd
|
10
|
- Mixture of HS-TCP with TCP-LP (Low Priority)
- Backs off early in face of congestion by looking at RTT
- Idea is to give scavengers service without router modifications
- From Rice University
|
11
|
- Combine:
- An additive increase used for large cwnd
- A binary search increase used for small cwnd
- Developed Injong Rhee at NC State University
|
12
|
- Similar to HS-TCP in switching to aggressive mode after threshold
- Uses an heterogeneous AIMD algorithm
- Developed at Hamilton U Ireland
|
13
|
- 20 minute tests, long enough to see stable patterns
- Iperf reports incremental and cumulative throughputs at 5 second
intervals
- Ping interval about 100ms
- At sender use: 1 for iperf/TCP, 2nd for cross-traffic (UDP or TCP), 3rd
for ping
- At receiver: use 1 machine for ping (echo) and TCP, 2nd for
cross-traffic
|
14
|
- 3 main network paths
- Short distance: SLAC-Caltech (RTT~10ms)
- Middle distance: UFL and DataTAG Chicago (RTT~70ms)
- Long distance: CERN and University of Manchester (RTT ~ 170ms)
- Tests during nights and weekends to avoid unacceptable impacts on
production traffic
|
15
|
- Set large maximum windows (typically 32MB) on all hosts
- Used 3 different windows with iperf:
- Small window size, factor 2-4 below optimal
- Roughly optimal window size (~BDP)
- Oversized window
|
16
|
- Only P-TCP appears to dramatically affect the RTT
- E.g. increases by RTT by 200ms (factor 20 for short distances)
|
17
|
- Regulates the size of the queue between the IP layer and the Ethernet
layer
- May increase the throughput if we find optimal values
- But may increase duplicate ACKs (Y. T Li)
|
18
|
|
19
|
|
20
|
- Definition: standard deviation normalized by the average throughput
- At short RTT (10ms) stability is usually good (<=12%)
- At medium RTT (70ms) P-TCP, Scalable & Bic-TCP and appear more
stable than the other protocols
|
21
|
- UDP does not back off in face of congestion, it has a “stiff” behavior
- We modified iperf to allow it to create UDP traffic with a sinusoidal
time behavior, following an idea from Tom Hacker
- See how TCP responds to varying cross-traffic
- Used 2 periods of 30 and 60 seconds and amplitude varying from 20 to 80
Mbps
- Sent from 2nd sending host to 2nd receiving host
while sending TCP from 1st sending host to 1st
receiving host
- As long as the window size was large enough all protocols converged
quickly and maintain a roughly constant aggregate throughput
- Especially for P-TCP & Bic-TCP
|
22
|
- Stability better at short distances
- P-TCP & Bic more stable
|
23
|
|
24
|
- Important to understand how fair a protocol is
- For one protocol competing against the same protocol (intra-protocol) we
define the fairness for a single bottleneck as:
- All protocols have good intra-protocol Fairness (F>0.98)
- Except HS-TCP (F<0.94) when the window size > optimal
|
25
|
- Most have good intra-protocol fairness (diagonal elements), except
HS-TCP
- Inter protocol Bic & H appear more fair against others
- Worst fairness are HSTCP-LP, P-TCP, S-TCP, Fast, HSTCP-LP
- But cannot tell who is aggressive and who is timid
|
26
|
- For inter-protocol fairness we introduce the asymmetry between the two
throughputs:
- Where x1 and x2 are the throughput averages of
TCP stack 1 competing with TCP stack 2
|
27
|
|
28
|
- Cause queuing on reverse path by using P-TCP 16 streams
- ACKs are lost or come back in bursts (compressed ACKs)
- Fast TCP throughput is 4 to 8 times less than the other TCPs.
|
29
|
- Finish measurements to Manchester/CERN
- More analysis
- Work with Caltech to correlate with simulation
- Compare with other people’s measurements
- Test Westwood+
- Tests with different RTTs on the same link
- Try on 10Gbps links
- More tests with multiple streams
- Look at performance of rate based protocols
- Use with production applications
|
30
|
- Advanced stacks behave like TCP-Reno single stream on short distances
for up to Gbits/s paths, especially if window size limited
- TCP Reno single stream has low performance and is unstable on long
distances
- P-TCP is very aggressive and impacts the RTT badly
- HSTCP-LP is too gentle, this can be important for providing scavenger
service without router modifications. By design it backs off quickly,
otherwise performs well
- Fast TCP is very handicapped by reverse traffic
- S-TCP is very aggressive on long distances
- HS-TCP is very gentle, like H-TCP has lower throughput than other
protocols
- Bic-TCP performs very well in almost all cases
|
31
|
- TCP Stacks Evaluation:
- www-iepm.slac.stanford.edu/bw/tcp-eval/
|
32
|
|
33
|
- With optimal window all stacks within ~20% of one another, except Reno 1
stream on medium and long distances
|
34
|
|
35
|
|