 Throughput versus loss Les Cottrell Last Update: February 16, 2000 Central Computer Access | Computer Networking | Internet Monitoring | IEPM | Tutorial on Internet Monitoring     ### Introduction

The macroscopic behavior of the TCP congestion avoidance algorithm by Mathis, Semke, Mahdavi & Ott in Computer Communication Review, 27(3), July 1997, provides a short and useful formula for the upper bound on the transfer rate:
Rate <= (MSS/RTT)*(1 / sqrt{p})
where:
Rate: is the TCP transfer rate or throughputd
MSS: is the maximum segment size (fixed for each Internet path, typically 1460 bytes)
RTT: is the round trip time (as measured by TCP)
p: is the packet loss rate.

Note that the Mathis formula fails for 0 packet loss. Possible solutions are:

• You assume 0.5 packets were lost. Eg. assume you send 10 packets each 30 mins for 1 year then 48 (30 min intervals in day) * 10 packet *365 days = 175200 pings or a loss rate of 0.5/175200 = 0.00000285.
• You take the Padhye estimate (see bwlow) for 0 packet loss, i.e. Rate = Wmax / RTT. The default Linux Wmax is 12k8Bytes (see Linux Tune Network Stack (Buffers Size) To Increase Networking Performance).

An improved form of the above formula that takes into account the TCP initial retransmit timer and the Maximum TCP window size, and is generally more accurate for larger (> 2%) packet losses, can be found in: Modelling TCP throughput: A simple model and its empirical validation by J. Padhye, V. Firoiu, D. Townsley and J. Kurose, in Proc. SIGCOMM Symp. Communications Architectures and Protocols Aug. 1998, pp. 304-314. The formula is given below (derived from eqn 31 of Padhye et. al.):.

if w(p) < wmax

Rate = MSS * [((1-p)/p) +  w(p) + Q{p,w{p}}/(1-p)] /
(RTT * [(w{p}+1)]+(Q{p,w{p}}*G{p}*T0)/(1-p))
otherwise:
Rate = MSS * [((1-p)/p)+ wmax+Q{p,wmax}/(1-p)] /
(RTT * [0.25*wmax+((1-p)/(p*wmax)+2)] + (Q{p,wmax}*G{p}*T0)/(1-p)])

Where:

We have assumed the number of packets acknowledged by a received ACK is 2 (this is b in the Padhye et. al. formula  31)
wmax is the maximum congestion window size
w{p} = (2/3)(1 + sqrt{3*((1-p)/p) + 1}
from eqn. 13 of Padhye et. al. substituting b=2

Q{p,w} = min{1,[(1-(1-p)3)*(1+(1-p)3)*(1-(1-p)(w-3))] /
[(1-(1-p)w)]}
G{p} = 1+p+2*p2+4*p3+8*p4+16*p5+32*p
from eqn 28 of Padhye et. al.
T0 = Initial retransmit timeout (typically this is suggested by RFCs 793 and 1123 to be 3 seconds).
Wmax = Maximum TCP window size (typical default for Solaris 2.6 is 8192 bytes)

If you are tuning your hosts for best performance then also read Enabling High Performance Data Transfers on Hosts and TCP Tuning Guide for Distributed Application on Wide Area Networks. Also The TCP-Friendly Website summarizes some recent work on congestion control for non-TCP based applications in particular for congestion control schemes that maintain the arrival rate to at most some constant over the square root of the packet loss rate.

A problem with both formulae for 0 packet loss is that the throughput depends on RTT which may change by little from year to year for a host pair with terrestrial wired links. Thus the formulae start to fail for long term trends unless one knows the Wmax as a function of time for such pairs of hosts. Unfortunately this is typically only known by the system administrators and may change with time.

### Measurement of MSS

On June 7-8, 1999, we measured the MSS between SLAC and about 50 Beacon sites by sending pings with 2000 bytes and sniffing on the wire to see the size of the response packets. Of the 48 Beacon site paths pinged all except 1 responded with an MSS of 1460 bytes (packet size of 1514 bytes as reported by the sniffer), and the remaining 1 (nic.nordu.net) was unreachable.

### Validation of the formula

Between May 3 and May 15, 1999, Andy Germain of the Goddard Space Filght Center (GSFC) made measurements of TCP throughput between GSFC and Los Alamos National Laboratory (LANL). The measurements were made with a modified version of ttcp which, every hour, sent data from GSFC to LANL for 30 seconds and then measured the amount of data transmitted. From this a measureed TCP throughput was obtained. At the same time Andy sent 100 pings from GSFC to LANL and measured the loss and RTT. These loss and RTT measurements were plugged into the formula above to provide a predicted throughput. We then plotted the measured TCP throughput (ttcp) and the predicted (from ping) throughput and the results are shown in the chart below. It can be seen from the variation between day and night that the link is congested. Visually there is reasonable agreement between the predicted and measured values with the predicted values tracking the predicted ones. Note that since only 100 pings were measured per sample set, the loss resolution was only 1%. For losses of < 1% we set the loss arbitrarily to 0.5%. To further evaluate the agreement between the predicted and measured values we scatter plotted the predicted versus the measured values as shown in the chart below. It can be seen that the points exhibit a strong positive correlation with an R2 of about 0.85. In this plot points with a ping loss of < 1% are omitted.

### How the Formula behaves

Using the above formula with an MSS of 1460 bytes, we can plot the throughput as a function of loss and RTT as shown in the chart below for the range RTT from 0.25ms (typical LAN RTTs) to 650ms (typical geostationary satellite speed). If one takes the speed of light in fibre as roughly 0.6c or msec = alpha * 100km where empirically alpha ~0.4 accounts for non direct paths, router delays etc. then the distances corresponding to 10, 50, 100, 250, 500, 1000, 2500, 5000, 10000, 25000km are 0.25, 1.25, 2.5, 6.25, 12.5, 25, 62.5, 125, 250, 625 msec. In the above chart the lines are colored according to the packet loss quality defined in Tutorial on Internet Monitoring. Given the difficulty of reducing the RTT, the importance of minimizing packet loss is apparent.

If we consider how the packet loss is improving month-to-month (see Tutorial on Internet monitoring at SLAC) then the loss appears to be improving by between 2% and 9% / month. Applying the above formula, fixing the RTT at 100 msec. and starting with an initial loss of 2.5% we get the throughputs shown in the following figure for various values from 0% to 10% improvement/month: In order to facilitate understanding, the table on the left of the chart above, shows the yearly improvement in loss and the loss at the end of three years.

### Using the formula with long term PingER data

Given the historical measurements from PingER of the packet loss and RTT we can calculate the maximum TCP bandwidth for the last few years for various groups of sites as shown in the figure below. The numbers in parentheses in the legend indicate the number of pairs of monitor-remote sites included in the group measurement. The percentages to the right show the improvement per month, and the straight lines are exponential fits to the data. Another way of looking at the data is to show the Gbytes that can be transmitted per hour. This is shown in the chart below between ESnet and various collections of sites. The figure below shows the Normalized Derived PingER Throughput measured from SLAC to countries of the world from january through September 2007 as a function of the packet loss. Normalization is of the form:
Normalized Derived TCP throughput(AKA Normalized Rate) = (Minimum_RTT(Remote country)/Minimum_RTT(Monitoring Country)) * Rate
The correlation is seen to be strong with R2 ~ 0.89, and goes as 1526.6 / loss0.66. Also shown is a rough fit for the 129 countries with observed data minimizing X=Sum(Theory-Observation)/Theory)2) where Theory = beta/sqrt(loss), beta = 1875 and X= 12.43. ### Correlation of Derived Throughput vs Average RTT and Loss

The correlation of Derived Throughput is stronger versus Loss than versus Average RTT. This can be seen in the images below:
Derived Throughput vs Average RTT Derived Throughput vs Loss  [ Feedback | Reporting Problems ]
Les Cottrell