The jumbo frames effectively increase the congestion avoidance additive increase of the congestion avoidance phase of TCP by a factor of 6. Thus after a congestion event, that reduces the window by a factor of 2, one can recover 6 times as fast. This is very important on large RTT fast links where the recovery rate(for TCP/Reno) goes as the MTU/RTT^2. This can be seen in some of the graphs at: http://www-iepm.slac.stanford.edu/monitoring/bulk/fast/stacks.png or more fully at: http://www-iepm.slac.stanford.edu/monitoring/bulk/fast/ We saw little congestion related packet loss on the testbed. With big windows SACK becomes increasingly important so one does not have recover a large fraction of the window for a single packet. Once one gets onto networks where one is really sharing the bandwidth with others performance drops off rapidly (see for example the measuremsnts at http://www-iepm.slac.stanford.edu/monitoring/bulk/fast/#Measurements%20from%... and compare them with those at http://www-iepm.slac.stanford.edu/monitoring/bulk/fast/#TCP%20Stack%20Compar... One of the next things we want to look at next is how the various new TCP stacks work on production Academic & Research Networks (e.g. from Internet2, ESnet, GEANT, ...) with lots of other competing traffic. -----Original Message----- From: Iljitsch van Beijnum [mailto:iljitsch@muada.com] Sent: Saturday, March 08, 2003 1:49 PM To: Cottrell, Les Cc: 'nanog@nanog.org' Subject: Re: 923Mbits/s across the ocean On Sat, 8 Mar 2003, Cottrell, Les wrote:
We used a stock TCP (Linux kernel TCP). We did however, use jumbo frames (9000Byte MTUs).
What kind of difference did you see as opposed to standard 1500 byte packets? I did some testing once and things actually ran slightly faster with 1500 byte packets, completely contrary to my expectations... (This was UDP and just 0.003 km rather than 10,000, though.)
The remarks about window size and buffer are interesting also. It is true large windows are needed. To approach 1Gbits/s we require 40MByte windows. If this is going to be a problem, then we need to raise question like this soon and figure out how to address (add more memory, use other protocols etc.). In practice to approcah 2.5Gbits/s requires 120MByte windows.
So how much packet loss did you see? Even with a few packets in a million lost this would bring your transfer way down and/or you'd need even bigger windows. However, bigger windows mean more congestion. When two of those boxes start pushing traffic at 1 Gbps with a 40 MB window, you'll see 20 MB worth of lost packets due to congestion in a single RTT. A test where the high-bandwidth session or several high-bandwidth sessions have to live side by side with other traffic would be very interesting. If this works well it opens up possibilities of doing this type of application over real networks rather than (virtual) point-to-point links where congestion management isn't an issue.