On Sun, Nov 07, 2010 at 01:42:33AM -0700, George Bonser wrote:
I guess you didn't read the links earlier. It has nothing to do
with
stack tweaks. The moment you lose a single packet, you are toast. And
TCP SACK.
Certainly helps but still has limitations. If you have too many packets in flight, it can take too long to locate the SACKed packet in some implementations, this can cause a TCP timeout and resetting the window to 1. It varies from one implementation to another. The above was for some implementations of Linux. The larger the window (high speed, high latency paths) the worse this problem is. In other words, sure, you can get great performance but when you hit a lost packet, depending on which packet is lost, you can also take a huge performance hit depending on who is doing the talking or what they are talking to.
Common advice on stack tuning " for very large BDP paths where the TCP window is > 20 MB, you are likely to hit the Linux SACK implementation problem. If Linux has too many packets in flight when it gets a SACK event, it takes too long to located the SACKed packet, and you get a TCP timeout and CWND goes back to 1 packet. Restricting the TCP buffer size to about 12 MB seems to avoid this problem, but clearly limits your total throughput. Another solution is to disable SACK." Even if you don't have such as system, you might be talking to one.
Do you know if any work is being done on resolving this problem? It seems that work in that area might be more fruitful than banging your head against increasing the MTU.
But anyway, I still think 1500 is a really dumb MTU value for modern interfaces and unnecessarily retards performance over long distances.