Why the temptation for dial users to crank back rwin/mtu?
Hi Folks, As we know[1], packet sizes on the network fall almost exclusively into one of 4 categories: Small (<50 byte) dataless packets.. tcp acks/syns/rsts.. 576 byte packets 1500 byte packets runt packets less than sender MTU.. It's always been my belief that the 576 values were generated by hosts that didn't support Path MTU discovery and wanted to be conservative and avoid fragmentation. Fair enough.. but, as I'm sure everybody knows, there's a plethora of website/utilities ([2],[3],[4],[5],[6],..) imploring windows dial users to use any one of a number of tools/techniques to play with their mtu/rwin settings to 'make things faster'.. In typical windows fashion nothing is quantified in any meaningful fashion and the only motivation provided is typically some garbage like "windows comes setup for LAN not internet use".. these tools tend to crank down both rwin and mtu an awful lot. apparently there's some performance value in this (at least to the immediate user) because they keep doing it in droves. It's not obvious to me why the heck this would be. (warning: I am a protocol guy, but I'm not a dialup guy at all.. and even less of a windows guy) MTU - at least this makes a little bit of sense.. If they're doing HTTP/1.0 stuff with parallel connections then a smaller MTU is going to make that parallelization latency much more effective and perceived performance will go up some.. it doesn't impact full document retrieval time though (at least not positively!).. are dial links really lossy enough that chopping the segment size to 1/3 is a big win in retransmit time or are the win95/98 stacks really braindead enough that they don't do pmtud so are just trying to dodge fragmentation? I found it really odd that [7] which I use all the time to track features in a myriad of shipped OS's actually has a blank entry for pmtud on both of those (neither yes nor no..) RWIN - this is the one that boggles my mind.. it gets set way way way down by the above mentioned tools.. I've seen it as low as 2500 bytes recently. Anyone have any insight into the value of pushing this all the way down? The web pages generally mumble about capping the amount of data that needs to be resent in case of a failure.. which is of course true in the extreme case, but I'd much rather have the congestion window providing the throttle than the hard-limit of rwin that can just cap transfer rates on you.. about the only reason I can think of for small RWINs is to conserve the buffer space, but it sure seems worth a few K to me to be sure I can work with high latency links. You could argue that 3 or 4 K is sufficient for any reasonable latency that is bottlenecked by a modem's throughput.. and eventually I might give in (or maybe not ;)).. what I don't get is why this results in any kind of perceived performace increase on the part of the user under any condition.. It almost implies that TCP congestion control is too conservative, although almost all work on that indicates it's a little too aggressive (which would be the side to err on..) Any thoughts? -P [1] http://www.caida.org/Papers/Inet98/ [2] http://www.cerberus-sys.com/~belleisl/tune_faq/tuning.htm [3] http://www.trumpet.com.au/wsk/faq/config.htm [4] http://www.mc-pro.com/hardware/windialup.shtml [5] http://www0.delphi.com/pccompat/mtu.html [6] http://www.pattersondesigns.com/tweakdun/index.html [7] http://www.psc.edu/networking/perf_tune.html
On Sat, 6 Feb 1999 mcmanus@appliedtheory.com wrote:
apparently there's some performance value in this (at least to the immediate user) because they keep doing it in droves. It's not obvious to me why the heck this would be. (warning: I am a protocol guy, but I'm not a dialup guy at all.. and even less of a windows guy)
This has been generally beat to death on nanog in the past. If you weren't around back then, dig around in the archive. I remember one of the subjects being "PC Bozoworld strikes again" or something like that. The short recap is that for some unknown reason the Microsoft TCP/IP stack is broken in some bizzare way that setting down the MTU on a good chunk of the machines out there will result in a dramatic speed increase. Why this occurs, I'm not sure anyone really knows. It would be really interesting to see a study of what the MS stack is doing and why it's faster.
MTU - at least this makes a little bit of sense.. If they're doing HTTP/1.0 stuff with parallel connections then a smaller MTU is going to make that parallelization latency much more effective and perceived performance will go up some.. it doesn't impact full document
Just for my information, does the MTU setting affect <received> packets in some way? My understanding was that a machine wouldn't send packets over the MTU size, but could recieve anything up to whatever the TCP/IP stack writer included in the stack. Guess I'll have to go dig out the RFC's. - Forrest W. Christian (forrestc@imach.com) ---------------------------------------------------------------------- iMach, Ltd., P.O. Box 5749, Helena, MT 59604 http://www.imach.com Solutions for your high-tech problems. (406)-442-6648 ----------------------------------------------------------------------
In a previous episode Forrest W. Christian said... :: :: :: > MTU - at least this makes a little bit of sense.. If they're doing :: > HTTP/1.0 stuff with parallel connections then a smaller MTU is going :: > to make that parallelization latency much more effective and perceived :: > performance will go up some.. it doesn't impact full document :: :: Just for my information, does the MTU setting affect <received> packets :: in some way? My understanding was that a machine wouldn't send packets :: over the MTU size, but could recieve anything up to whatever the TCP/IP :: stack writer included in the stack. Guess I'll have to go dig out the windows uses the mss option in it's SYN.. it is supposed to be set to the max mtu of all the boxes interfaces - 40. (and dial users probably only have 1 interface)...the other end has to use that as a max starting point for pmtud.. thus lots of small packets headed towards the windows box.. -P
mcmanus@appliedtheory.com wrote:
MTU - at least this makes a little bit of sense.. If they're doing HTTP/1.0 stuff with parallel connections then a smaller MTU is going to make that parallelization latency much more effective and perceived performance will go up some.. it doesn't impact full document retrieval time though (at least not positively!).. are dial links really lossy enough that chopping the segment size to 1/3 is a big win in retransmit time or are the win95/98 stacks really braindead enough that they don't do pmtud so are just trying to dodge fragmentation? I found it really odd that [7] which I use all the time to track features in a myriad of shipped OS's actually has a blank entry for pmtud on both of those (neither yes nor no..)
The perception of speed is likely to propogate the concept well. If it looks like it works, they'll pass it along to someone else. I do suspect that if at any point along the way, memory is tight, a smaller packet has a better chance of not falling on the floor.
RWIN - this is the one that boggles my mind.. it gets set way way way down by the above mentioned tools.. I've seen it as low as 2500 bytes recently. Anyone have any insight into the value of pushing this all the way down? The web pages generally mumble about capping the amount of data that needs to be resent in case of a failure.. which is of course true in the extreme case, but I'd much rather have the congestion window providing the throttle than the hard-limit of rwin that can just cap transfer rates on you.. about the only reason I can think of for small RWINs is to conserve the buffer space, but it sure seems worth a few K to me to be sure I can work with high latency links. You could argue that 3 or 4 K is sufficient for any reasonable latency that is bottlenecked by a modem's throughput.. and eventually I might give in (or maybe not ;)).. what I don't get is why this results in any kind of perceived performace increase on the part of the user under any condition.. It almost implies that TCP congestion control is too conservative, although almost all work on that indicates it's a little too aggressive (which would be the side to err on..) Any thoughts?
When a given image hit connects, RWIN bytes will be sent and the sender will wait. If RWIN is big, one image loads more and others load less for the amount of time that RWIN bytes take to come across the modem. For a small RWIN, the parallelism increases and the perception of speed does as well. When evaluating this, do keep in mind the large number of parallel connections for large numbers of images. Setting the connection limit low makes the perception of speed go down. With more connections, it appears to go up ... unless RWIN is high enough to still affect it on a per connection basis. I think a lot of the design of TCP simply never considered the cases of tens to maybe over a hundred parallel connections funneling through a thin pipe at one end. Back then, who would have thought of what we have today. The made progressive loading images for a reason. -- -- *-----------------------------* Phil Howard KA9WGN * -- -- | Inturnet, Inc. | Director of Internet Services | -- -- | Business Internet Solutions | eng at intur.net | -- -- *-----------------------------* phil at intur.net * --
participants (3)
-
Forrest W. Christian
-
mcmanus@appliedtheory.com
-
Phil Howard