Re: MTU of the Internet?
Date: Wed, 4 Feb 1998 10:51:44 -0700 (MST) From: "Forrest W. Christian" <forrestc@iMach.com> To: "Perry E. Metzger" <perry@piermont.com> cc: Peter Ford <peterf@microsoft.com>, "'nanog@merit.edu'" <nanog@merit.edu> Subject: Re: MTU of the Internet?
Now it's been a while since I looked at latency vs transfer rates, so maybe someone who works on this on an everyday basis would like to comment on what ~200 more ms of latency on a 28.8 link would do to throughput end-to-end across the net (totals of something like 350 and 512 ms end-to-end).
We recommend that clients who care about interactive response use small MTUs, and clients who care about download speed use higher MTUs. It seems most of them agree that smaller MTUs improve their interactive response for things like telnet, IRC, MUD, etc., particularly if they are downloading/surfing at the same time. Thx, dennis
On Wed, Feb 04, 1998 at 01:18:56PM -0500, Dennis Simpson wrote:
We recommend that clients who care about interactive response use small MTUs, and clients who care about download speed use higher MTUs.
There's an extremely annoying potential gotcha in having clients set lower MTUs. At least one release of Netscape's web server set the Don't Fragment bit. In the few cases we've seen, if there was not a 1500 MTU pipe between server and client, the server could be reached, but no HTML would be downloaded. Usually it's easier to work around the problem on the client end than convince the server admins they might want change things on their end. -- Jeff Stehman Senior Systems Administrator stehman@southwind.net SouthWind Internet Access, Inc. voice: (316)263-7963 Wichita, KS
On Thu, 5 Feb 1998, Jeff Stehman wrote:
On Wed, Feb 04, 1998 at 01:18:56PM -0500, Dennis Simpson wrote:
We recommend that clients who care about interactive response use small MTUs, and clients who care about download speed use higher MTUs.
There's an extremely annoying potential gotcha in having clients set lower MTUs. At least one release of Netscape's web server set the Don't Fragment bit. In the few cases we've seen, if there was not a 1500 MTU pipe between server and client, the server could be reached, but no HTML would be downloaded. Usually it's easier to work around the problem on the client end than convince the server admins they might want change things on their end.
Erm... I think you are getting a few things a bit confused here. First, I _really_ doubt the web server set the DF bit. It is almost certainly the OS, probably trying to do path MTU discovery. If there is a smaller MTU in the middle, and someone is filtering ICMP can't fragment errors then yes, you will have trouble with PTMU discovery. The fix is to not blindly filter ICMP. Increasing the client MTU won't fix this. If the client lowers their MTU, there is no problem because then they advertise an appropriate MSS and no stack should try sending packets that it knows will need to be fragmented given the client MSS.
Marc Slemko writes...
First, I _really_ doubt the web server set the DF bit. It is almost certainly the OS, probably trying to do path MTU discovery.
If there is a smaller MTU in the middle, and someone is filtering ICMP can't fragment errors then yes, you will have trouble with PTMU discovery. The fix is to not blindly filter ICMP. Increasing the client MTU won't fix this.
Agreed that one should not blindly filter ICMP. However not all filter tests don't support discriminating individual ICMP types. Ascend is one example.
If the client lowers their MTU, there is no problem because then they advertise an appropriate MSS and no stack should try sending packets that it knows will need to be fragmented given the client MSS.
That's one reason I went with lower MTU until I could replace SLIP with PPP. But then I discovered other things work better, such as interactive telnet during multiple concurrent downloads. So I leave MTU low until I get that OC-12 into my apartment. -- Phil Howard | blow1me4@spammer0.net no75ads9@no3where.org ads9suck@dumb0ads.com phil | stop9397@noplace2.org ads0suck@lame9ads.net no27ads2@nowhere0.org at | ads4suck@nowhere5.net eat0this@noplace5.com die2spam@nowhere4.net milepost | eat8this@spammer3.net suck1it7@anyplace.com end8ads9@lame8ads.org dot | crash834@anyplace.edu suck5it5@spammer6.edu suck7it3@dumbads1.net com | die5spam@anyplace.edu no16ads5@no61ads7.org no47ads4@anyplace.com
At 12:18 PM 2/5/98 -0600, Phil Howard wrote:
Marc Slemko writes...
First, I _really_ doubt the web server set the DF bit. It is almost certainly the OS, probably trying to do path MTU discovery.
If there is a smaller MTU in the middle, and someone is filtering ICMP can't fragment errors then yes, you will have trouble with PTMU discovery. The fix is to not blindly filter ICMP. Increasing the client MTU won't fix this.
Agreed that one should not blindly filter ICMP. However not all filter tests don't support discriminating individual ICMP types. Ascend is one example.
Why not? You have the options to use IP filters which are fairly limited, or generic filters where you can specify more details of the filter. Kevin
Jeff Stehman writes...
On Wed, Feb 04, 1998 at 01:18:56PM -0500, Dennis Simpson wrote:
We recommend that clients who care about interactive response use small MTUs, and clients who care about download speed use higher MTUs.
There's an extremely annoying potential gotcha in having clients set lower MTUs. At least one release of Netscape's web server set the Don't Fragment bit. In the few cases we've seen, if there was not a 1500 MTU pipe between server and client, the server could be reached, but no HTML would be downloaded. Usually it's easier to work around the problem on the client end than convince the server admins they might want change things on their end.
If the client sets the MTU lower, that becomes the connection MTU and MTU discovery doesn't take place. The problem only occurs if there is a router in between (not the client) that has a lower MTU than what the connection is using, and hence the fragmentation (or discarding if DF is on) occurs. There's probably a whole thread due on whether DF should be set or not. But there really are links that are smaller than 1500, such as SLIP at 1006. IMHO setting DF should not be allowed where the MTU is greater than 576, or whatever number today constitutes the "minimum reasonable requirement" which I would say isn't larger than 1006. Maybe in a few years we can kiss SLIP bye-bye and make sure everything is 1500. Netscape's web server is broken. And in more ways than just the DF thing. But then that begs the question, why 1500? Why not 4000? Or 32000? -- Phil Howard | no7spam4@no6where.net eat07me7@s5p5a7m7.edu stop9858@no52ads4.edu phil | suck1it4@noplace9.net no6spam0@no78ads3.org suck8it2@spammer7.org at | end2it19@spam7mer.net stop8810@no4where.net no4way24@nowhere6.com milepost | ads7suck@spam7mer.edu crash826@anyplace.edu w5x0y6z2@nowhere0.net dot | suck6it0@no8where.com stop6it1@dumbads4.org blow6me3@dumbads6.com com | end2ads4@lame2ads.com no0spam4@no32ads4.com a3b1c5d8@anywhere.com
On Thu, 5 Feb 1998, Phil Howard wrote:
IMHO setting DF should not be allowed where the MTU is greater than 576, or whatever number today constitutes the "minimum reasonable requirement" which I would say isn't larger than 1006. Maybe in a few years we can kiss SLIP bye-bye and make sure everything is 1500.
Erm... no. The whole point of setting the DF bit is to avoid fragmentation. Read up on path MTU discovery to see why it is a good thing. Just because braindead filters cause problems is no reason to suggest that PMTU discovery is bad. It becomes even more critical in IPv6 where routers don't fragment period, so people had better get used to it. Trying to force everyone to have the same MTU simply is not practical. You will always have systems with higher path MTUs that can get a gain from knowing it and you will always have systems with lower MTUs for whatever reason.
Netscape's web server is broken. And in more ways than just the DF thing.
It has nothing to do with what web server you are running.
Marc Slemko writes...
On Thu, 5 Feb 1998, Phil Howard wrote:
IMHO setting DF should not be allowed where the MTU is greater than 576, or whatever number today constitutes the "minimum reasonable requirement" which I would say isn't larger than 1006. Maybe in a few years we can kiss SLIP bye-bye and make sure everything is 1500.
Erm... no. The whole point of setting the DF bit is to avoid fragmentation. Read up on path MTU discovery to see why it is a good thing. Just because braindead filters cause problems is no reason to suggest that PMTU discovery is bad. It becomes even more critical in IPv6 where routers don't fragment period, so people had better get used to it.
Trying to force everyone to have the same MTU simply is not practical. You will always have systems with higher path MTUs that can get a gain from knowing it and you will always have systems with lower MTUs for whatever reason.
I agree MTU needs to be flexible. But in a protocol where MTU discovery is based on ICMP, and where filters are often implemented for ICMP without detailing, then I think DF is just as uncivilized as other bad behaviour. What's the mechanism for negotiating packet size in IPv6 and how to does it deal with minimal routers in between? Does it have an MTU discovery? And is its MTU discovery poorly designed like in IPv4 (which looks to me like it's an afterthought). Of course MTU discovery is something that needs to take place across TCP, UDP, and whatever else is just above IP, but putting that over in ICMP is, IMHO, part of the problem. It would not violate the statelessness principle to allow a transit router to send back something entirely outside of the concept of ICMP to do MTU discovery since that doesn't involve storing any state of any connections in transit routers. -- Phil Howard | crash359@nowhere0.edu blow5me1@noplace4.edu no8way61@spam9mer.edu phil | eat91me8@no8where.org end6it70@anywhere.edu eat7this@nowhere6.net at | no1spam0@s4p0a2m7.org blow2me2@nowhere3.net stop2570@s8p8a0m8.com milepost | eat8this@anyplace.edu blow3me8@nowhere8.net w2x1y6z4@no5place.com dot | stop2ads@no9where.com stop5ads@no3place.org stop7it4@anyplace.org com | no58ads6@no13ads5.com eat2this@s5p6a5m0.net stop4790@spam7mer.edu
Phil Howard writes:
I agree MTU needs to be flexible. But in a protocol where MTU discovery is based on ICMP, and where filters are often implemented for ICMP without detailing, then I think DF is just as uncivilized as other bad behaviour.
The great thing about the internet is that almost anyone can pretend to understand it. Perry
Hot Diggety! On a dark and stormy night, Phil Howard was rumored to have said...
I agree MTU needs to be flexible. But in a protocol where MTU discovery is based on ICMP, and where filters are often implemented for ICMP without detailing, then I think DF is just as uncivilized as other bad behaviour.
It is? It would seem to me the problem here is idiots-for-network-admins, not the protocol per se. Although, yes, it would be optimal if it could work in an environment without regard to misconfigurations.
And is its MTU discovery poorly designed like in IPv4 (which looks to me like it's an afterthought). Of course MTU discovery is something that
Them be fighting words. :) Allow me to quickly educate/introduce a different viewpoint where years ago the Internet landscape was *very* different (1989), and the problems of the time basically was itty bitty pipes, and setting IP options or other schemes that would have been 'nicer' design-wise just ended up eating valuable, limited bandwidth. Below is a post from one of the RFC 1191 authors, the chair of the IETF WG who came up with PMTUD. He says it much better than I ever could regarding the context in which they operated in. Aside from that, no other comments. -Dan Foster From: mogul@actitis.pa.dec.com (Jeffrey Mogul) Newsgroups: comp.protocols.tcp-ip Subject: Re: Lost http from sites - Will PAY for Help! Date: 3 Feb 1998 01:30:51 GMT Organization: DEC Western Research Message-ID: <6b5s0b$bvk@usenet.pa.dec.com> In article <6atrrf$m6p$2@ocean.cup.hp.com>, foo@bar.baz (Rick Jones) writes: |> had the ietf gone the route for path mtu discovery based on an IP |> option, people could have filtered all the ICMP they wanted, *and* |> detecting a smaller mtu would not have required a packet |> loss...instead, what was probably seen as a quicker, cheaper solution |> (by the router vendors?) was selected, and not surprisingly, there are |> a couple problems with it As the chair of the IETF working group that came up with Path MTU Discovery, and the co-author of RFC1191, I suppose I should respond. Some of us originally proposed using an IP option. However, at the time, most members of the working group (eventually including me) believed that it would be nearly impossible to get the installed base of routers, and the vendors of new routers, to upgrade in any reasonable time. My recollection of the process was that the router vendors were not in any way dominating the decision. Also, some people objected to the IP-option mechanism on the grounds that this would inject extra packets into the Internet backbone on every connection, whereas RFC1191 doesn't inject any extra packets unless the initial MTU is too high. Remember that we started this in 1989, when "backbone" bandwidth was scarce (DS0? Certainly no higher than T1) and Van Jacobson's congestion-avoidance mechanisms had only been published the previous year. Finally, the options-based mechanism could not detect changes in the path MTU (because of a routing topology change) without explicit polling. This seemed like a drawback. In retrospect, we were probably too optimistic that vendors would not do silly things (like engineering FDDI-Ethernet bridges that obeyed the DF bit without sending an ICMP message), but generally I think we made the right decision. By the way, one could presumably set up an ICMP filter so that it allows "destination unreachable" messages (as are sent as a result of RFC1191) without allowing other ICMP messages. I know that screend-based firewalls make this easy, but I'm not sure whether other firewalls are so flexible. I'm also willing to concede that this might leave people open to denial-of-service attacks; in 1989, we weren't quite as experienced with the scum of the Internet as we are today. -Jeff P.S.: If anyone is really interested in reading the mail exchanged during the design of RFC1191, I still have the archive of the working group's mailing list ... only 600Kbytes. [Dan: He made a second post 20 minutes later saying this archive is accessible at: http://gatekeeper.dec.com/~mogul/ietf-mtudwg-archive.txt.Z ]
participants (7)
-
Dan Foster
-
Dennis Simpson
-
Jeff Stehman
-
Kevin A. Smith
-
Marc Slemko
-
Perry E. Metzger
-
Phil Howard