RE: PMTU-D: remember, your load balancer is broken
The 576 value of the MS PPP MTU is merely a default - it can be changed with a registry hack. Yes, fragmentation has indeed become a Great Evil due to the large amounts of data we're pushing, and the time/resources required for fragmentation/defragmentation. Forcing excessive fragmentation/defragmentation is an effective DoS. As far as increasing the MTU size on your LAN links, you need to exercise a lot of care when so doing. I personally have never tried to change the MTU size on an Ethernet segment of any type (Ethernet_II/1500 has worked admirably, and I'm unsure of the result if I tried it); on Token Ring, going up to 4096 has indeed been beneficial in the past when dealing with large database writes, etc. Of course, the protocol I was using at the time supported 4096-byte frame sizes on Token Ring. I thought the frame-size limits for Gigabit Ethernet were 64-1518/1522 bytes? And isn't that the limit on most host IP stacks for Ethernet media? Or am I off in left field, here? Finally, I would say that on any medium, <100% utilization in and of itself isn't grounds for fiddling with the MTU. There are lots of other things to look at, first. --------------------------------------------------------------- Roland Dobbins <rdobbins@netmore.net> // 818.535.5024 voice -----Original Message----- From: Roeland Meyer (E-mail) [mailto:rmeyer@mhsc.com] Sent: Wednesday, June 14, 2000 9:07 AM To: Valdis.Kletnieks@vt.edu; 'Marc Slemko' Cc: nanog@merit.edu Subject: RE: PMTU-D: remember, your load balancer is broken
Valdis.Kletnieks@vt.edu: Wednesday, June 14, 2000 8:07 AM
On Tue, 13 Jun 2000 22:36:08 MDT, Marc Slemko said:
It is also a concern that, in my experience, many of the links with MTUs <1500 are also the links with greater packet loss, etc. so you really don't want fragmentation on them.
The worst part here is that I suspect that most of these links (just on sheer numbers of shipped product) are the aformentioned Win98 576-MTU.
I just set my dial PPP ports to MTU=512+40=552, is this wrong? Where does the MTU=576 number come from?
I seem to remember that the *original* motivation for slow-start and all that was Van Jacobson's observation that the most common cause of a TCP retransmit was that an *entire* packet had been silently dropped due to queueing congestion, and could thus be treated identical to an ICMP Source Quench.
Has this changed? Has "fragmentation" become a Great Evil, rather than an annoyance that some links have to deal with?
I'm having some trouble getting full throughput from a GigE pipe. Even in the 100baseTX/FDX down-stream, I'm not getting full link utilization (everything on switches, Cat6509 and 3512XLs). I'm considering increasing MTU sizes to MTU=4096+40, or even larger. Most of the data transmissions fall into the 5KB-50KB range. The site can be considered a large portal. What would be the effect on my upstream? Would it create problems? The only systems that see the Internet are the web-servers (dual NICs).
Sez <rdobbins@netmore.net>
The 576 value of the MS PPP MTU is merely a default - it can be changed with a registry hack.
Expecting the tens of millions of novice computer users to set their systems for a 1500 byte MTU is irrational. Those who are knowledgeable enough to do so are generally reducing it due to "speed up your modem" articles and programs which improve interactive performance at the expense of throughput.
Forcing excessive fragmentation/defragmentation is an effective DoS.
Effective, but a fairly difficult problem to exploit.
I thought the frame-size limits for Gigabit Ethernet were 64-1518/1522 bytes? And isn't that the limit on most host IP stacks for Ethernet media? Or am I off in left field, here?
Finally, I would say that on any medium, <100% utilization in and of itself isn't grounds for fiddling with the MTU. There are lots of other
IIRC, during development of the GigE spec, several vendors wished to increase the GigE MTU to ~9000 bytes. Due to the technical ramifications of bridging to low-MTU FastE segments, as well as inter-vendor politics, it didn't make it as part of the GigE spec but was later published as an optional extension. There are now several devices on the market that will do Jumbo frames on GigE. For instance, the Catalyst 6000 and GSR do: http://www.cisco.com/univercd/cc/td/doc/product/lan/cat6000/sw_5_5/cmd_r efr/set_po_r.htm#xtocid661812 I know there are several other vendors who support Jumbo frames as well. things to
look at, first.
I hear consistent requests from server folks for higher MTUs; they claim the per-frame CPU hit is significant enough to warrant using Jumbo frames to increase throughput. The results clearly show that it helps. S | | Stephen Sprunk, K5SSS, CCIE #3723 :|: :|: Network Design Consultant, HCOE :|||: :|||: 14875 Landmark Blvd #400; Dallas, TX .:|||||||:..:|||||||:. Email: ssprunk@cisco.com
The 576 value of the MS PPP MTU is merely a default - it can be changed with a registry hack.
Expecting the tens of millions of novice computer users to set their systems for a 1500 byte MTU is irrational. Those who are knowledgeable enough to do so are generally reducing it due to "speed up your modem" articles and programs which improve interactive performance at the expense of throughput.
i think you have two things confused. raising the mtu will "speed up the modem" since you get more data for less overhead, however *lowering* the mtu will increase interactivity, since each packet is smaller, the transmit time is shorter, so the next one can get in/out sooner. you stand a better chance of getting a "word" in edgewise if the other guy is using smaller "phrases". -- |-----< "CODE WARRIOR" >-----| codewarrior@daemon.org * "ah! i see you have the internet twofsonet@graffiti.com (Andrew Brown) that goes *ping*!" andrew@crossbar.com * "information is power -- share the wealth."
On Thu, 15 Jun 2000 00:14:43 EDT, Andrew Brown said:
i think you have two things confused. raising the mtu will "speed up the modem" since you get more data for less overhead, however *lowering* the mtu will increase interactivity, since each packet is smaller, the transmit time is shorter, so the next one can get in/out sooner. you stand a better chance of getting a "word" in edgewise if the other guy is using smaller "phrases".
No, he's got it right. The user *percieves* the modem is "speeded up" if he's doing interactive work and fighting with a file download. For instance, if you're using BIG packets on a slower modem (yes, there's still 14.4 and 33.8 users out htere), the network latency on a telnet fighting with an FTP can be up to a major fraction of a second - if your MTU is 1/3 the size, then your character echo is 3 times as fast. The fact that if you lower the MTU from 1500 to 500 you're taking a 10% performance hit (2 extra IP/TCP headers amortized over 1500 bytes) pales in comparison... Yes, it's actually 10% lower for throughput. But for the "boot it, reinstall AGAIN, call tech support and listen when they tell you to wave a dead chicken over the CPU while re-re-installing drivers" crowd, it feels 3 times faster... So it *is* 3 times faster. Valdis Kletnieks Operating Systems Analyst Virginia Tech
Yes, it's actually 10% lower for throughput. But for the "boot it, reinstall AGAIN, call tech support and listen when they tell you to wave a dead chicken over the CPU while re-re-installing drivers" crowd, it feels 3 times faster... So it *is* 3 times faster.
Is the chicken supposed to be dead? No wonder they keep calling me back... -brad (Rural CNE)
Stephen Sprunk: Wednesday, June 14, 2000 3:48 PM
Sez <rdobbins@netmore.net>
The 576 value of the MS PPP MTU is merely a default - it can be changed with a registry hack.
Expecting the tens of millions of novice computer users to set
systems for a 1500 byte MTU is irrational. Those who are knowledgeable enough to do so are generally reducing it due to "speed up your modem" articles and programs which improve interactive performance at
their the
expense of throughput.
How does this effect the link when the server explicitly sets MTU=512+40, in the server-side of the PPP link? AFAICT, that over-rides whatever the end-user may want to do.
I thought the frame-size limits for Gigabit Ethernet were 64-1518/1522 bytes? And isn't that the limit on most host IP stacks for Ethernet media?
There are now several devices on the market that will do Jumbo frames on GigE. For instance, the Catalyst 6000 and GSR do:
http://www.cisco.com/univercd/cc/td/doc/product/lan/cat6000/sw_5_ 5/cmd_refr/set_po_r.htm#xtocid661812
I know there are several other vendors who support Jumbo frames as well.
Finally, I would say that on any medium, <100% utilization in and of itself isn't grounds for fiddling with the MTU. There are lots of other things to look at, first.
I hear consistent requests from server folks for higher MTUs;
Without doing performance analysis on the actual running code, there really isn't a lot else to look at. In my case, we are looking at the web-server<->RDBMS link, which I've already architcted to be on a dedicated switch segment. We already know that we'll never get much less than 3KB data transfers on that bus and it could get as high as 50KB. Larger transfers will also happen, occasionally. Where it hits the WAN is the OPS pipe to the fail-over site. We're looking at that now. they claim
the per-frame CPU hit is significant enough to warrant using Jumbo frames to increase throughput. The results clearly show that it helps.
It isn't really the CPU hit, although that may be a factor, we have intentionally over-built CPU because it's relatively cheap. The problem is that we aren't getting enough CPU utilization, because the xfer rates are too slow. That's how we found the problem in the first place. We've identified the source as being in the ramping algorithms (which make no sense in a switched environment, IMHO). A way to get a faster ramp is to have larger packet sizes to begin with. Since our switch-gear can handle it (all Cisco) the real issue is how this effects the WAN? There are also limits to how much we can tinker with the TCP/IP stacks on the RDBMS machine.
participants (6)
-
Andrew Brown
-
rdobbins@netmore.net
-
Roeland Meyer (E-mail)
-
Rural CNE
-
Stephen Sprunk
-
Valdis.Kletnieks@vt.edu