Sorry, glanced at this and thought it was someone having problems with tunnel MTU without adjusting TCP MSS. Nice work, though my preference is to avoid tunnels at all costs :-) On Mon, Oct 29, 2012 at 12:39 PM, Templin, Fred L <Fred.L.Templin@boeing.com> wrote:
Hi Ray,
MSS rewriting has been well known and broadly applied for a long time now, but only applies to TCP. The subject of MSS rewriting comes up all the time in the IETF wg discussions, but has failed to reach consensus as a long-term alternative.
Plus, MSS rewriting does no good for tunnels-within-tunnels. If the innermost tunnel rewrites MSS to a value that *it* thinks is safe there is no guarantee that the packets will fit within any outer tunnels that occur further down the line.
What I want to get to is an indefinite tunnel MTU; i.e., admit any packet into the tunnel regardless of its size then make any necessary adaptations from within the tunnel. That is exactly what SEAL does:
https://datatracker.ietf.org/doc/draft-templin-intarea-seal/
Thanks - Fred fred.l.templin@boeing.com
-----Original Message----- From: Ray Soucy [mailto:rps@maine.edu] Sent: Monday, October 29, 2012 7:55 AM To: Templin, Fred L Cc: Dobbins, Roland; NANOG list Subject: Re: IP tunnel MTU
The core issue here is TCP MSS. PMTUD is a dynamic process for adjusting MSS, but requires that ICMP be permitted to negotiate the connection. The realistic alternative, in a world that filters all ICMP traffic, is to manually rewrite the MSS. In IOS this can be achieved via "ip tcp adjust-mss" and on Linux-based systems, netfilter can be used to adjust MSS for example.
Keep in mind that the MSS will be smaller than your MTU. Consider the following example:
ip mtu 1480 ip tcp adjust-mss 1440 tunnel mode ipip
IP packets have 20 bytes of overhead, leaving 1480 bytes for data. So for an IP-in-IP tunnel, you'd set your MTU of your tunnel interface to 1480. Subtract another 20 bytes for the tunneled IP header and 20 bytes (typical) for your TCP header and you're left with 1440 bytes for data in a TCP connection. So in this case we write the MSS as 1440.
I use IP-in-IP as an example because it's simple. GRE tunnels can be a little more complex. While the GRE header is typically 4 bytes, it can grow up to 16 bytes depending on options used.
So for a typical GRE tunnel (4 byte header), you would subtract 20 bytes for the IP header and 4 bytes for the GRE header from your base MTU of 1500. This would mean an MTU of 1476, and a TCP MMS of 1436.
Keep in mind that a TCP header can be up to 60 bytes in length, so you may want to go higher than the typical 20 bytes for your MSS if you're seeing problems.
On Tue, Oct 23, 2012 at 10:07 AM, Templin, Fred L <Fred.L.Templin@boeing.com> wrote:
Hi Roland,
-----Original Message----- From: Dobbins, Roland [mailto:rdobbins@arbor.net] Sent: Monday, October 22, 2012 6:49 PM To: NANOG list Subject: Re: IP tunnel MTU
On Oct 23, 2012, at 5:24 AM, Templin, Fred L wrote:
Since tunnels always reduce the effective MTU seen by data packets due to the encapsulation overhead, the only two ways to accommodate the tunnel MTU is either through the use of path MTU discovery or through fragmentation and reassembly.
Actually, you can set your tunnel MTU manually.
For example, the typical MTU folks set for a GRE tunnel is 1476.
Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.
This isn't a new issue; it's been around ever since tunneling technologies have been around, and tons have been written on this topic. Look at your various router/switch vendor Web sites, archives of this list and others, etc.
Sure. I've written a fair amount about it too over the span of the last ten years. What is new is that there is now a solution near at hand.
So, it's been known about, dealt with, and documented for a long time. In terms of doing something about it, the answer there is a) to allow the requisite ICMP for PMTU-D to work to/through any networks within your span of administrative control and b)
That does you no good if there is some other network further beyond your span of administrative control that does not allow the ICMP PTBs through. And, studies have shown this to be the case in a non-trivial number of instances.
b) adjusting your own tunnel MTUs to appropriate values based upon experimentation.
Adjust it down to what? 1280? Then, if your tunnel with the adjusted MTU enters another tunnel with its own adjusted MTU there is an MTU underflow that might not get reported if the ICMP PTB messages are lost. An alternative is to use IP fragmentation, but recent studies have shown that more and more operators are unconditionally dropping IPv6 fragments and IPv4 fragmentation is not an option due to wrapping IDs at high data rates.
Nested tunnels-within-tunnels occur in operational scenarios more and more, and adjusting the MTU for only one tunnel in the nesting does you no good if there are other tunnels that adjust their own MTUs.
Enterprise endpoint networks are notorious for blocking *all* ICMP (as well as TCP/53 DNS) at their edges due to 'security' misinformation propagated by Confused Information Systems Security Professionals and their ilk. Be sure that your own network policies aren't part of the problem affecting your userbase, as well as anyone else with a need to communicate with properties on your network via tunnels.
Again, all an operator can control is that which is within their own administrative domain. That does no good for ICMPs that are lost beyond their administrative domain.
Thanks - Fred fred.l.templin@boeing.com
----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com>
Luck is the residue of opportunity and design.
-- John Milton
-- Ray Patrick Soucy Network Engineer University of Maine System
T: 207-561-3526 F: 207-561-3531
MaineREN, Maine's Research and Education Network www.maineren.net
-- Ray Patrick Soucy Network Engineer University of Maine System T: 207-561-3526 F: 207-561-3531 MaineREN, Maine's Research and Education Network www.maineren.net