Please, talk me down.

Joseph Anthony Pasquale Holsten

17 Oct 2012 17 Oct '12

3:35 a.m.

I want to like IPv6. I do. But I'm seriously considering turning off IPv6 support from our servers. First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally. But today I noticed that we have a lot of traffic to our DNS cache, and started to investigate. Turns out that every DNS request would start with one for the AAAA record. Ah, no luck. Maybe you forgot the search domain? Let's retry that DNS request with that tacked on. Failed again? Meanwhile, lets simultaneously try for the AA record then. Repeat. I'm _this_ close to turning IPv6 off entirely. Anyone want to talk me off this ledge? -- http://josephholsten.com

Show replies by date

Mikael Abrahamsson

17 Oct 17 Oct

3:42 a.m.

On Wed, 17 Oct 2012, Joseph Anthony Pasquale Holsten wrote:

...

But today I noticed that we have a lot of traffic to our DNS cache, and started to investigate. Turns out that every DNS request would start with one for the AAAA record. Ah, no luck. Maybe you forgot the search domain? Let's retry that DNS request with that tacked on. Failed again? Meanwhile, lets simultaneously try for the AA record then. Repeat.

You're describing normal behaviour. Why do you feel the behaviour you're seeing is a problem? Because you're not running IPv6, you're seeing twice the DNS traffic in some cases. Doing multiple lookups for everything in search domain is done for IPv4 as well. -- Mikael Abrahamsson email: swmike@swm.pp.se

Mike Lyon

3:43 a.m.

Upgrade djbdns to support IPV6? Think there is a patch for it... -mike Sent from my iPhone On Oct 16, 2012, at 20:36, Joseph Anthony Pasquale Holsten <joseph@josephholsten.com> wrote:

...

I want to like IPv6. I do. But I'm seriously considering turning off IPv6 support from our servers.

First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally.

But today I noticed that we have a lot of traffic to our DNS cache, and started to investigate. Turns out that every DNS request would start with one for the AAAA record. Ah, no luck. Maybe you forgot the search domain? Let's retry that DNS request with that tacked on. Failed again? Meanwhile, lets simultaneously try for the AA record then. Repeat.

I'm _this_ close to turning IPv6 off entirely. Anyone want to talk me off this ledge? -- http://josephholsten.com

Jima

3:59 a.m.

On 2012-10-16 21:35, Joseph Anthony Pasquale Holsten wrote:

...

I want to like IPv6. I do. But I'm seriously considering turning off IPv6 support from our servers.

First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally.

It sounds like this is a djbdns problem, not an IPv6 problem. FWIW, DJB's public take on IPv6 can be found here: http://cr.yp.to/djbdns/ipv6mess.html . Judging by the lack of updates in the past 10 years (OK, 10 years next month), I'm not certain whether his position has changed. (Granted, some of the ten-year-old facts have, so who knows.) Personally, I didn't agree with his perspective at the time, and I feel it's only gotten less valid over time.

...

But today I noticed that we have a lot of traffic to our DNS cache, and started to investigate. Turns out that every DNS request would start with one for the AAAA record. Ah, no luck. Maybe you forgot the search domain? Let's retry that DNS request with that tacked on. Failed again? Meanwhile, lets simultaneously try for the AA record then. Repeat.

Are 2x the queries -- in exchange for future-proofing the network -- coming that close to overloading your DNS cache? You may want to re-evaluate the scalability of your cache. Or replace your DNS cache with something maintained in the last decade (I thought I was exaggerating, but the last changelog in 1.05 is 20010211), and deploy all your internal assets on IPv6 -- thus reducing the query load AND getting your systems ready for the future.

...

I'm _this_ close to turning IPv6 off entirely. Anyone want to talk me off this ledge?

Go right ahead. But first, what company is this, so the rest of us can know to avoid doing business? ;-) Jima

Karl Auer

5:20 a.m.

On Tue, 2012-10-16 at 21:59 -0600, Jima wrote:

...

FWIW, DJB's public take on IPv6 can be found here: http://cr.yp.to/djbdns/ipv6mess.html .

After a quick read, it seems that that statement completely fails to consider dual stack as a transition mechanism. Regards, K. -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Karl Auer (kauer@biplane.com.au) http://www.biplane.com.au/kauer http://www.biplane.com.au/blog GPG fingerprint: AE1D 4868 6420 AD9A A698 5251 1699 7B78 4EEE 6017 Old fingerprint: DA41 51B1 1481 16E1 F7E2 B2E9 3007 14ED 5736 F687

Randy Bush

4:20 a.m.

...

First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally.

if the clutch in my car is broken, should i stop using vehicles? dump djbdns or get some diehard to tell you how to fix it. randy

Jimmy Hess

18 Oct 18 Oct

2:45 a.m.

On 10/16/12, Randy Bush <randy@psg.com> wrote:

...

...
First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally. if the clutch in my car is broken, should i stop using vehicles? dump djbdns or get some diehard to tell you how to fix it.

Ah, but the clutch is not actually broken; it works perfectly, and it is a very robust clutch, not likely to break, it's just that the car was designed, so you need a wrench with you while at all times while driving, to actuate the clutch, and you need a screwdriver onhand as well to adjust gears. They have a raw record format, that allows you to enter a raw record into your tinydns data file, containing anything, including AAAA data. However, djbdns also lacks support for DNSSEC validation. the stock package 1.05, when installed on a 64-bit OS, contained an unpatched security vulnerability. The car was also designed with no electric ignition switch, and no headlights. You want to start your car, you need a manual crank. It's "good enough"; but probably the time comes soon to retire it. Electronic ignitions and headlights became the 'standard' a long time ago, but the car design was never improved to include the features (not necessarily an easy feat) -- meanwhile, the person in charge of maintaining the design; spent many hours writing essays about the problem of light pollution caused by headlights, insisting that road lights instead would be better, and calling up issues about the extra weight and space required for batteries, danger of batteries leaking, or failing, leaving motorists stranded, etc, thus spending time not updating the design to incorporate beneficial, new standards.

...

randy -- -JH

Leen Besselink

21 Oct 21 Oct

10:37 a.m.

On Wed, Oct 17, 2012 at 09:45:09PM -0500, Jimmy Hess wrote:

...

On 10/16/12, Randy Bush <randy@psg.com> wrote:

...
...
First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally. if the clutch in my car is broken, should i stop using vehicles? dump djbdns or get some diehard to tell you how to fix it.

Ah, but the clutch is not actually broken; it works perfectly, and it is a very robust clutch, not likely to break, it's just that the car was designed, so you need a wrench with you while at all times while driving, to actuate the clutch, and you need a screwdriver onhand as well to adjust gears. They have a raw record format, that allows you to enter a raw record into your tinydns data file, containing anything, including AAAA data.

However, djbdns also lacks support for DNSSEC validation. the stock package 1.05, when installed on a 64-bit OS, contained an unpatched security vulnerability.

If Joseph really likes to use the TinyDNS database so much there is an experimental PowerDNS backend of supposedly there is even an even more DNSSEC-patch somewhere. I can't find the patch right now, but it was mentioned in a presentation by the head developer at ICANN44: http://prague44.icann.org/node/31749 Here it the audio recording: http://audio.icann.org/meetings/prague2012/dnssec-workshop-27jun12-en.mp3 (135 MB) His presentation starts at: 3:32:18 He mentions it at: 3:46:53 And the PDF of his presentation is here: http://prague44.icann.org/meetings/prague2012/presentation-dnssec-power-dns-... I don't expect anyone is using patch in production right now.

...

The car was also designed with no electric ignition switch, and no headlights. You want to start your car, you need a manual crank. It's "good enough"; but probably the time comes soon to retire it.

Electronic ignitions and headlights became the 'standard' a long time ago, but the car design was never improved to include the features (not necessarily an easy feat) -- meanwhile, the person in charge of maintaining the design; spent many hours writing essays about the problem of light pollution caused by headlights, insisting that road lights instead would be better, and calling up issues about the extra weight and space required for batteries, danger of batteries leaking, or failing, leaving motorists stranded, etc, thus spending time not updating the design to incorporate beneficial, new standards.

...
randy -- -JH

Have a nice day, Leen.

Mark Andrews

17 Oct 17 Oct

5:21 a.m.

In message <2801F5F8-B8E2-4A9F-9A89-02D7783CCDA7@josephholsten.com>, Joseph Ant hony Pasquale Holsten writes:

...

I want to like IPv6. I do. But I'm seriously considering turning off IPv6 support from our servers.

First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally.

djbdns doesn't support lots of things.

...

But today I noticed that we have a lot of traffic to our DNS cache, and started to investigate. Turns out that every DNS request would start with one for the AAAA record. Ah, no luck. Maybe you forgot the search domain? Let's retry that DNS request with that tacked on. Failed again? Meanwhile, lets simultaneously try for the AA record then. Repeat.

It looks like your getaddrinfo implementation is a searching for AAAA records and then searching for A records. With a A record for name2 you get a query path like this. e.g. name1 AAAA -> NXDOMAIN name2 AAAA -> NODATA name3 AAAA -> NXDOMAIN name1 A -> NXDOMAIN name2 A -> DATA You could ask you vendor to implement a alternating search strategy. e.g. name1 AAAA -> NXDOMAIN name1 A -> NXDOMAIN name2 AAAA -> NODATA name2 A -> DATA Additionally you could get your vendor skip the A lookup on NXDOMAIN from AAAA. e.g. name1 AAAA -> NXDOMAIN name2 AAAA -> NODATA name2 A -> DATA

...

I'm _this_ close to turning IPv6 off entirely. Anyone want to talk me off this ledge? -- http://josephholsten.com

-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org

Octavio Alvarez

5:22 a.m.

On Tue, 16 Oct 2012 20:35:11 -0700, Joseph Anthony Pasquale Holsten <joseph@josephholsten.com> wrote:

...

I want to like IPv6. I do. But I'm seriously considering turning off IPv6 support from our servers.

First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally.

But today I noticed that we have a lot of traffic to our DNS cache, and started to investigate. Turns out that every DNS request would start with one for the AAAA record. Ah, no luck. Maybe you forgot the search domain? Let's retry that DNS request with that tacked on. Failed again? Meanwhile, lets simultaneously try for the AA record then. Repeat.

I'm _this_ close to turning IPv6 off entirely. Anyone want to talk me off this ledge?

You will eventually have to turn it on again, so you may run away from the problem that will catch you up anyway or, better, start tackling the problems, like fixing djbdns or replacing it with something that works. That's my way of seeing it. Good luck with it. -- Octavio. Twitter: @alvarezp2000 -- Identi.ca: @alvarezp

Antonio Querubin

5:25 a.m.

On Wed, 17 Oct 2012, Joseph Anthony Pasquale Holsten wrote:

...

First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally.

Sounds like a self-inflicted wound. You have alternatives.

...

I'm _this_ close to turning IPv6 off entirely. Anyone want to talk me off this ledge? -- http://josephholsten.com

You can stay on the ledge if you like. A lot of folks have already decided to move on... Antonio Querubin e-mail: tony@lavanauts.org xmpp: antonioquerubin@gmail.com

JP Viljoen

5:25 a.m.

On 17 Oct 2012, at 5:35 AM, Joseph Anthony Pasquale Holsten <joseph@josephholsten.com> wrote:

...

I want to like IPv6. I do. But I'm seriously considering turning off IPv6 support from our servers.

First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally.

But today I noticed that we have a lot of traffic to our DNS cache, and started to investigate. Turns out that every DNS request would start with one for the AAAA record. Ah, no luck. Maybe you forgot the search domain? Let's retry that DNS request with that tacked on. Failed again? Meanwhile, lets simultaneously try for the AA record then. Repeat.

++ on what everyone else has said about this being a problem with the way you run your DNS infrastructure, instead of an actual IPv6 problem. Without reasons listed for why you use djbdns, I can't really adequately comment, but: on our net we're using unbound as caching DNS servers with pretty good success, and pdns with dynamic backends (the backends are custom in-house stuff) as our authoritative DNS. Short of issues now and then with the backends, it works pretty well. -J

John Levine

2:25 p.m.

In article <2801F5F8-B8E2-4A9F-9A89-02D7783CCDA7@josephholsten.com> you write:

...

I want to like IPv6. I do. But I'm seriously considering turning off IPv6 support from our servers.

First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally.

I'm a long time djbdns user. But about a year ago, I switched from using dnscache to unbound for my cache, because it does useful stuff that dnscache doesn't do. I had a bunch of wacky local stuff configured into dnscache, like querying local servers for local-only domains, and substituting a local reject-all for some nasty outside domains, and it took about an hour to figure out how to do it all with unbound. I run it under daemontools. My authoritative servers are still tinydns, even though I do support IPv6. Since tinydns-data compiles stuff from a text source file, I have a perl script that translates lines with AAAA records in a normal format into the escape codes that tinydns uses for arbitrary record types. It's gross, but it works. So anyway, use unbound for your cache, no need to change away from tinydns unless you want to use DNSSEC, which it'll never support. -- Regards, John Levine, johnl@iecc.com, Primary Perpetrator of "The Internet for Dummies", Please consider the environment before reading this e-mail. http://jl.ly

Nicolai

8:44 p.m.

On Wed, Oct 17, 2012 at 03:35:11AM +0000, Joseph Anthony Pasquale Holsten wrote:

...

First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally.

I assume you mean stock djbdns doesn't support ip6, because it does indeed support AAAA records. I use both dnscache and tinydns from djbdns and AAAA records work fine for me. Note: I'm not using Felix von Leitner's ip6 patch. $ dig aaaa chocolatine.org +short 2610:130:103:e00:201:2ff:fe45:8308 Resolver is dnscache, authoritate server is tinydns. No problem. I think the problem you're experiencing, if there is one, is not related to either djbdns or ip6. Nicolai

Jay Mitchell

21 Oct 21 Oct

10:59 a.m.

On 18/10/2012, at 7:44 AM, Nicolai <nicolai-nanog@chocolatine.org> wrote:

...

On Wed, Oct 17, 2012 at 03:35:11AM +0000, Joseph Anthony Pasquale Holsten wrote:

...
First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally.

I assume you mean stock djbdns doesn't support ip6, because it doesindeed support AAAA records. I use both dnscache and tinydns from djbdns and AAAA records work fine for me. Note: I'm not using Felix von Leitner's ip6 patch.

$ dig aaaa chocolatine.org +short 2610:130:103:e00:201:2ff:fe45:8308

Resolver is dnscache, authoritate server is tinydns. No problem.

I think the problem you're experiencing, if there is one, is not related to either djbdns or ip6.

Nicolai

Jay Mitchell

11:09 a.m.

Apologies for the empty reply, mobile typo machine at work :( On 18/10/2012, at 7:44 AM, Nicolai <nicolai-nanog@chocolatine.org> wrote:

...

On Wed, Oct 17, 2012 at 03:35:11AM +0000, Joseph Anthony Pasquale Holsten wrote:

...
First off, I'm using djbdns internally and it doesn't support AAAA records. So we really aren't using it internally.

I assume you mean stock djbdns doesn't support ip6, because it does indeed support AAAA records.

Actually, it doesn't, as you so kindly pointed out. It does WITH a patch.

...

I use both dnscache and tinydns from djbdns and AAAA records work fine for me. Note: I'm not using Felix von Leitner's ip6 patch.

Thanks for pointing that out, finally.

...

$ dig aaaa chocolatine.org +short 2610:130:103:e00:201:2ff:fe45:8308

Resolver is dnscache, authoritate server is tinydns. No problem.

I think the problem you're experiencing, if there is one, is not related to either djbdns or ip6.

For real? Go figure.

...

Nicolai

Nicolai

9:06 p.m.

On Sun, Oct 21, 2012 at 10:09:24PM +1100, Jay Mitchell wrote:

...

On 18/10/2012, at 7:44 AM, Nicolai <nicolai-nanog@chocolatine.org> wrote:

...

...
I assume you mean stock djbdns doesn't support ip6, because it does indeed support AAAA records.

Actually, it doesn't, as you so kindly pointed out. It does WITH a patch.

No. djbdns 1.05 supports AAAA records as anyone can verify. To make sure myself I just downloaded stock djbdns from the cr.yp.to website, installed, and ran some aaaa queries. Works as it always has. $ dig aaaa he.net +short 2001:470:0:76::2 That's an unpatched, stock dnscache. John Levine already described in this thread how tinydns supports AAAA records, so there's no point going over it again. I only responded to this thread to correct misinformation. sigh As an aside, you may want to fix your DNS, as some mail receivers don't like this: $ dig -x 72.249.91.101 +short static.serversandhosting.com. $ dig a static.serversandhosting.com +short 72.249.3.27 Nicolai

Keith Medcalf

22 Oct 22 Oct

3:49 a.m.

...

As an aside, you may want to fix your DNS, as some mail receivers don't like this:

...

$ dig -x 72.249.91.101 +short static.serversandhosting.com. $ dig a static.serversandhosting.com +short 72.249.3.27

What is really meant to be said is that MTA's which require RFC compliance won't talk to you. Running an MTA which requires minimal RFC compliance (particularly in respect of DNS configuration) eliminates 98% of spam. --- () ascii ribbon campaign against html e-mail /\ www.asciiribbon.org

Suresh Ramasubramanian

3:56 a.m.

On Mon, Oct 22, 2012 at 9:19 AM, Keith Medcalf <kmedcalf@dessus.com> wrote:

...

What is really meant to be said is that MTA's which require RFC compliance won't talk to you. Running an MTA which requires minimal RFC compliance (particularly in respect of DNS configuration) eliminates 98% of spam.

I wish it were that easy. -- Suresh Ramasubramanian (ops.lists@gmail.com)

Mark Andrews

4:18 a.m.

In message <83452cbbe5c3c5439212c8a56346b283@mail.dessus.com>, "Keith Medcalf" writes:

...

...
As an aside, you may want to fix your DNS, as some mail receivers don't like this:

...
$ dig -x 72.249.91.101 +short static.serversandhosting.com. $ dig a static.serversandhosting.com +short 72.249.3.27

What is really meant to be said is that MTA's which require RFC compliance = won't talk to you. Running an MTA which requires minimal RFC compliance (p= articularly in respect of DNS configuration) eliminates 98% of spam.

Standards track RFC compliance REQUIRES that you ACCEPT email from that box. There is no standards track RFC that requires that PTR records exist. There is no standards track RFC that requires that PTR and address records are consistent. It is however good practice that these exist and are consistent. Mark -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org

Andrew Sullivan

4:36 p.m.

New subject: forward and reverse DNS (was: Please, talk me down.)

On Mon, Oct 22, 2012 at 03:18:52PM +1100, Mark Andrews wrote:

...

records are consistent. It is however good practice that these exist and are consistent.

I will note that the IETF DNSOP WG was unable to agree even on that latter claim. A -- Andrew Sullivan Dyn Labs asullivan@dyn.com

Templin, Fred L

10:24 p.m.

New subject: IP tunnel MTU

Hello, Several months ago, there was discussion on the list regarding IP tunnel maximum transmission unit (MTU). Since that time, it has been brought to my attention by members of my company's network operations staff that tunnel MTU is a very real problem they need to cope with on a daily basis - especially with the growing need to depend on both tunnels and tunnels-within-tunnels to track mobile devices. Since tunnels always reduce the effective MTU seen by data packets due to the encapsulation overhead, the only two ways to accommodate the tunnel MTU is either through the use of path MTU discovery or through fragmentation and reassembly. Unfortunately, both are known to be problematic in a non-trivial number of cases. The discussions on NANOG from back in the June timeframe resulted in "Operational Issues with Tunnel Maximum transmission Unit": https://datatracker.ietf.org/doc/draft-generic-v6ops-tunmtu/ I would like to ask this group to now give this document a look and post your comments/thoughts/experiences. For example, has the tunnel MTU problem crept into daily operational considerations to the point that we should now at least be documenting it and preferably trying to do something about it? From talking to our staff, I believe the answer is yes but it would be good to have confirmation from others. Thanks in advance for your thoughts, Fred fred.l.templin@boeing.com

Dobbins, Roland

23 Oct 23 Oct

1:49 a.m.

New subject: IP tunnel MTU

On Oct 23, 2012, at 5:24 AM, Templin, Fred L wrote:

...

Since tunnels always reduce the effective MTU seen by data packets due to the encapsulation overhead, the only two ways to accommodate the tunnel MTU is either through the use of path MTU discovery or through fragmentation and reassembly.

Actually, you can set your tunnel MTU manually. For example, the typical MTU folks set for a GRE tunnel is 1476. This isn't a new issue; it's been around ever since tunneling technologies have been around, and tons have been written on this topic. Look at your various router/switch vendor Web sites, archives of this list and others, etc. So, it's been known about, dealt with, and documented for a long time. In terms of doing something about it, the answer there is a) to allow the requisite ICMP for PMTU-D to work to/through any networks within your span of administrative control and b) adjusting your own tunnel MTUs to appropriate values based upon experimentation. Enterprise endpoint networks are notorious for blocking *all* ICMP (as well as TCP/53 DNS) at their edges due to 'security' misinformation propagated by Confused Information Systems Security Professionals and their ilk. Be sure that your own network policies aren't part of the problem affecting your userbase, as well as anyone else with a need to communicate with properties on your network via tunnels. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> Luck is the residue of opportunity and design. -- John Milton

Templin, Fred L

2:07 p.m.

New subject: IP tunnel MTU

Hi Roland,

...

-----Original Message----- From: Dobbins, Roland [mailto:rdobbins@arbor.net] Sent: Monday, October 22, 2012 6:49 PM To: NANOG list Subject: Re: IP tunnel MTU

On Oct 23, 2012, at 5:24 AM, Templin, Fred L wrote:

...
Since tunnels always reduce the effective MTU seen by data packets due to the encapsulation overhead, the only two ways to accommodate the tunnel MTU is either through the use of path MTU discovery or through fragmentation and reassembly.

Actually, you can set your tunnel MTU manually.

For example, the typical MTU folks set for a GRE tunnel is 1476.

Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

...

This isn't a new issue; it's been around ever since tunneling technologies have been around, and tons have been written on this topic. Look at your various router/switch vendor Web sites, archives of this list and others, etc.

Sure. I've written a fair amount about it too over the span of the last ten years. What is new is that there is now a solution near at hand.

...

So, it's been known about, dealt with, and documented for a long time. In terms of doing something about it, the answer there is a) to allow the requisite ICMP for PMTU-D to work to/through any networks within your span of administrative control and b)

That does you no good if there is some other network further beyond your span of administrative control that does not allow the ICMP PTBs through. And, studies have shown this to be the case in a non-trivial number of instances.

...

b) adjusting your own tunnel MTUs to appropriate values based upon experimentation.

Adjust it down to what? 1280? Then, if your tunnel with the adjusted MTU enters another tunnel with its own adjusted MTU there is an MTU underflow that might not get reported if the ICMP PTB messages are lost. An alternative is to use IP fragmentation, but recent studies have shown that more and more operators are unconditionally dropping IPv6 fragments and IPv4 fragmentation is not an option due to wrapping IDs at high data rates. Nested tunnels-within-tunnels occur in operational scenarios more and more, and adjusting the MTU for only one tunnel in the nesting does you no good if there are other tunnels that adjust their own MTUs.

...

Enterprise endpoint networks are notorious for blocking *all* ICMP (as well as TCP/53 DNS) at their edges due to 'security' misinformation propagated by Confused Information Systems Security Professionals and their ilk. Be sure that your own network policies aren't part of the problem affecting your userbase, as well as anyone else with a need to communicate with properties on your network via tunnels.

Again, all an operator can control is that which is within their own administrative domain. That does no good for ICMPs that are lost beyond their administrative domain. Thanks - Fred fred.l.templin@boeing.com

...

----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com>

Luck is the residue of opportunity and design.

-- John Milton

Ray Soucy

29 Oct 29 Oct

2:54 p.m.

New subject: IP tunnel MTU

The core issue here is TCP MSS. PMTUD is a dynamic process for adjusting MSS, but requires that ICMP be permitted to negotiate the connection. The realistic alternative, in a world that filters all ICMP traffic, is to manually rewrite the MSS. In IOS this can be achieved via "ip tcp adjust-mss" and on Linux-based systems, netfilter can be used to adjust MSS for example. Keep in mind that the MSS will be smaller than your MTU. Consider the following example: ip mtu 1480 ip tcp adjust-mss 1440 tunnel mode ipip IP packets have 20 bytes of overhead, leaving 1480 bytes for data. So for an IP-in-IP tunnel, you'd set your MTU of your tunnel interface to 1480. Subtract another 20 bytes for the tunneled IP header and 20 bytes (typical) for your TCP header and you're left with 1440 bytes for data in a TCP connection. So in this case we write the MSS as 1440. I use IP-in-IP as an example because it's simple. GRE tunnels can be a little more complex. While the GRE header is typically 4 bytes, it can grow up to 16 bytes depending on options used. So for a typical GRE tunnel (4 byte header), you would subtract 20 bytes for the IP header and 4 bytes for the GRE header from your base MTU of 1500. This would mean an MTU of 1476, and a TCP MMS of 1436. Keep in mind that a TCP header can be up to 60 bytes in length, so you may want to go higher than the typical 20 bytes for your MSS if you're seeing problems. On Tue, Oct 23, 2012 at 10:07 AM, Templin, Fred L <Fred.L.Templin@boeing.com> wrote:

...

Hi Roland,

...
-----Original Message----- From: Dobbins, Roland [mailto:rdobbins@arbor.net] Sent: Monday, October 22, 2012 6:49 PM To: NANOG list Subject: Re: IP tunnel MTU

On Oct 23, 2012, at 5:24 AM, Templin, Fred L wrote:

...
Since tunnels always reduce the effective MTU seen by data packets due to the encapsulation overhead, the only two ways to accommodate the tunnel MTU is either through the use of path MTU discovery or through fragmentation and reassembly.

Actually, you can set your tunnel MTU manually.

For example, the typical MTU folks set for a GRE tunnel is 1476.

Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

...
This isn't a new issue; it's been around ever since tunneling technologies have been around, and tons have been written on this topic. Look at your various router/switch vendor Web sites, archives of this list and others, etc.

Sure. I've written a fair amount about it too over the span of the last ten years. What is new is that there is now a solution near at hand.

...
So, it's been known about, dealt with, and documented for a long time. In terms of doing something about it, the answer there is a) to allow the requisite ICMP for PMTU-D to work to/through any networks within your span of administrative control and b)

That does you no good if there is some other network further beyond your span of administrative control that does not allow the ICMP PTBs through. And, studies have shown this to be the case in a non-trivial number of instances.

...
b) adjusting your own tunnel MTUs to appropriate values based upon experimentation.

Adjust it down to what? 1280? Then, if your tunnel with the adjusted MTU enters another tunnel with its own adjusted MTU there is an MTU underflow that might not get reported if the ICMP PTB messages are lost. An alternative is to use IP fragmentation, but recent studies have shown that more and more operators are unconditionally dropping IPv6 fragments and IPv4 fragmentation is not an option due to wrapping IDs at high data rates.

Nested tunnels-within-tunnels occur in operational scenarios more and more, and adjusting the MTU for only one tunnel in the nesting does you no good if there are other tunnels that adjust their own MTUs.

...
Enterprise endpoint networks are notorious for blocking *all* ICMP (as well as TCP/53 DNS) at their edges due to 'security' misinformation propagated by Confused Information Systems Security Professionals and their ilk. Be sure that your own network policies aren't part of the problem affecting your userbase, as well as anyone else with a need to communicate with properties on your network via tunnels.

Again, all an operator can control is that which is within their own administrative domain. That does no good for ICMPs that are lost beyond their administrative domain.

Thanks - Fred fred.l.templin@boeing.com

...
----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com>

Luck is the residue of opportunity and design.

-- John Milton

-- Ray Patrick Soucy Network Engineer University of Maine System T: 207-561-3526 F: 207-561-3531 MaineREN, Maine's Research and Education Network www.maineren.net

Templin, Fred L

4:39 p.m.

New subject: IP tunnel MTU

Hi Ray, MSS rewriting has been well known and broadly applied for a long time now, but only applies to TCP. The subject of MSS rewriting comes up all the time in the IETF wg discussions, but has failed to reach consensus as a long-term alternative. Plus, MSS rewriting does no good for tunnels-within-tunnels. If the innermost tunnel rewrites MSS to a value that *it* thinks is safe there is no guarantee that the packets will fit within any outer tunnels that occur further down the line. What I want to get to is an indefinite tunnel MTU; i.e., admit any packet into the tunnel regardless of its size then make any necessary adaptations from within the tunnel. That is exactly what SEAL does: https://datatracker.ietf.org/doc/draft-templin-intarea-seal/ Thanks - Fred fred.l.templin@boeing.com

...

-----Original Message----- From: Ray Soucy [mailto:rps@maine.edu] Sent: Monday, October 29, 2012 7:55 AM To: Templin, Fred L Cc: Dobbins, Roland; NANOG list Subject: Re: IP tunnel MTU

The core issue here is TCP MSS. PMTUD is a dynamic process for adjusting MSS, but requires that ICMP be permitted to negotiate the connection. The realistic alternative, in a world that filters all ICMP traffic, is to manually rewrite the MSS. In IOS this can be achieved via "ip tcp adjust-mss" and on Linux-based systems, netfilter can be used to adjust MSS for example.

Keep in mind that the MSS will be smaller than your MTU. Consider the following example:

ip mtu 1480 ip tcp adjust-mss 1440 tunnel mode ipip

IP packets have 20 bytes of overhead, leaving 1480 bytes for data. So for an IP-in-IP tunnel, you'd set your MTU of your tunnel interface to 1480. Subtract another 20 bytes for the tunneled IP header and 20 bytes (typical) for your TCP header and you're left with 1440 bytes for data in a TCP connection. So in this case we write the MSS as 1440.

I use IP-in-IP as an example because it's simple. GRE tunnels can be a little more complex. While the GRE header is typically 4 bytes, it can grow up to 16 bytes depending on options used.

So for a typical GRE tunnel (4 byte header), you would subtract 20 bytes for the IP header and 4 bytes for the GRE header from your base MTU of 1500. This would mean an MTU of 1476, and a TCP MMS of 1436.

Keep in mind that a TCP header can be up to 60 bytes in length, so you may want to go higher than the typical 20 bytes for your MSS if you're seeing problems.

On Tue, Oct 23, 2012 at 10:07 AM, Templin, Fred L <Fred.L.Templin@boeing.com> wrote:

...
Hi Roland,

...
-----Original Message----- From: Dobbins, Roland [mailto:rdobbins@arbor.net] Sent: Monday, October 22, 2012 6:49 PM To: NANOG list Subject: Re: IP tunnel MTU

On Oct 23, 2012, at 5:24 AM, Templin, Fred L wrote:

...
Since tunnels always reduce the effective MTU seen by data packets due to the encapsulation overhead, the only two ways to accommodate the tunnel MTU is either through the use of path MTU discovery or through fragmentation and reassembly.

Actually, you can set your tunnel MTU manually.

For example, the typical MTU folks set for a GRE tunnel is 1476.

Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

...
This isn't a new issue; it's been around ever since tunneling technologies have been around, and tons have been written on this topic. Look at your various router/switch vendor Web sites, archives of this list and others, etc.

Sure. I've written a fair amount about it too over the span of the last ten years. What is new is that there is now a solution near at hand.

...
So, it's been known about, dealt with, and documented for a long time. In terms of doing something about it, the answer there is a) to allow the requisite ICMP for PMTU-D to work to/through any networks within your span of administrative control and b)

That does you no good if there is some other network further beyond your span of administrative control that does not allow the ICMP PTBs through. And, studies have shown this to be the case in a non-trivial number of instances.

...
b) adjusting your own tunnel MTUs to appropriate values based upon experimentation.

Adjust it down to what? 1280? Then, if your tunnel with the adjusted MTU enters another tunnel with its own adjusted MTU there is an MTU underflow that might not get reported if the ICMP PTB messages are lost. An alternative is to use IP fragmentation, but recent studies have shown that more and more operators are unconditionally dropping IPv6 fragments and IPv4 fragmentation is not an option due to wrapping IDs at high data rates.

Nested tunnels-within-tunnels occur in operational scenarios more and more, and adjusting the MTU for only one tunnel in the nesting does you no good if there are other tunnels that adjust their own MTUs.

...
Enterprise endpoint networks are notorious for blocking *all* ICMP (as well as TCP/53 DNS) at their edges due to 'security' misinformation propagated by Confused Information Systems Security Professionals and their ilk. Be sure that your own network policies aren't part of the problem affecting your userbase, as well as anyone else with a need to communicate with properties on your network via tunnels.

Again, all an operator can control is that which is within their own administrative domain. That does no good for ICMPs that are lost beyond their administrative domain.

Thanks - Fred fred.l.templin@boeing.com

...
----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com>

Luck is the residue of opportunity and design.

-- John Milton

-- Ray Patrick Soucy Network Engineer University of Maine System

T: 207-561-3526 F: 207-561-3531

MaineREN, Maine's Research and Education Network www.maineren.net

Ray Soucy

7:17 p.m.

New subject: IP tunnel MTU

Sorry, glanced at this and thought it was someone having problems with tunnel MTU without adjusting TCP MSS. Nice work, though my preference is to avoid tunnels at all costs :-) On Mon, Oct 29, 2012 at 12:39 PM, Templin, Fred L <Fred.L.Templin@boeing.com> wrote:

...

Hi Ray,

MSS rewriting has been well known and broadly applied for a long time now, but only applies to TCP. The subject of MSS rewriting comes up all the time in the IETF wg discussions, but has failed to reach consensus as a long-term alternative.

Plus, MSS rewriting does no good for tunnels-within-tunnels. If the innermost tunnel rewrites MSS to a value that *it* thinks is safe there is no guarantee that the packets will fit within any outer tunnels that occur further down the line.

What I want to get to is an indefinite tunnel MTU; i.e., admit any packet into the tunnel regardless of its size then make any necessary adaptations from within the tunnel. That is exactly what SEAL does:

https://datatracker.ietf.org/doc/draft-templin-intarea-seal/

Thanks - Fred fred.l.templin@boeing.com

...
-----Original Message----- From: Ray Soucy [mailto:rps@maine.edu] Sent: Monday, October 29, 2012 7:55 AM To: Templin, Fred L Cc: Dobbins, Roland; NANOG list Subject: Re: IP tunnel MTU

The core issue here is TCP MSS. PMTUD is a dynamic process for adjusting MSS, but requires that ICMP be permitted to negotiate the connection. The realistic alternative, in a world that filters all ICMP traffic, is to manually rewrite the MSS. In IOS this can be achieved via "ip tcp adjust-mss" and on Linux-based systems, netfilter can be used to adjust MSS for example.

Keep in mind that the MSS will be smaller than your MTU. Consider the following example:

ip mtu 1480 ip tcp adjust-mss 1440 tunnel mode ipip

IP packets have 20 bytes of overhead, leaving 1480 bytes for data. So for an IP-in-IP tunnel, you'd set your MTU of your tunnel interface to 1480. Subtract another 20 bytes for the tunneled IP header and 20 bytes (typical) for your TCP header and you're left with 1440 bytes for data in a TCP connection. So in this case we write the MSS as 1440.

I use IP-in-IP as an example because it's simple. GRE tunnels can be a little more complex. While the GRE header is typically 4 bytes, it can grow up to 16 bytes depending on options used.

So for a typical GRE tunnel (4 byte header), you would subtract 20 bytes for the IP header and 4 bytes for the GRE header from your base MTU of 1500. This would mean an MTU of 1476, and a TCP MMS of 1436.

Keep in mind that a TCP header can be up to 60 bytes in length, so you may want to go higher than the typical 20 bytes for your MSS if you're seeing problems.

On Tue, Oct 23, 2012 at 10:07 AM, Templin, Fred L <Fred.L.Templin@boeing.com> wrote:

...
Hi Roland,

...
-----Original Message----- From: Dobbins, Roland [mailto:rdobbins@arbor.net] Sent: Monday, October 22, 2012 6:49 PM To: NANOG list Subject: Re: IP tunnel MTU

On Oct 23, 2012, at 5:24 AM, Templin, Fred L wrote:

...
Since tunnels always reduce the effective MTU seen by data packets due to the encapsulation overhead, the only two ways to accommodate the tunnel MTU is either through the use of path MTU discovery or through fragmentation and reassembly.

Actually, you can set your tunnel MTU manually.

For example, the typical MTU folks set for a GRE tunnel is 1476.

Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

...
This isn't a new issue; it's been around ever since tunneling technologies have been around, and tons have been written on this topic. Look at your various router/switch vendor Web sites, archives of this list and others, etc.

Sure. I've written a fair amount about it too over the span of the last ten years. What is new is that there is now a solution near at hand.

...
So, it's been known about, dealt with, and documented for a long time. In terms of doing something about it, the answer there is a) to allow the requisite ICMP for PMTU-D to work to/through any networks within your span of administrative control and b)

That does you no good if there is some other network further beyond your span of administrative control that does not allow the ICMP PTBs through. And, studies have shown this to be the case in a non-trivial number of instances.

...
b) adjusting your own tunnel MTUs to appropriate values based upon experimentation.

Adjust it down to what? 1280? Then, if your tunnel with the adjusted MTU enters another tunnel with its own adjusted MTU there is an MTU underflow that might not get reported if the ICMP PTB messages are lost. An alternative is to use IP fragmentation, but recent studies have shown that more and more operators are unconditionally dropping IPv6 fragments and IPv4 fragmentation is not an option due to wrapping IDs at high data rates.

Nested tunnels-within-tunnels occur in operational scenarios more and more, and adjusting the MTU for only one tunnel in the nesting does you no good if there are other tunnels that adjust their own MTUs.

...
Enterprise endpoint networks are notorious for blocking *all* ICMP (as well as TCP/53 DNS) at their edges due to 'security' misinformation propagated by Confused Information Systems Security Professionals and their ilk. Be sure that your own network policies aren't part of the problem affecting your userbase, as well as anyone else with a need to communicate with properties on your network via tunnels.

Again, all an operator can control is that which is within their own administrative domain. That does no good for ICMPs that are lost beyond their administrative domain.

Thanks - Fred fred.l.templin@boeing.com

...
----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com>

Luck is the residue of opportunity and design.

-- John Milton

-- Ray Patrick Soucy Network Engineer University of Maine System

T: 207-561-3526 F: 207-561-3531

MaineREN, Maine's Research and Education Network www.maineren.net

-- Ray Patrick Soucy Network Engineer University of Maine System T: 207-561-3526 F: 207-561-3531 MaineREN, Maine's Research and Education Network www.maineren.net

Shahab Vahabzadeh

7:32 p.m.

New subject: IP tunnel MTU

Hi there, I have the same problem in my network, I have GRE tunnel for transfering users real internet traffic, they have problems with browsing websites like yahoo.com or microsoft.com. I had to set ip mtu 1500 to solve it, and it occurs fragmantation... Thanks On Mon, Oct 29, 2012 at 10:47 PM, Ray Soucy <rps@maine.edu> wrote:

...

Sorry, glanced at this and thought it was someone having problems with tunnel MTU without adjusting TCP MSS.

Nice work, though my preference is to avoid tunnels at all costs :-)

...
Hi Ray,

MSS rewriting has been well known and broadly applied for a long time now, but only applies to TCP. The subject of MSS rewriting comes up all the time in the IETF wg discussions, but has failed to reach consensus as a long-term alternative.

Plus, MSS rewriting does no good for tunnels-within-tunnels. If the innermost tunnel rewrites MSS to a value that *it* thinks is safe there is no guarantee that the packets will fit within any outer tunnels that occur further down the line.

What I want to get to is an indefinite tunnel MTU; i.e., admit any packet into the tunnel regardless of its size then make any necessary adaptations from within the tunnel. That is exactly what SEAL does:

https://datatracker.ietf.org/doc/draft-templin-intarea-seal/

Thanks - Fred fred.l.templin@boeing.com

...
-----Original Message----- From: Ray Soucy [mailto:rps@maine.edu] Sent: Monday, October 29, 2012 7:55 AM To: Templin, Fred L Cc: Dobbins, Roland; NANOG list Subject: Re: IP tunnel MTU

The core issue here is TCP MSS. PMTUD is a dynamic process for adjusting MSS, but requires that ICMP be permitted to negotiate the connection. The realistic alternative, in a world that filters all ICMP traffic, is to manually rewrite the MSS. In IOS this can be achieved via "ip tcp adjust-mss" and on Linux-based systems, netfilter can be used to adjust MSS for example.

Keep in mind that the MSS will be smaller than your MTU. Consider the following example:

ip mtu 1480 ip tcp adjust-mss 1440 tunnel mode ipip

IP packets have 20 bytes of overhead, leaving 1480 bytes for data. So for an IP-in-IP tunnel, you'd set your MTU of your tunnel interface to 1480. Subtract another 20 bytes for the tunneled IP header and 20 bytes (typical) for your TCP header and you're left with 1440 bytes for data in a TCP connection. So in this case we write the MSS as 1440.

I use IP-in-IP as an example because it's simple. GRE tunnels can be a little more complex. While the GRE header is typically 4 bytes, it can grow up to 16 bytes depending on options used.

So for a typical GRE tunnel (4 byte header), you would subtract 20 bytes for the IP header and 4 bytes for the GRE header from your base MTU of 1500. This would mean an MTU of 1476, and a TCP MMS of 1436.

Keep in mind that a TCP header can be up to 60 bytes in length, so you may want to go higher than the typical 20 bytes for your MSS if you're seeing problems.

On Tue, Oct 23, 2012 at 10:07 AM, Templin, Fred L <Fred.L.Templin@boeing.com> wrote:

...
Hi Roland,

...
-----Original Message----- From: Dobbins, Roland [mailto:rdobbins@arbor.net] Sent: Monday, October 22, 2012 6:49 PM To: NANOG list Subject: Re: IP tunnel MTU

On Oct 23, 2012, at 5:24 AM, Templin, Fred L wrote:

...
Since tunnels always reduce the effective MTU seen by data packets due to the encapsulation overhead, the only two ways to accommodate the tunnel MTU is either through the use of path MTU discovery or through fragmentation and reassembly.

Actually, you can set your tunnel MTU manually.

For example, the typical MTU folks set for a GRE tunnel is 1476.

Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

...
This isn't a new issue; it's been around ever since tunneling technologies have been around, and tons have been written on this topic. Look at your various router/switch vendor Web sites, archives of this list and others, etc.

Sure. I've written a fair amount about it too over the span of the last ten years. What is new is that there is now a solution near at hand.

...
So, it's been known about, dealt with, and documented for a long time. In terms of doing something about it, the answer there is a) to allow

On Mon, Oct 29, 2012 at 12:39 PM, Templin, Fred L <Fred.L.Templin@boeing.com> wrote: the

...
...
...
...
requisite ICMP for PMTU-D to work to/through any networks within your span of administrative control and b)

That does you no good if there is some other network further beyond your span of administrative control that does not allow the ICMP PTBs through. And, studies have shown this to be the case in a non-trivial number of instances.

...
b) adjusting your own tunnel MTUs to appropriate values based upon experimentation.

Adjust it down to what? 1280? Then, if your tunnel with the adjusted MTU enters another tunnel with its own adjusted MTU there is an MTU underflow that might not get reported if the ICMP PTB messages are lost. An alternative is to use IP fragmentation, but recent studies have shown that more and more operators are unconditionally dropping IPv6 fragments and IPv4 fragmentation is not an option due to wrapping IDs at high data rates.

Nested tunnels-within-tunnels occur in operational scenarios more and more, and adjusting the MTU for only one tunnel in the nesting does you no good if there are other tunnels that adjust their own MTUs.

...
Enterprise endpoint networks are notorious for blocking *all* ICMP (as well as TCP/53 DNS) at their edges due to 'security' misinformation propagated by Confused Information Systems Security Professionals and their ilk. Be sure that your own network policies aren't part of the problem affecting your userbase, as well as anyone else with a need to communicate with properties on your network via tunnels.

Again, all an operator can control is that which is within their own administrative domain. That does no good for ICMPs that are lost beyond their administrative domain.

Thanks - Fred fred.l.templin@boeing.com

...

...
...
...
...
Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com

...
...
...
Luck is the residue of opportunity and design.

-- John Milton

-- Ray Patrick Soucy Network Engineer University of Maine System

T: 207-561-3526 F: 207-561-3531

MaineREN, Maine's Research and Education Network www.maineren.net

-- Ray Patrick Soucy Network Engineer University of Maine System

T: 207-561-3526 F: 207-561-3531

MaineREN, Maine's Research and Education Network www.maineren.net

-- Regards, Shahab Vahabzadeh, Network Engineer and System Administrator Cell Phone: +1 (415) 871 0742 PGP Key Fingerprint = 8E34 B335 D702 0CA7 5A81 C2EE 76A2 46C2 5367 BF90

William Herrin

10:47 p.m.

New subject: IP tunnel MTU

On Mon, Oct 29, 2012 at 10:54 AM, Ray Soucy <rps@maine.edu> wrote:

...

The core issue here is TCP MSS. PMTUD is a dynamic process for adjusting MSS, but requires that ICMP be permitted to negotiate the connection. The realistic alternative, in a world that filters all ICMP traffic, is to manually rewrite the MSS. In IOS this can be achieved via "ip tcp adjust-mss" and on Linux-based systems, netfilter can be used to adjust MSS for example.

Longer term, the ideal solution would be a replacement algorithm that allows TCP to adjust its MSS with or without negative acknowledgement from intermediate routers. The ICMP-didn't-get-there problem is only going to get worse and things like private IPs on routers and encapsulation mechanisms where the intermediate router isn't dealing with an IP packet directly are as much at fault these days as foolish firewall admins. Perhaps my understanding of end-to-end is flawed, but I suspect it means that an endpoint shouldn't depend on direct communication with an intermediate system for its successful communication with another endpoint. Maybe something as simple as clearing the don't fragment flag and adding a TCP option to report receipt of a fragmented packet along with the fragment sizes back to the sender so he can adjust his mss to avoid fragmentation. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004

Templin, Fred L

11:02 p.m.

New subject: IP tunnel MTU

Hi Bill,

...

Maybe something as simple as clearing the don't fragment flag and adding a TCP option to report receipt of a fragmented packet along with the fragment sizes back to the sender so he can adjust his mss to avoid fragmentation.

That is in fact what SEAL is doing, but there is no guarantee that the size of the largest fragment is going to be an accurate reflection of the true path MTU. RFC1812 made sure of that when it more or less gave IPv4 routers permission to fragment packets pretty much any way they want. Thanks - Fred fred.l.templin@boeing.com

Chris Woodfield

11:40 p.m.

New subject: IP tunnel MTU

True, but it could be used as an alternative PMTUD algorithm - raise the segment size and wait for the "I got this as fragments" option to show up... Of course, this only works for IPv4. IPv6 users are SOL if something in the middle is dropping ICMPv6. -C On Oct 29, 2012, at 4:02 PM, Templin, Fred L wrote:

...

Hi Bill,

...
Maybe something as simple as clearing the don't fragment flag and adding a TCP option to report receipt of a fragmented packet along with the fragment sizes back to the sender so he can adjust his mss to avoid fragmentation.

That is in fact what SEAL is doing, but there is no guarantee that the size of the largest fragment is going to be an accurate reflection of the true path MTU. RFC1812 made sure of that when it more or less gave IPv4 routers permission to fragment packets pretty much any way they want.

Thanks - Fred fred.l.templin@boeing.com

Templin, Fred L

30 Oct 30 Oct

2:10 p.m.

New subject: IP tunnel MTU

Hi Chris,

...

-----Original Message----- From: Chris Woodfield [mailto:rekoil@semihuman.com] Sent: Monday, October 29, 2012 4:40 PM To: Templin, Fred L Cc: William Herrin; Ray Soucy; NANOG list Subject: Re: IP tunnel MTU

True, but it could be used as an alternative PMTUD algorithm - raise the segment size and wait for the "I got this as fragments" option to show up...

Yes; it is a very attractive option on the surface. Steve Deering called it "Report Fragmentation (RF)" when he first proposed it back in 1988, but it didn't gain sufficient traction and what we got instead was RFC1191. As I mentioned, SEAL does this already but in a "best effort" fashion. SEAL will work over paths that don't conform well to the RF model, but will derive some useful benefit from paths that do.

...

Of course, this only works for IPv4. IPv6 users are SOL if something in the middle is dropping ICMPv6.

Sad, but true. Thanks - Fred fred.l.templin@boeing.com

...

-C

On Oct 29, 2012, at 4:02 PM, Templin, Fred L wrote:

...
Hi Bill,

...
Maybe something as simple as clearing the don't fragment flag and adding a TCP option to report receipt of a fragmented packet along with the fragment sizes back to the sender so he can adjust his mss to avoid fragmentation.

That is in fact what SEAL is doing, but there is no guarantee that the size of the largest fragment is going to be an accurate reflection of the true path MTU. RFC1812 made sure of that when it more or less gave IPv4 routers permission to fragment packets pretty much any way they want.

Thanks - Fred fred.l.templin@boeing.com

Joe Maimon

29 Oct 29 Oct

7:46 p.m.

New subject: IP tunnel MTU

Templin, Fred L wrote:

...

Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

Essentially, its time the network matured to the point where inter-networking actually works (again), seamlessly. I agree. Joe

Jared Mauch

8:01 p.m.

New subject: IP tunnel MTU

On Oct 29, 2012, at 3:46 PM, Joe Maimon <jmaimon@ttec.com> wrote:

...

Templin, Fred L wrote:

...
Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

Essentially, its time the network matured to the point where inter-networking actually works (again), seamlessly.

I agree.

Certainly fixing all the buggy host stacks, firewall and compliance devices to realize that ICMP isn't bad won't be hard. - Jared

Tim Durack

8:06 p.m.

New subject: IP tunnel MTU

On Mon, Oct 29, 2012 at 4:01 PM, Jared Mauch <jared@puck.nether.net> wrote:

...

On Oct 29, 2012, at 3:46 PM, Joe Maimon <jmaimon@ttec.com> wrote:

...
Templin, Fred L wrote:

...
Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

Essentially, its time the network matured to the point where inter-networking actually works (again), seamlessly.

I agree.

Certainly fixing all the buggy host stacks, firewall and compliance devices to realize that ICMP isn't bad won't be hard.

- Jared

Wait till you get started on "fixing" the "security" consultants. -- Tim:>

Tim Franklin

30 Oct 30 Oct

10:01 a.m.

New subject: IP tunnel MTU

...

...
Certainly fixing all the buggy host stacks, firewall and compliance devices to realize that ICMP isn't bad won't be hard.

Wait till you get started on "fixing" the "security" consultants.

Ack. I've yet to come across a *device* that doesn't deal properly with "packet too big". Lots (and lots and lots) of "security" people, one or two applications, but no devices. Regards, Tim.

Sander Steffann

10:19 a.m.

New subject: IP tunnel MTU

Hi,

...

...
...
Certainly fixing all the buggy host stacks, firewall and compliance devices to realize that ICMP isn't bad won't be hard.

Wait till you get started on "fixing" the "security" consultants.

Ack. I've yet to come across a *device* that doesn't deal properly with "packet too big". Lots (and lots and lots) of "security" people, one or two applications, but no devices.

I know of one: Juniper SSG and SRX boxes used to block IPv6 ICMP errors when the screening option 'big ICMP packets' was enabled because it blocked all (v4 and v6) ICMP packets bigger than 1024 bytes and IPv6 ICMP errors are often 1280 bytes. I don't know if that has been fixed yet. - Sander

Jeroen Massar

10:23 a.m.

New subject: IP tunnel MTU

On 2012-10-30 11:19, Sander Steffann wrote:

...

Hi,

...
...
...
Certainly fixing all the buggy host stacks, firewall and compliance devices to realize that ICMP isn't bad won't be hard.

Wait till you get started on "fixing" the "security" consultants.

Ack. I've yet to come across a *device* that doesn't deal properly with "packet too big". Lots (and lots and lots) of "security" people, one or two applications, but no devices.

I know of one: Juniper SSG and SRX boxes used to block IPv6 ICMP errors when the screening option 'big ICMP packets' was enabled because it blocked all (v4 and v6) ICMP packets bigger than 1024 bytes and IPv6 ICMP errors are often 1280 bytes. I don't know if that has been fixed yet.

I do not see them "fixing" that either, if one misconfigures a host to filter big ICMP packets, you get exactly that, it will filter those packets. In the same way as folks misconfiguring hosts to drop ICMP in general etc. One cannot solve stupid people as they will do stupid things. Greets, Jeroen

Joe Maimon

29 Oct 29 Oct

8:43 p.m.

New subject: IP tunnel MTU

Jared Mauch wrote:

...

On Oct 29, 2012, at 3:46 PM, Joe Maimon <jmaimon@ttec.com> wrote:

...
Templin, Fred L wrote:

...
Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

Essentially, its time the network matured to the point where inter-networking actually works (again), seamlessly.

I agree.

Certainly fixing all the buggy host stacks, firewall and compliance devices to realize that ICMP isn't bad won't be hard.

- Jared

ICMP is just not the way it is ever going to work. Joe

Jared Mauch

8:49 p.m.

New subject: IP tunnel MTU

On Oct 29, 2012, at 4:43 PM, Joe Maimon <jmaimon@ttec.com> wrote:

...

Jared Mauch wrote:

...
On Oct 29, 2012, at 3:46 PM, Joe Maimon <jmaimon@ttec.com> wrote:

...
Templin, Fred L wrote:

...
Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

Essentially, its time the network matured to the point where inter-networking actually works (again), seamlessly.

I agree.

Certainly fixing all the buggy host stacks, firewall and compliance devices to realize that ICMP isn't bad won't be hard.

- Jared

ICMP is just not the way it is ever going to work.

I wish you luck in getting your host IP stacks to work properly without ICMP, especially as you deploy IPv6. - Jared

Templin, Fred L

9:03 p.m.

New subject: IP tunnel MTU

...

I wish you luck in getting your host IP stacks to work properly without ICMP, especially as you deploy IPv6.

...

From what I've heard, ICMPv6 is already being filtered, including PTBs. I have also heard that IPv6 fragments are also being dropped unconditionally along some paths. So, if neither ICMPv6 PTB nor IPv6 fragmentation works then the tunnel endpoints have to take matters into their own hands. That's where SEAL comes in.

Thanks - Fred fred.l.templin@boeing.com

...

- Jared

Masataka Ohta

30 Oct 30 Oct

4:07 a.m.

New subject: IP tunnel MTU

Templin, Fred L wrote:

...

...
I wish you luck in getting your host IP stacks to work properly without ICMP, especially as you deploy IPv6.

...

...
From what I've heard, ICMPv6 is already being filtered, including PTBs.

As v6 PTBs are specified to be generated even against multicast packets, it is of course that they are dropped to prevent ICMP implosions. But, it is a very serious problem of not only tunnels but entire IPv6. That is, if PMTUD is unavailable, IPv6 hosts are prohibited to send packets larger than 1280B. Then, ignoring the prohibition, tunnel end points may send packets a little larger than 1280B, which means physical link MTU of 1500B or a little smaller than that is enough for nested tunnels. Thus, no new tunneling protocol is necessary. The harder part of the job is to disable PMTUD on all the IPv6 implementations.

...

I have also heard that IPv6 fragments are also being dropped unconditionally along some paths.

Again, it is not a problem of tunnels only. If that is the operational reality, specifications on fragmentation must be dropped from IPv6 specification. Masataka Ohta

Joe Maimon

29 Oct 29 Oct

9:04 p.m.

New subject: IP tunnel MTU

Jared Mauch wrote:

...

...
ICMP is just not the way it is ever going to work.

I wish you luck in getting your host IP stacks to work properly without ICMP, especially as you deploy IPv6.

- Jared

Precisely the state we are in. Looking for luck. Joe

bmanning＠vacation.karoshi.com

8:28 p.m.

New subject: IP tunnel MTU

On Mon, Oct 29, 2012 at 03:46:57PM -0400, Joe Maimon wrote:

...

Templin, Fred L wrote:

...
Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

Essentially, its time the network matured to the point where inter-networking actually works (again), seamlessly.

I agree.

Joe

you mean its safe to turn off the VPNs? /bill

Joe Maimon

8:44 p.m.

New subject: IP tunnel MTU

bmanning@vacation.karoshi.com wrote:

...

On Mon, Oct 29, 2012 at 03:46:57PM -0400, Joe Maimon wrote:

...
Templin, Fred L wrote:

...
Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

Essentially, its time the network matured to the point where inter-networking actually works (again), seamlessly.

I agree.

Joe

you mean its safe to turn off the VPNs?

/bill

Quite the reverse. Joe

bmanning＠vacation.karoshi.com

9:15 p.m.

New subject: IP tunnel MTU

On Mon, Oct 29, 2012 at 04:44:40PM -0400, Joe Maimon wrote:

...

bmanning@vacation.karoshi.com wrote:

...
On Mon, Oct 29, 2012 at 03:46:57PM -0400, Joe Maimon wrote:

...
Templin, Fred L wrote:

...
Yes; I was aware of this. But, what I want to get to is setting the tunnel MTU to infinity.

Essentially, its time the network matured to the point where inter-networking actually works (again), seamlessly.

I agree.

Joe

you mean its safe to turn off the VPNs?

/bill

Quite the reverse.

Joe

so its tunnels all the way down... maybe we should just go back to a circuit oriented network, eh? /bill

Joe Maimon

10:45 p.m.

New subject: IP tunnel MTU

bmanning@vacation.karoshi.com wrote:

...

...
...
you mean its safe to turn off the VPNs?

/bill

Quite the reverse.

Joe

so its tunnels all the way down... maybe we should just go back to a circuit oriented network, eh?

/bill

Its not safe to turn on VPNs. Joe

Joe Abley

23 Oct 23 Oct

2:19 a.m.

New subject: forward and reverse DNS (was: Please, talk me down.)

On 2012-10-22, at 11:36, Andrew Sullivan <asullivan@dyn.com> wrote:

...

On Mon, Oct 22, 2012 at 03:18:52PM +1100, Mark Andrews wrote:

...
records are consistent. It is however good practice that these exist and are consistent.

I will note that the IETF DNSOP WG was unable to agree even on that latter claim.

I will further note that just because dnsop can't agree on something doesn't mean that it's not worth agreeing on. Joe

Jimmy Hess

4:17 a.m.

New subject: forward and reverse DNS (was: Please, talk me down.)

...

I will further note that just because dnsop can't agree on something doesn't mean that it's not worth agreeing on. [snip] Some of the IETF WGs' members wouldn't be able to agree what color

On 10/22/12, Joe Abley <jabley@hopcount.ca> wrote: the sky appears to be on a clear sunny day. But it is common MTAs, to be configured to perform a check for Forward-Confirmed DNS, similar to the iprev authentication mechanism mentioned in RFC5451, except this is mandatory, and they refuse delivery. Many popular anti-spam solutions are implementing this out of the box, and common MTAs provide documentation recommending configurations that implement constraints such as these: 1. If a 'HELO' or 'EHLO' message is received, and there is no argument, the SMTP server will respond with a 5xx reject, even though it is technically allowed to have a HELO/EHLO without a hostnamr parameter specified. 2. If a 'HELO' or 'EHLO' message is received; the SMTP server will begin a forward DNS lookup on the hostname presented in the HELO/EHLO, and a Reverse DNS lookup on the connecting IP; it may initiate an outgoing connection to port 113 auth (Ident) on the connecting IP, in order to ask for a username to insert in message headers. a. If the forward DNS check on the HELO name, or the PTR query on the connecting IP fails to get a response. HELO fails with a 4xx reject. b. If either result in a NXDOMAIN response, HELO fails with a 5xx reject. c. If both succeed, a forward DNS lookup is started for the name found in the PTR response, and a 4xx reject upon lookup failure, or 5xx reject upon a NXDOMAIN response, or forward lookup response not matching the IP address of the client. o The "SMTP reject" might instead trigger a tarpitting mechanism. Some implementations currently accept the HELO and delay the SMTP reject by default until a later stage, such as RCPT TO, and/or cache the reject decision, to reduce the impact of multiple connection attempts. 3. If a 'RCPT TO' message is received, a 5xx smtp error is sent, unless a 'MAIL FROM' message has already been received and accepted, and the mailbox is a known local mailbox. 4. If a 'MAIL FROM' message is received, a 5xx smtp error is sent, unless a 'HELO' or 'EHLO' message has already been received and accepted. If the address referenced is not <>, then A DNS request is sent for forward lookup of the domain in the MAIL FROM, and SPF query/policy test on the envelope from address. If there is a SPF soft fail, a 4xx reject; SPF hard fail, or the domain does not exist, a SMTP 5xx reject.

...

Joe -- -JH

4630

Age (days ago)

4643

Last active (days ago)

List overview

Download

48 comments

33 participants

participants (33)

Andrew Sullivan
Antonio Querubin
bmanning＠vacation.karoshi.com
Chris Woodfield
Dobbins, Roland
Jared Mauch
Jay Mitchell
Jeroen Massar
Jima
Jimmy Hess
Joe Abley
Joe Maimon
John Levine
Joseph Anthony Pasquale Holsten
JP Viljoen
Karl Auer
Keith Medcalf
Leen Besselink
Mark Andrews
Masataka Ohta
Mikael Abrahamsson
Mike Lyon
Nicolai
Octavio Alvarez
Randy Bush
Ray Soucy
Sander Steffann
Shahab Vahabzadeh
Suresh Ramasubramanian
Templin, Fred L
Tim Durack
Tim Franklin
William Herrin