The IETF's tcp-impl (TCP implementation) working group has a draft document discussing problems with path MTU discovery: http://www.ietf.org/internet-drafts/draft-ietf-tcpimpl-pmtud-02.txt The main issue we're trying to decide is whether the draft should advocate "black hole detection". That is, when a TCP is doing PMTU discovery, but somewhere the necessary ICMPs are either not being generated or are being filtered out before the TCP receives them, the TCP notices that it's losing multiple packets of the same size, so it then tries sending smaller segments, even though it hasn't received a "Datagram Too Big" ICMP. The plus of black hole detection is that it can work around a sometimes very hard to debug problem. The minus is that it masks problems that should instead be fixed. To help resolve this issue, I'm wondering whether the ISP community has a clear preference for either yes-do-detection or no-we-want-the-problems-fixed. Comments appreciated. Thanks, Vern
Unfortunately, the MTU problem can be caused by the client's network admin as well as by the ISP; it's very difficult to explain what's wrong, for this admins, and MTU discovery is not the part of traditional IP approach. This means that black-hole detection whould be implemented anyway to prevent lost of connectivity which we have sometimes nopw when some MS-based server or crlient refuse to allow ip fragmentation. On Thu, 18 Nov 1999, Vern Paxson wrote:
Date: Thu, 18 Nov 1999 14:40:01 PST From: Vern Paxson <vern@ee.lbl.gov> To: nanog@merit.edu Subject: should TCPs do MTU black hole detection?
The IETF's tcp-impl (TCP implementation) working group has a draft document discussing problems with path MTU discovery:
http://www.ietf.org/internet-drafts/draft-ietf-tcpimpl-pmtud-02.txt
The main issue we're trying to decide is whether the draft should advocate "black hole detection". That is, when a TCP is doing PMTU discovery, but somewhere the necessary ICMPs are either not being generated or are being filtered out before the TCP receives them, the TCP notices that it's losing multiple packets of the same size, so it then tries sending smaller segments, even though it hasn't received a "Datagram Too Big" ICMP.
The plus of black hole detection is that it can work around a sometimes very hard to debug problem. The minus is that it masks problems that should instead be fixed.
To help resolve this issue, I'm wondering whether the ISP community has a clear preference for either yes-do-detection or no-we-want-the-problems-fixed. Comments appreciated.
Thanks,
Vern
Aleksei Roudnev, Network Operations Center, Relcom, Moscow (+7 095) 194-19-95 (Network Operations Center Hot Line),(+7 095) 230-41-41, N 13729 (pager) (+7 095) 196-72-12 (Support), (+7 095) 194-33-28 (Fax)
On Thu, 18 Nov 1999, Vern Paxson wrote: | To help resolve this issue, I'm wondering whether the ISP community has a | clear preference for either yes-do-detection or no-we-want-the-problems-fixed. | Comments appreciated. | I think that most ISP's would prefer that problems were fixed. However, we also know this doesn't happen very often, unless provoked (by customers, usually.) Can you provide more detail as to what problems would be masked or otherwise ignored if TCP implementations started to accomodate for the lack of Path-MTU discovery ? --- Gates' Law: Every 18 months, the speed of software halves.
I'm wondering whether the ISP community has a clear preference for either yes-do-detection or no-we-want-the-problems-fixed. Comments appreciated. I think that most ISP's would prefer that problems were fixed.
the choice seems o when it breaks, the noc gets the call, debugs it, and it gets fixed o when it breaks, the software does successive guesswork back-offs until it makes it through. the performance sucks big-time, the customer thinks the isp is at fault, but the noc does not get called and the real problem never gets fixed. randy
yes-do-detection or no-we-want-the-problems-fixed. Comments appreciated. I think that most ISP's would prefer that problems were fixed.
the choice seems
o when it breaks, the noc gets the call, debugs it, and it gets fixed 'it get fixed' if the problem is inside the network controlled by the NOC. If not??
o when it breaks, the software does successive guesswork back-offs until it makes it through. the performance sucks big-time, the customer thinks the isp is at fault, but the noc does not get called and the real problem never gets fixed.
randy
Aleksei Roudnev, Network Operations Center, Relcom, Moscow (+7 095) 194-19-95 (Network Operations Center Hot Line),(+7 095) 230-41-41, N 13729 (pager) (+7 095) 196-72-12 (Support), (+7 095) 194-33-28 (Fax)
yes-do-detection or no-we-want-the-problems-fixed. Comments appreciated. I think that most ISP's would prefer that problems were fixed. the choice seems o when it breaks, the noc gets the call, debugs it, and it gets fixed 'it get fixed' if the problem is inside the network controlled by the NOC. If not??
noc tells user to lower mtu
Thus spake Randy Bush
the choice seems
o when it breaks, the noc gets the call, debugs it, and it gets fixed ^^^^^^^^^^^^^ Optimist!
o when it breaks, the software does successive guesswork back-offs until it makes it through. the performance sucks big-time, the customer thinks the isp is at fault, but the noc does not get called and the real problem never gets fixed.
This one is pretty accurate though. -- Jeff McAdams Email: jeffm@iglou.com Head Network Administrator Voice: (502) 966-3848 IgLou Internet Services (800) 436-4456
o when it breaks, the noc gets the call, debugs it, and it gets fixed ^^^^^^^^^^^^^ Optimist!
but at least the user knows where the problem lies, as opposed to
o when it breaks, the software does successive guesswork back-offs until it makes it through. the performance sucks big-time, the customer thinks the isp is at fault, but the noc does not get called and the real problem never gets fixed. This one is pretty accurate though.
where the user thinks the isp sucks. and the latter will be forever increasing entropy. the net as experienced just gets worse and worse. randy
[ On Thursday, November 18, 1999 at 22:03:21 (-0500), Randy Bush wrote: ]
Subject: Re: should TCPs do MTU black hole detection?
o when it breaks, the noc gets the call, debugs it, and it gets fixed ^^^^^^^^^^^^^ Optimist!
but at least the user knows where the problem lies, as opposed to
I certainly agree with you here Randy! However as a reasonably expert user there are times when I want the ability to have my equipment try to work around even ultra-stupid configurations elsewhere on the net, especially when those of us experiencing the problem are far more rare than those who do not. Some time ago when I first began to personally experience this problem with path MTU discovery on my home network I discovered through analysis of the upstream packet traces that it was almost always possible to have the router sending the needs-frag reply to realize when its attempts to do so were futile and thus enforce fragmentation anyway. While this may make some protocols un-usable anyway it could at least allow me to limp along and to allow me to use that same network to communicate with the offending people at the other end using protocols affected by this problem such as SMTP, FTP, HTTP, etc. Unfortunately I have not yet had time (or in this case the need -- I've since increased my link's MTU to 1500 :-), to implement this algorithm in my own upstream router (which thankfully I do have the root password for! ;-). BTW, I think my algorithm provides a much more efficient work-around to the problem than black-hole discovery. Unfortunately it also makes it harder to convince the offending admins to fix their stupid filter definitions (or alternately at least turn off PMTUd in their servers). What I'd really like to do is find some way to enhance my algorithm in such a way that it could send a tiny tactical nuke down the wire after my connection to/through the offending network safely closes. I.e. I want to cause grief to any ICMP filter that has caused my router to have to work around its stupidity! ;-) -- Greg A. Woods +1 416 218-0098 VE3TCP <gwoods@acm.org> <robohack!woods> Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>
foxRandy Bush wrote:
I'm wondering whether the ISP community has a clear preference for either yes-do-detection or no-we-want-the-problems-fixed. Comments appreciated. I think that most ISP's would prefer that problems were fixed.
the choice seems
o when it breaks, the noc gets the call, debugs it, and it gets fixed
Well Randy, ideally yes but reality sucks as we know. We had this crap as well and tried to get folks to fix it - important sites just don't care and yes, we are usually talking broken web stuff here. Hard to tell their ISP to get them to fix it. I am asking myself if we can nuke it elsewhere, nothing obvious right off ?
o when it breaks, the software does successive guesswork back-offs until it makes it through. the performance sucks big-time, the customer thinks the isp is at fault, but the noc does not get called and the real problem never gets fixed.
Agreed. Regardless of what I mention above, nuke em, i.e. fix. Dave
randy
participants (7)
-
Alex P. Rudnev
-
Chris Cappuccio
-
Dave Morton
-
Jeff Mcadams
-
Randy Bush
-
Vern Paxson
-
woods@most.weird.com