Hi, N00b here trying to understand why certain CDN's such as Cloudfare have issues where my MTU is low. For instance if I am using pptp and the MTU is at 1300 it wont work. If I increase to 1478 it may or may not work. TIA.
Hi,
N00b here trying to understand why certain CDN's such as Cloudfare have issues where my MTU is low. For instance if I am using pptp and the MTU is at 1300 it wont work. If I increase to 1478 it may or may not work. PMTUD has a lot of trouble working reliability when the destination of
On 1/8/18 2:55 PM, Dovid Bender wrote: the PTB is a stateless load-balancer. If your tunnel or host clamps the mss to the appropriate value it can support. it is highly likely that connection attempts to the same destination will work fine.
TIA.
On Mon, 8 Jan 2018, joel jaeggli wrote:
PMTUD has a lot of trouble working reliability when the destination of the PTB is a stateless load-balancer.
If your tunnel or host clamps the mss to the appropriate value it can support. it is highly likely that connection attempts to the same destination will work fine.
This is understandable, but if this is also an operational practice we as the operational community want to condone (people using solutions where PMTUD doesn't work), then we also need to make sure that all applications do PLMTUD (RFC4821, Packet Level MTU Discovery). This is currently NOT the case, and from what I can tell, there isn't even an IETF document saying this is the best current practice. So, is this something we want to say? We should talk about that. -- Mikael Abrahamsson email: swmike@swm.pp.se
❦ 8 janvier 2018 15:08 -0800, joel jaeggli <joelja@bogus.com> :
N00b here trying to understand why certain CDN's such as Cloudfare have issues where my MTU is low. For instance if I am using pptp and the MTU is at 1300 it wont work. If I increase to 1478 it may or may not work. PMTUD has a lot of trouble working reliability when the destination of the PTB is a stateless load-balancer.
More explanations are available here: https://blog.cloudflare.com/path-mtu-discovery-in-practice/ -- Don't comment bad code - rewrite it. - The Elements of Programming Style (Kernighan & Plauger)
Vincent, Thanks. That URL explained a lot. On Tue, Jan 9, 2018 at 3:11 AM, Vincent Bernat <bernat@luffy.cx> wrote:
❦ 8 janvier 2018 15:08 -0800, joel jaeggli <joelja@bogus.com> :
N00b here trying to understand why certain CDN's such as Cloudfare have issues where my MTU is low. For instance if I am using pptp and the MTU is at 1300 it wont work. If I increase to 1478 it may or may not work. PMTUD has a lot of trouble working reliability when the destination of the PTB is a stateless load-balancer.
More explanations are available here: https://blog.cloudflare.com/path-mtu-discovery-in-practice/ -- Don't comment bad code - rewrite it. - The Elements of Programming Style (Kernighan & Plauger)
if I was an ISP (Im not) and a CDN came and said "we want to be inside you" (ewww) why wouldn't I say "sure: lets jumbo" not even "asking for a friend" I genuinely don't understand why a CDN who colocates and is not using public exchange, but is inside your transit boundary (which I am told is actually a bit thing now) would not drive to the packet size which works in your switching gear. I understand that CDN/DC praxis now drives to cheap dumb switches, but even dumb switches like bigger packets dont they? less forwarding decision cost, for more throughput? On Fri, Jan 19, 2018 at 6:21 AM, Dovid Bender <dovid@telecurve.com> wrote:
Vincent,
Thanks. That URL explained a lot.
On Tue, Jan 9, 2018 at 3:11 AM, Vincent Bernat <bernat@luffy.cx> wrote:
❦ 8 janvier 2018 15:08 -0800, joel jaeggli <joelja@bogus.com> :
N00b here trying to understand why certain CDN's such as Cloudfare have issues where my MTU is low. For instance if I am using pptp and the MTU is at 1300 it wont work. If I increase to 1478 it may or may not work. PMTUD has a lot of trouble working reliability when the destination of the PTB is a stateless load-balancer.
More explanations are available here: https://blog.cloudflare.com/path-mtu-discovery-in-practice/ -- Don't comment bad code - rewrite it. - The Elements of Programming Style (Kernighan & Plauger)
Because the CDN delivers to your customers not you. It’s your customers link requirements that are the ones you need to worry about. If you support jumbo frames to all of your customers and their gear also supports jumbo frame then sure go ahead and use jumbo frames otherwise use the lowest common denominator MTU when transmitting. This is less than 1500 on today Internet and encapsulated traffic is reasonable common. embedded CND <--> NAT64 <--> CLAT <--> client 1500 14XX 1500 embedded CDN <--> B4 <— > 6RD <— > client 1500. 14XX 1500 Now you can increase the first 1500 easily. The rest of the path not so easily.
On 19 Jan 2018, at 9:53 am, George Michaelson <ggm@algebras.org> wrote:
if I was an ISP (Im not) and a CDN came and said "we want to be inside you" (ewww) why wouldn't I say "sure: lets jumbo"
not even "asking for a friend" I genuinely don't understand why a CDN who colocates and is not using public exchange, but is inside your transit boundary (which I am told is actually a bit thing now) would not drive to the packet size which works in your switching gear.
I understand that CDN/DC praxis now drives to cheap dumb switches, but even dumb switches like bigger packets dont they? less forwarding decision cost, for more throughput?
On Fri, Jan 19, 2018 at 6:21 AM, Dovid Bender <dovid@telecurve.com> wrote:
Vincent,
Thanks. That URL explained a lot.
On Tue, Jan 9, 2018 at 3:11 AM, Vincent Bernat <bernat@luffy.cx> wrote:
❦ 8 janvier 2018 15:08 -0800, joel jaeggli <joelja@bogus.com> :
N00b here trying to understand why certain CDN's such as Cloudfare have issues where my MTU is low. For instance if I am using pptp and the MTU is at 1300 it wont work. If I increase to 1478 it may or may not work. PMTUD has a lot of trouble working reliability when the destination of the PTB is a stateless load-balancer.
More explanations are available here: https://blog.cloudflare.com/path-mtu-discovery-in-practice/ -- Don't comment bad code - rewrite it. - The Elements of Programming Style (Kernighan & Plauger)
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
thanks. good answer. low risk answer. "it will work" answer. If its a variant of "the last mile is your problem" problem, I'm ok with that. If its a consequence of the middleware deployment I feel like its more tangibly bad decision logic, but its real. -G On Fri, Jan 19, 2018 at 9:50 AM, Mark Andrews <marka@isc.org> wrote:
Because the CDN delivers to your customers not you. It’s your customers link requirements that are the ones you need to worry about. If you support jumbo frames to all of your customers and their gear also supports jumbo frame then sure go ahead and use jumbo frames otherwise use the lowest common denominator MTU when transmitting. This is less than 1500 on today Internet and encapsulated traffic is reasonable common.
embedded CND <--> NAT64 <--> CLAT <--> client 1500 14XX 1500 embedded CDN <--> B4 <— > 6RD <— > client 1500. 14XX 1500
Now you can increase the first 1500 easily. The rest of the path not so easily.
On 19 Jan 2018, at 9:53 am, George Michaelson <ggm@algebras.org> wrote:
if I was an ISP (Im not) and a CDN came and said "we want to be inside you" (ewww) why wouldn't I say "sure: lets jumbo"
not even "asking for a friend" I genuinely don't understand why a CDN who colocates and is not using public exchange, but is inside your transit boundary (which I am told is actually a bit thing now) would not drive to the packet size which works in your switching gear.
I understand that CDN/DC praxis now drives to cheap dumb switches, but even dumb switches like bigger packets dont they? less forwarding decision cost, for more throughput?
On Fri, Jan 19, 2018 at 6:21 AM, Dovid Bender <dovid@telecurve.com> wrote:
Vincent,
Thanks. That URL explained a lot.
On Tue, Jan 9, 2018 at 3:11 AM, Vincent Bernat <bernat@luffy.cx> wrote:
❦ 8 janvier 2018 15:08 -0800, joel jaeggli <joelja@bogus.com> :
N00b here trying to understand why certain CDN's such as Cloudfare have issues where my MTU is low. For instance if I am using pptp and the MTU is at 1300 it wont work. If I increase to 1478 it may or may not work. PMTUD has a lot of trouble working reliability when the destination of the PTB is a stateless load-balancer.
More explanations are available here: https://blog.cloudflare.com/path-mtu-discovery-in-practice/ -- Don't comment bad code - rewrite it. - The Elements of Programming Style (Kernighan & Plauger)
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
On Jan 18, 2018, at 5:53 PM, George Michaelson <ggm@algebras.org> wrote:
if I was an ISP (Im not) and a CDN came and said "we want to be inside you" (ewww) why wouldn't I say "sure: lets jumbo"
not even "asking for a friend" I genuinely don't understand why a CDN who colocates and is not using public exchange, but is inside your transit boundary (which I am told is actually a bit thing now) would not drive to the packet size which works in your switching gear.
I understand that CDN/DC praxis now drives to cheap dumb switches, but even dumb switches like bigger packets dont they? less forwarding decision cost, for more throughput?
The reason is most customers are at a lower MTU size. lets say i can send you a 9K packet. If you receive that frame, and realize you need to fragment, then it’s your routers job to slice 9000 into 5 x 1500. I may have caused you to hit your exception path (which could be expensive) as well as made your PPS load 5x larger downstream. This doesn’t even account for the fact that you may need to have a speed mismatch, whereby I am sending 100Gb+ and your outputs may be only 10G. If you’re then doing DSL + PPPoE and your customers really see a MTU of 1492 or less, then another device has to fragment 5x again. For server to server, 9K makes a lot of sense, it reduces the packet processing and increases the throughput. If your consumer electronic wifi gear or switch can’t handle >1500, and doesn’t even have a setting for layer-2 > 1500, the cost is just too high. Much easier for me to send 5x packets in the first place and be more compatible. Like many things, I’d love for this to be as simple and purist as you purport. I might even be willing to figure out if at $DayJob we could see a benefit from doing this, but from the servers to switches to routers then a partner interface.. it’s a lot of things to make sure are just right. Plus.. can your phone do > 1500 MTU on the Wifi? Where’s that setting? (mumbling person about CSLIP and MRUs from back in the day) - Jared
On Thu, Jan 18, 2018 at 7:14 PM, Jared Mauch <jared@puck.nether.net> wrote:
lets say i can send you a 9K packet. If you receive that frame, and realize you need to fragment, then it’s your routers job to slice 9000 into 5 x 1500.
In practice, no, because the packet you sent had the "don't fragment" bit set. That means my router is not allowed to fragment the packet. Instead, I must send the originating host an ICMP destination unreachable packet stating that the largest packet I can send further is 1500 bytes. You might receive my ICMP message. You might not. After all, I am not the host you were looking for. Good luck. Regards, Bill Herrin P.S. This makes Linux servers happy: iptables -t mangle --insert POSTROUTING --proto tcp \ --tcp-flags SYN,RST,FIN SYN --match tcpmss --mss 1241:65535 \ --jump TCPMSS --set-mss 1240 -- William Herrin ................ herrin@dirtside.com bill@herrin.us Dirtside Systems ......... Web: <http://www.dirtside.com/>
On Jan 18, 2018, at 4:32 PM, William Herrin <bill@herrin.us> wrote:
On Thu, Jan 18, 2018 at 7:14 PM, Jared Mauch <jared@puck.nether.net> wrote:
lets say i can send you a 9K packet. If you receive that frame, and realize you need to fragment, then it’s your routers job to slice 9000 into 5 x 1500.
In practice, no, because the packet you sent had the "don't fragment" bit set. That means my router is not allowed to fragment the packet. Instead, I must send the originating host an ICMP destination unreachable packet stating that the largest packet I can send further is 1500 bytes.
You might receive my ICMP message. You might not. After all, I am not the host you were looking for.
This gets especially bad in cases such as anycast where the return path may be asymmetrical and could result in delivery of the ICMP PTB message to a different anycast instance or to a stateless load balancer that is incapable of determining which machine originated the packet being referenced. One of the many reasons I continue to question the wisdom of using anycast for multi-packet transactions. Owen
Good luck.
Regards, Bill Herrin
P.S. This makes Linux servers happy:
iptables -t mangle --insert POSTROUTING --proto tcp \ --tcp-flags SYN,RST,FIN SYN --match tcpmss --mss 1241:65535 \ --jump TCPMSS --set-mss 1240
-- William Herrin ................ herrin@dirtside.com bill@herrin.us Dirtside Systems ......... Web: <http://www.dirtside.com/>
On Jan 18, 2018, at 7:32 PM, William Herrin <bill@herrin.us> wrote:
On Thu, Jan 18, 2018 at 7:14 PM, Jared Mauch <jared@puck.nether.net> wrote:
lets say i can send you a 9K packet. If you receive that frame, and realize you need to fragment, then it’s your routers job to slice 9000 into 5 x 1500.
In practice, no, because the packet you sent had the "don't fragment" bit set.
Which packet? Is there a specific CDN that does this? I’d be curious to see data vs speculation.
That means my router is not allowed to fragment the packet. Instead, I must send the originating host an ICMP destination unreachable packet stating that the largest packet I can send further is 1500 bytes.
You might receive my ICMP message. You might not. After all, I am not the host you were looking for.
:-) Nor is it likely the reply. - Jared
On Thu, Jan 18, 2018 at 7:41 PM, Jared Mauch <jared@puck.nether.net> wrote:
On Jan 18, 2018, at 7:32 PM, William Herrin <bill@herrin.us> wrote:
On Thu, Jan 18, 2018 at 7:14 PM, Jared Mauch <jared@puck.nether.net> wrote:
lets say i can send you a 9K packet. If you receive that frame, and realize you need to fragment, then it’s your routers job to slice 9000 into 5 x 1500.
In practice, no, because the packet you sent had the "don't fragment" bit set.
Which packet? Is there a specific CDN that does this? I’d be curious to see data vs speculation.
Howdy, Path MTU discovery (which sets the DF bit on TCP packets) is enabled by default on -every- operating system that's shipped for decades now. If you don't want it, you have to explicitly disable it. Disabling it for any significant quantity of traffic is considered antisocial since routers generally can't fragment in the hardware fast path. Regards, Bill -- William Herrin ................ herrin@dirtside.com bill@herrin.us Dirtside Systems ......... Web: <http://www.dirtside.com/>
On Jan 18, 2018, at 8:44 PM, William Herrin <bill@herrin.us> wrote:
Which packet? Is there a specific CDN that does this? I’d be curious to see data vs speculation.
Howdy,
Path MTU discovery (which sets the DF bit on TCP packets) is enabled by default on -every- operating system that's shipped for decades now. If you don't want it, you have to explicitly disable it. Disabling it for any significant quantity of traffic is considered antisocial since routers generally can't fragment in the hardware fast path.
I’m not seeing this in a PCAP capture to at least one CDN, either from my host or from the CDN endpoint. I suspect you’re mistaken. - Jared PCAP: https://puck.nether.net/~jared/akamai.pcap
Bah, never mind.. reading my PCAP wrong :-(
On Jan 19, 2018, at 8:58 AM, Jared Mauch <jared@puck.nether.net> wrote:
On Jan 18, 2018, at 8:44 PM, William Herrin <bill@herrin.us> wrote:
Which packet? Is there a specific CDN that does this? I’d be curious to see data vs speculation.
Howdy,
Path MTU discovery (which sets the DF bit on TCP packets) is enabled by default on -every- operating system that's shipped for decades now. If you don't want it, you have to explicitly disable it. Disabling it for any significant quantity of traffic is considered antisocial since routers generally can't fragment in the hardware fast path.
I’m not seeing this in a PCAP capture to at least one CDN, either from my host or from the CDN endpoint.
I suspect you’re mistaken.
- Jared
On Fri, Jan 19, 2018 at 8:58 AM, Jared Mauch <jared@puck.nether.net> wrote:
On Jan 18, 2018, at 8:44 PM, William Herrin <bill@herrin.us> wrote:
Which packet? Is there a specific CDN that does this? I’d be curious to see data vs speculation.
Path MTU discovery (which sets the DF bit on TCP packets) is enabled by default on -every- operating system that's shipped for decades now.
I’m not seeing this in a PCAP capture to at least one CDN, either from my host or from the CDN endpoint. PCAP: https://puck.nether.net/~jared/akamai.pcap
Hi Jared, tcpdump -v -n -nn -r akamai.pcap |more reading from file akamai.pcap, link-type EN10MB (Ethernet) 08:54:48.611321 IP (tos 0x0, ttl 64, id 12596, offset 0, flags [DF], proto TCP (6), length 60) 204.42.254.5.60262 > 23.0.51.165.80: Flags [S], cksum 0x1504 (incorrect -> 0x5a14), seq 3315894416, win 29200, options [mss 1460,sackOK,TS val 3822930236 ecr 0,nop,wscale 7], length 0 08:54:48.633286 IP (tos 0x0, ttl 58, id 0, offset 0, flags [DF], proto TCP (6), length 60) 23.0.51.165.80 > 204.42.254.5.60262: Flags [S.], cksum 0x0972 (correct), seq 3383397658, ack 3315894417, win 28960, options [mss 1460,sackOK,TS val 2906475904 ecr 3822930236,nop,wscale 5], length 0 Note: "flags [DF]" That means the don't fragment bit is set. Regards, Bill Herrin -- William Herrin ................ herrin@dirtside.com bill@herrin.us Dirtside Systems ......... Web: <http://www.dirtside.com/>
On Fri, Jan 19, 2018, at 01:14, Jared Mauch wrote:
If you’re then doing DSL + PPPoE and your customers really see a MTU of 1492 or less, then another device has to fragment 5x again.
In this part of the world we have even worse stuff around: PPP over L2TP over over IP with 1500 MTU interconnection. Remove another 40 bytes. Add some more headers for various tunneling scenarios and you may get into a situation where even 1400 is too much. But it usually works with MSS clamping to the correct value. Some small ISPs don't even make the effort to check if the transport supports "more than 1500" in order to give the 1500 bytes to the customer - they just clamp down MSS.
Which doesn’t work with IPv6 as UDP doesn’t have the field to clamp. -- Mark Andrews
On 20 Jan 2018, at 03:35, Radu-Adrian Feurdean <nanog@radu-adrian.feurdean.net> wrote:
On Fri, Jan 19, 2018, at 01:14, Jared Mauch wrote: If you’re then doing DSL + PPPoE and your customers really see a MTU of 1492 or less, then another device has to fragment 5x again.
In this part of the world we have even worse stuff around: PPP over L2TP over over IP with 1500 MTU interconnection. Remove another 40 bytes. Add some more headers for various tunneling scenarios and you may get into a situation where even 1400 is too much. But it usually works with MSS clamping to the correct value. Some small ISPs don't even make the effort to check if the transport supports "more than 1500" in order to give the 1500 bytes to the customer - they just clamp down MSS.
On Sat, 20 Jan 2018, Mark Andrews wrote:
Which doesn’t work with IPv6 as UDP doesn’t have the field to clamp.
Well, not with UDP/IPv4 either. Actually, the only protocol I know out there that has this kind of clamping (and is in wide use for clamping), is TCP. Thus, my earlier comment about making strong advise for protocols using PLMTUD. -- Mikael Abrahamsson email: swmike@swm.pp.se
❦ 19 janvier 2018 08:53 +1000, George Michaelson <ggm@algebras.org> :
if I was an ISP (Im not) and a CDN came and said "we want to be inside you" (ewww) why wouldn't I say "sure: lets jumbo"
Most traffic would be with clients limited to at most 1500 bytes. -- Its name is Public Opinion. It is held in reverence. It settles everything. Some think it is the voice of God. -- Mark Twain
On Mon, Jan 08, 2018 at 05:55:55PM -0500, Dovid Bender wrote:
Hi,
N00b here trying to understand why certain CDN's such as Cloudfare have issues where my MTU is low. For instance if I am using pptp and the MTU is at 1300 it wont work. If I increase to 1478 it may or may not work.
I've done some measurements over the internet in the past year or so and 1400 byte packets with DF bit seem to make it just fine. - Jared -- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
CDN’s (or anyone using a load balancer to multiple server instances) needs to assume that traffic may be encapsulated (4in6, 6in4, 464XLAT) and lower the interface MTU’s so that all traffic generated can be encapsulated without fragmentation or PTB’s being generated. This is only going to get worse as more and more eyeballs are being forced into using IPv4 as a service scenarios.
On 9 Jan 2018, at 11:54 am, Jared Mauch <jared@puck.Nether.net> wrote:
On Mon, Jan 08, 2018 at 05:55:55PM -0500, Dovid Bender wrote:
Hi,
N00b here trying to understand why certain CDN's such as Cloudfare have issues where my MTU is low. For instance if I am using pptp and the MTU is at 1300 it wont work. If I increase to 1478 it may or may not work.
I've done some measurements over the internet in the past year or so and 1400 byte packets with DF bit seem to make it just fine.
- Jared
-- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org
On Mon, 08 Jan 2018 17:55:55 -0500, Dovid Bender said:
Hi,
N00b here trying to understand why certain CDN's such as Cloudfare have issues where my MTU is low. For instance if I am using pptp and the MTU is at 1300 it wont work. If I increase to 1478 it may or may not work.
Wait, what? MTU 1300 fails but 1478 sometimes works? Or was 1300 a typo and you meant 1500?
participants (12)
-
Dovid Bender
-
George Michaelson
-
Jared Mauch
-
Jared Mauch
-
joel jaeggli
-
Mark Andrews
-
Mikael Abrahamsson
-
Owen DeLong
-
Radu-Adrian Feurdean
-
valdis.kletnieks@vt.edu
-
Vincent Bernat
-
William Herrin