Large RTT or Why doesn't my ping traffic get discarded?

Jason Iannone

21 Dec 2022 21 Dec '22

5:10 p.m.

Here's a question I haven't bothered to ask until now. Can someone please help me understand why I receive a ping reply after almost 5 seconds? As I understand it, buffers in SP gear are generally 100ms. According to my math this round trip should have been discarded around the 1 second mark, even in a long path. Maybe I should buy a lottery ticket. I don't get it. What is happening here? Jason 64 bytes from 4.2.2.2: icmp_seq=392 ttl=54 time=4834.737 ms 64 bytes from 4.2.2.2: icmp_seq=393 ttl=54 time=4301.243 ms 64 bytes from 4.2.2.2: icmp_seq=394 ttl=54 time=3300.328 ms 64 bytes from 4.2.2.2: icmp_seq=396 ttl=54 time=1289.723 ms Request timeout for icmp_seq 400 Request timeout for icmp_seq 401 64 bytes from 4.2.2.2: icmp_seq=398 ttl=54 time=4915.096 ms 64 bytes from 4.2.2.2: icmp_seq=399 ttl=54 time=4310.575 ms 64 bytes from 4.2.2.2: icmp_seq=400 ttl=54 time=4196.075 ms 64 bytes from 4.2.2.2: icmp_seq=401 ttl=54 time=4287.048 ms 64 bytes from 4.2.2.2: icmp_seq=403 ttl=54 time=2280.466 ms 64 bytes from 4.2.2.2: icmp_seq=404 ttl=54 time=1279.348 ms 64 bytes from 4.2.2.2: icmp_seq=405 ttl=54 time=276.669 ms

Attachments:

attachment.html (text/html — 1.5 KB)

Show replies by date

Mel Beckman

21 Dec 21 Dec

5:22 p.m.

Sometimes this is usually due to high CPU time on the target device. If the device is under heavy load, the ICMP Echo process gets lowest priority. With a well-known name server like 4.2.2.2, this seems unlikely. It could be an intermediate hop or a routing loop, Do a traceroute to get more detailed per-hop statistics. -mel ________________________________ From: NANOG <nanog-bounces+mel=beckman.org@nanog.org> on behalf of Jason Iannone <jason.iannone@gmail.com> Sent: Wednesday, December 21, 2022 9:10 AM To: North American Network Operators' Group <nanog@nanog.org> Subject: Large RTT or Why doesn't my ping traffic get discarded? Here's a question I haven't bothered to ask until now. Can someone please help me understand why I receive a ping reply after almost 5 seconds? As I understand it, buffers in SP gear are generally 100ms. According to my math this round trip should have been discarded around the 1 second mark, even in a long path. Maybe I should buy a lottery ticket. I don't get it. What is happening here? Jason 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=392 ttl=54 time=4834.737 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=393 ttl=54 time=4301.243 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=394 ttl=54 time=3300.328 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=396 ttl=54 time=1289.723 ms Request timeout for icmp_seq 400 Request timeout for icmp_seq 401 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=398 ttl=54 time=4915.096 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=399 ttl=54 time=4310.575 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=400 ttl=54 time=4196.075 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=401 ttl=54 time=4287.048 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=403 ttl=54 time=2280.466 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=404 ttl=54 time=1279.348 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=405 ttl=54 time=276.669 ms

Mel Beckman

5:28 p.m.

Keep in mind that ping reports round trip time, so there could be a device delaying the ping reply on the return trip. In these cases, it helps to have a traceroute from both ends, to detect asymmetrical routing and possibly return path congestion invisible in a traceroute from you end. ________________________________ From: NANOG <nanog-bounces+mel=beckman.org@nanog.org> on behalf of Mel Beckman <mel@beckman.org> Sent: Wednesday, December 21, 2022 9:22 AM To: Jason Iannone <jason.iannone@gmail.com>; North American Network Operators' Group <nanog@nanog.org> Subject: Re: Large RTT or Why doesn't my ping traffic get discarded? Sometimes this is usually due to high CPU time on the target device. If the device is under heavy load, the ICMP Echo process gets lowest priority. With a well-known name server like 4.2.2.2, this seems unlikely. It could be an intermediate hop or a routing loop, Do a traceroute to get more detailed per-hop statistics. -mel ________________________________ From: NANOG <nanog-bounces+mel=beckman.org@nanog.org> on behalf of Jason Iannone <jason.iannone@gmail.com> Sent: Wednesday, December 21, 2022 9:10 AM To: North American Network Operators' Group <nanog@nanog.org> Subject: Large RTT or Why doesn't my ping traffic get discarded? Here's a question I haven't bothered to ask until now. Can someone please help me understand why I receive a ping reply after almost 5 seconds? As I understand it, buffers in SP gear are generally 100ms. According to my math this round trip should have been discarded around the 1 second mark, even in a long path. Maybe I should buy a lottery ticket. I don't get it. What is happening here? Jason 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=392 ttl=54 time=4834.737 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=393 ttl=54 time=4301.243 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=394 ttl=54 time=3300.328 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=396 ttl=54 time=1289.723 ms Request timeout for icmp_seq 400 Request timeout for icmp_seq 401 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=398 ttl=54 time=4915.096 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=399 ttl=54 time=4310.575 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=400 ttl=54 time=4196.075 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=401 ttl=54 time=4287.048 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=403 ttl=54 time=2280.466 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=404 ttl=54 time=1279.348 ms 64 bytes from 4.2.2.2<http://4.2.2.2>: icmp_seq=405 ttl=54 time=276.669 ms

William Herrin

7:56 p.m.

On Wed, Dec 21, 2022 at 9:10 AM Jason Iannone <jason.iannone@gmail.com> wrote:

...

Here's a question I haven't bothered to ask until now. Can someone please help me understand why I receive a ping reply after almost 5 seconds?

64 bytes from 4.2.2.2: icmp_seq=398 ttl=54 time=4915.096 ms 64 bytes from 4.2.2.2: icmp_seq=399 ttl=54 time=4310.575 ms 64 bytes from 4.2.2.2: icmp_seq=400 ttl=54 time=4196.075 ms 64 bytes from 4.2.2.2: icmp_seq=401 ttl=54 time=4287.048 ms 64 bytes from 4.2.2.2: icmp_seq=403 ttl=54 time=2280.466 ms 64 bytes from 4.2.2.2: icmp_seq=404 ttl=54 time=1279.348 ms 64 bytes from 4.2.2.2: icmp_seq=405 ttl=54 time=276.669 ms

Hi Jason, This usually means a problem on the Linux machine originating the packet. It has lost the ARP for the next hop or something similar so the outbound ICMP packet is queued. The glitch repairs itself, briefly, releasing the queued packets. Then it comes right back. Regards, Bill Herrin -- For hire. https://bill.herrin.us/resume/

Dave Taht

9:19 p.m.

There's this thing called bufferbloat... On Wed, Dec 21, 2022 at 11:58 AM William Herrin <bill@herrin.us> wrote:

...

On Wed, Dec 21, 2022 at 9:10 AM Jason Iannone <jason.iannone@gmail.com> wrote:

...
Here's a question I haven't bothered to ask until now. Can someone please help me understand why I receive a ping reply after almost 5 seconds?

64 bytes from 4.2.2.2: icmp_seq=398 ttl=54 time=4915.096 ms 64 bytes from 4.2.2.2: icmp_seq=399 ttl=54 time=4310.575 ms 64 bytes from 4.2.2.2: icmp_seq=400 ttl=54 time=4196.075 ms 64 bytes from 4.2.2.2: icmp_seq=401 ttl=54 time=4287.048 ms 64 bytes from 4.2.2.2: icmp_seq=403 ttl=54 time=2280.466 ms 64 bytes from 4.2.2.2: icmp_seq=404 ttl=54 time=1279.348 ms 64 bytes from 4.2.2.2: icmp_seq=405 ttl=54 time=276.669 ms

Hi Jason,

This usually means a problem on the Linux machine originating the packet. It has lost the ARP for the next hop or something similar so the outbound ICMP packet is queued. The glitch repairs itself, briefly, releasing the queued packets. Then it comes right back.

Regards, Bill Herrin

-- For hire. https://bill.herrin.us/resume/

-- This song goes out to all the folk that thought Stadia would work: https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-698136666560... Dave Täht CEO, TekLibre, LLC

William Herrin

9:37 p.m.

On Wed, Dec 21, 2022 at 1:20 PM Dave Taht <dave.taht@gmail.com> wrote:

...

On Wed, Dec 21, 2022 at 11:58 AM William Herrin <bill@herrin.us> wrote:

...
On Wed, Dec 21, 2022 at 9:10 AM Jason Iannone <jason.iannone@gmail.com> wrote:

...
Here's a question I haven't bothered to ask until now. Can someone please help me understand why I receive a ping reply after almost 5 seconds?

64 bytes from 4.2.2.2: icmp_seq=398 ttl=54 time=4915.096 ms 64 bytes from 4.2.2.2: icmp_seq=399 ttl=54 time=4310.575 ms 64 bytes from 4.2.2.2: icmp_seq=400 ttl=54 time=4196.075 ms 64 bytes from 4.2.2.2: icmp_seq=401 ttl=54 time=4287.048 ms 64 bytes from 4.2.2.2: icmp_seq=403 ttl=54 time=2280.466 ms 64 bytes from 4.2.2.2: icmp_seq=404 ttl=54 time=1279.348 ms 64 bytes from 4.2.2.2: icmp_seq=405 ttl=54 time=276.669 ms

Hi Jason,

This usually means a problem on the Linux machine originating the packet. It has lost the ARP for the next hop or something similar so the outbound ICMP packet is queued. The glitch repairs itself, briefly, releasing the queued packets. Then it comes right back.

...

There's this thing called bufferbloat...

Hi Dave, Yes, but I've seen this particular pattern before and it's generally not bufferbloat. With bufferbloat you usually see consistent long ping times: this ping is 3 seconds, the next ping is 2.9, the next is 3.2. This example had a descending pattern spread exactly the number of seconds apart that the ICMP message was sent. The descending pattern indicates something went wrong with arp, or a virtual machine was starved for CPU time and didn't run for a couple seconds, or something like that. Regards, Bill Herrin -- For hire. https://bill.herrin.us/resume/

J. Hellenthal

9:21 p.m.

As well if this persists you may consider disabling hardware rx/tx checksumming to see if it clears up your results. Some net cards can get glitchy causing this exact behavior. GL -- J. Hellenthal The fact that there's a highway to Hell but only a stairway to Heaven says a lot about anticipated traffic volume.

...

On Dec 21, 2022, at 13:58, William Herrin <bill@herrin.us> wrote:

On Wed, Dec 21, 2022 at 9:10 AM Jason Iannone <jason.iannone@gmail.com> wrote:

...
Here's a question I haven't bothered to ask until now. Can someone please help me understand why I receive a ping reply after almost 5 seconds?

64 bytes from 4.2.2.2: icmp_seq=398 ttl=54 time=4915.096 ms 64 bytes from 4.2.2.2: icmp_seq=399 ttl=54 time=4310.575 ms 64 bytes from 4.2.2.2: icmp_seq=400 ttl=54 time=4196.075 ms 64 bytes from 4.2.2.2: icmp_seq=401 ttl=54 time=4287.048 ms 64 bytes from 4.2.2.2: icmp_seq=403 ttl=54 time=2280.466 ms 64 bytes from 4.2.2.2: icmp_seq=404 ttl=54 time=1279.348 ms 64 bytes from 4.2.2.2: icmp_seq=405 ttl=54 time=276.669 ms

Hi Jason,

This usually means a problem on the Linux machine originating the packet. It has lost the ARP for the next hop or something similar so the outbound ICMP packet is queued. The glitch repairs itself, briefly, releasing the queued packets. Then it comes right back.

Regards, Bill Herrin

-- For hire. https://bill.herrin.us/resume/

Joelle Maslak

10:12 p.m.

You didn't tell us anything about your path or your endpoint, or if you see this just with Lumen's DNS servers or with other devices. So it is hard to guess what is going on here. That said, I know I've seen this kind of behavior both with buffer bloat on consumer devices (particularly the uplink direction) and wifi networks (which can have surprisingly deep buffers, with retransmissions occurring at layer 1.5/2). My guess is that there is a software routing/switching device somewhere in the path (wifi AP, home router, Linux or BSD router, etc). On Wed, Dec 21, 2022 at 10:10 AM Jason Iannone <jason.iannone@gmail.com> wrote:

...

Here's a question I haven't bothered to ask until now. Can someone please help me understand why I receive a ping reply after almost 5 seconds? As I understand it, buffers in SP gear are generally 100ms. According to my math this round trip should have been discarded around the 1 second mark, even in a long path. Maybe I should buy a lottery ticket. I don't get it. What is happening here?

Jason

64 bytes from 4.2.2.2: icmp_seq=392 ttl=54 time=4834.737 ms 64 bytes from 4.2.2.2: icmp_seq=393 ttl=54 time=4301.243 ms 64 bytes from 4.2.2.2: icmp_seq=394 ttl=54 time=3300.328 ms 64 bytes from 4.2.2.2: icmp_seq=396 ttl=54 time=1289.723 ms Request timeout for icmp_seq 400 Request timeout for icmp_seq 401 64 bytes from 4.2.2.2: icmp_seq=398 ttl=54 time=4915.096 ms 64 bytes from 4.2.2.2: icmp_seq=399 ttl=54 time=4310.575 ms 64 bytes from 4.2.2.2: icmp_seq=400 ttl=54 time=4196.075 ms 64 bytes from 4.2.2.2: icmp_seq=401 ttl=54 time=4287.048 ms 64 bytes from 4.2.2.2: icmp_seq=403 ttl=54 time=2280.466 ms 64 bytes from 4.2.2.2: icmp_seq=404 ttl=54 time=1279.348 ms 64 bytes from 4.2.2.2: icmp_seq=405 ttl=54 time=276.669 ms

-- Sincerely, Ms. Joelle Maslak

Jerry Cloe

22 Dec 22 Dec

5:32 a.m.

Because there is no standard for discarding "old" traffic, only discard is for packets that hop too many times. There is, however, a standard for decrementing TTL by 1 if a packet sits on a device for more than 1000ms, and of course we all know what happens when TTL hits zero. Based on that, your packet could have floated around for another 53 seconds. Having said that, I'm not sure many devices actually do this (but its not likely it would have had a significant impact on this traffic anyway). -----Original message----- From:Jason Iannone <jason.iannone@gmail.com> Sent:Wed 12-21-2022 11:11 am Subject:Large RTT or Why doesn‘t my ping traffic get discarded? To:North American Network Operators‘ Group <nanog@nanog.org>; Here's a question I haven't bothered to ask until now. Can someone please help me understand why I receive a ping reply after almost 5 seconds? As I understand it, buffers in SP gear are generally 100ms. According to my math this round trip should have been discarded around the 1 second mark, even in a long path. Maybe I should buy a lottery ticket. I don't get it. What is happening here? Jason 64 bytes from 4.2.2.2 <http://4.2.2.2> : icmp_seq=392 ttl=54 time=4834.737 ms 64 bytes from 4.2.2.2 <http://4.2.2.2> : icmp_seq=393 ttl=54 time=4301.243 ms 64 bytes from 4.2.2.2 <http://4.2.2.2> : icmp_seq=394 ttl=54 time=3300.328 ms 64 bytes from 4.2.2.2 <http://4.2.2.2> : icmp_seq=396 ttl=54 time=1289.723 ms Request timeout for icmp_seq 400 Request timeout for icmp_seq 401 64 bytes from 4.2.2.2 <http://4.2.2.2> : icmp_seq=398 ttl=54 time=4915.096 ms 64 bytes from 4.2.2.2 <http://4.2.2.2> : icmp_seq=399 ttl=54 time=4310.575 ms 64 bytes from 4.2.2.2 <http://4.2.2.2> : icmp_seq=400 ttl=54 time=4196.075 ms 64 bytes from 4.2.2.2 <http://4.2.2.2> : icmp_seq=401 ttl=54 time=4287.048 ms 64 bytes from 4.2.2.2 <http://4.2.2.2> : icmp_seq=403 ttl=54 time=2280.466 ms 64 bytes from 4.2.2.2 <http://4.2.2.2> : icmp_seq=404 ttl=54 time=1279.348 ms 64 bytes from 4.2.2.2 <http://4.2.2.2> : icmp_seq=405 ttl=54 time=276.669 ms

Saku Ytti

6:07 a.m.

There certainly aren't any temporal buffers in SP gear limiting the buffer to 100ms, nor are there any mechanisms to temporally decrease TTL or hop-limit. Some devices may expose temporal configuration to UX, but that is just a multiplier for max_buffer_bytes, and what is programmed is a fixed amount of bytes instead of temporal limit as function of observed traffic rate. This is important, because HW may support tens or even hundreds of thousands of queues, because HW may support large amount of logical interfaces with HQoS and multiple queues each, then if such device is ran with single logical interface, which is low speed either physically or shaped, you may end up having very very long temporal queues, not because people intend to queue long, but because understanding all of this requires lot of context and information about platform which isn't readily available nor is solved by 'just remove those buffers from devices physically, it's bufferbloat'. Like others have pointed out, there is not much information to go with and this could be many things, one of those could be 'buffer bloat' like Taht pointed out, this might be true because cyclical nature of the ping, buffer getting filled and drained. I don't really think ARP/ND is good candidate like Herring suggested, because it's cyclical, instead of exactly single event, but not impossible. We'd really need to see full mtr output, and if or not this affects other destinations, if it just affects icmp or also dns, ideally reverse traceroute as well. I can tell that I'm not observing the issue, nor did I expect to observe it, as I expect problem to close to your network, and therefore affecting a lot of destinations. On Thu, 22 Dec 2022 at 07:35, Jerry Cloe <jerry@jtcloe.net> wrote:

...

Because there is no standard for discarding "old" traffic, only discard is for packets that hop too many times. There is, however, a standard for decrementing TTL by 1 if a packet sits on a device for more than 1000ms, and of course we all know what happens when TTL hits zero. Based on that, your packet could have floated around for another 53 seconds. Having said that, I'm not sure many devices actually do this (but its not likely it would have had a significant impact on this traffic anyway).

-----Original message----- From: Jason Iannone <jason.iannone@gmail.com> Sent: Wed 12-21-2022 11:11 am Subject: Large RTT or Why doesn‘t my ping traffic get discarded? To: North American Network Operators‘ Group <nanog@nanog.org>; Here's a question I haven't bothered to ask until now. Can someone please help me understand why I receive a ping reply after almost 5 seconds? As I understand it, buffers in SP gear are generally 100ms. According to my math this round trip should have been discarded around the 1 second mark, even in a long path. Maybe I should buy a lottery ticket. I don't get it. What is happening here?

Jason

64 bytes from 4.2.2.2: icmp_seq=392 ttl=54 time=4834.737 ms 64 bytes from 4.2.2.2: icmp_seq=393 ttl=54 time=4301.243 ms 64 bytes from 4.2.2.2: icmp_seq=394 ttl=54 time=3300.328 ms 64 bytes from 4.2.2.2: icmp_seq=396 ttl=54 time=1289.723 ms Request timeout for icmp_seq 400 Request timeout for icmp_seq 401 64 bytes from 4.2.2.2: icmp_seq=398 ttl=54 time=4915.096 ms 64 bytes from 4.2.2.2: icmp_seq=399 ttl=54 time=4310.575 ms 64 bytes from 4.2.2.2: icmp_seq=400 ttl=54 time=4196.075 ms 64 bytes from 4.2.2.2: icmp_seq=401 ttl=54 time=4287.048 ms 64 bytes from 4.2.2.2: icmp_seq=403 ttl=54 time=2280.466 ms 64 bytes from 4.2.2.2: icmp_seq=404 ttl=54 time=1279.348 ms 64 bytes from 4.2.2.2: icmp_seq=405 ttl=54 time=276.669 ms

-- ++ytti

William Herrin

6:40 a.m.

On Wed, Dec 21, 2022 at 10:07 PM Saku Ytti <saku@ytti.fi> wrote:

...

I don't really think ARP/ND is good candidate like Herring suggested, because it's cyclical, instead of exactly single event, but not impossible.

Suppose you have a loose network cable between your Linux server and a switch. Layer 1. That RJ45 just isn't quite solid. It's mostly working but not quite right. What does it look like at layer 2? One thing it can look like is a periodic carrier flash where the NIC thinks it has no carrier, then immediately thinks it has enough of a carrier to negotiate speed and duplex. How does layer 3 respond to that? 1s: send ping toward default router 1.1s: ping response from remote server 2s: send ping toward default router 2.1s: ping response from remote server 2.5s: carrier down 2.501s: carrier up 3s: queue ping, arp for default router, no response 4s: queue ping, arp for default router, no response 5s: queue ping, arp for default router, no response 6s: queue ping, arp for default router, no response 7s: queue ping, arp for default router 7.01s: arp response, send all 5 queued pings but note that the earliest is more than 4 seconds old. 7.1s: response from all 5 queued pings. Cable still isn't right though, so in a few seconds or a few minutes you're going to get another carrier flash and the pattern will repeat. I've also seen some cheap switches get stuck doing this even after the faulty cable connection is repaired, not clearing until a reboot. Regards, Bill Herrin -- For hire. https://bill.herrin.us/resume/

Saku Ytti

7:02 a.m.

On Thu, 22 Dec 2022 at 08:41, William Herrin <bill@herrin.us> wrote:

...

Suppose you have a loose network cable between your Linux server and a switch. Layer 1. That RJ45 just isn't quite solid. It's mostly working but not quite right. What does it look like at layer 2? One thing it can look like is a periodic carrier flash where the NIC thinks it has no carrier, then immediately thinks it has enough of a carrier to negotiate speed and duplex. How does layer 3 respond to that?

Agreed. But then once the resolve happens, and linux floods the queued pings out, the responses would come ~immediately. So the delta between the RTT would remain at the send interval, in this case 1s. In this case, we see the RTT decreasing as if the buffer is being purged, until it seems to be filled again, up-until 5s or so. I don't exclude the rationale, I just think it's not likely based on the latencies observed. But at any rate with so little data, my confidence to include or exclude any specific explanation is low.

...

1s: send ping toward default router 1.1s: ping response from remote server 2s: send ping toward default router 2.1s: ping response from remote server 2.5s: carrier down 2.501s: carrier up 3s: queue ping, arp for default router, no response 4s: queue ping, arp for default router, no response 5s: queue ping, arp for default router, no response 6s: queue ping, arp for default router, no response 7s: queue ping, arp for default router 7.01s: arp response, send all 5 queued pings but note that the earliest is more than 4 seconds old. 7.1s: response from all 5 queued pings.

Cable still isn't right though, so in a few seconds or a few minutes you're going to get another carrier flash and the pattern will repeat.

I've also seen some cheap switches get stuck doing this even after the faulty cable connection is repaired, not clearing until a reboot.

Regards, Bill Herrin

-- For hire. https://bill.herrin.us/resume/

-- ++ytti

William Herrin

7:26 a.m.

On Wed, Dec 21, 2022 at 11:03 PM Saku Ytti <saku@ytti.fi> wrote:

...

On Thu, 22 Dec 2022 at 08:41, William Herrin <bill@herrin.us> wrote:

...
Suppose you have a loose network cable between your Linux server and a switch. Layer 1. That RJ45 just isn't quite solid. It's mostly working but not quite right. What does it look like at layer 2? One thing it can look like is a periodic carrier flash where the NIC thinks it has no carrier, then immediately thinks it has enough of a carrier to negotiate speed and duplex. How does layer 3 respond to that?

Agreed. But then once the resolve happens, and linux floods the queued pings out, the responses would come ~immediately. So the delta between the RTT would remain at the send interval, in this case 1s. In this case, we see the RTT decreasing as if the buffer is being purged, until it seems to be filled again, up-until 5s or so.

Howdy, Not quite. The ping origination time isn't set when layer 3 decides the packet can be delivered to layer 2, it's set when layer 7 drops the packet on the stack. In other words: when the ping app "sends" the packet, not when the NIC actually puts the packet on the wire or even when the OS sends the packet over to the NIC. The time the packet spends queued waiting for ARP to supply a next-hop MAC address counts against the round trip time. When you see this pattern of descending ping times exactly one second apart where the responses all arrived at once, it's usually because something in the path didn't have the next-hop MAC address for a while, and then it did. And it's usually not something deep in the network because something deep would exhaust it's transmission queue long before it could queue several seconds worth of pings. If you want to prove this to yourself, set up a Linux box, install a filter to drop arp replies (arptables or nftables), delete the arp entry for your default router (arp -d) and then start pinging something. When you -remove- the arp filter, you'll see the pattern in the ping responses that Jason posted. You may get different results in other OSes. For example, Windows will lose its DHCP address with the carrier flash, so when ping tries to send the packet the network is unreachable. Because the stack considers the network unreachable, the ping packet isn't queued and the error is reported immediately to the application. Regards, Bill Herrin -- For hire. https://bill.herrin.us/resume/

Masataka Ohta

11:19 a.m.

Jerry Cloe wrote:

...

Because there is no standard for discarding "old" traffic, only discard is for packets that hop too many times. There is, however, a standard for decrementing TTL by 1 if a packet sits on a device for more than 1000ms, and of course we all know what happens when TTL hits zero. Based on that, your packet could have floated around for another 53 seconds.

Totally wrong as the standard says TTL MUST be decremented at least by one on every hop and TTL MAY NOT be decremented further as is specified by the standard of IPv4 router requirements (rfc1812): When a router forwards a packet, it MUST reduce the TTL by at least one. If it holds a packet for more than one second, it MAY decrement the TTL by one for each second. As for IPv6, Unlike IPv4, IPv6 nodes are not required to enforce maximum packet lifetime. That is the reason the IPv4 "Time to Live" field was renamed "Hop Limit" in IPv6. In practice, very few, if any, IPv4 implementations conform to the requirement that they limit packet lifetime, so this is not a change in practice. Masataka Ohta

Jason Iannone

12:35 p.m.

Thanks for engaging with this. I was intentionally brief in my explanation. I have observed this behavior in congested networks for years and ignored it as an obvious symptom of the congestion. What has always piqued my curiosity though is just how long a ping can last. In my case yesterday, I was at the airport at peak holiday travel and free wifi usage time. I expect a bad experience. I don't expect a ping to return 5 seconds after originating it. I just imagine the network straining and groaning to get my ping back to me. It's okay, man. Let it go. On Thu, Dec 22, 2022 at 5:22 AM Masataka Ohta < mohta@necom830.hpcl.titech.ac.jp> wrote:

...

Jerry Cloe wrote:

...
Because there is no standard for discarding "old" traffic, only discard is for packets that hop too many times. There is, however, a standard for decrementing TTL by 1 if a packet sits on a device for more than 1000ms, and of course we all know what happens when TTL hits zero. Based on that, your packet could have floated around for another 53 seconds.

Totally wrong as the standard says TTL MUST be decremented at least by one on every hop and TTL MAY NOT be decremented further as is specified by the standard of IPv4 router requirements (rfc1812):

When a router forwards a packet, it MUST reduce the TTL by at least one. If it holds a packet for more than one second, it MAY decrement the TTL by one for each second.

As for IPv6,

Unlike IPv4, IPv6 nodes are not required to enforce maximum packet lifetime. That is the reason the IPv4 "Time to Live" field was renamed "Hop Limit" in IPv6. In practice, very few, if any, IPv4 implementations conform to the requirement that they limit packet lifetime, so this is not a change in practice.

Masataka Ohta

933

Age (days ago)

934

Last active (days ago)

List overview

Download

14 comments

9 participants

participants (9)

Dave Taht
J. Hellenthal
Jason Iannone
Jerry Cloe
Joelle Maslak
Masataka Ohta
Mel Beckman
Saku Ytti
William Herrin

Large RTT or Why doesn't my ping traffic get discarded?

tags

participants (9)