How often do packets magically get duplicated within the network so that the target receives 2 copies? That seems like something somebody at NANOG might have studied and given a talk on. Any suggestions for other places to look? Context is NTP. If a client gets an answer, should it keep the socket around for a short time so that any late responses or duplicates from the network don't turn into ICMP port unreachable back at the server. Nothing critical, just general clutter reduction. I have packet captures from a NTP server. I'm trying to sort things out. There are a surprising (to me) number of duplicates that arrive back-to-back, sometimes the timestamp is the same microsecond. They could come from buggy clients, but that seems like an unlikely sort of bug. -- These are my opinions. I hate spam.
Hey Hal,
How often do packets magically get duplicated within the network so that the target receives 2 copies? That seems like something somebody at NANOG might have studied and given a talk on.
I can't tell you how common it is, because that type of visibility is not easy to acquire, But I can explain at least one scenario when it occasionally happens. 1) Imagine a ring of L2 metro ethernet 2) Ring is connected to two PE routers, for redundancy 3) Customers are connected to ring ports and backhauled over VLAN to PE If there is very little traffic from Network=>Customer, the L2 metro forgets the MAC of customer subinterfaces (or VRRP) on the PE routers. Then when the client sends a packet to the Internet, the L2 floods it to all eligible ports, and it'll arrive to both PE routers, which will continue to forward it to the Internet. This requires an unfortunate (but typical) combination of ARP timeout and MAC timeout, so that sender still has ARP cache, while switch doesn't have MAC cache. In the opposite direction this same topology can cause loops, when PE routers still have a customer MAC in the ARP table, but L2 switch doesn't have the MAC. I wouldn't personally add code in applications to handle this case more gracefully. -- ++ytti
On Mon, Jun 22, 2020 at 9:43 PM Saku Ytti <saku@ytti.fi> wrote:
I can't tell you how common it is, because that type of visibility is not easy to acquire, But I can explain at least one scenario when it occasionally happens.
1) Imagine a ring of L2 metro ethernet 2) Ring is connected to two PE routers, for redundancy 3) Customers are connected to ring ports and backhauled over VLAN to PE
If there is very little traffic from Network=>Customer, the L2 metro forgets the MAC of customer subinterfaces (or VRRP) on the PE routers. Then when the client sends a packet to the Internet, the L2 floods it to all eligible ports, and it'll arrive to both PE routers, which will continue to forward it to the Internet.
Hi Saku, That's what spanning tree and its compatriots are for. Otherwise, ordinary broadcast traffic (like those arp packets) would travel in a loop, flooding the network and it would just about instantly collapse when you first turned it on. A slightly more likely scenario is a wifi link. 802.11 employs layer-2 retries across the wireless segment. When the packet is successfully transmitted but the ack is garbled, the packet may be sent a second time. Even then I wouldn't expect duplicated packets to be more than a very small fraction of a percent. Hal, if you're seeing a non-trivial amount of identical packets, my best guess is that the client is sending identical packets for some reason. NTP you say? How does iburst work during initial sync up? Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/
On Tue, 23 Jun 2020 at 08:12, William Herrin <bill@herrin.us> wrote: Hey Bill,
That's what spanning tree and its compatriots are for. Otherwise, ordinary broadcast traffic (like those arp packets) would travel in a loop, flooding the network and it would just about instantly collapse when you first turned it on.
Metro: S1-S2-S3-S1 PE1: S1 PE2: S2 Customer: S3 STP blocking: ANY S3 sends frame, it is unknown unicast flooded, S1+S2 both get it (regardless of which metro port blocks), which will send it via PE to Internet. STP doesn't help, at all. Hope this helps. -- ++ytti
On 23/Jun/20 07:21, Saku Ytti wrote:
Metro: S1-S2-S3-S1 PE1: S1 PE2: S2 Customer: S3 STP blocking: ANY
S3 sends frame, it is unknown unicast flooded, S1+S2 both get it (regardless of which metro port blocks), which will send it via PE to Internet.
STP doesn't help, at all. Hope this helps.
In the above, is S3 part of the Metro-E ring, or simply downstream of S1 and S2? Mark.
On Tue, 23 Jun 2020 at 08:36, Mark Tinka <mark.tinka@seacom.mu> wrote:
To be clear, is the customer's device S3, or is S3 the ISP's device that terminates the customer's service?
S1-S2-S3-S1 is operator L2 metro-ring, which connects customers and 2xPE routers. It VLAN backhauls customers to PE. -- ++ytti
On 23/Jun/20 07:52, Saku Ytti wrote:
S1-S2-S3-S1 is operator L2 metro-ring, which connects customers and 2xPE routers. It VLAN backhauls customers to PE.
Okay. In 2014, we hit a similar issue, although not in a ring. Our previous architecture was to interconnect edge routers via downstream, interconnected aggregation switches to which customers connected in order to support VRRP. Since customers do strange things, both edge routers received the same traffic, which caused pain. Since then, we don't support VRRP for customers any longer, nor do we interconnect aggregation switches that map to different edge routers. Your example scenario describes what we experienced back then. Mark.
----- On Jun 22, 2020, at 10:21 PM, Saku Ytti saku@ytti.fi wrote: Hi,
Metro: S1-S2-S3-S1 PE1: S1 PE2: S2 Customer: S3 STP blocking: ANY
S3 sends frame, it is unknown unicast flooded, S1+S2 both get it (regardless of which metro port blocks), which will send it via PE to Internet.
STP doesn't help, at all. Hope this helps.
Yeah, except that unless you use static ARP entries, I can't come up with a plausible scenario in which this would happen for NTP. Assuming we're talking about a non-local NTP server, S3 will not send an NTP packet without first sending an ARP. Yes, your ARP will be flooded, but your NTP packet won't be transmitted until there is an ARP reply. By that time MACs have been learned, and the NTP packet will not be considered BUM traffic, right? That said, I have seen packet duplication in L2 onlu networks that I've worked on myself, but that was because I disregarded a lot of rules from the imaginary networking handbook. Thanks, Sabri
On Tue, 23 Jun 2020 at 09:15, Sabri Berisha <sabri@cluecentral.net> wrote:
Yeah, except that unless you use static ARP entries, I can't come up with a plausible scenario in which this would happen for NTP. Assuming we're talking about a non-local NTP server, S3 will not send an NTP packet without first sending an ARP. Yes, your ARP will be flooded, but your NTP packet won't be transmitted until there is an ARP reply. By that time MACs have been learned, and the NTP packet will not be considered BUM traffic, right?
The plausible scenario is the one I explained. The crucial detail is MAC timeout (catalyst 300s) being shorter than ARP timeout (cisco 4h). So the device generating the packet knows the MAC address, the L2 does not. Hope this helps! -- ++ytti
----- On Jun 22, 2020, at 11:21 PM, Saku Ytti saku@ytti.fi wrote: Hi Saku,
On Tue, 23 Jun 2020 at 09:15, Sabri Berisha <sabri@cluecentral.net> wrote:
Yeah, except that unless you use static ARP entries, I can't come up with a plausible scenario in which this would happen for NTP. Assuming we're talking about a non-local NTP server, S3 will not send an NTP packet without first sending an ARP. Yes, your ARP will be flooded, but your NTP packet won't be transmitted until there is an ARP reply. By that time MACs have been learned, and the NTP packet will not be considered BUM traffic, right?
The plausible scenario is the one I explained. The crucial detail is MAC timeout (catalyst 300s) being shorter than ARP timeout (cisco 4h). So the device generating the packet knows the MAC address, the L2 does not.
Aaah yes, fair point! Thanks $deity for default timers that make no sense. Thanks, Sabri
On Tue, 23 Jun 2020 at 09:32, Sabri Berisha <sabri@cluecentral.net> wrote:
Aaah yes, fair point! Thanks $deity for default timers that make no sense.
Add low-traffic connection and default 1024s maxPoll of NTP and this duplication is guaranteed to happen for 97.9% of packets. -- ++ytti
On Mon, Jun 22, 2020 at 10:21 PM Saku Ytti <saku@ytti.fi> wrote:
On Tue, 23 Jun 2020 at 08:12, William Herrin <bill@herrin.us> wrote:
That's what spanning tree and its compatriots are for. Otherwise, ordinary broadcast traffic (like those arp packets) would travel in a loop, flooding the network and it would just about instantly collapse when you first turned it on.
Metro: S1-S2-S3-S1 PE1: S1 PE2: S2 Customer: S3 STP blocking: ANY
S3 sends frame, it is unknown unicast flooded, S1+S2 both get it (regardless of which metro port blocks), which will send it via PE to Internet.
There's a link in the chain you haven't explained. The packet which entered at S3 has a unicast destination MAC address. That's what was in the arp table. If they're following the standards, only one of PE1 and PE2 will accept packets with that destination mac address. The other, recognizing that the packet is not addressed to it, drops it. Recall that ethernet worked without duplicating packets back in the days of hubs when all stations received all packets. This is how. That having been said, I've seen vendors creatively breach the boundary between L2 and L3 with some really peculiar results. AWS VPCs for example. But then this ring configuration doesn't exist in an AWS VPC and I've not particularly observed a lot of packet duplication out of Amazon. Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/
On Tue, 23 Jun 2020 at 09:54, William Herrin <bill@herrin.us> wrote:
There's a link in the chain you haven't explained. The packet which entered at S3 has a unicast destination MAC address. That's what was in the arp table. If they're following the standards, only one of PE1 and PE2 will accept packets with that destination mac address. The other, recognizing that the packet is not addressed to it, drops it.
There are many reasons why practical devices (such as VXR) don't use MAC HW filters. Such as your PHY runs out of HW filter slots, or the HW does not support per-vlan HW filter, or there is 1 subinterface with EoMPLS configured or other type of L2 service, requiring reception of any DMAC. There are also many reasons why both routers have the DMAC in their HW filter, such as VRRP, HSRP.
for example. But then this ring configuration doesn't exist in an AWS VPC and I've not particularly observed a lot of packet duplication out of Amazon.
Amazon does nothing standard, it's all AMZN. Hope this helps! -- ++ytti
Am Montag, 22. Juni 2020, 23:53:44 schrieb William Herrin:
On Mon, Jun 22, 2020 at 10:21 PM Saku Ytti <saku@ytti.fi> wrote:
On Tue, 23 Jun 2020 at 08:12, William Herrin <bill@herrin.us> wrote:
That's what spanning tree and its compatriots are for. Otherwise, ordinary broadcast traffic (like those arp packets) would travel in a loop, flooding the network and it would just about instantly collapse when you first turned it on.
Metro: S1-S2-S3-S1 PE1: S1 PE2: S2 Customer: S3 STP blocking: ANY
S3 sends frame, it is unknown unicast flooded, S1+S2 both get it (regardless of which metro port blocks), which will send it via PE to Internet.
There's a link in the chain you haven't explained. The packet which entered at S3 has a unicast destination MAC address. That's what was in the arp table. If they're following the standards, only one of PE1 and PE2 will accept packets with that destination mac address. The other, recognizing that the packet is not addressed to it, drops it.
Recall that ethernet worked without duplicating packets back in the days of hubs when all stations received all packets. This is how.
That having been said, I've seen vendors creatively breach the boundary between L2 and L3 with some really peculiar results. AWS VPCs for example. But then this ring configuration doesn't exist in an AWS VPC and I've not particularly observed a lot of packet duplication out of Amazon.
Regards, Bill Herrin
They don't have to break anything or get creative , just assume vrrp between the PE Routers. Not sure how many vendors drop by default if they are not the active router. Regards Karsten
On 23/Jun/20 06:41, Saku Ytti wrote:
I can't tell you how common it is, because that type of visibility is not easy to acquire, But I can explain at least one scenario when it occasionally happens.
1) Imagine a ring of L2 metro ethernet 2) Ring is connected to two PE routers, for redundancy 3) Customers are connected to ring ports and backhauled over VLAN to PE
If there is very little traffic from Network=>Customer, the L2 metro forgets the MAC of customer subinterfaces (or VRRP) on the PE routers. Then when the client sends a packet to the Internet, the L2 floods it to all eligible ports, and it'll arrive to both PE routers, which will continue to forward it to the Internet. This requires an unfortunate (but typical) combination of ARP timeout and MAC timeout, so that sender still has ARP cache, while switch doesn't have MAC cache.
In the opposite direction this same topology can cause loops, when PE routers still have a customer MAC in the ARP table, but L2 switch doesn't have the MAC.
I wouldn't personally add code in applications to handle this case more gracefully.
My understanding of Layer 2-based Metro-E networks is that multi-directional traffic would be prevented by way of Spanning Tree. Mark.
On Mon, Jun 22, 2020 at 5:30 PM Hal Murray <hgm+nanog@ip-64-139-1-69.sjc.megapath.net> wrote:
How often do packets magically get duplicated within the network so that the target receives 2 copies? That seems like something somebody at NANOG might have studied and given a talk on.
Any suggestions for other places to look?
bugs like https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvn71311 where both hw forwarded and punted packet are sent to destination?
participants (7)
-
Hal Murray
-
Karsten Thomann
-
Mark Tinka
-
Sabri Berisha
-
Saku Ytti
-
William Herrin
-
Yang Yu