EVPN ESI BUM Forwarding
My google-fu and attempts to dig through all of the standards is failing me. I am trying to understand the mechanism to prevent an ESI designated forwarder from looping BUM traffic. The scenario I am imagining is BUM traffic coming into the fabric on an ESI link on a non-designated member of the ESI. It still needs to replicate the traffic to all of the VTEPs in the instance, including the other ESI members. Other non-designated members obviously don’t send the BUM traffic out the associated ESI ports, but how does the designated one know not to? It could lookup the source Ethernet address, but MACs move and if it’s new, there’s a race with the BGP NLRI getting there and getting processed before the encapsulated traffic. We’ve been seeing what looks exactly like this not working. A switch with a port channel to two leaf switches complains MAC addresses flapping between where they really are and the port channel. Doesn’t seem like some endpoint out in the fabric doing it because it’s across a whole bunch of VLANs, and it goes away if we shut down one of the port channel links, leaving only one up. I want to understand how this should be working to maybe understand what’s not working here and how to fix it.
hey, "The EVPN split-horizon procedure ensures that the BUM traffic originated by the multi-homed PE and sent from the non-DF to the DF, is not replicated back to the CE (echoed packets on the CE). To avoid these echoed packets, the non-DF (PE1) sends all the BUM packets to the DF (PE2) with an indication of the source Ethernet-Segment. That indication is the ESI Label (ESI2 in the example), previously signaled by PE2 in the AD per-ESI route for the Ethernet-Segment. When PE2 receives an EVPN packet (after the EVPN label lookup), the PE2 finds the ESI label that identifies its local Ethernet-Segment ESI2. The BUM packet is replicated to other local CEs but not to the ESI2 SAP." https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-mh-split-horizon -- tarko
Thanks for the response. It really doesn't bear directly on my situation, but it does have references to what I need in RFC 8365. Now that I know the terminology for these features, "Split Horizon" and "Local Bias" (neither of which seems to fit very well to me), it's easier to find more info. I understand the approach outlined in RFC 8365. Makes it look more like something is not working right in our implementation. I did some definitive packet captures showing that BUM traffic is going in the non-designated member of the ESI and looping back out of the designated forwarder. Not all BUM traffic is looped, just a small fraction. I want to investigate next if the events have something to do with MAC address tables or some other timers. The switches doing it are two Arista 7050SX3 with a single instance VXLAN EVPN. It should be a pretty simple setup. Not aware of any knobs to modify any of this behavior or what we could be missing. On Thu, Nov 17, 2022 at 1:30 PM Tarko Tikan <tarko@lanparty.ee> wrote:
hey,
"The EVPN split-horizon procedure ensures that the BUM traffic originated by the multi-homed PE and sent from the non-DF to the DF, is not replicated back to the CE (echoed packets on the CE). To avoid these echoed packets, the non-DF (PE1) sends all the BUM packets to the DF (PE2) with an indication of the source Ethernet-Segment. That indication is the ESI Label (ESI2 in the example), previously signaled by PE2 in the AD per-ESI route for the Ethernet-Segment. When PE2 receives an EVPN packet (after the EVPN label lookup), the PE2 finds the ESI label that identifies its local Ethernet-Segment ESI2. The BUM packet is replicated to other local CEs but not to the ESI2 SAP."
https://datatracker.ietf.org/doc/html/draft-ietf-bess-evpn-mh-split-horizon
-- tarko
hey,
The switches doing it are two Arista 7050SX3 with a single instance VXLAN EVPN. It should be a pretty simple setup. Not aware of any knobs to modify any of this behavior or what we could be missing.
Hard to speak for Arista (we do also have EVPN-VXLAN implementation with 7050SX3 and A-A MH and don't see mentioned issue) but I wouldn't be suprised if this is Arista bug considered their EVPN story has been pretty rough in other similar areas (like A-S MH not blocking on non-DF etc). Let us know if you find some correlation with other events that might explain this. -- tarko
participants (2)
-
Crist Clark
-
Tarko Tikan