L2VPN/L2transport, Cumulus Linux & hardware suggestion
Dear folks, have anyone already tried to run VXLAN/EVPN + Bridge Layer 2 Protocol Tunneling on Cumulus Linux as an replacement for classic MPLS L2VPN/VPWS (xconnect, l2circuit, VLL) ? I need to provide transparent Ethernet P2P virtual leased lines to my customers and these have to support stuff like LLDP, STP, LACP, etc. The transport L2 network is not THAT big: max hops between VTEP is 4. Anyone have suggestions for the below hardware request? #) 1-3U L2/L3 box #) 48x SFP28 / 1/10/25G #) 6x QSFP28 / 100G #) VXLAN/EVPN with L2 tunneling support or #) MPLS VPWS/l2circuit #) Dual PSU thanks & best regards Jürgen
Good luck with tunnelling LACP, no matter what boxes you have - LACP has (de facto) hard jitter requirements of under 1msec, or you'll be getting TCP resets coming out your ears due to mis-ordered packets. For your requirements, although I hesitate to recommend them for enterprise/carrier use, Miktotik's EoIP protocol does a much better job of this than most "carrier-grade" implementations. Otherwise, Juniper and Arista both come to mind, Juniper has the EX4650 that matches your h/w specs, and Arista has, oh, at least half a dozen boxes of various spec that comply, too. Not 100% sure the Juniper EX does 25G, now that I think of it. Adam Thompson Consultant, Infrastructure Services MERLIN 100 - 135 Innovation Drive Winnipeg, MB, R3T 6A8 (204) 977-6824 or 1-800-430-6404 (MB only) athompson@merlin.mb.ca www.merlin.mb.ca
-----Original Message----- From: NANOG <nanog-bounces+athompson=merlin.mb.ca@nanog.org> On Behalf Of Jürgen Jaritsch Sent: Tuesday, July 7, 2020 3:15 PM To: nanog@nanog.org Subject: L2VPN/L2transport, Cumulus Linux & hardware suggestion
Dear folks,
have anyone already tried to run VXLAN/EVPN + “Bridge Layer 2 Protocol Tunneling” on Cumulus Linux as an replacement for classic MPLS L2VPN/VPWS (“xconnect”, l2circuit, VLL) ?
I need to provide transparent Ethernet P2P virtual leased lines to my customers and these have to support stuff like LLDP, STP, LACP, etc. The transport L2 network is not THAT big: max hops between VTEP is 4.
Anyone have suggestions for the below hardware request? #) 1-3U L2/L3 box #) 48x SFP28 / 1/10/25G #) 6x QSFP28 / 100G #) VXLAN/EVPN with L2 tunneling support or #) MPLS VPWS/l2circuit #) Dual PSU
thanks & best regards Jürgen
For your requirements, although I hesitate to recommend them for enterprise/carrier use, Miktotik's EoIP protocol does a much better job of
Dear Adam, yeah, forget about LACP - the bigger problem is all the LLDP and STP stuff, that gets interpreted at the UNI port. LACP is a bad example - but there are many other frames and protocols, which must work. Could be that a customer wants to run MPLS+LDP on his VLL (for whatever reason ...). this than most "carrier-grade" implementations. Not at wirespeed ... and not without causing other issues (single thread load, etc).
Juniper has the EX4650 that matches your h/w specs,... Not 100% sure the Juniper EX does 25G, now that I think of it.
Yeah, EX4650 it does: 48x 1/10/25G + 6x 100G + MPLS It also supports Ethernet over MPLS (at least they say here: https://www.juniper.net/documentation/en_US/junos/topics/topic-map/mpls-over view.html#id-mpls-feature-support-on-qfx-series-and-ex4600-switches) but at some of their sites they mention, that MPLS-based CCC are not support: https://www.juniper.net/documentation/en_US/junos/topics/topic-map/mpls-over view.html#jd0e2531 " ... MPLS-based circuit cross-connects (CCC) are not supportedonly circuit-based pseudowires are supported. ..." There is also the QFX5120-48Y - 48x 1/10/25G + 8x 100G + MPLS In the past QFX wasn't the best idea for MPLS topics ... has this changed?
and Arista has, oh, at least half a dozen boxes of various spec that comply, too.
Yeah, I already know them (do have some older 7050S). The call it "VXLAN P2P Pseudowire", but there is absolutely nothing in there CLI documentation :(. Looks like the feature is only support on the 7280 platform. Possible options: 7280SR2-48YC6 Do you have any experience with what they call "VXLAN P2P Pseudowire"? I can't even find a config example on the net :( thanks & best regards Jürgen -----Ursprüngliche Nachricht----- Von: Adam Thompson [mailto:athompson@merlin.mb.ca] Gesendet: Dienstag, 7. Juli 2020 23:09 An: Jürgen Jaritsch <juergen@jaritsch.at>; nanog@nanog.org Betreff: RE: L2VPN/L2transport, Cumulus Linux & hardware suggestion Good luck with tunnelling LACP, no matter what boxes you have - LACP has (de facto) hard jitter requirements of under 1msec, or you'll be getting TCP resets coming out your ears due to mis-ordered packets. For your requirements, although I hesitate to recommend them for enterprise/carrier use, Miktotik's EoIP protocol does a much better job of this than most "carrier-grade" implementations. Otherwise, Juniper and Arista both come to mind, Juniper has the EX4650 that matches your h/w specs, and Arista has, oh, at least half a dozen boxes of various spec that comply, too. Not 100% sure the Juniper EX does 25G, now that I think of it. Adam Thompson Consultant, Infrastructure Services MERLIN 100 - 135 Innovation Drive Winnipeg, MB, R3T 6A8 (204) 977-6824 or 1-800-430-6404 (MB only) mailto:athompson@merlin.mb.ca http://www.merlin.mb.ca
-----Original Message----- From: NANOG <mailto:nanog-bounces+athompson=merlin.mb.ca@nanog.org> On Behalf Of Jürgen Jaritsch Sent: Tuesday, July 7, 2020 3:15 PM To: mailto:nanog@nanog.org Subject: L2VPN/L2transport, Cumulus Linux & hardware suggestion
Dear folks,
have anyone already tried to run VXLAN/EVPN + Bridge Layer 2 Protocol Tunneling on Cumulus Linux as an replacement for classic MPLS L2VPN/VPWS (xconnect, l2circuit, VLL) ?
I need to provide transparent Ethernet P2P virtual leased lines to my customers and these have to support stuff like LLDP, STP, LACP, etc. The transport L2 network is not THAT big: max hops between VTEP is 4.
Anyone have suggestions for the below hardware request? #) 1-3U L2/L3 box #) 48x SFP28 / 1/10/25G #) 6x QSFP28 / 100G #) VXLAN/EVPN with L2 tunneling support or #) MPLS VPWS/l2circuit #) Dual PSU
thanks & best regards Jürgen
For your requirements, although I hesitate to recommend them for enterprise/carrier use, Miktotik's EoIP protocol does a much better job of
I do run the 7280SR2-48YC6, but I don't do VPLS or pseudowires on them right now so I can't help directly with that. Based on my experience with Arista so far, it'll be perfectly-well documented, just for a different platform, and in a blog post instead of in the user manual. :-( (Note to anyone from Arista lurking on the list: your User Manual sucks rocks because it's wildly incomplete. Please put some of the effort that goes into those EOS Central blog posts, into the manual instead.) As to the Juniper, I'm a client on a Juniper-based VPLS system, and the only thing it consistently intercepts is LLDP... which I'm actually OK with, mostly. Other BPDUs and other Ethernet protocols get passed through (that we've tested, so far). We have heard of some feature limitations on the EX4650, no CCC is unfortunate. I don't have any experience with the QFX series as an operator or customer so can't comment. Adam Thompson Consultant, Infrastructure Services [1593169877849] 100 - 135 Innovation Drive Winnipeg, MB, R3T 6A8 (204) 977-6824 or 1-800-430-6404 (MB only) athompson@merlin.mb.ca<mailto:athompson@merlin.mb.ca> www.merlin.mb.ca<http://www.merlin.mb.ca/> ________________________________ From: NANOG <nanog-bounces+athompson=merlin.mb.ca@nanog.org> on behalf of Jürgen Jaritsch <juergen@jaritsch.at> Sent: Tuesday, July 7, 2020 5:05:03 PM To: nanog@nanog.org Subject: AW: L2VPN/L2transport, Cumulus Linux & hardware suggestion Dear Adam, yeah, forget about LACP - the bigger problem is all the LLDP and STP stuff, that gets interpreted at the UNI port. LACP is a bad example - but there are many other frames and protocols, which must work. Could be that a customer wants to run MPLS+LDP on his VLL (for whatever reason ...). this than most "carrier-grade" implementations. Not at wirespeed ... and not without causing other issues (single thread load, etc).
Juniper has the EX4650 that matches your h/w specs,... Not 100% sure the Juniper EX does 25G, now that I think of it.
Yeah, EX4650 it does: 48x 1/10/25G + 6x 100G + MPLS It also supports Ethernet over MPLS (at least they say here: https://www.juniper.net/documentation/en_US/junos/topics/topic-map/mpls-over view.html#id-mpls-feature-support-on-qfx-series-and-ex4600-switches) but at some of their sites they mention, that MPLS-based CCC are not support: https://www.juniper.net/documentation/en_US/junos/topics/topic-map/mpls-over view.html#jd0e2531 " ... MPLS-based circuit cross-connects (CCC) are not supported—only circuit-based pseudowires are supported. ..." There is also the QFX5120-48Y - 48x 1/10/25G + 8x 100G + MPLS In the past QFX wasn't the best idea for MPLS topics ... has this changed?
and Arista has, oh, at least half a dozen boxes of various spec that comply, too.
Yeah, I already know them (do have some older 7050S). The call it "VXLAN P2P Pseudowire", but there is absolutely nothing in there CLI documentation :(. Looks like the feature is only support on the 7280 platform. Possible options: 7280SR2-48YC6 Do you have any experience with what they call "VXLAN P2P Pseudowire"? I can't even find a config example on the net :( thanks & best regards Jürgen -----Ursprüngliche Nachricht----- Von: Adam Thompson [mailto:athompson@merlin.mb.ca] Gesendet: Dienstag, 7. Juli 2020 23:09 An: Jürgen Jaritsch <juergen@jaritsch.at>; nanog@nanog.org Betreff: RE: L2VPN/L2transport, Cumulus Linux & hardware suggestion Good luck with tunnelling LACP, no matter what boxes you have - LACP has (de facto) hard jitter requirements of under 1msec, or you'll be getting TCP resets coming out your ears due to mis-ordered packets. For your requirements, although I hesitate to recommend them for enterprise/carrier use, Miktotik's EoIP protocol does a much better job of this than most "carrier-grade" implementations. Otherwise, Juniper and Arista both come to mind, Juniper has the EX4650 that matches your h/w specs, and Arista has, oh, at least half a dozen boxes of various spec that comply, too. Not 100% sure the Juniper EX does 25G, now that I think of it. Adam Thompson Consultant, Infrastructure Services MERLIN 100 - 135 Innovation Drive Winnipeg, MB, R3T 6A8 (204) 977-6824 or 1-800-430-6404 (MB only) mailto:athompson@merlin.mb.ca http://www.merlin.mb.ca
-----Original Message----- From: NANOG <mailto:nanog-bounces+athompson=merlin.mb.ca@nanog.org> On Behalf Of Jürgen Jaritsch Sent: Tuesday, July 7, 2020 3:15 PM To: mailto:nanog@nanog.org Subject: L2VPN/L2transport, Cumulus Linux & hardware suggestion
Dear folks,
have anyone already tried to run VXLAN/EVPN + “Bridge Layer 2 Protocol Tunneling” on Cumulus Linux as an replacement for classic MPLS L2VPN/VPWS (“xconnect”, l2circuit, VLL) ?
I need to provide transparent Ethernet P2P virtual leased lines to my customers and these have to support stuff like LLDP, STP, LACP, etc. The transport L2 network is not THAT big: max hops between VTEP is 4.
Anyone have suggestions for the below hardware request? #) 1-3U L2/L3 box #) 48x SFP28 / 1/10/25G #) 6x QSFP28 / 100G #) VXLAN/EVPN with L2 tunneling support or #) MPLS VPWS/l2circuit #) Dual PSU
thanks & best regards Jürgen
Hey Adam, On Wed, 8 Jul 2020 at 00:11, Adam Thompson <athompson@merlin.mb.ca> wrote:
Good luck with tunnelling LACP, no matter what boxes you have - LACP has (de facto) hard jitter requirements of under 1msec, or you'll be getting TCP resets coming out your ears due to mis-ordered packets.
Can you elaborate on this? Where is LACP jitter defined and for what purpose? We push packets around the globe in sub 200us jitter on any given day, so 1000us isn't for us a particularly hard goal. Only reason why I could imagine someone would care about jitter here is if protocol measures delay (LACP doesn't) and relies on delay to remain static and then balances per-packet or per-byte or otherwise between multiple links. However we of course put all packets from given TCP session to always same LACP interface, so from TCP session POV, each LACP is exactly a 1 interface. Per-packet balancing on LACP is possible via a special configuration, but anyone who does it, doesn't care about reordering, no matter of jitter, because even in very stable jitter, the paths may be unequal length and cause reordering. LACP hellos are sent every 1s when in fast mode with 3s keepalive, which also isn't particularly tight. We do have customers running LACP over MPLS pseudowires over great distances. -- ++ytti
If jitter were defined anywhere vis-à-vis LACP, it would be _de jure_, not _de facto_ as I said. Yes, if you have *guaranteed* that TCP sessions hash uniquely to a single link in your network, you might be able to successfully tunnel LACP (or EtherChannel, or any other L1 link-bonding technique). The last time I attempted to do this on my network, I discovered that guarantee wasn't nearly as ironclad as I expected. I don't remember the gory details, at this remove, sorry. Maybe it wasn't TCP? Maybe it wasn't the default hashing algorithm? Dunno. -Adam On Jul. 8, 2020 00:48, Saku Ytti <saku@ytti.fi> wrote: Hey Adam, On Wed, 8 Jul 2020 at 00:11, Adam Thompson <athompson@merlin.mb.ca> wrote:
Good luck with tunnelling LACP, no matter what boxes you have - LACP has (de facto) hard jitter requirements of under 1msec, or you'll be getting TCP resets coming out your ears due to mis-ordered packets.
Can you elaborate on this? Where is LACP jitter defined and for what purpose? We push packets around the globe in sub 200us jitter on any given day, so 1000us isn't for us a particularly hard goal. Only reason why I could imagine someone would care about jitter here is if protocol measures delay (LACP doesn't) and relies on delay to remain static and then balances per-packet or per-byte or otherwise between multiple links. However we of course put all packets from given TCP session to always same LACP interface, so from TCP session POV, each LACP is exactly a 1 interface. Per-packet balancing on LACP is possible via a special configuration, but anyone who does it, doesn't care about reordering, no matter of jitter, because even in very stable jitter, the paths may be unequal length and cause reordering. LACP hellos are sent every 1s when in fast mode with 3s keepalive, which also isn't particularly tight. We do have customers running LACP over MPLS pseudowires over great distances. -- ++ytti
On Wed, 8 Jul 2020 at 14:56, Adam Thompson <athompson@merlin.mb.ca> wrote:
If jitter were defined anywhere vis-à-vis LACP, it would be _de jure_, not _de facto_ as I said.
I suspect the de-facto domain you think of has modest population. As jitter would only matter in case where protocol measures delay and artificially adds static delay to compensate. This is not the case for LACP (some balancing solutions do latency compensation), jitter is immaterial.
Yes, if you have *guaranteed* that TCP sessions hash uniquely to a single link in your network, you might be able to successfully tunnel LACP (or EtherChannel, or any other L1 link-bonding technique). The last time I attempted to do this on my network, I discovered that guarantee wasn't nearly as ironclad as I expected. I don't remember the gory details, at this remove, sorry. Maybe it wasn't TCP? Maybe it wasn't the default hashing algorithm? Dunno.
Jitter on software device connected directly has order of magnitude higher jitter than operator pseudowire across globe, so adding tunnel or not adding tunnel is not at all indicative of amount of jitter, which still is not a metric that LACP cares about. Internet works, because hashing works, it's not perfect, but it's good enough that in practical Internet most links you traverse are relying on hash to work, be it ECMP or LAG. -- ++ytti
On 7/Jul/20 23:09, Adam Thompson wrote:
Good luck with tunnelling LACP, no matter what boxes you have - LACP has (de facto) hard jitter requirements of under 1msec, or you'll be getting TCP resets coming out your ears due to mis-ordered packets.
Hmmh - this is odd. We once provided a customer with an EoMPLS pw between Johannesburg and London, which tunneled a number of L2CP's, including LACP. Worked well, and I'd say jitter varied but never exceeded 20ms. Mark.
On Wed, Jul 8, 2020, at 00:09, Adam Thompson wrote:
Good luck with tunnelling LACP, no matter what boxes you have - LACP has (de facto) hard jitter requirements of under 1msec, or you'll be getting TCP resets coming out your ears due to mis-ordered packets.
Errr.... sorry, but at the latest news, TCP was supposed to handle out of order packets and reorder them before sending them to upper layer. Not to mention hashing that almost systematically makes that all packets of the same TCP stream will be sent on the same link in an LAG (also on most if not all ECMP implementations).
Miktotik "carrier-grade"
.....
On Wed, 8 Jul 2020 at 13:46, Radu-Adrian Feurdean <nanog@radu-adrian.feurdean.net> wrote:
Errr.... sorry, but at the latest news, TCP was supposed to handle out of order packets and reorder them before sending them to upper layer. Not to mention hashing that almost systematically makes that all packets of the same TCP stream will be sent on the same link in an LAG (also on most if not all ECMP implementations).
Yes, however new reno and the like are tuned for practical Internet. Practical Internet has lot more packet loss than reordering, so TCP algorithm considers any amount of reordering a packet loss, causing an immediate resend, destroying your performance. However, as you state TCP will only ever see single port LACP interfaces. -- ++ytti
(re-adding Adam's text that didn't get quoted, but matters) On Wed, 08 Jul 2020 13:49:56 +0300, Saku Ytti said:
On Wed, 8 Jul 2020 at 13:46, Radu-Adrian Feurdean <nanog@radu-adrian.feurdean.net> wrote: On Wed, Jul 8, 2020, at 00:09, Adam Thompson wrote:
Good luck with tunnelling LACP, no matter what boxes you have - LACP has (de facto) hard jitter requirements of under 1msec, or you'll be getting TCP resets coming out your ears due to mis-ordered packets. Errr.... sorry, but at the latest news, TCP was supposed to handle out of order packets and reorder them before sending them to upper layer. Yes, however new reno and the like are tuned for practical Internet. Practical Internet has lot more packet loss than reordering, so TCP algorithm considers any amount of reordering a packet loss, causing an immediate resend, destroying your performance.
There's a difference between a TCP *resend*, and a *RESET*. Triggering a resend on a re-order is reasonably sane, sending an RST isn't....
On 09.07.2020 02.14, Valdis Klētnieks wrote:
There's a difference between a TCP *resend*, and a *RESET*. Triggering a resend on a re-order is reasonably sane, sending an RST isn't....
You get the RESETs from people that do anycast when your broken ECMP hashing splits the packets between multiple upstream providers. This might cause parts of your TCP stream to end up at entirely different destinations. Probably not going to happen with LACP but these things are related and often use the same knops in the configuration. Regards, Baldur
On 8/Jul/20 12:42, Radu-Adrian Feurdean wrote:
Errr.... sorry, but at the latest news, TCP was supposed to handle out of order packets and reorder them before sending them to upper layer. Not to mention hashing that almost systematically makes that all packets of the same TCP stream will be sent on the same link in an LAG (also on most if not all ECMP implementations).
True, but TCP is unaware about if the interface is a LAG or a native port. It's just another tube. We tested per-packet load balancing on the MX Trio line cards. The traffic spread is perfect, but the OoO experience is atrocious. Either settle for per-flow load balancing, move to a faster native port, or stick with ECMP at the IP layer. For instance, this is why we don't do LACP for backbones anymore. It is far more reliable to have individual IP links, and let ECMP do its thing. The only place we run LAG's in our network is 802.1Q trunks between router and switch. But the moment those get to 4x 10Gbps, we go native 100Gbps (which has the added benefit of making per-service policing on the router easier). Mark.
The EX 4650 does indeed do 25G. Chris From: NANOG <nanog-bounces+ccummings=coeur.com@nanog.org> Date: Tuesday, July 7, 2020 at 16:10 To: Jürgen Jaritsch <juergen@jaritsch.at>, nanog@nanog.org <nanog@nanog.org> Subject: RE: L2VPN/L2transport, Cumulus Linux & hardware suggestion Good luck with tunnelling LACP, no matter what boxes you have - LACP has (de facto) hard jitter requirements of under 1msec, or you'll be getting TCP resets coming out your ears due to mis-ordered packets. For your requirements, although I hesitate to recommend them for enterprise/carrier use, Miktotik's EoIP protocol does a much better job of this than most "carrier-grade" implementations. Otherwise, Juniper and Arista both come to mind, Juniper has the EX4650 that matches your h/w specs, and Arista has, oh, at least half a dozen boxes of various spec that comply, too. Not 100% sure the Juniper EX does 25G, now that I think of it. Adam Thompson Consultant, Infrastructure Services MERLIN 100 - 135 Innovation Drive Winnipeg, MB, R3T 6A8 (204) 977-6824 or 1-800-430-6404 (MB only) athompson@merlin.mb.ca https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.merlin.mb.ca&c=E,1,o0s1bKhLRD6liZS4E7uqx8L_J041eQu7PxSKcDhFF789AA4epdh0jA2ocQb3Muy1lOflaqq0cDB0hNdzN8eaUymLeSEkUXcpEIsdt7KL5XEHOMQ,&typo=1
-----Original Message----- From: NANOG <nanog-bounces+athompson=merlin.mb.ca@nanog.org> On Behalf Of Jürgen Jaritsch Sent: Tuesday, July 7, 2020 3:15 PM To: nanog@nanog.org Subject: L2VPN/L2transport, Cumulus Linux & hardware suggestion
Dear folks,
have anyone already tried to run VXLAN/EVPN + “Bridge Layer 2 Protocol Tunneling” on Cumulus Linux as an replacement for classic MPLS L2VPN/VPWS (“xconnect”, l2circuit, VLL) ?
I need to provide transparent Ethernet P2P virtual leased lines to my customers and these have to support stuff like LLDP, STP, LACP, etc. The transport L2 network is not THAT big: max hops between VTEP is 4.
Anyone have suggestions for the below hardware request? #) 1-3U L2/L3 box #) 48x SFP28 / 1/10/25G #) 6x QSFP28 / 100G #) VXLAN/EVPN with L2 tunneling support or #) MPLS VPWS/l2circuit #) Dual PSU
thanks & best regards Jürgen
participants (8)
-
Adam Thompson
-
Baldur Norddahl
-
Cummings, Chris
-
Jürgen Jaritsch
-
Mark Tinka
-
Radu-Adrian Feurdean
-
Saku Ytti
-
Valdis Klētnieks