MPLS VPN design - RR in forwarding path?
Hi everyone, I'm reading Randy's Zhang BGP Design and Implementation and I found following guidelines about designing RR-based MPLS VPN architecture: - Partition RRs - Move RRs out of the forwarding path - Use a high-end processor with maximum memory - Use peer groups - Tune RR routers for improved performance. Since the book is a bit outdated (2004) I'm curious if these rules still apply to modern SP networks. What would be the reasoning behind keeping RRs out of the forwarding path? Is it only a matter of performance and stability? Thanks, Marcin
On Wednesday, December 31, 2014, Marcin Kurek <notify@marcinkurek.com> wrote:
Hi everyone,
I'm reading Randy's Zhang BGP Design and Implementation and I found following guidelines about designing RR-based MPLS VPN architecture: - Partition RRs - Move RRs out of the forwarding path - Use a high-end processor with maximum memory - Use peer groups - Tune RR routers for improved performance.
Since the book is a bit outdated (2004) I'm curious if these rules still apply to modern SP networks. What would be the reasoning behind keeping RRs out of the forwarding path? Is it only a matter of performance and stability?
Thanks, Marcin
Correct, these ideas are MOSTLY rooted in old school router limitations. Ymmv. Look for facts in the replies you get, not unsubstantiated opinions. There is no technical reason to have a bgp rr out of path on a hardware based forwarding router that has sufficient control plane capacity to run bgp. CB
On 31/12/2014 12:08, Marcin Kurek wrote:
I'm reading Randy's Zhang BGP Design and Implementation and I found following guidelines about designing RR-based MPLS VPN architecture: - Partition RRs - Move RRs out of the forwarding path - Use a high-end processor with maximum memory - Use peer groups - Tune RR routers for improved performance.
Since the book is a bit outdated (2004) I'm curious if these rules still apply to modern SP networks.
arguably more so now than ever, but you can always run RRs inline in the forwarding path if you want to. Taking RRs out of the forwarding plane means that you can keep your overall routing architecture simpler and more consistent, and adding/removing different forwarding hardware means that you don't really need to do much with the RR configuration. The larger router vendors all have virtualised RR implementations these days (XRv/CSR1k, vRR, AlcaLu, etc), which means that you can get to run your RRs on standard x86 hardware platforms using normal hypervisors. This wasn't the case in 2004. The pricing and licensing for virtual RR images from the normal vendors hasn't settled down into workable models yet but that's only a matter of time, particularly given that open source routing stacks are going to start seriously impinging on this market segment in the next couple of years. Nick
On 12/31/14 4:08 AM, Marcin Kurek wrote:
Hi everyone,
I'm reading Randy's Zhang BGP Design and Implementation and I found following guidelines about designing RR-based MPLS VPN architecture: - Partition RRs - Move RRs out of the forwarding path I'd find it odd if the RR were the nexthop for any signficant traffic, in recent deployments I've done there's no fib to speak of excepting igp routes installed on the RR itself. - Use a high-end processor with maximum memory bgp addpath kicked up the memory requirements of the RR considerably when we deployed it. - Use peer groups - Tune RR routers for improved performance.
Since the book is a bit outdated (2004) I'm curious if these rules still apply to modern SP networks. What would be the reasoning behind keeping RRs out of the forwarding path? Is it only a matter of performance and stability?
Thanks, Marcin
On Wed, Dec 31, 2014 at 01:08:15PM +0100, Marcin Kurek wrote:
Hi everyone,
I'm reading Randy's Zhang BGP Design and Implementation and I found following guidelines about designing RR-based MPLS VPN architecture: - Partition RRs - Move RRs out of the forwarding path - Use a high-end processor with maximum memory - Use peer groups - Tune RR routers for improved performance.
Since the book is a bit outdated (2004) I'm curious if these rules still apply to modern SP networks. What would be the reasoning behind keeping RRs out of the forwarding path? Is it only a matter of performance and stability?
When they say "move RRs out of the forwarding path", they could mean "don't force all traffic through the RRs". These are two different things. Naive configurations could end up causing all VPN traffic to go through the RRs (e.g. setting next-hop-self on all reflected routes) whereas more correct configurations don't do that--but there may be some traffic that natrually flows through the same routers that are the RRs, via an MPLS LSP for example. That latter is fine in many cases, the former is not. E.g. I would argue that a P-router can be an RR if desired.
On (2014-12-31 12:05 -0500), Chuck Anderson wrote: Hey,
are the RRs, via an MPLS LSP for example. That latter is fine in many cases, the former is not. E.g. I would argue that a P-router can be an RR if desired.
There is no compelling advantage. No budget is too thin for 3 gray NPE-G1, if they are, maybe network engineers without borders can help you. There are some compelling disadvantages, my current and previous employer both have experienced VPN AFI BGP UPDATE crashing whole box (infact whole cluster of 3 VPN reflectors, at once). Trying to achieve 0 outages is silly and impossible, reducing outage impact is often simple and cheap, sometimes not done, when only failure modes considered are physical (HW, fibre, electricity...) failures, rather than the more common modes (pilot and software). -- ++ytti
Hi, Right, one is when besides forwarding packets a router also functioning as a RR, another - when RR sets NH to itself and hence forces all the traffic to pass thru the router in fast path. Keep in mind - some architectures, such as seamless MPLS would require a RR to be in the fast path. There are some other cases where it could be a requirement. I'd advice to look into vRR space - price/performance looks quite good. Wrt open source implementations - if you are looking into relatively basic feature set (v4/v6 unicast/vpn) reliability is not of main concern and of course- there are hands and brains to support it - could be a viable approach. Might you be looking into more complex feature set - EVPN, BGP-LS, FS, enhanced route refresh, etc, highly optimized code wrt update rate/ number of peers supported - most probably you'd end up with a commercial implementation. Hope this helps Regards, Jeff
On Dec 31, 2014, at 9:08 AM, Chuck Anderson <cra@WPI.EDU> wrote:
On Wed, Dec 31, 2014 at 01:08:15PM +0100, Marcin Kurek wrote: Hi everyone,
I'm reading Randy's Zhang BGP Design and Implementation and I found following guidelines about designing RR-based MPLS VPN architecture: - Partition RRs - Move RRs out of the forwarding path - Use a high-end processor with maximum memory - Use peer groups - Tune RR routers for improved performance.
Since the book is a bit outdated (2004) I'm curious if these rules still apply to modern SP networks. What would be the reasoning behind keeping RRs out of the forwarding path? Is it only a matter of performance and stability?
When they say "move RRs out of the forwarding path", they could mean "don't force all traffic through the RRs". These are two different things. Naive configurations could end up causing all VPN traffic to go through the RRs (e.g. setting next-hop-self on all reflected routes) whereas more correct configurations don't do that--but there may be some traffic that natrually flows through the same routers that are the RRs, via an MPLS LSP for example. That latter is fine in many cases, the former is not. E.g. I would argue that a P-router can be an RR if desired.
On Wed, Dec 31, 2014 at 11:14 AM, Jeff Tantsura <jeff.tantsura@ericsson.com> wrote:
Keep in mind - some architectures, such as seamless MPLS would require a RR to be in the fast path.
+1 Also think physical topologies like ethernet rings. Where's the RR go in this topology? -Dan
On Friday, January 02, 2015 11:03:21 PM Daniel Rohan wrote:
Also think physical topologies like ethernet rings. Where's the RR go in this topology?
In these topologies, I've been playing with having the RR's in the core (i.e., on the other end of the PE Aggregation routers terminating the ring) and running the iBGP sessions between the Metro-E Access switches and the RR's. As the RR's are MPLS-free (but the Metro-E Access switches [particularly Cisco] still assign MPLS labels to each IGP route), we cobble together a combination of hop-by-hop IP forwarding + MPLS forwarding fu to get traffic to where it needs to go without involving the PE Aggregation routers in the routing toward the Metro-E Access switches (i.e., 0/0 and ::/0 + a few more specifics come from the RR's). Been working with this routing topology for MPLS rings since 2009. It seems to hold up... Mark.
Hello all, Thank you for insightful answers. I was thinking mostly about the second scenario Chuck mentioned - where some traffic naturally flows through the routers that are the RRs because of MPLS LSP. Setting next-hop-self on all reflected routes would be misconfiguration IMHO. I am also aware of products like vMX or CSR1000v/XRv and the example given by Saku makes me more interested in licensing/pricing options. Regards, Marcin W dniu 2014-12-31 o 18:05, Chuck Anderson pisze:
Hi everyone,
I'm reading Randy's Zhang BGP Design and Implementation and I found following guidelines about designing RR-based MPLS VPN architecture: - Partition RRs - Move RRs out of the forwarding path - Use a high-end processor with maximum memory - Use peer groups - Tune RR routers for improved performance.
Since the book is a bit outdated (2004) I'm curious if these rules still apply to modern SP networks. What would be the reasoning behind keeping RRs out of the forwarding path? Is it only a matter of performance and stability? When they say "move RRs out of the forwarding path", they could mean "don't force all traffic through the RRs". These are two different
On Wed, Dec 31, 2014 at 01:08:15PM +0100, Marcin Kurek wrote: things. Naive configurations could end up causing all VPN traffic to go through the RRs (e.g. setting next-hop-self on all reflected routes) whereas more correct configurations don't do that--but there may be some traffic that natrually flows through the same routers that are the RRs, via an MPLS LSP for example. That latter is fine in many cases, the former is not. E.g. I would argue that a P-router can be an RR if desired.
On Thursday, January 01, 2015 12:46:23 PM Marcin Kurek wrote:
I am also aware of products like vMX or CSR1000v/XRv and the example given by Saku makes me more interested in licensing/pricing options.
Our network spans Africa, South Asia and Europe. We have 2x RR's in each PoP running Cisco's CSR1000v on x86_64 hardware under VMware ESXi. Pricing is not too bad; we use the Premium license which enables all features (BFD, e.t.c.) that you don't get with the Standard license. Been running this configuration since July 2014 - very happy. Mark.
On 12/31/2014 6:08 AM, Marcin Kurek wrote:
Hi everyone,
I'm reading Randy's Zhang BGP Design and Implementation and I found following guidelines about designing RR-based MPLS VPN architecture: - Partition RRs - Move RRs out of the forwarding path - Use a high-end processor with maximum memory - Use peer groups - Tune RR routers for improved performance.
Since the book is a bit outdated (2004) I'm curious if these rules still apply to modern SP networks. What would be the reasoning behind keeping RRs out of the forwarding path? Is it only a matter of performance and stability?
Thanks, Marcin
Overall, depends on your design and scale. But, I will comment on a few of your items... We have RRs in the forwarding path but have a project to move them out in 2015. We feel it gives us more options as well as more flexibility when we move to the next phase of RR design (hierarchical). Most vendors today have the performance numbers (sometimes they aren't published publically) for routers acting as RRs. Ask your vendor and pick one that suits you. We generally buy the middle or most memory and pick a reasonable processor. And, then we monitor :) As for peer groups, you should have a design that allows you to herd most of the config snips together. Use the features that make your life easier and allow you to simplify your routing policies. tv
Is there a good reason to use actual router hardware for the route reflector role? Even a cheap server has more CPU and memory. If it is not in the forwarding path, this is a computing task - not a move packets at line speed task. Are anyone using Bird, Quagga etc. for this? Regards, Baldur
On 01/01/2015 21:37, Baldur Norddahl wrote:
Are anyone using Bird, Quagga etc. for this?
there are patches for both code-bases and some preliminary support for vpnv4 in quagga, but other than that neither currently supports either ldp or the vpnv4/vpnv6 address families in the main-line code. Nick
You don't need LDP on RR as long as clients support "not on lsp" flag (different implementation have different names for it) There are more and more reasons to run RR on a non router HW, there are many reasons to still run commercial code base, mostly feature set and resilience. Regards, Jeff
On Jan 1, 2015, at 2:11 PM, Nick Hilliard <nick@foobar.org> wrote:
On 01/01/2015 21:37, Baldur Norddahl wrote: Are anyone using Bird, Quagga etc. for this?
there are patches for both code-bases and some preliminary support for vpnv4 in quagga, but other than that neither currently supports either ldp or the vpnv4/vpnv6 address families in the main-line code.
Nick
Running various functions on a couple small VM clusters makes a lot of sense. ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com ----- Original Message ----- From: "Jeff Tantsura" <jeff.tantsura@ericsson.com> To: "Nick Hilliard" <nick@foobar.org> Cc: nanog@nanog.org Sent: Thursday, January 1, 2015 7:54:32 PM Subject: Re: MPLS VPN design - RR in forwarding path? You don't need LDP on RR as long as clients support "not on lsp" flag (different implementation have different names for it) There are more and more reasons to run RR on a non router HW, there are many reasons to still run commercial code base, mostly feature set and resilience. Regards, Jeff
On Jan 1, 2015, at 2:11 PM, Nick Hilliard <nick@foobar.org> wrote:
On 01/01/2015 21:37, Baldur Norddahl wrote: Are anyone using Bird, Quagga etc. for this?
there are patches for both code-bases and some preliminary support for vpnv4 in quagga, but other than that neither currently supports either ldp or the vpnv4/vpnv6 address families in the main-line code.
Nick
On Thursday, January 1, 2015, Mike Hammett <nanog@ics-il.net> wrote:
Running various functions on a couple small VM clusters makes a lot of sense.
I agree, it makes some sense, especially if you are control plane bound. But, nearly all my routers run between 1% and 10% cpu. Ymmv. I have feeling that running a bgp rr on cheap / standard / commidity vm is pretty exotic from a support perspective. So running a bgp rr on a vm may make sense in theory, but my network control planes are not too busy and vm bgp is a unique/ exotic support model. Your network is probably different
----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com
----- Original Message -----
From: "Jeff Tantsura" <jeff.tantsura@ericsson.com <javascript:;>> To: "Nick Hilliard" <nick@foobar.org <javascript:;>> Cc: nanog@nanog.org <javascript:;> Sent: Thursday, January 1, 2015 7:54:32 PM Subject: Re: MPLS VPN design - RR in forwarding path?
You don't need LDP on RR as long as clients support "not on lsp" flag (different implementation have different names for it) There are more and more reasons to run RR on a non router HW, there are many reasons to still run commercial code base, mostly feature set and resilience.
Regards, Jeff
On Jan 1, 2015, at 2:11 PM, Nick Hilliard <nick@foobar.org <javascript:;>> wrote:
On 01/01/2015 21:37, Baldur Norddahl wrote: Are anyone using Bird, Quagga etc. for this?
there are patches for both code-bases and some preliminary support for vpnv4 in quagga, but other than that neither currently supports either ldp or the vpnv4/vpnv6 address families in the main-line code.
Nick
On Friday, January 02, 2015 04:17:37 AM Ca By wrote:
Ymmv. I have feeling that running a bgp rr on cheap / standard / commidity vm is pretty exotic from a support perspective.
Not really. Since July last year. The worst I've had was the HP server shutting down in a London data centre due to environmental overheating. Beyond that, similar requirements as with a router, if you avoid the VM clustering goodness they all preach.
So running a bgp rr on a vm may make sense in theory, but my network control planes are not too busy and vm bgp is a unique/ exotic support model.
Amongst very many other things, running an RR on my core router means I need to touch my core router code if I want that exotic routing feature. I'd rather not, if my core router (in-path) is really just forwarding traffic between PoP's. But agree, our networks are probably quite different :-). Mark.
On 02/01/2015 02:17, Ca By wrote:
I agree, it makes some sense, especially if you are control plane bound. But, nearly all my routers run between 1% and 10% cpu.
1% and 10% on what will generally be older, slower cpus. Modern servers have significantly faster CPUs than any RE/RP on the market, so you will see a benefit in terms of convergence speed by using virtualisation.
Ymmv. I have feeling that running a bgp rr on cheap / standard / commidity vm is pretty exotic from a support perspective.
not really, no. Standalone hypervisors are stable, predictable and easy to manage. They're also commercially supported and most companies these days have a good deal of internal experience in dealing with them. As Mark commented separately, you would need your head examined if you plan to enable hypervisor clustering for this sort of thing. Nick
So running a bgp rr on a vm may make sense in theory, but my network control planes are not too busy and vm bgp is a unique/ exotic support model.
Your network is probably different
----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com
----- Original Message -----
From: "Jeff Tantsura" <jeff.tantsura@ericsson.com <javascript:;>> To: "Nick Hilliard" <nick@foobar.org <javascript:;>> Cc: nanog@nanog.org <javascript:;> Sent: Thursday, January 1, 2015 7:54:32 PM Subject: Re: MPLS VPN design - RR in forwarding path?
You don't need LDP on RR as long as clients support "not on lsp" flag (different implementation have different names for it) There are more and more reasons to run RR on a non router HW, there are many reasons to still run commercial code base, mostly feature set and resilience.
Regards, Jeff
On Jan 1, 2015, at 2:11 PM, Nick Hilliard <nick@foobar.org <javascript:;>> wrote:
On 01/01/2015 21:37, Baldur Norddahl wrote: Are anyone using Bird, Quagga etc. for this?
there are patches for both code-bases and some preliminary support for vpnv4 in quagga, but other than that neither currently supports either ldp or the vpnv4/vpnv6 address families in the main-line code.
Nick
On Friday, January 02, 2015 03:57:41 AM Mike Hammett wrote:
Running various functions on a couple small VM clusters makes a lot of sense.
We treat our CSR1000v RR's as dedicated islands. No other functions run on them, nor do we cluster them. Don't want what fun could arise :-)... Mark.
On 2 Jan 2015, at 01:54, Jeff Tantsura <jeff.tantsura@ericsson.com> wrote:
You don't need LDP on RR as long as clients support "not on lsp" flag (different implementation have different names for it) There are more and more reasons to run RR on a non router HW, there are many reasons to still run commercial code base, mostly feature set and resilience.
And test coverage. As Saku alluded to earlier in the thread, rr<->rr-client outages are painful. I’ve certainly seen a number of them caused by inter-op issues between implementations. Running at least one RR which matches the code-base of the client means that at least you’re likely to have fallen within the test-cases of that vendor’s implementation. r.
+100 Regards, Jeff
On Jan 2, 2015, at 5:29 AM, Rob Shakir <rjs@rob.sh> wrote:
On 2 Jan 2015, at 01:54, Jeff Tantsura <jeff.tantsura@ericsson.com> wrote:
You don't need LDP on RR as long as clients support "not on lsp" flag (different implementation have different names for it) There are more and more reasons to run RR on a non router HW, there are many reasons to still run commercial code base, mostly feature set and resilience.
And test coverage. As Saku alluded to earlier in the thread, rr<->rr-client outages are painful. I’ve certainly seen a number of them caused by inter-op issues between implementations. Running at least one RR which matches the code-base of the client means that at least you’re likely to have fallen within the test-cases of that vendor’s implementation.
r.
On Friday, January 02, 2015 03:54:32 AM Jeff Tantsura wrote:
You don't need LDP on RR as long as clients support "not on lsp" flag (different implementation have different names for it)
The hack needed when running a Junos-based RR in an MPLS network to allow route reflection of l3vpn routes on an RR not running MPLS. For IOS and IOS XE (and IOS XR, I think), this wouldn't be needed as unlike Juniper, Cisco don't treat MPLS signaling protocols as (pseudo) routing protocols.
There are more and more reasons to run RR on a non router HW, there are many reasons to still run commercial code base, mostly feature set and resilience.
+1. Mark.
On Friday, January 02, 2015 12:09:36 AM Nick Hilliard wrote:
there are patches for both code-bases and some preliminary support for vpnv4 in quagga, but other than that neither currently supports either ldp or the vpnv4/vpnv6 address families in the main-line code.
LDP support would not be necessary for an out-of-path RR, but lack of VPNv4/VPNv6 is hurdle. Mark.
On Thursday, January 01, 2015 11:37:25 PM Baldur Norddahl wrote:
Is there a good reason to use actual router hardware for the route reflector role?
Nope. It used to be code maturity - but major vendors are supporting service-grade code on VM's.
Even a cheap server has more CPU and memory. If it is not in the forwarding path, this is a computing task - not a move packets at line speed task.
Agree.
Are anyone using Bird, Quagga etc. for this?
Wish I could - to be honest, these don't give me enough comfort for a production network. We use Quagga on FreeBSD for Anycast-this-&-that - from that experience, I'd not use it for backbone routing. YMMV. Mark.
On 02/01/2015 18:24, Mark Tinka wrote:
Wish I could - to be honest, these don't give me enough comfort for a production network.
It's not even possible for a vpn enabled network right now. Having said that, I use bird in anger for ixp route server functionality (i.e. ebgp route reflector) and have nothing but good to say about it. Quagga can be, uh, more temperamental. Nick
On Thursday, January 01, 2015 11:25:24 PM Tony Varriale wrote:
Most vendors today have the performance numbers (sometimes they aren't published publically) for routers acting as RRs. Ask your vendor and pick one that suits you. We generally buy the middle or most memory and pick a reasonable processor. And, then we monitor :)
With the major vendors now offering VM-based RR's, I'd discourage using routers as RR's just for pure long-term scale.
As for peer groups, you should have a design that allows you to herd most of the config snips together. Use the features that make your life easier and allow you to simplify your routing policies.
Suffice it to say that the Peer Group functionality in IOS and IOS XE has largely been replaced by Update Groups. We use peer and session templates, but really, as with Peer Groups in 2015, it's just to keep things neat and tidy. Junos, of course, has its way forever which still works nicely. Mark.
Given that you assign unique RD per PE, RR out of the forwarding path provides you with a neat trick for fast convergence (and debugging purposes) when CE has redundant paths to different PEs. Routes to those CEs will be seen as different routes on RR. On Wed, Dec 31, 2014 at 1:08 PM, Marcin Kurek <notify@marcinkurek.com> wrote:
Hi everyone,
I'm reading Randy's Zhang BGP Design and Implementation and I found following guidelines about designing RR-based MPLS VPN architecture: - Partition RRs - Move RRs out of the forwarding path - Use a high-end processor with maximum memory - Use peer groups - Tune RR routers for improved performance.
Since the book is a bit outdated (2004) I'm curious if these rules still apply to modern SP networks. What would be the reasoning behind keeping RRs out of the forwarding path? Is it only a matter of performance and stability?
Thanks, Marcin
On Friday, January 02, 2015 12:16:26 PM Andriy Bilous wrote:
Given that you assign unique RD per PE, RR out of the forwarding path provides you with a neat trick for fast convergence (and debugging purposes) when CE has redundant paths to different PEs. Routes to those CEs will be seen as different routes on RR.
Not having to dump routes into FIB *really* speeds up convergence. It's not even funny :-)... Mark.
participants (15)
-
Andriy Bilous
-
Baldur Norddahl
-
Ca By
-
Chuck Anderson
-
Daniel Rohan
-
Jeff Tantsura
-
joel jaeggli
-
Marcin Kurek
-
Mark Tinka
-
Mike Hammett
-
Nick Hilliard
-
Randy Bush
-
Rob Shakir
-
Saku Ytti
-
Tony Varriale