Route Reflector architecture and how to get small customer blocks in to BGP?
Hey all, I know this will start an argument about relevance, but think of it as one network operator trying to help another operator do good and have a network where the potential for stupid routing leaks is minimized... I've got a friend (really, it's not me!) who works at a regional ISP . The previous network designer didn't seem to have much of a clue, and has tons of customer nets in OSPF. So he's trying to migrate them properly to BGP. I know opinions will vary widely, but that's exactly what we're looking for. What pros/cons haven't we thought of? First, they've got a BGP full mesh of all their routers. They're considering moving towards route reflectors. There's 2 core routers per-POP. And anywhere between 5 and 15 edge/aggregation routers in a POP. The current thought is to move to a route-reflector full mesh between all the dual-core routers in each POP. The other alternative is to deploy just 2 route reflectors for the entire network. Can anyone point me towards real-world info on the pros and cons of each approach? There seems to be little public info on why people do what they do, it's more info on how to do it. Next I'm looking for differing advice on getting small customer- assigned blocks in to BGP. Like /29s that are given to fractional T1 customers. Assume that there are /29s assigned to customers. It would be a beautiful world if these /29s could be easily rolled up in to /24s, then have a static route /24 pointed towards the bitbucket, and advertise the /24 in to BGP. But alas the network is not that neat. / 29s are scattered all over various customer aggregation routers. Sometimes it's just one /29 out of a a /24 block that's on another router. Also, there's /22s to /19s per-pop in nice little aggregatable subnets, so at least that's good. - Should he use a permanent static route to the /29, then use a network statement to bring it in to BGP? - Should he use a permanent static route, and redistribute *very* carefully, so that he can do tagging with communities? - Should he use a static route which would be withdrawn if the link went down? This would mean traffic to a down customer would be dropped quicker, but flaps cause more BGP churn. This would mean he's have hundreds to thousands of /29 routes floating around internally. But the /29s would be portable between edge aggregation routers. - Should he have a /24 routed to the bitbucket, and aggregate /29s where he can, and have some exception /29s floating around randomly? The problem here is what if the customer needs to move to another router, because he wants a bigger pipe and his circuit can't be connected to the existing router? - Should he take the hard line approach and force customers to renumber? If they upgrade their service, and have to move routers, they get assigned a new /29 subnet. In this world of NAT, it should be easy! ;) That way he can have nice contiguous blocks to announce. - Would confederations help? Seems like overkill, but he could aggregate at the POP level instead of the router level. How do the rest of you assign out customer blocks? On a per-router basis, on a per-pop basis? How do you keep the number of routes down to a manageable level? How do you make it easy for installation / provisioning engineers to bring up a customer, but never get near the BGP config (assume they'll login to put a /30 on the link interface)? Any pointers to sites on the net that show how real world ISPs setup their route/policy maps? I'm not talking BGP intro stuff like "make sure you don't announce a default to your upstreams" but examples of what bgp communities are used for, what real world filtering is done beyond basics and bogon filtering, etc. I can't even find a good nanog presentation on this. They're all about the basics. Let the floodgates begin! If you want to tell me this is off-topic, please do us all a favor and email me directly. Or don't read it and simply trash it. Nobody wants to hear why you're the on-topic police and what you say goes. -pete
On Sat, Jan 27, 2007 at 01:39:54PM -0500, Pete Crocker wrote: [snip]
First, they've got a BGP full mesh of all their routers. They're considering moving towards route reflectors. There's 2 core routers per-POP. And anywhere between 5 and 15 edge/aggregation routers in a POP. The current thought is to move to a route-reflector full mesh between all the dual-core routers in each POP. The other alternative is to deploy just 2 route reflectors for the entire network. Can anyone point me towards real-world info on the pros and cons of each approach? There seems to be little public info on why people do what they do, it's more info on how to do it.
Data missing above: how many sites in this design overall? What is the fragility of the inter-site links? What are the growth plans? If "few", "robust" and "none-to-low" are the answers then yes only a pair or quartet of network wide RRs make sense. I wouldn't want to have to maintain it, nor really recommend it. For any kind of growth, failure condition coverage, or many POP sites, then you'll want all the individual sites' core routers in the core iBGP mesh and a pair of RR trees per site, each rooted in the core router. I'll leave the whole confederation issue aside for now.
Assume that there are /29s assigned to customers. It would be a beautiful world if these /29s could be easily rolled up in to /24s, [snip] router. Also, there's /22s to /19s per-pop in nice little aggregatable subnets, so at least that's good. [snip] - Should he use a static route which would be withdrawn if the link went down? This would mean traffic to a down customer would be dropped quicker, but flaps cause more BGP churn.
Select the latter. Modifying networks statements for move/add/changes invites trouble. Carefully constructed policies to redistribute your connected or static routes into iBGP and tagged appropriately are a win. At the very least, you can limit to subnets of "my network's prefixes"; If possible, leverage the nice aggregation and limit to "my network's local prefixes" and you scope potential future havoc.
- Would confederations help? Seems like overkill, but he could aggregate at the POP level instead of the router level.
There is no need for small slices of the nice aggregatable local site prefixes to leave the site in eithe a confederation or an RR model. Think of what device owns the tie-down route for the site-level, and how it is hearing that route, and how it is redistributing to the rest of your network. Cheers, Joe -- RSUC / GweepNet / Spunk / FnB / Usenix / SAGE
On Jan 28, 2007, at 9:06 AM, Joe Provo wrote:
Select the latter. Modifying networks statements for move/add/changes invites trouble. Carefully constructed policies to redistribute your connected or static routes into iBGP and tagged appropriately are a win. At the very least, you can limit to subnets of "my network's prefixes"; If possible, leverage the nice aggregation and limit to "my network's local prefixes" and you scope potential future havoc.
I'm not a big fan of redistribution as I've been bitten by it a few times. One of the biggest issues is that if a policy is being updated and some periodic redistribution process runs the policy at that instant is applied and things not in the policy at that snapshot are not applied (intuitive enough - now). For example, if you're redistributing routes into BGP and coloring with a community based on a route match policy and some of those routes aren't in the policy snapshot then they won't be "colored" with communities or the like and may be leaked or not advertised otherwise. This is particularly ugly when you've employed "implicit permit" external advertisement policies where routes that aren't tagged with some value are passed by default. Two lessons learned for me: o If you're going to use redistribution - or not - ensure that all external advertisement policies require explicit match of advertise communities and default is to deny o Don't unnecessarily touch policies or blindly overwrite them periodically, utilize incrementally updated prefix lists as much as possible Given the two conditions above I'm not as wary of redistribution and it may ease configuration managed as Joe suggests. -danny
On 1/28/07, Danny McPherson <danny@tcb.net> wrote:
o If you're going to use redistribution - or not - ensure that all external advertisement policies require explicit match of advertise communities and default is to deny
I'll second that recommendation. I learned early in life that this can be a mess otherwise. We employed that technique at BBN/Genu and it kept us from leaking quite nicely. If a provisioning person forgot a customer inbound route-map or something, we didn't accidentally hose ourselves. -- -Steve
On Sun, Jan 28, 2007 at 10:59:50AM -0700, Danny McPherson wrote: [snip]
o If you're going to use redistribution - or not - ensure that all external advertisement policies require explicit match of advertise communities and default is to deny
This should be just good security policy. I think of it as a network-level instance of "that which is not expressly permitted is denied" which everyone applies for services on their hosts, right :-) Cheers, Joe -- RSUC / GweepNet / Spunk / FnB / Usenix / SAGE
participants (4)
-
Danny McPherson
-
Joe Provo
-
Pete Crocker
-
Steve Meuse