We have a WAN that is multihomed with 2 NSP's (as customers, not peers). Small routers with limited memory; full iBGP mesh; BGP routes not redistributed into IGP. Our routing policy had been: [] Point default to NSP #1 [] Take internal routes from NSP #2 [] Have a second, lower pref default to NSP #2 Recently, we have been bitten by failures on our 10mbit link to NSP #1 (default) where the ethernet-lookalike link between us became uni-directional (common netedges failure mode). This failure was not detected by the router hardware so the bulk of our traffic continued to send the data out that interface and into a black hole. To get around this failure mode, we got NSP #1 to also send us their internal routes and we changed our candidate default-network to point to one of their well-known annoucements instead of the directly attached interface. So our new policy had been for quite a while: [] Take internal routes from NSP #1 [] Select an annoucement as candidate default net from NSP #1. [] Take internal routes from NSP #2 [] Select an annoucement as alternate candidate default net to NSP #2. The problem we have been experiencing now is that both NSP #1 and NSP #2 have been undergoing some major internal restructuring, causing daily change/loss of candidate net annoucements, or change in aggregation boundaries, etc..... making it a daily exercise at selecting proper candidate route selection. Can anyone provide an alternate or better strategy on how to deal with this? tia, --curtis
Hi Curtis,
The problem we have been experiencing now is that both NSP #1 and NSP #2 have been undergoing some major internal restructuring, causing daily change/loss of candidate net annoucements, or change in aggregation boundaries, etc..... making it a daily exercise at selecting proper candidate route selection.
Can anyone provide an alternate or better strategy on how to deal with this?
It would seem that NSP A and B would have some relatively static networks that they could tell you about. Perhaps their noc lan or web farm addressing. I realize this is rather a simple-ish fix, but you may wish to visit with some intelligent folk at the NSP to determine what these are. I would question if there is renumbering/announcement changes, or if they are having "BGP configuration problems". Another solution, with some, but less, negative ramifications, would be to select 2 or 3 networks within each NSP. Not a great fix, but a simple one to look at. A final solution would be to buy a 4700 w/ 64M of ram. You should be able to handle full tables from 2 peers along w/ all your IGP on this box. The port cost is not exceedingly high, and if you can afford the capital outlay, it sounds like it would be a rather good mediumish solution. Obviously the primary recommendation would be to "fix" the netedge problem, but I assume you're pursuing that. The fact that your NSP is rather unstable is also a negative, but you can only do so much (before you leave them hint hint) $0.02 -alan
BGP done right requires a full duplex connection to be up. So this is ok.
[] Point default to NSP #1 [] Take internal routes from NSP #2 [] Have a second, lower pref default to NSP #2
Recently, we have been bitten by failures on our 10mbit link to NSP #1 (default) where the ethernet-lookalike link between us became uni-directional (common netedges failure mode). This failure was not
Of couse it won't help you if your provider can't correctly send a default only when he has a proper default.
participants (3)
-
alan@mindvision.com
-
Curtis Generous
-
jon@branch.com