Danny McPherson wrote:
On May 22, 2009, at 5:15 PM, Steve Bertrand wrote:
neighbor xxx.xx.xx.x timers 30 60
Make sure that this is communicated to your peer as well so that their timer setting are reflected the same.
Thankfully at this point, we manage all CPE of any clients who peer with us, and so far, the clients advertise our own space back to us. I'll go back to looking at adequate timer settings for my environment.
Of course, given that the lowest BGP holdtime is selected when the session is being established, you don't really need to change the CPE side, all you need to do is make the change on the network side and reset the session. And it's typically a good idea to set the keepalive interval to a higher frequency when employing lower holdtimes such that transient keepalive loss (or updates, which act as implicit keepalives) don't cause any unnecessary instability.
Also, there are usually global values you can set for all BGP neighbors in most implementations, as well as the per-peer configuration illustrated above. The former requires less configuration bits if you're comfortable with setting the values globally.
I remember reading that the lowest value is implemented, but thanks for the reminder. In this case, since I *can* change it at the CPE, I may as well. That way, in the event that I move on (or get hit by a bus) and the next person moves the connection to a new router, the CPE will win. Also... the global setting is a great idea. Unfortunately, connected to this router that handles these fibre connections are a couple of local peers that I don't want to change the 'defaults' for. I can't remember if timers can be set at a peer-group level, so I'll look that up and go from there. That will be my best option given what is connected to this router.
If you want to converge a little fast than BGP holdtimes here and the fiber link is directly between the routers, you might look at something akin to Cisco's "bgp fast-external-fallover", which immediately resets the session if the link layer is reset or lost.
Well, unfortunately, the local PUC owns the fibre, and they have a switch aggregating all of their fibre in a star pattern. They then trunk the VLANs to me across two redundant pair. I'm in the process of persuading them to allow me to put my own gear in their location so I can manage it myself (no risk of port-monitor, no risk of their ops fscking up my clients etc). This way, they connect from their client-facing converter into whatever port in my switch I tell them. With that said, and as I said before, L3 and below rarely fails. I'll look into fast-external-fallover. It may be worth it here.
While I'm at it, I've got another couple of questions:
- whatever technique you might recommend to reduce the convergence throughout the network, can the same principles be applied to iBGP as well?
Depending on your definition of convergence, yes. If you're referring to update advertisements as opposed to session or router failures, though, MRAI tweaks and/or less iBGP hierarchy might be the way to go. Then again, there are lots of side effects with these as well..
I suppose I might not completely understand what I am asking. - pe1 has iBGP peering with p1 and p2, and pe1 has p2 as it's next hop in FIB for prefix X (both cores have prefix X in routing table through a different edge device) - p2 suddenly falls off the network Perhaps it's late enough on Friday night after a long day for me to not be thinking correctly, but I can't figure out exactly what the delay time would be for a client connected to pe1 to re-reach prefix X if p2 goes down hard.
- if I need to down core2, what is the quickest and easiest way to ensure that all gear connected to the cores will *quickly* switch to preferring core1?
Use your IGP mechanisms akin to IS-IS overload bit or OSPF stub router (max metric) advertisement.
I will certainly look into your suggestions. I have only a backbone area in OSPF carrying loopbacks and infrastructure, but don't quite understand the entire OSPF protocol yet. Thanks Danny, Steve