Re: Multi-homed clients and BGP timers

23 May 2009

      Danny McPherson wrote:
...
On May 22, 2009, at 5:15 PM, Steve Bertrand wrote:
...
...
neighbor xxx.xx.xx.x timers 30 60
Make sure that this is communicated to your peer as well so that
their timer setting are reflected the same.
Thankfully at this point, we manage all CPE of any clients who peer with
us, and so far, the clients advertise our own space back to us. I'll go
back to looking at adequate timer settings for my environment.

...
Of course, given that the lowest BGP holdtime is selected
when the session is being established, you don't really need
to change the CPE side, all you need to do is make the
change on the network side and reset the session.  And it's
typically a good idea to set the keepalive interval to a
higher frequency when employing lower holdtimes such that
transient keepalive loss (or updates, which act as implicit
keepalives) don't cause any unnecessary instability.
Also, there are usually global values you can set for all
BGP neighbors in most implementations, as well as the per-peer
configuration illustrated above.  The former requires less
configuration bits if you're comfortable with setting the
values globally.
I remember reading that the lowest value is implemented, but thanks for
the reminder. In this case, since I *can* change it at the CPE, I may as
well. That way, in the event that I move on (or get hit by a bus) and
the next person moves the connection to a new router, the CPE will win.

Also... the global setting is a great idea. Unfortunately, connected to
this router that handles these fibre connections are a couple of local
peers that I don't want to change the 'defaults' for.

I can't remember if timers can be set at a peer-group level, so I'll
look that up and go from there. That will be my best option given what
is connected to this router.
...
If you want to converge a little fast than BGP holdtimes here
and the fiber link is directly between the routers, you might
look at something akin to Cisco's "bgp fast-external-fallover",
which immediately resets the session if the link layer is
reset or lost.
Well, unfortunately, the local PUC owns the fibre, and they have a
switch aggregating all of their fibre in a star pattern. They then trunk
the VLANs to me across two redundant pair. I'm in the process of
persuading them to allow me to put my own gear in their location so I
can manage it myself (no risk of port-monitor, no risk of their ops
fscking up my clients etc). This way, they connect from their
client-facing converter into whatever port in my switch I tell them.

With that said, and as I said before, L3 and below rarely fails. I'll
look into fast-external-fallover. It may be worth it here.
...
...
While I'm at it, I've got another couple of questions:
- whatever technique you might recommend to reduce the convergence
throughout the network, can the same principles be applied to iBGP as
well?
Depending on your definition of convergence, yes.  If you're
referring to update advertisements as opposed to session or
router failures, though, MRAI tweaks and/or less iBGP hierarchy
might be the way to go.  Then again, there are lots of side
effects with these as well..
I suppose I might not completely understand what I am asking.

- pe1 has iBGP peering with p1 and p2, and pe1 has p2 as it's next hop
in FIB for prefix X (both cores have prefix X in routing table through a
different edge device)
- p2 suddenly falls off the network

Perhaps it's late enough on Friday night after a long day for me to not
be thinking correctly, but I can't figure out exactly what the delay
time would be for a client connected to pe1 to re-reach prefix X if p2
goes down hard.
...
...
- if I need to down core2, what is the quickest and easiest way to
ensure that all gear connected to the cores will *quickly* switch to
preferring core1?
Use your IGP mechanisms akin to IS-IS overload bit or OSPF
stub router (max metric) advertisement.
I will certainly look into your suggestions. I have only a backbone area
in OSPF carrying loopbacks and infrastructure, but don't quite
understand the entire OSPF protocol yet.

Thanks Danny,

Steve