RE: BGP keepalive/holdtime at GigE exchange
Hmm, I know there are a lot of overburdened BR's out there, but since this is set on a per-neighbor basis, there should at least be room for some selective optimization. It seems a bit crazy to think that each time there's a BR maintenance/reboot at an IXP, peers will continue to send to the bit bucket in the sky for 180+ seconds.
-----Original Message----- From: Deepak Jain [mailto:deepak@ai.net] Sent: Friday, January 12, 2001 11:48 AM To: Lane Patterson Cc: 'nanog@merit.edu' Subject: RE: BGP keepalive/holdtime at GigE exchange
The problem I have seen with setting BGP timeouts that low is when peering with overloaded or slow/old routers. Often they will "pause" their BGP activity while they are actively peering or repeering across their internal or external network. The low times will then cause more timeouts before the fabric has stablized.
Deepak Jain AiNET
On Fri, 12 Jan 2001, Lane Patterson wrote:
Hmm, many folks didn't seem to understand the context here.
fast-external-fallover doesn't apply if a peer BR across a GigE exchange dies...you've still got link on your Gig port, so there is no link level indication of failure.
tweaking tcp timers is not the right approach...BGP explicitly has a keepalive for this exact purpose, when peering dies but your interface stays up.
the best non-radical suggestion so far is to simply tweak your keepalive to 10 and holdtime to 30 seconds, to bring this in line with the granularity of direct-connected peer interface or
IGP metrics.
Do people do this? Do people have problems doing this?
Do any folks do less than this on their eBGP peers, and at what tradeoff expense.
This is the old issue of finding the right operationally sane timeouts, not too high, not too low. The defaults clearly seem too high, yet I haven't seen many cases where folks set these down :-)
Cheers, -Lane
-----Original Message----- From: Lane Patterson [mailto:lpatterson@equinix.com] Sent: Thursday, January 11, 2001 10:08 PM To: 'nanog@merit.edu' Subject: FW: BGP keepalive/holdtime at GigE exchange
I am looking for operational BCP feedback on common practice for tweaking down BGP holdtime/keepalive across GigE exchange points,
could go down on the other side of the GigE switch without a corresponding adjacency change seen on your BR. The thought is to make down peers known as fast thru a GigE exchange as
since a peer they would
be over a POS private peer interface.
The current defaults are pretty gross, and much worse than the ISIS hello and interface keepalive defaults of 10 seconds.
IOS12.x: neighbor [ip-address | peer-group-name] timers keepalive holdtime holdtime: default 180 seconds keepalive: default 60 seconds
http://cco.cisco.com/univercd/cc/td/doc/product/software/ios12 1/121cgcr/ip_r /iprprt2/1rdbgp.htm#xtocid8553
JunOS 4.2: holdtime: default 90 seconds keepalive: default one third of holdtime
https://www.juniper.net/techpubs/software/junos42/swconfig-rou ting42/html/bg p-summary13.html#1015669
Cheers, -Lane
Lane Patterson <lane@equinix.com> Equinix, Inc.
I think the argument is one of stability. BGP is supposed to be stable for days/weeks on end normally. Making your internal network too sensitive to external changes destabilizes your network and those who connect to you. If a BGP session with one peer resets once every three days, and you peer with them at a few places, at most you are talking about a service degradation for about 5-10 minutes as say 1/3 of your packets are resent or dropped (assuming you peer in three places, etc). 180 seconds is nothing for a router with many peering sessions and a reasonable traffic load. Its not exciting, but the other peer's customers are just as screwed. If the whole fabric went goes down, a good dampening policy at your internal-> BR routers will keep the instability from influencing your core. The bigger concern is IF a peer is dropping a session that often, *what* is wrong with their router? I am very afraid of routers that *randomly* timeout and re-peer with no good reason. Most networks insert new routes at internal/CR/other routers that are automatically distributed to their borders, this way internal route changes do not require resetting of external peers to take effect. So, maybe I am misunderstanding your concern, why micromanage BGP timers on your routers when a reasonably sized network may have more than 1000 external peering sessions; and each router on both sides has different loading characteristics that are not stable? inbound prefix limits are my personal interest in a lot of these per neighbor configs and even, then a big customer signed on or leaving a peer causes the prefix limits to get hit or be meaningless; I only recommend them for use with peers that have fat finger engineers working at 4am. :) Deepak Jain AiNET On Fri, 12 Jan 2001, Lane Patterson wrote:
Hmm, I know there are a lot of overburdened BR's out there, but since this is set on a per-neighbor basis, there should at least be room for some selective optimization. It seems a bit crazy to think that each time there's a BR maintenance/reboot at an IXP, peers will continue to send to the bit bucket in the sky for 180+ seconds.
-----Original Message----- From: Deepak Jain [mailto:deepak@ai.net] Sent: Friday, January 12, 2001 11:48 AM To: Lane Patterson Cc: 'nanog@merit.edu' Subject: RE: BGP keepalive/holdtime at GigE exchange
The problem I have seen with setting BGP timeouts that low is when peering with overloaded or slow/old routers. Often they will "pause" their BGP activity while they are actively peering or repeering across their internal or external network. The low times will then cause more timeouts before the fabric has stablized.
Deepak Jain AiNET
On Fri, 12 Jan 2001, Lane Patterson wrote:
Hmm, many folks didn't seem to understand the context here.
fast-external-fallover doesn't apply if a peer BR across a GigE exchange dies...you've still got link on your Gig port, so there is no link level indication of failure.
tweaking tcp timers is not the right approach...BGP explicitly has a keepalive for this exact purpose, when peering dies but your interface stays up.
the best non-radical suggestion so far is to simply tweak your keepalive to 10 and holdtime to 30 seconds, to bring this in line with the granularity of direct-connected peer interface or
IGP metrics.
Do people do this? Do people have problems doing this?
Do any folks do less than this on their eBGP peers, and at what tradeoff expense.
This is the old issue of finding the right operationally sane timeouts, not too high, not too low. The defaults clearly seem too high, yet I haven't seen many cases where folks set these down :-)
Cheers, -Lane
-----Original Message----- From: Lane Patterson [mailto:lpatterson@equinix.com] Sent: Thursday, January 11, 2001 10:08 PM To: 'nanog@merit.edu' Subject: FW: BGP keepalive/holdtime at GigE exchange
I am looking for operational BCP feedback on common practice for tweaking down BGP holdtime/keepalive across GigE exchange points,
could go down on the other side of the GigE switch without a corresponding adjacency change seen on your BR. The thought is to make down peers known as fast thru a GigE exchange as
since a peer they would
be over a POS private peer interface.
The current defaults are pretty gross, and much worse than the ISIS hello and interface keepalive defaults of 10 seconds.
IOS12.x: neighbor [ip-address | peer-group-name] timers keepalive holdtime holdtime: default 180 seconds keepalive: default 60 seconds
http://cco.cisco.com/univercd/cc/td/doc/product/software/ios12 1/121cgcr/ip_r /iprprt2/1rdbgp.htm#xtocid8553
JunOS 4.2: holdtime: default 90 seconds keepalive: default one third of holdtime
https://www.juniper.net/techpubs/software/junos42/swconfig-rou ting42/html/bg p-summary13.html#1015669
Cheers, -Lane
Lane Patterson <lane@equinix.com> Equinix, Inc.
On Fri, Jan 12, 2001 at 03:23:51PM -0500, Deepak Jain wrote:
I think the argument is one of stability. BGP is supposed to be stable for days/weeks on end normally. Making your internal network too sensitive to external changes destabilizes your network and those who connect to you.
If a BGP session with one peer resets once every three days, and you peer with them at a few places, at most you are talking about a service degradation for about 5-10 minutes as say 1/3 of your packets are resent or dropped (assuming you peer in three places, etc). 180 seconds is nothing for a router with many peering sessions and a reasonable traffic load.
With regard to your earlier comments about busy routers "pausing" BGP, perhaps this is something that can be investigated at a vendor software level. I would think keepalives (of any variety) should rank fairly high on the food chain in terms of CPU precedence. If this isn't the case already, why not? I don't know how true it is anymore, but I recall a few years back having to deal with some routers which got bogged down with OSPF updates to the point that they kept resetting perfectly stable links (or the other end did) due to keepalives not being processed in a timely manner. In the interest of stability, I would certainly want keepalives to be processed ahead of routing updates. After all, it's not as though they even represent a significant percentage of the total workload on the CPU, even when you reach a reasonably high number of links. And if your links keep resetting due to route churn, you've got a self-perpetuating problem.
The bigger concern is IF a peer is dropping a session that often, *what* is wrong with their router? I am very afraid of routers that *randomly* timeout and re-peer with no good reason.
In this case, I would expect a NOC with proper monitoring of peering sessions to take notice and initiate an investigation into the problem. -c
On Fri, Jan 12, 2001 at 01:04:45PM -0800, Clayton Fiske wrote:
With regard to your earlier comments about busy routers "pausing" BGP, perhaps this is something that can be investigated at a vendor software level. [snip] In the interest of stability, I would certainly want keepalives to be processed ahead of routing updates.
BGP is TCP-based, so there is no (easy) way of ensuring that the keepalives go to the top of the queue without possibly corrupting the routing data itself. -- Ryan O'Connell - <ryan@complicity.co.uk> - http://www.complicity.co.uk I'm not losing my mind, no I'm not changing my lines, I'm just learning new things with the passage of time
On Fri, Jan 12, 2001 at 10:03:28PM +0000, Ryan O'Connell wrote:
BGP is TCP-based, so there is no (easy) way of ensuring that the keepalives go to the top of the queue without possibly corrupting the routing data itself.
But perhaps TCP connection handling (for existing connections anyway, so as not to make the router more susceptible to a SYN flood) could be bumped up, at which point you could hand off routing updates into one queue and keepalives into another. I know it's much easier for me to say than it would be to code, but it certainly seems doable and it sure could be a lifesaver during routing storms. -c
participants (4)
-
Clayton Fiske
-
Deepak Jain
-
Lane Patterson
-
Ryan O'Connell