No, I'm not thinking of cases of gracefully closed sessions, only the behavior of non-gracefully closed sessions. I think the lack of argument here means it's by and large not an important issue, since folks shut down sessions gracefully before doing a maintenance, right? But the answer that "it needs to be at 180 seconds holdtime to maintain routing stability" doesn't seem to hold water since no one's claiming instability caused on private peering links and iBGP sessions due to 10 second interface keepalives ;-) I think the answer is that it just hasn't been important enough to ever change the defaults.
From: Sean Donelan [mailto:sean@donelan.com]
On Fri, 12 January 2001, Lane Patterson wrote:
Hmm, I know there are a lot of overburdened BR's out there, but since this is set on a per-neighbor basis, there should at least be room for some selective optimization. It seems a bit crazy to think that each time there's a BR maintenance/reboot at an IXP, peers will continue to send to the bit bucket in the sky for 180+ seconds.
What kind of failure modes are you protecting against. The most common reasons for ending a BGP session will usually include a TCP CLOSE. There is no problem with a "normal" shutdown because BGP will immediately withdraw the routes. The timers are for aborts, such as someone unplugged the neighbor.
The reason I see for lowering the BGP timer is enabling an upper layer protocol to ride through the storm. Does Vern Paxson have any data on how long typical TCP streams survive during periods of routing instability?
If a TCP application lasts 180 seconds, and it takes 150 seconds to recompute the route table, wouldn't you set the timeout to 30 seconds? (all numbers fictional)