Re: Time to revise RFC 1771
On Tue, 26 June 2001, Clayton Fiske wrote:
I don't object to the discussion of changing the RFC (whether I agree or not), and I accept that Vendor [everyone else except Cisco] having a knob for this would have prevented some routing disruptions for some networks. But then again, static routes would have prevented that too. It doesn't mean they're a good idea. What I object to is that people are using this particular case as justification for said discussion.
This is the third time a BGP protocol error has lead to repeated BGP flapping across the Internet due to implementations aborting the BGP session, restarting it, and aborting it over and over. GATED, Bay Networks and now Vendor X have each had issues internetworking with Cisco. The particular cause varied, but the error handling in all the cases resulted in severe route flaps across a substantial portion of the net at the time. It wasn't Cisco's fault all the time. But what was common was implementations following RFC's 1771 guidance that the proper behaivor is to abort the BGP session. If not now, when? I would prefer we try to maintain the BGP sessions. But if you think your peer's protocol implementation is flawed, cycling the BGP session is unlikely to fix the software. It just makes things worse as you announce and withdrawl sections of the route table repeatedly. Shutdown the session (and keep it down) and wait for human intervention.
I must be missing something. I thought the first duty of a routing protocol was to avoid loops, even above maintaining reachability. Are we really sure that accepting all but the noticeably bad routes from a berserk neighbor would not cause loops? Also, if we damp bgp routes, surely we should damp bgp sessions too? There's no need to retry instantly. Barney Wolff
participants (2)
-
Barney Wolff
-
Sean Donelan