On Tue, 26 June 2001, Clayton Fiske wrote:
I don't object to the discussion of changing the RFC (whether I agree or not), and I accept that Vendor [everyone else except Cisco] having a knob for this would have prevented some routing disruptions for some networks. But then again, static routes would have prevented that too. It doesn't mean they're a good idea. What I object to is that people are using this particular case as justification for said discussion.
This is the third time a BGP protocol error has lead to repeated BGP flapping across the Internet due to implementations aborting the BGP session, restarting it, and aborting it over and over. GATED, Bay Networks and now Vendor X have each had issues internetworking with Cisco. The particular cause varied, but the error handling in all the cases resulted in severe route flaps across a substantial portion of the net at the time. It wasn't Cisco's fault all the time. But what was common was implementations following RFC's 1771 guidance that the proper behaivor is to abort the BGP session. If not now, when? I would prefer we try to maintain the BGP sessions. But if you think your peer's protocol implementation is flawed, cycling the BGP session is unlikely to fix the software. It just makes things worse as you announce and withdrawl sections of the route table repeatedly. Shutdown the session (and keep it down) and wait for human intervention.