lucifer@lightbearer.com wrote:
Brett Frankenberger wrote:
I have no data on Bay; my apologies if this wasn't clear. Bay was *only* being referenced as a historical point of note. No attempt at FUD, and my apologies if anyone read it that way.
And I wasn't attempting to defend them, either -- I'm just curious about the problem.
Anyway, someone had to be passing this advertisement around ... if the Ciscos were dropping the session in response to it, and <X>'s were crashing, who's left to pass the bad advertisement around? Cisco with older code that propogated the advertisement upon receipt, instead of issuing a NOTIFY and tearing the session down?
I'm not entirely clear on this; from the bug ID, it implies that iBGP may be treated differently than external peers (specifically, part of it appears to involve appending one's own ASN, possibly; again, I'm not entirely clear on it, even reading the bug report).
Actually, upon reviewing the whole thread from the Bay routers incident, a large part of what caused that issue to spread is that the Ciscos were, in fact, sending a NOTIFY and closing the session - but only *after* they had alreayd propagated the route. If this hasn't changed, then last Sat's issues could once again be characterized as "A cisco bug caused bad routes to enter the routing table, and Cisco's handling of bad received routes caused it to propagate throughout the network rapidly, while some non-zero number of other vendors closed the session upon receiving the bad route, and did not propagate it" (I can verify the last, at least, for our core; no router that did not have an external BGP feed was affected at all, except for it's sessions resetting when some of the peers vanished). Can anyone verify whether Cisco still does BGP this way? (Propagate, then kill origionating session). If so, it rather clearly answers the question about how this managed to make it throughout the network... (For the record: I'm not trying to Cisco-bash here. All vendors have problems, and when you have a huge market share, your problems tend to show up much more obviously, when they appear. However, Cisco does still have a huge market share, meaning this affected a whole lot of people, if true... so, I'm curious). -- *************************************************************************** Joel Baker System Administrator - lightbearer.com lucifer@lightbearer.com http://www.lightbearer.com/~lucifer