Sean, It can be dangerous to look at a problem from the extreme case. To illustrate that, look at it from the other extreme for a second. Imagine a router which gets 99,999 bad routes from a peer and 1 good one. Should it try to send a notify with appropriate opcode for each of those 99,999 and keep the one good route? How likely is it that it can do that and forward the packets it's getting on its other interfaces? Now, let's got back to your assumption that you get one bad route; sure, you can send the notify on that, maintain state that this part of the announcement should be disregarded, and keep going. Somewhere between one bad route out of 100k and one good route out of 100k there is a threshold that says the sensible thing to do is to shut down the session, having told the bgp peer why. What's the threshold on Cisco GSR? How about on a Cisco 7200 series? A Juniper M5? Wait, that threshold should probably vary not just based on my hardware but on the peer's hardware (a 7200 is probably not going to be able to process all those notifies...). Hmm, now I need to signal both my capacity and current load on an ongoing basis so my peers know how many notifications I can handle before I fall over. Suddenly, I don't want to play any more. One of the othe posters noted that seeing bad data was could be a bellwether of a router gone nuts. I'm not sure how often that is the case, but I agree that setting the threshold low makes sense. I might even agree that the simplest thing to do is set it to 1. regards, Ted Sean writes much, which I have snipped to:
The receiver should liberally accept what it can, and only reject "bad" data.
I don't think the receiver should be changed to understand the bad data, just not to reject "good" data.
Under RFC 1771, the receiver is rejecting both "good" and "bad" data. It should be revised so when there are both "bad" routes and "good" routes, the receiver should accept the "good" routes and only reject the "bad" routes.
If a TELNET implementation doesn't understand an escape code, it shouldn't terminate the entire TELNET session.
There is a flaw in both the sender's implementation and RFC 1771's method of handling errors.