In an attempt to return to an argument, rather than simple contradiction (ok, ok, it's far more polite and reasonable so far than that would imply,
but I couldn't miss the cheap shot; apologies hereby tendered), perhaps we
should consider *what* the RFC should say, if it should be changed? Going to the WG with a proposal in hand and a rationale to support it would seem
to be the best path.
So, a summary of my view on it at the moment:
Assumption #1) Resetting a BGP session is 'costly'. Both in terms of the time it takes, the stability it removes, and the fact that it flaps all of your *outgoing* announcements as well as incoming ones.
Assumption #2) A router that sends a malformed route is clearly doing something which it Should Not Be Doing (tm) (ok, this might be axiomatic, but should still be laid out)
Assumption #3) The current practice has been shown to demonstrably increase the brittleness of the Internet, by causing severe flapping when someone only partially follows the RFC (in particular, propagating bad route data, whether or not the origional source session is reset).
Assumption #4) Routing errors which are bad data, but *not* malformed routes, will not generally be caught by normal means in normal operation, until a case of human intervention to cross-check the data.
Assumption #5) Any router which breaks so badly as to start spewing large amounts of validly formed but errorneous data, and is *also* spewing badly
formed data, will spew noticeable amounts of said badly formed data. (This
one is key, and is only a conjecture; field evidence would be of great use
in validating it).
Hello; Can "badly formed data" be reasonably clearly defined ? What tests are there for "validly formed but errorneous data" ? There are several monitoring efforts (including the one done here) which compare sets of (m)bgp routing tables over time. It seems to me that such (m)bgp pollution should be detectable with a monitoring project. BTW, what seems to be the clearest sign here of the recent flap was the dropping of 43 Autonomous Systems by UU.net for the Sat Jun 23 16:37:41 2001 status run. This is not a good enough metric to relieably detect such problems. There do seem to be a lot of weird changes in the routing table in that dump, but a simple test for this is not apparent to me at present. Regards Marshall Eubanks Multicast Technologies, Inc. 10301 Democracy Lane, Suite 410 Fairfax, Virginia 22030 Phone : 703-293-9624 Fax : 703-293-9609 e-mail : tme@on-the-i.com http://www.on-the-i.com
Conclusion: changing the RFC from saying you MUST do a NOTIFY and ditch the
session could be adjusted to stating that you MUST handle the error in one
of two ways: do a NOTIFY and ditch the session (traditional), or send an ALERT and discard the badly formed route. Additionally, this alternative handling MUST NOT be enabled by default, and SHOULD have a threshhold parameter at which the session will undergo a NOTIFY/reset, under the assumption that the host sending an appreciable amount of badly formed routes is, in fact, in danger of sending correctly formed but erroneous data as well.
Suitable threshold values are left as an excercise to local admins and BCP
documents; I would think this could be negotiated as a capability extension
to BGP4, with the fallback, of course, being to follow the traditional RFC
practice.
Thoughts? -- ***************************************************************************
Joel Baker System Administrator - lightbearer.com
lucifer@lightbearer.com http://www.lightbearer.com/~lucifer
Marshall Eubanks tme@21rst-century.com