Peter T. Whiting <pwhiting@fury.ittc.ukans.edu> wrote:
As I understand the current spec, a router, upon receiving a malformed as_path is supposed to respond with a notification message (3.11) and drop the BGP connection. Your suggestion to maintain the connection and drop the announcement is a practical one, but doesn't put as much pressure on vendors to fix the bug.
This is not only practical, but, in fact, the only sane way to do things. Dropping BGP session causes withdrawal of hundreds or thousands of acceptable routes. When the BGP session is reestablished, these routes will be acquired again, causing a wave of announcements. When the invalid route shows up, the cycle is repeated. What a perfect way to kill the Internet :) --vadim
On Tue, May 23, 2000 at 10:41:24AM -0700, Vadim Antonov wrote:
Peter T. Whiting <pwhiting@fury.ittc.ukans.edu> wrote:
As I understand the current spec, a router, upon receiving a malformed as_path is supposed to respond with a notification message (3.11) and drop the BGP connection. Your suggestion to maintain the connection and drop the announcement is a practical one, but doesn't put as much pressure on vendors to fix the bug.
This is not only practical, but, in fact, the only sane way to do things. Dropping BGP session causes withdrawal of hundreds or thousands of acceptable routes. When the BGP session is reestablished, these routes will be acquired again, causing a wave of announcements. When the invalid route shows up, the cycle is repeated.
What a perfect way to kill the Internet :)
good point. In defense of the spec here is a cut from rfc 1771: If a BGP speaker detects an error, it shuts down the connection and changes its state to Idle. Getting out of the Idle state requires generation of the Start event. If such an event is generated automatically, then persistent BGP errors may result in persistent flapping of the speaker. To avoid such a condition it is recommended that Start events should not be generated immediately for a peer that was previously transitioned to Idle due to an error. For a peer that was previously transitioned to Idle due to an error, the time between consecutive generation of Start events, if such events are generated automatically, shall exponentially increase. The value of the initial timer shall be 60 seconds. The time shall be doubled for each consecutive retry. pete
On Tue, 23 May 2000, Vadim Antonov wrote:
This is not only practical, but, in fact, the only sane way to do things. Dropping BGP session causes withdrawal of hundreds or thousands of acceptable routes. When the BGP session is reestablished, these routes will be acquired again, causing a wave of announcements. When the invalid route shows up, the cycle is repeated.
What a perfect way to kill the Internet :)
This is what BGP dampening is for. Dropping a few prefixes here and there is a good way to make operational debugging of a network of any size, hell. With a NOTIFY and a drop, there is a definite positive feedback that something is majorly wrong and allows people to open top level cases with vendors. /vijay
I see that I have elicited some interesting responses <insert evil cackle here>. I agree with removing cruftery (thanks for point out something that is quite valid Vijay). Lets talk about a couple things. 1. How can everyone protect themselves RIGHT NOW. 2. A couple vendors that I know of decide to either restart routing entirely or in the least restart more than 1 BGP session. This behaviour should be considered BAD. This behaviour generally can not be corrected quickly because new code releases take time. 3. The vendor who starts the nastiness seems to be the only one who can't quite seem to grasp that the sort of behaviour that engenders corrupt AS paths is BAD. So, in light of the above statements. Would it be safe to say that safeguarding the Internet is our first duty and beating up on vendors is our second? Please note that I enjoy abusing vendors but they tend to get worn out <grin>. Note; I will caveat all statements by saying that some vendors claim to have fixed this in later versions of code. I concurr regarding the route-servers. However, just about everything else is free game. Who besides a route-server would want to prepend an AS besides their own. Who wants to allow customers and perhaps even peers to send routes prepending an AS that is not their own? I would side with Vijay on the withdrawl issue. Since the route update that was received was malformed we should treat all announcements from the EBGP peer with extreme suspicion. Reseting the BGP session (perhaps tearing it down and leaving it down until a human intervenes) is probably the best idea. A note of interest for the events I have seen is that you do not necessarily have the BGP session you expect torn down. Wouldn't you expect to tear down your EBGP session with the person who sent the weirdness? I can assure you that several vendors do not do things this way. In fact the vendors I am thinking of quite obviously propagate the bad route AND THEN decide to reset their BGP on a larger scale<grrrr>. Just some additional thoughts... Regards, Blaine On Tue, 23 May 2000, Vadim Antonov wrote:
Peter T. Whiting <pwhiting@fury.ittc.ukans.edu> wrote:
As I understand the current spec, a router, upon receiving a malformed as_path is supposed to respond with a notification message (3.11) and drop the BGP connection. Your suggestion to maintain the connection and drop the announcement is a practical one, but doesn't put as much pressure on vendors to fix the bug.
This is not only practical, but, in fact, the only sane way to do things. Dropping BGP session causes withdrawal of hundreds or thousands of acceptable routes. When the BGP session is reestablished, these routes will be acquired again, causing a wave of announcements. When the invalid route shows up, the cycle is repeated.
What a perfect way to kill the Internet :)
--vadim
On Tue, 23 May 2000, Blaine Christian wrote:
1. How can everyone protect themselves RIGHT NOW.
RIGHT NOW you can basically shut your routers off. Or a slightly less drastic method might be to trace down the session that originates the bad NLRI and turn that peering session down.
else is free game. Who besides a route-server would want to prepend an AS besides their own. Who wants to allow customers and perhaps even peers to send routes prepending an AS that is not their own?
Prepending an AS is not as inherently bad as REMOVING an as. You can only prepend an AS to a route you send out (either you originate or you transit it). If you own the object, BFD. People will notice that you are messing with their AS and various unpleasantness will occur. If you are messing with others people's objects that you are transiting, then they should get a better transit provider. Either way, it is a self correcting problem which does not cause any catastrophic damage, like removing an as would.
EBGP peer with extreme suspicion. Reseting the BGP session (perhaps tearing it down and leaving it down until a human intervenes) is probably the best idea. A note of interest for the events I have seen is that you
This is already accounted for in the spec. Exponential backoff on retry.
way. In fact the vendors I am thinking of quite obviously propagate the bad route AND THEN decide to reset their BGP on a larger scale<grrrr>.
Escalate the issue internally to net-eng and let Juzer deal with it. /vijay
On Tue, 23 May 2000, Blaine Christian wrote:
else is free game. Who besides a route-server would want to prepend an AS besides their own. Who wants to allow customers and perhaps even peers to send routes prepending an AS that is not their own?
FWIW, route servers (at least RSng ones) either prepend their own AS or leave the path information alone. No sane BGP speaker would prepend anything other than its own, its peers (proxy AS prepending) or internal AS numbers for confederation purposes. This isn't to say that "routers" can't diddle with it all they want. If you have access to a BGP session and can muck with AS-paths in routing updates, you have access to a very effective denial of routing attack. The only valid defense against such mucking that I can think of is verifying AS adjacencies against some registry and flagging unknown paths. This is not a cheap thing to do. This, however, is far saner than cryptographically signing all routing updates which is one solution I've heard proposed. :-P -- Jeffrey Haas - Merit RSng project - jeffhaas@merit.edu
participants (5)
-
Blaine Christian
-
Jeff Haas
-
Peter T. Whiting
-
Vadim Antonov
-
Vijay Gill