Did your BGP crash today?
Havent seen a thread on this one so thought i'd start one. Ripe tested a new attribute that crashed the internet, is that true? Kim
I did see some attribute 99 stuff go around earlier today and have not yet researched it. Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) - Jared On Aug 27, 2010, at 1:27 PM, Kasper Adel wrote:
Havent seen a thread on this one so thought i'd start one.
Ripe tested a new attribute that crashed the internet, is that true?
Kim
On Fri, Aug 27, 2010 at 01:29:15PM -0400, Jared Mauch wrote:
Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240)
Just out of curiosity, at what point will we as operators rise up against the ivory tower protocol designers at the IETF and demand that they add a mechanism to not bring down the entire BGP session because of a single malformed attribute? Did I miss the memo about the meeting? I'll bring the punch and pie. -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
On 2010-08-27 21:13, Richard A Steenbergen wrote:
On Fri, Aug 27, 2010 at 01:29:15PM -0400, Jared Mauch wrote:
Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240)
Just out of curiosity, at what point will we as operators rise up against the ivory tower protocol designers at the IETF and demand that they add a mechanism to not bring down the entire BGP session because of a single malformed attribute? Did I miss the memo about the meeting? I'll bring the punch and pie.
Complain to your vendor, especially C & J are having good enough influence on the IETF to make such a change possible. I can agree with tearing the session down when one encounters an improperly formatted message, but an unknown attribute, while the rest of the format of message is fine, is a silly thing to hang up on indeed. Greets, Jeroen
On Aug 27, 2010, at 3:17 PM, Jeroen Massar wrote:
On 2010-08-27 21:13, Richard A Steenbergen wrote:
On Fri, Aug 27, 2010 at 01:29:15PM -0400, Jared Mauch wrote:
Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240)
Just out of curiosity, at what point will we as operators rise up against the ivory tower protocol designers at the IETF and demand that they add a mechanism to not bring down the entire BGP session because of a single malformed attribute? Did I miss the memo about the meeting? I'll bring the punch and pie.
Complain to your vendor, especially C & J are having good enough influence on the IETF to make such a change possible.
I can agree with tearing the session down when one encounters an improperly formatted message, but an unknown attribute, while the rest of the format of message is fine, is a silly thing to hang up on indeed.
When you are processing something, it's sometimes hard to tell if something just was mis-parsed (as I think the case is here with the "missing-2-bytes") vs just getting garbage. Perhaps there should be some way to "re-sync" when you are having this problem, or a parallel "keepalive" path similar to MACA/MCAS/MIDCAS/TCAS between the devices to talk when something bad is happening. - Jared
On 8/27/2010 3:22 PM, Jared Mauch wrote:
When you are processing something, it's sometimes hard to tell if something just was mis-parsed (as I think the case is here with the "missing-2-bytes") vs just getting garbage. Perhaps there should be some way to "re-sync" when you are having this problem, or a parallel "keepalive" path similar to MACA/MCAS/MIDCAS/TCAS between the devices to talk when something bad is happening.
I know it wasn't there originally, and isn't mandatory now, but there is an MD5 hash that can be added to the packet. If the TCP hash checks out, then you know the packet wasn't garbled, and just contained information you didn't grok. That seems like enough evidence to be able to shrug and toss the packet without dropping the session. -Dave
where's the change management process in all of this. basically now we are going to starting changing things that can potentially have an adverse affect on users without letting anyone know before hand .... Interesting concept. On Aug 27, 2010, at 3:33 PM, Dave Israel wrote:
On 8/27/2010 3:22 PM, Jared Mauch wrote:
When you are processing something, it's sometimes hard to tell if something just was mis-parsed (as I think the case is here with the "missing-2-bytes") vs just getting garbage. Perhaps there should be some way to "re-sync" when you are having this problem, or a parallel "keepalive" path similar to MACA/MCAS/MIDCAS/TCAS between the devices to talk when something bad is happening.
I know it wasn't there originally, and isn't mandatory now, but there is an MD5 hash that can be added to the packet. If the TCP hash checks out, then you know the packet wasn't garbled, and just contained information you didn't grok. That seems like enough evidence to be able to shrug and toss the packet without dropping the session.
-Dave
=+=+=+=+=+=+=+=+=+=+=+=+= Mike Gatti ekim.ittag@gmail.com =+=+=+=+=+=+=+=+=+=+=+=+=
On Fri, Aug 27, 2010 at 4:07 PM, Mike Gatti <ekim.ittag@gmail.com> wrote:
where's the change management process in all of this. basically now we are going to starting changing things that can potentially have an adverse affect on users without letting anyone know before hand .... Interesting concept.
you are running bgp, you are connected to the 'internet'... congrats you are part of the experiment. I suppose one view is that "at least it wasn't someone with ill intent, or a misconfigured mikrotek!" (you are asking your vendors to run full bit sweeps of each protocol in a regimented manner checking for all possible edge cases and properly handling them, right?) -chris
On Aug 27, 2010, at 3:33 PM, Dave Israel wrote:
On 8/27/2010 3:22 PM, Jared Mauch wrote:
When you are processing something, it's sometimes hard to tell if something just was mis-parsed (as I think the case is here with the "missing-2-bytes") vs just getting garbage. Perhaps there should be some way to "re-sync" when you are having this problem, or a parallel "keepalive" path similar to MACA/MCAS/MIDCAS/TCAS between the devices to talk when something bad is happening.
I know it wasn't there originally, and isn't mandatory now, but there is an MD5 hash that can be added to the packet. If the TCP hash checks out, then you know the packet wasn't garbled, and just contained information you didn't grok. That seems like enough evidence to be able to shrug and toss the packet without dropping the session.
-Dave
=+=+=+=+=+=+=+=+=+=+=+=+= Mike Gatti ekim.ittag@gmail.com =+=+=+=+=+=+=+=+=+=+=+=+=
come on Chris, is the Internet an experiment or not? :) one would think that a responsible party would have made efforts to let others in the "playground" know they were going to try something different that could have ramifications on an unkown distribution of some code bases. I'm not asking my vendor or (in the case of OSS) me to run "full bit sweeps"... but a heads up to some of the known ops lists would have been not only welcome but expected. as usual, YMMV --bill On Fri, Aug 27, 2010 at 04:11:32PM -0400, Christopher Morrow wrote:
On Fri, Aug 27, 2010 at 4:07 PM, Mike Gatti <ekim.ittag@gmail.com> wrote:
where's the change management process in all of this. basically now we are going to starting changing things that can potentially have an adverse affect on users without letting anyone know before hand .... Interesting concept.
you are running bgp, you are connected to the 'internet'... congrats you are part of the experiment.
I suppose one view is that "at least it wasn't someone with ill intent, or a misconfigured mikrotek!"
(you are asking your vendors to run full bit sweeps of each protocol in a regimented manner checking for all possible edge cases and properly handling them, right?)
-chris
On Aug 27, 2010, at 3:33 PM, Dave Israel wrote:
On 8/27/2010 3:22 PM, Jared Mauch wrote:
When you are processing something, it's sometimes hard to tell if something just was mis-parsed (as I think the case is here with the "missing-2-bytes") vs just getting garbage. Perhaps there should be some way to "re-sync" when you are having this problem, or a parallel "keepalive" path similar to MACA/MCAS/MIDCAS/TCAS between the devices to talk when something bad is happening.
I know it wasn't there originally, and isn't mandatory now, but there is an MD5 hash that can be added to the packet. If the TCP hash checks out, then you know the packet wasn't garbled, and just contained information you didn't grok. That seems like enough evidence to be able to shrug and toss the packet without dropping the session.
-Dave
=+=+=+=+=+=+=+=+=+=+=+=+= Mike Gatti ekim.ittag@gmail.com =+=+=+=+=+=+=+=+=+=+=+=+=
On Aug 27, 2010, at 5:37 PM, bmanning@vacation.karoshi.com wrote:
come on Chris, is the Internet an experiment or not? :) one would think that a responsible party would have made efforts to let others in the "playground" know they were going to try something different that could have ramifications on an unkown distribution of some code bases.
I'm assuming that they weren't really expecting this to cause issues... Where does one draw the line? I'm planning on announcing x.y.z.0/20 later in the week -- x, y and z are all prime and the sum of all 3 is also a prime. There is a non-zero chance that something somewhere will go flooie, shall I send mail now or later? Also, I would prefer that this gets discovered and dealt with (in this case by stopping the announcement :-)) than having folk not willing to try things and ending up with a weaponized version... W
I'm not asking my vendor or (in the case of OSS) me to run "full bit sweeps"... but a heads up to some of the known ops lists would have been not only welcome but expected.
as usual, YMMV
--bill
On Fri, Aug 27, 2010 at 04:11:32PM -0400, Christopher Morrow wrote:
On Fri, Aug 27, 2010 at 4:07 PM, Mike Gatti <ekim.ittag@gmail.com> wrote:
where's the change management process in all of this. basically now we are going to starting changing things that can potentially have an adverse affect on users without letting anyone know before hand .... Interesting concept.
you are running bgp, you are connected to the 'internet'... congrats you are part of the experiment.
I suppose one view is that "at least it wasn't someone with ill intent, or a misconfigured mikrotek!"
(you are asking your vendors to run full bit sweeps of each protocol in a regimented manner checking for all possible edge cases and properly handling them, right?)
-chris
On Aug 27, 2010, at 3:33 PM, Dave Israel wrote:
On 8/27/2010 3:22 PM, Jared Mauch wrote:
When you are processing something, it's sometimes hard to tell if something just was mis-parsed (as I think the case is here with the "missing-2-bytes") vs just getting garbage. Perhaps there should be some way to "re-sync" when you are having this problem, or a parallel "keepalive" path similar to MACA/MCAS/MIDCAS/TCAS between the devices to talk when something bad is happening.
I know it wasn't there originally, and isn't mandatory now, but there is an MD5 hash that can be added to the packet. If the TCP hash checks out, then you know the packet wasn't garbled, and just contained information you didn't grok. That seems like enough evidence to be able to shrug and toss the packet without dropping the session.
-Dave
=+=+=+=+=+=+=+=+=+=+=+=+= Mike Gatti ekim.ittag@gmail.com =+=+=+=+=+=+=+=+=+=+=+=+=
-- What our ancestors would really be thinking, if they were alive today, is: "Why is it so dark in here?" -- (Terry Pratchett, Pyramids)
I'm assuming that they weren't really expecting this to cause issues... Where does one draw the line? I'm planning on announcing x.y.z.0/20 later in the week -- x, y and z are all prime and the sum of all 3 is also a prime. There is a non-zero chance that something somewhere will go flooie, shall I send mail now or later?
In this case the researchers sent an new packet that would never have been generated by any operational router ever before to their peer. They knew their packet would cause the router to run less/un tested and code path in BGP. To their defence, the risk was low. That said when I wrote my own BGP injector I accidentally sent badly formed known messages (like UPDATE,etc.) with bad attributes (like transitive when the RFC it MUST not be, and vice versa) to my routers. Juniper would kill the session at the validation stage and be quite verbose in the log but Cisco - at least on the 7301 I tested last year with a then recent IOS - would accept the packet as it. Yep, IOS do accept INVALID packets. I have no idea what happens after but if a Cisco router is passing the packet to a Juniper router it could have the same effect that what we saw, again, and tear down a session which is not the one which initiated the badly formed packet. That said I suspect that the message may not have been fully parsed, for performance reasons, with the outgoing packet partially generated following the RFC. Quagga is even worse that Cisco when it comes to packet validation but it should not surprise anyone :p Now, Should I research the described BGP behaviour (for a white hat conference for example) and send my possibly risking packets to my peer without telling them ? Hell no ! I am pretty sure that if I did I would loose quite a few session afterwards. People trust me not to absuse my BGP connections but for sending safe known message about my network and not some research stuff. That said vendor SHOULD research (and hopefully did) this kind of behaviour, but as yesterday shown, causing packet corruption through a router is bad for its stability :p
Also, I would prefer that this gets discovered and dealt with (in this case by stopping the announcement :-)) than having folk not willing to try things and ending up with a weaponized version...
No argument here. Thomas
imiho, researchers injecting data into the control plane are responsible to have tested it at least against major bgp speakers. and, considering its placement in the net (big core), i consider ios xr to be a major speaker. i suspect that these folk will test better next time. i sure hope so. randy
On 28 Aug 2010, at 08:56, Randy Bush wrote:
imiho, researchers injecting data into the control plane are responsible to have tested it at least against major bgp speakers. and, considering its placement in the net (big core), i consider ios xr to be a major speaker.
i suspect that these folk will test better next time. i sure hope so.
Not sure the researcher can afford to buy a ios xr and may not have access to one ! Thomas
On (2010-08-28 09:22 +0100), Thomas Mangin wrote:
i suspect that these folk will test better next time. i sure hope so.
Not sure the researcher can afford to buy a ios xr and may not have access to one !
Indeed. Also testing is hard, especially so, when you essentially need to reinvent the wheel every time, which might not even fit your time schedule. Maybe we as community could build 'BGPSpec' testing suite, simply python (or ruby yay!) script which has been thought at least to puke out UPDATEs that have known to break implementations before. Test cases being unique files for easy contribution. This BGPSpec could then be ran by vendors, researchers and operators, and we could be sure that at least same mistake is not done twice. With this suite in place, it would be easier for researcher to write new test case for the suite and then ask people to run it against their gear.
From global network security/reliability point-of-view BGP is pretty much only important protocol and as such maybe should enjoy special status in
collaborative quality assurance.
Considering this issue, late junos 32b ASN, mikrotik long AS path this http://www.cisco.com/en/US/products/products_security_advisory09186a0080094a... and probably many others, it seems we've been exceptionally lucky, that someone hasn't been fuzzing Internet BGP with target of breaking as much of it as possible, as it wouldn't really been that hard. -- ++ytti
i suspect that these folk will test better next time. i sure hope so. Not sure the researcher can afford to buy a ios xr and may not have access to one ! Also testing is hard
so is cleaning up the mess when you screw up enough of the internet to make the international press.
Maybe we as community could build 'BGPSpec' testing suite, simply python (or ruby yay!) script which has been thought at least to puke out UPDATEs that have known to break implementations before. Test cases being unique files for easy contribution.
a bgp regression suite would not have caught this as it was not a repeat. but it sure would be useful to implementors. randy
On (2010-08-28 18:20 +0900), Randy Bush wrote:
a bgp regression suite would not have caught this as it was not a repeat. but it sure would be useful to implementors.
Naturally 'proving' that non-trivial software works is practically impossible. But stating what non-existing test-suite would or would not have covered is not a topic I'm particularly interested to engage. -- ++ytti
On 08/28/2010 11:39 AM, Saku Ytti wrote:
On (2010-08-28 18:20 +0900), Randy Bush wrote:
a bgp regression suite would not have caught this as it was not a repeat. but it sure would be useful to implementors.
Naturally 'proving' that non-trivial software works is practically impossible. But stating what non-existing test-suite would or would not have covered is not a topic I'm particularly interested to engage.
I suggest the test-tool has 2 bgp-sessions and tests if what it put in did or did not come out on the otherside and in what shape or form. There are already atleast 2 projects which have BGP-code which could probably be adapted: http://code.google.com/p/exabgp/ http://code.google.com/p/bgpsimple/ Can I suggest a fuzzer as wel ?
Those tools are not suitable for regression testing ( I know I wrote exabgp ) not saying they could not be adapted though. Fizzing may return crashes or issues with the daemon but it is unlikely. You need predictable input for regression testing and in our particular case how do you detect a corruption without knowing what the behaviour of the router should be on that particular input. If it was that simple vendors would have done it --- from my iPhone On 28 Aug 2010, at 13:09, Leen Besselink <leen@consolejunkie.net> wrote:
On 08/28/2010 11:39 AM, Saku Ytti wrote:
On (2010-08-28 18:20 +0900), Randy Bush wrote:
a bgp regression suite would not have caught this as it was not a repeat. but it sure would be useful to implementors.
Naturally 'proving' that non-trivial software works is practically impossible. But stating what non-existing test-suite would or would not have covered is not a topic I'm particularly interested to engage.
I suggest the test-tool has 2 bgp-sessions and tests if what it put in did or did not come out on the otherside and in what shape or form.
There are already atleast 2 projects which have BGP-code which could probably be adapted: http://code.google.com/p/exabgp/ http://code.google.com/p/bgpsimple/
Can I suggest a fuzzer as wel ?
On (2010-08-28 13:23 +0200), Thomas Mangin wrote:
Those tools are not suitable for regression testing ( I know I wrote exabgp ) not saying they could not be adapted though.
Fizzing may return crashes or issues with the daemon but it is unlikely. You need predictable input for regression testing and in our particular case how do you detect a corruption without knowing what the behaviour of the router should be on that particular input.
It doesn't actually matter how likely or unlikely one expect such tool to be finding new issues. There is already value, that researchers like RIPE in this case, could simply write new test case, instead of needing to build whole infrastructure. -- ++ytti
My point was not about crafted bgp message to test border cases - this is what one would expect in a regression suite. It is about the use of a fuzzer to corrupt packet when you then do not know if the router is then behaving correctly or not. --- from my iPhone On 28 Aug 2010, at 13:36, Saku Ytti <saku@ytti.fi> wrote:
On (2010-08-28 13:23 +0200), Thomas Mangin wrote:
Those tools are not suitable for regression testing ( I know I wrote exabgp ) not saying they could not be adapted though.
Fizzing may return crashes or issues with the daemon but it is unlikely. You need predictable input for regression testing and in our particular case how do you detect a corruption without knowing what the behaviour of the router should be on that particular input.
It doesn't actually matter how likely or unlikely one expect such tool to be finding new issues. There is already value, that researchers like RIPE in this case, could simply write new test case, instead of needing to build whole infrastructure.
-- ++ytti
On 08/28/2010 01:52 PM, Thomas Mangin wrote:
My point was not about crafted bgp message to test border cases - this is what one would expect in a regression suite. It is about the use of a fuzzer to corrupt packet when you then do not know if the router is then behaving correctly or not.
I wasn't saying you should use both at the same time, but I thought it might be a good idea to add a fuzzer so that it could be run seperately. Any bugs we can find before it is in production causing problems is useful. Although most code I've seen which deals with the BGP-protocol directly seemed to be pretty simple/smart about it.
--- from my iPhone
On 28 Aug 2010, at 13:36, Saku Ytti <saku@ytti.fi> wrote:
On (2010-08-28 13:23 +0200), Thomas Mangin wrote:
Those tools are not suitable for regression testing ( I know I wrote exabgp ) not saying they could not be adapted though.
Fizzing may return crashes or issues with the daemon but it is unlikely. You need predictable input for regression testing and in our particular case how do you detect a corruption without knowing what the behaviour of the router should be on that particular input.
It doesn't actually matter how likely or unlikely one expect such tool to be finding new issues. There is already value, that researchers like RIPE in this case, could simply write new test case, instead of needing to build whole infrastructure.
-- ++ytti
On Sat, Aug 28, 2010 at 01:09:47PM +0200, Leen Besselink wrote:
On 08/28/2010 11:39 AM, Saku Ytti wrote:
On (2010-08-28 18:20 +0900), Randy Bush wrote:
a bgp regression suite would not have caught this as it was not a repeat. but it sure would be useful to implementors.
Naturally 'proving' that non-trivial software works is practically impossible. But stating what non-existing test-suite would or would not have covered is not a topic I'm particularly interested to engage.
I suggest the test-tool has 2 bgp-sessions and tests if what it put in did or did not come out on the otherside and in what shape or form.
There are already atleast 2 projects which have BGP-code which could probably be adapted: http://code.google.com/p/exabgp/ http://code.google.com/p/bgpsimple/
Can I suggest a fuzzer as wel ?
There was once cert-bgp-testcases-28may03-final.tar.gz which did some testing (including expected responses). I use it from time to time.
From the README: For more information see the NANOG 28 (http://www.nanog.org) presentation ...
"BGP Vulnerability Testing: Separating Fact from FUD" by Sean Convery (sean@cisco.com) and Matthew Franz (mfranz@cisco.com) But my quick googeling failed to locate a link to it. -- :wq Claudio
On Sat, Aug 28, 2010 at 09:22:34AM +0100, Thomas Mangin wrote:
On 28 Aug 2010, at 08:56, Randy Bush wrote:
imiho, researchers injecting data into the control plane are responsible to have tested it at least against major bgp speakers. and, considering its placement in the net (big core), i consider ios xr to be a major speaker.
i suspect that these folk will test better next time. i sure hope so.
Not sure the researcher can afford to buy a ios xr and may not have access to one !
Thomas
while this is undoubtedly true for hobbiest researchers, there are pretty good relationships between vendors and some research facilities with a strong interst in ensuring there is external review of the code base(es). (I am personally aware of at least five such facilities...:) hence I am going to have to echo Randys sentiments. This was just sloppy. --bill
while this is undoubtedly true for hobbiest researchers, there are pretty good relationships between vendors and some research facilities with a strong interst in ensuring there is external review of the code base(es).
(I am personally aware of at least five such facilities...:)
hence I am going to have to echo Randys sentiments. This was just sloppy.
I am really surprised by these attitudes. Guys (and gals), these incidents simply go to reinforce that the software we depend on, has not received sufficient testing and that we all have gigantic exposures due to things outside of our direct control (eg: cisco, juniper or other router software quality control). You can't just demand that people don't do things that wind up being destructive to you on your production network, thats asking the world to be responsible for you. There are lots of bugs in this stuff and the sooner that we find out about them, the sooner we can get updates to address them and hopefully, shorten the window in which those issues could be painful to us and cause us grief.
I am really surprised by these attitudes. Guys (and gals), these incidents simply go to reinforce that the software we depend on, has not received sufficient testing and that we all have gigantic exposures due to things outside of our direct control
nice anti-vendor rant. but over the last decades we the ops have asked for a jillion features which creates massive code, and there is no hope of testing all the code paths rigorously. the vendors have large test labs, and do what they can. sure, they could do better. so could we all. but it is also coders' responsibility, whether vendors or researchers or hackers, to also test what they send. in this case, clearly that was not done sufficiently. if i am sloppy in my receiving code, the pain is mine. if you are sloppy in your sending code, the pain is not yours. randy
* Randy Bush:
imiho, researchers injecting data into the control plane are responsible to have tested it at least against major bgp speakers.
Practically, this boils down to "don't do that", which is certainly fine by me. To carry out such experiments responsibly, you have to conduct so much testing beforehand that the live test on the actual Internet will not yield new insights (assuming you did your pre-experiment testing properly).
I think that focusing on researchers (who we assume are good-intentioned) misses the point. Any connected BGP speaker can inject any form of ugliness. The routers that mishandled these updates were bounded by routers that were able to 'properly' handle corrupted updates. The question of aggressive teardown of BGP sessions after a speaker receives garbage has been well considered for a long time. Stop the problem at the edges. The only difference here is that the edge moved one hop closer to the core. /c Sent from my iPhone On Aug 28, 2010, at 7:31 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
* Randy Bush:
imiho, researchers injecting data into the control plane are responsible to have tested it at least against major bgp speakers.
Practically, this boils down to "don't do that", which is certainly fine by me.
To carry out such experiments responsibly, you have to conduct so much testing beforehand that the live test on the actual Internet will not yield new insights (assuming you did your pre-experiment testing properly).
To carry out such experiments responsibly, you have to conduct so much testing beforehand that the live test on the actual Internet will not yield new insights (assuming you did your pre-experiment testing properly).
you seem to assume the purpose of the test was to see if routers crashed. i certainly think mor ehighly of ripe lans folk than that. randy
Am I the only one on the list which saw the sentence in Cisco's Advisory "Before sending the the unknown attribute to peers, the IOS XR corrupted it" which clearly states this was a bug?!
* Randy Bush:
To carry out such experiments responsibly, you have to conduct so much testing beforehand that the live test on the actual Internet will not yield new insights (assuming you did your pre-experiment testing properly).
you seem to assume the purpose of the test was to see if routers crashed. i certainly think mor ehighly of ripe lans folk than that.
We don't yet precisely what was the point of the experiment. But it is very likely that it intended to study propagation of such updates. Not propagating them is a protocol violation, so in order to observe anything beyond propagation times, they would have to intend to cause protocol violations, which is, in fact, awfully close to session resets (thanks to the BGP protocol design).
On Sat, Aug 28, 2010 at 04:56:05PM +0900, Randy Bush wrote:
imiho, researchers injecting data into the control plane are responsible to have tested it at least against major bgp speakers. and, considering its placement in the net (big core), i consider ios xr to be a major speaker.
i suspect that these folk will test better next time. i sure hope so.
I think you blame the wrong people. The vendor should make sure that their implementation does not violate the very basics of the BGP protocol. This bug in the IOS XR BGP implementation was a ticking time bomb and it was just a matter of when it would blow up. I suspect that Cisco will test better next time when they release a new version of their software. I sure hope so. -- :wq Claudio
* Claudio Jeker:
I think you blame the wrong people. The vendor should make sure that their implementation does not violate the very basics of the BGP protocol.
The curious thing here is that the peer that resets the session, as required by the spec, causes the actual damage (the session reset), and not the peer producing the wrong update. This whole thread is quite schizophrenic because the consensus appears to be that (a) a *researcher is not to blame* for sending out a BGP message which eventually leads to session resets, and (b) an *implementor is to blame* for sending out a BGP messages which eventually leads to session resets. You really can't have it both ways. I'm fed up with this situation, and we will fix it this time. My take is that if you reset the session, you're part of the problem, and consequently deserve part of the blame. So if you receive a properly-framed BGP update message you cannot parse, you should just log it, but not take down the session.
Hi!
I think you blame the wrong people. The vendor should make sure that their implementation does not violate the very basics of the BGP protocol.
The curious thing here is that the peer that resets the session, as required by the spec, causes the actual damage (the session reset), and not the peer producing the wrong update.
This whole thread is quite schizophrenic because the consensus appears to be that (a) a *researcher is not to blame* for sending out a BGP message which eventually leads to session resets, and (b) an *implementor is to blame* for sending out a BGP messages which eventually leads to session resets. You really can't have it both ways.
I'm fed up with this situation, and we will fix it this time. My take is that if you reset the session, you're part of the problem, and consequently deserve part of the blame. So if you receive a properly-framed BGP update message you cannot parse, you should just log it, but not take down the session.
Not sure if the link was posted allready ... http://www.cisco.com/en/US/products/products_security_advisory09186a0080b441... 'The vulnerability manifests itself when a BGP peer announces a prefix with a specific, valid but unrecognized transitive attribute. On receipt of this prefix, the Cisco IOS XR device will corrupt the attribute before sending it to the neighboring devices. Neighboring devices that receive this corrupted update may reset the BGP peering session.' Bye, Raymond.
* Raymond Dijkxhoorn:
Not sure if the link was posted allready ...
http://www.cisco.com/en/US/products/products_security_advisory09186a0080b441...
Cisco posts their advisories to the NANOG list.
'The vulnerability manifests itself when a BGP peer announces a prefix with a specific, valid but unrecognized transitive attribute. On receipt of this prefix, the Cisco IOS XR device will corrupt the attribute before sending it to the neighboring devices. Neighboring devices that receive this corrupted update may reset the BGP peering session.'
I'm not sure what you intend to say by quoting this part of the advisory. If you think that it's an IOS XR bug which only needs fixing in IOS XR, you're showing the very attitude which has stopped us from making the network more resilient to these types of events.
Hi!
Cisco posts their advisories to the NANOG list.
'The vulnerability manifests itself when a BGP peer announces a prefix with a specific, valid but unrecognized transitive attribute. On receipt of this prefix, the Cisco IOS XR device will corrupt the attribute before sending it to the neighboring devices. Neighboring devices that receive this corrupted update may reset the BGP peering session.'
I'm not sure what you intend to say by quoting this part of the advisory. If you think that it's an IOS XR bug which only needs fixing in IOS XR, you're showing the very attitude which has stopped us from making the network more resilient to these types of events.
Its more a workaround then a bugfix ... Dont try to write down what I might think. I am perfectly capable of explaining this myselve. The narrow minded response you just did tells more about you then about me. So far for the rant. I think i am around long enough that you would not even consider thinking that i would say 'hey this is a IOS XR BUG. Its not.' I didnt say this at all. Did I? If it affects a large part of traffic on the internet and it obviously did. It took down a couple of the larger networks. http://www.ams-ix.net/cgi-bin/stats/16all?log=totalall;png=daily You can clearly see the drop there also. I think a 'fix' 'bugfix' 'workaround' whatever you want to call it, i still think its good they released it and fast. A more structural approach is nice but wont help a lot of networks right now. I am sorry i tried to add something to the thread. Think about this Florian. We are not the bad guys. Bye, Raymond.
We had ASN4, AS-PATH and this one. More or less we hit this session reset problem once a year but nothing was done yet to change the RFC. So I am to blame as much as every network engineer to not have pushed for a change or at least a comprehensive explanation on the session teardown behaviour is like it is and should not be changed. It is only our fault for not having dealt with the problem the first time correctly, and will be next time if nothing is changed once more. I agree correctly framed invalid packet should be discarded without tearing the session down. --- from my iPhone On 28 Aug 2010, at 14:27, Florian Weimer <fw@deneb.enyo.de> wrote:
* Raymond Dijkxhoorn:
Not sure if the link was posted allready ...
http://www.cisco.com/en/US/products/products_security_advisory09186a0080b441...
Cisco posts their advisories to the NANOG list.
'The vulnerability manifests itself when a BGP peer announces a prefix with a specific, valid but unrecognized transitive attribute. On receipt of this prefix, the Cisco IOS XR device will corrupt the attribute before sending it to the neighboring devices. Neighboring devices that receive this corrupted update may reset the BGP peering session.'
I'm not sure what you intend to say by quoting this part of the advisory. If you think that it's an IOS XR bug which only needs fixing in IOS XR, you're showing the very attitude which has stopped us from making the network more resilient to these types of events.
On Sat, Aug 28, 2010 at 02:51:17PM +0200, Thomas Mangin wrote:
We had ASN4, AS-PATH and this one. More or less we hit this session reset problem once a year but nothing was done yet to change the RFC.
You are mixing up three totaly different problems. Sure the result was the same (session drops). This time a IOS XR device was corrupting an attribute before sending it out. The corruption had to be in the header section of the attribute or the other side would not have detected it (since the neighbor did not know about this attribute either). Now if a system sends out corrupted BGP messages there is no way out, you need to close the session because not doing so may result in much bigger mayhem. It was not mentioned what the corruption was actually, was the lenght wrong or was the optional flag missing (makeing the attribute well known)? Unlike in the ASN4 issue this time the session to the faulty system was dropped and by doing so stopped further issues.
So I am to blame as much as every network engineer to not have pushed for a change or at least a comprehensive explanation on the session teardown behaviour is like it is and should not be changed.
It is only our fault for not having dealt with the problem the first time correctly, and will be next time if nothing is changed once more.
I agree correctly framed invalid packet should be discarded without tearing the session down.
Great, corrupting your RIB and FIB and every of your peers RIB. Thanks a lot for routing loops and wrong announcements. The only thing you can drop without causing troubles are (tranistive) optional attributes. This is covered by draft-ietf-idr-optional-transitive and hopefully it will be adopted as RFC and implemented by vendors. If a well known attribute like AS-PATH is corrupted then there is no choice, the session needs to be reset. Which is bad when the AS-PATH validation code has a bug. -- :wq Claudio
On Sat, Aug 28, 2010 at 02:19:28PM +0200, Florian Weimer wrote:
* Claudio Jeker:
I think you blame the wrong people. The vendor should make sure that their implementation does not violate the very basics of the BGP protocol.
The curious thing here is that the peer that resets the session, as required by the spec, causes the actual damage (the session reset), and not the peer producing the wrong update.
This whole thread is quite schizophrenic because the consensus appears to be that (a) a *researcher is not to blame* for sending out a BGP message which eventually leads to session resets, and (b) an *implementor is to blame* for sending out a BGP messages which eventually leads to session resets. You really can't have it both ways.
The researcher is not to blame because all the BGP messages he sent out were properly formed. The implementor is to blame becuase the code he wrote send out BGP messages which were not properly formed.
I'm fed up with this situation, and we will fix it this time. My take is that if you reset the session, you're part of the problem, and consequently deserve part of the blame. So if you receive a properly-framed BGP update message you cannot parse, you should just log it, but not take down the session.
If you get your wish, and that gets implemented, in some numer of years trree will be a NANOG posting (perhaps from you, perhaps not) arguing that any malformed BGP message should result in the session being torn down. This will be after a router develops a failure that causes it to send many incorrect messages, but only some of them malformed. So the malformed ones will be discarded, the remainder will be propogated throughout the Internet. If the ones that are incorrect but not malformed are, say, filled with more specifics for large portions of the Internet, someone will be asking: "How could all the other routers accept these advertisement from a router known to be broken ... it was sending malformed advertisements, but instead of tearning down the sessions, you decided to trust all the validly formed messages from this known-to-be-broken router". My point is: we can't always look at the most recent failure to decide what the correct policy is. We have good data on the cases where NOTIFY on any malformed packet has caused significantly outages in the Internet. We don't have nearly as good data on the cases where NOTIFY-on-any-malformed-packet saved the Internet from a significant outage. I don't claim to know which is the bigger problem. But any serious argument to change the behavior needs to consider the risk from propogating information received from a router known to be broken, on the theory that the brokenness only causes malformed messages (which can be discarded) and does not also cause incorrect but correctly formed messages to be sent. -- Brett
On Sat, 28 Aug 2010, Brett Frankenberger wrote:
The implementor is to blame becuase the code he wrote send out BGP messages which were not properly formed.
People talk about not dropping sessions but instead dropping malformed messages. This is not safe. We've seen ISIS (which is TLV based and *can* drop individual messages) been wrongly implemented and platforms drop the entire ISIS *packet* instead of the individual message when seeing something malformed (or rather in this case, ISIS multi topology which the implementation didn't understand), and this made the link state database go out of sync and miss information for things it actually should have understood. This was *silent* error/corruption. I'm not sure I prefer to have silent problems instead of tearing down the session which is definitely noticable. -- Mikael Abrahamsson email: swmike@swm.pp.se
This was *silent* error/corruption. I'm not sure I prefer to have silent problems instead of tearing down the session which is definitely noticable.
i call the silent fix "do-gooder software." it means to do good. when it works, nobody notices or says thanks. when it fails, there is hell to pay. randy
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sun, Aug 29, 2010 at 12:23 AM, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
On Sat, 28 Aug 2010, Brett Frankenberger wrote:
The implementor is to blame becuase the code he wrote send out BGP messages which were not properly formed.
People talk about not dropping sessions but instead dropping malformed messages. This is not safe. We've seen ISIS (which is TLV based and *can* drop individual messages) been wrongly implemented and platforms drop the entire ISIS *packet* instead of the individual message when seeing something malformed (or rather in this case, ISIS multi topology which the implementation didn't understand), and this made the link state database go out of sync and miss information for things it actually should have understood.
This was *silent* error/corruption. I'm not sure I prefer to have silent problems instead of tearing down the session which is definitely noticable.
It would seem to me that there should actually be a better option, e.g. recognizing the malformed update, and simply discarding it (and sending the originator an error message) instead of resetting the session. Resetting of BGP sessions should only be done in the most dire of circumstances, to avoid a widespread instability incident. - - ferg -----BEGIN PGP SIGNATURE----- Version: PGP Desktop 9.5.3 (Build 5003) wj8DBQFMegyGq1pz9mNUZTMRAr6tAKDHDZk2/Yk3bHNKTvCJeniTCEdPvwCg0zhk HX/E0XsFOIURWI8UlfpM2Ms= =PSz3 -----END PGP SIGNATURE----- -- "Fergie", a.k.a. Paul Ferguson Engineering Architecture for the Internet fergdawgster(at)gmail.com ferg's tech blog: http://fergdawg.blogspot.com/
On Aug 29, 2010, at 2:30 PM, Paul Ferguson wrote:
It would seem to me that there should actually be a better option, e.g. recognizing the malformed update, and simply discarding it (and sending the originator an error message) instead of resetting the session.
Generation of the error message should probably have a user toggle. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> Injustice is relatively easy to bear; what stings is justice. -- H.L. Mencken
On Sun, Aug 29, 2010 at 12:30:21AM -0700, Paul Ferguson wrote:
It would seem to me that there should actually be a better option, e.g. recognizing the malformed update, and simply discarding it (and sending the originator an error message) instead of resetting the session.
Resetting of BGP sessions should only be done in the most dire of circumstances, to avoid a widespread instability incident.
The only thing you know for sure when you receive a malformed update is that the router on the other end of the connection is broken (or that there's something in between the other router and you that is corrupting messages, but for the purposes of this, that's essentially the same thing). Accepting information received from a router known to be broken, and then passing that on to other routers, is a bad idea and something that could lead to a widespread instability incident. Of course, in theory, you discard the bad updates and only pass on the good updates, but doing that relies on the assumption that the known-to-be-broken router on the other end of the connection is broken in such a way that ensures that all the corrupted messages it sends will be recognizable as malformed and can be discarded. There's plenty of corruption that can't be detected on the receiving end. On top of that, there's problems with being out of sync with the router on the other end. For example, suppose a router developed a condition that caused it to malform all withdraw messages (or, more precisely, all UPDATE messages where the withdrarn routes length field is non-zero). If we implement what you suggest above, then we'll accept all the advertisements from that router, but ignore all the withdraws, and end up sending that router a bunch of traffic that it won't actually be able to handle. -- Brett
It would seem to me that there should actually be a better option, e.g. recognizing the malformed update, and simply discarding it (and sending the originator an error message) instead of resetting the session.
Resetting of BGP sessions should only be done in the most dire of circumstances, to avoid a widespread instability incident.
I had the same thought before giving up on it. Negotiating a new error message could be a per peer option. BGP has capabilities for this exact reason. However to make sense you would need to find a resynchronisation point to only exclude the one faulty message. Initially I thought that the last received KEEPALIVE (for the receiver of the error message) could do - but you find yourselves with races conditions - so perhaps two KEEPALIVE back ? Each TCP packet can contain multiple message, so the messages would have to be then split and ACK individually to find the faulty one and then ACK individually. EOR could be used for that purpose. Still it adds lots of complexity in the conversation - are we not going to introduce bug in that not much used and tested code path as well ? Unless you have a new "ACK" capability for each message - another idea but those are clearly a discussions for outside NANOG. Thomas
On Sun, Aug 29, 2010 at 3:12 PM, Thomas Mangin <thomas.mangin@exa-networks.co.uk> wrote:
However to make sense you would need to find a resynchronisation point to only exclude the one faulty message. Initially I thought that the last received KEEPALIVE (for the receiver of the error message) could do - but you find yourselves with races conditions - so perhaps two KEEPALIVE back ? Each TCP packet can contain multiple message, so the messages would have to be then split and ACK individually to find the faulty one and then ACK individually. EOR could be used for that purpose.
Every BGP message header has a portion that starts with 16 all-bits-1 octets, for compatibility. This is distinctive enough an implementation can guess where the next message starts. However, suppose you have an attacker.. if for example, a BGP speaker passes on too short a length value for an attribute... and the attacker knows what length will be sent instead of the right one. Places an entry into the Data portion, that will appear to the other peer to be "the rest" of the malformed update, Result: the "malformed" update is received and appears to be perfectly valid. The next thing the attacker inserts into the data portion of the attribute is the 16 all-bits-1 octets, BGP header, update message, and their malicious update. This will appear properly formed, when the buggy BGP speaker sends it. As far as the buggy BGP speaker is concerned, it has propagated 1 route update. As far as the buggy BGP speaker's other peers are concerned, they have received 3 messages from the buggy speaker. * The update "completed" in the attribute data section. (This is "malformed", but intentionally not detectable as malformed) * The maliciously injected route. (This isn't supposed to exist. The buggy speaker is unaware of its existence, there is a disagreement between peers about how the message is interpreted) * A malformed message that does not make any sense. If the injection were perfect, nothing would be detectable as malformed. But alas, the attacker does not know exactly what other attributes or prepending buggy router will add to the message before passing it on. They could work this out through trial and error, however, some admin will hopefully notice all the CEASEs, before the attacker achieved complete success. In this case, by the time the other speakers detect something as malformed, the two preceding updates are already in the table, and possibly even propagated further. A "CEASE" rolls this back, by rolling back the entire session. Peers could (perhaps) safely re-synchronize in this case is if there was an extension to partially roll back some of the updates in a session and request a portion of the messages to be resent. Or if an extension such as authentication is used to make it impossible to inject BGP messages within the value of an attribute. Through data quarantine: requiring all BGP speakers to disallow the all-bits-1 sequence in any attribute value. Or through peer-specific authentication mechanisms, or checksums and digital signature, in the message header portion of each BGP message. -- -J
Every BGP message header has a portion that starts with 16 all-bits-1 octets, for compatibility. This is distinctive enough an implementation can guess where the next message starts.
i desperately feared reading this. i do not want to bet the internet on guessing where anythings starts. randy
On Sun, Aug 29, 2010 at 10:12:35PM +0200, Thomas Mangin wrote:
It would seem to me that there should actually be a better option, e.g. recognizing the malformed update, and simply discarding it (and sending the originator an error message) instead of resetting the session.
Resetting of BGP sessions should only be done in the most dire of circumstances, to avoid a widespread instability incident.
I had the same thought before giving up on it.
Negotiating a new error message could be a per peer option. BGP has capabilities for this exact reason.
However to make sense you would need to find a resynchronisation point to only exclude the one faulty message. Initially I thought that the last received KEEPALIVE (for the receiver of the error message) could do - but you find yourselves with races conditions - so perhaps two KEEPALIVE back ?
Apart from one big vendor most BGP speaker only send KEEPALIVES when they need to. So on my full feeds I see sessions running for more then 1 month which received less then 300 KEEPALIVE packets. -- :wq Claudio
Apart from one big vendor most BGP speaker only send KEEPALIVES when they need to. So on my full feeds I see sessions running for more then 1 month which received less then 300 KEEPALIVE packets.
The negociaged holdtime is always the lower value presented between two routers. The default HoldTime timer for Cisco is 180 seconds and for Juniper 90. So you should see a KEEPALIVE packet every minute from/to Cisco routers, and one every 30 seconds between Junipers. Should a BGP speaker do not see any KEEPALIVE during $HOLDTIME, it will tear the session down. You are telling me that your effective holdtime is 2592000 seconds when the HOLDTIME field is 16 bits ... hum ... http://www.faqs.org/rfcs/rfc4271.html section 4.2 So unless you know something I don't, I believe you are totally mistaken :) Thomas
On Mon, 2010-08-30 at 10:58 +0200, Thomas Mangin wrote:
http://www.faqs.org/rfcs/rfc4271.html section 4.2
So unless you know something I don't, I believe you are totally mistaken :)
updates serve as implicit keepalives. in that same section: "Hold Time: The calculated value indicates the maximum number of seconds that may elapse between the receipt of successive KEEPALIVE and/or UPDATE messages from the sender." also check section 6.5: "If a system does not receive successive KEEPALIVE, UPDATE, and/or NOTIFICATION messages [...]" --Daniel
On Mon, 2010-08-30 at 10:58 +0200, Thomas Mangin wrote:
http://www.faqs.org/rfcs/rfc4271.html section 4.2
So unless you know something I don't, I believe you are totally mistaken :)
updates serve as implicit keepalives.
Rule #1 do not post when you are not awake yet and quote the text which tells you are wrong .. broken :p Thank you Claudio for showing me why it would not work. Thomas
Thomas, Wouldn't the confusion come from the fact that updates are considered as keepalives, so that Claudio sees so few type 4 messages because he receives updates ? Sec 4.2, Hold Time : "The calculated value indicates the maximum number of seconds that may elapse between the receipt of successive KEEPALIVE and/or UPDATE messages from the sender." Regards, Pierre. Thomas Mangin wrote:
Apart from one big vendor most BGP speaker only send KEEPALIVES when they need to. So on my full feeds I see sessions running for more then 1 month which received less then 300 KEEPALIVE packets.
The negociaged holdtime is always the lower value presented between two routers. The default HoldTime timer for Cisco is 180 seconds and for Juniper 90. So you should see a KEEPALIVE packet every minute from/to Cisco routers, and one every 30 seconds between Junipers.
Should a BGP speaker do not see any KEEPALIVE during $HOLDTIME, it will tear the session down. You are telling me that your effective holdtime is 2592000 seconds when the HOLDTIME field is 16 bits ... hum ... http://www.faqs.org/rfcs/rfc4271.html section 4.2
So unless you know something I don't, I believe you are totally mistaken :)
Thomas
On 8/29/10 3:23 AM, Mikael Abrahamsson wrote:
On Sat, 28 Aug 2010, Brett Frankenberger wrote:
The implementor is to blame becuase the code he wrote send out BGP messages which were not properly formed.
People talk about not dropping sessions but instead dropping malformed messages. This is not safe. We've seen ISIS (which is TLV based and *can* drop individual messages) been wrongly implemented and platforms drop the entire ISIS *packet* instead of the individual message when seeing something malformed (or rather in this case, ISIS multi topology which the implementation didn't understand), and this made the link state database go out of sync and miss information for things it actually should have understood.
Reminder: TCP itself has also "been wrongly implemented" with horrid consequences. Unknown TCP options are supposed to be silently discarded. Instead, some middlebox vendors simply copy them into the return packet. There are some circumstances where it makes sense to silently discard one TLV option, and others where it makes sense to discard the whole packet, and still others where it makes sense to drop the session. A problem is that many of the early designers (BGP is a fairly early design) used one-size-fits-all error handling. There's not much anybody can do about bad implementation (as in this case) that corrupts data. But a lot more thought needs to go into error recovery!
This was *silent* error/corruption. I'm not sure I prefer to have silent problems instead of tearing down the session which is definitely noticable.
Personally, I've usually advocated returning an error message. Many of the protocols I've developed use this approach. (Please forgive RADIUS, which for some odd reason is my most frequently cited work according to Google. My original draft had a Reject, subsequent WG activity took it away. All I could do is throw up my hands and walk away.)
Florian Weimer wrote:
This whole thread is quite schizophrenic because the consensus appears to be that (a) a *researcher is not to blame* for sending out a BGP message which eventually leads to session resets, and (b) an *implementor is to blame* for sending out a BGP messages which eventually leads to session resets. You really can't have it both ways.
As good a place to break in on the thread as any, I guess. Randy and others believe more testing should have been done. I'm not completely sure they didn't test against XR. They very likely could have tested in a 1 on 1 connection and everything looked fine. I don't know the full details, but at what point did the corruption appear, and was it visible? We know that it was corrupt on the output which caused peer resets, but was it necessarily visible in the router itself? Do we require a researcher to setup a chain of every vender BGP speaker in every possible configuration and order to verify a bug doesn't cause things to break? In this case, one very likely would need an XR receiving and transmitting updates to detect the failure, so no less than 3 routers with the XR in the middle. What about individual configurations? Perhaps the update is received and altered by one vendor due to specific configurations, sent to the next vendor, accepted and altered (due to the first alteration, where as it wouldn't be altered if the original update had been received) which causes the next vendor to reset. Then we add to this that it may pass silently through several middle vendor routers without problems and we realize the scope of such problems and why connecting to the Internet is so unpredictable. Jack
Date: Mon, 30 Aug 2010 10:55:03 -0500 From: Jack Bates <jbates@brightok.net>
Florian Weimer wrote:
This whole thread is quite schizophrenic because the consensus appears to be that (a) a *researcher is not to blame* for sending out a BGP message which eventually leads to session resets, and (b) an *implementor is to blame* for sending out a BGP messages which eventually leads to session resets. You really can't have it both ways.
As good a place to break in on the thread as any, I guess. Randy and others believe more testing should have been done. I'm not completely sure they didn't test against XR. They very likely could have tested in a 1 on 1 connection and everything looked fine.
I don't know the full details, but at what point did the corruption appear, and was it visible? We know that it was corrupt on the output which caused peer resets, but was it necessarily visible in the router itself?
Do we require a researcher to setup a chain of every vender BGP speaker in every possible configuration and order to verify a bug doesn't cause things to break? In this case, one very likely would need an XR receiving and transmitting updates to detect the failure, so no less than 3 routers with the XR in the middle.
What about individual configurations? Perhaps the update is received and altered by one vendor due to specific configurations, sent to the next vendor, accepted and altered (due to the first alteration, where as it wouldn't be altered if the original update had been received) which causes the next vendor to reset. Then we add to this that it may pass silently through several middle vendor routers without problems and we realize the scope of such problems and why connecting to the Internet is so unpredictable.
This only way they could have caught this one was to have tested to a CRS which had another router to which it was announcing the attribute in a mal-formed packet. Worse, the resets should just keep happening as the CRS would still have the route with the unknown attribute which would just generate another malformed update to cause the session to reset again. While it may be possible to recover from something like this, it sure would not be easy. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751
At 12:40 PM 8/30/2010, Kevin Oberman wrote:
This only way they could have caught this one was to have tested to a CRS which had another router to which it was announcing the attribute in a mal-formed packet. Worse, the resets should just keep happening as the CRS would still have the route with the unknown attribute which would just generate another malformed update to cause the session to reset again.
While it may be possible to recover from something like this, it sure would not be easy.
We experienced something like this a year ago on a couple of quagga boxes. At least we had source code to go through and resources to make use of that source code to find the problem and implement a quick work around. Its for situations like this, debugging logging is ooooohhh so important. What did people do in this case to identify the issue ? Did you just pass it off to your vendor ? or did anyone try to diagnose it locally ? If so, what did you do ? ---Mike
-- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751
-------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike
On Mon, Aug 30, 2010 at 15:55, Jack Bates <jbates@brightok.net> wrote:
...
As good a place to break in on the thread as any, I guess. Randy and others believe more testing should have been done. I'm not completely sure they didn't test against XR. They very likely could have tested in a 1 on 1 connection and everything looked fine.
I don't know the full details, but at what point did the corruption appear, and was it visible? We know that it was corrupt on the output which caused peer resets, but was it necessarily visible in the router itself?
Do we require a researcher to setup a chain of every vender BGP speaker in every possible configuration and order to verify a bug doesn't cause things to break? In this case, one very likely would need an XR receiving and transmitting updates to detect the failure, so no less than 3 routers with the XR in the middle.
What about individual configurations? Perhaps the update is received and altered by one vendor due to specific configurations, sent to the next vendor, accepted and altered (due to the first alteration, where as it wouldn't be altered if the original update had been received) which causes the next vendor to reset. Then we add to this that it may pass silently through several middle vendor routers without problems and we realize the scope of such problems and why connecting to the Internet is so unpredictable.
I am not aware that anyone has provided the complete details at this point which would include any test plans that may have been performed. From what I have been able to discern, it does seem likely that a test plan that would have caught this almost had to know of the specific issue in advance. More testing would have been better, but there is just too much variability out there to assure you can do a complete test. I am also not aware that the introduction of the attribute was announced to the usual operational lists in advance "just in case" (Ok, in this case, I mean NANOG). This, is my mind, is actually the bigger faux pas. An "Oh S***" moment has happened to most of us. It probably will happen again to many of us. But letting people know in advance of scheduled changes is the important thing. I would hope that in the future researchers will commit to test plans to (at least) all the major vendor BGP speakers (which, I admit, would likely not have caught this issue), and that before introducing such "new" attributes into the "Internet", they would announce it to the usual operational lists, again, "just in case". But my hopes are often dashed. Gary
Quagga is even worse that Cisco when it comes to packet validation but it should not surprise anyone :p
To substantiate my claim, my mercurial log tells me that for MPRNLRI and MPURNLRI having the flag set as Transitive instead of Optional did not cause Quagga to complain. It just took the IPv4/IPv6 route . Now it may have been fixed. I should really check and if not pass this to the quagga dev list. I am lazy. Thomas
* Christopher Morrow:
(you are asking your vendors to run full bit sweeps of each protocol in a regimented manner checking for all possible edge cases and properly handling them, right?)
The real issue is that both spec and current practice say you need to drop the session as soon as you encounter any unexpected data. That's just wrong---you can't really be sure that it's a temporary glitch caused by your peer. If it's not, you are unnecessarily taking yourself off the net. Of course, there is little you can do when the outer framing at the internal BGP layer is wrong (resyncing is way too risky). A tear-down might be in order if you receive an unrecognized message type, too. But a BGP update message which is malformed internally should just be ignored. From a theoretical point of view, it's no worse than the operator configuring a prefix-list that filters out all the NLRI entries.
On Sat, Aug 28, 2010 at 6:14 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
* Christopher Morrow:
(you are asking your vendors to run full bit sweeps of each protocol in a regimented manner checking for all possible edge cases and properly handling them, right?)
The real issue is that both spec and current practice say you need to drop the session as soon as you encounter any unexpected data. That's
sorry, I conflated two things... or didn't mean to but did anyway. 1) users of gear that does BGP really need to ask loudly and longly (and then go test for themselves) that their BGP speakers do the 'right thing' when faced with oddball scenarios. If someone sends you a previously unknown attribute... don't corrupt it and pass it on, pass if transitive, drop if not. 2) some thought and writing and code-changes need to go into how the bgp-speakers of the world deal with bad-behaving bgp speakers. Is 'send notify and reset' the right answer? is there one 'right answer' ? Should some classes of fugly exchange end with a 'dropped that update, moved along' and some end with 'pull eject handle!' ? it's doubtful that 2 can get solved here (nanog, though certainly some operational thought on the right thing would be great as guidance). i would hope that 1 can get some traction here (via folks going back to their vendors and asking: "Did you run the Mu-security/Oolu-univ/etc fuzzing test suites against this code? can I see the results? I hope they match the results I'm going to be getting from my folks in ~2wks... or we'll be having a much more structured/loud conversation..." another poster had a great point about 'all the world can screw with you, you have no protections other than trust that the next guy won't screw you over (inadvertently)'. There are no protections available to you if someone sets (example) bit 77 in an ipv4 update message to 1 when it should by all accounts be 0. Or (apparently) if they send a previously unknown attribute on a route :( You can put in max-prefix limits, as-path limits (length and content), prefix-filters.. but internal-message-content you are stuck hoping the vendors all followed the same playbook. With everyone saying together: "Please appropriately test your implementation for all boundary cases" maybe we can get to where these happen less often (or nearly never) - every 3 months is a little tedious. -chris
On 8/27/10 1:07 PM, Mike Gatti wrote:
where's the change management process in all of this. basically now we are going to starting changing things that can potentially have an adverse affect on users without letting anyone know before hand .... Interesting concept.
BGP is transitive, change management is not. you have a change management process, your peer might integrate into that but have their own, your peer's peers almost certainly do not. Every time a wet-behind-the-ears network engineer connects a bgp speaker to the edge of the network it's an experiment in the the stability of the Internet. This on the fact of it seems like a quite reasonable experiment, which should have worked, except that it happened to tickle a bug...
On Aug 27, 2010, at 3:33 PM, Dave Israel wrote:
On 8/27/2010 3:22 PM, Jared Mauch wrote:
When you are processing something, it's sometimes hard to tell if something just was mis-parsed (as I think the case is here with the "missing-2-bytes") vs just getting garbage. Perhaps there should be some way to "re-sync" when you are having this problem, or a parallel "keepalive" path similar to MACA/MCAS/MIDCAS/TCAS between the devices to talk when something bad is happening.
I know it wasn't there originally, and isn't mandatory now, but there is an MD5 hash that can be added to the packet. If the TCP hash checks out, then you know the packet wasn't garbled, and just contained information you didn't grok. That seems like enough evidence to be able to shrug and toss the packet without dropping the session.
-Dave
=+=+=+=+=+=+=+=+=+=+=+=+= Mike Gatti ekim.ittag@gmail.com =+=+=+=+=+=+=+=+=+=+=+=+=
On Fri, Aug 27, 2010 at 2:33 PM, Dave Israel <davei@otd.com> wrote:
On 8/27/2010 3:22 PM, Jared Mauch wrote: [snip] an MD5 hash that can be added to the packet. If the TCP hash checks
Hello, layering violation. If the TCP MD5 option was used, the MD5 checksum was probably correct. Malformed BGP Protocol messages, not malformed TCP messages. The BGP protocol that lives on top of TCP is a non-packetized stream. Dropping the IP packets, would just mean that the IP packets containing the malformed BGP data need to get resent (still containing malformed BGP application protocol data, when resent).
out, then you know the packet wasn't garbled, and just contained information you didn't grok. That seems like enough evidence to be able to shrug and toss the packet without dropping the session.
If the attribute is malformed, and in particular, if the _length_ value is malformed, because more bits have been transmitted as part of an update than indicated in the length, how do you reliably determine exactly where the "junk" ends, and the next attribute starts, and resume the stream without loss of other critical data? Without a valid length of the attribute, you don't know which bit the next attribute starts at, or which bit the next update starts at. If the apparently length of the update is wrong, the rest of your session appears to be malformed. If your guess is wrong, you could wind up interpreting part of the attribute DATA portion as another route update, allowing an adversary to exploit the malformed bug to possibly inject new routes. A "recovery" mechanism could lead to worse problems, or lead to problems not being discovered.
-Dave -- -J
On Aug 27, 2010, at 3:13 PM, Richard A Steenbergen wrote:
On Fri, Aug 27, 2010 at 01:29:15PM -0400, Jared Mauch wrote:
Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240)
Just out of curiosity, at what point will we as operators rise up against the ivory tower protocol designers at the IETF and demand that they add a mechanism to not bring down the entire BGP session because of a single malformed attribute? Did I miss the memo about the meeting? I'll bring the punch and pie.
I think it's actually an implementation problem where it got out-of-sync. You can't exactly blame the IETF for a vendor having poor code quality. (at least not in this case IMHO). I seem to recall there was something like this in the past that caused some significant problems with people also running XR/CRS-1. They quickly got a fix and cisco issued a PSIRT as a result: http://www.cisco.com/en/US/products/products_security_advisory09186a0080af15... I would hope these people updated their software for that impact as well. Without knowing what the defect impact was on those devices, and without talking to PSIRT today, I don't know if an advisory is pending. Perhaps it's a new defect and the bug is going to be triggered again soon for those that don't patch their devices. - jared
On Aug 27, 2010, at 12:13 PM, Richard A Steenbergen wrote:
Just out of curiosity, at what point will we as operators rise up against the ivory tower protocol designers at the IETF and demand that they add a mechanism to not bring down the entire BGP session because of a single malformed attribute? Did I miss the memo about the meeting? I'll bring the punch and pie.
About the same time vendors' BGP implementations start to work correctly? I agree such a knob would be useful, but seems to me that actually following the current standard would largely curb the issue by itself. I recall one of the previous times something like this happened (and with a much wider impact), I believe it was $C that was accepting a bad attribute and passing it along. The effect was that other vendors ($F in particular, I think) would drop the session (per RFC), which made it look like they were the broken ones. Instead of saying "why was this accepted from its source?" the community reaction seemed more to me to be "hey, BGP is breaking the internet!" If -everyone- dropped the session on a bad attribute, it likely wouldn't make it far enough into the wild to cause these problems in the first place. -c
On Fri, 27 Aug 2010 13:43:39 PDT, Clay Fiske said:
If -everyone- dropped the session on a bad attribute, it likely wouldn't make it far enough into the wild to cause these problems in the first place.
That works fine for malformed attributes. It blows chunks for legally formed but unknown attributes - how would you ever deploy a new attribute?
On Fri, Aug 27, 2010 at 04:57:17PM -0400, Valdis.Kletnieks@vt.edu wrote:
On Fri, 27 Aug 2010 13:43:39 PDT, Clay Fiske said:
If -everyone- dropped the session on a bad attribute, it likely wouldn't make it far enough into the wild to cause these problems in the first place.
That works fine for malformed attributes. It blows chunks for legally formed but unknown attributes - how would you ever deploy a new attribute?
This is covered by the RFC. Unknown attributes are either dropped or passed on depending on the attribute flags. The problem as in AS4 was that there where illegally formed unknown attributes that got passed around and made RFC compliant routers, which already handled AS4, further down the chain fail. This problem was addressed in "Error Handling for Optional Transitive BGP Attributes" but for some reasons people think it is necessary to make something simple more and more complex so this draft is still pending. -- :wq Claudio
On Aug 27, 2010, at 1:57 PM, Valdis.Kletnieks@vt.edu wrote:
On Fri, 27 Aug 2010 13:43:39 PDT, Clay Fiske said:
If -everyone- dropped the session on a bad attribute, it likely wouldn't make it far enough into the wild to cause these problems in the first place.
That works fine for malformed attributes. It blows chunks for legally formed but unknown attributes - how would you ever deploy a new attribute?
By making it optional. Seems to me that's pretty well covered by the Path Attributes section of the RFC. A bad attribute isn't simply unknown, it's malformed. My apologies for not wording that more precisely. I do see the wisdom of fine-grained control of this behavior. I'm just saying, it'd be nice if we could have correct behavior on the basics in the first place. :) -c
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Fri, Aug 27, 2010 at 5:02 PM, Clay Fiske <clay@bloomcounty.org> wrote:
On Aug 27, 2010, at 1:57 PM, Valdis.Kletnieks@vt.edu wrote:
That works fine for malformed attributes. It blows chunks for legally formed but unknown attributes - how would you ever deploy a new attribute?
By making it optional. Seems to me that's pretty well covered by the Path Attributes section of the RFC.
A bad attribute isn't simply unknown, it's malformed. My apologies for not wording that more precisely.
I do see the wisdom of fine-grained control of this behavior. I'm just saying, it'd be nice if we could have correct behavior on the basics in the first place. :)
As an aside, I see that Cisco has released a late Friday afternoon security advisory on this issue: http://www.cisco.com/warp/public/707/cisco-sa-20100827-bgp.shtml FYI, - - ferg -----BEGIN PGP SIGNATURE----- Version: PGP Desktop 9.5.3 (Build 5003) wj8DBQFMeFNZq1pz9mNUZTMRAkR9AJ9cTz71N5/RMaQFD6LsumKLhpfASACdHrBR 4uQ0+oes21gvTS5IVJZXMds= =5wqD -----END PGP SIGNATURE----- -- "Fergie", a.k.a. Paul Ferguson Engineering Architecture for the Internet fergdawgster(at)gmail.com ferg's tech blog: http://fergdawg.blogspot.com/
Once upon a time, Paul Ferguson <fergdawgster@gmail.com> said:
As an aside, I see that Cisco has released a late Friday afternoon security advisory on this issue:
Huh, I had an upstream (with Cisco gear on their end) do "URGENT maintenance" last night with less than 12 hours notice. I wonder if this is why... -- Chris Adams <cmadams@hiwaay.net> Systems and Network Administrator - HiWAAY Internet Services I don't speak for anybody but myself - that's enough trouble.
On Fri, Aug 27, 2010 at 01:43:39PM -0700, Clay Fiske wrote:
If -everyone- dropped the session on a bad attribute, it likely wouldn't make it far enough into the wild to cause these problems in the first place.
And if everyone filtered their BGP customers there would be no routing leaks, but we've seen how well that works. :) The "if anything bad happens, drop the session" method of protection is only effective if EVERY BGP implementation catches EVERY malformed update EVERY time, which just doesn't match up with reality. Not only that, but a healthy number of the bgp update issues over the years have actually been the result of implementations detecting perfectly valid things as invalid, which means by definition the implementations which get it right and don't drop the session act as carriers and spread the problem route globally. How long as we going to continue to act like this method of protection is actually working? Lets be reasonable, if your basic bgp message format is malformed you're going to need to drop the session. If the packet is corrupted or the size of the message doesn't match whats in the tlv, you're not going to be able to continue and you'll have to drop the session. But there are still a huge number of potential issues where it would be perfectly safe to drop the update you didn't like, and support for this could easily be negotiated and the sending side informed of the issue by a soft notification extension. I have yet to see a single argument against this which isn't political or philosophical in nature. -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
About the same time the operators get back into the IETF and become involved again. There was a time when operators played a large role in the development of things BGP (e.g. Tony Bates, Enke Chen, both at iMCI). No one is stopping us, the 'ivory tower' has no gate. jy On 28/08/2010, at 5:13 AM, Richard A Steenbergen <ras@e-gerbil.net> wrote:
On Fri, Aug 27, 2010 at 01:29:15PM -0400, Jared Mauch wrote:
Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240) Unknown BGP attribute 99 (flags: 240)
Just out of curiosity, at what point will we as operators rise up against the ivory tower protocol designers at the IETF and demand that they add a mechanism to not bring down the entire BGP session because of a single malformed attribute? Did I miss the memo about the meeting? I'll bring the punch and pie.
-- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
Just out of curiosity, at what point will we as operators rise up against the ivory tower protocol designers at the IETF and demand that they add a mechanism to not bring down the entire BGP session because of a single malformed attribute?
there is a problem underlying this. bgp is not tlv. so once a parser detects an error, it can not *rigorously* know where to take up again. randy
Richard A Steenbergen <ras@e-gerbil.net> writes:
Just out of curiosity, at what point will we as operators rise up against the ivory tower protocol designers at the IETF and demand that they add a mechanism to not bring down the entire BGP session because of a single malformed attribute? Did I miss the memo about the meeting?
I guess you did. http://tools.ietf.org/html/draft-ietf-idr-optional-transitive-02 Bjørn
On 8/29/10 9:31 AM, Bjørn Mork wrote:
Richard A Steenbergen <ras@e-gerbil.net> writes:
Just out of curiosity, at what point will we as operators rise up against the ivory tower protocol designers at the IETF and demand that they add a mechanism to not bring down the entire BGP session because of a single malformed attribute? Did I miss the memo about the meeting?
I guess you did.
http://tools.ietf.org/html/draft-ietf-idr-optional-transitive-02
rfc 4893 (4 octet as numbers) leverages the assumption that you can send the as4_path attribute and that even router's that don't understand it will forward it. given that 4 byte as numbers exist in the internet and many non-4byte aware routers exist, that seems like a reasonable assumption.
Bjørn
On Fri, 27 Aug 2010 19:27:06 +0200, Kasper Adel said:
Havent seen a thread on this one so thought i'd start one.
Ripe tested a new attribute that crashed the internet, is that true?
If it in fact "crashed the internet", as opposed to "gave a few buggy routers here and there indigestion", you wouldn't be posting to NANOG looking for confirmation. :)
On 27-08-10 19:31, Valdis.Kletnieks@vt.edu wrote:
On Fri, 27 Aug 2010 19:27:06 +0200, Kasper Adel said:
Havent seen a thread on this one so thought i'd start one.
Ripe tested a new attribute that crashed the internet, is that true?
If it in fact "crashed the internet", as opposed to "gave a few buggy routers here and there indigestion", you wouldn't be posting to NANOG looking for confirmation. :)
https://www.ams-ix.net/statistics/ Not whole internet, but a part. And the "few buggy routers here and there" were mostly Cisco CRS-1's which didn't understand the new attribute and sent a malformed message to all peers, causing them to close the BGP session. I think most of the impact was limited to Europe, especially Amsterdam area. -- Grzegorz Janoszka
On 27 Aug 2010, at 19:27, Grzegorz Janoszka wrote:
On 27-08-10 19:31, Valdis.Kletnieks@vt.edu wrote:
On Fri, 27 Aug 2010 19:27:06 +0200, Kasper Adel said:
Havent seen a thread on this one so thought i'd start one.
Ripe tested a new attribute that crashed the internet, is that true?
If it in fact "crashed the internet", as opposed to "gave a few buggy routers here and there indigestion", you wouldn't be posting to NANOG looking for confirmation. :)
https://www.ams-ix.net/statistics/
Not whole internet, but a part. And the "few buggy routers here and there" were mostly Cisco CRS-1's which didn't understand the new attribute and sent a malformed message to all peers, causing them to close the BGP session.
In a way it remind me of the ASN4 bug .. Until a vendor fix is available I guess that the details are better left off public mailing lists. http://www.uknof.org.uk/uknof12/Davidson-4_byte_asn.pdf
I think most of the impact was limited to Europe, especially Amsterdam area.
Yes, It had an effect on ISPs which are connected to RIS. http://www.ripe.net/ris/ AFAIK this mean ASes at LINX and AMS-IX . The LINX graph shows a similar (but smaller) dip of 50-60 GB. Thomas
On 27-08-10 20:41, Thomas Mangin wrote:
I think most of the impact was limited to Europe, especially Amsterdam area. Yes, It had an effect on ISPs which are connected to RIS. http://www.ripe.net/ris/ AFAIK this mean ASes at LINX and AMS-IX . The LINX graph shows a similar (but smaller) dip of 50-60 GB.
Not only. We don't peer with RIS, but about 8-10 our peers announce to us RIS. The nasty update we got from completely different AS, not RIS. You may just check whether you see AS12654 - it is RIS. -- Grzegorz Janoszka
On 27 Aug 2010, at 20:03, Grzegorz Janoszka wrote:
On 27-08-10 20:41, Thomas Mangin wrote:
I think most of the impact was limited to Europe, especially Amsterdam area. Yes, It had an effect on ISPs which are connected to RIS. http://www.ripe.net/ris/ AFAIK this mean ASes at LINX and AMS-IX . The LINX graph shows a similar (but smaller) dip of 50-60 GB.
Not only. We don't peer with RIS, but about 8-10 our peers announce to us RIS. The nasty update we got from completely different AS, not RIS. You may just check whether you see AS12654 - it is RIS.
Yes, the BGP message had a transitive attribute - sorry if I was not clear. That said, you may want to ask why you are getting RIS routes if you are not peering with them directly :p RIS is peering world wide ( http://www.ripe.net/ris/docs/peering.html ) but the mail was only sent to linx-ops and tech-l, so the announcement may have been limited to europe (for all I know). Thomas
FYI: ---------------------------------------------------------------------- Dear Colleagues, On Friday 27 August, from 08:41 to 09:08 UTC, the RIPE NCC Routing Information Service (RIS) announced a route with an experimental BGP attribute. During this announcement, some Internet Service Providers reported problems with their networking infrastructure. Investigation -------------- Immediately after discovering this, we stopped the announcement and started investigating the problem. Our investigation has shown that the problem was likely to have been caused by certain router types incorrectly modifying the experimental attribute and then further announcing the malformed route to their peers. The announcements sent out by the RIS were correct and complied to all standards. The experimental attribute was part of an experiment conducted in collaboration with a group from Duke University. This involved announcing a large (3000 bytes) optional transitive attribute, using a modified version of Quagga. The attribute used type code 99. The data consisted of zeros. We used the prefix 93.175.144.0/24 for this and announced from AS 12654 on AMS-IX, NL-IX and GN-IX to all our peers. Reports from affected ISPs showed that the length of the attribute in the attribute header, as seen by their routers, was not correct. The header stated 233 bytes and the actual data in their samples was 237 bytes. This caused some routers to drop the session with the peer that announced the route. We have built a test set-up which is running identical software and configurations to the live set-up. From this set-up, and the BGP packet dumps as made by the RIS, we have determined that the length of the data in the attribute as sent out by the RIS was indeed 3000 bytes and that all lengths recorded in the headers of the BGP updates were correct. Beyond the RIS systems, we can only do limited diagnosis. One possible explanation is that the affected routers did not correctly use the extended length flag on the attribute. This flag is set when the length of the attribute exceeds 255 bytes i.e. when two octets are needed to store the length. It may be that the routers may not add the higher octet of the length to the total length, which would lead, in our test set-up, to a total packet length of 236 bytes. If, in addition, the routers also incorrectly trim the attribute length, the problem could occur as observed. It is worth noting that the difference between the reported 233 and 237 bytes is the size of the flags, type code and length in the attribute. We will be further investigating this problem and will report any findings. We regret any inconvenience caused. Kind regards, Erik Romijn Information Services RIPE NCC _______________________________________________ tech-l mailing list tech-l@ams-ix.net http://melix.ams-ix.net/mailman/listinfo/tech-l - Lucy On Fri, 27 Aug 2010, Grzegorz Janoszka wrote:
On 27-08-10 19:31, Valdis.Kletnieks@vt.edu wrote:
On Fri, 27 Aug 2010 19:27:06 +0200, Kasper Adel said:
Havent seen a thread on this one so thought i'd start one.
Ripe tested a new attribute that crashed the internet, is that true?
If it in fact "crashed the internet", as opposed to "gave a few buggy routers here and there indigestion", you wouldn't be posting to NANOG looking for confirmation. :)
https://www.ams-ix.net/statistics/
Not whole internet, but a part. And the "few buggy routers here and there" were mostly Cisco CRS-1's which didn't understand the new attribute and sent a malformed message to all peers, causing them to close the BGP session.
I think most of the impact was limited to Europe, especially Amsterdam area.
So much for "better left off public mailing lists" ! sigh ! Thomas On 27 Aug 2010, at 19:42, Lucy Lynch wrote:
FYI:
---------------------------------------------------------------------- Dear Colleagues,
On Friday 27 August, from 08:41 to 09:08 UTC, the RIPE NCC Routing Information Service (RIS) announced a route with an experimental BGP attribute. During this announcement, some Internet Service Providers reported problems with their networking infrastructure.
Investigation --------------
Immediately after discovering this, we stopped the announcement and started investigating the problem. Our investigation has shown that the problem was likely to have been caused by certain router types incorrectly modifying the experimental attribute and then further announcing the malformed route to their peers. The announcements sent out by the RIS were correct and complied to all standards.
The experimental attribute was part of an experiment conducted in collaboration with a group from Duke University. This involved announcing a large (3000 bytes) optional transitive attribute, using a modified version of Quagga. The attribute used type code 99. The data consisted of zeros. We used the prefix 93.175.144.0/24 for this and announced from AS 12654 on AMS-IX, NL-IX and GN-IX to all our peers.
Reports from affected ISPs showed that the length of the attribute in the attribute header, as seen by their routers, was not correct. The header stated 233 bytes and the actual data in their samples was 237 bytes. This caused some routers to drop the session with the peer that announced the route.
We have built a test set-up which is running identical software and configurations to the live set-up. From this set-up, and the BGP packet dumps as made by the RIS, we have determined that the length of the data in the attribute as sent out by the RIS was indeed 3000 bytes and that all lengths recorded in the headers of the BGP updates were correct.
Beyond the RIS systems, we can only do limited diagnosis. One possible explanation is that the affected routers did not correctly use the extended length flag on the attribute. This flag is set when the length of the attribute exceeds 255 bytes i.e. when two octets are needed to store the length.
It may be that the routers may not add the higher octet of the length to the total length, which would lead, in our test set-up, to a total packet length of 236 bytes. If, in addition, the routers also incorrectly trim the attribute length, the problem could occur as observed. It is worth noting that the difference between the reported 233 and 237 bytes is the size of the flags, type code and length in the attribute.
We will be further investigating this problem and will report any findings. We regret any inconvenience caused.
Kind regards,
Erik Romijn
Information Services RIPE NCC _______________________________________________ tech-l mailing list tech-l@ams-ix.net http://melix.ams-ix.net/mailman/listinfo/tech-l
- Lucy
On Fri, 27 Aug 2010, Grzegorz Janoszka wrote:
On 27-08-10 19:31, Valdis.Kletnieks@vt.edu wrote:
On Fri, 27 Aug 2010 19:27:06 +0200, Kasper Adel said:
Havent seen a thread on this one so thought i'd start one. Ripe tested a new attribute that crashed the internet, is that true? If it in fact "crashed the internet", as opposed to "gave a few buggy routers here and there indigestion", you wouldn't be posting to NANOG looking for confirmation. :)
https://www.ams-ix.net/statistics/
Not whole internet, but a part. And the "few buggy routers here and there" were mostly Cisco CRS-1's which didn't understand the new attribute and sent a malformed message to all peers, causing them to close the BGP session.
I think most of the impact was limited to Europe, especially Amsterdam area.
sorry - found via google... - Lucy On Fri, 27 Aug 2010, Thomas Mangin wrote:
So much for "better left off public mailing lists" ! sigh !
Thomas
On 27 Aug 2010, at 19:42, Lucy Lynch wrote:
FYI:
---------------------------------------------------------------------- Dear Colleagues,
On Friday 27 August, from 08:41 to 09:08 UTC, the RIPE NCC Routing Information Service (RIS) announced a route with an experimental BGP attribute. During this announcement, some Internet Service Providers reported problems with their networking infrastructure.
Investigation --------------
Immediately after discovering this, we stopped the announcement and started investigating the problem. Our investigation has shown that the problem was likely to have been caused by certain router types incorrectly modifying the experimental attribute and then further announcing the malformed route to their peers. The announcements sent out by the RIS were correct and complied to all standards.
The experimental attribute was part of an experiment conducted in collaboration with a group from Duke University. This involved announcing a large (3000 bytes) optional transitive attribute, using a modified version of Quagga. The attribute used type code 99. The data consisted of zeros. We used the prefix 93.175.144.0/24 for this and announced from AS 12654 on AMS-IX, NL-IX and GN-IX to all our peers.
Reports from affected ISPs showed that the length of the attribute in the attribute header, as seen by their routers, was not correct. The header stated 233 bytes and the actual data in their samples was 237 bytes. This caused some routers to drop the session with the peer that announced the route.
We have built a test set-up which is running identical software and configurations to the live set-up. From this set-up, and the BGP packet dumps as made by the RIS, we have determined that the length of the data in the attribute as sent out by the RIS was indeed 3000 bytes and that all lengths recorded in the headers of the BGP updates were correct.
Beyond the RIS systems, we can only do limited diagnosis. One possible explanation is that the affected routers did not correctly use the extended length flag on the attribute. This flag is set when the length of the attribute exceeds 255 bytes i.e. when two octets are needed to store the length.
It may be that the routers may not add the higher octet of the length to the total length, which would lead, in our test set-up, to a total packet length of 236 bytes. If, in addition, the routers also incorrectly trim the attribute length, the problem could occur as observed. It is worth noting that the difference between the reported 233 and 237 bytes is the size of the flags, type code and length in the attribute.
We will be further investigating this problem and will report any findings. We regret any inconvenience caused.
Kind regards,
Erik Romijn
Information Services RIPE NCC _______________________________________________ tech-l mailing list tech-l@ams-ix.net http://melix.ams-ix.net/mailman/listinfo/tech-l
- Lucy
On Fri, 27 Aug 2010, Grzegorz Janoszka wrote:
On 27-08-10 19:31, Valdis.Kletnieks@vt.edu wrote:
On Fri, 27 Aug 2010 19:27:06 +0200, Kasper Adel said:
Havent seen a thread on this one so thought i'd start one. Ripe tested a new attribute that crashed the internet, is that true? If it in fact "crashed the internet", as opposed to "gave a few buggy routers here and there indigestion", you wouldn't be posting to NANOG looking for confirmation. :)
https://www.ams-ix.net/statistics/
Not whole internet, but a part. And the "few buggy routers here and there" were mostly Cisco CRS-1's which didn't understand the new attribute and sent a malformed message to all peers, causing them to close the BGP session.
I think most of the impact was limited to Europe, especially Amsterdam area.
participants (39)
-
Bjørn Mork
-
bmanning@vacation.karoshi.com
-
Brett Frankenberger
-
Chris Adams
-
Christian Martin
-
Christopher Morrow
-
Claudio Jeker
-
Clay Fiske
-
Daniel Verlouw
-
Dave Israel
-
Dobbins, Roland
-
Florian Weimer
-
Gary Buhrmaster
-
Grzegorz Janoszka
-
Jack Bates
-
James Hess
-
Jared Mauch
-
Jeffrey S. Young
-
Jeroen Massar
-
Joel Jaeggli
-
Kasper Adel
-
Kevin Oberman
-
Leen Besselink
-
lorddoskias
-
Lucy Lynch
-
Mikael Abrahamsson
-
Mike
-
Mike Gatti
-
Mike Tancsa
-
Paul Ferguson
-
Pierre Francois
-
Randy Bush
-
Raymond Dijkxhoorn
-
Richard A Steenbergen
-
Saku Ytti
-
Thomas Mangin
-
Valdis.Kletnieks@vt.edu
-
Warren Kumari
-
William Allen Simpson