sorry - found via google... - Lucy On Fri, 27 Aug 2010, Thomas Mangin wrote:
So much for "better left off public mailing lists" ! sigh !
Thomas
On 27 Aug 2010, at 19:42, Lucy Lynch wrote:
FYI:
---------------------------------------------------------------------- Dear Colleagues,
On Friday 27 August, from 08:41 to 09:08 UTC, the RIPE NCC Routing Information Service (RIS) announced a route with an experimental BGP attribute. During this announcement, some Internet Service Providers reported problems with their networking infrastructure.
Investigation --------------
Immediately after discovering this, we stopped the announcement and started investigating the problem. Our investigation has shown that the problem was likely to have been caused by certain router types incorrectly modifying the experimental attribute and then further announcing the malformed route to their peers. The announcements sent out by the RIS were correct and complied to all standards.
The experimental attribute was part of an experiment conducted in collaboration with a group from Duke University. This involved announcing a large (3000 bytes) optional transitive attribute, using a modified version of Quagga. The attribute used type code 99. The data consisted of zeros. We used the prefix 93.175.144.0/24 for this and announced from AS 12654 on AMS-IX, NL-IX and GN-IX to all our peers.
Reports from affected ISPs showed that the length of the attribute in the attribute header, as seen by their routers, was not correct. The header stated 233 bytes and the actual data in their samples was 237 bytes. This caused some routers to drop the session with the peer that announced the route.
We have built a test set-up which is running identical software and configurations to the live set-up. From this set-up, and the BGP packet dumps as made by the RIS, we have determined that the length of the data in the attribute as sent out by the RIS was indeed 3000 bytes and that all lengths recorded in the headers of the BGP updates were correct.
Beyond the RIS systems, we can only do limited diagnosis. One possible explanation is that the affected routers did not correctly use the extended length flag on the attribute. This flag is set when the length of the attribute exceeds 255 bytes i.e. when two octets are needed to store the length.
It may be that the routers may not add the higher octet of the length to the total length, which would lead, in our test set-up, to a total packet length of 236 bytes. If, in addition, the routers also incorrectly trim the attribute length, the problem could occur as observed. It is worth noting that the difference between the reported 233 and 237 bytes is the size of the flags, type code and length in the attribute.
We will be further investigating this problem and will report any findings. We regret any inconvenience caused.
Kind regards,
Erik Romijn
Information Services RIPE NCC _______________________________________________ tech-l mailing list tech-l@ams-ix.net http://melix.ams-ix.net/mailman/listinfo/tech-l
- Lucy
On Fri, 27 Aug 2010, Grzegorz Janoszka wrote:
On 27-08-10 19:31, Valdis.Kletnieks@vt.edu wrote:
On Fri, 27 Aug 2010 19:27:06 +0200, Kasper Adel said:
Havent seen a thread on this one so thought i'd start one. Ripe tested a new attribute that crashed the internet, is that true? If it in fact "crashed the internet", as opposed to "gave a few buggy routers here and there indigestion", you wouldn't be posting to NANOG looking for confirmation. :)
https://www.ams-ix.net/statistics/
Not whole internet, but a part. And the "few buggy routers here and there" were mostly Cisco CRS-1's which didn't understand the new attribute and sent a malformed message to all peers, causing them to close the BGP session.
I think most of the impact was limited to Europe, especially Amsterdam area.