Incident with AMS-IX drops 7.5 Tbps of traffic

Eric Kuhnke

23 Nov 2023 23 Nov '23

2:57 p.m.

https://www.ams-ix.net/ams/documentation/total-stats https://www.ams-ix.net/ams/outage-on-amsterdam-peering-platform

Attachments:

attachment.html (text/html — 383 bytes)

Show replies by date

Yesterday evening, today again. [quote] "We have identified what seems to be the sequence of events. We are currently working on confirming the hypothesis in the lab, while the platform is stabilised, before we proceed to any further changes/actions. We have confirmed that Juniper propagated LACP packets from a customer to the rest of the platform. Of course, this shouldn't happen which points to a bug. This causes customer LACP LAGs to be torn down and, potential, pseudowires to get destroyed and rebuild. In consequence, this leads to full buffers and resource starvation which leads to RSVP messages timeout/errors. Then, RSVP PathError messages are sent (aggressively) and trigger another Juniper bug, which sends PathError messages to both the head-end PEs and new RSVP Path messages to the tail-end PEs, without any back-off timeouts. As a cascade effect, this causes issues to SLXes as well." Update 15:20: We have finished the call with our vendor and we can confirm that the root cause is indeed triggered by (wrong) propagation of LACP packets from Juniper PE equipment. As a mitigation solution, we will be deploying ACL entries to filter out LACP packets on all non-LACP interfaces on Juniper boxes. At the same time, we will reboot Juniper core-glo-205, to refresh its runtime state and proceed will load balancing all traffic to both cores. Please note that the platform is currently stable and we are working on installing the failsafes to avoid any reoccurrence. [/quote] On 23-11-2023 15:57, Eric Kuhnke wrote:

...

https://www.ams-ix.net/ams/documentation/total-stats

https://www.ams-ix.net/ams/outage-on-amsterdam-peering-platform

854

Age (days ago)

854

Last active (days ago)

List overview

Download

1 comments

2 participants

participants (2)

Eric Kuhnke
Erik van Westen

Eric Kuhnke

Erik van Westen

tags

participants (2)