
Hi. I recently read an interesting research paper that I don’t think has been shared here yet: https://www.usenix.org/system/files/usenixsecurity25-stoeger.pdf I would throw out two additional thoughts related to "Partial Mitigations"(6.1): 1. Usage of modern, multi-threaded routing daemons which are able to take advantage of multi-core control plane CPUs. 2. Control plane CPU usage monitoring. BGP per-neighbor sent and received UPDATE messages rate monitoring. For example, Junos supports this via SNMP, telemetry or RPC. In the "Safe BGP Communities"(6.2) paragraph the paper advises networks to cease the support of "Lower Local Pref Below Peer" community. As the BGP vortex forms only when both the "Lower Local Pref Below Peer" and "Selective NOPEER" communities are present, then perhaps alternative approach would be to cease the support of those two communities together. In other words, both communities would still be allowed separately, but not together. I did a brief analysis on RIPE RIS data for NTT, GTT and Sparkle "Lower Local Pref Below Peer" and "Selective NOPEER" type communities and the combination where both types are attached to the prefix are rare. Finally, I wonder if or how much the attack is amplified if the adversary ensures(for example, using communities) that each announced prefix has unique path attributes and thus there is an UPDATE message per prefix. Martin

On a quick first read, this seems like very much one of those things that is theoretically possible, but highly implausible in the real word. 1. This would be a lot of money for an attacker to spend, connecting to 3 specific ASNs, just to slow down convergence. 2. p3619 : "Then each new prefix will be propagated in parallel." Not really. Even if you assume the AS A sent a single UPDATE with 1 NLRI for each prefix, ASes B C D are going to aggregate multiple NLRI changes in a single UPDATE message to each other. This isn't going to cause the amplification claimed. 3. p3620, 5.1 Experiment Infrastructure Their virtualized test setup is many orders of magnitude less powerful than the actual hardware run by the ASNs that would theoretically be susceptible to this. The software run on this hardware is also WAY more optimized than FRR and BIRD are , especially at massive BGP scale that they run. 4. p3622, 5.3 BGP Vortices Delay Network Convergence, Methodology This methodology is bad. "I wanted X seconds to see" is meaningless. In a controlled environment, you can set things up to see exactly how long convergence takes. You don't need to handwave it. The real DFZ sees almost constant update splashing and oscillations similar to this 24/7/365, none of it malicious. And it has for years. On Mon, Oct 6, 2025 at 6:32 AM Martin Tonusoo via NANOG < nanog@lists.nanog.org> wrote:
Hi.
I recently read an interesting research paper that I don’t think has been shared here yet: https://www.usenix.org/system/files/usenixsecurity25-stoeger.pdf
I would throw out two additional thoughts related to "Partial Mitigations"(6.1):
1. Usage of modern, multi-threaded routing daemons which are able to take advantage of multi-core control plane CPUs.
2. Control plane CPU usage monitoring. BGP per-neighbor sent and received UPDATE messages rate monitoring. For example, Junos supports this via SNMP, telemetry or RPC.
In the "Safe BGP Communities"(6.2) paragraph the paper advises networks to cease the support of "Lower Local Pref Below Peer" community. As the BGP vortex forms only when both the "Lower Local Pref Below Peer" and "Selective NOPEER" communities are present, then perhaps alternative approach would be to cease the support of those two communities together. In other words, both communities would still be allowed separately, but not together. I did a brief analysis on RIPE RIS data for NTT, GTT and Sparkle "Lower Local Pref Below Peer" and "Selective NOPEER" type communities and the combination where both types are attached to the prefix are rare.
Finally, I wonder if or how much the attack is amplified if the adversary ensures(for example, using communities) that each announced prefix has unique path attributes and thus there is an UPDATE message per prefix.
Martin _______________________________________________ NANOG mailing list
https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/SU76M47A...

On Mon, Oct 6, 2025 at 10:14 AM Tom Beecher via NANOG <nanog@lists.nanog.org> wrote:
On a quick first read, this seems like very much one of those things that is theoretically possible, but highly implausible in the real word.
1. This would be a lot of money for an attacker to spend, connecting to 3 specific ASNs, just to slow down convergence.
To be fair, in Appendix A, the authors point out that the same effect can be had through downstream connections, so long as the upstream network isn't filtering BGP communities. So, you can get the same effect by buying a single BGP connection to a 4th, tier 2 network, so long as the upstream you've chosen a) doesn't strip BGP communities inbound from customers, b) doesn't strip BGP communities before propagating routes upstream, and c) connects to a trio of ASNs that are mutual peers of each other. So, I could trigger this via a simple downstream BGP adjacency through cogent, for example, for relatively little money.
3. p3620, 5.1 Experiment Infrastructure
Their virtualized test setup is many orders of magnitude less powerful than the actual hardware run by the ASNs that would theoretically be susceptible to this. The software run on this hardware is also WAY more optimized than FRR and BIRD are , especially at massive BGP scale that they run.
4. p3622, 5.3 BGP Vortices Delay Network Convergence, Methodology
This methodology is bad. "I wanted X seconds to see" is meaningless. In a controlled environment, you can set things up to see exactly how long convergence takes. You don't need to handwave it.
The real DFZ sees almost constant update splashing and oscillations similar to this 24/7/365, none of it malicious. And it has for years.
I had to chuckle at this part: p 3620 Discussion. To put the results above in perspective, a recent report [28] shows that, in 2024, the APNIC R&D Center AS (AS 131072) received around 200000 BGP updates per day, or 2.3 per second.7 Thus, the fact that a single BGP Vortex attack, based only on 21 ASes, can induce tens of thousands of updates per period highlights the potential impact a BGP Vortex attack can have on the global routing system. Clearly, then, the practical impact of the abstract results described above depends on many factors, but most importantly: Yes, on a typical boring day on the Internet, that's about right. However, taking that rate as though it's indicative of what core routers can *handle* is laughable. Flap a transit adjacency, and your router is going to be processing 1M+ BGP update messages hopefully in a small number of minutes. If my core routers can't deal with at least 200,000 BGP updates a minute, I'm going to be in a world of hurt every time an upstream neighbor session drops and re-establishes. Likewise, on page 3625, the paper says: p 3625 Rexford et al. [43] and Labovitz et al. [31] showed that while routes to popular destinations tend to be stable over time, network changes can trigger convergence delays lasting tens of minutes. The two studies cited were performed in 2000, and 2002, a quarter of a century ago. I will confess, I'm still using network hardware from that era...in my home network. Any network connecting to the BGP core of the internet that's running hardware from that era...may ${diety} have mercy on your CPU cores. ^_^; While this is an interesting demonstration of something we've all had a gut-level understanding probably takes place all the time due to inconsistent policies and unintentional overlooking of implementation details between peers, there are simpler ways to attack the DFZ core with more devastating impact. The amount of sleep I'd be losing worrying about this is negligible. Of course, that needs to be understood in the context of just how little sleep I tend to get in general. ^_^; Thanks! Matt

Hi.
2. p3619 : "Then each new prefix will be propagated in parallel."
Not really. Even if you assume the AS A sent a single UPDATE with 1 NLRI for each prefix, ASes B C D are going to aggregate multiple NLRI changes in a single UPDATE message to each other. This isn't going to cause the amplification claimed.
Perhaps the authors meant that each UPDATE message sent by AS A has unique path attributes and thus ensuring that ASes B, C and D can not aggregate multiple NLRIs into a single UPDATE message. I tried to replicate the "BGP Vortices Delay Network Convergence" test demonstrated in paragraph 5.3. Setup(drawing: https://gist.github.com/tonusoo/1cced39aa6ae53143d12623a05f02331) is very similar to figure 4b on the page 3621, but all my routers are running BIRD 3(single thread mode). Router "rY"(ingress) injects real BGP feed into the lab setup, router "rX"(upstream) periodically advertises and withdraws 50 routes and router "rK" injects 5k prefixes for the BGP vortex. Running the packet capture on Linux bridge connecting, for example, the "rN" and "rM" routers confirms that the BGP vortex is ongoing and I'm seeing well over 10k UPDATE messages per second. However, I might be doing something wrong, but I don't see the delays shown on figure 5a on page 3622. That is, 50 routes advertised or withdrawn by "rX" are propagated to "rZ" within few hundred milliseconds and not delayed for 10+ seconds. Martin

Matthew Petach via NANOG wrote on 06/10/2025 22:23:
While this is an interesting demonstration of something we've all had a gut-level understanding probably takes place all the time due to inconsistent policies and unintentional overlooking of implementation details between peers, there are simpler ways to attack the DFZ core with more devastating impact.
more to the point, the moment you implement both filtering and propagation of subnets in a routing protocol which allows next-hop resolution, you can no longer deterministically defend against this entire class of threats. This is well known, e.g. connecting up a GRE VPN and inserting the NH of the endpoint into the tunnel. Or being careless with prefix redistribution between routing protocols and finding out that it's very easy to shoot yourself in the foot. Hopefully most network engineers have done this accidentally in their lives so that they learn to be aware of it as something that can happen. Obviously the singe marks on my fingers are from ... other things and definitely none of the above (looks at floor awkwardly). Anyway, the principle of all these things is the same: oscillatory invalidation of the next-hop ip address.
The amount of sleep I'd be losing worrying about this is negligible.
Indeed. On a separate issue, it's frustrating when people see the need to brand their discoveries with breathless names and often (not in this case) cutesy logos. What makes a vulnerability relevant is the product of its exploitability and its impact. I'd rate this one as being worth having router high-CPU triggers on the NMS and possibly churn counters, but not much more. Nick

On Tue, Oct 7, 2025 at 9:55 AM Martin Tonusoo via NANOG < nanog@lists.nanog.org> wrote:
Hi.
2. p3619 : "Then each new prefix will be propagated in parallel."
Not really. Even if you assume the AS A sent a single UPDATE with 1 NLRI for each prefix, ASes B C D are going to aggregate multiple NLRI changes in a single UPDATE message to each other. This isn't going to cause the amplification claimed.
Perhaps the authors meant that each UPDATE message sent by AS A has unique path attributes and thus ensuring that ASes B, C and D can not aggregate multiple NLRIs into a single UPDATE message.
I tried to replicate the "BGP Vortices Delay Network Convergence" test demonstrated in paragraph 5.3. Setup(drawing: https://gist.github.com/tonusoo/1cced39aa6ae53143d12623a05f02331) is very similar to figure 4b on the page 3621, but all my routers are running BIRD 3(single thread mode). Router "rY"(ingress) injects real BGP feed into the lab setup, router "rX"(upstream) periodically advertises and withdraws 50 routes and router "rK" injects 5k prefixes for the BGP vortex. Running the packet capture on Linux bridge connecting, for example, the "rN" and "rM" routers confirms that the BGP vortex is ongoing and I'm seeing well over 10k UPDATE messages per second. However, I might be doing something wrong, but I don't see the delays shown on figure 5a on page 3622. That is, 50 routes advertised or withdrawn by "rX" are propagated to "rZ" within few hundred milliseconds and not delayed for 10+ seconds.
Looking at figure 6, it appears that the larger component appears to be the time between when the BGP update message arrived at the bystander-AS and when FRR finished logging the update message in its logs. As the methodology claims: By subtracting the time a route advertisement arrived at the bystander-AS from when it was logged in the FRR’s BGP log, we computed the processing time on the bystander-AS. As someone who has dealt with logging of debugging output from programs that need to be as real-time as possible, the logging functions are generally written to be asynchronous and separate from the main processing path, so that delays in the logging subsystem don't hold up the real work the program is doing. Using the appearance of a log message as an indicator of precise timing of when a RIB update happened is handwavy at best, and flat-out wrong at worst. The timestamp at which the zlog subsystem of FRR got the BGP update log message is unlikely to be the same timestamp at which the RIB itself was updated. Indeed, when researching FRR logging timestamps, it says - Performance impact: Debug-level logging can significantly increase the load on the system and may not capture precise, real-time updates without impacting performance, especially for frequent RIB updates. So, you end up with a double-whammy; turning on debug logging to see the logs for the routing updates significantly increases the load on the box running FRR, which in turns slows down the rate at which it can process update messages coming in. I think we've all known for years the perils of turning on extensive debug messages on routers. How many of us have had the awkward moment of a partner shaking us awake in bed saying "what happened? You were shouting "undebug all! undebug all!" in your sleep. Were you having a nightmare?" I suspect if you turn on verbose debugging logging on "rZ", you might find that suddenly route updates to the RIB slow down noticeably. This has less to do with the actions of a route vortex, and much more to do with hitting the CPU of your router over the head repeatedly with the blunt hammer of sprintf. ^_^;; Matt
participants (4)
-
Martin Tonusoo
-
Matthew Petach
-
Nick Hilliard
-
Tom Beecher