
On Mon, Aug 11, 2025 at 6:16 PM Matthew Petach <mpetach@netflight.com> wrote:
Oh--I wasn't talking about the CPU having issues. I was talking about DDoSing your own site, with all the inbound traffic worldwide traffic focusing in on the last remaining site, hammering the network links to the point of absolute congestion. At that point, trying to send update messages to depref the anycast routes for the site generally fails, leading to an extended outage as all the traffic gets stuck trying to reach that last site.
Howdy. Why wouldn't the server itself be originating the announcement so that the high-pref route goes away when the routing session collapses?
It's helpful to set a minimum number of anycast sites in your topology automation systems, such that sites will no longer remove themselves from rotation/distribution if doing so would reduce the count of active sites below the minimum required site count.
Treading dangerous territory since the participants can't necessarily know the difference between a site that's down and a site that's inaccessible to them (but not other people). Might be safer for the system's components to intentionally collapse to the neutral routing preference at that point rather than waiting for the failure cascade to push system there. Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/