On 11/14/24 12:08 PM, Christopher Morrow wrote:
On Wed, Nov 13, 2024 at 7:02 PM Matt Corallo <nanog@as397444.net> wrote:
Thanks for raising this topic. In all the rush to deploy RPKI I fear these issues are not talked about enough.
you missed ~8yrs of hand wringing and such... so sad.
Fair enough. I suppose I wasn't reading the right corner of the internet to find it. Either way there's basically no mitigations in place for these issues today, so hand wringing or not, nothing came of it.
To address this, one approach is for autonomous networks within a region to establish two trusted RPKI CA servers: one from the major RIRs and another locally managed. The locally managed CA would take precedence, allowing autonomous networks to submit their IP resources to the RPKI server of their peers (and potentially backed by a national mandate to trust this CA). This setup could prevent a scenario where an entire country’s IP resources are revoked, leading to all IPs being marked as invalid.
A variant of this could make some sense, the issue is that it doesn't do you a whole lot of good to have a local RPKI anchor that you and your local community look to if the global internet community isn't looking at it - sure, your IPs are routable to a few of your friends, but they can't reach Google...oops.
this is slurm, actually... but sure. There's even a federated version of slurm being discussed. you might like that conversation over in sidrops@ietf.org
How is this SLURM? SLURM lets you allow some nets to have a local view of RPKI, which is great as long as there is some covering route in the global DFZ to reach the nets with a local view. The OP didn't mention anything about a covering route, but instead talked about where the RIR that manages the resources from the global internet's PoV decides to AS0-ROA the IPs and make them unroutable.
Another variant I've suggested before relies on timeouts for removal - for networks that have RPKI anchors deployed, if their RIR wants to remove their anchors the RIR must publish an intent to
'anchor' is not an RPKI word, maybe you mean something else, please try a correct version of the word you mean? (if you mean, effectively, ROA.. then basically all ROA have an expiry..so yay we already have the thing you want?) perhaps you actually mean the 'trust-anchor' - of which (today) there are one per RIR?
Apologies, I had written the email somewhat in haste. Indeed, ROA is what I'd meant. Sadly, just the existence of an expiry doesn't address the issue unless (a) all RPKI RPs take full advantage of the expiry to cache entries (and MUST do so), which as far as I understand they do not (and generally isn't practical given the full-rsync+validate approach most take), (b) RIRs always maintain ROAs with timeouts at least a week (or some N) in the future (I assume most do? But I'm unaware of the exact policy).
remove the anchor a week (or some other N) prior to the removal, with validators ignoring immediate removal. This takes the issue from "I woke up one morning and my IPs weren't routable" to "I spent a week arguing on *NOG and the internet community added a new temporary workaround to avoid my ISP losing all its resources due to a runaway RIR".
removal of a trust-anchor would have relatively high impact on the RPKi system and possibly the routing system depending on what decisions were made in bgp policy. I think over time we've tried to make the whole of the system a bit more resilient, though... a missing trust-anchor (or broken one) SHOULD just end up with a bunch of 'not found' or 'unknown' routes... which probably you aren't tossing in the bucket. (because ~40% of the internet is still unsigned/unknown)
Apologies, again I meant ROAs, the email was written in some haste prior to a flight :) Matt