RPKI over RSYNC vs RRDP (Was: plea for comcast/sprint handoff debug help)

31 Oct 2020

      On Fri, Oct 30, 2020 at 12:47:44PM +0100, Alex Band wrote:
...
...
On 30 Oct 2020, at 01:10, Randy Bush <randy@psg.com> wrote:
i'll see your blog post and raise you a peer reviewed academic paper
and two rfcs :)
For the readers wondering what is going on here: there is a reason
there is only a vague mention to two RFCs instead of the specific
paragraph where it says that Relying Party software must fall back to
rsync immediately if RRDP is temporarily unavailable. That is because
this section doesn’t exist.
*skeptical face* Alex, you got it backwards: the section that does not
exist, is to *not* fall back to rsync. But on the other hand, there are
ample RFC sections which outline rsync is the mandatory-to-implement
protocol. Starts at RFC 6481 Section 3: "The publication repository
MUST be available using rsync".

Even the RRDP RFC itself (RFC 8182) describes that RSYNC and RRDP
*co-exist*. I think this co-existence was factored into both the design
of RPKIoverRSYNC and subsequently RPKIoverRRDP. An rsync publication
point does not become invalid because of the demise of an
once-upon-a-time valid RRDP publication point.

Only a few weeks ago a large NIR (IDNIC) disabled their RRDP service
because somehow the RSYNC and RRDP repositories were out-of-sync with
each other. The RRDP service remained disabled for a number of days
until they repaired their RPKI Certificate Authority service.

I suppose that during this time, Routinator was unable to receive any
updates related to the IDNIC CA (pinned to RRDP -> because of a prior
successful fetch prior to the partial IDNIC RPKI outage). This in turn
deprived the IDNIC subordinate Resource Holders the ability to update
their Route Origin Authorization attestations (from Routinator's
perspective).

Given that RRDP is an *optional* protocol in the RPKI stack, it doesn't
make sense to me to strictly pin fetching operations to RRDP: Over time
(months, years), a CA could enable / disable / enable / disable RRDP
service, while listing the RRDP URI as a valid SIA, amongst other valid
SIAs.

An analogy to DNS: A website operator may add AAAA records to indicate
IPv6 reachability, but over time may also remove the AAAA record if
there (temporarily) is some kind of issue with the IPv6 service. The
Internet operations community of course encourages everyone to add AAAA
records, and IPv6 Happy Eyeballs were a concept to for a long time even
*favor* IPv6 over IPv4 to help improve IPv6 adoption, but a dual-stack
browser will always try to make benefit of the redundancy that exists
through the two address families.

RSYNC and RRDP should be viewed in a similar context as v4 vs v6, but
unlike with IPv4 and IPv6, I am convinced that RSYNC can be deprecated
in the span of 3 or 4 years, the draft-sidrops-bruijnzeels-deprecate-rsync
document is helping towards that goal!
...
Be that as it may, operators can rest assured that if consensus goes
against our logic, we will change our design.
Please change the implementation a little bit (0.8.1). I think it is too
soon for the internet wide 'rsync to RRDP' migration project to be
declared complete and successfull, and this actually hampers the
transition to RRDP.

Pinning to RRDP *forever* violates the principle-of-least-astonishment
in a world where draft-sidrops-bruijnzeels-deprecate-rsync-00 was
published only as recent as November 2019. That draft now is a working
group document, and it will probably take another 1 or 2 years before it
is published as RFC.

Section 5 of 'draft-deprecate-rsync' says RRDP *SHOULD* be used when it
is available. Thus it logically follows, when it is not available, the
lowest common denominator is to be used: rsync. After all, the Issuing
CA put an RSYNC URI in the 'Subject Information Access' (SIA). Who knows
better than the CA?

The ability to publish routing intentions, and for others to honor the
intentions of the CA is what RPKI is all about. When the CA says
delegated RPKI data is available at both an RSYNC URI and an RRDP URI,
both are valid network entrypoints to the publication point. The
resource holder's X.509 signature even is on those 'reference to there'
directions (URIs)! :-)

If I can make a small suggestion: make 0.8.1 fall back to rsync after
waiting an hour or so, (meanwhile polling to see if the the RRDP service
restores). This way the network operator takes advantage of both
transport protocols, whichever is available, with a clear preference to
try RRDP first, then eventually rsync.

RPKI was designed in such a way that it can be transported even over
printed paper, usb stick, bluetooth, vinyl, rsync, and also https (as
rrdp). Because RPKI data is signed using the X.509 framework, the
transportation method really is irrelevant. IP holders can publish RPKI
data via horse + cart, and still make productive use of it!

Routinator's behavior is not RFC compliant, and has tangible effects in
the default-free zone.

Regards,

Job

RPKI over RSYNC vs RRDP (Was: plea for comcast/sprint handoff debug help)

Job Snijders