Paul, On May 29, 2012, at 8:44 PM, Paul Vixie wrote:
On 2012-05-29 5:37 PM, Richard Barnes wrote:
I agree with the person higher up the thread that ROVER seems like just another distribution mechanism for what is essentially RPKI data.
noting, that up-thread person also said "i havn't studied this in detail so i'm probably wrong."
But does that distribution method easily allow you to get the full set of available data? From what little I know, it seems to me that ROVER is optimized for point queries, rather than bulk data access. Which is the opposite of making it easy to get full data :)
that's close to the problem but it is not the problem.
RPKI is a catalogue. it's possible to fetch all of the data you could need, before starting what's basically the "batch job" of computing the filters you will use at BGP-reception-time to either accept or ignore an incoming route. if your "fetch and recompute" steps don't work, then you'll have to continue filtering using stale data. if that data becomes too stale you're likely to have to turn off the filtering until you can resynchronize.
ROVER is not a catalogue. it's impossible to know what data you could need to precompute any route filters, and it's impossible to know what 'all possible rover data' is -- in fact that would be a nonsequitur. you could i suppose query for every possible netblock (of every possible size) but that's an awful lot of queries and you'd have to do it every day in order to see new stuff or to know when to forget old stuff.
the problem is in time domain bounding of data validity and data reachability. ROVER expects you to be able to query for the information about a route at the time you receive that route. that's point-in-time validity and reachability, which you might not have depending on where the DNS servers are and what order you're receiving routes in. RPKI+ROA expects you to have periodic but complete access to a catalogue, and then your future use of the data you fetched has only the risk of staleness or invalidity, but never reachability.
as others have stated, there is no reference collection of bad ideas. otherwise we would have written this one up in 1996 when a couple of dns people looked at the routing system and said 'hey what about something like [ROVER]?' and the routing people explained in detail why it wouldn't work.
Just one correction to the above. As pointed out in Section 4 of draft-gersch-grow-revdns-bgp-00 "near-real-time route origin verification" is merely one instantiation of the "ROVER concept". Please refer to that section for other potential uses of such published data. I would also ask people to expand their minds beyond the "it must have a (near-)real-time mechanism" directly coupled to the Control Plane" for a variety of reasons. Such a tight coupling of /any/ two systems inevitably, and unfortunately, will only fail at scale in ways that likely would never have been predicted a priori[1] -- unfortunately, you only learn this lesson afterward. Question is: how quickly can you detect and react to unwind the specific problem w/out having to turn it off completely, (i.e.: "shields down")? Alternatively, is it more prudent to engineer some 'sanity' safeguards, (i.e.: back away from the 'real-time' aspect), to avoid this from happening at all or at least extremely rarely? -shane [1] FWIW, Dave Meyer has been doing some thinking about "complexity" for a while now, drawing analogies from outside of networking/computing, and has some fascinating insight. I'm sure if there's enough interest, he'd be willing to discuss it. Who knows, we may even learn something. :-)