Re: Follow up to previous post regarding SAAVIS

13 Aug 2009

      On Wed, Aug 12, 2009 at 10:06:49PM -0400, Joe Provo wrote:
...
On Wed, Aug 12, 2009 at 08:16:38PM -0500, Richard A Steenbergen wrote:
[snip]
...
Unfortunately the distributed nature of the databases is one of the 
biggest problems with the IRR system. Anyone can run an irrd, there is
You misspelled "largest strength".  FOlks get to choose which registries
to believe in what order, not required to have a single [politicized]
entity.
Well, actually, no. I'm not aware of any mechanism under which you can
effectively choose who to believe and in what order, nor do I think that
it would make any real difference in the long run even if you could.

IRR database mirroring is like being a tier 1, you have to peer with
every other database out there in order to obtain a full view. That
means there are two ways you can get access to all the data you need, 
you can either query someone else who maintains all those mirrors, or 
you can run your own irr db and do all the mirroring yourself. RADB is 
the de facto "main db" in the IRR system, it not only originates the 
vast majority of the routes but it maintains the most comprehensive 
mirroring of every other active IRR db. RADB currently tracks a total of 
32 databases:

http://www.radb.net/mirrorlist.html

At the end of the day the results are the same, whether the data is in
your local irrd or someone else's db (like RADB). You query them via the
same mechanism, which has extremely limited source selection control. 
Basically the only thing you have control over are the list of IRR
databases which are searched, but the results which are returned are a
superset of all databases which you selected to search. You don't get to
say "only listen to results from LEVEL3's db for this object unless they
don't have results there, in which case you listen to SAVVIS" or
anything like that. You could query the complete data for every
individual route yourself, but this would be a massively difficult
undertaking compared to the normal query operation. The normal query
operation is by no stretch of the imagination easy either, querying a
large ISP can take hours or more depending on the transaction latency
between yourself and the server you're querying. In fact this is one of
the reasons why querying data from RIPE is such a pain, their query
language lacks a recursive service side expansion mechanism so the
transaction latency turns querying a large AS-SET into a multi-hour or
day long operation. Their whois daemon also has an obnoxious "feature"
of forcefully closing the socket after 3 minutes, even if it's in the
middle of returning an answer. This is the only real advantage of
running your own local irrd, reducing the transaction latency, but it's
still a lot of work to maintain the mirror, verify that all the data
from all the sources is always importing correctly, etc. And even after
you do all this, what does being able to pick a data source order buy
you anyways? Do you think you win something by preferring say RADB over
LEVEL3 over SAVVIS over ARIN over RIPE over...? You have no real idea
where your customer's are keeping their records, or their customers,
etc. Where do you draw the line on who's data you look at, and why do we 
need yet another system where people are left to make a judgement call 
over who's data they should trust?

Personally I'm of the belief that every ISP running their own IRR db is
a very bad idea, which is why I have chosen not to run one myself. To
quote Vijay, it doesn't scale. The last thing the already very broken
IRR system needs is more crap data by more random people spread out over
more databases, and the majority of the current db's probably need to be
shut down too. There is no reason that this process needs to be
politicized, or cost anyone any money to use. Again, we've made a
horrible system here.

One reasonable solution is to have the server side run the complete 
query off its local database, and pass the complete results for a 
prefix-list back to the querier in a single transaction. This is how 
filtergen.level3.com works, though I personally find their system is be 
excessively slow. In IRRPT 2.0 development I'm writing a similar type of 
remote filter generator, which I hope will be useful to some people.
...
If folks mistakenly believe there is a 1:1 correspendence between IRR
data and BGP tables, they will lose.  The IRR data is more of a
"flight plan", a set of what-is-possible per the originator of the
data.
[snip]
...
people query and boom you're in the system. What tends to happen is 
someone puts a route into a database and then completely forgets about 
it, so there are a huge number of completely bogus routes out there 
which are never going to get cleaned up.
Lots of folks set up systems for provisioning without deprovisioning. 
This is not an IRR problem, but a sloppy-human problem.  Folks that 
get stuck with provisioning generally aren't incented to remove billable 
resources.  CF good processes and management with backbone.
There is plenty of motivation to add data to IRR to make your
announcements work, but no motivation at all to remove data when it is
no longer needed. Nobody sees a problem with this until you step back 
and realize that a lot of networks have IRR records so sloppy that they 
list nearly every route on the Internet. Why bother filtering at all 
then?

I think if it was as simple as seeing a list of your routes (or
customers in your as-set, etc) and having a checkbox to delete old data,
people would be more reasonable about maintaining it. RPSL is scary and
confusing to a lot of people (and it should probably be scary to
everyone at any rate :P), there is no reason it needs to be like this.

-- 
Richard A Steenbergen <ras@e-gerbil.net>       http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)