Re: Science vs. bullshit

19 Oct 2009

      Randy's right that it can be somewhat difficult to agree on a single
methodology for generating accurate assessments of how many transit
providers a particular network uses at a particular moment in time.
There are at least two knobs to turn: how long you integrate updates (we
like to use at least 24 hours of continuous time in order to flush out
backup routes, but it's sensible to look for weeks or longer to get the real
rarities to show their heads), and how much peer diversity you require in
order to call a provider relationship 'globally visible transit' for a given
prefix (I used 50+ peers as a rough rule of thumb, but you can pick
lower/higher numbers and get arguably meaningful answers).   It's like
asking, "how big is the global routing table .. REALLY?"   Depends on how
you count.

The thing about the data I presented, however, is that it is _differential_
... it says "set your knobs, look at four days over four years, and let's
see if the migration among populations seems consistent."   In fact, the
recurrence is pretty stable -- the same percentage of people in "diversity
class X" tend to end up in "diversity class Y" twelve months later, over
multiple years, with small changes that we can identify as trends.  This
gives confidence that the knobs are set in such a way that they are
achieving some meaningful classification of the prefix population.

To Patrick's point, the shape of the curve tells us useful things, even if
the precise boundaries among diversity classes can be drawn in subtly
different ways.

And that's exactly why we look for techniques that can give information
about trends (for example, my point that some dual-homed ASNs appear to be
postponing their decision to attain higher degrees of multihoming) even in
the presence of some classification uncertainty at the single-prefix level.

I'm glad to have sparked so much excitement with a 10-minute talk.   Imagine
if I had dragged it out to 30 minutes!

cheers, ---jim

On Mon, Oct 19, 2009 at 4:05 PM, Patrick W. Gilmore <patrick@ianai.net>wrote:
...
Lightning talk followup because I want to make sure there was not a
miscommunication.  A two sentence comment at the mic while 400+ of your
not-so-close friends are watching does not a rational discussion make.
The talk in question:
<
http://nanog.org/meetings/nanog47/presentations/Lightning/Cowie_Recession_li...
...
The disagreement is whether Renesys can reliably find out how many transit
providers an AS has.  Remember, we are discussing transit providers here,
not peers.
My point is if an AS has _transit_, then it must be visible in the global
table (assuming a reasonably large set of vantage points), or it would not
be transit.  Of course, this is not perfect, but it is a pretty close
approximation for fitting curves over 10s of 1000s of ASes.  So things like
"I have two transit providers, and one buys transit from the other" is a
small number and not relevant to fitting curves.  (It also means you are an
idiot, or in a corner of the Internet where you should probably be
considered as having only one provider.)
Majdi has pointed out other corner cases where transit is not viewable
through systems like Rensys.  For instance, announcing prefixes to Provider
2 with a community to local-pref the announcement below peer routes.  That
means only one transit is visible in BGP data.
There were several reasons some of us did not think edge cases like this
were important.  For instance, Renesys keeps -every- update ever, so if
Provider 1 ever flaps, Rensys will see Provider 2.  Also, when looking for
the number of providers, a "backup path" may not be relevant since no
packets take that path.
More importantly, I thought the point of the talk was to show that the
table was growing during the recession and people were still getting more
providers.  The result is a curve, not a hard-and-fast number.  Corner cases
like the one above are barely noise, so the curve it still valid.
It is true that finding peering edges with things like route-views is
problematic at best, so finding ASes with one transit plus peering might be
problematic.  But since I do not think that was the point of the talk, I do
not consider that problem.
If anyone who still thinks the problems with finding transit edges somehow
make the talk 'bullshit' could clarify their position, I would be grateful.
--
TTFN,
patrick

Re: Science vs. bullshit

Jim Cowie