Maciej, I work at one of the research institutions connected to Abilene and have some understanding of how various institutions handle their I2 connections.
Of course, in order to perform this kind of studies I need a way to distinguish between these two worlds. Ive learnt that Abilene does not provide commercial connectivity. This means that BGP prefixes and AS paths announced by Abilene BGP routers should lead only to research and academic destinations.
I think someone already pointed this out, but I'll just underscore the fact that _MOST_ of the AS'es connected to Abilene are research and academic institutions. However, there are a lot of others connected for various reasons. For example, Akamai and Microsoft, government facilities, a few non-profits, and some companies that provide support and equipment for the Abilene project.
I have extracted (from the BGP tables at http://abilene.internet2.edu/observatory) a list of all such destinations and obtained 1333 ASes (for data form July 2006). The number looks reasonable, but I would like to be sure that I am not making a mistake. Therefore I would be grateful if you could answer the following questions:
I know that we provide a BGP feed over to Renesys which includes our I2 routing tables, so they may also be able to help you understand how Abilene and the commodity Internet map together.
1) Is this approach to obtain a list of research and academic ISPs correct?
See my above comments, but I would suggest that pretty much all research and academic locations are connected to Abilene, but the converse isn't true: everything connected to Abilene is not a research/academic entity.
2) Do you maybe know of such lists compiled before? 3) If I keep not only the destination ASes, but also all ASes on the AS paths towards these destination I obtain a list of about 1400 ASes. How should I understand this? Does it mean that some research and academic destinations are reachable from Abilene only by traversing the commercial Internet?
Abilene is a very flat hierarchy from the AS perspective. Institutions tend to either connect directly to it or into aggregation/GigaPops which themselves connect directly to Abilene. I'm guessing that you only added 90 ASes when you expanded your list to all AS'es along the Abilene path because those are the few aggregation points and similar items. For example, in the northeastern US, most of the institutions aggregate their traffic into a consortium called the Northern Crossroads (AS 10571), which peers directly with Abilene. Some of the members themselves are consortiums (e.g. OSHEAN - http://www.oshean.org), so that adds in another AS. In both cases, the ASes are used mostly to share links, bandwidth, and support costs and are not really ISPs in the way that you are probably thinking about.
4) Of course, research and academic ASes are often well connected to the commercial Internet. My guess is that in most cases their peering relationship is customer-provider, where commercial ASes are providers. Is it possible that an academic AS is a provider for some commercial ASes? If so, does it happen often?
I think that the short answer is yes, but I'm not really sure how prevelant that is. I know that that NoX has peering relationships with a few local ISPs and uses these to provide transit for consortium members. These are, however, peering rather than transit relationships. With that said, some of the NoX members are cosortiums themselves and have both Abilene and commodity connections for their member institutions. I hope this helps. If not or if you have some more questions, drop me a note. Eric :)