Dear All,
Thank you very much for numerous and quick replies
for my email. I must say that nanog list is really highly responsive.
I needed some time to digest your comments and try
some new ideas. I share the preliminary results with you now, begging for
further comments.
The problem was (and still is) to find a good
heuristic to distinguish between commercial (COM) and educational/research/academic
(EDU) ASes.
*EDU_Abilene*
My first approach (see my original email) was to
extract a list of all destinations announced by
*EDU_description*
Some of you suggested looking at the names and
descriptions of ASes. I used the AS list available at:
http://www.multicasttech.com/status/asn_expand.txt
and searched the last column
("Organization") for the following strings:
"Universit|Univerz|Universida|research|education|science|scientif|academic|college|institut|laborator|school|ecole|
edu|R&D|library|academy|Etudes"
This approach finds 1796 "educational"
ASes, call this set “EDU_description”.
Of course, these two lists overlap, but less than I
expected. In particular:
len(EDU_Abilene)=1333
len(EDU_description)=1796
union(EDU_Abilene, EDU_description)=2269
intersection(EDU_Abilene, EDU_description)=860
For many reasons, these lists are far from being very
precise. For instance EDU_Abilene contains AS 7132 (AT&T) and AS 8075 (Microsoft).
Therefore I need further data sets or filtering methodology. This raises some
questions:
1) What other EDU networks (preferably with BGP
tables available in the web) can I take as examples of ASes that (generally) do
not announce commercial prefixes? Based on them I could construct lists similar
in spirit to EDU_Abilene. I guess, the more the better.
2) Do you know of other lists, similar to http://www.multicasttech.com/status/asn_expand.txt
? Maybe a longer description or a www related to an AS would help the
method I use to create EDU_description. Do you think the strings I use in my
search are appropriate?
*AS relationships*
Another approach is to exploit the AS relationships. Most
of you agree that usually EDU ASes are not providers for COM customers. This
suggests a way to detect false positives in EDU_Abilene and EDU_description (or
in their union). For every EDU node check how many COM customers it has, i.e., EDU
provider --- COM customer relationship. I used the AS graphs with inferred relationships
provided by CAIDA (http://as-rank.caida.org/data/2006/).
This method works well to find good candidates for false positive, but they
should not be blindly accepted. For instance AS 7132 (AT&T) has the highest
number of COM customers (615) and should obviously belong to COM (it is a
member of EDU_Abilene). In contrast, a big component of the EDU backbone, AS
11537 (
3) What other “automatic” or “manual”
approaches would you suggest? Or improvements of the ones just described?
I will appreciate even the briefest comments and
suggestions,
Maciej Kurant
From:
Maciej Kurant [mailto:maciej.kurant@epfl.ch]
Sent: mercredi, 15. novembre 2006
18:46
To: '
Subject: How to get a list of
research and academic ISP ?
Dear all,
I am a PhD student at EPFL,
Of course, in order to perform this kind of studies I
need a way to distinguish between these two worlds. I’ve learnt that
1) Is
this approach to obtain a list of research and academic ISPs correct?
2) Do
you maybe know of such lists compiled before?
3) If
I keep not only the destination ASes, but also all ASes on the AS paths towards
these destination I obtain a list of about 1400 ASes. How should I understand
this? Does it mean that some research and academic destinations are reachable
from
4) Of
course, research and academic ASes are often well connected to the commercial
Internet. My guess is that in most cases their peering relationship is
“customer-provider”, where commercial ASes are providers. Is it
possible that an academic AS is a provider for some commercial ASes? If so,
does it happen often?
Thank you in advance for your comments.
Maciej Kurant
=============================================
EPFL IC ISC
Maciej Kurant
PhD Student
CH-1015 Lausanne,
Switzerland
web site: http://lcawww.epfl.ch/kurant
=============================================