|
List Info
Thread: How to get a list of research and academic ISP ?
|
|
| How to get a list of research and
academic ISP ? |

|
2006-11-20 20:59:29 |
Hello;
On Nov 20, 2006, at 3:13 PM, Maciej Kurant wrote:
> Dear All,
>
>
>
>
>
> Thank you very much for numerous and quick replies for
my email. I
> must say that nanog list is really highly responsive.
>
>
>
> I needed some time to digest your comments and try some
new ideas.
> I share the preliminary results with you now, begging
for further
> comments.
>
>
>
> The problem was (and still is) to find a good heuristic
to
> distinguish between commercial (COM) and
educational/research/
> academic (EDU) ASes.
>
>
I would suggest you need to think a little about what
exactly you want
- a list of _all_ academic ASN ? (that will be tough, and
you will
have to deal with corner cases, and you will not fully
automate it)
- a list of _some_ academic ASN ? (you have that now - so
are you
worried about completeness or size or ... ?)
- a list of _no_ academic ASN ? (again, this will be tough)
or something else ?
Note, too, that these lists will change with time.
> *EDU_Abilene*
>
> My first approach (see my original email) was to
extract a list of
> all destinations announced by Abilene. (The assumption
is that
> Abilene generally does not announce commercial
prefixes.) This
> results in a list, call it “EDU_Abilene”, of 1333 ASes.
>
>
>
>
> *EDU_description*
>
> Some of you suggested looking at the names and
descriptions of
> ASes. I used the AS list available at:
>
> ht
tp://www.multicasttech.com/status/asn_expand.txt
>
> and searched the last column ("Organization")
for the following
> strings:
>
>
"Universit|Univerz|Universida|research|education|scienc
e|scientif|
> academic|college|institut|laborator|school|ecole|
>
> edu|R&D|library|academy|Etudes"
>
> This approach finds 1796 "educational" ASes,
call this set
> “EDU_description”.
>
>
>
> Of course, these two lists overlap, but less than I
expected. In
> particular:
>
> len(EDU_Abilene)=1333
>
> len(EDU_description)=1796
>
> union(EDU_Abilene, EDU_description)=2269
>
> intersection(EDU_Abilene, EDU_description)=860
>
>
>
>
>
> For many reasons, these lists are far from being very
precise. For
> instance EDU_Abilene contains AS 7132 (AT&T) and AS
8075
> (Microsoft). Therefore I need further data sets or
filtering
> methodology. This raises some questions:
>
>
>
> 1) What other EDU networks (preferably with BGP tables
available in
> the web) can I take as examples of ASes that
(generally) do not
> announce commercial prefixes? Based on them I could
construct lists
> similar in spirit to EDU_Abilene. I guess, the more the
better.
>
>
There are lots - look at the ones that Abilene peers with
http://i
nternational.internet2.edu/partners/
http://abilene.internet2.edu/peernetworks/internatio
nal.html
> 2) Do you know of other lists, similar to http://
> www.multicasttech.com/status/asn_expand.txt ? Maybe a
longer
> description or a www related to an AS would help the
method I use
> to create EDU_description. Do you think the strings I
use in my
> search are appropriate?
>
>
Try
http://bgp.
potaroo.net/as1221/asnames.txt
Note that there are errors all over the place here; these
lists will
not agree perfectly.
My lists come from the rwhois data, but I correct for
obvious errors
(some of which I have
sent back to the list maintainers). There are others I am
sure that I
have not caught, and my corrections are undoubtedly not
perfect. I am
sure that the other maintainers of such lists could tell
similar tales.
You could start polling rwhois yourself, and I would in
doubtful cases.
>
>
> *AS relationships*
>
> Another approach is to exploit the AS relationships.
Most of you
> agree that usually EDU ASes are not providers for COM
customers.
> This suggests a way to detect false positives in
EDU_Abilene and
> EDU_description (or in their union). For every EDU node
check how
> many COM customers it has, i.e., EDU provider --- COM
customer
> relationship. I used the AS graphs with inferred
relationships
> provided by CAIDA (http://as-rank.ca
ida.org/data/2006/). This
> method works well to find good candidates for false
positive, but
> they should not be blindly accepted. For instance AS
7132 (AT&T)
> has the highest number of COM customers (615) and
should obviously
> belong to COM (it is a member of EDU_Abilene). In
contrast, a big
> component of the EDU backbone, AS 11537 (Abilene) has
66 COM
> customers! In general there are about 50 EDU nodes with
more than
> 10 COM customers each.
>
>
Not a bad approach.
>
>
> 3) What other “automatic” or “manual” approaches would
you suggest?
> Or improvements of the ones just described?
Again, I don't know what you are trying to do. What I have
found
useful is what you are doing - make lots of lists, and cross
reference, and
see what passes multiple tests.
>
>
>
>
> I will appreciate even the briefest comments and
suggestions,
>
> Maciej Kurant
>
>
>
>
Hope this helps.
Regards
Marshall
>
>
> From: Maciej Kurant [mailto:maciej.kurant epfl.ch]
> Sent: mercredi, 15. novembre 2006 18:46
> To: 'nanog merit.edu'
> Subject: How to get a list of research and academic ISP
?
>
>
>
> Dear all,
>
>
>
> I am a PhD student at EPFL, Switzerland. My recent
research
> interest is in large scale differences between the
commercial and
> academic parts of the Internet.
>
>
>
> Of course, in order to perform this kind of studies I
need a way to
> distinguish between these two worlds. I’ve learnt that
Abilene does
> not provide commercial connectivity. This means that
BGP prefixes
> and AS paths announced by Abilene BGP routers should
lead only to
> research and academic destinations. I have extracted
(from the BGP
> tables at http://abile
ne.internet2.edu/observatory) a list of all
> such destinations and obtained 1333 ASes (for data form
July 2006).
> The number looks reasonable, but I would like to be
sure that I am
> not making a mistake. Therefore I would be grateful if
you could
> answer the following questions:
>
>
>
> 1) Is this approach to obtain a list of research
and academic
> ISPs correct?
>
> 2) Do you maybe know of such lists compiled
before?
>
> 3) If I keep not only the destination ASes, but
also all ASes
> on the AS paths towards these destination I obtain a
list of about
> 1400 ASes. How should I understand this? Does it mean
that some
> research and academic destinations are reachable from
Abilene only
> by traversing the commercial Internet?
>
> 4) Of course, research and academic ASes are
often well
> connected to the commercial Internet. My guess is that
in most
> cases their peering relationship is
“customer-provider”, where
> commercial ASes are providers. Is it possible that an
academic AS
> is a provider for some commercial ASes? If so, does it
happen often?
>
>
>
> Thank you in advance for your comments.
>
> Maciej Kurant
>
>
>
>
>
>
>
> =============================================
>
>
>
> EPFL IC ISC LCA3
>
> Maciej Kurant
>
> PhD Student
>
> CH-1015 Lausanne, Switzerland
>
>
>
> web site: http://lcawww.epfl.ch/ku
rant
>
>
>
> =============================================
>
>
>
>
|
|
| How to get a list of research and
academic ISP ? |

|
2006-11-21 04:33:25 |
You might have a look at:
http://www.caida.org/publications/papers/2006/revealin
gas/
revealingas.pdf
The algorithm produces a lot of false negatives for
non-English
speaking countries that don't use .edu uniformly, but is
otherwise an
excellent place to start...
TV
On Nov 20, 2006, at 3:59 PM, Marshall Eubanks wrote:
>
> Hello;
>
> On Nov 20, 2006, at 3:13 PM, Maciej Kurant wrote:
>
>> Dear All,
>>
>>
>>
>>
>>
>> Thank you very much for numerous and quick replies
for my email. I
>> must say that nanog list is really highly
responsive.
>>
>>
>>
>> I needed some time to digest your comments and try
some new ideas.
>> I share the preliminary results with you now,
begging for further
>> comments.
>>
>>
>>
>> The problem was (and still is) to find a good
heuristic to
>> distinguish between commercial (COM) and
educational/research/
>> academic (EDU) ASes.
>>
>>
>
> I would suggest you need to think a little about what
exactly you want
>
> - a list of _all_ academic ASN ? (that will be tough,
and you will
> have to deal with corner cases, and you will not fully
automate it)
> - a list of _some_ academic ASN ? (you have that now -
so are you
> worried about completeness or size or ... ?)
> - a list of _no_ academic ASN ? (again, this will be
tough)
> or something else ?
>
> Note, too, that these lists will change with time.
>
>> *EDU_Abilene*
>>
>> My first approach (see my original email) was to
extract a list of
>> all destinations announced by Abilene. (The
assumption is that
>> Abilene generally does not announce commercial
prefixes.) This
>> results in a list, call it “EDU_Abilene”, of 1333
ASes.
>>
>>
>>
>>
>> *EDU_description*
>>
>> Some of you suggested looking at the names and
descriptions of
>> ASes. I used the AS list available at:
>>
>> ht
tp://www.multicasttech.com/status/asn_expand.txt
>>
>> and searched the last column
("Organization") for the following
>> strings:
>>
>>
"Universit|Univerz|Universida|research|education|scienc
e|scientif|
>> academic|college|institut|laborator|school|ecole|
>>
>> edu|R&D|library|academy|Etudes"
>>
>> This approach finds 1796 "educational"
ASes, call this set
>> “EDU_description”.
>>
>>
>>
>> Of course, these two lists overlap, but less than I
expected. In
>> particular:
>>
>> len(EDU_Abilene)=1333
>>
>> len(EDU_description)=1796
>>
>> union(EDU_Abilene, EDU_description)=2269
>>
>> intersection(EDU_Abilene, EDU_description)=860
>>
>>
>>
>>
>>
>> For many reasons, these lists are far from being
very precise. For
>> instance EDU_Abilene contains AS 7132 (AT&T)
and AS 8075
>> (Microsoft). Therefore I need further data sets or
filtering
>> methodology. This raises some questions:
>>
>>
>>
>> 1) What other EDU networks (preferably with BGP
tables available
>> in the web) can I take as examples of ASes that
(generally) do not
>> announce commercial prefixes? Based on them I could
construct
>> lists similar in spirit to EDU_Abilene. I guess,
the more the better.
>>
>>
>
> There are lots - look at the ones that Abilene peers
with
>
> http://i
nternational.internet2.edu/partners/
> http://abilene.internet2.edu/peernetworks/internatio
nal.html
>
>
>
>> 2) Do you know of other lists, similar to http://
>> www.multicasttech.com/status/asn_expand.txt ?
Maybe a longer
>> description or a www related to an AS would help
the method I use
>> to create EDU_description. Do you think the strings
I use in my
>> search are appropriate?
>>
>>
> Try
> http://bgp.
potaroo.net/as1221/asnames.txt
>
> Note that there are errors all over the place here;
these lists
> will not agree perfectly.
> My lists come from the rwhois data, but I correct for
obvious
> errors (some of which I have
> sent back to the list maintainers). There are others I
am sure that
> I have not caught, and my corrections are undoubtedly
not perfect.
> I am
> sure that the other maintainers of such lists could
tell similar
> tales.
>
> You could start polling rwhois yourself, and I would in
doubtful
> cases.
>
>>
>>
>> *AS relationships*
>>
>> Another approach is to exploit the AS
relationships. Most of you
>> agree that usually EDU ASes are not providers for
COM customers.
>> This suggests a way to detect false positives in
EDU_Abilene and
>> EDU_description (or in their union). For every EDU
node check how
>> many COM customers it has, i.e., EDU provider ---
COM customer
>> relationship. I used the AS graphs with inferred
relationships
>> provided by CAIDA (http://as-rank.ca
ida.org/data/2006/). This
>> method works well to find good candidates for false
positive, but
>> they should not be blindly accepted. For instance
AS 7132 (AT&T)
>> has the highest number of COM customers (615) and
should obviously
>> belong to COM (it is a member of EDU_Abilene). In
contrast, a big
>> component of the EDU backbone, AS 11537 (Abilene)
has 66 COM
>> customers! In general there are about 50 EDU nodes
with more than
>> 10 COM customers each.
>>
>>
>
> Not a bad approach.
>>
>>
>> 3) What other “automatic” or “manual” approaches
would you
>> suggest? Or improvements of the ones just
described?
>
>
> Again, I don't know what you are trying to do. What I
have found
> useful is what you are doing - make lots of lists, and
cross
> reference, and
> see what passes multiple tests.
>>
>>
>>
>>
>> I will appreciate even the briefest comments and
suggestions,
>>
>> Maciej Kurant
>>
>>
>>
>>
>
> Hope this helps.
>
> Regards
> Marshall
>
>>
>>
>> From: Maciej Kurant [mailto:maciej.kurant epfl.ch]
>> Sent: mercredi, 15. novembre 2006 18:46
>> To: 'nanog merit.edu'
>> Subject: How to get a list of research and academic
ISP ?
>>
>>
>>
>> Dear all,
>>
>>
>>
>> I am a PhD student at EPFL, Switzerland. My recent
research
>> interest is in large scale differences between the
commercial and
>> academic parts of the Internet.
>>
>>
>>
>> Of course, in order to perform this kind of studies
I need a way
>> to distinguish between these two worlds. I’ve
learnt that Abilene
>> does not provide commercial connectivity. This
means that BGP
>> prefixes and AS paths announced by Abilene BGP
routers should lead
>> only to research and academic destinations. I have
extracted (from
>> the BGP tables at http://abile
ne.internet2.edu/observatory) a list
>> of all such destinations and obtained 1333 ASes
(for data form
>> July 2006). The number looks reasonable, but I
would like to be
>> sure that I am not making a mistake. Therefore I
would be grateful
>> if you could answer the following questions:
>>
>>
>>
>> 1) Is this approach to obtain a list of
research and
>> academic ISPs correct?
>>
>> 2) Do you maybe know of such lists compiled
before?
>>
>> 3) If I keep not only the destination ASes,
but also all
>> ASes on the AS paths towards these destination I
obtain a list of
>> about 1400 ASes. How should I understand this? Does
it mean that
>> some research and academic destinations are
reachable from Abilene
>> only by traversing the commercial Internet?
>>
>> 4) Of course, research and academic ASes are
often well
>> connected to the commercial Internet. My guess is
that in most
>> cases their peering relationship is
“customer-provider”, where
>> commercial ASes are providers. Is it possible that
an academic AS
>> is a provider for some commercial ASes? If so, does
it happen often?
>>
>>
>>
>> Thank you in advance for your comments.
>>
>> Maciej Kurant
>>
>>
>>
>>
>>
>>
>>
>> =============================================
>>
>>
>>
>> EPFL IC ISC LCA3
>>
>> Maciej Kurant
>>
>> PhD Student
>>
>> CH-1015 Lausanne, Switzerland
>>
>>
>>
>> web site: http://lcawww.epfl.ch/ku
rant
>>
>>
>>
>> =============================================
>>
>>
>>
>>
>
|
|
[1-2]
|
|