I would like to restrict access from certain countries to content on my network (for security and legal reasons). So far the best algorithm I've been able to come up with is a combination of reverse DNS and APNIC/ARIN/RIPE whois queries. I've written a perl cgi that checks reverse DNS first, and if there is no gtld country code for the reverse mapping, does a whois query and parses the response for the address. The problem I have is that the country for the company that owns the IP block is sometimes not the country the IP block is used in. For example sungold22.de.ibm.com 194.196.100.86 Whois parsing indicates a country of UK, but from the reverse DNS a person can see that it is Germany. I've built the pattern of cc.ibm.com into my cgi, but I'm sure there are other blocks that I'm incorrectly identifying. I've looked at RADB entries, as well as origin AS for various IP blocks, and neither source looks any better than whois. Is there a more accurate method to determine the country of origin for an IP than the methods I've described above? -Ralph
On Wednesday, Oct 2, 2002, at 23:21 Canada/Eastern, Ralph Doncaster wrote:
I would like to restrict access from certain countries to content on my network (for security and legal reasons).
So far the best algorithm I've been able to come up with is a combination of reverse DNS and APNIC/ARIN/RIPE whois queries. I've written a perl cgi that checks reverse DNS first, and if there is no gtld country code for the reverse mapping, does a whois query and parses the response for the address.
If you're in the market for a commercial solution, Ixia do one: http://www.ixiacom.com/products/paa/netops/IxMapping.php I don't know where they get their data from, how accurate it is, or what it costs, but I thought I'd mention that there is at least a way to make the problem someone else's by the simple application of money :) Joe
Is there a more accurate method to determine the country of origin for an IP than the methods I've described above?
Physical geography and DNS do not match. Some of the most popular web sites in Indian under the .in domain are physically in the US and owned by US companies. Having a web site under the .in domain is a means to reach a market. Physical geography and IP addresses do not match. Once the RIR allocates to the LIR, the LIR can sub-allocate anywhere. So a LIR (ISP) in Singapore with a regional business could allocate their address block to customers in Singapore, Hong Kong, China, India, and any other place where they offer services. DNS LOC Recorded might be helpful. But, as noted in one CAIDA paper ... "Both the whois-based and hostname-based mapping rely on the assumption that educated guesses are required in the absence of explicit location information. While RFC 1876 [RFC1876] did define a DNS extension to provide a LOC resource record type that allows administrators to associate latitude and longitude information with entries, it turns out to be sub-optimally useful. First, the RFC specifies only the format and interpretation of the new field, without establishing where or at what granularity to use it. Because of this, finding the appropriate LOC resource record may require multiple DNS queries. More importantly, people just do not use it. NetGeo currently does not use DNS LOC queries by default because their low success rate does not justify the expense of the three or more DNS lookups typically needed to rule out the existence of a valid DNS LOC record." ---> http://www.caida.org/outreach/papers/2000/inet_netgeo/inet_netgeo.html#dnslo c There are tools that CAIDA has worked on like NetGeo (now something sold by Ixia) http://www.caida.org/tools/utilities/netgeo/. Might be something to check out along with all the other Internet mapping projects.
On Thu, 3 Oct 2002, alex@yuriev.com wrote:
Is there a more accurate method to determine the country of origin for an IP than the methods I've described above?
Yes, at least three companies have databases of pretty much all /24s and above mapped up to a zip code.
So far I've been referred to 3 commercial services, and all (including NetGeo/Ixia) fail on the example I gave (194.196.100.86). -Ralph
On Thu, 3 Oct 2002, alex@yuriev.com wrote:
Is there a more accurate method to determine the country of origin for an IP than the methods I've described above?
Yes, at least three companies have databases of pretty much all /24s and above mapped up to a zip code.
So far I've been referred to 3 commercial services, and all (including NetGeo/Ixia) fail on the example I gave (194.196.100.86).
Maybe I missed those posts, sorry. I am not aware of any commercial service tht has a /32s in its databases. Neither am I aware of any of the companies that have the data providing the service of 'lookup the location'. It is incorporated into the other services that they provide and are used for internal purposes.
On Thu, Oct 03, 2002 at 11:10:45AM -0400, alex@yuriev.com wrote:
On Thu, 3 Oct 2002, alex@yuriev.com wrote:
Is there a more accurate method to determine the country of origin for an IP than the methods I've described above?
Yes, at least three companies have databases of pretty much all /24s and above mapped up to a zip code.
So far I've been referred to 3 commercial services, and all (including NetGeo/Ixia) fail on the example I gave (194.196.100.86).
The Akamai EdgeScape service is correct for 194.196.100.86.
Maybe I missed those posts, sorry.
I am not aware of any commercial service tht has a /32s in its databases. Neither am I aware of any of the companies that have the data providing the service of 'lookup the location'. It is incorporated into the other services that they provide and are used for internal purposes.
I'm not sure how far Akamai goes in its database. I do know for a fact that there are entries more specific than /24s in its database.
Yo Alex! On Thu, 3 Oct 2002 alex@yuriev.com wrote:
Is there a more accurate method to determine the country of origin for an IP than the methods I've described above?
Yes, at least three companies have databases of pretty much all /24s and above mapped up to a zip code.
These DBs are a joke. I have /19's that are SWIPed to the billing office but used in remote POPs. No-one is ever gonna figure out where they really are. Except for the IPs I set RFC1712 LOC records on. I see load-balancing by geo-code do way more harm than good. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 20340 Empire Blvd, Suite E-3, Bend, OR 97701 gem@rellim.com Tel:+1(541)382-8588 Fax: +1(541)382-8676
On Thu, 3 Oct 2002 alex@yuriev.com wrote:
Is there a more accurate method to determine the country of origin for an IP than the methods I've described above?
Yes, at least three companies have databases of pretty much all /24s and above mapped up to a zip code.
These DBs are a joke. I have /19's that are SWIPed to the billing office but used in remote POPs. No-one is ever gonna figure out where they really are.
Wrong answer. Just because free public dbs dont have that info does not mean that it does not exist. Alex
"alex" == alex <alex@yuriev.com> writes:
alex> Just because free public dbs dont have that info does not alex> mean that it does not exist. i guess the question is, "how to ascertain the accuracy of the data?" if you have a collection of n known address to location mappings, evenly distributed over the address space, you'd want to approach one of the private db vendors and say, "do lookups on these n addresses and tell me the answers." if there's a good correlation between the known data and the answers, then it might make sense to purchase data [*] from those people. but. it is probably necessary to construct the set of control data by hand, which might be a big job. what is a sufficiently large n? for n sufficiently large, are the vendors likely to answer the question? i suspect that, in real life, it will come down to trusting the vendors' assertion that their data is accurate... -w [*] purchase data!?!? doesn't information want to be free? or is that passé? oh well... -- William Waites <ww@styx.org> Idiosyntactix Research Laboratories http://www.irl.styx.org
Thus spake <alex@yuriev.com>
On Thu, 3 Oct 2002 alex@yuriev.com wrote:
Yes, at least three companies have databases of pretty much all /24s and above mapped up to a zip code.
These DBs are a joke. I have /19's that are SWIPed to the billing office but used in remote POPs. No-one is ever gonna figure out where they really are.
Wrong answer.
Just because free public dbs dont have that info does not mean that it does not exist.
Say I have about 10 /16's reachable through firewalls in SJC, RDU, SYD, and AMS. No traceroutes or pings can make it past these firewalls, nor do the hostnames indicate any particular location. How exactly do you plan on mapping these to a zip code, when I can tell you those addresses are fairly randomly spread, in /24 increments, to sites all over the world? The neat thing about selling databases like that is nobody can ever prove how incredibly inaccurate they are. Just come up with a reasonable-sounding collection methodology and claim any counterexamples are just flukes, then collect money from the saps who believe you... S
Wrong answer.
Just because free public dbs dont have that info does not mean that it does not exist.
Say I have about 10 /16's reachable through firewalls in SJC, RDU, SYD, and AMS. No traceroutes or pings can make it past these firewalls, nor do the hostnames indicate any particular location. How exactly do you plan on mapping these to a zip code, when I can tell you those addresses are fairly randomly spread, in /24 increments, to sites all over the world?
It is very easy. Anyone would care about it only when users from those addreses interact with whatever the software that ends up creating those databases. If those users never buy stuff from Amazon.com, Amazon.com does not care where they are. But eh moment they do, somewhere someone is cruniching the data that says "Of 10 sites that I saw this IP address access and provide a clearing for the credit card transaction, 9 ended up being within 3 miles radius of ZZZZ. Lets put a tag on that"
The neat thing about selling databases like that is nobody can ever prove how incredibly inaccurate they are. Just come up with a reasonable-sounding collection methodology and claim any counterexamples are just flukes, then collect money from the saps who believe you...
The really neat things about talking to computer geeks is that they all operate with the lots of absolutes. They will explain to you why in a specific case it does not work and forget that those specific cases are usually exceptions. ALex P.S. So, ever bought stuff from Amazon from one of those IP addresses and sent it to some non-related location *just* to confuse the mapping systems?
Yo Alex! On Thu, 3 Oct 2002 alex@yuriev.com wrote:
cruniching the data that says "Of 10 sites that I saw this IP address access and provide a clearing for the credit card transaction, 9 ended up being within 3 miles radius of ZZZZ. Lets put a tag on that"
I would be REALLY interested to know how you measure mileage with IP. I tried 6 IPs with one of these locator services and one was off by over 2,000 miles, one by 150 miles and 2 by 10 miles. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 20340 Empire Blvd, Suite E-3, Bend, OR 97701 gem@rellim.com Tel:+1(541)382-8588 Fax: +1(541)382-8676
Yo Bradley! On Thu, 3 Oct 2002, Bradley Dunn wrote:
I would be REALLY interested to know how you measure mileage with IP.
Latency triangulation.
Oh really? So you can figure out how plugged the pipe is, how backed up the router is, and then measure the speed of light? Triangulate this: 204.245.220.1 RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 20340 Empire Blvd, Suite E-3, Bend, OR 97701 gem@rellim.com Tel:+1(541)382-8588 Fax: +1(541)382-8676
On Thu, 3 Oct 2002 alex@yuriev.com wrote:
cruniching the data that says "Of 10 sites that I saw this IP address access and provide a clearing for the credit card transaction, 9 ended up being within 3 miles radius of ZZZZ. Lets put a tag on that"
I would be REALLY interested to know how you measure mileage with IP.
I tried 6 IPs with one of these locator services and one was off by over 2,000 miles, one by 150 miles and 2 by 10 miles.
Again, majority of companies that have that data will not provide it to you for free. In a case of someone like Amazon, they probably wont measure mileage. Rather whey would flag transactions that make no geographic sense and pull them for separate processing. ALex
Thus spake <alex@yuriev.com>
Say I have about 10 /16's reachable through firewalls in SJC, RDU, SYD, and AMS. No traceroutes or pings can make it past these firewalls, nor do the hostnames indicate any particular location. How exactly do you plan on mapping these to a zip code, when I can tell you those addresses are fairly randomly spread, in /24 increments, to sites all over the world?
It is very easy. Anyone would care about it only when users from those addreses interact with whatever the software that ends up creating those databases. If those users never buy stuff from Amazon.com, Amazon.com does not care where they are. But eh moment they do, somewhere someone is cruniching the data that says "Of 10 sites that I saw this IP address access and provide a clearing for the credit card transaction, 9 ended up being within 3 miles radius of ZZZZ. Lets put a tag on that"
But Amazon already knows where I live, so why do they need an IP-to-address database? My physical location is irrelevant for load-balancing purposes -- topological location is what matters. If they want to sell me "local" products, they can do that by looking at the zip code on file for my shipping address.
The neat thing about selling databases like that is nobody can ever prove how incredibly inaccurate they are. Just come up with a reasonable-sounding collection methodology and claim any counterexamples are just flukes, then collect money from the saps who believe you...
The really neat things about talking to computer geeks is that they all operate with the lots of absolutes. They will explain to you why in a specific case it does not work and forget that those specific cases are usually exceptions.
That's because we've dealt with too many business types who hype how well the general case works but ignore the exception cases that crash or corrupt your systems.
P.S. So, ever bought stuff from Amazon from one of those IP addresses and sent it to some non-related location *just* to confuse the mapping systems?
Not intentionally, but I work from a dozen different IPs, including ones from a pool "located" in a different state that is shared by 30k VPN users worldwide. I've also ordered stuff from IPs all over the world and shipped to various locations inside the US. I wonder where Amazon thinks I actually live, if they care. S
databases. If those users never buy stuff from Amazon.com, Amazon.com does not care where they are. But eh moment they do, somewhere someone is cruniching the data that says "Of 10 sites that I saw this IP address access and provide a clearing for the credit card transaction, 9 ended up being within 3 miles radius of ZZZZ. Lets put a tag on that"
But Amazon already knows where I live, so why do they need an IP-to-address database? My physical location is irrelevant for load-balancing purposes -- topological location is what matters. If they want to sell me "local" products, they can do that by looking at the zip code on file for my shipping address.
Right, that's the point! Amazon, Double-Click and others that care about where the *user* is have ability to correlate the IP addresses to the location of the user rather closely, even if at *that* point the user is not interacting with the system where he or she is forced to give up his/hers address, *however* if over the period of 3 years Amazon determined that majority of the people whose orders were placed from IP 207.106.66.0/24 got those orders shipped somewhere in Philadelphia, and no one shipped anything to San Francisco, it can deduce that *geographically* 207.106.66.0/24 is likely to be in Philadelphia and not in San Francisco even if the hop before it resolves into .sfo. Does it mean that such database would be useful for the load-balancing purposes? I personally think it would not, since the geographical location is not linked to the location IP-wise, since IP does not really really on geography.
The neat thing about selling databases like that is nobody can ever prove how incredibly inaccurate they are. Just come up with a reasonable-sounding collection methodology and claim any counterexamples are just flukes, then collect money from the saps who believe you...
The really neat things about talking to computer geeks is that they all operate with the lots of absolutes. They will explain to you why in a specific case it does not work and forget that those specific cases are usually exceptions.
That's because we've dealt with too many business types who hype how well the general case works but ignore the exception cases that crash or corrupt your systems.
I totally agree with you. However, it seems that for the majority of the businesses that could be interested in such data right now would not really have a business care for the need the guarantee of data accuracy.
P.S. So, ever bought stuff from Amazon from one of those IP addresses and sent it to some non-related location *just* to confuse the mapping systems?
Not intentionally, but I work from a dozen different IPs, including ones from a pool "located" in a different state that is shared by 30k VPN users worldwide. I've also ordered stuff from IPs all over the world and shipped to various locations inside the US. I wonder where Amazon thinks I actually live, if they care.
Actually, they do. They get charged less to clear a credit card transaction that looks squeaky clean compared to the one which is somewhat clean. Thanks, Alex
On Thu, Oct 03, 2002 at 04:22:30PM -0500, Stephen Sprunk wrote:
Say I have about 10 /16's reachable through firewalls in SJC, RDU, SYD, and AMS. No traceroutes or pings can make it past these firewalls, nor do the hostnames indicate any particular location. How exactly do you plan on mapping these to a zip code, when I can tell you those addresses are fairly randomly spread, in /24 increments, to sites all over the world?
edge intercept? there are probably a few other ways as well. -dre
I believe Akamai offers an IP address to location database for sale. I'm unsure of the accuracy, but Akamai folks claim it to be quite high. YMMV. - Daniel Golding On Thu, 3 Oct 2002, Barry Raveendran Greene wrote:
Is there a more accurate method to determine the country of origin for an IP than the methods I've described above?
Physical geography and DNS do not match. Some of the most popular web sites in Indian under the .in domain are physically in the US and owned by US companies. Having a web site under the .in domain is a means to reach a market.
Physical geography and IP addresses do not match. Once the RIR allocates to the LIR, the LIR can sub-allocate anywhere. So a LIR (ISP) in Singapore with a regional business could allocate their address block to customers in Singapore, Hong Kong, China, India, and any other place where they offer services.
DNS LOC Recorded might be helpful. But, as noted in one CAIDA paper ...
"Both the whois-based and hostname-based mapping rely on the assumption that educated guesses are required in the absence of explicit location information. While RFC 1876 [RFC1876] did define a DNS extension to provide a LOC resource record type that allows administrators to associate latitude and longitude information with entries, it turns out to be sub-optimally useful. First, the RFC specifies only the format and interpretation of the new field, without establishing where or at what granularity to use it. Because of this, finding the appropriate LOC resource record may require multiple DNS queries. More importantly, people just do not use it. NetGeo currently does not use DNS LOC queries by default because their low success rate does not justify the expense of the three or more DNS lookups typically needed to rule out the existence of a valid DNS LOC record." ---> http://www.caida.org/outreach/papers/2000/inet_netgeo/inet_netgeo.html#dnslo c
There are tools that CAIDA has worked on like NetGeo (now something sold by Ixia) http://www.caida.org/tools/utilities/netgeo/. Might be something to check out along with all the other Internet mapping projects.
On Wed, Oct 02, 2002 at 11:21:04PM -0400, Ralph Doncaster wrote:
Is there a more accurate method to determine the country of origin for an IP than the methods I've described above?
http://www.nicolas-guillard.com/cybergeography-fr/mapping.html -dre
Andre, I fail to see where a pointer to the French version of Dodge's UCL-based cybergeography pages responds to Ralph's queries. Peter
Ok sorry, I did a quick search on the Internet, how's this? http://www.cybergeography.org/mapping.html On Thu, Oct 03, 2002 at 05:21:50PM -0500, Peter Salus wrote:
Andre, I fail to see where a pointer to the French version of Dodge's UCL-based cybergeography pages responds to Ralph's queries.
Peter
participants (12)
-
alex@yuriev.com
-
Barry Raveendran Greene
-
Bradley Dunn
-
dgold
-
dre
-
Gary E. Miller
-
Joe Abley
-
John Payne
-
Peter Salus
-
Ralph Doncaster
-
Stephen Sprunk
-
William Waites