Re: Using RIR info to determine geographic location...
At 08:44 PM 19-12-07 -0500, Drew Weaver wrote: I too would be interested to know how others feel about the various geo-location services available to speed things along. Three that come to mind are Akamai, Neustar/Ultradns and the "roll your own" Cisco GSS 4492R. How do they stack up? How good are the various Maxmind files? Thanks, Hank
Is this becoming a more common or less common practice as we slide ourselves into the last week of 2007? The reason I am wondering is we have noticed some 'issues' recently where correct info in the RIR causes very inefficient and sometimes annoying interaction with some of the world's largest online applications (such as Google) lets say for example that a customer in India purchases dedicated server or Co-Location hosting at a HSP in the United States [very common]. So the RIR shows that the customer is in India, so when the customer interacts with any google applications google automatically directs this traffic to google.in (or the India version of whichever app)....
More unfortunate than this fact, is the fact that it appears that services and application providers such as google are caching RIR data for an "unknown" amount of time. Which means that if a service provider SWIPs an allocation to a customer (lets use the same example... again in India) (say a /24) to a user, and then that user subsequently returns that allocation and the service provider re-allocates in smaller blocks to different customers in say /29, /28.. et cetera... the problems related to this issue are compounded (30 customers being affected, instead of one...) by this caching...
Obviously providing RIR information is the responsibility of service providers (it is even ARIN's policy) has anyone else in the community ran into issues such as this and found solutions or workarounds?
Happy holidays to all on NANOG :D
Thanks, -Drew
Personally, I have trouble accepting some of the claims the geotargeting companies have made, such as Quova's 99.9% to the country level, and 95% to the US state level. ( More info at http://www.quova.com/page.php?id=132 ) Perhaps I'm just part of the outlying data; using the "three top search engines" I rarely see them get the city correct (ie. where *I* am physically located, as opposed to where the registration data says the block is located), and have seen some glaring errors for the country in some cases. Geotargeting has turned into quite a business, and I'm concerned that people who rely on these services do not fully understand the risks. --gregbo On Thu, Dec 20, 2007 at 08:48:44AM +0200, Hank Nussbacher wrote:
At 08:44 PM 19-12-07 -0500, Drew Weaver wrote:
I too would be interested to know how others feel about the various geo-location services available to speed things along. Three that come to mind are Akamai, Neustar/Ultradns and the "roll your own" Cisco GSS 4492R. How do they stack up? How good are the various Maxmind files?
Thanks, Hank
Is this becoming a more common or less common practice as we slide ourselves into the last week of 2007? The reason I am wondering is we have noticed some 'issues' recently where correct info in the RIR causes very inefficient and sometimes annoying interaction with some of the world's largest online applications (such as Google) lets say for example that a customer in India purchases dedicated server or Co-Location hosting at a HSP in the United States [very common]. So the RIR shows that the customer is in India, so when the customer interacts with any google applications google automatically directs this traffic to google.in (or the India version of whichever app)....
More unfortunate than this fact, is the fact that it appears that services and application providers such as google are caching RIR data for an "unknown" amount of time. Which means that if a service provider SWIPs an allocation to a customer (lets use the same example... again in India) (say a /24) to a user, and then that user subsequently returns that allocation and the service provider re-allocates in smaller blocks to different customers in say /29, /28.. et cetera... the problems related to this issue are compounded (30 customers being affected, instead of one...) by this caching...
Obviously providing RIR information is the responsibility of service providers (it is even ARIN's policy) has anyone else in the community ran into issues such as this and found solutions or workarounds?
Happy holidays to all on NANOG :D
Thanks, -Drew
On Fri, 21 Dec 2007 02:13:17 +0000 Greg Skinner <gds@best.com> wrote:
Personally, I have trouble accepting some of the claims the geotargeting companies have made, such as Quova's 99.9% to the country level, and 95% to the US state level. ( More info at http://www.quova.com/page.php?id=132 ) Perhaps I'm just part of the outlying data; using the "three top search engines" I rarely see them get the city correct (ie. where *I* am physically located, as opposed to where the registration data says the block is located), and have seen some glaring errors for the country in some cases.
Geotargeting has turned into quite a business, and I'm concerned that people who rely on these services do not fully understand the risks.
Some folks are relying on it for serious purposes. Many Internet gambling sites use it to avoid serving US customers, for example. Their risk is criminal liability for the executive -- the have a strong incentive to get reliable data... Some sports media sites use it to enforce local area blackouts; though that doesn't need to be perfect, if it's too imperfect they risk breach of contract and expensive lawsuits. For the advertisers, best effort is probably good enough... --Steve Bellovin, http://www.cs.columbia.edu/~smb
On Thu, Dec 20, 2007 at 10:17:36PM -0500, Steven M. Bellovin wrote:
On Fri, 21 Dec 2007 02:13:17 +0000, Greg Skinner <gds@best.com> wrote:
Personally, I have trouble accepting some of the claims the geotargeting companies have made, such as Quova's 99.9% to the country level, and 95% to the US state level. ( More info at http://www.quova.com/page.php?id=132 ) Perhaps I'm just part of the outlying data; using the "three top search engines" I rarely see them get the city correct (ie. where *I* am physically located, as opposed to where the registration data says the block is located), and have seen some glaring errors for the country in some cases.
Geotargeting has turned into quite a business, and I'm concerned that people who rely on these services do not fully understand the risks.
Some folks are relying on it for serious purposes. Many Internet gambling sites use it to avoid serving US customers, for example. Their risk is criminal liability for the executive -- the have a strong incentive to get reliable data... Some sports media sites use it to enforce local area blackouts; though that doesn't need to be perfect, if it's too imperfect they risk breach of contract and expensive lawsuits.
For the advertisers, best effort is probably good enough...
--Steve Bellovin, http://www.cs.columbia.edu/~smb
Funny you should mention sports media sites. Not too long ago, someone asked on usenet how to foil geotargeting in order to watch a sportscast that was being blocked. The answer was posted not long after the question. It doesn't surprise me that "the word is out" on how to foil geotargeting, but it disturbs me that this aspect of geotargeting is not discussed more. I would prefer it if there were more openness and transparency about such things (without necessarily divulging the exact means by which geotargeting can be foiled). The Carleton paper ( http://www.scs.carleton.ca/~jamuir/papers/TR-06-05.pdf ) goes into some detail on the practical limits of geotargeting, but it has been difficult to raise this type of awareness among consumers of geotargeting services. WRT advertisers, opinions are mixed on whether best effort is good enough, fraud aside. Some feel any discrepancies are just a cost of doing business on the Internet; hopefully they have factored discrepancies into their ad spend. Others are more skeptical. Some of you may find ( http://blog.merjis.com/2007/10/19/adwords-geotargeting-myths/ ) interesting. --gregbo
On Dec 20, 2007 8:13 PM, Greg Skinner <gds@best.com> wrote:
Personally, I have trouble accepting some of the claims the geotargeting companies have made, such as Quova's 99.9% to the country level, and 95% to the US state level. ( More info at http://www.quova.com/page.php?id=132 ) Perhaps I'm just part of the
The trouble with a claim of "95%" accuracy is the method of determining the accuracy of the measurement has not been indicated, and there are _many_ IPs out there. With no method of obtaining the statistic indicated: there is no evidence I saw that 99%/95%, weren't possibly just made up numbers for the purpose of aggressively marketing a product. I agree it is not very believable that a geolocation service properly locates 95% of all ip addresses to within a state/city. Due to the existence of various types of proxies and anonymizer services, visible IP often does not reveal original requestor details. RIR records give contact information for an organization utilizing IP space, that's not the same as the physical location of nodes -- it makes the RIR data an unreliable source of information for that usage. This information is not necessarily always up to date in the first place. Nodes on the very same RIR allocation may be geographically distant. No more reliable than performing traceroutes to the destination IP, reverse resolving, and using pattern matching to search for possible city, state, country names contained in the reverse DNS mappings of the hops nearest the target. (Since providers sometimes include state and/or city names in router rDNS hosts) On the other hand, it's perhaps the best geolocators can _try_ to do... Short of geolocation services manually calling ISPs and asking.../ making deals with major ISPs to procure lists of geographic regions and assigned IPs in those regions. I suppose that in theory proper geolocation close to 95% of IPs for page access requests would occur then (provided 95% of page access requests came from providers they had that type of direct information from) -- -J
[nicely off-ops, pre-X-mas 101] James Hess wrote: [..]
On the other hand, it's perhaps the best geolocators can _try_ to do...
And they really can't care less about that data. What they do is very simple, and it maybe explains for you why for instance the big G earns loads of money: by aggregating data and selling it. When you go to Amazon.com and you buy something, from your IP, you fill in your full address details. Then your DHCP expires or for some other reason your IP changes, and another person gets that same IP, they go to some other site and fill in their address details. Do this trick for a couple of hundred rounds. Most very likely the same /24 or some other size will be re-used in the same geolocation, may that be country, US state (which actually is just country size compared to Europe), city etc. After a while you will have a lot of people (they can't care less about your name though) who have said that when they came from IP a.b.c.d, that their address was X, as such for IP a.b.c.d you had x% people saying it was city Y and y% saying it was city Z. Presto, your accuracy for that IP. Even works when people fill in fake data or use business/home address (that is why they want you to tag it as such, as it would screw their stats :) Now combine a couple of thousands merchant site to increase the data you get, buy/sell it at country/state/city/street level and you have the best data ever. That 0.1% of the data that is 'inaccurate' now is the data of proxies and other multi-city/state/city IP addresses. Now you know how to earn money by selling something you collected from simple a site. Now you probably also understand why data from "Social Sites" is so valuable: all those people fill in all their data, and that in a structured way, together with a nice link to 'friends' who also do that, from which you can so statistical stuff about how accurate it is as most of them live togheter/closeby or talk about it. They don't really break any 'privacy' with that, they generally can't care less about who you are, they just care about the statistics, as that is the data they will sell. Can't help you any further though if you are still wondering how the two G man bought their own Boeings ;) Greets, Jeroen
On Dec 24, 2007, at 1:23 AM, James Hess wrote:
On Dec 20, 2007 8:13 PM, Greg Skinner <gds@best.com> wrote:
Personally, I have trouble accepting some of the claims the geotargeting companies have made, such as Quova's 99.9% to the country level, and 95% to the US state level. ( More info at http://www.quova.com/page.php?id=132 ) Perhaps I'm just part of the
The trouble with a claim of "95%" accuracy is the method of determining the accuracy of the measurement has not been indicated, and there are _many_ IPs out there. With no method of obtaining the statistic indicated: there is no evidence I saw that 99%/95%, weren't possibly just made up numbers for the purpose of aggressively marketing a product.
Well, I use a geolocation service, and as I travel around to many fine hotels and meetings, I try to check frequently to see if it knows where I am, so I have a few dozen test probes scattered about the globe. The results are mixed. On US corporate (enterprise) networks, it is typically unreliable. In-room hotel networks generally get mapped to the right city. Wireless hot spots are erratic - sometimes mapped to better than a km, sometimes wildly off. People's home networks, generally the right country, frequently the right city. I would say, overall - mapping to the right country, probably better than 95% accuracy, maybe 99%. - mapping to the right city, at least 75% of the time, for sure not 99%, even if you discount enterprise networks. Of course, your probable error may vary... Regards Marshall
I agree it is not very believable that a geolocation service properly locates 95% of all ip addresses to within a state/city.
Due to the existence of various types of proxies and anonymizer services, visible IP often does not reveal original requestor details.
RIR records give contact information for an organization utilizing IP space, that's not the same as the physical location of nodes -- it makes the RIR data an unreliable source of information for that usage.
This information is not necessarily always up to date in the first place. Nodes on the very same RIR allocation may be geographically distant.
No more reliable than performing traceroutes to the destination IP, reverse resolving, and using pattern matching to search for possible city, state, country names contained in the reverse DNS mappings of the hops nearest the target.
(Since providers sometimes include state and/or city names in router rDNS hosts)
On the other hand, it's perhaps the best geolocators can _try_ to do...
Short of geolocation services manually calling ISPs and asking.../ making deals with major ISPs to procure lists of geographic regions and assigned IPs in those regions.
I suppose that in theory proper geolocation close to 95% of IPs for page access requests would occur then (provided 95% of page access requests came from providers they had that type of direct information from)
-- -J
On Mon, 24 Dec 2007 00:23:00 -0600 "James Hess" <mysidia@gmail.com> wrote:
On the other hand, it's perhaps the best geolocators can _try_ to do...
Short of geolocation services manually calling ISPs and asking.../ making deals with major ISPs to procure lists of geographic regions and assigned IPs in those regions.
I suppose that in theory proper geolocation close to 95% of IPs for page access requests would occur then (provided 95% of page access requests came from providers they had that type of direct information from)
See http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=6,947,978.PN.&OS=PN/6,947,978&RS=PN/6,947,978 for another approach. --Steve Bellovin, http://www.cs.columbia.edu/~smb
participants (6)
-
Greg Skinner
-
Hank Nussbacher
-
James Hess
-
Jeroen Massar
-
Marshall Eubanks
-
Steven M. Bellovin