most accurate geo-IP source to build country-based access lists
Hi, let's say that I need to build an ACL where I block all the IPv4 traffic from Sweden. I considered following solutions: 1) RIR statistics files(ftp://ftp.ripe.net/ripe/stats/RIR-Statistics-Exchange-Format.txt) accessible for example at ftp://ftp.apnic.net/pub/stats/. However, those files contain allocations and assignment made by the registry producing the file and not any sub-assignments by other agencies(for example NIR, LIR). This means that this information is not very accurate. Another problem which I found out is that in case of inetnum object has many country fields, the first one is used. In addition, even the RIR statistics exchange format document says that: cc = ISO 3166 2-letter country code, and the enumerated variances of {AP,EU,UK} These values are not defined in ISO 3166 but are widely used. The cc value identifies the country. However, it is not specified if this is the country where the addresses are used. There are no rules defined for this value. It therefore cannot be used in any reliable way to map IP addresses to countries 2) MaxMind products. Those should rely on user input(for example MaxMind purchases user data from ISP's or content providers) and based on personal experience defaults to RIR data if no other more accurate source is available. If anyone has something to specify here, then please do so. 3) Use iptables geoip module, but turned out, that it uses MaxMind database: root@VM-host:~# grep -Hsi maxmind $(dpkg -L xtables-addons-common) /usr/lib/xtables-addons/xt_geoip_build:# Converter for MaxMind CSV database to binary, for xt_geoip /usr/lib/xtables-addons/xt_geoip_dl: http://geolite.maxmind.com/download/geoip/database/GeoIPv6.csv.gz \ /usr/lib/xtables-addons/xt_geoip_dl: http://geolite.maxmind.com/download/geoip/database/GeoIPCountryCSV.zip; root@VM-host:~# 4) In theory geofeeds(http://tools.ietf.org/html/draft-google-self-published-geofeeds-02) would be a nice solution, but as I understand the RFC, it would work for my example only in case all the IP address users would provide their geofeed and there is a centralized database to query. 5) Use prefix AS path. However, there seems to be no reliable way to determine source country based on information in BGP routing tables. Are there any other possibilities to geolocate IPv4 addresses with higher accuracy? regards, Martin
On 8 Jun 2015, at 21:11, Martin T wrote:
Are there any other possibilities to geolocate IPv4 addresses with higher accuracy?
There is no direct relationship between logical network topology and geopolitical boundaries. ----------------------------------- Roland Dobbins <rdobbins@arbor.net>
Roland Dobbins wrote on 6/8/2015 9:14 AM:
On 8 Jun 2015, at 21:11, Martin T wrote:
Are there any other possibilities to geolocate IPv4 addresses with higher accuracy?
There is no direct relationship between logical network topology and geopolitical boundaries.
----------------------------------- Roland Dobbins <rdobbins@arbor.net>
Have you thought about application layer tests - e.g. is the client's character set/language set to Swedish? Has the user identified himself/herself/henself as living in or being from Sweeden? --Blake
Hi,
Have you thought about application layer tests - e.g. is the client's character set/language set to Swedish? Has the user identified himself/herself/henself as living in or being from Sweeden?
...just waiting for someone to suggest checking their web cookies to see what area they've got defined in adultfriendfinder or whatever... ;-) alan
Tinder would be more accurate since it uses the phones GPS. You could also cross check what subreddits they are subscribed to. On 8 Jun 2015 23:12, <A.L.M.Buxey@lboro.ac.uk> wrote:
Hi,
Have you thought about application layer tests - e.g. is the client's character set/language set to Swedish? Has the user identified himself/herself/henself as living in or being from Sweeden?
...just waiting for someone to suggest checking their web cookies to see what area they've got defined in adultfriendfinder or whatever... ;-)
alan
John,
At a brute force country level it is possible to use the Delegated ranges lists but that runs into the problem where IP ranges are subnetted and allocated to other countries.
Yeah. In addition, to illustrate the point in my initial post, sometimes inetnum objects contain more than one "country" attribute and only the first country code is inserted into RIR delegated list. For example: $ for deleg in $(wget -qO - ftp://ftp.ripe.net/ripe/stats/delegated-ripencc-latest | grep ipv4 | cut -d '|' -f 4 | tail -10000); do
[[ $(whois -rh whois.ripe.net -T inetnum "$deleg") = *country:*country:* ]] && echo "$deleg" done 193.104.217.0 193.110.48.0 193.111.228.0 193.218.114.0 194.33.109.0 194.34.64.0 194.42.56.0 194.150.168.0 194.153.74.0 195.14.23.0 195.39.208.0 195.85.254.0 195.95.150.0 195.158.230.0 $
Blake,
Have you thought about application layer tests - e.g. is the client's character set/language set to Swedish? Has the user identified himself/herself/henself as living in or being from Sweeden?
Unfortunately I need this on network layer, i.e. it should work for other traffic besides HTTP/HTTPS. Anyway, thanks for all the replies! Martin
On 9 Jun 2015, at 5:11, Martin T wrote:
At a brute force country level it is possible to use the Delegated ranges lists but that runs into the problem where IP ranges are subnetted and allocated to other countries.
Yeah.
I would say that a perfectly accurate mapping of address to anything geographical (with more accuracy than "it's within the observed universe, somewhere") is unlikely ever to exist, except by accident and for short periods of time. Accuracy and lack of authoritative sources of data is one reason, constant uncoordinated reconfiguration is another. You need to decide how accurate your mapping needs to be (and figure out how to measure that, if accuracy is important). Another part of the problem is framing the question in a useful way: a universal solution seems intractable when the following questions are answered differently (but accurately) by different people who have different needs. Is a device in Uganda connected via satphone to a router in France in Uganda, or France? Is a network in Fiji that can't talk to any other networks in Fiji without leaving the island but is one layer-3 hop away from Australia in Fiji, or Australia? Does the source address of a packet always identify the device that sent the packet? If I'm in region A and you're in region A, and you route within region to me but my replies leave the region on the way back, are we in the same region from my perspective? How about yours? Even: if I'm in region A but I'm using a DNS resolver in region B, am I in region A or region B? Joe
Years ago when meeting with the lawyers to talk about the need to block access to a list of websites I was coming from the technical side and talking about how all of our possible solutions were incomplete and easily circumvented by our users. The lawyers' response was to explain the concept of good faith effort. The main point was that we needed to "do something." We'd be in pretty good shape liability-wise as long as we made an attempt. Getting back to the point of the question. I'd find the cheapest/easiest way to implement a somewhat effective GeoIP block, and say that you've done something. On Tue, Jun 9, 2015 at 11:13 AM, Joe Abley <jabley@hopcount.ca> wrote:
On 9 Jun 2015, at 5:11, Martin T wrote:
At a brute force country level it is possible to use the Delegated
ranges lists but that runs into the problem where IP ranges are subnetted and allocated to other countries.
Yeah.
I would say that a perfectly accurate mapping of address to anything geographical (with more accuracy than "it's within the observed universe, somewhere") is unlikely ever to exist, except by accident and for short periods of time. Accuracy and lack of authoritative sources of data is one reason, constant uncoordinated reconfiguration is another. You need to decide how accurate your mapping needs to be (and figure out how to measure that, if accuracy is important).
Another part of the problem is framing the question in a useful way: a universal solution seems intractable when the following questions are answered differently (but accurately) by different people who have different needs.
Is a device in Uganda connected via satphone to a router in France in Uganda, or France?
Is a network in Fiji that can't talk to any other networks in Fiji without leaving the island but is one layer-3 hop away from Australia in Fiji, or Australia?
Does the source address of a packet always identify the device that sent the packet?
If I'm in region A and you're in region A, and you route within region to me but my replies leave the region on the way back, are we in the same region from my perspective? How about yours?
Even: if I'm in region A but I'm using a DNS resolver in region B, am I in region A or region B?
Joe
Subject: most accurate geo-IP source to build country-based access lists Date: Mon, Jun 08, 2015 at 05:11:15PM +0300 Quoting Martin T (m4rtntns@gmail.com):
Are there any other possibilities to geolocate IPv4 addresses with higher accuracy?
There are three levels of untruth: (in increasing order of falseness) 1. No, mom, I did not eat the pie. 2. "There are no Russian soldiers in Crimea" 3. IP Geolocation -- Måns Nilsson primary/secondary/besserwisser/machina MN-1334-RIPE +46 705 989668 GOOD-NIGHT, everybody ... Now I have to go administer FIRST-AID to my pet LEISURE SUIT!!
4. There are no Russian IPs in Crimea? David Hofstee Deliverability Management MailPlus B.V. Netherlands (ESP) -----Oorspronkelijk bericht----- Van: NANOG [mailto:nanog-bounces@nanog.org] Namens Måns Nilsson Verzonden: Monday, June 8, 2015 4:23 PM Aan: Martin T CC: nanog@nanog.org Onderwerp: Re: most accurate geo-IP source to build country-based access lists Subject: most accurate geo-IP source to build country-based access lists Date: Mon, Jun 08, 2015 at 05:11:15PM +0300 Quoting Martin T (m4rtntns@gmail.com):
Are there any other possibilities to geolocate IPv4 addresses with higher accuracy?
There are three levels of untruth: (in increasing order of falseness) 1. No, mom, I did not eat the pie. 2. "There are no Russian soldiers in Crimea" 3. IP Geolocation -- Måns Nilsson primary/secondary/besserwisser/machina MN-1334-RIPE +46 705 989668 GOOD-NIGHT, everybody ... Now I have to go administer FIRST-AID to my pet LEISURE SUIT!!
On 08/06/2015 15:11, Martin T wrote:> Hi,
let's say that I need to build an ACL where I block all the IPv4 traffic from Sweden. I considered following solutions:
1) RIR statistics files(ftp://ftp.ripe.net/ripe/stats/RIR-Statistics-Exchange-Format.txt) accessible for example at ftp://ftp.apnic.net/pub/stats/. However, those files contain allocations and assignment made by the registry producing the file and not any sub-assignments by other agencies(for example NIR, LIR). This means that this information is not very accurate. Another problem which I found out is that in case of inetnum object has many country fields, the first one is used. In addition, even the RIR statistics exchange format document says that:
It is a very difficult problem because IP ranges change and are split or redelegated. This means that even a reasonably current database will have data that is either out of date or not current. I mapped all websites in com/net/org/biz/info/mobi and the new gTLDs last year. While these are simply websites, the rise of VPN services and TOR have made blocking at a country level somewhat problematic. You may get many of the IPs associated with the country but you will not get them all. At a brute force country level it is possible to use the Delegated ranges lists but that runs into the problem where IP ranges are subnetted and allocated to other countries. This happens more with hosting service providers more than ISPs. There is also the Adjacent Markets effect where a provider will be operating in geographically close markets and the provider's largest IP range will encompass all the country level allocations. This problem typically reoccurs every time a large transnational cable TV/ISP acquires a new range of IPs and the online services such as Netflix are waiting for the IP range lists to update. The cable ISP's users generally appear, to the online services, as being in another country.
4) In theory geofeeds(http://tools.ietf.org/html/draft-google-self-published-geofeeds-02) would be a nice solution, but as I understand the RFC, it would work for my example only in case all the IP address users would provide their geofeed and there is a centralized database to query.
The idea of all IP address users submitting their data is nice in theory but it runs into much the same problem as submission based web directories. Most users are either unaware of the existence of such projects or have no interest in doing so.
Are there any other possibilities to geolocate IPv4 addresses with higher accuracy?
There is but it is seriously labour and resource intensive as it would require a working model of a country's network infrastructure. Basically it uses a combination of IP data and IP mapping using route tracing. There were some US patents published on it a few years ago (I think that Google may have been one of the patentees. Regards...jmcc -- ********************************************************** John McCormac * e-mail: jmcc@hosterstats.com MC2 * web: http://www.hosterstats.com/ 22 Viewmount * Domain Registrations Statistics Waterford * And Historical DNS Database. Ireland * Over 396 Million Domains Tracked. IE * web: http://newgtldnews.com **********************************************************
participants (10)
-
A.L.M.Buxey@lboro.ac.uk
-
Bacon Zombie
-
Blake Hudson
-
Dave Sparro
-
David Hofstee
-
Joe Abley
-
John McCormac
-
Martin T
-
Måns Nilsson
-
Roland Dobbins