Hello everyone! I have a small question and was wondering if someone could help me with that. Question is - why companies like Google, Amazon are having partial anycasting in CDN setups? E.g if we pick a random hostname from url of Picasa picture - lh3.googleusercontent.com - this one is further a cname string and at the end you will find different A records when checked from different locations. E.g when checked from my local system (in India): ;; QUESTION SECTION: ;lh3.googleusercontent.com. IN A ;; ANSWER SECTION: lh3.googleusercontent.com. 86276 IN CNAME googlehosted.l.googleusercontent.com. googlehosted.l.googleusercontent.com. 176 IN A 209.85.175.132 Next, lookup from a server in Europe: ;; QUESTION SECTION: ;lh3.googleusercontent.com. IN A ;; ANSWER SECTION: lh3.googleusercontent.com. 86400 IN CNAME googlehosted.l.googleusercontent.com. googlehosted.l.googleusercontent.com. 300 IN A 209.85.148.132 thus different IPs in both cases. I understand that Google is doing anycasting on core DNS servers, and thus we always hit nearest DNS server and all DNS servers are sort of independent and carry different A records for CDN strings which point to local cache server IP addresses. And here's confirmation: anurag@laptop:~$ dig googleusercontent.com. ns +short ns2.google.com. ns3.google.com. ns4.google.com. ns1.google.com. Picking ns1.google.com. and asking IP for googlehosted.l.googleusercontent.com. from different locations: anurag@laptop:~$ dig @ns1.google.com googlehosted.l.googleusercontent.com. a +short 209.85.175.132 anurag@server7:~$ dig @ns1.google.com googlehosted.l.googleusercontent.com. a +short 209.85.148.132 As expected - same server (which appears same but is different) giving different values - thus I am actually hitting different servers in both cases. Now my question here is - why this setup and not simply using having a A record for googlehosted.l.googleusercontent.com. which comes from any anycasted IP address space? Why not anycasting at CDN itself rather then only at DNS layer? Can someone explain? Thanks! -- Anurag Bhatia anuragbhatia.com or simply - http://[2001:470:26:78f::5] if you are on IPv6 connected network! Twitter: @anurag_bhatia <https://twitter.com/#!/anurag_bhatia> Linkedin: http://linkedin.anuragbhatia.com
On Wed, Feb 1, 2012 at 3:25 PM, Anurag Bhatia <me@anuragbhatia.com> wrote:
Hello everyone!
I have a small question and was wondering if someone could help me with that.
Question is - why companies like Google, Amazon are having partial anycasting in CDN setups? E.g if we pick a random hostname from url of Picasa picture - lh3.googleusercontent.com - this one is further a cname string and at the end you will find different A records when checked from different locations.
The simple answer for this is, Google cannot be expected to have a local cache of every image supplied to them globally on every server. So they use unicast servers behind a DNS based geo load balancer configuration. As for DNS, every anycasted node is expected to be able to resolve any DNS request that is made. It's all a matter of disk and acceptable delay in providing the data from the "closest" disk. charles
On Feb 1, 2012, at 3:25 PM, Anurag Bhatia wrote:
I have a small question and was wondering if someone could help me with that.
Question is - why companies like Google, Amazon are having partial anycasting in CDN setups? E.g if we pick a random hostname from url of Picasa picture - lh3.googleusercontent.com - this one is further a cname string and at the end you will find different A records when checked from different locations.
The real answer to this is highly variable based on criteria that are unknown by many people outside of the operators at these networks. what is fairly well known: 1) Anycast can be used to provide low latency queries for stateless (UDP) and state full protocols (TCP). 2) Query responses will vary based on node hit and/or source IP address the query comes from. Source address is used to attempt traffic localization. This can be defeated by using another resolver on purpose, or inadvertently (eg: corporate VPN may cause you to use a CDN node that is non-local by using corp DNS). 3) CDNs vary the response based upon uptime/load and other unknown policy criteria. They don't want to send you to a server that is down, nor one that is overloaded. The secret is in the sauce here and is complex enough that it's not easy to perfect. Also, be careful equating Anycast w/ CDN. They are not the same thing but sometimes are related. (e.g.: cousins) - Jared
On 1 February 2012 20:25, Anurag Bhatia <me@anuragbhatia.com> wrote: <snip>
Now my question here is - why this setup and not simply using having a A record for googlehosted.l.googleusercontent.com. which comes from any anycasted IP address space? Why not anycasting at CDN itself rather then only at DNS layer?
You are confusing anycasting with offering different results. I can have an anycast DNS setup where all my servers give the same response (example: most DNS providers), I can also have a single DNS server give 192.0.2.80 out to queries sourced from a US IP Address, 198.51.100.80 for queries sourced from a German IP Address and 203.0.113.80 to queries sourced from a Chinese address (djbdns has a module for this for example). I would guess that google probably have a highly customised algorithm which uses a combination of source IP and the node that your query arrived at as part of the process for deciding what answer to give you, along with dozens of other internal factors. Although I do sometimes wonder why they use CNAME chains in cases where the same servers are authoritative for the target name anyway. If you were wondering why they direct you to the unicast addresses for the local datacentre instead of just giving an anycast address which your nearest datacentre would answer, well their algorithm might decide that it wants to serve you content from the second closest datacentre because the closest one is near capacity, anycast can't do that. - Mike
Nice explanation! Thanks Mike. Appreciate it. On Thu, Feb 2, 2012 at 6:08 AM, Mike Jones <mike@mikejones.in> wrote:
On 1 February 2012 20:25, Anurag Bhatia <me@anuragbhatia.com> wrote: <snip>
Now my question here is - why this setup and not simply using having a A record for googlehosted.l.googleusercontent.com. which comes from any anycasted IP address space? Why not anycasting at CDN itself rather then only at DNS layer?
You are confusing anycasting with offering different results.
I can have an anycast DNS setup where all my servers give the same response (example: most DNS providers), I can also have a single DNS server give 192.0.2.80 out to queries sourced from a US IP Address, 198.51.100.80 for queries sourced from a German IP Address and 203.0.113.80 to queries sourced from a Chinese address (djbdns has a module for this for example).
I would guess that google probably have a highly customised algorithm which uses a combination of source IP and the node that your query arrived at as part of the process for deciding what answer to give you, along with dozens of other internal factors.
Although I do sometimes wonder why they use CNAME chains in cases where the same servers are authoritative for the target name anyway.
If you were wondering why they direct you to the unicast addresses for the local datacentre instead of just giving an anycast address which your nearest datacentre would answer, well their algorithm might decide that it wants to serve you content from the second closest datacentre because the closest one is near capacity, anycast can't do that.
- Mike
-- Anurag Bhatia anuragbhatia.com or simply - http://[2001:470:26:78f::5] if you are on IPv6 connected network! Twitter: @anurag_bhatia <https://twitter.com/#!/anurag_bhatia> Linkedin: http://linkedin.anuragbhatia.com
Mike I can also have a single DNS
server give 192.0.2.80 out to queries sourced from a US IP Address, 198.51.100.80 for queries sourced from a German IP Address and 203.0.113.80 to queries sourced from a Chinese address (djbdns has a module for this for example).
I have never did such setup, but I assume it works as you say. I wonder how it finds a US based system from IP quickly (since it's DNS server)? Thanks. On Thu, Feb 9, 2012 at 1:21 AM, Anurag Bhatia <me@anuragbhatia.com> wrote:
Nice explanation!
Thanks Mike.
Appreciate it.
On Thu, Feb 2, 2012 at 6:08 AM, Mike Jones <mike@mikejones.in> wrote:
On 1 February 2012 20:25, Anurag Bhatia <me@anuragbhatia.com> wrote: <snip>
Now my question here is - why this setup and not simply using having a A record for googlehosted.l.googleusercontent.com. which comes from any anycasted IP address space? Why not anycasting at CDN itself rather then only at DNS layer?
You are confusing anycasting with offering different results.
I can have an anycast DNS setup where all my servers give the same response (example: most DNS providers), I can also have a single DNS server give 192.0.2.80 out to queries sourced from a US IP Address, 198.51.100.80 for queries sourced from a German IP Address and 203.0.113.80 to queries sourced from a Chinese address (djbdns has a module for this for example).
I would guess that google probably have a highly customised algorithm which uses a combination of source IP and the node that your query arrived at as part of the process for deciding what answer to give you, along with dozens of other internal factors.
Although I do sometimes wonder why they use CNAME chains in cases where the same servers are authoritative for the target name anyway.
If you were wondering why they direct you to the unicast addresses for the local datacentre instead of just giving an anycast address which your nearest datacentre would answer, well their algorithm might decide that it wants to serve you content from the second closest datacentre because the closest one is near capacity, anycast can't do that.
- Mike
--
Anurag Bhatia anuragbhatia.com or simply - http://[2001:470:26:78f::5] if you are on IPv6 connected network!
Twitter: @anurag_bhatia <https://twitter.com/#!/anurag_bhatia> Linkedin: http://linkedin.anuragbhatia.com
-- Anurag Bhatia anuragbhatia.com or simply - http://[2001:470:26:78f::5] if you are on IPv6 connected network! Twitter: @anurag_bhatia <https://twitter.com/#!/anurag_bhatia> Linkedin: http://linkedin.anuragbhatia.com
On Feb 8, 2012, at 11:58 AM, Anurag Bhatia wrote:
Mike
I can also have a single DNS
server give 192.0.2.80 out to queries sourced from a US IP Address, 198.51.100.80 for queries sourced from a German IP Address and 203.0.113.80 to queries sourced from a Chinese address (djbdns has a module for this for example).
I have never did such setup, but I assume it works as you say. I wonder how it finds a US based system from IP quickly (since it's DNS server)?
Here is *one* method if you obtain a feed of geo-ip data from someone like Maxmind: http://phix.me/geodns/ Several DNS providers have different methods and different geo-ip data vendors. -b
On Thu, Feb 09, 2012 at 01:28:07AM +0530, Anurag Bhatia wrote: [snip]
I have never did such setup, but I assume it works as you say. I wonder how it finds a US based system from IP quickly (since it's DNS server)?
Drop "ip geolocation" or "internet geolocation" into Your Favorite Search Engine. Short answer is some folks just refer to databases published/generated by others, some folks use DNS guesses, and some folks measure packet arrival. And most often, there is a combination of methods used. -- RSUC / GweepNet / Spunk / FnB / Usenix / SAGE / NewNOG
Great explanation . Thanks everyone (Sent from my mobile device) Anurag Bhatia http://anuragbhatia.com On Feb 9, 2012 1:37 AM, "Joe Provo" <nanog-post@rsuc.gweep.net> wrote:
On Thu, Feb 09, 2012 at 01:28:07AM +0530, Anurag Bhatia wrote: [snip]
I have never did such setup, but I assume it works as you say. I wonder how it finds a US based system from IP quickly (since it's DNS server)?
Drop "ip geolocation" or "internet geolocation" into Your Favorite Search Engine. Short answer is some folks just refer to databases published/generated by others, some folks use DNS guesses, and some folks measure packet arrival. And most often, there is a combination of methods used.
-- RSUC / GweepNet / Spunk / FnB / Usenix / SAGE / NewNOG
participants (6)
-
Anurag Bhatia
-
Brett Watson
-
Charles Gucker
-
Jared Mauch
-
Joe Provo
-
Mike Jones