DNS Anycast as traffic optimizer?

Steve Francis

1 Sep 2004 1 Sep '04

6:06 p.m.

I'm sure there is research out there, but I can't find it, so does anyone know of any research showing how good/bad using DNS anycast is as a kludgey traffic optimiser? (i.e. having multiple datacenters, all anycasting the authoritative name server for a domain, but each datacenters' DNS server resolving the domain name to an IP local to that datacenter, under the assumption that if the end user hit that DNS server first, there is "some" relationship between that datacenter and good performance for that user.) THe question is, what is that "some" relationship? 80% as good as Akamai? Terrible? TIA

Show replies by date

Bill Woodcock

1 Sep 1 Sep

6:05 p.m.

On Wed, 1 Sep 2004, Steve Francis wrote: > I'm sure there is research out there... Why? :-) > ...how good/bad using DNS anycast is as a kludgey traffic optimiser? I'd hardly call it a kludge. It's been standard best-practice for over a decade. > THe question is, what is that "some" relationship? 80% as good as > Akamai? Terrible? Should be much higher than Akamai, since that's not what they're optimizing for. If you want nearest server, anycast will give you that essentially 100% of the time. Akamai tries to get queries to servers that have enough available capacity to handle the load. Since they're handling bursty, high-bandwidth applications, rather than DNS. -Bill

Steve Francis

6:17 p.m.

Bill Woodcock wrote:

...

On Wed, 1 Sep 2004, Steve Francis wrote:

...
I'm sure there is research out there...

Why? :-)

Usual - if I build it myself, will it work well enough, or should I pony up for a CDN?

...

...
...how good/bad using DNS anycast is as a kludgey traffic optimiser?

I'd hardly call it a kludge. It's been standard best-practice for over a decade.

I thought it was standard best practice for availability, like for root name servers. I thought it was not a good "closest server" selection mechanism, as you'll be going to the closest server as determined by BGP - which may have little relationship to the server with lowest RTT. It'd be nice to see some metrics wither way....

...

...
THe question is, what is that "some" relationship? 80% as good as Akamai? Terrible?

Should be much higher than Akamai, since that's not what they're optimizing for. If you want nearest server, anycast will give you that essentially 100% of the time. Akamai tries to get queries to servers that have enough available capacity to handle the load. Since they're handling bursty, high-bandwidth applications, rather than DNS.

-Bill

Christopher L. Morrow

7:17 p.m.

(Caution: Chris is a chemical engineer, not an anycast engineer) On Wed, 1 Sep 2004, Steve Francis wrote:

...

Bill Woodcock wrote:

...
...
...how good/bad using DNS anycast is as a kludgey traffic optimiser?

I'd hardly call it a kludge. It's been standard best-practice for over a decade.

If I read your original request correctly you were planning on: 1) having presence in multiple datacenters (assume multiple providers as well) 2) having a 'authoritative' DNS server in each facility (or 2/3/4 whatever per center) 3) return datacenter-1-host-1 from datacenter-1-authserver-1, datacenter-2-host-2 from datacenter-2-authserver-1, and so forth. This isn't really 'anycast' so much as 'different A records depending on server which was asked' So, you'd be dependent on: 1) order of DNS requests made to AUTH NS servers for your domain/host 2) speed of network(s) between requestor and responder 3) effects of using caching DNS servers along the route You are not, now, making your decision on 'network closeness' so much as 'application swiftness'. I suspect you'd really also introduce some major troubleshooting headaches with this setup, not just for you, but for your users as well. I think in the end you probably want to obtain PI space from ARIN and use that as the 'home' for your DNS and Application servers, or atleast the application servers. There was some mention, and research I believe(?), about the value of having a partial Anycast deployment, so 3/4ths of your capacity on Anycast servers and 1/4th on 'normal' hosts to guard against route flaps and dampening of prefixes... I'm sure that some of the existing anycast users could provide much mode relevant real-world experiences though. -chris

Steve Francis

7:33 p.m.

Christopher L. Morrow wrote:

...

If I read your original request correctly you were planning on: 1) having presence in multiple datacenters (assume multiple providers as well) 2) having a 'authoritative' DNS server in each facility (or 2/3/4 whatever per center) 3) return datacenter-1-host-1 from datacenter-1-authserver-1, datacenter-2-host-2 from datacenter-2-authserver-1, and so forth.

This isn't really 'anycast' so much as 'different A records depending on server which was asked'

Well, there'd be one NS record returned for the zone in question. That NS record would be an IP address that is anycasted from all the datacenters. So end users (or their DNS servers) would all query the same IP address as the NS for that zone, but would end up at different datacenters depending on the whims of the anycasted BGP space. Once they reached a name server, then yes, it changes to 'different A records depending on server which was asked'

...

So, you'd be dependent on: 1) order of DNS requests made to AUTH NS servers for your domain/host

As there'd only be one NS server address returned, that negates this point.

...

2) speed of network(s) between requestor and responder

Or the closenes (in a BGP sense) b/w the requester and the anycasted DNS server.

...

3) effects of using caching DNS servers along the route

True. But I'm not trying to cope with instantly changing dynamic conditions.

...

I suspect you'd really also introduce some major troubleshooting headaches with this setup, not just for you, but for your users as well.

I don't doubt that. :-)

Christopher L. Morrow

8 p.m.

On Wed, 1 Sep 2004, Steve Francis wrote:

...

Christopher L. Morrow wrote:

...
If I read your original request correctly you were planning on: 1) having presence in multiple datacenters (assume multiple providers as well) 2) having a 'authoritative' DNS server in each facility (or 2/3/4 whatever per center) 3) return datacenter-1-host-1 from datacenter-1-authserver-1, datacenter-2-host-2 from datacenter-2-authserver-1, and so forth.

This isn't really 'anycast' so much as 'different A records depending on server which was asked'

Well, there'd be one NS record returned for the zone in question. That NS record would be an IP address that is anycasted from all the datacenters. So end users (or their DNS servers) would all query the same IP address as the NS for that zone, but would end up at different datacenters depending on the whims of the anycasted BGP space.

Hmm, why not anycast the service/application ips? Having inconsistent DNS info seems like a problem waiting to bite your behind.

...

...
I suspect you'd really also introduce some major troubleshooting headaches with this setup, not just for you, but for your users as well.

I don't doubt that. :-)

which I'd think you'd want to minimize as much as possible, right?

James

8:41 p.m.

On Wed, Sep 01, 2004 at 08:00:53PM +0000, Christopher L. Morrow wrote:

...

On Wed, 1 Sep 2004, Steve Francis wrote:

...
Christopher L. Morrow wrote:

...
If I read your original request correctly you were planning on: 1) having presence in multiple datacenters (assume multiple providers as well) 2) having a 'authoritative' DNS server in each facility (or 2/3/4 whatever per center) 3) return datacenter-1-host-1 from datacenter-1-authserver-1, datacenter-2-host-2 from datacenter-2-authserver-1, and so forth.

This isn't really 'anycast' so much as 'different A records depending on server which was asked'

Well, there'd be one NS record returned for the zone in question. That NS record would be an IP address that is anycasted from all the datacenters. So end users (or their DNS servers) would all query the same IP address as the NS for that zone, but would end up at different datacenters depending on the whims of the anycasted BGP space.

Hmm, why not anycast the service/application ips? Having inconsistent DNS info seems like a problem waiting to bite your behind.

Which begs the question.. is anyone doing this right now? I've been wondering about the potential issues wrt anycasting tcp applications.. TCP sessions would be affected negatively during a route change.. -J -- James Jun TowardEX Technologies, Inc. Technical Lead Network Design, Consulting, IT Outsourcing james@towardex.com Boston-based Colocation & Bandwidth Services cell: 1(978)-394-2867 web: http://www.towardex.com , noc: www.twdx.net

Christopher L. Morrow

9:05 p.m.

On Wed, 1 Sep 2004, James wrote:

...

On Wed, Sep 01, 2004 at 08:00:53PM +0000, Christopher L. Morrow wrote:

...
On Wed, 1 Sep 2004, Steve Francis wrote:

...
Christopher L. Morrow wrote:

Hmm, why not anycast the service/application ips? Having inconsistent DNS info seems like a problem waiting to bite your behind.

Which begs the question.. is anyone doing this right now? I've been wondering about the potential issues wrt anycasting tcp applications.. TCP sessions would be affected negatively during a route change..

short-lived tcp is probably ok though (like static webpages or something of that sort) you'll also have to watch out for maintaining state for distributed application servers (I suppose). TCP anycast has many more complicated implications than UDP/DNS things, or so it seems to my untrained/educated eye.

Bill Woodcock

10:59 p.m.

On Wed, 1 Sep 2004, James wrote: >> Hmm, why not anycast the service/application ips? Having >> inconsistent DNS info seems like a problem waiting to bite your >> behind. > Which begs the question.. is anyone doing this right now? Yes, lots of people. Akamai is the largest provider of services based on inconsistent DNS that I know of, and they've been doing it for quite a while. They were by no means a pioneer. Many others before them, they might just be one you've heard of. > I've been wondering about the potential issues wrt anycasting tcp > applications. TCP sessions would be affected negatively during a > route change. Yup, which happens about one hundredth as often as TCP sessions being dropped for other reasons, so it's not worth worrying about. You'll never measure it, unless your network is already too unstable to carry TCP flows anyway. This is also ancient history. I and I assume plenty of other people were doing this with long-lived FTP sessions prior to the advent of the World Wide Web. This is the objection clever people who don't actually bother to try it normally come up with, after they've thought about it for a few (but fewer than, say, ten) minutes. -Bill

Paul Vixie

8:19 p.m.

...

...
This isn't really 'anycast' so much as 'different A records depending on server which was asked'

right.

...

Well, there'd be one NS record returned for the zone in question. That NS record would be an IP address that is anycasted from all the datacenters. So end users (or their DNS servers) would all query the same IP address as the NS for that zone, but would end up at different datacenters depending on the whims of the anycasted BGP space.

that's generic dns anycast. it's safe if your routing team is very strong.

...

Once they reached a name server, then yes, it changes to 'different A records depending on server which was asked'

that's incoherent dns. when i first began castigating people in public for this, i coined the term "stupid dns tricks" to describe this behaviour. cisco now has products that will do this for you. many web hosting companies offer this incoherence as though it were some kind of feature. akamai at one time depended on it, speedera at one time did not, i don't know what's happening currently, perhaps they've flipflopped. dns is not a redirection service, and incoherence is bad. when you make a query you're asking for a mapping of <name,class,type,time> to an rrset. offering back a different rrset based on criteria like source ip address, bgp path length, ping rtt, or the phase of the moon, is a protocol violation, and you shouldn't do it. the only way to make this not be a protocol violation is to use zero TTL's to prohibit caching/reuse, which is also bad but for a different reason.

...

...
I suspect you'd really also introduce some major troubleshooting headaches with this setup, not just for you, but for your users as well.

I don't doubt that. :-)

not only is it bad dns, it's bad web service. the fact that a current routing table gives a client's query to a particular anycasted DNS server does not mean that the web services mirror co-located with that DNS server is the one that would give you the best performance. for one thing, the client's dns forwarding/caching resolver might have a different position in the connectivity graph than the web client. for another thing, as-path length doesn't tell you anything about current congestion or bandwidth -- BGP is not IGRP (and thank goodness!). if you want a web client to get its web data from the best possible web services host/mirror out of a distributed cluster, then you will have to do something a hell of a lot smarter than incoherent dns. there are open source packages to help you do this. they involve sending back an HTTP redirect to clients who would be best served by some other member of the distributed mirror cluster. -- Paul Vixie

Steve Francis

9:40 p.m.

Paul Vixie wrote:

...

not only is it bad dns, it's bad web service. the fact that a current routing table gives a client's query to a particular anycasted DNS server does not mean that the web services mirror co-located with that DNS server is the one that would give you the best performance. for one thing, the client's dns forwarding/caching resolver might have a different position in the connectivity graph than the web client. for another thing, as-path length doesn't tell you anything about current congestion or bandwidth -- BGP is not IGRP (and thank goodness!).

I'm aware that web clients are not colocated with the client's name server, and that BGP does not attempt to optimise performance. However, I suspect that in most cases, the client is close enough to the name server, and the BGP best path is close enough to the best path if it were based on latency, that most clients would be happy with the result most of the time. I'm not aiming for 100%, just Good Enough. I'd be interested in seeing any data refuting either of those points, but it looks like I may have to do it, see what I find, and go write my own research paper. :-) (I have found data that client's name servers are incorrect indicators of RTT b/w 2 web locations and clients 21 % of the time, but not how incorrect... http://www.ieee-infocom.org/2001/paper/806.pdf)

Christopher L. Morrow

9:51 p.m.

On Wed, 1 Sep 2004, Steve Francis wrote:

...

Paul Vixie wrote:

...
not only is it bad dns, it's bad web service. the fact that a current routing table gives a client's query to a particular anycasted DNS server does not mean that the web services mirror co-located with that DNS server is the one that would give you the best performance. for one thing, the client's dns forwarding/caching resolver might have a different position in the connectivity graph than the web client. for another thing, as-path length doesn't tell you anything about current congestion or bandwidth -- BGP is not IGRP (and thank goodness!).

I'm aware that web clients are not colocated with the client's name server, and that BGP does not attempt to optimise performance.

However, I suspect that in most cases, the client is close enough to the name server, and the BGP best path is close enough to the best path if it were based on latency, that most clients would be happy with the result most of the time. I'm not aiming for 100%, just Good Enough.

This is not always a good assumption: 1) dial clients sometimes get their DNS info from their radius profile (I believe) sometimes that dns server isn't on the same ASN as the dialup link. 2) many people have hardcoded DNS servers over the years, ones that have drifted from 'close' to 'far' 3) corporations with multiple exit points and larger internal networks might have DNS servers that exit in one country but are queried internally from other countries/states/locations. I think Paul's partly pointing out that you are using DNS for the wrong thing here, and partly pointing out that you are going to increase your troubleshooting overhead/complexity... Users on network X that you expect to use datacenter Y are really accessing datacenter Z because their dns cache server is located on network U :( I'm glad to see Joe/Paul/Bill jump in though... they do know quite a bit more about the practice of anycasting services on large networks.

Patrick W Gilmore

10:09 p.m.

On Sep 1, 2004, at 2:17 PM, Steve Francis wrote:

...

...
...
...how good/bad using DNS anycast is as a kludgey traffic optimiser?

I'd hardly call it a kludge. It's been standard best-practice for over a decade.

I thought it was standard best practice for availability, like for root name servers. I thought it was not a good "closest server" selection mechanism, as you'll be going to the closest server as determined by BGP - which may have little relationship to the server with lowest RTT. It'd be nice to see some metrics wither way....

I don't know any papers, but I have see real world examples where a well peered network was adjacent to 5 or more anycasted server, 3 in the US, one in Europe, and one in Asia. The network was going to the Asian server, because that router had the lowest Router ID. Not exactly sure how that makes it "much higher than Akamai", but that's what I've seen. -- TTFN, patrick

...

...
...
THe question is, what is that "some" relationship? 80% as good as Akamai? Terrible?

Should be much higher than Akamai, since that's not what they're optimizing for. If you want nearest server, anycast will give you that essentially 100% of the time. Akamai tries to get queries to servers that have enough available capacity to handle the load. Since they're handling bursty, high-bandwidth applications, rather than DNS.

-Bill

Bill Woodcock

10:52 p.m.

On Wed, 1 Sep 2004, Steve Francis wrote: >>> I'm sure there is research out there... >> Why? :-) > Usual - if I build it myself, will it work well enough, or should I pony > up for a CDN? Uh, what about that makes you sure that there's research out there? > I thought it was standard best practice for availability, like for root > name servers. I thought it was not a good "closest server" selection > mechanism, as you'll be going to the closest server as determined by BGP > - which may have little relationship to the server with lowest RTT. And the lowest RTT doesn't necessarily have much to do with what's closest. If you want lowest RTT, that's what the DNS client already does for you, so you don't need to do anything at all. -Bill

Steve Francis

11:25 p.m.

Bill Woodcock wrote:

...

On Wed, 1 Sep 2004, Steve Francis wrote:

...
...
...
I'm sure there is research out there... Why? :-) Usual - if I build it myself, will it work well enough, or should I pony up for a CDN?

Uh, what about that makes you sure that there's research out there?

Oops, sorry, misread the question. I should have said "I expect there is research..." I was answering why I wanted to know, not why I expect there is research...

...

...
I thought it was standard best practice for availability, like for root name servers. I thought it was not a good "closest server" selection mechanism, as you'll be going to the closest server as determined by BGP - which may have little relationship to the server with lowest RTT.

And the lowest RTT doesn't necessarily have much to do with what's closest. If you want lowest RTT, that's what the DNS client already does for you, so you don't need to do anything at all.

Excellent point, thanks. So there is no need to anycast the DNS servers and rely on BGP topology for selection. Instead use bind's behaviour so that each resolving nameserver will be querying the authoritative nameserver that responds the fastest. If I have inconsistest replies from each authoratitive name server, where each replies with the virtual IP of a cluster colocated with it, I will have reasonably optimised client's nameserver to web farm RTT. Whether that is good for the client, remains to be seen, but it seems to be all that (most) commercial CDNs do. That just makes it too easy.... Am I missing something else, or is it really that simple to replicate a simple CDN?

...

-Bill

Duane Wessels

11:36 p.m.

...

So there is no need to anycast the DNS servers and rely on BGP topology for selection. Instead use bind's behaviour so that each resolving nameserver will be querying the authoritative nameserver that responds the fastest.

However, note that only BIND does this. djbdns always selects nameservers randomly and the Windows selection algorithm is somewhat of a mystery. See http://www.nanog.org/mtg-0310/wessels.html Duane W.

Paul Jakma

2 Sep 2 Sep

5:41 a.m.

On Wed, 1 Sep 2004, Steve Francis wrote:

...

I thought it was standard best practice for availability, like for root name servers. I thought it was not a good "closest server" selection mechanism, as you'll be going to the closest server as determined by BGP - which may have little relationship to the server with lowest RTT. It'd be nice to see some metrics wither way....

For anycast within an organisation, it will be as determined by the IGP, not BGP. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: Genius, n.: A chemist who discovers a laundry additive that rhymes with "bright."

Joe Abley

1 Sep 1 Sep

9:24 p.m.

On 2 Sep 2004, at 06:05, Bill Woodcock wrote:

...

If you want nearest server, anycast will give you that essentially 100% of the time.

Just to clarify this slightly, since I've known people to misinterpret this point: a clear, contextual understanding of the word "nearest" is important in understanding this sentence. Here's an example: France Telecom was an early supporter of F-root's anycast deployment in Hong Kong. Due to the peering between OpenTransit and F at the HKIX, the nearest F-root server to OT customers in Paris was in Asia, despite the fact that there were other F-root nodes deployed in Europe. Those OT customers were indeed reaching the nearest F-root node, or maybe they weren't, depending on what you understand by the word "near". Another one: where anycast nodes are deployed within the scope of an IGP, topological nearness does not necessarily indicate best performance (since not all circuits will have the same loading, in general, and maybe a short, congested hop is not as "near" as several uncongested hops). For F, we don't worry too much about which flavour of "near" we achieve for every potential client: redundancy/diversity/reliability/availability is more important than minimising the time to do a lookup, and the fact that the "near" we achieve in many cases corresponds to what human users expect it to mean is really just a bonus. However, in the general case it's important to understand what kind of "near" you need, and to deploy accordingly. Joe

Matt Larson

7:18 p.m.

On Wed, 01 Sep 2004, Steve Francis wrote:

...

I'm sure there is research out there, but I can't find it, so does anyone know of any research showing how good/bad using DNS anycast is as a kludgey traffic optimiser? (i.e. having multiple datacenters, all anycasting the authoritative name server for a domain, but each datacenters' DNS server resolving the domain name to an IP local to that datacenter, under the assumption that if the end user hit that DNS server first, there is "some" relationship between that datacenter and good performance for that user.)

I can give you one data point: VeriSign anycasts j.root-servers.net from all the same locations (minus one) where the com/net authoritative servers (i.e., *.gtld-servers.net) are located. An informal examination of query rates among all the J root instances (traffic distribution via BGP) vs. query rates among all the com/net servers (traffic distribution via iterative resolver algorithms, which means round trip time in the case of BIND and Microsoft) shows much more even distribution when the iterative resolvers get to pick vs. BGP. Note that we're not using the no-export community, so all J root routes are global. When examining queries per second, there is a factor of ten separating the busiest J root instance from the least busy, whereas for com/net it's more like a factor of 2.5. Of course, I'm sure a lot of that has to do with server placement, especially in the BGP case. For what it's worth, Matt -- Matt Larson <mlarson@verisign.com> VeriSign Naming and Directory Services

Daniel Karrenberg

2 Sep 2 Sep

8:28 a.m.

On 01.09 15:18, Matt Larson wrote:

...

I can give you one data point: VeriSign anycasts j.root-servers.net from all the same locations (minus one) where the com/net authoritative servers (i.e., *.gtld-servers.net) are located. An informal examination of query rates among all the J root instances (traffic distribution via BGP) vs. query rates among all the com/net servers (traffic distribution via iterative resolver algorithms, which means round trip time in the case of BIND and Microsoft) shows much more even distribution when the iterative resolvers get to pick vs. BGP. ....

For what it's worth,

Thanks Matt for sharing this observation. Based on my own informal observations this has to be taken with a truck load of NaCl. The load characteristics of TLD servers and root servers are vastly different. The root servers usually get large amounts of (bogus) load from relatively few sources whereas the sources of load for TLD servers are more evenly distributed to start with. Daniel

Andre Gironda

1 Sep 1 Sep

9:31 p.m.

On Wed, Sep 01, 2004 at 11:06:16AM -0700, Steve Francis wrote:

...

I'm sure there is research out there, but I can't find it, so does anyone know of any research showing how good/bad using DNS anycast is as a kludgey traffic optimiser?

http://www.caida.org/outreach/papers/2002/Distance/ this paper would be somewhat on-topic, as you can infer the performance characteristics that anycast would have. no direct comparisons made to akamai,etc but maybe you can infer those as well. -dre

7773

Age (days ago)

7774

Last active (days ago)

List overview

Download

20 comments

12 participants

participants (12)

Andre Gironda
Bill Woodcock
Christopher L. Morrow
Daniel Karrenberg
Duane Wessels
James
Joe Abley
Matt Larson
Patrick W Gilmore
Paul Jakma
Paul Vixie
Steve Francis