Re: Anycast 101

17 Dec 2004

      On Fri, 17 Dec 2004, Iljitsch van Beijnum wrote:
...
I got some messages from people who weren't exactly clear on how
anycast works and fails. So let me try to explain...
Nice try.
...
Anycast is now deployed for a significant number of root and gtld
servers. Before anycast, most of those servers were located in the US,
and most of the rest of the world suffered significant latency in
querying them. Due to limitations in the DNS protocol, it's not
possible to increase the number of authoritative DNS servers for a zone
beyond around 13. With anycast, a much larger part of the world now has
regional access to the root and com and net zones, and probably many
more that I don't know about.
Think of this also as a reliability measure.  If a region of the world has
poor connectivity to the so-called "Internet core" (Remember the Sri Lanka
international fiber outage a few months ago?), a loss of international
connectivity can mean a loss of DNS, which breaks even local connectivity.
...
However, there are some issues. The first one is that different packets
can end up at different anycast instances. This can happen when BGP
reconverges after some network event (or after an anycast instance goes
offline and stops announcing the anycast prefix), but under some very
specific circumstances it can also happen with per packet load
balancing. Most DNS traffic consists of single packets, but the DNS
also uses TCP for queries sometimes, and when intermediate MTUs are
small there may be fragmentation.
You're misunderstanding how per-packet load balancing is generally used.

Per-packet load balancing works very well when you've got two identical
circuits between the same two routers, and you want to make sure neither
circuit fills up while the other has spare capacity.

Using per-packet load balancing on non-identical paths (in your example,
out different peering or transit connections) doesn't work.  Even when
connecting to a unicast host, the packets would arrive out of order,
leading to some really nasty performance problems.  If anybody is using
per-packet load balancing in that sort of situation, anycast DNS is the
least of their problems.
...
Another issue is the increased risk of fait sharing. In the old root
setup, it was very unlikely for a non-single homed network to see all
the root DNS servers behind the same next hop address. With anycast,
this is much more likely to happen. The pathological case is one where
a small network connects to one or more transit networks and has
local/regional peering, and then sees an anycast instance for all root
servers over peering. If then something bad happens to the peering
connection (peering router melts down, a peer pulls an AS7007, peering
fabric goes down, or worse, starts flapping), all the anycasted
addresses become unreachable at the same time.
You appear to be assuming that every anycast server in the world announces
routes for every anycasted address.

The general Anycast rule is that for however many anycasted IP addresses
you have serving a zone, you have that many separate sets of anycast
nodes.  So, if you have a zone served by anyns1, anyns2, and anyns3, there
will be a set of nodes that is anyns1, a set of nodes that is anyns2, and
a set of nodes that is anyns3.  Different servers, different routers,
and probably different physical locations.  Are there scenarios where an
outage would lead to a loss of all of the anycast clouds?  Of course, but
those scenarios would apply to Unicast servers as well.

The potentially valid point you've made is about switching servers during
BGP convergence.  As such, anycast might well be inappropriate for long
term stateful connections.  However, BGP reconvergence should be
relatively rare, DNS queries finish quickly, and DNS is good about failing
over to another DNS server IP address when a query fails.  If your example
is a network whose entire routing table is reconverging, and they're
changing their routes to all the name servers for a particular zone, their
network performance is going to be pretty bad until convergence finishes
anyway.
...
Obviously this won't happen to the degree of unreachability in practice
(well, unless there are only two addresses that are both anycast for a
certain TLD, then your milage may vary), but even if 5 or 8 or 12
addresses become unreachable the timeouts get bad enough for users to
notice.
Right, but if you're losing 5 or 8 or 12 diverse routes at the same time,
your problem probably has very little to do with anycast.

-Steve

Re: Anycast 101

Steve Gibbard