On Wed, Oct 6, 2021 at 10:45 AM Michael Thomas <
mike@mtcc.com> wrote:
So if I understand their post correctly, their DNS servers have the
ability to withdraw routes if they determine are sub-optimal (fsvo). I
can certainly understand for the DNS servers to not give answers they
think are unreachable but there is always the problem that they may be
partitioned and not the routes themselves. At a minimum, I would think
they'd need some consensus protocol that says that it's broken across
multiple servers.
But I just don't understand why this is a good idea at all. Network
topology is not DNS's bailiwick so using it as a trigger to withdraw
routes seems really strange and fraught with unintended consequences.
Why is it a good idea to withdraw the route if it doesn't seem reachable
from the DNS server? Give answers that are reachable, sure, but to
actually make a topology decision? Yikes. And what happens to the cached
answers that still point to the supposedly dead route? They're going to
fail until the TTL expires anyway so why is it preferable withdraw the
route too?
My guess is that their post while more clear that most doesn't go into
enough detail, but is it me or does it seem like this is a really weird
thing to do?
Mike
Hi Mike,
You're kinda thinking about this from the wrong angle.
It's not that the route is withdrawn if doesn't seem reachable
from the DNS server.
It's that your DNS server is geolocating requests to the nearest
content delivery cluster, where the CDN cluster is likely fetching
content from a core datacenter elsewhere. You don't want that
remote/edge CDN node to give back A records for a CDN node
that is isolated from the rest of the network and can't reach the
datacenter to fetch the necessary content; otherwise, you'll have
clients that reach the page, can load the static elements on the
page, but all the dynamic elements hang, waiting for a fetch to
complete from the origin which won't ever complete. Not a very
good end user experience.
So, the idea is that if the edge CDN node loses connectivity to
the core datacenters, the DNS servers should stop answering
queries for A records with the local CDN node's address, and
let a different site respond back to the client's DNS request.
In particular, you really don't want the client to even send the
request to the edge CDN node that's been isolated, you want
to allow anycast to find the next-best edge site; so, once the
DNS servers fail the "can-I-reach-my-datacenter" health check,
they stop announcing the Anycast service address to the local
routers; that way, they drop out of the Anycast pool, and normal
Internet routing will ensure the client DNS requests are now sent
to the next-nearest edge CDN cluster for resolution and retrieving
data.
This works fine for ensuring that one or two edge sites that get
isolated due to fiber cuts don't end up pulling client requests into
them, and subsequently leaving the users hanging, waiting for
data that will never arrive.
However, it fails big-time if *all* sites fail their "can-I-reach-the-datacenter"
check simultaneously. When I was involved in the decision making
on a design like this, a choice was made to have a set of "really core"
sites in the middle of the network always announce the anycast prefixes,
as a fallback, so even if the routing wasn't optimal to reach them, the
users would still get *some* level of reply back.
In this situation, that would have ensured that at least some DNS
servers were reachable; but it wouldn't have fixed the "oh crap we
pushed 'no router bgp' out to all the routers at the same time" type
problem. But that isn't really the core of your question, so we'll
just quietly push that aside for now. ^_^;
Point being--it's useful and normal for edge sites that may become
isolated from the rest of the network to be configured to stop announcing
the Anycast service address for DNS out to local peers and transit
providers at that site during the period in which they are isolated, to
prevent users from being directed to CDN servers which can't fetch
content from the origin servers in the datacenter. It's just generally
assumed that not every site will become "isolated" at the same time
like that. :)
I hope this helps clear up the confusion.
Thanks!
Matt