Re: Recommended DNS server for a medium 20-30k users isp

21 Aug 2025

      Responding to pings to 8.8.8.8 is not something we've ever advertised as a
service, we have on multiple occasions told people to _not_ rely on, and it
comes with no SLA -- it's just a best^Wworst effort "service".  Seeing
people rely on this is concerning, since it's not something we're committed
to supporting.  And, the best thing to do when you suspect there might be
unknown or undocumented reliance on something is to expose it by triggering
a brief (and reversible!) disruption.  That gives people time to identify
the dependency before an unintentional long-term outage causes significant
problems, when you may not have the option to revert to the previous
configuration.

Lest anyone still think I'm being "flippant" about this, I've actually done
it before... in 2016 we speculated there might be machines talking directly
to one of our authoritative nameservers (ns[1-4].google.com) rather than
going through a recursive resolver (which would know how to fail over if an
authoritative were unreachable for any reason).  To flush out any such
dependencies, I deployed an ACL that dropped all traffic to ns4.google.com,
for about an hour.  Graphs showed traffic immediately shifted to ns[1-3],
we saw no discussion of the test here or on dns-operations@, and no Google
service owners reported disruptions, or we could have ended the test
earlier.  At the end of the test we removed the ACL and traffic shifted
back to a roughly equal balance over the next 10 minutes.

Back to today's discussion, I'm only threatening to drop (or more severely
throttle) ICMP to 8.8.8.8, _not_ DNS resolution, since DNS resolution _is_
a service we offer.  That said, we do reserve the right to drop abuse to
that service (including UDP amplification attacks and DNS cache-busting
attacks) to protect ourselves and others.

Damian

On Tue, Aug 12, 2025 at 9:26 AM John Todd via NANOG <nanog@lists.nanog.org>
wrote:
...
You would be surprised as to what percentage of DNS recursive resolution
traffic is "a.root-servers.net" and "www.example.com" and other more
specific names like "connectivitycheck.gstatic.com" (which I know has
different purposes.)
Related: there is a draft at IETF about probing for "reachability" using
the DNS rather than picking random names which tends to skew data or
present un-necessary costs in various ways, or using ICMP echo. Since
query-based status checking seems to be a thing that people do anyway,
so maybe it should be formalized so everyone can use/expect the same
methods.
https://datatracker.ietf.org/doc/draft-sst-dnsop-probe-name/
Despite the flippant comment below about "april 1st experiment" with the
largest global resolver, there is a significant risk associated with the
concentration of measurements on systems with unintentional shared fate
issues. I expect there is a large community of services which expect
correct DNS resolution and ICMP echo response from  "a.root-servers.net"
and "www.google.com" as indicators of general network accessibility. If
(for example) the services in .com/.net/.org were to be offline, this
would probably create much larger impact than their localized outage
since both those services would be offline which would trigger
undetermined failure behaviors in many network monitoring/automation or
application software stacks.
Using IP addresses for service check destinations is slightly better but
as noted, ICMP is rarely a service with an SLA, and ICMP echo is
frequently blocked or heavily rate-limited. I will comment with my Quad9
hat on that there is no risk of us doing an April 1st experiment of
turning off ICMP echo packets to 9.9.9.9. There are however real risks
of ICMP having increased failure rates in DDOS conditions in any
network, either locally or at the receiving end. As another DNS-oriented
friend of mine has in his .sig: "The Prudent Mariner never relies solely
on any single aid to navigation."
JT
On 12 Aug 2025, at 7:15, David Prall via NANOG wrote:
...
This here has always been my biggest concern with external monitoring.
If the chosen site decides to deny ping one day then your monitoring
tool is broken.
Can do a quick DNS lookup via a DNS server, since they shouldn't turn
that off. But, what happens when they notice the same site doing the
same lookup(s) every x minutes.
In the past I've utilized the root DNS servers as a good measurement
tool. Majority are anycast. All are dual-stack so I get both IPv4 and
IPv6 verification. If 60% of them are responding we should be good.
But again this is load they aren't expecting, but I assume they know
is happening. I can rotate through doing a DNS lookup for .com, .net,
.org, .gov, etc. so that I'm not doing the same thing over and over
and I'm utilizing something they are designed and prepared to handle.
David
--
https://dprall.net
On 8/11/2025 8:08 PM, Damian Menscher via NANOG wrote:
...
On Mon, Aug 11, 2025 at 3:08 PM Matthew Petach via NANOG <
nanog@lists.nanog.org> wrote:
...
Having been bitten by this in the past...never base your
determination of
"healthy" or "working" on a single external data reference.
It can be tempting to just assume 8.8.8.8 will always be "up" and
"pingable" to verify your internet connectivity is good...right up
to the
point where Google has a routing snafu
...
No need for a routing snafu... 8.8.8.8 is current getting a
steady-state
27Mpps (million packets/second) of ICMP ECHO_REQUEST.  Internet
connectivity checking is not a service we offer, and there is no SLA
for
it, therefore it may go away at any time.  There is a very real risk
of me
running an April 1st experiment of "what would happen if I just ACL
off all
the pings?".  I might have guessed I'd light up a couple dozen pagers
and
start a nanog@ flamewar... but if anyone is basing routing decisions
on
that, it will be a "fun" day indeed!
Damian
_______________________________________________
NANOG mailing list
https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/YIM6ZS3Z...
_______________________________________________
NANOG mailing list
https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/FGMPHVNA...