
Responding to pings to 8.8.8.8 is not something we've ever advertised as a service, we have on multiple occasions told people to _not_ rely on, and it comes with no SLA -- it's just a best^Wworst effort "service". Seeing people rely on this is concerning, since it's not something we're committed to supporting. And, the best thing to do when you suspect there might be unknown or undocumented reliance on something is to expose it by triggering a brief (and reversible!) disruption. That gives people time to identify the dependency before an unintentional long-term outage causes significant problems, when you may not have the option to revert to the previous configuration. Lest anyone still think I'm being "flippant" about this, I've actually done it before... in 2016 we speculated there might be machines talking directly to one of our authoritative nameservers (ns[1-4].google.com) rather than going through a recursive resolver (which would know how to fail over if an authoritative were unreachable for any reason). To flush out any such dependencies, I deployed an ACL that dropped all traffic to ns4.google.com, for about an hour. Graphs showed traffic immediately shifted to ns[1-3], we saw no discussion of the test here or on dns-operations@, and no Google service owners reported disruptions, or we could have ended the test earlier. At the end of the test we removed the ACL and traffic shifted back to a roughly equal balance over the next 10 minutes. Back to today's discussion, I'm only threatening to drop (or more severely throttle) ICMP to 8.8.8.8, _not_ DNS resolution, since DNS resolution _is_ a service we offer. That said, we do reserve the right to drop abuse to that service (including UDP amplification attacks and DNS cache-busting attacks) to protect ourselves and others. Damian On Tue, Aug 12, 2025 at 9:26 AM John Todd via NANOG <nanog@lists.nanog.org> wrote:
You would be surprised as to what percentage of DNS recursive resolution traffic is "a.root-servers.net" and "www.example.com" and other more specific names like "connectivitycheck.gstatic.com" (which I know has different purposes.)
Related: there is a draft at IETF about probing for "reachability" using the DNS rather than picking random names which tends to skew data or present un-necessary costs in various ways, or using ICMP echo. Since query-based status checking seems to be a thing that people do anyway, so maybe it should be formalized so everyone can use/expect the same methods.
https://datatracker.ietf.org/doc/draft-sst-dnsop-probe-name/
Despite the flippant comment below about "april 1st experiment" with the largest global resolver, there is a significant risk associated with the concentration of measurements on systems with unintentional shared fate issues. I expect there is a large community of services which expect correct DNS resolution and ICMP echo response from "a.root-servers.net" and "www.google.com" as indicators of general network accessibility. If (for example) the services in .com/.net/.org were to be offline, this would probably create much larger impact than their localized outage since both those services would be offline which would trigger undetermined failure behaviors in many network monitoring/automation or application software stacks.
Using IP addresses for service check destinations is slightly better but as noted, ICMP is rarely a service with an SLA, and ICMP echo is frequently blocked or heavily rate-limited. I will comment with my Quad9 hat on that there is no risk of us doing an April 1st experiment of turning off ICMP echo packets to 9.9.9.9. There are however real risks of ICMP having increased failure rates in DDOS conditions in any network, either locally or at the receiving end. As another DNS-oriented friend of mine has in his .sig: "The Prudent Mariner never relies solely on any single aid to navigation."
JT
On 12 Aug 2025, at 7:15, David Prall via NANOG wrote:
This here has always been my biggest concern with external monitoring. If the chosen site decides to deny ping one day then your monitoring tool is broken.
Can do a quick DNS lookup via a DNS server, since they shouldn't turn that off. But, what happens when they notice the same site doing the same lookup(s) every x minutes.
In the past I've utilized the root DNS servers as a good measurement tool. Majority are anycast. All are dual-stack so I get both IPv4 and IPv6 verification. If 60% of them are responding we should be good. But again this is load they aren't expecting, but I assume they know is happening. I can rotate through doing a DNS lookup for .com, .net, .org, .gov, etc. so that I'm not doing the same thing over and over and I'm utilizing something they are designed and prepared to handle.
David
On 8/11/2025 8:08 PM, Damian Menscher via NANOG wrote:
On Mon, Aug 11, 2025 at 3:08 PM Matthew Petach via NANOG < nanog@lists.nanog.org> wrote:
Having been bitten by this in the past...never base your determination of "healthy" or "working" on a single external data reference. It can be tempting to just assume 8.8.8.8 will always be "up" and "pingable" to verify your internet connectivity is good...right up to the point where Google has a routing snafu
...
No need for a routing snafu... 8.8.8.8 is current getting a steady-state 27Mpps (million packets/second) of ICMP ECHO_REQUEST. Internet connectivity checking is not a service we offer, and there is no SLA for it, therefore it may go away at any time. There is a very real risk of me running an April 1st experiment of "what would happen if I just ACL off all the pings?". I might have guessed I'd light up a couple dozen pagers and start a nanog@ flamewar... but if anyone is basing routing decisions on that, it will be a "fun" day indeed!
Damian
_______________________________________________ NANOG mailing list
https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/YIM6ZS3Z... _______________________________________________ NANOG mailing list
https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/FGMPHVNA...