
You would be surprised as to what percentage of DNS recursive resolution traffic is "a.root-servers.net" and "www.example.com" and other more specific names like "connectivitycheck.gstatic.com" (which I know has different purposes.) Related: there is a draft at IETF about probing for "reachability" using the DNS rather than picking random names which tends to skew data or present un-necessary costs in various ways, or using ICMP echo. Since query-based status checking seems to be a thing that people do anyway, so maybe it should be formalized so everyone can use/expect the same methods. https://datatracker.ietf.org/doc/draft-sst-dnsop-probe-name/ Despite the flippant comment below about "april 1st experiment" with the largest global resolver, there is a significant risk associated with the concentration of measurements on systems with unintentional shared fate issues. I expect there is a large community of services which expect correct DNS resolution and ICMP echo response from "a.root-servers.net" and "www.google.com" as indicators of general network accessibility. If (for example) the services in .com/.net/.org were to be offline, this would probably create much larger impact than their localized outage since both those services would be offline which would trigger undetermined failure behaviors in many network monitoring/automation or application software stacks. Using IP addresses for service check destinations is slightly better but as noted, ICMP is rarely a service with an SLA, and ICMP echo is frequently blocked or heavily rate-limited. I will comment with my Quad9 hat on that there is no risk of us doing an April 1st experiment of turning off ICMP echo packets to 9.9.9.9. There are however real risks of ICMP having increased failure rates in DDOS conditions in any network, either locally or at the receiving end. As another DNS-oriented friend of mine has in his .sig: "The Prudent Mariner never relies solely on any single aid to navigation." JT On 12 Aug 2025, at 7:15, David Prall via NANOG wrote:
This here has always been my biggest concern with external monitoring. If the chosen site decides to deny ping one day then your monitoring tool is broken.
Can do a quick DNS lookup via a DNS server, since they shouldn't turn that off. But, what happens when they notice the same site doing the same lookup(s) every x minutes.
In the past I've utilized the root DNS servers as a good measurement tool. Majority are anycast. All are dual-stack so I get both IPv4 and IPv6 verification. If 60% of them are responding we should be good. But again this is load they aren't expecting, but I assume they know is happening. I can rotate through doing a DNS lookup for .com, .net, .org, .gov, etc. so that I'm not doing the same thing over and over and I'm utilizing something they are designed and prepared to handle.
David
On 8/11/2025 8:08 PM, Damian Menscher via NANOG wrote:
On Mon, Aug 11, 2025 at 3:08 PM Matthew Petach via NANOG < nanog@lists.nanog.org> wrote:
Having been bitten by this in the past...never base your determination of "healthy" or "working" on a single external data reference. It can be tempting to just assume 8.8.8.8 will always be "up" and "pingable" to verify your internet connectivity is good...right up to the point where Google has a routing snafu
...
No need for a routing snafu... 8.8.8.8 is current getting a steady-state 27Mpps (million packets/second) of ICMP ECHO_REQUEST. Internet connectivity checking is not a service we offer, and there is no SLA for it, therefore it may go away at any time. There is a very real risk of me running an April 1st experiment of "what would happen if I just ACL off all the pings?". I might have guessed I'd light up a couple dozen pagers and start a nanog@ flamewar... but if anyone is basing routing decisions on that, it will be a "fun" day indeed!
Damian
_______________________________________________ NANOG mailing list https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/YIM6ZS3Z...