I think it would be fair to say that ICMP echo to easy-to-remember internet resources is tolerated, but not encouraged, and is probably not a good idea unless one knows and very well understands the implications of failure (or success!) modes that don’t match the conditions that are expected. Terrible monitoring is easy; good monitoring is quite difficult.
It is reasonable to expect operators of systems designed for one type of service to quickly rate-limit or entirely filter non-critical alternate capabilities in the event of resource exhaustion or other type of risk to the primary service - this applies to web severs, DNS servers, NTP servers, etc. Also, choosing as an indicator a response from a protocol such as ICMP echo/reply which has had a historical risk of flooding attacks and which may have rapid clamping of traffic seems to be also a large check mark in the “do not use” column. ICMP echo stands real risks of not providing expected results for reasons that are known only to the target operator, and which do not take your non-obvious intentions into consideration.
More central to the issue: “The Prudent Mariner never relies solely on any single aid to navigation” (hi, Ken!) is an applicable quote here. Nothing is immune from interruption of service, especially as it becomes more distant from your administrative control. I see all too often people using ICMP to a nameserver, or a query to a nameserver, or a socket request to port 80 of some well-known name as the only method utilized for determining if a larger set of systems are available. This is not typically a good idea. I shudder to think what would happen if certain well-known domains were to be unavailable due to one of a dozen different potential failure cases. There are far too many poorly-written stacks that assume some singular conditions are “impossible” unless as a result of local failure, and that always ends in sadness and late nights spent writing root-cause analysis reports.
Further adding to this complexity is the benefit or detraction of anycast for many of these larger public services. What is “up” and what is “down”? What is the signal generated or inferred by presence or absence of this monitoring sample? The question typically generates lively debate within a network or monitoring team. I am pretty sure that “But I could ping x.x.x.x” is not typically a statement that has much weight when considering overall reachability. I do admit it is a hint, but not the answer, for many network conditions, but probably not by itself should any system consider that result canonical for anything other than that exact result.
If one is going to use responses of exterior (not within the same organizational control) services as an indicator of reachability, then a broad spectrum of tests are probably the only way to have anything approaching certainty or knowledge upon which action could be based, and even that will always have a shadow of a doubt. In that mix, ICMP echo/reply to public nameservers is probably not the best indicator to add in a monitoring suite, though it may appear to be perfectly OK… until it isn’t. DNS queries to DNS servers seems to be the most reasonable thing to use as test material, rather than ICMP, if one were building a rickety monitoring house out of the resources at-hand.
Additionally: The suggestions of building some new ICMP-responding service may end up being counter to the goals of the people using external tests, so careful what is wished for. Witness everyone installing various “speed testing” servers in their own networks, which may not truly provide accurate measurements of anything other than local loop speeds, which now sort of defeats the purpose of the speed test for anything other than the most local set of results.
JT
--
John Todd - jtodd@quad9.net
General Manager - Quad9 Recursive Resolver
On 8 Feb 2022, at 9:56, Mike Hammett wrote:
Yes, pinging public DNS servers is bad.Googling didn't help me find anything.Are there any authoritative resources from said organizations saying you shouldn't use their servers for your persistent ping destinations?