As a community, it is clear that there is a need for this, and if
8.8.8.8 stops being an anchor for liveliness detection, users will find
something else to replace it with. And we can bet all our Kwacha that it
won't have been designed for that purpose, either.
I respectfully strongly disagree on 'need'.
Let's perform a thought experiment. Assert that 8.8.8.8 was expressly codified by Google to be a designated ICMP endpoint, and that for 100% of ICMP echo requests they receive, they guarantee an echo-reply will be sent. There are countless reasons , even with that (unreasonable) assumption of 100% uptime of the endpoint, that echo-requests may not reach them, or that echo-replies may be sent but not reach the originating source. Extend this idea even further. Assert that it is now not just Google running it, and the largest networks in the world all agree to anycast it from their networks.Assert is still guaranteed to respond to 100% if all echo requests it receives, wherever it receives. ( An even more unreasonable assertion than before!) There are STILL countless reasons why an endpoint may, at times, have that simple ICMP check fail.
The prediciate assumption that "pinging one destination is a valid check that my internet works' is INCORRECT. There is no magical unicorn that could be built that could make that true, and 'they're gonna do it anyways' is a poor excuse to even consider it.
This is a mistake many of us have made. I'll openly admit I made it 20 years ago. Like someone on the outages list I think mentioned, I had built a couple SLA checks that triggered some routing changes to occur based on their status, and I thought I was super hot shit. Until I had to drive an hour through a blizzard to bring my routers back up after my incorrect assumptions knocked my entire company (an ISP) offline. Sometimes these are lessons people need to learn, but it's also helpful to point out to others why what they are trying to do is a bad idea so they can( if they chose to) learn from our prior mistakes.
On Fri, Feb 11, 2022 at 3:29 AM Mark Tinka <mark@tinka.africa> wrote:
On 2/10/22 20:27, Tom Beecher wrote:
>
> I guess it depends on what the actual problem trying to be solved is.
>
> If I understand it correctly, the OG issue was someone (who was not
> Google) building some monitoring around the assumption of the idea
> that ICMP echo-request/reply to 8.8.8.8 would always be available.
> Google decided to make a change so that assumption was now false.
>
> The actual problem here has nothing to do with how Google handles (or
> doesn't handle) ICMP towards their servers. The issue is that people
> have made poor assumptions about how they structured monitoring, and
> learned some lessons about that. Suggesting that Party B should do
> something because Party A made poor decisions is questionable, even if
> it is 75% of what we do in this world.
100% - and this is the crux of the issue.
As a community, it is clear that there is a need for this, and if
8.8.8.8 stops being an anchor for liveliness detection, users will find
something else to replace it with. And we can bet all our Kwacha that it
won't have been designed for that purpose, either.
Mark.