On Nov 28, 2021, at 02:42 , Mark Tinka <mark@tinka.africa> wrote:
On 11/28/21 06:43, Masataka Ohta wrote:
Here in nanog, we are talking about network operations, considerable part of which can not rely on DNS.
And yet Facebook were unable to access their kit to fix their recent outage because of it (or, lack of it).
I’d argue that failing to put the correct documentation in place for coping with a DNS outage was the bigger issue than the DNS failure in the Facebook outage… So would a number of the engineers I know at Facebook.
There was a time when knowing the IP(v4) address of every interface of every router in your network was cool. I have never had to care about that in close to 15 years. Right up there with losing interest in making software modems work in Linux, when it was a thing :-).
There was a time when every router in a moderately large network was less than 50. Those days are gone. Today, it’s impossible to build large scale networks without depending on certain tools. That means that the failure of those tools can be catastrophic if one is not properly prepared. This is simply the modern reality. Proper preparation is harder than it used to be, but for any such network, there should be online and off-line copies of sufficient documentation (which is adequately maintained) to cope restore service of any such underlying facility quickly in the event of a failure. Owen