That brings back memories....I had a similar experience. First month on the job, large Sun raid array storing ~ 5k of mailboxes dies in the middle of the afternoon. So, I start troubleshooting and determine it's most likely a bad disk. The CEO walked into the server room right about the time I had 20 disks laid out on a table. He had a fit and called the desktop support guy to come and 'show me how to fix a pc'.
Never mind the fact that we had a 90% ready to go replacement box sitting at another site, and just needed to either go get it, or bring the disks to it..... So we sat there until the desktop who was 30 minutes away guy got there. He took one look at it and said 'never touched that thing before, looks like he knows what he's doing' and pointed to me. 4 hours later we were driving the new server to the data center strapped down in the back of a pickup. Fun times.
-----Original Message-----
From: "Justin Streiner" <streinerj@gmail.com>
Sent: Tuesday, February 23, 2021 5:11pm
To: "John Kristoff" <jtk@dataplane.org>
Cc: "NANOG" <nanog@nanog.org>
Subject: Re: Famous operational issues
Friends,
I'd like to start a thread about the most famous and widespread Internet
operational issues, outages or implementation incompatibilities you
have seen.
Which examples would make up your top three?
To get things started, I'd suggest the AS 7007 event is perhaps the
most notorious and likely to top many lists including mine. So if
that is one for you I'm asking for just two more.
I'm particularly interested in this as the first step in developing a
future NANOG session. I'd be particularly interested in any issues
that also identify key individuals that might still be around and
interested in participating in a retrospective. I already have someone
that is willing to talk about AS 7007, which shouldn't be hard to guess
who.
Thanks in advance for your suggestions,
John