On Sat, 20 Oct 2001, Frank Coluccio wrote:
Lest we forget, there were considerable problems bringing certain IXs back on line that were in the affected zone back on line for a over a week after the day of the attacks, due to problems assoicated with on-site power generation. But ignoring those exceptions...
I remember them, and included them in my NANOG presentation. But if you compare the NYSE restoration (with the full faith, credit and Presidential order) versus no-name carrier hotel restoration things aren't straightforward. Even with the diversion of Verizon's resources to the NYSE, the carrier hotels got back on-line darn quick. I mapped out all the CLEC facilities I know about in lower manhattan. Some are very close to the WTC complex, others are far away. Some were damaged, some weren't. That's the point of diversity. A couple of CLECs suffered damage even worse than Verizon. One CLEC's POP is gone. It was pretty much luck of the draw what was damaged. The damage isn't a reflection on the quality of any provider. But if I'm a network manager, my concern isn't assessing blame but getting my network back online as quick as possible. What is the best assurance I can get my network online (assuming my own building wasn't destroyed)? The best bet seems to be buildings with multiple carriers.
Has anyone assessed the level of risk that exists to the 'Net due to the high levels of traffic concentration at 60 Hudson. Or, 1 Wilshire in LA, for that matter? Curious.
As Yogi Berra said, "Nobody goes there anymore; it's too crowded." Honestly, the data doesn't exist, so if anyone claims they have its just a wild guess. Given that, here is a wild guess. Historically, based upon previous accidents and failures of major exchange points, failures of "national" exchange points cause the least problems. The loss of one or two national exchange points (there has been a dual failure in the past) caused lots of traffic on NANOG. But the traffic kept flowing. National exchange points tend to be bright, shiny beacons. Everyone is (or should be) aware of the risk and engineer their network with that knowledge. Operator error and software bugs still cause the worst problems. The worst physical failures have been in locations where no one realized the risk existed.