Hell, we used to not have to bother notifying customers of anything, we just fixed the problem. Reminds me a of a story I've probably shared on the past. 1995, IETF in Dallas. The "big ISP" I worked for at the time got tripped up on a 24-day IS-IS timer bug (maybe all of them at the time did, I don't recall) where all adjacencies reset at once. That's like, entire network down. Working with our engineering team in the *terminal* lab mind you, and Ravi Chandra (then at Cisco) we reloaded the entire network of routers with new code from Cisco once they'd fixed the bug. I seem to remember this being my first exposure to Tony Li's infamous line, "... Confidence Level: boots in the lab." Good times. -b On Feb 6, 2013, at 5:41 PM, Brandt, Ralph wrote:
David. I am on an evening shift and am just now reading this thread.
I was almost tempted to write an explanation that would have had identical content with yours based simply on Level3 doing something and keeping the information close.
Responsible Vendors do not try to hide what is being done unless it is an Op Sec issue and I have never seen Level3 act with less than responsibility so it had to be Op Sec.
When it is that, it is best if the remainder of us sit quietly on the sidelines.
Ralph Brandt
-----Original Message----- From: Siegel, David [mailto:David.Siegel@Level3.com] Sent: Wednesday, February 06, 2013 12:01 PM To: 'Ray Wong'; nanog@nanog.org Subject: RE: Level3 worldwide emergency upgrade?
Hi Ray,
This topic reminds me of yesterday's discussion in the conference around getting some BCOP's drafted. it would be useful to confirm my own view of the BCOP around communicating security issues. My understanding for the best practice is to limit knowledge distribution of security related problems both before and after the patches are deployed. You limit knowledge before the patch is deployed to prevent yourself from being exploited, but you also limit knowledge afterwards in order to limit potential damage to others (customers, competitors...the Internet at large). You also do not want to announce that you will be deploying a security patch until you have a fix in hand and know when you will deploy it (typically, next available maintenance window unless the cat is out of the bag and danger is real and imminent).
As a service provider, you should stay on top of security alerts from your vendors so that you can make your own decision about what action is required. I would not recommend relying on service provider maintenance bulletins or public operations mailing lists for obtaining this type of information. There is some information that can cause more harm than good if it is distributed in the wrong way and information relating to security vulnerabilities definitely falls into that category.
Dave
-----Original Message----- From: Ray Wong [mailto:rayw@rayw.net] Sent: Wednesday, February 06, 2013 9:16 AM To: nanog@nanog.org Subject: Re: Level3 worldwide emergency upgrade?
OK, having had that first cup of coffee, I can say perhaps the main reason I was wondering is I've gotten used to Level3 always being on top of things (and admittedly, rarely communicating). They've reached the top by often being a black box of reliability, so it's (perhaps unrealistically) surprising to see them caught by surprise. Anything that pushes them into scramble mode causes me to lose a little sleep anyway. The alternative to what they did seems likely for at least a few providers who'll NOT manage to fix things in time, so I may well be looking at longer outages from other providers, and need to issue guidance to others on what to do if/when other links go down for periods long enough that all the cost-bounding monitoring alarms start to scream even louder.
I was also grumpy at myself for having not noticed advance communication, which I still don't seem to have, though since I outsourced my email to bigG, I've noticed I'm more likely to miss things. Perhaps giving up maintaining that massive set of procmail rules has cost me a bit more edge.
Related, of course, just because you design/run your network to tolerate some issues doesn't mean you can also budget to be in support contract as well. :) Knowing more about the exploit/fix might mean trying to find a way to get free upgrades to some kit to prevent more localized attacks to other types of gear, as well, though in this case it's all about Juniper PR839412 then, so vendor specific, it seems?
There are probably more reasons to wish for more info, too. There's still more of them (exploiters/attackers) than there are those of us trying to keep things running smoothly and transparently, so anything that smells of "OMG new exploit found!" also triggers my desire to share information. The network bad guys share information far more quickly and effectively than we do, it often seems.
-R>