Is it time for an disruption analysis working group for the Internet?
Have we reached a critical mass of multi-provider disruptions to make it possible to do something yet? Most networked industries have some group which collects and analyzes information about disruptions. What's interesting is how often similar disruptions had precursor events across multiple different service providers. For example, there have been several cases in the last few months of root and gtld servers failing to transfer zone files. And there have been several cases of routers not withdrawing routes after an erroneous announcement. It is only after the major disruption occurs does the information get shared, usually via the public news media. Once upon a time, the IETF had a group called 'netstat,' and NANOG had presentations about the 'State of the Internet.' Neither have appeared on the agenda of those organizations for a variety of reasons recently. If there was a process for providers to submit initial and final reports about significant service disruptions; and a group to organize a regular report of common root causes across multiple providers (not a report card on any single provider) would any provider voluntarily participate? I'm not thinking about a real-time shared trouble ticket system, but something on the same scale as other industry outage reports to industry working groups. I suspect I know the answer to that question. Craig, Randy, Jhawk stop reading here----------------------------------- On the other hand, suppose instead of being very hard to reach I suddenly started returning reporters' phone calls promptly and telling them about this great idea I have to improve the reliability of the Internet. Eventually one will write a story about it, and maybe even get some decent coverage. How high up do I have to shoot in order to get your CEO's attention? Does it have to be the front page of the New York Times? Would that change the answer to the question above? -- Sean Donelan, Data Research Associates, Inc, St. Louis, MO Affiliation given for identification not representation
On Thu, 12 Nov 1998, Sean Donelan wrote:
Have we reached a critical mass of multi-provider disruptions to make it possible to do something yet?
Most networked industries have some group which collects and analyzes information about disruptions. What's interesting is how often similar disruptions had precursor events across multiple different service providers. For example, there have been several cases in the last few [...]
How is this handled in other networked industries? I'm sure that the same issues of proprietary information and public humliation exist there; how do they deal with it? Pete.
At 2:04 AM -0700 11/12/98, Pete Kruckenberg wrote:
On Thu, 12 Nov 1998, Sean Donelan wrote:
Have we reached a critical mass of multi-provider disruptions to make it possible to do something yet?
Most networked industries have some group which collects and analyzes information about disruptions. What's interesting is how often similar disruptions had precursor events across multiple different service providers. For example, there have been several cases in the last few [...]
How is this handled in other networked industries? I'm sure that the same issues of proprietary information and public humliation exist there; how do they deal with it?
Pete.
Not precisely the networking industry, but the airline industry has been revising recently its procedures for crash notification, not just on its own but with governmental pressure. I suspect we could get information from the Air Transport Association or possibly the US National Transportation Safety Board. A point from aviation -- incidents such as near-misses can be reported without fear of liabiity, because the consensus is that it's more important to recognize potential safety problems than it is to set up opportunities for acting against individuals or setting up opportunities for lawsuits. In other industries, the Electric Power Research Institute would be a good starting point, since they have responsiblity for data network architecture in the electrical power industry. Anyone from EPRI reading NANOG? Medicine, unfortunately, isn't the best area in general for seeing examples of how to do things in the open. There are examples in the specialty of public health. There is a well-respected email newsletter called Pro-Med to which I subscribe. Pro-Med came out of the Federation of American Scientists, has a rather star-studded advisory board of public health experts, and is quite respected. I suspect their staff and board would be open to serving as a model, if the model fits. Howard
Seems like you probably want an on-going group, akin to the developing network of CERT teams, that focuses on operations anomolies, rather than security incidents. A third-party that is funded by the industry but separate from any particular provider. d/ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Dave Crocker Tel: +60 (19) 3299 445 <mailto:dcrocker@brandenburg.com> Post Office Box 296, U.P.M. Serdang, Selangor 43400 MALAYSIA Brandenburg Consulting <http://www.brandenburg.com> Tel: +1 (408) 246 8253 Fax: +1(408)273 6464 675 Spruce Dr., Sunnyvale, CA 94086 USA
Sean, Do we actually need the cooperation of the organizations in question to effect this? For large enough failures, the results are obvious and the data is fairly clear. Perhaps a first stage of a Disruption Analysis Working Group would simply be for a coordinated group to gather the facts, sort through the impact, analyze the failure and report recommendations in a public forum. A sponsoring organization that could provide a legal liability shield would be desirable, as anyone not cooperating may make the non-cooperation more active than passive. Regards, Eric Carroll Tekton Internet Associates
On Thu, Nov 12, 1998 at 02:48:58AM -0600, Sean Donelan wrote:
Craig, Randy, Jhawk stop reading here-----------------------------------
<chuckle>
On the other hand, suppose instead of being very hard to reach I suddenly started returning reporters' phone calls promptly and telling them about this great idea I have to improve the reliability of the Internet. Eventually one will write a story about it, and maybe even get some decent coverage. How high up do I have to shoot in order to get your CEO's attention? Does it have to be the front page of the New York Times?
Would that change the answer to the question above?
Yup. Battle by press is a dicey game at best; you have to make _certain_ your press contact is knowledgeable enough to make the right points the right way. We _do_, however, have several participants here who seem to have a clue, who also have ink. I'm thinking of one in particular who writes a column for InternetWorld (week?). Although it's not "mainstream" business press, those folks _do_ read the trades, too... We're becoming a utility, folks; it's time to act that way. It seems to me that there's a niche market here for anyone with the capital and inclination. _All_ of the net doesn't have to be high-availability, as long as the HA section replicates the right things. Decentralization doesn't require anyone's permission. And if we can't survive a backhoe, how in _hell_ will we survive a pissed off Saddam tossing nuclear SCUDs? Cheers, -- jra -- Jay R. Ashworth jra@baylink.com Member of the Technical Staff Buy copies of The New Hackers Dictionary. The Suncoast Freenet Give them to all your friends. Tampa Bay, Florida http://www.ccil.org/jargon/ +1 813 790 7592
participants (6)
-
Dave Crocker
-
Eric M. Carroll
-
Howard C. Berkowitz
-
Jay R. Ashworth
-
Pete Kruckenberg
-
Sean Donelan