On Sat, 13 Jan 1996, Sean Donelan wrote:
I hesitate to suggest a presentation, because I don't have any solutions only problems. A couple of months ago this list had a thread on communicating
Generally problems are what start people thinking about solutions.. ;-)
While individual network providers track their own network reliability, is there a need to report and track some data on an Internet-wide basis similar to the reliability reporting done in other industries (telephone, airline, etc)?
Definitely. I see the critical need as not publishing generic uptime statistics, but network health information. That's not only what informs users as to why they can reach X web site, but also for consumers to choose their providers; among other benefits.
While anyone could track network outages on their own through massive invasive testing, it usually doesn't reveal the cause of the outage. What is the biggest threat to network reliability? A farmer with a backhoe, or a network engineer at a console?
This sort of testing would also lead to false conclusions, IMHO. Lately the backhoe appears to be taking the lead. :-)
Is there a neutral third-party which could blind and summarize the data? I'm not an academic type, so I don't know what would be involved in getting funding at one of the national labs for such a project. Or do we wait the the FCC to mandate something?
I can't suggest a third-party at this point, but as my cohort Alan Hannan pointed out, following the discussion last month we have begun working on proposal material. Unfortunately it really hasn't received the attention it should on my priority list. The key in obtaining the data is for the individual providers themselves to contribute it. There should be no reason why an organization needs to maintain a staff simply to monitor another's backbone and publish the results.
is the NOC communication lines. I haven't seen a network provider with sufficient staff to answer all the calls, and repair their network at the same time when it goes down. Either calls go unanswered, or the network doesn't get repaired, or sometimes both.
The 1-800 problem reporting method isn't scaling well. Alternatives?
Unfortunately the telephone is still one of the best methods for reliable communication, IMHO. One of the bits that amazed me back in the day ('bout 2yrs ago) at BARRNet (not a pot shot) -- is that when the network had an outage, an Email was sent to an outage list. Well, if your network link is down -- how can you get the Email? One particular customer mentioned to me that when I called him during an outage, it was the first time he had been contacted during an outage. Normally he had to wait and digest the Email after his service was restored.
chance of wedging a "user information" field into the IPng ICMP destination unreachable message? It would be nice to tell the user in the ICMP message: "Beep BOOP BEEP, We're sorry your packet could not be delivered as addressed due to a ...." Instead of waiting for the users to call the NOC which probably is already snowed under with calls.
Nice idea, but much more difficult to implement. You're talking about convincing quite a few people to implement it. Part of a NOC's mission is to deal with incoming calls, or in certain cases Customer Service Center's.
Since the 'net as a whole doesn't fail that often, but pieces of the 'net fail frequently, in-band notification isn't as crazy an idea as it seems.
Definitely; the model we're currently toying with would be open enough to be accessed both by provider's NOC staff as well as individual consumers on the Internet. The access method would be Web based and offer an Email interface for those desiring automated status reporting or simply a different view. Providers would be responsible for submitting incident reports and keeping them current (e.g. ticket updates); and a user could browse as his/her leisure.
Any thoughts how to turn this into a presentation topic?
Probably lots more effort than has been put in thus far. :) I would imagine a short presentation could be prepared explaining the model and who has "agreed" to support it by NANOG. However, a worthwhile presentation should include stats on who has used the service and if it's worthwhile, etc. \|/ _____ \|/ Jonathan Heiliger @~/ . . \~@ MFS Global Network Services, Inc. ________________________/_( \___/ )_\______________________________________ \__U__/ E-Mail: loco@mfst.com Data Services Network Engineering