Hello Alan - Here's a PARTIAL list of presentations we've planned so far:
I hesitate to suggest a presentation, because I don't have any solutions only problems. A couple of months ago this list had a thread on communicating network outage information, network usability, and generally keeping network operators and users informed about the current state of the net. - Overall network reliability tracking, better or worse? While individual network providers track their own network reliability, is there a need to report and track some data on an Internet-wide basis similar to the reliability reporting done in other industries (telephone, airline, etc)? While anyone could track network outages on their own through massive invasive testing, it usually doesn't reveal the cause of the outage. What is the biggest threat to network reliability? A farmer with a backhoe, or a network engineer at a console? Is there a neutral third-party which could blind and summarize the data? I'm not an academic type, so I don't know what would be involved in getting funding at one of the national labs for such a project. Or do we wait the the FCC to mandate something? - No one is perfect Everyone should plan for the disaster which will hit their network at some point. Whenever a network melts down, the next thing to go is the NOC communication lines. I haven't seen a network provider with sufficient staff to answer all the calls, and repair their network at the same time when it goes down. Either calls go unanswered, or the network doesn't get repaired, or sometimes both. The 1-800 problem reporting method isn't scaling well. Alternatives? - Everything hasn't failed at once [for a long time] I don't think there has been an Internet-wide ('net-wide) failure since BBN made Butterfly gateways and one lost its mind. This means, even though one network provider is wiped out, other networks could pass along reports about the current state of the network. How can this reporting function be decentralized? - Finally, keep network users informed Since we have a hard time tracking who is using what (if we even wanted to track users), out-of-band notification isn't great for notifying users. Ideally the network itself could be used to inform just those users affected why things aren't working. Any chance of wedging a "user information" field into the IPng ICMP destination unreachable message? It would be nice to tell the user in the ICMP message: "Beep BOOP BEEP, We're sorry your packet could not be delivered as addressed due to a ...." Instead of waiting for the users to call the NOC which probably is already snowed under with calls. Since the 'net as a whole doesn't fail that often, but pieces of the 'net fail frequently, in-band notification isn't as crazy an idea as it seems. Any thoughts how to turn this into a presentation topic? -- Sean Donelan, Data Research Associates, Inc, St. Louis, MO Affiliation given for identification not representation
......... Sean Donelan is rumored to have said: ] ] >Hello Alan - Here's a PARTIAL list of presentations we've planned so far: ] ] inform just those users affected why things aren't working. Any ] chance of wedging a "user information" field into the IPng ICMP ] destination unreachable message? It would be nice to tell the ] user in the ICMP message: "Beep BOOP BEEP, We're sorry your ] packet could not be delivered as addressed due to a ...." Instead Cool! I could really dig that. Pop the tcp stack to jive on a message, and the hubs and routers could determine the cause... Very nice. My telnet throws back a message via icmp that the site is unreachable becuase they don't know how to subnet and flap too much.. ;) Still doesn't change the fact that IPv6 will never solve any real problems. ] Since the 'net as a whole doesn't fail that often, but pieces ] of the 'net fail frequently, in-band notification isn't as crazy ] an idea as it seems. Hmm, how about this, any 'decent' ISP engages in a 'subscription' broadcast system, where each sites web site was decorated with a dynamic list of all significant outages? And if we, Global Internet/MIDnet go south, a nifty little modem dials up to DRA or MCI and spouts out our problem description, and this propogates to the group. I don't know the best way to make it happen, but I do think with some discussion and brainstorming we could make it work. Jonathon Heiliger (MFS) and I have been working on a draft of such a thing, and would appreciate some input. ] Any thoughts how to turn this into a presentation topic? This is a good way. Let's see if others are interested. I'll see you in San Diego, and if nothing else, let's sit down and talk. -alan
On Sat, 13 Jan 1996, Sean Donelan wrote:
I hesitate to suggest a presentation, because I don't have any solutions only problems. A couple of months ago this list had a thread on communicating
Generally problems are what start people thinking about solutions.. ;-)
While individual network providers track their own network reliability, is there a need to report and track some data on an Internet-wide basis similar to the reliability reporting done in other industries (telephone, airline, etc)?
Definitely. I see the critical need as not publishing generic uptime statistics, but network health information. That's not only what informs users as to why they can reach X web site, but also for consumers to choose their providers; among other benefits.
While anyone could track network outages on their own through massive invasive testing, it usually doesn't reveal the cause of the outage. What is the biggest threat to network reliability? A farmer with a backhoe, or a network engineer at a console?
This sort of testing would also lead to false conclusions, IMHO. Lately the backhoe appears to be taking the lead. :-)
Is there a neutral third-party which could blind and summarize the data? I'm not an academic type, so I don't know what would be involved in getting funding at one of the national labs for such a project. Or do we wait the the FCC to mandate something?
I can't suggest a third-party at this point, but as my cohort Alan Hannan pointed out, following the discussion last month we have begun working on proposal material. Unfortunately it really hasn't received the attention it should on my priority list. The key in obtaining the data is for the individual providers themselves to contribute it. There should be no reason why an organization needs to maintain a staff simply to monitor another's backbone and publish the results.
is the NOC communication lines. I haven't seen a network provider with sufficient staff to answer all the calls, and repair their network at the same time when it goes down. Either calls go unanswered, or the network doesn't get repaired, or sometimes both.
The 1-800 problem reporting method isn't scaling well. Alternatives?
Unfortunately the telephone is still one of the best methods for reliable communication, IMHO. One of the bits that amazed me back in the day ('bout 2yrs ago) at BARRNet (not a pot shot) -- is that when the network had an outage, an Email was sent to an outage list. Well, if your network link is down -- how can you get the Email? One particular customer mentioned to me that when I called him during an outage, it was the first time he had been contacted during an outage. Normally he had to wait and digest the Email after his service was restored.
chance of wedging a "user information" field into the IPng ICMP destination unreachable message? It would be nice to tell the user in the ICMP message: "Beep BOOP BEEP, We're sorry your packet could not be delivered as addressed due to a ...." Instead of waiting for the users to call the NOC which probably is already snowed under with calls.
Nice idea, but much more difficult to implement. You're talking about convincing quite a few people to implement it. Part of a NOC's mission is to deal with incoming calls, or in certain cases Customer Service Center's.
Since the 'net as a whole doesn't fail that often, but pieces of the 'net fail frequently, in-band notification isn't as crazy an idea as it seems.
Definitely; the model we're currently toying with would be open enough to be accessed both by provider's NOC staff as well as individual consumers on the Internet. The access method would be Web based and offer an Email interface for those desiring automated status reporting or simply a different view. Providers would be responsible for submitting incident reports and keeping them current (e.g. ticket updates); and a user could browse as his/her leisure.
Any thoughts how to turn this into a presentation topic?
Probably lots more effort than has been put in thus far. :) I would imagine a short presentation could be prepared explaining the model and who has "agreed" to support it by NANOG. However, a worthwhile presentation should include stats on who has used the service and if it's worthwhile, etc. \|/ _____ \|/ Jonathan Heiliger @~/ . . \~@ MFS Global Network Services, Inc. ________________________/_( \___/ )_\______________________________________ \__U__/ E-Mail: loco@mfst.com Data Services Network Engineering
participants (3)
-
Alan Hannan
-
Jonathan Heiliger
-
Sean Donelan