Re: outages, quality monitoring, trouble tickets, etc
From: alan@gi.NET (Alan Hannan)
Hmm, I wonder if the Trib' would be interested in knowing when the DS3 from Pensaulen is down.....
Sometimes the Trib would be interested, most of the time they wouldn't. Who cares if a central office in Hillsdale, IL burns down? Some people thought it was front page news.
At a precursing glance I would agree with you. However, let us delve into this a bit deeper. Donning my idiot hat may I point out that the _most_ important thing is network reliability -Period-.
That would be great, please give me the name of a network provider which provides perfect network reliability. In the absence of perfection, please tell me what went wrong when I can't get an expected level of usability out of the network. You can greatly reduce your customers' stress levels simply by keeping them informed. Give me the TCP/IP equivalent of "*beep* *boop* *BEEP* We're sorry your TCP/IP connection can not be completed due to an earthquake (software glitch, route table overload, nuclear detonation) in the area. Please hang up and try your call later." With an accurate RA database, and a little magic, the route servers could redirect connections to an intercept message. That should send a shiver up the spine of your network security folks. It would be nice if the problem is also fixed quickly, but I realize that is asking for a lot. In the mean time, keep the customer informed. As the size of the Internet has grown, keeping the customer informed is a bigger job. Relying on a 1-800 number doesn't work when a large NSPs backbone melts down, and all the NSP's customers call the NOC at the same time.
Your page at DRA is quite good, however the concensus among upper management (not just at our site) is "Why should other people know when we're broke?". And the sad thing is, I am tempted to agree with them.
Thanks for the complement. I would point out to your upper management other people already know when your network is broken. If they didn't notice, it wouldn't be a problem. If a network falls over in the woods, and there was no one to hear it, does it make a sound? Tell your upper managers, the only time people don't know when your network is broken is when your network is irrelevant to their work. I don't know about you, but if I was managing an irrelevant network, I would be working on my resume. Maybe that's why so many people in this business keep switching employers? :-)
Do you really want outage and downtime on public record, or do you want easier access to clueful folx?
As a network user (operator, manager): - Ideally I want a useable network. - When I can't use the network, I want an explanation. - I want the problem fixed so I can use the network again. How you meet those needs, I don't care. If you fix the problem before I'm effected by it, then I don't care about the intermediate steps either (tree falling in the forest). If I can get the explanation from an automated server, then I don't have to bother your clueless or cluefull folk. If your clueless folk can take a log and get it resolved, I don't have to bother your cluefull folk. When I need to dig out my magic cache of business cards and start e-mailing/calling the secret members of the "backbone cabal" to get a problem fixed, I consider it a failure of the process. That is what I meant when I said NOC-to-NOC communications has been a long-term Internet problem. It relies on personal contacts, rather than a reliable process. I try very hard not to handle problems by directly contact people I know at other NOCs. I prefer handling problems through the normal NOC channels. If you do happen to receive a phone call directly from me about a network problem, something has gone very wrong. -- Sean Donelan, Data Research Associates, Inc, St. Louis, MO Affiliation given for identification not representation
participants (1)
-
Sean Donelan