outages, quality monitoring, trouble tickets, etc
A bit of a rambling note, as I catch up with the unusually busy lists....
From: Scott Huddle <huddle@mci.net> I consider this list a place for ISPs to discuss general policy and planning issues that effect all of us. It is a very inappropriate place to discuss problems with a specific provider.
I firmly DISAGREE! None of us are particularly interested in hearing every jot and tittle about every network flap, but _BIG_ ones and their resolution are important to bring to this list! How else to get a handle on what the real problems are? How else to help each other avoid repeating the problem in the future? ---- As to the current state of the 'net, I have to agree whole-heartedly with Hans Werner (something I rarely did when he was around here....) The mindset in NANOG is pretty useless. It would be nice if folks stopped beating around the bush, worrying about "competitive" issues, and started cooperating! Leave the competitive posturing to the marketing departments. Fixing problems usually means focusing on a particular case. If analysis of the problems of a particular ISP/NSP shed some light on the resolution of a bigger scope, then a little embarrassment is a small price to pay; it's not fatal -- failing to fix the problem is fatal! ---- As to the earlier discussion about Frame Relay instead of direct links, my experience is that F-R within a LATA between a few routers is working reasonably well, but inter-LATA and wider is working poorly, and more than 5-6 routers is a disaster. Most of my recent link problems that can be pinpointed enough to trouble ticket have been directly due to F-R, primarily at PSI. A couple of weeks ago, they lost the entire Great Lakes area, and didn't notice for over 4.5 hours. And took another 6 hours to fix. They never did tell me the final solution. So, based on experience, I don't recommend F-R for long haul links. It's just not good enough! ---- One of the reasons that ISPs are flapping is the lack of Link Quality Monitoring. You can easily tell when the link is degrading, with very accurate reports on a packet or byte basis. This is particularly important for F-R links, as the switches don't seem to tell each other when the link is down. I was surprised to learn that some folks weren't using PPP LQM on high speed HDLC links. That's why we originally designed it! PPP also runs over F-R links, even if all you use it for is LQM. After 4 years, we are finally getting around to raising PPP LQM for Draft Standard, but it is pretty widely implemented.... Insist on LQM from your router vendors! ---- Has anybody else noticed how hard it is to get trouble tickets these days? Once upon a time, I just called the NSF NOC, and got a report to them in real time, so the problem could be fixed quickly. Nowadays, NOCs seem to want you to send email with 24 or 48 hour turnaround, or go through 2 layers of service representatives. Pretty hard to send email to them when their link is down, or go through "regular" support in the middle of the night! We really need more folks like MCI with an 800 number. I've found them very responsive. But then, I've also found that they have fewer problems than other ISPs I've dealt with lately. Maybe that's because they get faster problem reports? (See, I can give compliments, too.) Bill.Simpson@um.cc.umich.edu Key fingerprint = 2E 07 23 03 C5 62 70 D3 59 B1 4F 5E 1D C2 C1 A2
participants (1)
-
William Allen Simpson