If the network is always down, how come I can post a message
Discussing network reliability for the last week, I've been beating my head against the "phone system is 99.999% reliable, but the Internet crashes all the time." One anecdotal data point, I've been reporting about Internet problems for the last five years or so. Over the last 5 years no Internet network event has been so severe it prevented me from reporting about the problem on the net. In a strange way, my postings about the problems on the net are proof of the reliability of the same network. In the same time period, I've lost my telephone service several times. 1. The Ladue, MO central office was struck by lightning, and the generator failed 2. The Rochelle Park, NJ central office was flooded by a hurricane the week I started work in northern New Jersey 3. I moved to Pacific Bell country, enough said I've lost my pager service multiple times 1. Galaxy IV died 2. Pagenet network problems Even the Associated Press has gone down in the last five years. So I'm sick and tired about hearing the telephone network is 99.999% reliable and the Internet isn't.
On 31 Oct 2000, Sean Donelan wrote:
One anecdotal data point, I've been reporting about Internet problems for the last five years or so. Over the last 5 years no Internet network event has been so severe it prevented me from reporting about the problem on the net. In a strange way, my postings about the problems on the net are proof of the reliability of the same network.
In the same time period, I've lost my telephone service several times.
I've lost my pager service multiple times
Even the Associated Press has gone down in the last five years.
So I'm sick and tired about hearing the telephone network is 99.999% reliable and the Internet isn't.
Me, too. I'll put some finer points on the topic, though. The services you cite (phone, paging, AP) have essentially one application from a user point of view. The Internet has thousands of different applications at the user level. They handle this or that "outage" differently. Email is particularly robust in its store and forward behavior as a bounce is the only failure mode that readily comes to mind (ie. any mail that's eventually delivered is successful.) Other applications may behave differently but adding complexity to the analysis is the fact that the number of end nodes means that the matrix of possible src/dest pairs quickly climbs into the billions with port and protocol multiplexing on top of that. Thinking of things this way, it seems clear that there's no way to measure the "up-ness" of any part of the Internet that's not been isolated by some outage at a local entry point. Meaning that other than when its ethernet cable is unplugged or the WAN link out of the building go down, a given machine seems to be "up" as does the larger network, but we can be sure there's something somewhere it can't get to and some application that would be affected. Given this, measurements based on 9's seem particularly ill-suited and any metric that's not *extremely* narrowly defined seems incalculable. How to explain this to customers though? One possible approach would be to remind them that the way most users approach Internet applications approximates, "Oh, it's not working now, time for a coffee break." Nothing seems to stop people from building inappropriate applications on top of IP, though. (While I might consider it folly to depend on a web page to send sell orders for my stocks when the market's crashing, can I be sure that what ever other approach might be in the back of my mind would work in a "bad time"?) I'd like to see more discussion in this forum of new ways to think about risk and communicate it to those outside the technical IP community (eg. managers and customers). Tony
Sean, This is because the phone system is considered down when the subscriber can dial no numbers. The internet is considered down when you can't reach any particular web site (or collection thereof). If people can't reach a given company by phone they say 'their phone line is down'. I have given up counting the number of times I have spent trying to dial the US from the UK and got fast busy (or worse). -- Alex Bligh VP Core Network, XO Communications - http://www.xo.com/ (formerly Nextlink Inc, Concentric Network Corporation GX Networks, Xara Networks)
participants (3)
-
Alex Bligh
-
Sean Donelan
-
Tony Tauber