I think how reliable the internet needs to be depends on what you want to use it for: if you want to call an ambulance you DON'T use the internet, if you want to transfer money from one account to another you DO use the internet. In other words right now it's good for things that are important but not critical from an immediate action standpoint. If it can wait until tomorrow use the internet otherwise pick up the phone and dial. I can count on one hand the number of times I've had problems with my landline in my entire life but I can count on two hands the number of problems I've had with my internet connection in one year. If we ever want the internet to grow from being a handy medium for exchanging data to the converged, all-encompassing communications medium then it needs to go from "Mom, the internet's down again!" to "Dude, my internet connection went down yesterday, that ever happen to you before?". For that to happen there has to be more accountability in the industry. -GP -----Original Message----- From: Steve Gibbard [mailto:scg@gibbard.org] Sent: 26 February 2004 00:30 To: nanog@merit.edu Subject: How relable does the Internet need to be? (Was: Re: Converged Network Threat) Having woken up this morning and realized it was raining in my bedroom (last night was the biggest storm the Bay Area has had since my house got its new roof last summer), and then having moved from cleaning up that mess to vacuuming water out of the basement after the city's storm sewer overflowed (which seems to happen to everybody in my neighborhood a couple of times a year), I've spent lots of time today thinking about general expectations of reliability. In the telecommunications industry, where we tend to treat reliability as very important and any outage as a disaster, hopefully the questions I've been coming up with aren't career ending. ;) With that in mind, how much in the way of reliability problems is it reasonable to expect our users to accept? If the Internet is a utility, or more generally infrastructure our society depends on, it seems there are a bunch of different systems to compare it to. In general, if I pick up my landline phone, I expect to get a dialtone, and I expect to be able to make a call. If somebody calls my landline, I expect the phone to ring, and if I'm near the phone I expect to be able to answer. Yet, if I want somebody to actually get through to me reliably, I'll probably give them my cell phone number instead. If it rings, I'm far more likely to able to answer it easily than I am my landline, since the landline phone is in a fixed location. Yet some significant portion of calls to or from my cell phone come in when I'm in areas with bad reception, and the conversation becomes barely understandable. In many cases, the signal is too weak to make a call at all, and those who call me get sent straight to voicemail. Most of us put up with this, because we judge mobility to be more important than reliability. I don't think I've ever had a natural gas outage that I've noticed, but most of my gas appliances won't work without electric power. I seem to lose electric power at home for a few hours once a year or so, and after the interuption life tends to resume as it was before. When power outages were significantly more frequent, and due to rationing rather than to accidents, it caused major political problems for the California government. There must be some threshold for what people are willing to accept in terms of residential power outages, that's somewhere above 2-3 hours per year. In Ann Arbor, Michigan, where I grew up, the whole town tended to pretty much grind to a halt two or three days a year, when more snow fell than the city had the resources to deal with. That quantity of snow necessary to cause that was probably four or five inches. My understanding is that Minneapolis and Washington DC both grind to a halt due to snow with somewhat similar frequency, but the amount of snow requred is significantly more in Minneapolis and significantly less in DC. Again, there must be some threshold of interruptions due to exceptionally bad weather that are tolerated, which nobody wants to do worse than and nobody wants to spend the money to do better than. So, it appears that among general infrastructure we depend on, there are probably the following reliability thresholds: Employees not being able to get to work due to snow: two to three days per year. Berkeley storm sewers: overflow two to three days per year. Residential Electricity: out two to three hours per year. Cell phone service: Somewhat better than nine fives of reliability ;) Landline phone service: I haven't noticed an outage on my home lines in a few years. Natural gas: I've never noticed an outage. How Internet service fits into that of course depends on how you're accessing the Net. The T-Mobile GPRS card I got recently seems significantly less reliable than my cell phone. My SBC DSL line is almost to the reliability level of my landline phone or natural gas service, except that the DSL router in my basement doesn't work when electric power is out. I'm probably poorly qualified to talk about the end-user experience on the networks I actually work on, even if I had permission to. Like pretty much everybody else here, I'm always interested in doing better on reliability. And, like many of my neighbors, I'd like to be able to store stuff on my basement floor. In comparison to a lot of other infrastructure we depend on, it seems to me the Internet is already doing pretty well. -Steve On Wed, 25 Feb 2004, Jared Mauch wrote:
Ok.
I can't sit by here while people speculate about the possible problems of a network outage.
I think that most everyone here reading NANOG realizes that the Internet is becoming more and more central to daily life even for those that are not connected to the internet.
From where i'm sitting, I see a number of potentially dangerous trends that could result in some quite catastrophic failures of networks. No, i'm not predicting that the internet will end in 8^H7 days or anything like that. I think the Level3 outage as seen from the outside is a clear case that single providers will continue to have their own network
for time to come. (I just hope daily it's not my employers network ;-) )
So, We're sitting here at the crossroads, where VoIP is "coming of age". Vonage, 8x8 and others are blazing a path that the rest of the providers are now beginning to gun for. We've already read in press releases and articles in the past year how providers in Canada and the US are moving to VoIP transport within their long-distance networks.
I keep hear of Frame-Relay and ATM signaling that is going to happen in large providers MPLS cores. That's right, your "safe" TDM based services, will be transported over someones IP backbone first. This means if they don't protect their IP network, the TDM services could fail. These types of CES services are not just limited to Frame and ATM. (Did anyone with frame/atm/vpn services from Level3 experience the same outage?)
Now the question of Emergency Services is being posed here but also in parallel by a number of other people at the FCC. We've seen the E911 recommendation come out regarding VoIP calls. How long until a simple power failure results in the inability to place calls?
Now, i'm not trying to pick on Level3 at all. The trend I outline here is very real. The reliance on the Internet for critical communications is a trend that continues. Look at how it was used on 9/11 for communications when cell and land based telephony networks were crippled.
The internet has become a very critical part of all of our lives (some more than others) with banks using VPNs to link their ATMs back into their corporate network as well as the number of people that use it for just plain "just in time" bill payment and other things. I can literally cancel my home phone line, cell phone and communicate soley with my internet connection, performing all my bill payments without any
I can even file my taxes online.
We're at (or already past) the dangerous point of network convergence. While I suspect that nobody directly died as a result of the recent outage, the trend to link together hospitals, doctors and other agencies via the Internet and a series of VPN clients continues to grow. (I say this knowing how important the internet is to the medical community, reading x-rays and other data scans at home for the oncall is quite common).
While my friends that are local VFD do still have the traditional pager service with towers, etc... how long until the T1's that are used for dial-in or speaking to the towers are moved to some sort of IP based system? The global economy seems to be going this direction with varying degrees of caution.
I'm concerned, but not worried.. the network will survive..
- Jared
On Wed, Feb 25, 2004 at 09:17:30AM -0600, Pete Templin wrote:
If an IP-based system lets you see the status of the 23 hospitals in San Antonio graphically, perhaps overlaid with near-real-time traffic conditions, I'd rather use it as primary and telephone as secondary.
Counting on it? No. Gaining usability from it? You betcha.
Brian Knoblauch wrote:
If you're counting on IP (a "best attempt" protocol) for critical data, you've got a serious design flaw in your system...
-----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of Pete Templin Sent: Wednesday, February 25, 2004 9:10 To: Colin Neeson Cc: nanog@merit.edu Subject: Re: Level 3 statement concerning 2/23 events (nothing to see, move along)
Are you sure no one died as a result? My hobby is volunteering as a firefighter and EMT. If Level3's network sits between a dispatch
failures paperwork. center
or mobile data terminal and a key resource, it could be a factor (hospital status website, hazardous materials action guide, VoIP link that didn't reroute because the control plane was happy but the forwarding plane was sad, etc.).
And if the problem could happen to another network tomorrow but could be prevented or patched, wouldn't inquiring minds want to know? Your life might be more interesting when the fit hits the shan if you have the same vulnerability.
Colin Neeson wrote:
Because, in the the grand scale scheme of things, it's really not that important.
No one died because of it, the normal, everyday events of the world went on, unaffected by a Level 3 outage...
Might be nice to know what happened, but my life will certainly not be less interesting by not having that knowledge...
-- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
---------------------------------------------------------------------------- ---- Steve Gibbard scg@gibbard.org +1 415 717-7842 (cell) http://www.gibbard.org/~scg +1 510 528-1035 (home) Vodafone Global Content Services Limited Registered Office: Vodafone House, The Connection, Newbury, Berkshire RG14 2FN Registered in England No. 4064873 This e-mail is for the addressee(s) only. If you are not an addressee, you must not distribute, disclose, copy, use or rely on this e-mail or its contents, and you must immediately notify the sender and delete this e-mail and all copies from your system. Any unauthorised use may be unlawful. The information contained in this e-mail is confidential and may also be legally privileged.