I think the power grid outage is a good example of the robust yet fragile phenomenon found in many complex networks. The power grid is robust to even large variations in demand, but is extremely sensitive to the loss of particualr power lines or generators (see any of the work done by Massoud Amin). The reason is that vast majority of generators have only a few internconnections but there are a minority that have a large number that are critical to keeping the system going. If one of those gerneators in the minority has a failure it can rapidly result in a cascading failure since most of the adjacent generators do not have the capacity to handle the rerouted power. The Internet at the AS level is similar to the US power grid from a structural connectivity perspective, except there is even a smaller minority of nodes with an even larger percentage of connections. Theoretically going from a few large transit providers to more mid size or smaller providers would increase the resilie... The big 1996 power failure out west ended up resulting from a power line in Oregon that was downed by a falling tree branch. The rerouted capacity resulted in a cascading failure that spread to Denver and California. ----- Original Message ----- From: Iljitsch van Beijnum <iljitsch@muada.com> Date: Friday, August 15, 2003 1:53 pm Subject: Re: East Coast outage?
On vrijdag, aug 15, 2003, at 17:55 Europe/Amsterdam, Michael.Dillon@radianz.com wrote:
Perhaps the lesson to learn is that very large networks don't always>> lead to very high stability. A much larger number of smaller, more autonomous generation and transmission facilities might have much more reasonable interconnection requirements, and hence less wide- ranging>> failure modes.
And if we extrapolate that lesson to IP networks it implies that any medium to large sized organization should do their own BGP peering and multihome to 3 or more upstream network providers.
While this certainly has its advantages, I don't think it follows from Joe's remarks. What would follow is having many smaller transit networks rather than a few big ones. But I think in this regard IP is well ahead of the electricity people.
Still, I don't think it's this simple, as the problem with power is that supply and demand must be the same at all times. So if a decent chunk of the network that connects the two goes down, the supply side gets into trouble because they're suddenly generating too much. If the difference is big enough it's probably impossible to arrive at a new equilibrium above 0 fast enough. If you connect everything together you can absorb bigger imbalances but then when you get one you can't absorb, the impact is larger of course.
Fortunately in our business we have queues to smooth the spikes in network use and when we drop packets there are no sparks.
Perhaps we should start working on a hierarchical routing system in which the concept of a "global routing table" cannot develop. Perhaps> announcements and withdraws should have a TTL so that they never propogate very far from their source AS?
Have a look at the work going on in the IETF multihoming in IPv6 (multi6) working group and the IRTF routing working group.