On Mon, Jan 26, 2004 at 08:47:54AM -0700, Wayne E. Bouchard wrote:
Although in principle I agree with what you say here, I will point out that the number and frequency of "significant" network outages (excluding things like the recent power failure in LAX) has become rare as compared to what they were 5 or 6 years ago. Part of this is due to attitudes about the 'net maturing, part due to increased experience of the average engineer, and part due to things such as MPLS fast reroute.
I am going to have to call bullshit on the MPLS fast reroute thing there Wayne. The canonical counterexample is Sprint. Excellent engineering and ops folks top the list, followed closely by sufficient capacity, not pushing the envelope any more (basically we are now on the scale of growth where things like running out of pps don't happen any more), and the fact that now we are growing in an organic fashion, so the people at the bleeding edge are sufficiently clued up that the vendors products are together for the people following. Major protocol implementations have been beaten into shape, and now it is (mostly) a matter of bigger bandwidth and routers, not any fundamental architectural change. /vijay
I would also point out that, although there remain single points of interconnect, MPLS has meant that the path packets take intra-network doesn't have to be a single route between two boxes. BGP picks the exit point and engineers have configured MPLS to spread the traffic over 3 or 4 tunnels to get there thereby reducing the impact of a single failure.
But as you say, this really gets into the realm of overbuilt backbones which, of course, not everyone has. BGP isn't the best. I think many people have recognized that for some years now. However, when propperly managed, it suits current needs.
Perhaps it's time for the next generation of BGP to come into being; something that can use up to 4 paths through a network for any single destination rather than simply leaving alternate paths innactive until something changes. Heavens knows there are many instances where there are two or more "good" (and even equal) paths through a network that are not chosen simply because we're only allowed one.
On Mon, Jan 26, 2004 at 10:35:38AM +0000, Michael.Dillon@radianz.com wrote:
BGP is relatively good at determining the best path when you a major carrier with connectivity to "everyone" (i.e. when traffic flows "naturally"), in many locations, and you engineer your network so that you have sufficient capacity to support the traffic flows.
In other words, BGP really only works well when most networks are overbuilt so that there is a single uncongested best path through each network from every ingress to every egress and the paths within any given network's core are roughly similar in capacity.
Nowadays there is a lot more variability both within networks and between different networks. How can a simple protocol provide optimal behavior between an MPLS network, an IP over ATM network, a network that is half GRE tunnels, and a network that has core links ranging from DS3 to OC48? I think BGP is another example where something that is "good enough" has risen to prominence in spite of the fact that it is not optimal.
And another thing. How do we know this problem can ever be solved when we continue to use routing protocols which choose the *BEST* path. The best path is always a single path and, by definition, this is a single point of failure. How can we ever have a diverse and reliable network when its core routing paradigm is a single point of failure?
Note that people have built IP networks that provide two diverse paths at all times using multicast http://www.simc-inc.org/archive9899/Jun01-1999/bach2/Default.htm and such things may also be possible with MPLS. But are any of the researchers seriously looking at how to provide a network in which all packets flow through two diverse paths to provide better reliability?
--Michael Dillon
--- Wayne Bouchard web@typo.org Network Dude http://www.typo.org/~web/