Re: What frame relay switch is causing MCI/Worldcom such grief?
The traffic-engineering reason for L2 "routing" is only valid for complex-topology networks. In simple topologies, the penalty for suboptimal paths effectively cancels gains from spreading traffic around. Physical fiber plants do have rather simple topologies (the rich topologies are usually "optical illusions" created by SONET layer).
From a customer's point of view performance of the network is _not_ measured as available bandwidth; but rather as performance of his TCP streams; which depends heavily on latencies and loss. Increasing latency while there's a lossy component in the path (which is increasingly found not in backbone but at ingress tail circuits, and outside of the ISP control) downgrades performance apporximately inversely proportionally to the latency.
In other words: excluding grossly overloaded circuits, you want the path with least latency! This is because your performance is limited by the tail-circuit (or exchange point) loss _and_ the backbone latency. MPLS does nothing to help avoid these lossy places (avoiding IXP loss would require propagation of interior routing information into peer backbones). Additionally, suboptimal paths as a rule involve more hops, which increase latency variance proportionally. Now, no matter how one jumps, most congestions only last seconds. Expecting any traffic engineering mechanism to take care of these is unrealistic. A useful time scale for traffic engineering is therefore at least days - which can be perfectly accomodated by capacity planning in fixed topology. At these time scales traffic matrices do not change rapidly. In fact, as long as there are more than three backbones, one can safely assume that most traffic goes from customers (proportionally to size of their pipes) to the nearest exchange point; and from exchange points randomly to all customers (again proportionally to their access pipe sizes). Backbones which neglect the capacity planning because they can "reroute" traffic at L2 level simply cheat their customers. If they _do not_ neglect capacity planning, they do not particularly need the L2 traffic engineering facilities. Anyway, the simplest solution (having enough capacity, and physical topology matching L3 topology) appears to be the sanest way to build a stable and manageable network. Raw capacity is getting cheap fast; engineers aren't. And there is no magic recipe for writing complex _and_ reliable software. The simpler it is, the better it works. --vadim
On Mon, 9 Aug 1999, Vadim Antonov wrote:
Now, no matter how one jumps, most congestions only last seconds.
This isn't the case when you have half your bandwidth to any particular point down. Excess capacity in other portions of the network may be then used to carry a portion of the offered load via a suboptimal path.
Expecting any traffic engineering mechanism to take care of these is unrealistic. A useful time scale for traffic engineering is therefore
Expecting most congestions to last only seconds is also unrealistic. In most cases, there is no congestion, everything is taking the shortest path and then there is a loss of capacity and we have a problem. Expecting the physical circuit to never go down due to sonet protect and diverse routing is also a bit optimistic as regrooming may eventually reduce your "diverse" routing to a single path. This does not fly with customers, who want traffic moved, not excuses that the physically diverse pathing was regroomed by a telco to be non diverse. Backhoe fade induced loss MTTR is long enough that TE techniques have proven to be an effective mechanism of bypassing the outage without operator intervention.
at least days - which can be perfectly accomodated by capacity planning in fixed topology. At these time scales traffic matrices do not change
This assumes a decent fixed topology. The market has moved faster than predictions historically.
Backbones which neglect the capacity planning because they can "reroute" traffic at L2 level simply cheat their customers. If they _do not_ neglect capacity planning, they do not particularly need the L2 traffic engineering facilities.
Promising local ISP's are not neglecting capacity planning. The problem is that the effectively _random_ delivery of capacity and at points which are less than optimal.
Anyway, the simplest solution (having enough capacity, and physical topology matching L3 topology) appears to be the sanest way to build a stable and manageable network. Raw capacity is getting cheap fast; engineers aren't. And there is no magic recipe for writing complex _and_ reliable software. The simpler it is, the better it works.
There is _no_ disagreement on this topic. This paragraph is correct as it stands. With the exception of partial capacity loss, this is completely in line with most people's thinking. No one actually sits down and thinks "lets effectively route our traffic so that the sum (bit*mile) is the highest possible." That is just plain wrong. However, given real world constraints on capacity and delivery, TE is a useful tool today. /vijay
Now, no matter how one jumps, most congestions only last seconds.
This isn't the case when you have half your bandwidth to any particular point down. Excess capacity in other portions of the network may be then used to carry a portion of the offered load via a suboptimal path.
i believe he is talking of congestion under 'normal' circumstances, in which the assertion is correct. chronic congestion occurs when you oversubscribe the link and constantly fill/overflow the queue (and you see drops as inversely related to interarrival times). chronic congestion (not related to link failure) should (can? seems to be a question for some) be accounted for in windows longer than the normal TE hack delta. you can also engineer an ip network to accomodate circuit failures (with no APS). You just have to understand the problem. helps to be a consolidated CLEC/IXC too.
Expecting any traffic engineering mechanism to take care of these is unrealistic. A useful time scale for traffic engineering is therefore
Expecting most congestions to last only seconds is also unrealistic. In most cases, there is no congestion, everything is taking the shortest path and then there is a loss of capacity and we have a problem.
in a network where you have oversold core capacity at the edges, it is certainly normal to experience congestion. remember that the data most are exposed to are 5 minute averages. examining link utilization/drops at a shorter delta is most interesting.
Expecting the physical circuit to never go down due to sonet protect and diverse routing is also a bit optimistic as regrooming may eventually reduce your "diverse" routing to a single path.
if you are the telco, then you are shooting yourself. if you are not a telco and you buy diverse path aps in contract and don't inquire about 'regrooming ticket 12341' then you are just as stupid.
at least days - which can be perfectly accomodated by capacity planning in fixed topology. At these time scales traffic matrices do not change
This assumes a decent fixed topology. The market has moved faster than predictions historically.
so no matter what, it is impossible to give O(f(n)) growth ('O' not theta)? i find that hard to believe.
highest possible." That is just plain wrong. However, given real world constraints on capacity and delivery, TE is a useful tool today.
s/tool/hack/g BR
participants (3)
-
Bradley Reynolds
-
Vadim Antonov
-
Vijay Gill