Re: Converged Networks Threat (Was: Level3 Outage)

25 Feb 2004

      David Meyer wrote:
...
No doubt. However, the problem is: What constitutes
  "unnecessary system complexity"? A designed system's
  robustness comes in part from its complexity. So its not
  that complexity is inherently bad; rather, it is just
  that you wind up with extreme sensitivity to outlying
  events which is exhibited by catastrophic cascading
  failures if you push a system's complexity past some
  point; these are the so-called "robust yet fragile"
  systems (think NE power outage).
I think you hit the nail on the head. I view complexity as diminishing 
returns play. When you increase complexity, the increase does benefit a 
decreasing percentage of the users. A way to manage complexity is 
splitting large systems into smaller pieces and try to make the pieces 
independent enough to survive a failure of neighboring piece. This 
approach exists at least in the marketing materials of many 
telecommunications equipment vendors. The question then becomes, "what 
good is a backbone router without BGP process". So far I haven´t seen a 
router with a disposable entity on interface or peer basis. So if a BGP 
speaker to 10.1.1.1 crashes the system would still be able to maintain 
relationship to 10.2.2.2. Obviously the point of single device 
availability becomes moot if we can figure out a way to route/switch 
around the failed device quickly enough. Today we don´t even have a 
generic IP layer liveness protocol so by default packets will be 
blackholed for a definite duration until a routing protocol starts to 
miss it´s hello packets. (I´m aware of work towards this goal)

In summary, I feel systems should be designed to run independent in all 
failure modes. If you lose 1-n neighbors the system should be 
self-sufficient on figuring out near-immediately the situation, continue 
working while negotiating with neighbors about the overall picture.

Pete

Re: Converged Networks Threat (Was: Level3 Outage)

Petri Helenius