Just to point out. IOS 12.0(6) supposts restart delay after the interface goes down. Now it doesn't really detect the line problem fast, except possibly at the CSU level, as HDLC links still depend on the default 10sec keepalive, although you could tune that also. Although it doesn't appear restart delay is documented in any of the usual places, it does seem to work. In message <Pine.BSI.4.05L.9910261246180.20140-100000@sh.lh.vix.com>, Jerry Scha rf writes:
While I fully agree with Tony's comments on damped control systems, we also have to ackowledge the desire for fast failure recovery. When a person talks about recovery times measured in the same units as link RTT, it is very difficult to create well damped systems. Many systems that work fine in simulation and in "normal operation" die a horrible death as circuits start to fail. Those people pointing to theoretical complexity tend to be proven correct, in my experience.
I will also point to an idea that Vadim has championed for a long time that fast lines should go down quickly and up slowly rather than up quickly and down slowly as they do now. It doesn't solve the damping problem, but it does allow you to be more aggresive in taking failed lines down without destabilizing current algorithms. Someday YFRV may offer this as an option, perhaps some of them do already.
I do believe that some of the TE folks are punting the general case solution by only attempting to do TE and protection on a small potion of their traffic. If you have a glut of web traffic that acts as a sponge, you can get away with nonoptimal management of the subset without causing meltdowns.
jerry
--- jerry@fc.net Freeside/ Insync Internet, Inc.| 512-458-9810 | http://www.fc.net #include <sys/machine/wit/fortune.h>