On 5/11/2010 11:35, Jay Nakamura wrote:
So, we have two upstreams, both coming in on Ethernet. One of our switch crashed and rebooted itself. Although we have other paths to egress out the network, because the router's Ethernet interface didn't go down, our router's BGP didn't realize the neighbor was down until default BGP timeout was reached. Our upstream connectivity was out for couple minutes.
I am looking for ways to detect neighbor being down faster so traffic can be re-routed faster. I can do BFD internally but the issue is how the upstream is going to detect the outage and stop routing our traffic to that downed link. I have asked both of my upstreams and one said they don't do anything like that, second upstream I am still waiting on the answer.
My question is, do other carriers do BFD or any other means to detect the neighbor being down faster than normal BGP will allow? (Both upstreams are major telcos [AT&T and Qwest], so I think they are less flexible than some others.)
Or, has anyone succeeded in getting something done with those two carriers?
In my experience this is a pretty common problem with carrier Ethernet links where the interface is always "up" unless the directly connected switch/mux fails. Even then, it may still keep the port up through reboots. I like how Ethernet is cheap, but I hate how it lacks simple things like "link is down if any segment of the L1 or L2 between endpoints faults" that you get without silly tricks on a DSx or OC-x. (Then again, I suppose you're paying for that capability if it's important enough.) ~Seth