Re: BIRD vs Quagga

17 Feb 2010

      On Tue, 2010-02-16 at 21:50 -0800, Joe Abley wrote:
...
On 2010-02-16, at 19:53, Tomas L. Byrnes wrote:
...
There's significant theoretical work, backed up with lots of practical
experience connecting a lot more nodes in real time in a lot more places
than the Internet currently does, that posits that the control and
forwarding plane should actually ALWAYS be separate, and control higher
priority, so that state management converges faster than the dataflows.
I'd like to see the countervailing, peer reviewed, references.
I have no shortage of anecdotes where a non-trivial layer-2 topology
at an exchange point has left my router and provider X's router both
able to talk to a route server, but unable to talk to each other
directly. Since the NEXT_HOP on routes we each learnt from the route
server pointed at an address we couldn't talk to, the result was a
black hole.
I have similar anecdotes... and I was on the side of running the
route-servers.  This gets to be a tough nut to crack especially if you
happen to have multiple RSes on opposite ends of a layer2 failure (a
case where intended redundancy resulted in unintended new failure
modes).

The best solution we came up with at the time was to add some control
knobs to rsd in order to allow us to quickly take down the BGP session
to the peer on the falsely advertising RS.  Figuring out which
third-party negotiated "pairwise peering" was being effected during a
switch fabric breakage was done manually at the time and not all that
accurate nor of course was it expedient.  We attempted to automate that
part without too much success.

-- 
/*=================[ Jake Khuon <khuon@NEEBU.Net> ]=================+
 | Packet Plumber, Network Engineers     /| / [~ [~ |) | | -------- |
 | for Effective Bandwidth Utilisation  / |/  [_ [_ |) |_| NETWORKS |   
 +==================================================================*/