On Wed, 2 Sep 2020 at 16:16, Baldur Norddahl <baldur.norddahl@gmail.com> wrote:
I am not buying it. No normal implementation of BGP stays online, replying to heart beat and accepting updates from ebgp peers, yet after 5 hours failed to process withdrawal from customers.
I can imagine writing BGP implementation like this a) own queue for keepalives, which i always serve first fully b) own queue for update, which i serve second c) own queue for withdraw, which i serve last Why I might think this makes sense, is perhaps I just received from RR2 prefix I'm pulling from RR1, if I don't handle all my updates first, I'm causing outage that should not happen, because I already actually received the update telling I don't need to withdraw it. Is this the right way to do it? Maybe not, but it's easy to imagine why it might seem like a good idea. How well BGP works in common cases and how it works in pathologically scaled and busy cases are very different cases. I know that even in stable states commonly run vendors on commonly run hardware can take +2h to finish converging iBGP on initial turn-up. -- ++ytti