On Tue, Feb 11, 2020 at 12:33 AM Lukas Tribus <lists@ltri.eu> wrote:
Therefore, if being down for several minutes is not ok, you should invest in dual links to your transits. And connect those to two different routers. If possible with a guarantee the transits use two routers at their end and that divergent fiber paths are used etc.
That is not my experience *at all*. I have always seen my prefixes converge in a couple of seconds upstream (vs 2 different Tier1's).
This is a bit old but probably still thus: https://labs.ripe.net/Members/vastur/the-shape-of-a-bgp-update Quote: "To conclude, we observe that BGP route updates tend to converge globally in just a few minutes. The propagation of newly announced prefixes happens almost instantaneously, reaching 50% visibility in just under 10 seconds, revealing a highly responsive global system. Prefix withdrawals take longer to converge and generate nearly 4 times more BGP traffic, with the visibility dropping below 10% only after approximately 2 minutes". Unfortunately they did not test the case of withdrawal from one router while having the prefix still active at another.
When I saw *minutes* of brownouts in connectivity it was always because of ingress prefix convergence (or the lack thereof, due to slow FIB programing, then temporary internal routing loops, nasty things like that, but never external).
That is also a significant problem. In the case of a single transit connection per router, two routers and two providers, there will be a lot of internal convergence between your two routers in the case of a link failure. That is also avoided by having both routers having the same provider connections. That way a router may still have to invalidate many routes but there will be no loops and the router has loop free alternatives loaded into memory already (to the other provider). Plus you can use the simple trick of having a default route as a fall back. Regards Baldur