Hi,
Would like to take a poll on whether you folks tend to treat your transit/peering connections (BGP sessions in particular) as pets or rather as cattle.
And I appreciate the answer could differ for transit vs peering connections.
However, I’d like to ask this question through a lens of redundant vs non-redundant Internet edge devices.
To explain,
- The “pet” case:
Would you rather try improving the failure rate of your transit/peering connections by using resilient Control-Plane (REs/RSPs/RPs) or even designing these as link bundles over separate cards and optical modules?
Is this on the bases that doesn’t matter how hard you try on your end (i.e. distribute your traffic to multitude of transit and peering connections or use BFD or even BGP-PIC Edge to shuffle thing around fast, any disruption to the eBGP session itself will still hurt you in some way, (i.e. at least some partial outage for some proportion of the traffic for not insignificant period of time) until things converge in direction from The Internet back to you.
- The “cattle” case:
Or would you instead rely on small-ish non-redundant HW at your internet edge rather than trying to enhance MTBF with big chassis full of redundant HW?
Is this cause eventually the MTBF figure for a particular transit/peering eBGP session boils down to the MTBF of the single card or even single optical module hosting the link, (and creating bundles over separate cards -well you can never be quite sure how the setup looks like on the other end of that connection)?
Or is it because the effects of a smaller/non-resilient border edge device failure is not that bad in your particular (maybe horizontally scaled) setup?
Would appreciate any pointers, thank you.
Thank you
adam