On Sat, Jun 15, 2019 at 09:31:03AM -0400, Jon Lewis wrote:
On Sat, 15 Jun 2019, Job Snijders wrote:
There is no signal from the remote ASN (the one that receive the route announcement) to the Originator ASN about the remote ASN's loop detection policies. Therefor, since you can't know what the remote side will do ahead of time. The only recourse left at that point is active probing (trial & error). Trial and error, where the 'error' state may be an hard outage, means that the method is unreliable.
How does as-path poisoning failing (i.e. the AS you wanted to ignore a route accepts it) cause a hard outage?
Formatting warning, what follows is an ASCII art diagram: | "the rest" | +---------------+ | | +------+ +------+ | | | | +--+ 2914 +--------+ 7018 | | | | | | | +--+---+ +---+--+ | | | | | +-----+ | | | | | | | +----+ NSP +-----+ | | | | +---+-+ | | | +---+---+ | | | +----------+ ISP A | | | +-------+ In the above the ISP called "ISP A" is multihomed to "NTT" and an entity called "NSP", the NSP is multi-homed to both NTT and AT&T. I attempted to make this a realistic scenario. In the above situation the ISP A entity might want to force certain traffic over the NTT link, and instead of using BGP communities they use BGP AS_PATH poisoning. The moment they mangle the AS_PATH on their announcement and insert 2914 in their announcement towards NSP, the following can happen: When ISP A would want to poison the path, ISP A may expect the following paths to be visible from the ATT and NTT routes: AS_PATH | footnotes 7018_NSP_ISPA_2914_ISPA | 1 2914_7018_NSP_ISPA_2914_ISPA | 1 7018_2914_NSP_ISPA_2914_ISPA | 2 2914_NSP_ISPA_2914_ISPA | 2 NSP_ISPA_2914_ISPA | 3 7018_2914_ISPA | 4 2914_ISPA | 4 footnotes: 1) rejected on AT&T routers due to peerlock (2914 is seen in the AS_PATH) 2) rejected by NTT routers due to as-path loop detection, thus never propagated to AT&T. Neither NTT or AT&T will ever use this path. 3) potentially rejected by NSP due to presence of an upstream ASN in AS_PATH, thus neither NTT or AT&T will ever this path. 4) accepted by both AT&T and NTT. note that this effectively is ISP A single homing In both scenarios it was ISP A's goal to receive less traffic over the NSP-ISP A link, and the moment they deployed this policy, they'll think it was successful, because traffic comes in via the NTT-ISP A link. Now imagine (weeks after doing the AS_PATH poisoning), the link between 2914-ISP A is taken down (maintenance, outage, or whatever) - at that moment ISP A will discover that their AS_PATH mangling resulted in a hard outage. There was not switching to the paths via NSP. In fact, the NSP may not even have accepted the routes in the first place because many NSPs reject their upstream's ASNs when seen in routes received from their downstreams. In this thread, there is some hints of anecdata about when this trick works 'as intended', but what I'm trying to point out no shortage of examples where it leads to a problematic situation. In this thread we seem to have some unclarity about what 'reliable' or 'unreliable' means. AT&T will never proactively notify ISP A about changes to their AS_PATH filters, so what works today may be entirely broken tomorrow. I'm not disputing that AS_PATH poisoning can't be used to accomplish traffic engineering objectives, but it is similar to relying on linux operating system 0-days to obtain root access on a server. Sure, sometimes it may work, but I hope we can agree that for business purposes it is not a reliable or recommend way to achieve your goals via exploits.
When used for TE, a failure just means a route/path you wanted some remote network to ignore is not ignored and might be used. i.e. Your TE may not work as desired, but the packets will still get to you, just not necessarily via the path you wanted them to take.
It depends! Keep in mind that traffic engineering based on AS_PATH mangling is exploiting a property of the default as-path loop detection behaviour in BGP implementations. This means we're relying on a second order effect. The loop detection exists to stop the propagation of loops. This means that such paths won't be considered at even a lower priority, but are just rejected. The moment paths are rejected at any point anywhere in the BGP graph, you risk unreachablity. I hope this helped clarify a bit. Kind regards, Job