On Fri, Jun 12, 2020 at 10:22 PM David Sinn <dsinn@dsinn.com> wrote:
Except that is actually the problem if you look at it in hardware. And to be very specific, I'm talking about commodity hardware, not flexible pipelines like you find in the MX and a number of the ASR's. I'm also talking about the more recent approach of using Clos in PoP's instead of "big iron" or chassis based systems. 

TE gives you the most powerful traffic engineering tool kit available. Naturally it has a bit more weight than just a single screwdriver. It can you build nearly any kind of multipath transport while that Clos thing is just one architecture hunting for the cheapest implementation of IP/LDP-style ECMP.

On those boxes, it's actually better to not do shared labels, as this pushes the ECMP decision to the ingress node. That does mean you have to enumerate every possible path (or some approximate) through the network, however the action on the commodity gear is greatly reduced. It's a pure label swap, so you don't run into any egress next-hop problems. You definitely do on the ingress nodes. Very, very badly actually.

Actually shared links are not a swap but just a pop similar to SR. But indeed this would shift your ECMP issue just to the headend. So for your ECMP scaling there would still be an option left to use an implementation which offers you a merge-point with a single label to all upstreams for a certain equal-cost multipath downstream. This does exist, so would certainly fix your ECMP scaling problem. But advanced control-plane code is certainly not cheap so in the end, like it was already said before, if a simple and cheap platform can solve all your needs then it might be the better one. Let‘s see what problems we need to solve in five years again.

What I'm getting at is that IP allows re-write sharing in that what needs to change on two IP frames taking the same paths but ultimately reaching different destinations are re-written (e.g. DMAC, egress-port) identically. And, at least with IPIP, you are able to look at the inner-frame for ECMP calculations. Depending on your MPLS design, that may not be the case. If you have too deep of a label stack (3-5 depending on ASIC), you can't look at the payload and you end up with polarization.

Not really as you are still forced to rewrite on imposition for the simplest form of tunneling, and for TE as often as you need to go against your SPT as well, it‘s just happening on IP (and IP rewrites are more expensive than MPLS rewrites / forwarding operations).