On 9/9/23 22:29, Dave Cohen wrote:
At a previous $dayjob at a Tier 1, we would only support LAG for a customer L2/3 service if the ports were on the same card. The response we gave if customers pushed back was "we don't consider LAG a form of circuit protection, so we're not going to consider physical resiliency in the design", which was true, because we didn't, but it was beside the point. The real reason was that getting our switching/routing platform to actually run traffic symmetrically across a LAG, which most end users considered expected behavior in a LAG, required a reconfiguration of the default hash, which effectively meant that [switching/routing vendor]'s TAC wouldn't help when something invariably went wrong. So it wasn't that it wouldn't work (my recollection at least is that everything ran fine in lab environments) but we didn't trust the hardware vendor support.
We've had the odd bug here and there with LAG's for things like VRRP, BFD, e.t.c. But we have not run into that specific issue before on ASR1000's, ASR9000's, CRS-X's and MX. 98% of our network is Juniper nowadays, but even when we ran Cisco and had LAG's across multiple line cards, we didn't see this problem. The only hashing issue we had with LAG's is when we tried to carry Layer 2 traffic across them in the core. But this was just a limitation of the CRS-X, and happened also on member links of a LAG that shared the same line card. Mark.