
Hey LJ,
* the packet itself, often parsed into at least: * IP source / dest * layer4 source / dest * TOS * Entropy / flow labels * ALL of the metadata that might be used to make ANY of the hashing decisions... * at least some systems use the input interface as an input into the hash. * some use TOS, some don’t. * some have different hash generator algorithms, so you have to know which one * usually there’s some additional hash seed for more entropy (such as the router-ID) – you have to know if this is in play and if so what it is. * you need to know exactly which NPU(s) are in the forwarding path, because there’s no guarantee that they use the same algorithms.
Ingress interface is also a common hash key. Also for tunneling (MPLS, GRE, IPIP, GTP) you may look at bottom headers as well. And in ICMP packets, like PMTUD etc, you should actually hash on the embedded packet, not the actual headers, but this is rarely if ever implemented (despite actually being relatively simple to implement), breaking PMTUd in ECMP cases, causing customers to implement weird workarounds (https://blog.cloudflare.com/path-mtu-discovery-in-practice/). Anyhow, if you have to know which NPU you are using, you misunderstood the assignment. This implementation will work once, when it gets written, and over time it will get wrong because different people maintain the EZchip/LS and the RE hash-code command, it is guaranteed to feed bad information to the user. This is basically where Cisco is today, there is code (cef exact-route), but it doesn't talk to HW, and it gives results people use, but which are not correct. I know that Juniper MX (not PTX) injects the packet in the HW lookup engine, and runs the normal ucode and yoinks the answer. So it will be correct, no one has to maintain it. I did understand from other contributors that this is how Arista implementation works too, but it also appears to have platform gaps. Of course even if this is implemented correctly, for the points you make there is an extremely large risk that users simply do not give the right set of keys, they'll still get results and again end up confidently working with bad data. For these reasons RFC5837 is so much better, the far end system simply tells where it received the frame, removing all guess-work and fragility. So it might be best that the standard case would be that users use RF5837 to glean this information and the 'exact-route' solution on the NOS is the exception case, when you simply do not have the ability to generate those packets right now for real.
So. Anyway. In my newfound role as head apologist for people who build big systems... the main reason that these commands don’t exist on most systems is not because we don’t know how to implement them, and not because we don’t see the value in implementing them. It’s because the cost to implement (and maintain!!!) them is actually really high, and people have decided (with their wallets) that they want other things more than they want this.
If this was true, you would have implemented RFC5837. The real reason why things are not implemented is that no one dangled fat RFQ gated by the request. This is how features get implemented, even when they are absolutely stupid features which should not be implemented and customers should be educated about why what they ask introduces fragility that cannot be justified due to superior options already exists. But doing things the right way and having a good business case may not always go hand in hand. These absolutely stupid features increase technical debt and cause fragility to all users, but of course they help winning that RFQ, so they get implemented. -- ++ytti