On Sep 29, 2023, at 13:43, William Herrin <bill@herrin.us> wrote:
On Thu, Sep 28, 2023 at 10:29 PM Saku Ytti <saku@ytti.fi> wrote:
On Fri, 29 Sept 2023 at 08:24, William Herrin <bill@herrin.us> wrote:
Maybe. That's where my comment about CPU cache starvation comes into play. I haven't delved into the Juniper line cards recently so I could easily be wrong, but if the number of routes being actively used pushes past the CPU data cache, the cache miss rate will go way up and it'll start thrashing main memory. The net result is that the achievable PPS drops by at least an order of magnitude.
When you say, you've not delved into the Juniper line cards recently, to which specific Juniper linecard your comment applies to?
Howdy,
My understanding of Juniper's approach to the problem is that instead of employing TCAMs for next-hop lookup, they use general purpose CPUs operating on a radix tree, exactly as you would for an all-software router. This makes each lookup much slower than a TCAM can achieve. However, that doesn't matter much: the lookup delays are much shorter than the transmission delays so it's not noticeable to the user. To achieve an -aggregate- lookup speed comparable to a TCAM, they implement a bunch of these lookup engines as dedicated parallel subprocessors rather than using the router's primary compute engine.
In their lower-end hardware, yes. The MX uses ASICs traversing the tree, if I understood the explanation correctly, but there’s essentially a copy of the compiled/condensed tree on every line card and many of the line cards have more than one PFE per line card.
A TCAM lookup is approximately O(1) while a radix tree lookup is approximately O(log n). (Neither description is strictly correct but it's close enough to understand the running time.) Log n is pretty small so it doesn't take much parallelism for the practical run time to catch up to the TCAM.
The only difference, I believe, is that it’s ASICs against RAM rather than CPUs against CPU Cache, but otherwise, yes, you’ve got the correct general idea. However, since you brought CPU Cache into the discussion, that difference seemed worthy of addressing. Owen