Re: 400G forwarding - how does it work?

25 Jul 2022

      Once upon a time, James Bensley <jwbensley+nanog@gmail.com> said:
...
The obvious answer is that it's not magic and my understanding is
fundamentally flawed, so please enlighten me.
So I can't answer to your specific question, but I just wanted to say
that your CPU analysis is simplistic and doesn't really match how CPUs
work now.  Something can be "line rate" but not push the first packet
through in the shortest time.  CPUs break operations down into a series
of very small operations and then run those operations in a pipeline,
with different parts of the CPU working on the micro operations for
different overall operations at the same time.  The first object out of
the pipeline (packet destination calculated in this case) may take more
time, but then after that you keep getting a result every cycle/few
cycles.

For example, it might take 4 times as long to process the first packet,
but as long as the hardware can handle 4 packets in a queue, you'll get
a packet result every cycle after that, without dropping anything.  So
maybe the first result takes 12 cycles, but then you can keep getting a
result every 3 cycles as long as the pipeline is kept full.

This type of pipelined+superscalar processing was a big deal with Cray
supercomputers, but made it down to PC-level hardware with the Pentium
Pro.  It has issues (see all the Spectre and Retbleed CPU flaws with
branch prediction for example), but in general it allows a CPU to handle
a chain of operations faster than it can handle each operation
individually.

-- 
Chris Adams <cma@cmadams.net>