The problem asking whether this can be done "at line rate" in a specific switch platform ignores these critical measurements: - what's the packet rate expected for the nat flows? - will the control plane add a forwarding plane rule for every new session? if so, how quickly can that rule be pushed to the ASIC? how many per second can be done? I think you'll quickly find that new session rate will be a more limiting factor to the thruput than the bandwidth of the involved ports. An architecture to support that would be far more expensive. Maybe this was the case on the platforms Joel noted, and I believe modern "hardware based" firewall like higher end SRX and some Fortinet. - If not with the architecture above, then every packet needs to be punted to the CPU. What's the bw between ASIC and CPU? Consider the CPU is doing the decision making based on flows; the control plane usually has only 1G to the ASIC, sometimes and probably increasingly common is 10G. For these reasons I doubt the 7150s in the original email can dynamically NAT at line rate PZ On Tue, Oct 16, 2018 at 9:25 AM joel jaeggli <joelja@bogus.com> wrote:
On 10/16/18 08:55, Brandon Martin wrote:
On 10/16/18 10:05 AM, James Bensley wrote:
NAT/PAT is an N:1 swapping (map) though so a state/translation table is required to correctly "swap" back the return traffic. MPLS for example is 1:1 mapping/action. NAT/PAT state tables tend to fill quickly so to aid with this we also have timers to time out the translations and free up space in the translation table, and also track e.g. TCP RST or TCP FIN to remove entries from the table, so it's not "just swapping".
I do wonder, though, if these popular switching ASICs are flexible enough in terms of their header matching and manipulation capabilities to handle packet mangling and forwarding in hardware for a given NAT state entry while punting anything that requires a state change to a CPU for inspection and state update.
You'd need a somewhat more powerful CPU than your typical L3 switch might have, but it seems like you'd still be able to offload the vast majority of the actual packet processing to hardware.
This is a flow cached router fundamentally. They exist. In that design you burn your fib on flow entries rather than on nexthop routes. They tend to explode at forwarding rates far lower than a typical ethernet switch when their ability to accumulate new state is exercised. riverstone RS circa 1999-2004 and various cisco products (sup 1a cat6k?) did follow that model.
State table size (on a typical "switching" ASIC) might be an issue before you could actually fill up a 10Gbps+ link with typical SP multi-user traffic flows, I guess, and given that a moderate-spec PC can keep up with 10Gbps without much issue these days, maybe it's a non-starter.