NAT on a Trident/Qumran(/or other?) equipped whitebox?
Has anyone played around with this? Curious if the BCM (or whatever other chip) can do this, and if not, if any of the box vendors have tried to find a way to get these things to do a bunch of NAT - say some flavour of NAT, line-rate @ 10G. If so, anyone know of a NOS that has support for it? OcNOS, Cumulus Linux, PicOS and Switch Light OS seem to have none, but not sure if there are others out there. Thanks!
Not sure if you count Arista as whitebox given their use of merchant silicon but running their own NOS, however they were touting the 7170 series as being able to do NAT recently. That's a Barefoot Tofino chip under the hood. I've no idea how well it can do NAT or what the limitations are mind you, but it was a specific selling point that they were pushing ... Edward Dore Freethought Internet On 09/10/2018, 16:38, "NANOG on behalf of Jason Lixfeld" <nanog-bounces@nanog.org on behalf of jason+nanog@lixfeld.ca> wrote: Has anyone played around with this? Curious if the BCM (or whatever other chip) can do this, and if not, if any of the box vendors have tried to find a way to get these things to do a bunch of NAT - say some flavour of NAT, line-rate @ 10G. If so, anyone know of a NOS that has support for it? OcNOS, Cumulus Linux, PicOS and Switch Light OS seem to have none, but not sure if there are others out there. Thanks!
The older Fulcrum/Intel FM6000 in the Arista 7150 can do NAT. -- Tim On Tue, Oct 9, 2018 at 10:54 AM Edward Dore < edward.dore@freethought-internet.co.uk> wrote:
Not sure if you count Arista as whitebox given their use of merchant silicon but running their own NOS, however they were touting the 7170 series as being able to do NAT recently. That's a Barefoot Tofino chip under the hood.
I've no idea how well it can do NAT or what the limitations are mind you, but it was a specific selling point that they were pushing ...
Edward Dore Freethought Internet
On 09/10/2018, 16:38, "NANOG on behalf of Jason Lixfeld" < nanog-bounces@nanog.org on behalf of jason+nanog@lixfeld.ca> wrote:
Has anyone played around with this? Curious if the BCM (or whatever other chip) can do this, and if not, if any of the box vendors have tried to find a way to get these things to do a bunch of NAT - say some flavour of NAT, line-rate @ 10G. If so, anyone know of a NOS that has support for it? OcNOS, Cumulus Linux, PicOS and Switch Light OS seem to have none, but not sure if there are others out there.
Thanks!
Indeed, however there are some other features currently missing from the Arista stack that sort of take it off the table (granted, those features have been promised early-ish next year).
On Oct 9, 2018, at 11:52 AM, Edward Dore <edward.dore@freethought-internet.co.uk> wrote:
Not sure if you count Arista as whitebox given their use of merchant silicon but running their own NOS, however they were touting the 7170 series as being able to do NAT recently. That's a Barefoot Tofino chip under the hood.
I've no idea how well it can do NAT or what the limitations are mind you, but it was a specific selling point that they were pushing ...
Edward Dore Freethought Internet
On 09/10/2018, 16:38, "NANOG on behalf of Jason Lixfeld" <nanog-bounces@nanog.org on behalf of jason+nanog@lixfeld.ca> wrote:
Has anyone played around with this? Curious if the BCM (or whatever other chip) can do this, and if not, if any of the box vendors have tried to find a way to get these things to do a bunch of NAT - say some flavour of NAT, line-rate @ 10G. If so, anyone know of a NOS that has support for it? OcNOS, Cumulus Linux, PicOS and Switch Light OS seem to have none, but not sure if there are others out there.
Thanks!
On 10/9/18 10:35 AM, Jason Lixfeld wrote:
Has anyone played around with this? Curious if the BCM (or whatever other chip) can do this, and if not, if any of the box vendors have tried to find a way to get these things to do a bunch of NAT - say some flavour of NAT, line-rate @ 10G. If so, anyone know of a NOS that has support for it? OcNOS, Cumulus Linux, PicOS and Switch Light OS seem to have none, but not sure if there are others out there.
For 10G I would use software NAT like a firewall or CGN virtual appliance. Switch ASICs generally don't support NAT well; Tofino and maybe Jericho II can probably do it but at high cost and as you discovered the market isn't trying very hard to provide "routing" or "firewalling" functionality on "switching" ASICs.
The key to answering the question of NAT support on a Broadcom switch forwarding chip, is... another question: What /flavour of NAT/ you're looking for. Generally Trident (1,2,3), Tomahawk(1,2) and I believe Jericho all support varying degrees of swapping parts of an IP or Eth header for other parts - i.e. TTL of 249 in, TTL of 248 out, MPLS tag 500 in, MPLS tag 513 out. And, to your benefit, SRC IP of 10.1.1.1 in, SRC IP of 10.2.2.2 out. That can be handled at line rate (yes 10G); how many of those rules depends on the chip. So that's perfectly fine for static NAT. Problem with static NAT (i.e. 1:1) isn't what I suspect most of us are looking for. PAT, or "nat overload" - i.e. your internal 10.x or 192.168.x networks to the internet using one or a few public IPv4's - requires stateful tracking, which is not what any of those chips do. So you're dependent on what route engine and software is in use to supply stateful NAT / PAT, and the requirement being higher there generally means you'll need a firewall or router (which, btw, might actually be using one of the aforementioned Broadcom switch chips for the forwarding plane!). To achieve line rate for stateful NAT / PAT there's more than the switch chip and software in the equation, and can be the limiting factor to achieving "line rate" for a set of 10G ports. PZ On Wed, Oct 10, 2018 at 12:20 PM Wes Felter <wmf@felter.org> wrote:
On 10/9/18 10:35 AM, Jason Lixfeld wrote:
Has anyone played around with this? Curious if the BCM (or whatever other chip) can do this, and if not, if any of the box vendors have tried to find a way to get these things to do a bunch of NAT - say some flavour of NAT, line-rate @ 10G. If so, anyone know of a NOS that has support for it? OcNOS, Cumulus Linux, PicOS and Switch Light OS seem to have none, but not sure if there are others out there.
For 10G I would use software NAT like a firewall or CGN virtual appliance. Switch ASICs generally don't support NAT well; Tofino and maybe Jericho II can probably do it but at high cost and as you discovered the market isn't trying very hard to provide "routing" or "firewalling" functionality on "switching" ASICs.
Interesting, but isn’t stateful tracking once again just swapping, but in this case port 123 in port 32123 out? So none of the chips you named below support swapping parts of L4 header and that part is actually done with SW assistance please? So for example the following: https://eos.arista.com/7150s-nat-practical-guide-source-nat-dynamic/#2Dynami... - wouldn’t be at line-rate please? Thank you adam netconsultings.com ::carrier-class solutions for the telecommunications industry:: From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Paul Zugnoni Sent: Thursday, October 11, 2018 6:04 AM To: wmf@felter.org Cc: nanog@nanog.org Subject: Re: NAT on a Trident/Qumran(/or other?) equipped whitebox? The key to answering the question of NAT support on a Broadcom switch forwarding chip, is... another question: What /flavour of NAT/ you're looking for. Generally Trident (1,2,3), Tomahawk(1,2) and I believe Jericho all support varying degrees of swapping parts of an IP or Eth header for other parts - i.e. TTL of 249 in, TTL of 248 out, MPLS tag 500 in, MPLS tag 513 out. And, to your benefit, SRC IP of 10.1.1.1 in, SRC IP of 10.2.2.2 out. That can be handled at line rate (yes 10G); how many of those rules depends on the chip. So that's perfectly fine for static NAT. Problem with static NAT (i.e. 1:1) isn't what I suspect most of us are looking for. PAT, or "nat overload" - i.e. your internal 10.x or 192.168.x networks to the internet using one or a few public IPv4's - requires stateful tracking, which is not what any of those chips do. So you're dependent on what route engine and software is in use to supply stateful NAT / PAT, and the requirement being higher there generally means you'll need a firewall or router (which, btw, might actually be using one of the aforementioned Broadcom switch chips for the forwarding plane!). To achieve line rate for stateful NAT / PAT there's more than the switch chip and software in the equation, and can be the limiting factor to achieving "line rate" for a set of 10G ports. PZ On Wed, Oct 10, 2018 at 12:20 PM Wes Felter <wmf@felter.org <mailto:wmf@felter.org> > wrote: On 10/9/18 10:35 AM, Jason Lixfeld wrote:
Has anyone played around with this? Curious if the BCM (or whatever other chip) can do this, and if not, if any of the box vendors have tried to find a way to get these things to do a bunch of NAT - say some flavour of NAT, line-rate @ 10G. If so, anyone know of a NOS that has support for it? OcNOS, Cumulus Linux, PicOS and Switch Light OS seem to have none, but not sure if there are others out there.
For 10G I would use software NAT like a firewall or CGN virtual appliance. Switch ASICs generally don't support NAT well; Tofino and maybe Jericho II can probably do it but at high cost and as you discovered the market isn't trying very hard to provide "routing" or "firewalling" functionality on "switching" ASICs.
On Mon, 15 Oct 2018 at 10:07, <adamv0025@netconsultings.com> wrote:
Interesting, but isn’t stateful tracking once again just swapping, but in this case port 123 in port 32123 out?
So none of the chips you named below support swapping parts of L4 header and that part is actually done with SW assistance please?
So for example the following:
https://eos.arista.com/7150s-nat-practical-guide-source-nat-dynamic/#2Dynami...
- wouldn’t be at line-rate please?
Hi Adam, NAT/PAT is an N:1 swapping (map) though so a state/translation table is required to correctly "swap" back the return traffic. MPLS for example is 1:1 mapping/action. NAT/PAT state tables tend to fill quickly so to aid with this we also have timers to time out the translations and free up space in the translation table, and also track e.g. TCP RST or TCP FIN to remove entries from the table, so it's not "just swapping". Cheers, James.
On 10/16/18 10:05 AM, James Bensley wrote:
NAT/PAT is an N:1 swapping (map) though so a state/translation table is required to correctly "swap" back the return traffic. MPLS for example is 1:1 mapping/action. NAT/PAT state tables tend to fill quickly so to aid with this we also have timers to time out the translations and free up space in the translation table, and also track e.g. TCP RST or TCP FIN to remove entries from the table, so it's not "just swapping".
I do wonder, though, if these popular switching ASICs are flexible enough in terms of their header matching and manipulation capabilities to handle packet mangling and forwarding in hardware for a given NAT state entry while punting anything that requires a state change to a CPU for inspection and state update. You'd need a somewhat more powerful CPU than your typical L3 switch might have, but it seems like you'd still be able to offload the vast majority of the actual packet processing to hardware. State table size (on a typical "switching" ASIC) might be an issue before you could actually fill up a 10Gbps+ link with typical SP multi-user traffic flows, I guess, and given that a moderate-spec PC can keep up with 10Gbps without much issue these days, maybe it's a non-starter. -- Brandon Martin
On 10/16/18 08:55, Brandon Martin wrote:
On 10/16/18 10:05 AM, James Bensley wrote:
NAT/PAT is an N:1 swapping (map) though so a state/translation table is required to correctly "swap" back the return traffic. MPLS for example is 1:1 mapping/action. NAT/PAT state tables tend to fill quickly so to aid with this we also have timers to time out the translations and free up space in the translation table, and also track e.g. TCP RST or TCP FIN to remove entries from the table, so it's not "just swapping".
I do wonder, though, if these popular switching ASICs are flexible enough in terms of their header matching and manipulation capabilities to handle packet mangling and forwarding in hardware for a given NAT state entry while punting anything that requires a state change to a CPU for inspection and state update.
You'd need a somewhat more powerful CPU than your typical L3 switch might have, but it seems like you'd still be able to offload the vast majority of the actual packet processing to hardware.
This is a flow cached router fundamentally. They exist. In that design you burn your fib on flow entries rather than on nexthop routes. They tend to explode at forwarding rates far lower than a typical ethernet switch when their ability to accumulate new state is exercised. riverstone RS circa 1999-2004 and various cisco products (sup 1a cat6k?) did follow that model.
State table size (on a typical "switching" ASIC) might be an issue before you could actually fill up a 10Gbps+ link with typical SP multi-user traffic flows, I guess, and given that a moderate-spec PC can keep up with 10Gbps without much issue these days, maybe it's a non-starter.
The problem asking whether this can be done "at line rate" in a specific switch platform ignores these critical measurements: - what's the packet rate expected for the nat flows? - will the control plane add a forwarding plane rule for every new session? if so, how quickly can that rule be pushed to the ASIC? how many per second can be done? I think you'll quickly find that new session rate will be a more limiting factor to the thruput than the bandwidth of the involved ports. An architecture to support that would be far more expensive. Maybe this was the case on the platforms Joel noted, and I believe modern "hardware based" firewall like higher end SRX and some Fortinet. - If not with the architecture above, then every packet needs to be punted to the CPU. What's the bw between ASIC and CPU? Consider the CPU is doing the decision making based on flows; the control plane usually has only 1G to the ASIC, sometimes and probably increasingly common is 10G. For these reasons I doubt the 7150s in the original email can dynamically NAT at line rate PZ On Tue, Oct 16, 2018 at 9:25 AM joel jaeggli <joelja@bogus.com> wrote:
On 10/16/18 08:55, Brandon Martin wrote:
On 10/16/18 10:05 AM, James Bensley wrote:
NAT/PAT is an N:1 swapping (map) though so a state/translation table is required to correctly "swap" back the return traffic. MPLS for example is 1:1 mapping/action. NAT/PAT state tables tend to fill quickly so to aid with this we also have timers to time out the translations and free up space in the translation table, and also track e.g. TCP RST or TCP FIN to remove entries from the table, so it's not "just swapping".
I do wonder, though, if these popular switching ASICs are flexible enough in terms of their header matching and manipulation capabilities to handle packet mangling and forwarding in hardware for a given NAT state entry while punting anything that requires a state change to a CPU for inspection and state update.
You'd need a somewhat more powerful CPU than your typical L3 switch might have, but it seems like you'd still be able to offload the vast majority of the actual packet processing to hardware.
This is a flow cached router fundamentally. They exist. In that design you burn your fib on flow entries rather than on nexthop routes. They tend to explode at forwarding rates far lower than a typical ethernet switch when their ability to accumulate new state is exercised. riverstone RS circa 1999-2004 and various cisco products (sup 1a cat6k?) did follow that model.
State table size (on a typical "switching" ASIC) might be an issue before you could actually fill up a 10Gbps+ link with typical SP multi-user traffic flows, I guess, and given that a moderate-spec PC can keep up with 10Gbps without much issue these days, maybe it's a non-starter.
participants (9)
-
adamv0025@netconsultings.com
-
Brandon Martin
-
Edward Dore
-
James Bensley
-
Jason Lixfeld
-
joel jaeggli
-
Paul Zugnoni
-
Tim Jackson
-
Wes Felter