scaling linux-based router hardware recommendations

newer
suny.edu / ASN54226 anyone? please...

micah anderson

26 Jan 2015 26 Jan '15

10:53 p.m.

Hi, I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it. I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope). It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning. Any ideas or suggestions would be welcome! micah

Show replies by date

Faisal Imtiaz

26 Jan 26 Jan

11:27 p.m.

Hi Micah, There is a segment in the Hardware Side of the industry that produces "Network Appliances". (Folks such as Axiomtek, Lanner Electronics, Caswell Networks, Portwell etc etc) These appliances are commonly used as a commercial (OEM) platform for a variety of uses.. Routers, Firewalls, Specialized network applications etc. Our internal testing ( informal), matches up with the commonly quoted PPS handling by the different product vendors who incorporate these appliances in their network product offerings. i3/i5/i7 (x86) based network appliances will forward traffic as long as pps does not exceed 1.4million (In our testing we found the pps to be limiting factor and not the amount of traffic being moved) (will easily handle 6G to 10G of traffic Core2duo (x86) based network appliances will forward traffic as long as pps does not exceed 600,0000 pps (will easily handle 1.5G to 2G of traffic) Atom based (x86) network appliances will forward traffic as long as pps does not exceed 250,000 pps. ---------------------------------------- Of course, if you start to bog down the router with lots of NAT/ACL/ Bridge Rules (i.e. the CPU has to get involved in traffic management) then your actual performance will be degraded. Regards. Faisal Imtiaz Snappy Internet & Telecom 7266 SW 48 Street Miami, FL 33155 Tel: 305 663 5518 x 232 Help-desk: (305)663-5518 Option 2 or Email: Support@Snappytelecom.net ----- Original Message -----

...

From: "micah anderson" <micah@riseup.net> To: nanog@nanog.org Sent: Monday, January 26, 2015 5:53:54 PM Subject: scaling linux-based router hardware recommendations

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Mike Hammett

11:44 p.m.

Has anyone tested these setups with something more beefy like dual Xeons of Sandybridge or later vintage? Waiting to hear back from one NIC vendor (HotLava) what they think can be done on larger hardware setups. Put in two big Xeons and you're looking at 24 cores to work with as opposed to the <8 on the desktop versions. The newer ones would also have PCIe 3, which would overcome bus speed limitations in PCIe 2. Realistic to put 6x - 12x 10GigEs into a server with that much beef and expect it to perform well? What vintage of core ix do you run, Faisal? ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com ----- Original Message ----- From: "Faisal Imtiaz" <faisal@snappytelecom.net> To: "micah anderson" <micah@riseup.net> Cc: nanog@nanog.org Sent: Monday, January 26, 2015 5:27:55 PM Subject: Re: scaling linux-based router hardware recommendations Hi Micah, There is a segment in the Hardware Side of the industry that produces "Network Appliances". (Folks such as Axiomtek, Lanner Electronics, Caswell Networks, Portwell etc etc) These appliances are commonly used as a commercial (OEM) platform for a variety of uses.. Routers, Firewalls, Specialized network applications etc. Our internal testing ( informal), matches up with the commonly quoted PPS handling by the different product vendors who incorporate these appliances in their network product offerings. i3/i5/i7 (x86) based network appliances will forward traffic as long as pps does not exceed 1.4million (In our testing we found the pps to be limiting factor and not the amount of traffic being moved) (will easily handle 6G to 10G of traffic Core2duo (x86) based network appliances will forward traffic as long as pps does not exceed 600,0000 pps (will easily handle 1.5G to 2G of traffic) Atom based (x86) network appliances will forward traffic as long as pps does not exceed 250,000 pps. ---------------------------------------- Of course, if you start to bog down the router with lots of NAT/ACL/ Bridge Rules (i.e. the CPU has to get involved in traffic management) then your actual performance will be degraded. Regards. Faisal Imtiaz Snappy Internet & Telecom 7266 SW 48 Street Miami, FL 33155 Tel: 305 663 5518 x 232 Help-desk: (305)663-5518 Option 2 or Email: Support@Snappytelecom.net ----- Original Message -----

...

From: "micah anderson" <micah@riseup.net> To: nanog@nanog.org Sent: Monday, January 26, 2015 5:53:54 PM Subject: scaling linux-based router hardware recommendations

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Joe Holden

27 Jan 27 Jan

10:32 p.m.

I get more than that with realtek nics on x86, problem is high interrupt rates even with msix, intel fixes some of those and chelsio makes it all go away... Just saying :) On 26/01/2015 23:27, Faisal Imtiaz wrote:

...

Hi Micah,

There is a segment in the Hardware Side of the industry that produces "Network Appliances". (Folks such as Axiomtek, Lanner Electronics, Caswell Networks, Portwell etc etc)

These appliances are commonly used as a commercial (OEM) platform for a variety of uses.. Routers, Firewalls, Specialized network applications etc.

Our internal testing ( informal), matches up with the commonly quoted PPS handling by the different product vendors who incorporate these appliances in their network product offerings.

i3/i5/i7 (x86) based network appliances will forward traffic as long as pps does not exceed 1.4million (In our testing we found the pps to be limiting factor and not the amount of traffic being moved) (will easily handle 6G to 10G of traffic

Core2duo (x86) based network appliances will forward traffic as long as pps does not exceed 600,0000 pps (will easily handle 1.5G to 2G of traffic)

Atom based (x86) network appliances will forward traffic as long as pps does not exceed 250,000 pps.

----------------------------------------

Of course, if you start to bog down the router with lots of NAT/ACL/ Bridge Rules (i.e. the CPU has to get involved in traffic management) then your actual performance will be degraded.

Regards.

Faisal Imtiaz Snappy Internet & Telecom 7266 SW 48 Street Miami, FL 33155 Tel: 305 663 5518 x 232

Help-desk: (305)663-5518 Option 2 or Email: Support@Snappytelecom.net

----- Original Message -----

...
From: "micah anderson" <micah@riseup.net> To: nanog@nanog.org Sent: Monday, January 26, 2015 5:53:54 PM Subject: scaling linux-based router hardware recommendations

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Scott Whyte

26 Jan 26 Jan

11:52 p.m.

On 1/26/15 14:53, micah anderson wrote:

...

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome!

DPDK is your friend here. -Scott

...

micah

Mehmet Akcin

27 Jan 27 Jan

12:06 a.m.

Cumulus Networks has some stuff, http://www.bigswitch.com/sites/default/files/presentations/onug-baremetal-20... Pretty decent presentation with more details you like. Mehmet

...

On Jan 26, 2015, at 8:53 PM, micah anderson <micah@riseup.net> wrote:

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Mike Hammett

1:43 a.m.

Aren't most of the new whitebox\open source platforms based on switching and not routing? I'd assume that the "cloud-scale" data centers deploying this stuff still have more traditional big iron at their cores. The small\medium sized ISP usually is left behind. They're not big enough to afford the big new hardware, but all of their user's NetFlix and porn and whatever else they do is chewing up bandwidth. For example, the small\medium ISPs are at the Nx10GigE stage now. The new hardware is expensive, the old hardware (besides being old) is likely in a huge chassis if you can get any sort of port density at all. 48 port GigE switches with a couple 10GigE can be had for $100. A minimum of 24 port 10GigE switches (except for the occasional IBM switch ) is 30x to 40x times that. Routers (BGP, MPLS, etc.) with that more than just a couple 10GigEs are even more money, I'd assume. I thought vMX was going to save the day, but it's pricing for 10 gigs of traffic (licensed by throughput and standard\advanced licenses) is really about 5x - 10x what I'd be willing to pay for it. Haven't gotten a quote from AlcaLu yet. Vyatta (last I checked, which was admittedly some time ago) doesn't have MPLS. The FreeBSD world can bring zero software cost and a stable platform, but no MPLS. Mikrotik brings most (though not all) of the features one would want... a good enough feature set, let's say... but is a non-stop flow of bugs. I don't think a week or two goes by where one of my friends doesn't submit some sort of reproducible bug to Mikrotik. They've also been "looking into" DPDK for 2.5 years now. hasn't shown up yet. I've used MT for 10 years and I'm always left wanting just a little more, but it may be the best balance between the features and performance I want and the ability to pay for it. ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com ----- Original Message ----- From: "Mehmet Akcin" <mehmet@akcin.net> To: "micah anderson" <micah@riseup.net> Cc: nanog@nanog.org Sent: Monday, January 26, 2015 6:06:53 PM Subject: Re: scaling linux-based router hardware recommendations Cumulus Networks has some stuff, http://www.bigswitch.com/sites/default/files/presentations/onug-baremetal-20... Pretty decent presentation with more details you like. Mehmet

...

On Jan 26, 2015, at 8:53 PM, micah anderson <micah@riseup.net> wrote:

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Tony Wicks

1:57 a.m.

And the solution to this issue is - http://routerboard.com/ or http://www.mikrotik.com/software# on x86 hardware, plus any basic layer2 switch. Don't scoff until you have tried it, the price/performance is pretty staggering if you are in the sub 20gig space. -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Mike Hammett Sent: Tuesday, 27 January 2015 2:44 p.m. To: nanog@nanog.org Subject: Re: scaling linux-based router hardware recommendations Aren't most of the new whitebox\open source platforms based on switching and not routing? I'd assume that the "cloud-scale" data centers deploying this stuff still have more traditional big iron at their cores. The small\medium sized ISP usually is left behind. They're not big enough to afford the big new hardware, but all of their user's NetFlix and porn and whatever else they do is chewing up bandwidth. For example, the small\medium ISPs are at the Nx10GigE stage now. The new hardware is expensive, the old hardware (besides being old) is likely in a huge chassis if you can get any sort of port density at all. 48 port GigE switches with a couple 10GigE can be had for $100. A minimum of 24 port 10GigE switches (except for the occasional IBM switch ) is 30x to 40x times that. Routers (BGP, MPLS, etc.) with that more than just a couple 10GigEs are even more money, I'd assume. I thought vMX was going to save the day, but it's pricing for 10 gigs of traffic (licensed by throughput and standard\advanced licenses) is really about 5x - 10x what I'd be willing to pay for it. Haven't gotten a quote from AlcaLu yet. Vyatta (last I checked, which was admittedly some time ago) doesn't have MPLS. The FreeBSD world can bring zero software cost and a stable platform, but no MPLS. Mikrotik brings most (though not all) of the features one would want... a good enough feature set, let's say... but is a non-stop flow of bugs. I don't think a week or two goes by where one of my friends doesn't submit some sort of reproducible bug to Mikrotik. They've also been "looking into" DPDK for 2.5 years now. hasn't shown up yet. I've used MT for 10 years and I'm always left wanting just a little more, but it may be the best balance between the features and performance I want and the ability to pay for it. ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com ----- Original Message ----- From: "Mehmet Akcin" <mehmet@akcin.net> To: "micah anderson" <micah@riseup.net> Cc: nanog@nanog.org Sent: Monday, January 26, 2015 6:06:53 PM Subject: Re: scaling linux-based router hardware recommendations Cumulus Networks has some stuff, http://www.bigswitch.com/sites/default/files/presentations/onug-baremetal-20... Pretty decent presentation with more details you like. Mehmet

...

On Jan 26, 2015, at 8:53 PM, micah anderson <micah@riseup.net> wrote:

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Mike Hammett

2:01 a.m.

Must not have read my whole e-mail. ;-) There aren't very many people outside of my group that know more about Mikrotik. Trainers, MUM presenters, direct-line-to-Janis guys, etc. Still can't make those Latvians produce what we want. ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com ----- Original Message ----- From: "Tony Wicks" <tony@wicks.co.nz> To: "Mike Hammett" <nanog@ics-il.net>, nanog@nanog.org Sent: Monday, January 26, 2015 7:57:44 PM Subject: RE: scaling linux-based router hardware recommendations And the solution to this issue is - http://routerboard.com/ or http://www.mikrotik.com/software# on x86 hardware, plus any basic layer2 switch. Don't scoff until you have tried it, the price/performance is pretty staggering if you are in the sub 20gig space. -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Mike Hammett Sent: Tuesday, 27 January 2015 2:44 p.m. To: nanog@nanog.org Subject: Re: scaling linux-based router hardware recommendations Aren't most of the new whitebox\open source platforms based on switching and not routing? I'd assume that the "cloud-scale" data centers deploying this stuff still have more traditional big iron at their cores. The small\medium sized ISP usually is left behind. They're not big enough to afford the big new hardware, but all of their user's NetFlix and porn and whatever else they do is chewing up bandwidth. For example, the small\medium ISPs are at the Nx10GigE stage now. The new hardware is expensive, the old hardware (besides being old) is likely in a huge chassis if you can get any sort of port density at all. 48 port GigE switches with a couple 10GigE can be had for $100. A minimum of 24 port 10GigE switches (except for the occasional IBM switch ) is 30x to 40x times that. Routers (BGP, MPLS, etc.) with that more than just a couple 10GigEs are even more money, I'd assume. I thought vMX was going to save the day, but it's pricing for 10 gigs of traffic (licensed by throughput and standard\advanced licenses) is really about 5x - 10x what I'd be willing to pay for it. Haven't gotten a quote from AlcaLu yet. Vyatta (last I checked, which was admittedly some time ago) doesn't have MPLS. The FreeBSD world can bring zero software cost and a stable platform, but no MPLS. Mikrotik brings most (though not all) of the features one would want... a good enough feature set, let's say... but is a non-stop flow of bugs. I don't think a week or two goes by where one of my friends doesn't submit some sort of reproducible bug to Mikrotik. They've also been "looking into" DPDK for 2.5 years now. hasn't shown up yet. I've used MT for 10 years and I'm always left wanting just a little more, but it may be the best balance between the features and performance I want and the ability to pay for it. ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com ----- Original Message ----- From: "Mehmet Akcin" <mehmet@akcin.net> To: "micah anderson" <micah@riseup.net> Cc: nanog@nanog.org Sent: Monday, January 26, 2015 6:06:53 PM Subject: Re: scaling linux-based router hardware recommendations Cumulus Networks has some stuff, http://www.bigswitch.com/sites/default/files/presentations/onug-baremetal-20... Pretty decent presentation with more details you like. Mehmet

...

On Jan 26, 2015, at 8:53 PM, micah anderson <micah@riseup.net> wrote:

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Paul S.

2:10 a.m.

Like Mike mentioned, the feature list in RouterOS is nothing short of impressive -- problem is that pretty much everything in there is inherently buggy. That and one hell of a painful syntax-schema to work with too. On 1/27/2015 午前 10:57, Tony Wicks wrote:

...

And the solution to this issue is - http://routerboard.com/ or http://www.mikrotik.com/software# on x86 hardware, plus any basic layer2 switch. Don't scoff until you have tried it, the price/performance is pretty staggering if you are in the sub 20gig space.

-----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Mike Hammett Sent: Tuesday, 27 January 2015 2:44 p.m. To: nanog@nanog.org Subject: Re: scaling linux-based router hardware recommendations

Aren't most of the new whitebox\open source platforms based on switching and not routing? I'd assume that the "cloud-scale" data centers deploying this stuff still have more traditional big iron at their cores.

The small\medium sized ISP usually is left behind. They're not big enough to afford the big new hardware, but all of their user's NetFlix and porn and whatever else they do is chewing up bandwidth. For example, the small\medium ISPs are at the Nx10GigE stage now. The new hardware is expensive, the old hardware (besides being old) is likely in a huge chassis if you can get any sort of port density at all.

48 port GigE switches with a couple 10GigE can be had for $100. A minimum of 24 port 10GigE switches (except for the occasional IBM switch ) is 30x to 40x times that. Routers (BGP, MPLS, etc.) with that more than just a couple 10GigEs are even more money, I'd assume.

I thought vMX was going to save the day, but it's pricing for 10 gigs of traffic (licensed by throughput and standard\advanced licenses) is really about 5x - 10x what I'd be willing to pay for it.

Haven't gotten a quote from AlcaLu yet.

Vyatta (last I checked, which was admittedly some time ago) doesn't have MPLS.

The FreeBSD world can bring zero software cost and a stable platform, but no MPLS.

Mikrotik brings most (though not all) of the features one would want... a good enough feature set, let's say... but is a non-stop flow of bugs. I don't think a week or two goes by where one of my friends doesn't submit some sort of reproducible bug to Mikrotik. They've also been "looking into" DPDK for 2.5 years now. hasn't shown up yet. I've used MT for 10 years and I'm always left wanting just a little more, but it may be the best balance between the features and performance I want and the ability to pay for it.

----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com

----- Original Message -----

From: "Mehmet Akcin" <mehmet@akcin.net> To: "micah anderson" <micah@riseup.net> Cc: nanog@nanog.org Sent: Monday, January 26, 2015 6:06:53 PM Subject: Re: scaling linux-based router hardware recommendations

Cumulus Networks has some stuff,

http://www.bigswitch.com/sites/default/files/presentations/onug-baremetal-20...

Pretty decent presentation with more details you like.

Mehmet

...
On Jan 26, 2015, at 8:53 PM, micah anderson <micah@riseup.net> wrote:

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Mike Hammett

2:20 a.m.

Different (configuration) strokes for different folks. I look at a Cisco interface now and say, "Who the hell would use this?" despite my decade old Cisco training. I was corrected offlist that Vyatta does do MPLS now... but I can't find anything on it doing VPLS, so I guess that's still out. The 5600's license (according to their SDNCentral performance report) appears to be near $7k whereas MT you can get a license for $80. ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com ----- Original Message ----- From: "Paul S." <contact@winterei.se> To: nanog@nanog.org Sent: Monday, January 26, 2015 8:10:54 PM Subject: Re: scaling linux-based router hardware recommendations Like Mike mentioned, the feature list in RouterOS is nothing short of impressive -- problem is that pretty much everything in there is inherently buggy. That and one hell of a painful syntax-schema to work with too. On 1/27/2015 午前 10:57, Tony Wicks wrote:

...

And the solution to this issue is - http://routerboard.com/ or http://www.mikrotik.com/software# on x86 hardware, plus any basic layer2 switch. Don't scoff until you have tried it, the price/performance is pretty staggering if you are in the sub 20gig space.

-----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Mike Hammett Sent: Tuesday, 27 January 2015 2:44 p.m. To: nanog@nanog.org Subject: Re: scaling linux-based router hardware recommendations

Aren't most of the new whitebox\open source platforms based on switching and not routing? I'd assume that the "cloud-scale" data centers deploying this stuff still have more traditional big iron at their cores.

The small\medium sized ISP usually is left behind. They're not big enough to afford the big new hardware, but all of their user's NetFlix and porn and whatever else they do is chewing up bandwidth. For example, the small\medium ISPs are at the Nx10GigE stage now. The new hardware is expensive, the old hardware (besides being old) is likely in a huge chassis if you can get any sort of port density at all.

48 port GigE switches with a couple 10GigE can be had for $100. A minimum of 24 port 10GigE switches (except for the occasional IBM switch ) is 30x to 40x times that. Routers (BGP, MPLS, etc.) with that more than just a couple 10GigEs are even more money, I'd assume.

I thought vMX was going to save the day, but it's pricing for 10 gigs of traffic (licensed by throughput and standard\advanced licenses) is really about 5x - 10x what I'd be willing to pay for it.

Haven't gotten a quote from AlcaLu yet.

Vyatta (last I checked, which was admittedly some time ago) doesn't have MPLS.

The FreeBSD world can bring zero software cost and a stable platform, but no MPLS.

Mikrotik brings most (though not all) of the features one would want... a good enough feature set, let's say... but is a non-stop flow of bugs. I don't think a week or two goes by where one of my friends doesn't submit some sort of reproducible bug to Mikrotik. They've also been "looking into" DPDK for 2.5 years now. hasn't shown up yet. I've used MT for 10 years and I'm always left wanting just a little more, but it may be the best balance between the features and performance I want and the ability to pay for it.

----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com

----- Original Message -----

From: "Mehmet Akcin" <mehmet@akcin.net> To: "micah anderson" <micah@riseup.net> Cc: nanog@nanog.org Sent: Monday, January 26, 2015 6:06:53 PM Subject: Re: scaling linux-based router hardware recommendations

Cumulus Networks has some stuff,

http://www.bigswitch.com/sites/default/files/presentations/onug-baremetal-20...

Pretty decent presentation with more details you like.

Mehmet

...
On Jan 26, 2015, at 8:53 PM, micah anderson <micah@riseup.net> wrote:

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Valdis.Kletnieks＠vt.edu

3:58 a.m.

On Tue, 27 Jan 2015 11:10:54 +0900, "Paul S." said:

...

Like Mike mentioned, the feature list in RouterOS is nothing short of impressive -- problem is that pretty much everything in there is inherently buggy.

That and one hell of a painful syntax-schema to work with too.

Latvian grammar is.. somewhat unusual. Just be glad the development team wasn't Finnish. :) (Sorry, I couldn't resist. :)

Ken Chase

3:29 a.m.

Hows convergence time on these mikrotik/ubiquity/etc units for a full table? /kc -- Ken Chase - math@sizone.org Toronto

Mike Hammett

3:51 a.m.

Depends on the hardware. 30 - 45 seconds for the higher end stuff? I'm not sure how long it is on an RB750 (list price of like $40). ;-) ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com ----- Original Message ----- From: "Ken Chase" <math@sizone.org> To: nanog@nanog.org Sent: Monday, January 26, 2015 9:29:28 PM Subject: Re: scaling linux-based router hardware recommendations Hows convergence time on these mikrotik/ubiquity/etc units for a full table? /kc -- Ken Chase - math@sizone.org Toronto

Adair Winter

3:59 a.m.

A Maxxwave Routermxx MW-RM1300-i7 (x86 mikrotik router) pulls full tables from two peers and converges in about 40 seconds. On Mon, Jan 26, 2015 at 9:51 PM, Mike Hammett <nanog@ics-il.net> wrote:

...

Depends on the hardware. 30 - 45 seconds for the higher end stuff? I'm not sure how long it is on an RB750 (list price of like $40). ;-)

----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com

----- Original Message -----

From: "Ken Chase" <math@sizone.org> To: nanog@nanog.org Sent: Monday, January 26, 2015 9:29:28 PM Subject: Re: scaling linux-based router hardware recommendations

Hows convergence time on these mikrotik/ubiquity/etc units for a full table?

/kc -- Ken Chase - math@sizone.org Toronto

-- Adair Winter VP, Network Operations / Owner Amarillo Wireless | 806.316.5071 C: 806.231.7180 http://www.amarillowireless.net

Alexander Neilson

3:59 a.m.

...

On 27/01/2015, at 4:29 pm, Ken Chase <math@sizone.org> wrote:

Hows convergence time on these mikrotik/ubiquity/etc units for a full table?

For the CCR1036-12G-4S with one full table, one domestic table (NZ - ~26k entries) some peering and iBGP full convergence took about three minutes forty seconds last time I timed it from cold. I may do some new timing as they have been working hard to improve the multi core support (currently BGP still only single core however they been doing some work on efficient allocation of other tasks to cores.

...

/kc -- Ken Chase - math@sizone.org Toronto

Ken Chase

4:52 a.m.

On Tue, Jan 27, 2015 at 04:59:12PM +1300, Alexander Neilson said:

...

For the CCR1036-12G-4S with one full table, one domestic table (NZ - ~26k entries) some peering and iBGP full convergence took about three minutes forty seconds last time I timed it from cold.

That's terrible. I dont know what model that is or appropriate deploys but I think a couple of my peers use these and report similar times for their models for 500k+ routes. Still too slow. I think the single threaded nature of the routing table manip is at fault with the 36-but-slow cores (mikrotic). Im not sure how you get around this without drastically rewriting the kernel, which puts you out on your own developing new fundamental tech. I'd be more comfortable with full-cpu models (like xeon based for eg.)

...

From: Faisal Imtiaz <faisal@snappytelecom.net> Under 30sec (more like 15 to 20) on an i7 based Mikrotik for full BGP Tables.

Ya, that. /kc -- Ken Chase - math@sizone.org Toronto

Faisal Imtiaz

4:37 a.m.

Under 30sec (more like 15 to 20) on an i7 based Mikrotik for full BGP Tables. Faisal Imtiaz ----- Original Message -----

...

From: "Ken Chase" <math@sizone.org> To: nanog@nanog.org Sent: Monday, January 26, 2015 10:29:28 PM Subject: Re: scaling linux-based router hardware recommendations

Hows convergence time on these mikrotik/ubiquity/etc units for a full table?

/kc -- Ken Chase - math@sizone.org Toronto

joel jaeggli

3:23 a.m.

On 1/26/15 5:43 PM, Mike Hammett wrote:

...

Aren't most of the new whitebox\open source platforms based on switching and not routing? I'd assume that the "cloud-scale" data centers deploying this stuff still have more traditional big iron at their cores.

A L3 ethernet switch and a "router" are effectively indistinguishable. the actual feature set you need drives what platforms are appropiate. A signficant push for DCs particularly those with CLOS archectures is away from modular chassis based switches towards dense but fixed configuration switches. This drives the complexity and a signficant chunk of the cost out of these switches.

...

The small\medium sized ISP usually is left behind. They're not big enough to afford the big new hardware, but all of their user's NetFlix and porn and whatever else they do is chewing up bandwidth.

Everyone in the industry is under margin pressure. Done well every subsequent generation of your infrastrucuture is less costly per bit delivered while also being faster.

...

For example, the small\medium ISPs are at the Nx10GigE stage now. The new hardware is expensive, the old hardware (besides being old) is likely in a huge chassis if you can get any sort of port density at all.

If you're a small consumer based ISP how many routers do you actually need the have a full table (the customer access network doesn't need it).

...

48 port GigE switches with a couple 10GigE can be had for $100.

I'm not aware of that being the case. With respect to merchant silicon there a limited number of comon l3 switch asic building blocks which all switch/router vendors can avail themselves of. broadcom trident+ trident 2 and arad, intel fm6000, marvell prestera etc.

...

A minimum of 24 port 10GigE switches (except for the occasional IBM switch ) is 30x to 40x times that. Routers (BGP, MPLS, etc.) with that more than just a couple 10GigEs are even more money, I'd assume.

a 64 port 10 or mixed 10/40Gb/s switch can forward more half a Tb/s worth of 64byte packets, do so with cut-through forwarding and in a thermal enevelope of 150 watts. device like that retail for ~20k, in reality you need more than one. the equivalent gigabit product is 15 or 20% of the price. you mention mpls support so that dictates appropriate support which is available in some platforms and asics.

...

I thought vMX was going to save the day, but it's pricing for 10 gigs of traffic (licensed by throughput and standard\advanced licenses) is really about 5x - 10x what I'd be willing to pay for it.

The servers capable of relatively high-end forwarding feats aren't free either nor are the equivalent.

...

Haven't gotten a quote from AlcaLu yet.

Vyatta (last I checked, which was admittedly some time ago) doesn't have MPLS.

The FreeBSD world can bring zero software cost and a stable platform, but no MPLS.

mpls implementions have abundant ipr, which among other things prevents practical merging with the linux kernel.

...

Mikrotik brings most (though not all) of the features one would want... a good enough feature set, let's say... but is a non-stop flow of bugs. I don't think a week or two goes by where one of my friends doesn't submit some sort of reproducible bug to Mikrotik. They've also been "looking into" DPDK for 2.5 years now. hasn't shown up yet. I've used MT for 10 years and I'm always left wanting just a little more, but it may be the best balance between the features and performance I want and the ability to pay for it.

----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com

----- Original Message -----

From: "Mehmet Akcin" <mehmet@akcin.net> To: "micah anderson" <micah@riseup.net> Cc: nanog@nanog.org Sent: Monday, January 26, 2015 6:06:53 PM Subject: Re: scaling linux-based router hardware recommendations

Cumulus Networks has some stuff,

http://www.bigswitch.com/sites/default/files/presentations/onug-baremetal-20...

Pretty decent presentation with more details you like.

Mehmet

...
On Jan 26, 2015, at 8:53 PM, micah anderson <micah@riseup.net> wrote:

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Joe Holden

10:34 p.m.

Easy to make a switch when the only thing you're actually doing is teling the asic what to do (Cumulus, Ubiquiti, ... every other broadcom vendor out there...) Better yet - Atheros have finally come out with a 24*1GE + 2*10GE switch asic - only a matter of time before they challenge broadcom et al. On 27/01/2015 01:43, Mike Hammett wrote:

...

Aren't most of the new whitebox\open source platforms based on switching and not routing? I'd assume that the "cloud-scale" data centers deploying this stuff still have more traditional big iron at their cores.

The small\medium sized ISP usually is left behind. They're not big enough to afford the big new hardware, but all of their user's NetFlix and porn and whatever else they do is chewing up bandwidth. For example, the small\medium ISPs are at the Nx10GigE stage now. The new hardware is expensive, the old hardware (besides being old) is likely in a huge chassis if you can get any sort of port density at all.

48 port GigE switches with a couple 10GigE can be had for $100. A minimum of 24 port 10GigE switches (except for the occasional IBM switch ) is 30x to 40x times that. Routers (BGP, MPLS, etc.) with that more than just a couple 10GigEs are even more money, I'd assume.

I thought vMX was going to save the day, but it's pricing for 10 gigs of traffic (licensed by throughput and standard\advanced licenses) is really about 5x - 10x what I'd be willing to pay for it.

Haven't gotten a quote from AlcaLu yet.

Vyatta (last I checked, which was admittedly some time ago) doesn't have MPLS.

The FreeBSD world can bring zero software cost and a stable platform, but no MPLS.

Mikrotik brings most (though not all) of the features one would want... a good enough feature set, let's say... but is a non-stop flow of bugs. I don't think a week or two goes by where one of my friends doesn't submit some sort of reproducible bug to Mikrotik. They've also been "looking into" DPDK for 2.5 years now. hasn't shown up yet. I've used MT for 10 years and I'm always left wanting just a little more, but it may be the best balance between the features and performance I want and the ability to pay for it.

----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com

----- Original Message -----

From: "Mehmet Akcin" <mehmet@akcin.net> To: "micah anderson" <micah@riseup.net> Cc: nanog@nanog.org Sent: Monday, January 26, 2015 6:06:53 PM Subject: Re: scaling linux-based router hardware recommendations

Cumulus Networks has some stuff,

http://www.bigswitch.com/sites/default/files/presentations/onug-baremetal-20...

Pretty decent presentation with more details you like.

Mehmet

...
On Jan 26, 2015, at 8:53 PM, micah anderson <micah@riseup.net> wrote:

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Joe Greco

12:18 a.m.

...

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

10-15 years ago, we were seeing early Pentium 4 boxes capable of moving 100Kpps+ on FreeBSD. See for example http://info.iet.unipi.it/~luigi/polling/ Luigi moved on to Netmap, which looks promising for this sort of thing. https://www.usenix.org/system/files/conference/atc12/atc12-final186.pdf I was under the impression that some people have been using this for 10G routing. Also I'll note that Ubiquiti has some remarkable low-power gear capable of 1Mpps+. ... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.

Oliver Garraux

12:07 a.m.

One thing to note about Ubiquiti's EdgeMax products is that they are not Intel based. They use Cavium Octeon's (at least that's what my EdgeRouter Lite has in it). Oliver ------------------------------------- Oliver Garraux Check out my blog: blog.garraux.net Follow me on Twitter: twitter.com/olivergarraux On Mon, Jan 26, 2015 at 4:18 PM, Joe Greco <jgreco@ns.sol.net> wrote:

...

...
I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

10-15 years ago, we were seeing early Pentium 4 boxes capable of moving 100Kpps+ on FreeBSD. See for example http://info.iet.unipi.it/~luigi/polling/

Luigi moved on to Netmap, which looks promising for this sort of thing. https://www.usenix.org/system/files/conference/atc12/atc12-final186.pdf I was under the impression that some people have been using this for 10G routing.

Also I'll note that Ubiquiti has some remarkable low-power gear capable of 1Mpps+.

... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.

Phil Bedard

12:45 a.m.

Kind of unsurprisingly, the traditional network vendors are somewhat at the forefront of pushing what an x86 server can do as well. Brocade (Vyatta), Juniper, and Alcatel-Lucent all have virtualized routers using Intel's DPDK pushing 5M+ PPS at this point. They are all also tweaking what Intel is providing, and they are the ones with lots of software developers with a lot of hardware and network programming experience. ALU claims to be able to get 160Gbps full duplex through a 2RU server with 16x10G interfaces and two 10-core latest-gen Xeon processors. Of course that's probably at 9000 byte packet sizes, but at Imix type traffic it's probably still pushing 60-70Gbps. They have a demo of lots of them in a single rack managed as a single router pushing Tbps. A commerical offering you are going to pay for that kind of performance and the control plane software. Over time though you'll see the DPDK type enhancements make it into standard OS stacks. Other options include servers with integrated network processors or NPs on a PCI card, there is a whole rash of those type of devices out there now and coming out. Phil On 1/26/15, 22:53, "micah anderson" <micah@riseup.net> wrote:

...

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

David bass

1:07 a.m.

I'm also in the research stage of building our own router. I'm interested in reading more if you can post links to some of this research and/or testing. David Sent from my iPad

...

On Jan 26, 2015, at 6:45 PM, Phil Bedard <bedard.phil@gmail.com> wrote:

Kind of unsurprisingly, the traditional network vendors are somewhat at the forefront of pushing what an x86 server can do as well. Brocade (Vyatta), Juniper, and Alcatel-Lucent all have virtualized routers using Intel's DPDK pushing 5M+ PPS at this point. They are all also tweaking what Intel is providing, and they are the ones with lots of software developers with a lot of hardware and network programming experience.

ALU claims to be able to get 160Gbps full duplex through a 2RU server with 16x10G interfaces and two 10-core latest-gen Xeon processors. Of course that's probably at 9000 byte packet sizes, but at Imix type traffic it's probably still pushing 60-70Gbps. They have a demo of lots of them in a single rack managed as a single router pushing Tbps.

A commerical offering you are going to pay for that kind of performance and the control plane software. Over time though you'll see the DPDK type enhancements make it into standard OS stacks. Other options include servers with integrated network processors or NPs on a PCI card, there is a whole rash of those type of devices out there now and coming out.

Phil

...
On 1/26/15, 22:53, "micah anderson" <micah@riseup.net> wrote:

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Sudeep Khuraijam

1:05 a.m.

It really depends on the application that you are interested in beyond forwarding, but not knowing that and to scale forwarding ³at a reasonable price", things have to come off cpu and become more customized for forwarding, especially for low latency forwarding. The optimization comes in minimizing packet tuple copies, off load to co-processors and network coprocessors (some of which can be in NICs) and parallel processing with some semblance of shared memory across, all of which takes customization beyond CPU and Kernel which in itself needs to be stripped down bare and embedded. Ultimately that¹s what appliance vendors do with different levels of hardware/firmware customization depending on ROI of features, speeds and price. A generic OpenSource compatible OEM product with multi-gig ports will generally be at least half to 5th the price of a high end latest server architecture server product with ability to support 10 gig interfaces in the same forwarding performance range (which are in the market for a different scale problem in compute and net I/O but exist at a price point that make them exorbitant to solve forwarding speed). Cheers, Sudeep Khuraijam On 1/26/15, 2:53 PM, "micah anderson" <micah@riseup.net> wrote:

...

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Pavel Odintsov

7:33 a.m.

Hello! Looks like somebody want to build Linux soft router!) Nice idea for routing 10-30 GBps. I route about 5+ Gbps in Xeon E5-2620v2 with 4 10GE cards Intel 82599 and Debian Wheezy 3.2 (but it's really terrible kernel, everyone should use modern kernels since 3.16 because "buggy linux route cache"). My current processor load on server is about: 15%, thus I can route about 15 GE on my Linux server. Surely, you should deploy backup server too if master server fails. On Tue, Jan 27, 2015 at 1:53 AM, micah anderson <micah@riseup.net> wrote:

...

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

-- Sincerely yours, Pavel Odintsov

Paul S.

10:54 a.m.

Anyone aware of any dpdk enabled solutions in the software routing space that doesn't cost an arm and a leg? vMX certainly does. On 1/27/2015 午後 04:33, Pavel Odintsov wrote:

...

Hello!

Looks like somebody want to build Linux soft router!) Nice idea for routing 10-30 GBps. I route about 5+ Gbps in Xeon E5-2620v2 with 4 10GE cards Intel 82599 and Debian Wheezy 3.2 (but it's really terrible kernel, everyone should use modern kernels since 3.16 because "buggy linux route cache"). My current processor load on server is about: 15%, thus I can route about 15 GE on my Linux server.

Surely, you should deploy backup server too if master server fails.

On Tue, Jan 27, 2015 at 1:53 AM, micah anderson <micah@riseup.net> wrote:

...
Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Pavel Odintsov

11:14 a.m.

Hello! You could try to build simple router with DPDK yourself. It's very straightforward and have good examples for simple routing. I have done some tests with PF_RING ZC (it's very similar technology to DPDK without specialization on building of network devices) while test my DDoS monitoring solution and it work perfectly. I can achieve 8 million of packets per second (10GE with 120byte packets) on very slow Intel Xeon E5 2420. You could look at this tests from PF_RING developers: http://www.ntop.org/pf_ring/pf_ring-dna-rfc-2544-benchmark/ But building router on top of PF_RING or DPDK is very challenging task because everyone want very different things (BGP, OSPF, RIP... etc.). On Tue, Jan 27, 2015 at 1:54 PM, Paul S. <contact@winterei.se> wrote:

...

Anyone aware of any dpdk enabled solutions in the software routing space that doesn't cost an arm and a leg?

vMX certainly does.

On 1/27/2015 午後 04:33, Pavel Odintsov wrote:

...
Hello!

Looks like somebody want to build Linux soft router!) Nice idea for routing 10-30 GBps. I route about 5+ Gbps in Xeon E5-2620v2 with 4 10GE cards Intel 82599 and Debian Wheezy 3.2 (but it's really terrible kernel, everyone should use modern kernels since 3.16 because "buggy linux route cache"). My current processor load on server is about: 15%, thus I can route about 15 GE on my Linux server.

Surely, you should deploy backup server too if master server fails.

On Tue, Jan 27, 2015 at 1:53 AM, micah anderson <micah@riseup.net> wrote:

...
Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

-- Sincerely yours, Pavel Odintsov

Baldur Norddahl

11:56 a.m.

I propose the hybrid solution: A device such as the ZTE 5960e with 24x 10G and 2x 40G will set you about USD 6000 back. This thing can do MPLS and L3 equal cost multiple path routing. With that you can load balance across as many software routers as you need. It also speaks BGP and can accept about 10k routes. So maybe you could consider if the full table is really worth it. It would be possible to have your software router speak BGP with the neighbors and use next hop to direct the traffic directly to the switch. Or use proxy arp if the peer does not want to allow you to specify a different next hop than the BGP speaker. This way your software router is only moving outgoing packets. Inbound packets will never go through the computer, but will instead be delivered directly to the correct destination by hardware switching. If you are an ISP, you will often have more inbound traffic so this very useful. Also the weak point of the software router is denial of service attacks with small packets. The attacks are likely from outside your network so your software router will not need to route it. We need someone to code a BGP daemon, that will export the 5k most used routes to the switch. This way you can have the switch deliver the majority of the traffic directly to your peers. If you are a service provider, much of your traffic is outbound. Put your servers or multiple routers/firewalls on the same vlan as your transit. Then add static host routes for next hop on all servers. This way you can have as many servers as you need to deliver traffic directly. You can run iBGP on all the servers, so every server knows how to route outbound by itself. MPLS would also be useful for this instead of vlan, but there is no good MPLS implementation for Linux. Regards, Baldur

Phil Bedard

6:16 p.m.

There are some interesting ideas. There are tricks to getting 128K LPM routes into a Trident2 device like you mentioned. You can get the same type of devices from Cisco/Juniper for not a whole lot more, what you are really paying for is the mature control plane. https://github.com/dbarrosop/sir is a project from David Barasso at Spotify. The BGP daemon on your Internet-connected device may store all the routes in its RIB but usually doesn't need everything in the FIB. A downstream BGP controller running on a x86 server would analyze the routes more closely and use something like Netflow/Sflow (or really any criteria) to identify the routes you really care about and install those into the FIB on the device and just use the default for everything else. Metaswitch, a commercial control-plane company, had a similar idea using Openflow where the actual Internet connected device just proxied BGP connections to a controller which then went back and programmed the upstream Openflow switches. They call them "lean transit switches." The major network vendors all have the ability to run their control planes in software on x86 these days just fine. On many newer platforms they them run the software as a VM anyways. Pricing for bringing your own server is going to be cheaper, but not free. As for open source type stuff, Contrail from Juniper was made open source and has a BGP implementation, MPLS (MPLSoUDP) implementation, and a Linux kernel module to do fairly high speed packet forwarding, what they call the vRouter in Contrail. Wind River is another vendor who has incorporated the Intel DPDK stuff into a Linux distribution, but it is commercial as well. Phil On 1/27/15, 11:56, "Baldur Norddahl" <baldur.norddahl@gmail.com> wrote:

...

I propose the hybrid solution:

A device such as the ZTE 5960e with 24x 10G and 2x 40G will set you about USD 6000 back.

This thing can do MPLS and L3 equal cost multiple path routing. With that you can load balance across as many software routers as you need.

It also speaks BGP and can accept about 10k routes. So maybe you could consider if the full table is really worth it.

It would be possible to have your software router speak BGP with the neighbors and use next hop to direct the traffic directly to the switch. Or use proxy arp if the peer does not want to allow you to specify a different next hop than the BGP speaker. This way your software router is only moving outgoing packets. Inbound packets will never go through the computer, but will instead be delivered directly to the correct destination by hardware switching.

If you are an ISP, you will often have more inbound traffic so this very useful. Also the weak point of the software router is denial of service attacks with small packets. The attacks are likely from outside your network so your software router will not need to route it.

We need someone to code a BGP daemon, that will export the 5k most used routes to the switch. This way you can have the switch deliver the majority of the traffic directly to your peers.

If you are a service provider, much of your traffic is outbound. Put your servers or multiple routers/firewalls on the same vlan as your transit. Then add static host routes for next hop on all servers. This way you can have as many servers as you need to deliver traffic directly. You can run iBGP on all the servers, so every server knows how to route outbound by itself. MPLS would also be useful for this instead of vlan, but there is no good MPLS implementation for Linux.

Regards,

Baldur

Hugo Slabbert

4:22 p.m.

There is also some work in progress to improve network performance in the Linux kernel: https://lwn.net/Articles/629155/ Preliminary, but encouraging that work is under way. -- Hugo On Tue 2015-Jan-27 11:33:16 +0400, Pavel Odintsov <pavel.odintsov@gmail.com> wrote:

...

Hello!

Looks like somebody want to build Linux soft router!) Nice idea for routing 10-30 GBps. I route about 5+ Gbps in Xeon E5-2620v2 with 4 10GE cards Intel 82599 and Debian Wheezy 3.2 (but it's really terrible kernel, everyone should use modern kernels since 3.16 because "buggy linux route cache"). My current processor load on server is about: 15%, thus I can route about 15 GE on my Linux server.

Surely, you should deploy backup server too if master server fails.

On Tue, Jan 27, 2015 at 1:53 AM, micah anderson <micah@riseup.net> wrote:

...
Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

-- Sincerely yours, Pavel Odintsov

-- Hugo

Eduardo Schoedler

4:27 p.m.

Can be Freebsd-based? http://info.iet.unipi.it/~luigi/netmap/ 2015-01-27 14:22 GMT-02:00 Hugo Slabbert <hugo@slabnet.com>:

...

There is also some work in progress to improve network performance in the Linux kernel:

https://lwn.net/Articles/629155/

Preliminary, but encouraging that work is under way.

-- Hugo

On Tue 2015-Jan-27 11:33:16 +0400, Pavel Odintsov < pavel.odintsov@gmail.com> wrote:

Hello!

...
Looks like somebody want to build Linux soft router!) Nice idea for routing 10-30 GBps. I route about 5+ Gbps in Xeon E5-2620v2 with 4 10GE cards Intel 82599 and Debian Wheezy 3.2 (but it's really terrible kernel, everyone should use modern kernels since 3.16 because "buggy linux route cache"). My current processor load on server is about: 15%, thus I can route about 15 GE on my Linux server.

Surely, you should deploy backup server too if master server fails.

On Tue, Jan 27, 2015 at 1:53 AM, micah anderson <micah@riseup.net> wrote:

...
Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

-- Sincerely yours, Pavel Odintsov

-- Hugo

-- Eduardo Schoedler

Jim Shankland

4:31 p.m.

On 1/26/15 11:33 PM, Pavel Odintsov wrote:

...

Hello!

Looks like somebody want to build Linux soft router!) Nice idea for routing 10-30 GBps. I route about 5+ Gbps in Xeon E5-2620v2 with 4 10GE cards Intel 82599 and Debian Wheezy 3.2 (but it's really terrible kernel, everyone should use modern kernels since 3.16 because "buggy linux route cache"). My current processor load on server is about: 15%, thus I can route about 15 GE on my Linux server.

I looked into the promise and limits of this approach pretty intensively a few years back before abandoning the effort abruptly due to other constraints. Underscoring what others have said: it's all about pps, not aggregate throughput. Modern NICs can inject packets at line rate into the kernel, and distribute them across per-processor queues, etc. Payloads end up getting DMA-ed from NIC to RAM to NIC. There's really no reason you shouldn't be able to push 80 Gb/s of traffic, or more, through these boxes. As for routing protocol performance (BGP convergence time, ability to handle multiple full tables, etc.): that's just CPU and RAM. The part that's hard (as in "can't be fixed without rethinking this approach") is the per-packet routing overhead: the cost of reading the packet header, looking up the destination in the routing table, decrementing the TTL, and enqueueing the packet on the correct outbound interface. At the time, I was able to convince myself that being able to do this in 4 us, average, in the Linux kernel, was within reach. That's not really very much time: you start asking things like "will the entire routing table fit into the L2 cache?" 4 us to "think about" each packet comes out to 250Kpps per processor; with 24 processors, it's 6Mpps (assuming zero concurrency/locking overhead, which might be a little bit of an ... assumption). With 1500-byte packets, 6Mpps is 72 Gb/s of throughput -- not too shabby. But with 40-byte packets, it's less than 2 Gb/s. Which means that your Xeon ES-2620v2 will not cope well with a DDoS of 40-byte packets. That's not necessarily a reason not to use this approach, depending on your situation; but it's something to be aware of. I ended up convincing myself that OpenFlow was the right general idea: marry fast, dumb, and cheap switching hardware with fast, smart, and cheap generic CPU for the complicated stuff. My expertise, such as it ever was, is a bit stale at this point, and my figures might be a little off. But I think the general principle applies: think about the minimum number of x86 instructions, and the minimum number of main memory accesses, to inspect a packet header, do a routing table lookup, and enqueue the packet on an outbound interface. I can't see that ever getting reduced to the point where a generic server can handle 40-byte packets at line rate (for that matter, "line rate" is increasing a lot faster than "speed of generic server" these days). Jim

Robert Bays

28 Jan 28 Jan

11:02 a.m.

...

On Jan 27, 2015, at 8:31 AM, Jim Shankland <nanog@shankland.org> wrote:

My expertise, such as it ever was, is a bit stale at this point, and my figures might be a little off. But I think the general principle applies: think about the minimum number of x86 instructions, and the minimum number of main memory accesses, to inspect a packet header, do a routing table lookup, and enqueue the packet on an outbound interface. I can't see that ever getting reduced to the point where a generic server can handle 40-byte packets at line rate (for that matter, "line rate" is increasing a lot faster than "speed of generic server" these days).

Using DPDK it’s possible to do everything stated and achieve 10Gbps line rate at 64byte packets on multiple interfaces simultaneously. Add ACLs to the test setup and you can reach significant portions of 10Gbps at 64byte packets and full line rate at 128bytes. Check out Venky Venkatesan’s presentation at the last DPDK Summit for interesting information on pps/CPU cycles and some of the things that can be done to optimize forwarding in a generic processor environment. http://www.slideshare.net/jstleger/6-dpdk-summit-2014-intel-presentation-ven...

Paul S.

1:02 p.m.

That's the problem though. Everyone has presentations for the most part, very few actual tools that end users can just use exist. On 1/28/2015 午後 08:02, Robert Bays wrote:

...

...
On Jan 27, 2015, at 8:31 AM, Jim Shankland <nanog@shankland.org> wrote:

My expertise, such as it ever was, is a bit stale at this point, and my figures might be a little off. But I think the general principle applies: think about the minimum number of x86 instructions, and the minimum number of main memory accesses, to inspect a packet header, do a routing table lookup, and enqueue the packet on an outbound interface. I can't see that ever getting reduced to the point where a generic server can handle 40-byte packets at line rate (for that matter, "line rate" is increasing a lot faster than "speed of generic server" these days). Using DPDK it’s possible to do everything stated and achieve 10Gbps line rate at 64byte packets on multiple interfaces simultaneously. Add ACLs to the test setup and you can reach significant portions of 10Gbps at 64byte packets and full line rate at 128bytes.

Check out Venky Venkatesan’s presentation at the last DPDK Summit for interesting information on pps/CPU cycles and some of the things that can be done to optimize forwarding in a generic processor environment.

http://www.slideshare.net/jstleger/6-dpdk-summit-2014-intel-presentation-ven...

Robert Bays

1:23 p.m.

I was trying not to pitch my company on list, but the performance numbers I quoted are on the Vyatta/Brocade vRouter which is commercially available. Other vendors also also have publicly available performance numbers that are interesting.

...

On Jan 28, 2015, at 5:02 AM, Paul S. <contact@winterei.se> wrote:

That's the problem though.

Everyone has presentations for the most part, very few actual tools that end users can just use exist.

On 1/28/2015 午後 08:02, Robert Bays wrote:

...
...
On Jan 27, 2015, at 8:31 AM, Jim Shankland <nanog@shankland.org> wrote:

My expertise, such as it ever was, is a bit stale at this point, and my figures might be a little off. But I think the general principle applies: think about the minimum number of x86 instructions, and the minimum number of main memory accesses, to inspect a packet header, do a routing table lookup, and enqueue the packet on an outbound interface. I can't see that ever getting reduced to the point where a generic server can handle 40-byte packets at line rate (for that matter, "line rate" is increasing a lot faster than "speed of generic server" these days). Using DPDK it’s possible to do everything stated and achieve 10Gbps line rate at 64byte packets on multiple interfaces simultaneously. Add ACLs to the test setup and you can reach significant portions of 10Gbps at 64byte packets and full line rate at 128bytes.

Check out Venky Venkatesan’s presentation at the last DPDK Summit for interesting information on pps/CPU cycles and some of the things that can be done to optimize forwarding in a generic processor environment.

http://www.slideshare.net/jstleger/6-dpdk-summit-2014-intel-presentation-ven...

Charles N Wyble

2:35 p.m.

There is no free lunch. If you want " tools that end users can just use" then buy Cisco. Otherwise you need to roll up your sleeves and take the pieces and put them together. Or hire people like me to do it for you. It isn't overly complicated in my opinion. Also you'll find plenty of reasonably priced Linux or BSD integration engineers out there across the globe who are used to doing this sort of thing. Now once you move beyond basic forwarding / high PPS processing (which seems mostly commodity now) and get into say 80gbps (40gbps full duplex) IPS , ip reputation, data loss prevention, SSL MITM, AV... well that requires some very beefy hardware. Can that be done on x86? I doubt it. Tilera seems the way to go here. Newer FPGA boards can implement various CPU architectures on the fly. You also have CUDA. I hadn't seen chelsio, I'm very excited about that. Ill have one in my grubby little hands soon enough. transceivers are still horribly expensive. This is a major portion of the bom cost on any build, no matter what software stack is putting packets onto them. It isn't so simple once you move beyond the 1gbps range and want full feature set. And not in one box I think. Look at https://www.bro.org/ for interesting multi box scaling. On January 28, 2015 7:02:34 AM CST, "Paul S." <contact@winterei.se> wrote:

...

That's the problem though.

Everyone has presentations for the most part, very few actual tools that end users can just use exist.

...
...
On Jan 27, 2015, at 8:31 AM, Jim Shankland <nanog@shankland.org> wrote:

My expertise, such as it ever was, is a bit stale at this point, and my figures might be a little off. But I think the general principle applies: think about the minimum number of x86 instructions, and the minimum number of main memory accesses, to inspect a packet header, do a routing table lookup, and enqueue the packet on an outbound interface. I can't see that ever getting reduced to the point where a generic server can handle 40-byte packets at line rate (for that matter, "line rate" is increasing a lot faster than "speed of generic server" these days). Using DPDK it’s possible to do everything stated and achieve 10Gbps

On 1/28/2015 午後 08:02, Robert Bays wrote: line rate at 64byte packets on multiple interfaces simultaneously. Add ACLs to the test setup and you can reach significant portions of 10Gbps at 64byte packets and full line rate at 128bytes.

...
Check out Venky Venkatesan’s presentation at the last DPDK Summit for

interesting information on pps/CPU cycles and some of the things that can be done to optimize forwarding in a generic processor environment.

...
http://www.slideshare.net/jstleger/6-dpdk-summit-2014-intel-presentation-ven...

...
!DSPAM:54c8de34274511264773590!

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

Colin Johnston

2:45 p.m.

qnx os based router works well with powerpc, could be pushed far higher load than intel based chips Colin

...

...
That's the problem though.

Everyone has presentations for the most part, very few actual tools that end users can just use exist.

...
...
On Jan 27, 2015, at 8:31 AM, Jim Shankland <nanog@shankland.org> wrote:

My expertise, such as it ever was, is a bit stale at this point, and my figures might be a little off. But I think the general principle applies: think about the minimum number of x86 instructions, and the minimum number of main memory accesses, to inspect a packet header, do a routing table lookup, and enqueue the packet on an outbound interface. I can't see that ever getting reduced to the point where a generic server can handle 40-byte packets at line rate (for that matter, "line rate" is increasing a lot faster than "speed of generic server" these days). Using DPDK it’s possible to do everything stated and achieve 10Gbps

On 1/28/2015 午後 08:02, Robert Bays wrote: line rate at 64byte packets on multiple interfaces simultaneously. Add ACLs to the test setup and you can reach significant portions of 10Gbps at 64byte packets and full line rate at 128bytes.

...
Check out Venky Venkatesan’s presentation at the last DPDK Summit for

interesting information on pps/CPU cycles and some of the things that can be done to optimize forwarding in a generic processor environment.

...
http://www.slideshare.net/jstleger/6-dpdk-summit-2014-intel-presentation-ven...

...
!DSPAM:54c8de34274511264773590!

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

Mark Tinka

2:55 p.m.

On 28/1/15 16:45, Colin Johnston wrote:

...

qnx os based router works well with powerpc, could be pushed far higher load than intel based chips

The problem being that QNX is a 32-bit kernel. Mark.

Keith Medcalf

8 Feb 8 Feb

1:49 a.m.

How is that a problem? --- Theory is when you know everything but nothing works. Practice is when everything works but no one knows why. Sometimes theory and practice are combined: nothing works and no one knows why.

...

-----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Mark Tinka Sent: Wednesday, 28 January, 2015 07:55 To: nanog@nanog.org Subject: Re: scaling linux-based router hardware recommendations

On 28/1/15 16:45, Colin Johnston wrote:

...
qnx os based router works well with powerpc, could be pushed far higher load than intel based chips

The problem being that QNX is a 32-bit kernel.

Mark.

Nick Hilliard

28 Jan 28 Jan

3:51 p.m.

On 28/01/2015 14:45, Colin Johnston wrote:

...

qnx os based router works well with powerpc, could be pushed far higher load than intel based chips

that may be so, but how many people out there know how to push qnx that hard compared freebsd/linux on amd64 compatible hardware, and how many people know how to configure up a juniper mx or cisco asr9k, compared to the number that can tune a freely available unix. As someone pointed out elsewhere, there's no such thing as a free lunch. If you want to economise on hardware, you should expect to pay for the expertise to do it. Nick

Baldur Norddahl

3:35 p.m.

10g transceivers are not overly expensive if you buy compatible modules. SFP+ Direct attach cable is $16. SFP+ multimode module is $18. SFP+ singlemode LR module is $48. That is nothing compared to what vendors are asking for a "real" router. I believe there are many startups that are going for 2x 10G transit with full tables. We are one of them for sure. And then you need a cheap way to handle up to 20G bidirectional traffic, because as a startup it is not a good idea to fork over what equals to a whole year of salary to Cisco or Juniper. Even if you have that kind of money, you would want to spent it on something that will get you revenue. The obvious solution is a server (or two for redundancy) running Linux or BSD. You will be getting the Intel NIC with two SFP+ slots, so you can connect a transit connection directly to each server. This works well enough. We used a setup just like that for a year, before we upgraded to a hardware router. The weak point is that it will likely have trouble if you get hit by a real big DDoS with small packets. But back to cost of things. If I use my own company as an example, we are a FTTH provider. We use PON switches with 2x 10G ports on each switch. You can get many PON switches for the price of one router with at least 4x 10G ports (equivalent to the Linux routers). The PON switches will earn you revenue, it is what you connect your customers to. Better to get a bigger network, than spend the money on a router. The cost of SFP+/XFP and GPON C+ modules on the PON switch is only about 10% of the cost of the switch itself (again using compatible modules). A switch with 24x1G and 4x 10G can be bought for $3000. You can fill it completely with optics for $300 - again about 10%. My point is that if you are in an environment where every dollar counts, you do not need to spent a majority of your funds on optics. And neither do you need that expensive router until later in the game. Regards, Baldur On 28 January 2015 at 15:35, Charles N Wyble <charles@thefnf.org> wrote:

...

There is no free lunch. If you want " tools that end users can just use" then buy Cisco.

Otherwise you need to roll up your sleeves and take the pieces and put them together. Or hire people like me to do it for you.

It isn't overly complicated in my opinion. Also you'll find plenty of reasonably priced Linux or BSD integration engineers out there across the globe who are used to doing this sort of thing.

Now once you move beyond basic forwarding / high PPS processing (which seems mostly commodity now) and get into say 80gbps (40gbps full duplex) IPS , ip reputation, data loss prevention, SSL MITM, AV... well that requires some very beefy hardware. Can that be done on x86? I doubt it.

Tilera seems the way to go here. Newer FPGA boards can implement various CPU architectures on the fly. You also have CUDA. I hadn't seen chelsio, I'm very excited about that. Ill have one in my grubby little hands soon enough.

transceivers are still horribly expensive. This is a major portion of the bom cost on any build, no matter what software stack is putting packets onto them.

It isn't so simple once you move beyond the 1gbps range and want full feature set. And not in one box I think. Look at https://www.bro.org/ for interesting multi box scaling.

On January 28, 2015 7:02:34 AM CST, "Paul S." <contact@winterei.se> wrote:

...
That's the problem though.

Everyone has presentations for the most part, very few actual tools that end users can just use exist.

...
...
On Jan 27, 2015, at 8:31 AM, Jim Shankland <nanog@shankland.org> wrote:

My expertise, such as it ever was, is a bit stale at this point, and my figures might be a little off. But I think the general principle applies: think about the minimum number of x86 instructions, and the minimum number of main memory accesses, to inspect a packet header, do a routing table lookup, and enqueue the packet on an outbound interface. I can't see that ever getting reduced to the point where a generic server can handle 40-byte packets at line rate (for that matter, "line rate" is increasing a lot faster than "speed of generic server" these days). Using DPDK it’s possible to do everything stated and achieve 10Gbps

On 1/28/2015 午後 08:02, Robert Bays wrote: line rate at 64byte packets on multiple interfaces simultaneously. Add ACLs to the test setup and you can reach significant portions of 10Gbps at 64byte packets and full line rate at 128bytes.

...
Check out Venky Venkatesan’s presentation at the last DPDK Summit for

interesting information on pps/CPU cycles and some of the things that can be done to optimize forwarding in a generic processor environment.

...
http://www.slideshare.net/jstleger/6-dpdk-summit-2014-intel-presentation-ven...

...
...
!DSPAM:54c8de34274511264773590!

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

Eddie Tardist

12:05 a.m.

On Mon, Jan 26, 2015 at 8:53 PM, micah anderson <micah@riseup.net> wrote:

...

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Hello! This is a very interesting yet obscure and not widely discussed subject. And industry generally does not like the discussion to come up in public lists like this one. If you happen to reach line rate PPS throughput on x86, for filtering or forwarding, how will they keep that high profit rate on their products and keep investors happy? With that said, I am a very happy user for two hardware vendors not widely known, and a technology very well known but still barely discussed. I run FreeBSD, the so called "silent workhorse" as a BGP router and also FreeBSD (or pfSense) as a border firewall. For hardware vendors, I am a very happy customer of: - iXSystems (www.ixsystems.com) - ServerU Inc. (www.serveru.us) They are both BSD/Linux driven hardware specialists, and they are both very good consultants and technology engineers. I run a number of BGP and firewall boxes on GA, NY, FL and some other locations on east coast, as well as Belize, BVI and Bahamas and LATAM. pfSense is my number one system of choice, but sometimes I run FreeBSD vanilla, specially in my core locations. In one central location I have the following setup: - 1x ServerU Netmap L800 box in Bridge Mode for Core Firewall protection - 2x ServerU Netmap L800 boxes as BGP router (redundant) - Several Netmap L800, L100 and iXSystems servers (iXS for everything else since ServerU are only networking-centric, not high storage high processing Xeon servers) In this setup I am running yet another not well known but very promising technology, called Netmap. A Netmap firewall (called netmap-ipfw) was supplied from ServerU vendor, it's a slightly modified version from what you can download from Luigi Rizzo's (netmap author) public repository with multithread capabilities based on the number of queues available in the ServerU igb(4) networking card. What it does is, IMHO, amazing for a x86 hardware: line rate firewall on 1GbE port (1.3-1.4Mpps) and line rate firewall for 10GbE port (12-14Mpps) in a system with 8 @2.4Ghz Intel Rangeley CPU. It's not Linux DNA. It's not PF_RING. It's not Intel DPDK. It's netmap, it's there, available, on FreeBSD base system with a number of utilities and code for reference on Rizzos' repositories. It's there, it's available and it's amazing. This firewall has saved my sleep several times since November, dropping up to 9Mpps amplified UDP/NTP traffic on peak DDoS attack rates. For the BGP box, I needed trunking, Q-in-Q and vlan. And sadly right now this is not available in a netmap implementation. It means I had to keep my BGP router in the kernel path. It's funny to say this, but Netmap usually skips kernel path completely and does its job direct on the NIC, reaching backplane and bus limits directly. ServerU people recommended me to use Chelsio Terminator 5 40G ports. OK I only needed 10G but they convinced me not to look at the bits per second numbers but the packets per seconds number. Honestly, I don't know how Chelsio T5 did it, even though ServerU 1GbE ports perform very good on interruption CPU usage (probably this is an Intel igb(4) / ix(4) credit) but everything I route from one 40GbE port to the other port on the same L-800 expansion card, I have very, very, very LOW interrupt rates. Sometimes I have no interrupt at all!! I peaked routing 6Mpps on ServerU L-800 and still had CPU there, available. I am not sure where proper credits is due to ServerU hardware, to FreeBSD OS, to Netmap or to Chelsio. But I am sure on what it matters for my VP or my CFO: $$$ While a T5 card will cost around USD 1,000 and a ServerU L-800 router will cost another USD 1,200, I have a 2,2k USD overall cost of ownership for a box that will give me PPS rates that otherwise would cost from 9,000 USD to 12,000 USD on an industry product. I have followed a good discussion on a Linkedin Group (anyone googling for it will find it) comparing Netmap to DPDK from the developer perspective. Netmap developer pointed some good considerations while an Intel engineer pointed some other perspectives. Overall, DPDK and Netmap sounds, from my end-user/non-developer/non-software-engineer point of view, very similar in matter of results, while different in the inner gore details with some flexibility/generalist advantages for Netmap and some hardware specifics advantages for DPDK when running Intel hardware (of course), since its like CUDA is for Nvidia... vendor specific. I honestly hope a fraction of this million dollar donated to FreeBSD Foundation from WhatsApp founder goes on research and enhancements for Netmap technology. It's the most promising networking technology I have seen in the last years, and it goes straight to what FreeBSD does best: networking performance. It's not a coincidence that since the beginning of Internet, top Internet traffic servers, from Yahoo! to WhatsApp and Netflix, run FreeBSD. I don't know how important decisions can be addressed concerning adding to a Netmap stack a superset of full forwarding capability along with lagg(4), vlan(4), Q-in-Q, maybe carp(4) and other lightweight but still very kernel-path choppy features. But I hope FreeBSD engineers take good decisions on assigning those issues. And address time, funds and goals to Netmap. For now, however, if you really want a relatively new and innovative technology with actual code to use and run, ready and available, this is my suggestion: FreeBSD+Netmap. And for hardware vendors, iXSystems + ServerU. It gets out from the speculation field, since Netmap reference code for serious stuff, including a whole firewall, is available and ready to test, compare results, enhance and use. Suricata IDP has Netmap support, so yes, you can inspect close to line rate packets on IDS (not IPS) mode with Suricata. For everything else, DPDK, DNA, PF_RING, you have a framework in place. Some are experimental, some are more mature, but you will have to code and prove it by yourself. While FreeBSD/Netmap is a flavor ready to be tasted. This is my 5 cents opinion for such a great topic! Concerning BGP convergence time. Come on, are you serious? You deal with platforms that take 1 minute, up to 3 minutes for full convergence of a couple of bgp FULL sessions? What hardware is that? A Nintendo 8bits? LOL! ;-) Seriously and literally, a Sega Dreamcast videogame running NetBSD + BIRD will have better convergence time!! Now, serious again and no ironic statements further. While Cisco and Juniper have great ASICS chips and stuff, it's amazing to see that nowadays, Juniper MX Series still run weak Cavium-Octeon CPU for stuff their Trio 3D chip won't run. The same goes to Cisco with amazing ASICS but with weak CPU power that need, indeed, to be protected from DDoS attacks for things won't run on ASICS. Convergence time frames above 30 seconds nowadays, IMHO, should not be accepted on any new BGP environment. Only legacy hardware should take that long. For OpenBGP I have <30s convergence time for several full sessions on x86 hardware as the ones mentioned above. With BIRD, convergence time frames are even lower. If convergence time takes longer on OpenBGP or BIRD its mostly related to how long the UPDATE messages take to arrive, not to be processed. -- Eddie

Eduardo Meyer

3:07 p.m.

...

- 1x ServerU Netmap L800 box in Bridge Mode for Core Firewall protection - 2x ServerU Netmap L800 boxes as BGP router (redundant) - Several Netmap L800, L100 and iXSystems servers (iXS for everything else since ServerU are only networking-centric, not high storage high processing Xeon servers)

In this setup I am running yet another not well known but very promising technology, called Netmap.

A Netmap firewall (called netmap-ipfw) was supplied from ServerU vendor, it's a slightly modified version from what you can download from Luigi Rizzo's (netmap author) public repository with multithread capabilities based on the number of queues available in the ServerU igb(4) networking card.

What it does is, IMHO, amazing for a x86 hardware: line rate firewall on 1GbE port (1.3-1.4Mpps) and line rate firewall for 10GbE port (12-14Mpps) in a system with 8 @2.4Ghz Intel Rangeley CPU.

It's not Linux DNA. It's not PF_RING. It's not Intel DPDK.

It's netmap, it's there, available, on FreeBSD base system with a number of utilities and code for reference on Rizzos' repositories. It's there, it's available and it's amazing.

This firewall has saved my sleep several times since November, dropping up to 9Mpps amplified UDP/NTP traffic on peak DDoS attack rates.

For the BGP box, I needed trunking, Q-in-Q and vlan. And sadly right now this is not available in a netmap implementation.

It means I had to keep my BGP router in the kernel path. It's funny to say this, but Netmap usually skips kernel path completely and does its job direct on the NIC, reaching backplane and bus limits directly.

ServerU people recommended me to use Chelsio Terminator 5 40G ports. OK I only needed 10G but they convinced me not to look at the bits per second numbers but the packets per seconds number.

Honestly, I don't know how Chelsio T5 did it, even though ServerU 1GbE ports perform very good on interruption CPU usage (probably this is an Intel igb(4) / ix(4) credit) but everything I route from one 40GbE port to the other port on the same L-800 expansion card, I have very, very, very LOW interrupt rates. Sometimes I have no interrupt at all!!

I peaked routing 6Mpps on ServerU L-800 and still had CPU there,

I am also a user for FreeBSD netmap-ipfw, running kipfw fwd to, say, "fwd" http traffic to a peerapp appliance. My numbers are not line rate, I peak on 900Kpps, but still have CPU idle. I had a hard time figuring out how to use netmap-ipfw, due to lack of updated documentation, but once I got it running and set up, ecerything was very straightforward with default code, no modifications, just as available. I agree FreeBSD-netmap seems more ready, with tools, toolchains and code available wheh compared to DPDK or Linux DNA. Also in the hope for further evolvings of Netmap in the base system. Numbers are impressive indeed. -- =========== Eduardo Meyer pessoal: dudu.meyer@gmail.com profissional: ddm.farmaciap@saude.gov.br

Philip

10:58 p.m.

I recently built a pair of Linux based "routers" to handle full BGP tables from 3 upstream providers (10gig links) I had penguincomputing.com build me two reasonably powerful (dual xeon hex core processor) servers with SolarFlare <http://solarflare.com/1040GbE-Flareon-Server-IO-Adapters> NICs. (I didn't get a chance to play with open-onload before moving on to a new opportunity) Rudimentary testing with iperf showed I could saturate a 10gig link with minimal system load. With real world traffic, the limits came when we started pushing packets in the several hundred thousand range. However, this was due to the fact that these "routers" were also doing firewall / NAT duty (iptables), load-balancing (haproxy), VPN endpoints (openvpn), plus the routing eBGP (quagga), and internally propagating OSPF routes as well (quagga). Interrupt handling / system load became a problem only when our hadoop cluster (200+ nodes) started crazy aws s3 communications, otherwise things ran pretty well. The systems, configurations and software were pretty much just hacked together by me. Ideally we would have bought Juniper / Cisco gear, but my budget of $50K wouldn't even buy half a router after my vendors were done quoting me the real stuff. I ended up spending ~$15K to build this solution. I'm a not a networking person though, just a Linux hack, but was able to get this solution working reliably. -Philip On Mon, Jan 26, 2015 at 2:53 PM, micah anderson <micah@riseup.net> wrote:

...

Hi,

I know that specially programmed ASICs on dedicated hardware like Cisco, Juniper, etc. are going to always outperform a general purpose server running gnu/linux, *bsd... but I find the idea of trying to use proprietary, NSA-backdoored devices difficult to accept, especially when I don't have the budget for it.

I've noticed that even with a relatively modern system (supermicro with a 4 core 1265LV2 CPU, with a 9MB cache, Intel E1G44HTBLK Server adapters, and 16gig of ram, you still tend to get high percentage of time working on softirqs on all the CPUs when pps reaches somewhere around 60-70k, and the traffic approaching 600-900mbit/sec (during a DDoS, such hardware cannot typically cope).

It seems like finding hardware more optimized for very high packet per second counts would be a good thing to do. I just have no idea what is out there that could meet these goals. I'm unsure if faster CPUs, or more CPUs is really the problem, or networking cards, or just plain old fashioned tuning.

Any ideas or suggestions would be welcome! micah

Adrian Chadd

11:18 p.m.

[snip] To inject science into the discussion: http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_an_ibm... And he maintains a test setup to check for performance regressions: http://bsdrp.net/documentation/examples/freebsd_performance_regression_lab Now, this is using the in-kernel stack, not netmap/pfring/etc that uses all the batching-y, stack-shallow-y implementations that the kernel currently doesn't have. But, there are people out there doing science on it and trying very hard to kick things along. The nice thing about what has come out of the DPDK related stuff is, well, the bar is set very high now. Now it's up to the open source groups to stop messing around and do something about it. If you're interested in more of this stuff, go poke Jim at pfsense/netgate. -adrian (This and RSS work is plainly in my "stuff I do for fun" category, btw.)

Ray Soucy

29 Jan 29 Jan

9:56 p.m.

"For us, open source isn't just a business model; it's smart engineering practice." -- Bruce Schneier I hope I'm not the only one, but I think the NSA (and other state actors) intentionally introducing systemic weaknesses or backdoors into critical infrastructure is pretty ... reckless. I really can't figure out if it's arrogance or just plain naivety on their part, but they seem pretty confident that the information won't ever fall into the wrong hands and keep pushing forward. So for me, this is an area I've very interested in seeing some progress. I think most people don't realize that if you only care about 1G performance levels, commodity hardware can be more than fine. Linux netfilter makes a really great firewall, and it's the most peer-reviewed in the world. On Wed, Jan 28, 2015 at 6:18 PM, Adrian Chadd <adrian@creative.net.au> wrote:

...

[snip]

To inject science into the discussion:

http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_an_ibm...

And he maintains a test setup to check for performance regressions:

http://bsdrp.net/documentation/examples/freebsd_performance_regression_lab

Now, this is using the in-kernel stack, not netmap/pfring/etc that uses all the batching-y, stack-shallow-y implementations that the kernel currently doesn't have. But, there are people out there doing science on it and trying very hard to kick things along. The nice thing about what has come out of the DPDK related stuff is, well, the bar is set very high now. Now it's up to the open source groups to stop messing around and do something about it.

If you're interested in more of this stuff, go poke Jim at pfsense/netgate.

-adrian (This and RSS work is plainly in my "stuff I do for fun" category, btw.)

-- Ray Patrick Soucy Network Engineer University of Maine System T: 207-561-3526 F: 207-561-3531 MaineREN, Maine's Research and Education Network www.maineren.net

3935

Age (days ago)

3948

Last active (days ago)

List overview

Download

46 comments

34 participants

participants (34)

Adair Winter
Adrian Chadd
Alexander Neilson
Baldur Norddahl
Charles N Wyble
Colin Johnston
David bass
Eddie Tardist
Eduardo Meyer
Eduardo Schoedler
Faisal Imtiaz
Hugo Slabbert
Jim Shankland
Joe Greco
Joe Holden
joel jaeggli
Keith Medcalf
Ken Chase
Mark Tinka
Mehmet Akcin
micah anderson
Mike Hammett
Nick Hilliard
Oliver Garraux
Paul S.
Pavel Odintsov
Phil Bedard
Philip
Ray Soucy
Robert Bays
Scott Whyte
Sudeep Khuraijam
Tony Wicks
Valdis.Kletnieks＠vt.edu