On 17 July 2018 at 09:54, Saku Ytti <saku@ytti.fi> wrote:
On Tue, 17 Jul 2018 at 10:53, James Bensley <jwbensley@gmail.com> wrote:
Virtually any modern day laptop with a 1G NIC will saturate a 1G link using UDP traffic in iPerf with ease. I crummy i3 netbook with 1G NIC can do it on one core/thread.
I guess if you use large packets this might be true. But personally, if I'm testing network, I'm interested in latency, jitter, packet, bps _AND_ pps goals as well, not just bps goal.
Hi Saku, Yeah I fully agree with what you are saying however, the OPs question sounds like he "only" needed to prove bandwidth. With 1500 byte frames I've run it up to nearly 10Gbps before (it was between VMs in two different DCs that were having slow transfers and the hyper-visors had 10G NICs, so I dare say, on bare metal with large frames it will do 10Gbps).
And I've never seen clean 1Gbps on iperf with small packets. It just cannot be done, even if iPerf was written half decently and it used recvmmsg, it still wouldn't be anywhere near. Clean 1Gbps with small packets in user space is actually very much doable today, just you can't use UDP socket, you must use AF_PACKET on Linux or BPF on OSX and you can write portable 1Gbps UDP sender/receiver. I'm very surprised we don't have iperf like program for netengs which does this and reports latency, jitter, packet loss with binary search for highest lossless pps/bps rates.
I absolutely agree there is a gap in the open source market for this exact application. A tool that sends traffic between Tx and Rx (or bidirectionally) at a specified frame size and frame rate, which can max out 10Gbps at 64 byte frames if required (I say 10Gbps instead of 1Gbps because 10Gbps as an access circuit speed is being increasingly common), and throughout the test it should report RTT and one way latency, jitter and packet loss etc. and then output the results in a format that is easy to parse. It should also have a JSON API and be able to run in a "daemon" mode like an iPerf server that is always on ready for people to test to/from.
I started to write one with Anton Aksola in Rust (using libpnet[0]), and implemented quite flexible protocol (server/client, client can ask server exactly what kind of packet to construct/expect, what rate to send/receive over JSON based protocol), so you could also use it to ask it to DDoS your routers control-plane in lab etc. And actually got it working, OSX+Linux ~wirarate (still needs higher end laptop to do 1.5Mpps on single core and we didn't implement multicore support). But as both of us are trash in Rust (and every other applicable language in this domain), we kind of dropped the project once we had sufficient POC running on our laptops. Someone who actually can code, could easily implement such program in a weekend. I'm happy to share the trash we've done if someone intends to check this box in open source world. May use it for inspiration, or just straight up add polish and enough CLI to make it usable as-is.
I went through a similar process. AF_PACKET is definitely what you need to use if you want to use user-space in Linux (don't know about MAC, only use Linux). I wrote a basic multi-threaded load generator and load sinker (Tx and Rx) in C using various Kernel methods (send(), sendmsg(), sendmmsg(), and PACKET_MMAP) with AF_PACKET to compare them all: https://github.com/jwbensley/EtherateMT The problem is that C is a great language to write high performance stuff, it's a shit language to create a JSON API in. I have two back to back lab servers at work with 10G links between them, low end 2.1Ghz Xeons, I get 1Mpps per core, 8 cores-1 for OS means I max out at 7Mpps :( I know that XDP is coming to Linux user space so we'll see where that goes, as it promises the magic performance levels we want. Also TPACKETv4 is coming for AF_PACKET in Linux which should also get us to that magic level of performance in user land (it is effectively Kernel bypass). I'll add this to EtherateMT when I get some time to check it's performance: https://lwn.net/Articles/737947/ So EtherateMT works OK as a proof of concept, but nothing more. It requires 100% CPU utilisation to send/receive at such high pps rates, there is no CPU time for stats collection or fancy rtt/latency/jitter etc. That can only be done (right now) with something like DPDK, because it we only need one or two cores for Tx/Rx and then we have free cores for stats collections/generations etc. I looked into MoonGen, it creates Lua bindings for DPDK which means you can rapidly develop DPDK based tools without knowing much about DPDK. It had some RFC2544 Lua scripts for DPDK and I started to re-write them as they were old and didn't work with the latest version of MoonGen: https://github.com/jwbensley/MoonGen-Scripts The throughput script works OK-ish (10Gbps on one core no problems): https://github.com/jwbensley/MoonGen-Scripts/blob/master/throughput.lua Luea would allow one to easily provide parseable output and more easily implement a JSON API however, since MoonGen uses DPDK, we can only use the NICs that DPDK supports and not "any Ethernet NIC supported by Linux", which is what I really want by using AF_PACKET + TPACKETv4.
I think very important quality is multiplatform with static binaries. Because important use case is, that you can ask modestly informed customer to copy paste one line to donwload server and copy paste another line to have it running. If use case is that both ends have arbitrary clued people, then there are plenty of good solutions, like Cisco's trex[1]. But what I need is iPerf-like program, which actually a) performs and b) reports the correct things.
Yeah agreed so DPDK is out the window for me for this specific requirement, it's Linux only (ignoring the minor level of BSD support) and NIC specific too. Python *yuk* is multi-OS, it has JSON libraries, and it has some support for AF_PACKET: https://stackoverflow.com/questions/1117958/how-do-i-use-raw-socket-in-pytho... I don't know enough about it, but it might be that TPACKETv4 could be leveraged through Python but that still only covers Linux as Windows and MAC have very different network stacks (but then again I do only care about link sooooo...). I'm keen to have another go at this problem now that I've got a better understanding of it having written EtherateMT and played with DPDK etc. Not sure where to go though - so just waiting on TPACKETv4 right now. Cheers, James.