Changing the topic... On Fri, May 12, 2023 at 7:11 AM Mark Tinka <mark@tinka.africa> wrote:
On 5/12/23 15:03, Dave Taht wrote:
Libreqos is free software, working as a bridge, you can plug it in between any two points on your network, and on cheap (350 bucks off of ebay) xeon gold hardware easily cracks 25Gbits while shaping with a goal of cracking 100Gbits one day soon.
This is fantastic!
:blush: We have done a couple podcasts about it, like this one: https://packetpushers.net/podcast/heavy-networking-666-improving-quality-of-... and have perhaps made a mistake by using matrix chat, rather than a web forum, to too-invisibly, do development and support in, but it has been a highly entertaining way to get a better picture of the real problems caring ISPs have. I see you are in Africa? We have a few ISPs playing with this in kenya...
I also found your post about it here:
https://www.reddit.com/r/HomeNetworking/comments/11pmc9a/a_latency_on_the_in...
If you could throw more hardware at it, could it do several 100's of Gbps?
We do not know. Presently our work is supported by equinix´s open source program, with four servers in their Dallas DC, and they are 25Gbit ports. Putting together enough dough to get to 100Gbit or finding someone willing to send traffic through more bare metal at that data center or elsewhere is on my mind. In other words, we can easily spin up the ability to L2 route some traffic through a box in their DCs, if only we knew where to find it. :) If you assume linearity to cores (which is a lousy assumption, ok?), 64 Xeon cores could do about 200Gbit, running flat out. I am certain it will not scale linearly and we will hit multiple bottlenecks on a way to that goal. Limits we know about: A) Trying to drive 10s of gbits of realistic traffic through this requires more test clients and servers than we have, or someone with daring and that kind of real traffic in the first place. For example one of our most gung-ho clients has 100Gbit ports, but not anywhere near that amount of inbound traffic. (they are crazy enough to pull git head, try it for a few minutes in production, and then roll back or leave it up) B) A brief test of a 64 core AMD + Nvidia ethernet was severely outperformed by our current choice of a 20 core xeon gold + intel 710 or 810 card. It is far more the ethernet card that is the dominating factor. I would kill if I could find one that did a LPM -> CPU mapping... (e.g. instead of a LPM->route mapping, LPM to what cpu to interrupt). We also tried an 80 core arm to inconclusive results early on. Tests of the latest ubuntu release are ongoing. I am not prepared to bless that or release any results yet. C) A single cake instance on one of the more high end Xeons can *almost* push 10Gbit/sec while eating a core. D) Our model is one cake instance per subscriber + the ability to establish trees emulating links further down the chain. One ISP is modeling 10 mmwave hops. Another is just putting in multiple boxes closer to the towers. So in other words, 100s of gbits is achievable today if you throw boxes at it, and more cost effective to do that way. We will of course, keep striving to crack 100gbit native on a single box with multiple cards. It is a nice goal to have. E) In our present, target markets, 10k typical residential subscribers only eat 11Gbit/sec at peak. That is a LOT of the smaller ISPs and networks that fit into that space, so of late we have been focusing more on analytics and polish than pushing more traffic. Some of our new R/T analytics break down at 10k cake instances (that is 40 million fq_codel queues, ok?), and we cannot sample at 10ms rates, falling back to (presently) 1s conservatively. We are nearing putting out a v1.4-rc7 which is just features and polish, you can get a .deb of v1.4-rc6 here: https://github.com/LibreQoE/LibreQoS/releases/tag/v1.4-rc6 There is an optional, and anonymized reporting facility built into that. In the last two months, 44404 cake shaped devices shaping .19Tbits that we know of have come online. Aside from that we have no idea how many ISPs have picked it up! a best guess would be well over 100k subs at this point. Putting in libreqos is massively cheaper than upgrading all the cpe to good queue management, (it takes about 8 minutes to get it going in monitor mode, but exporting shaping data into it requires glue, and time) but better cpe remains desirable - especially that the uplink component of the cpe also do sane shaping natively. "And dang, it, ISPs of the world, please ship decent wifi!?", because we can see the wifi going south in many cases from this vantage point now. In the past year mikrotik in particular has done a nice update to fq_codel and cake in RouterOS, eero 6s have got quite good, much of openwifi/openwrt, evenroute is good... It feels good, after 14 years of trying to fix the internet, to be seeing such progress, on fixing bufferbloat, and in understanding and explaining the internet better. joooooiiiiiiiin us..
Also, when you say "bridge", if the server dies, does it become a wire, or would that require specialized hardware builds?
What we do now is put it inline with ospf/olsr/bgp with a low cost, and a wire with a higher cost, if it fails. Things have stablized a lot in the last few months, the last crash I can remember was in january. (in rust we trust!). You have to watch out for breaking spanning tree in that case. The most common install bug is someone flipping inbound and outbound interfaces in the setup. Among other things we replaced the linux native bridge code with about 600 lines of ebpf C. The enormous speedup from that is getting us closer to what dpdk could do, but dpdk cannot queue worth a darn, just forward willy nilly. I hope, in particular, far, far more folk start leveraging variants of doing inband measurements with pping. The stand alone code for that is here: https://github.com/thebracket/cpumap-pping
Mark.
-- Podcast: https://www.linkedin.com/feed/update/urn:li:activity:7058793910227111937/ Dave Täht CSO, LibreQos