On Sat, Dec 2, 2023 at 2:30 AM Stephen Satchell <list@satchell.net> wrote:
On 12/1/23 5:27 PM, Mike Hammett wrote:
It would be better to keep the government out of it altogether, but that has little chance of happening.
I agree. But I do have a question: is there a Best Practices RFC for setting buffer sizes in the existing corpus? The Internet community has been pretty good at setting reasonable standards and recommendations, so a pointer to a BCP or RFC would go much farther to solve the bufferbloat problem, as I believe administrators would prefer the "suggestion" instead of ham-handed regulation.
I too! The IETF recommends universal AQM: see RFC7567. However, as a consensus document, it was impossible to get the group to agree to fair (flow) queuing also being deployed universally, (although it is discussed extensively) where I feel that that technology - breaking up bursts, making it possible for all flows to multiplex better to a destination, and isolating problematic flows - is more important than AQM. A lot of FQ is implicit - packet pacing from the hosts does it e2e, switches naturally multiplex from multiple ports, but anywhere it is not, explicitly having it helps in downshifting from one rate to another. A pure AQM tends to react late to bursts otherwise. "Flow Queuing", is a real advance over conventional fair queueing, IMHO. PIE has a delay target of 16ms, Codel 5ms - but these are targets that take a while to hit, absorbing bursts, and then draining the queue to a steady, smaller state gradually. I recommend VJ's talk on this highly: https://www.bufferbloat.net/projects/cerowrt/wiki/Bloat-videos/#van-jacobson... (he has also been heavily involved in BBR and so many other things) If all you have is a FIFO, I personally would recommend no more than 30ms in a byte fifo, if available. A packet fifo, limited thusly, might have issues with swallowing enough acks, but either way, with the advent of "packet pacing" from the hosts, some pretty good experimental evidence that 100ms is too much... (I can supply more links), and the rise of unclassified interactive traffic like webrtc with tighter delay constraints, I still lean strongly towards 30ms as the outside figure in most cases. Aside from incast traffic, you only need even this much buffering when a link is oversubscribed. Try not to do that, but test. We got back a lot of good data from the level3 outage showing that a lot of core seemed to have 250ms of buffering, or more. I can dig up that research. For more RFCs from the now closed IETF AQM working group, see: https://datatracker.ietf.org/group/aqm/documents/
But that's just me. I do know there has been academic research on the subject, but don't recall seeing the results published as a practical operational RFC.
I too would like a practical operational RFC. It is becoming increasingly practical, but the big vendors are lagging behind on support for advanced FQ and AQM techniques. There has been some decent work in P4. In my case on problematic links I just slap CAKE in front of it on a whitebox. Example of the generally pleasing results here: https://blog.cerowrt.org/post/juniper/ (this blog also references the debate about the BDP in relation to the number of flows controversy) That blog also shows the degradation in tcp performance once buffers crack 250ms - a sea of retransmits with ever decreasing goodput. Cisco has AFD (approximate fair drop). I have zero reports from the field from those configuring it. I hear good things about Juniper's RED implementation but have not torn it apart, and few configure it that I know of. I would love it if more people slapped LibreQos on an oversubscribed link (it's good to well past 25Gbit and pushes cake to about 10Gbits/core destination), and observed what happened to user satisfaction, packet drops, RFC3168 ecn, and flow collisions in the 8 way set associative hash , and so on. We've produced some good tools for it, notably tcp sampling of the rtt, as well as nearly live "mice and elephants" plots from the 2002 paper on it.
(And this is very much on-topic for NANOG, as it is about encouraging our peers to implement effective operation in their networks, and in their connections with others.)
I too encourage everyone. -- :( My old R&D campus is up for sale: https://tinyurl.com/yurtlab Dave Täht CSO, LibreQos