On 3/5/23 7:00 PM, Matt Corallo wrote:
On 3/5/23 12:34 PM, Dave Taht wrote:
I rather enjoyed doing this podcast a few weeks ago, (and enjoy this podcast a lot, generally), and it talks to what I've been up to for the past year or so on fixing bufferbloat for ISPs.
https://packetpushers.net/podcast/heavy-networking-666-improving-quality-of-...
I am kind of curious as to how much XDP and EBPF now exist in the nanog universe and other applications y'all are finding for it?
I while back I had to make fragments passing through Linux boxes kinda sorta reliable-ish. Sadly, Linux's fragment reassembly code has a wonderful timeout to hold onto potential packets for reassembly of 30 *seconds*, and if it has stuff its holding on to it can throw away new stuff [1].
Because frag'ing happens pretty early in receive/usually on the nic in send, rewriting packets in normal firewall rules can be annoying. Turns out this was easier to do by just slapping a "rewrite the packet to make it pretend its not a frag and turn it back on on the other side" eBPF program on tc on the outbound side and xdp on the inbound edge.
Its all a bit low-level (gotta write your own packet reading), but incredibly powerful when you need to do something dirty (or more performant, in the cloudflare case, dunno if they're doing the in-hardware XDP stuff or not).
Matt
[1] I tried to change it, the constant literally predates linux-in-git but, hey, apparently sat links are more important to support https://patchwork.kernel.org/project/netdevbpf/patch/fdcac2a0-5036-f1c8-a926...
To clarify, since it was pointed out sat links *are* important to support, I really should have said earth-mars links here. 30 seconds isn't a realistic sat link latency either these days, let alone jitter :). Honestly I'm not really sure where you could even find 30 seconds of jitter anywhere, but, hey the constant literally came from RFC 791 and we can't change that now, apparently. Matt