On 19 Feb 2020, at 15:47, Daniel Sterling <sterling.daniel@gmail.com> wrote:
On Tue, Feb 18, 2020 at 8:05 PM Michael Brown <michael@supermathie.net> wrote:
Blocking a (for you) undesirable option when an established fallback exists is a much better end user experience than introducing breakage into that option
Or: I no longer use my ISP's IPv6 access (via 6rd) since it would cause terrible slowdowns due to packet loss when it broke
All the +1s.
I have one goal for my home internet: it should work better than my cell phone!
In our house, "screen time" is "all the time". Everyone is on their phone non-stop. So you'd think getting fiber, installing UBNT APs and swapping out AT&T's CPE for a Core i5 linux box would provide a better internet experience than a tree-obstructed cell tower a mile or two down the road.
But you'd be wrong.
Everyone in the house was, on a daily basis, turning off wifi in favor of AT&T LTE!
Why was everyone switching off wifi? I couldn't blame them -- I was toggling it on and off myself to get the occasional website or IM conversation to load. Why was my network broken?? How was it possible that a fairly high-latency mobile connection could provide a better experience than 802.11ac to an AP in the same room backed by a gigabit PON?
I banged against this for *years*. I punted on using my own router and tried just AT&T's CPE (reset to factory defaults). That does work decently, but there are some maddening quirks, not least of which is insanely high jitter.
I tried SoHo devices running vendor stock firmware driving hardware NAT. Those also work well -- until they inevitably crash.
I tried ddwrt and openwrt. I tried AQM; I tried QoS. I tried NUCs running upstream kernels; downstream kernels; I tried custom patches. I tried HFSC and CoDel; I compiled iproute2 so I could have some tc_cake.
On the link-layer (wifi) side I tried one AP; two APs; three APs. I tested any number of combinations of SSID name, channel frequency and width. I tried with ipv6; no ipv6; I put all 2ghz devices on their own AP; in desperation I even tried dedicating one AP and an entire 5ghz frequency range for just one phone.
But nothing mattered until I finally figured it out:
It was DNS. Of course it was DNS. It's always DNS.
When DNS was solid and fast, everything else fell into place. Toggling wifi worked because it was the same as re-querying DNS! And the DNS service on mobile works well -- better than the average CPE.
*** I cannot stress this enough. No manner of tuning or tweaking to my home network stopped users from fleeing it, until I had solid DNS. ***
For fast DNS, you of course need fast UDP. And, as we've empirically discovered, well-behaved UDP is by no means guaranteed on residential connections.
It turns out dnsmasq has a couple of tunables that can make a huge difference to home internet DNS performance. First, you need to be querying the DNS servers AT&T tells you to via DHCP. They're the least likely to be throttled, unfortunately. But I've found even that alone isn't enough.
You need to set dnsmasq's "query-port" option. By default it's random to protect against CVE-2008-1447, but apparently sending a ton of random-source-port UDP traffic does not impress the AT&T network flow control systems, and your DNS traffic becomes unbearably slow (or is simply dropped entirely)
If dnsmasq supports DNS COOKIE turn it on. DNS COOKIE provides protection against CVE-2008-1447 provides the other end supports DNS COOKIE without having to play games with ports.
AT&T gives you two DNS servers via DHCP. You can query more -- 8.8.8.8, 4.2.2.2, 2606:4700:4700::1111 -- but if you do, you'll want to enable dnsmasq's "all-servers" option. Packets are cheap -- send a query to every server on your list and use whatever comes back first. If you've angered the UDP flow restrictor, no matter -- with luck at least one of your packets is going to a server that's up and not throttled or overloaded!
Of course DNS is just the beginning -- you still need a proper gateway device with a good NAT stack and/or firewall; you still need a strong wifi signal; you still need tc_cake so everyone can watch Netflix at the same time.
But DNS is the *core*. Nothing works well until DNS works well. That means nothing works well unless UDP works well. And if I have learned anything about AS7018, it's that UDP -- especially its v4 UDP -- Does. Not. Work. Well.
Enter QUIC. It may be the perfect transport-layer protocol; but by putting it on top of UDP it's hobbled. It breaks extant v4 internet in a way that nothing else we've yet seen does -- it takes what would be your TCP traffic and gives it inconsistent and intermittently poor performance. Maybe it's sometimes fast. Maybe it is. But I can tell you, it sometimes Is Very Much Not So.
As much as I would on principle rather not stick to a legacy, TCP-only home network --
I can say that right now, my home internet, blocking UDP 443, and making tons of insecure DNS queries -- is the most stable, fastest, most usable and enjoyable home internet I've ever had. And my users agree -- they no longer turn off wifi.
May I naively ask if Google staff have considered scrapping using UDP and instead proposing a new, first-class transport protocol that OSes can implement on top of IP? UDP certainly helped speed testing and iteration for QUIC in real-world scenarios, but I fear UDP is too brittle ground upon which to build the next generation of internet transport. Committing to UDP now with HTTP/3 may be a mistake.
And if that doesn't convince you, consider that even I was smart enough to figure out how to block it :)
-- Dan
-- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: marka@isc.org