Hey Adam,

I’m looking for a practical guide – i.e. specifically NOT an academic paper, thanks anyway – to predicting the effect of increased (or decreased) latency on my user’s applications.


This makes answering difficult. Because you may be looking for a simpler answer than what is available.

Specifically, I want to estimate how much improvement there will be in {bandwidth, application XYZ responsiveness, protocol ABC goodput, whatever} if I decrease the RTT between the user and the server by 10msec, or by 20msec, or by 40msec.


TCP single flow throughput is TcpWindow / RTT. But all modern stacks have TcpWindow scaling (does not mean all devices have), allowing it to grow beyond the default 0xffff. However, most TCP implementations burst window growth, and window grows exponentially, this means if there is speed step-down in-transit (100G => 10G, or such, which is below sender's rate) then transit device needs to be able to slurp the window growth amount of bytes, otherwise there is packet loss and window cannot grow and single flow cannot attain the max_rate. This means with lower RTT you need to buffer less, and devices with tiny buffers can be a lot cheaper than devices with large buffers.
Even though most OS support window scaling, there is often a limit to how large it is allowed to grow, because it's an attack vector for DRAM DOS. 

Example1, no window scaling, 100ms:
0xffff bytes / 100ms == 5.2Mbps

Example2, no window scaling, 60ms:
0xffff bytes / 60ms == 8.7Mbps

Example3, no window scaling, 5ms:
0xffff bytes / 5ms == 104.8Mbps

I believe you can multiply the numbers by 8x, to get performance your modern window users experience. As I believe windows restricts window-size by limiting allowable scaling factor to 8x. 

Example4, arbitrary window scaling, 100ms, 10Gbps receiver, 100Gbps sender:
100ms * 10Gbps == 125MB, you need tcp window to scale to 125MB to achieve 10Gbps on this path, and you need step-down device 100G=>10G to have 62.5MB buffer, while window grows from 62.5MB to 125MB and sender burst that data @ 100Gbps.

The Example4 is easy to fix, by not bursting the window growth, but by rather doing bandwidth estimation and send the window growth at estimated receiver rate, removing almost all need for transit buffering. However if we were to migrate to such congestion control, classical congestion control like reno, would out-compete it during congestion, so well behaved TCP streams would get increasingly little capacity as their bandwidth estimation would keep going down, and reno would win increasingly more capacity.

Ultimately, this goes into MY calculator – we have the usual north-american duopoly on last-mile consumer internet here; I’m connected directly to only one of the two.  There’s a cost $X to improve connectivity so I’m peered with both, how do I tell if it will be worthwhile?


Based on your questions I'd estimate you do not have a business case to latency optimize if it carries a premium. The most to benefit would probably be for competitive gamers, but they may be bad users on average being subsidized by light users, so it's not entirely clear if you want to invest to attract them.


-- 
 ++ytti