You've also got fast retransmit, New Reno, BIC/CUBIC, as well as host parameter caching to limit the affect of packet loss on recovery time. I don't doubt that someone else could do a better job than I did in this field, but I'd be really curious to know how much of an effect a intermediary router can have on a TCP flow with SACK that doesn't cause more packet loss than anyone would put up with for interactive sessions.
my takeaway from the web site was that one of the ways p2p is bad is that it tends to start several parallel tcp sessions from the same client (i guess think of bittorrent where you're getting parts of the file from several folks at once). since each one has its own state machine, each will try to sense the end to end bandwidth-delay product. thus, on headroom-free links, each will get 1/Nth of that link's bandwidth, which could be (M>1)/Nth aggregate, and apparently this is unfair to the other users depending on that link. i guess i can see the point, if i squint just right. nobody wants to get blown off the channel because someone else gamed the fairness mechanisms. (on the other hand some tcp stacks are deliberately overaggressive in ways that don't require M>1 connections to get (M>1)/Nth of a link's bandwidth. on the internet, generally speaking, if someone else says fairness be damned, then fairness will be damned. however, i'm not sure that all TCP sessions having one endpoint in common or even all those having both endpoints in common ought to share fate. one of those endpoints might be a NAT box with M>1 users behind it, for example. in answer to your question about SACK, it looks like they simulate a slower link speed for all TCP sessions that they guess are in the same flow-bundle. thus, all sessions in that flow-bundle see a single shared contributed bandwidth-delay product from any link served by one of their boxes.