TCP congestion control and large router buffers

newer
RE: Router only speaks IGP in BGP...

Vasil Kolev

9 Dec 2010 9 Dec '10

4:09 p.m.

https://gettys.wordpress.com/2010/12/06/whose-house-is-of-glasse-must-not-th... I wonder why this hasn't made the rounds here. From what I see, a change in this part (e.g. lower buffers in customer routers, or a change (yet another) to the congestion control algorithms) would do miracles for end-user perceived performance and should help in some way with the net neutrality dispute. I also understand that a lot of the people here operate routers which are a bit far from the end-users and don't have a lot to do with this issue, but the rest should have something to do with choosing/configuring these end-user devices, so this should be relevant. -- Regards, Vasil Kolev

Attachments:

signature.asc (application/pgp-signature — 198 bytes)

Show replies by date

Mikael Abrahamsson

9 Dec 9 Dec

4:20 p.m.

On Thu, 9 Dec 2010, Vasil Kolev wrote:

...

I wonder why this hasn't made the rounds here. From what I see, a change in this part (e.g. lower buffers in customer routers, or a change (yet another) to the congestion control algorithms) would do miracles for end-user perceived performance and should help in some way with the net neutrality dispute.

I'd say this is common knowledge and has been for a long time. In the world of CPEs, lowest price and simplicity is what counts, so nobody cares about buffer depth and AQM, that's why you get ADSL CPEs with 200+ ms of upstream FIFO buffer (no AQM) in most devices. Personally I have MQC configured on my interface which has assured bw for small packets and ssh packets, and I also run fair-queue to make tcp sessions get a fair share. I don't know any non-cisco devices that does this. -- Mikael Abrahamsson email: swmike@swm.pp.se

Joel Jaeggli

19 Dec 19 Dec

8:16 p.m.

On 12/9/10 7:20 AM, Mikael Abrahamsson wrote:

...

On Thu, 9 Dec 2010, Vasil Kolev wrote:

...
I wonder why this hasn't made the rounds here. From what I see, a change in this part (e.g. lower buffers in customer routers, or a change (yet another) to the congestion control algorithms) would do miracles for end-user perceived performance and should help in some way with the net neutrality dispute.

I'd say this is common knowledge and has been for a long time.

In the world of CPEs, lowest price and simplicity is what counts, so nobody cares about buffer depth and AQM, that's why you get ADSL CPEs with 200+ ms of upstream FIFO buffer (no AQM) in most devices.

you're going to see more of it, at a minimum cpe are going to have to be able to drain a gig-e into a port that may be only 100Mb/s. The QOS options available in a ~$100 cpe router are adequate for the basic purpose. d-link dir-825 or 665 are examples of such devices

...

Personally I have MQC configured on my interface which has assured bw for small packets and ssh packets, and I also run fair-queue to make tcp sessions get a fair share. I don't know any non-cisco devices that does this.

the consumer cpe that care seem to be mostly oriented along keeping gaming and voip from being interfereed with by p2p and file transfers.

Jim Gettys

21 Dec 21 Dec

12:20 a.m.

On 12/19/2010 02:16 PM, Joel Jaeggli wrote:

...

On 12/9/10 7:20 AM, Mikael Abrahamsson wrote:

...
On Thu, 9 Dec 2010, Vasil Kolev wrote:

...
I wonder why this hasn't made the rounds here. From what I see, a change in this part (e.g. lower buffers in customer routers, or a change (yet another) to the congestion control algorithms) would do miracles for end-user perceived performance and should help in some way with the net neutrality dispute.

It's really hard to replace all the home user's hardware. Trying to "fix" the problem by fixing all of that is much more painful (and expensive) than fixing the network to not have the buffers.

...

...
I'd say this is common knowledge and has been for a long time.

Common knowledge among whom? I'm hardly a naive Internet user. And the statement is wrong: the large router buffers have effectively destroyed TCP's congestion avoidance altogether.

...

...
In the world of CPEs, lowest price and simplicity is what counts, so nobody cares about buffer depth and AQM, that's why you get ADSL CPEs with 200+ ms of upstream FIFO buffer (no AQM) in most devices.

200ms is good; but it is often up to multiple *seconds*. Resulting latencies on broadband gears are often horrific: see the netalyzr plots that I posted in my blog. See: http://gettys.wordpress.com/2010/12/06/whose-house-is-of-glasse-must-not-thr... Dave Clark first discovered bufferbloat on his DSLAM: he used the 6 second latency he saw to DDOS his son's excessive WOW playing. All broadband technologies are affected, as are, it turns out, all operating systems and likely all home routers as well (see other posts I've made recently). DSL, cable and FIOS all have problems. How many of retail ISP's service calls have been due to this terrible performance? I know I was harassing Comcast with multiple service calls over a year ago over what I now think was bufferbloat. And periodically for a number of years before that (roughly since DOCSIS 2 deployed, I would guess). "The Internet is slow today, Daddy" was usually Daddy saturating the home link, and bufferbloat the cause. Every time they would complain, I'd stop what I was doing, and the problem would vanish. A really nice willow the wisp...

...

you're going to see more of it, at a minimum cpe are going to have to be able to drain a gig-e into a port that may be only 100Mb/s. The QOS options available in a ~$100 cpe router are adequate for the basic purpose.

But the port may only be 1 Mb/second; 802.11g is 20Mbps tops; but drops to 1Mbps in extremis. So the real dynamic range is at least a factor of 1000 to 1.

...

d-link dir-825 or 665 are examples of such devices

Yes, and E3000's and others. Some are half measures, and have a single knob for both shaping uplinks and downlink bandwidth. The QOS features in home routers can help, but does not solve all problems. In part, because as broadband bandwidth increases, the bottleneck link may shift/often shifts to the home router to edge device links, and there are similar (or even worse) bufferbloat problems in both the home routers and operating systems.

...

...
Personally I have MQC configured on my interface which has assured bw for small packets and ssh packets, and I also run fair-queue to make tcp sessions get a fair share. I don't know any non-cisco devices that does this.

the consumer cpe that care seem to be mostly oriented along keeping gaming and voip from being interfereed with by p2p and file transfers.

An unappreciated issue is that these buffers have destroyed TCP (and all other congestion avoiding protocols) congestion avoidance. Secondly, any modern operating system (anything other than Windows XP), implements window scaling, and will, within about 10 seconds, *fill* the buffers with a single TCP connection, and they stay full until traffic drops enough to allow them to empty (which may take seconds). Since congestion avoidance has been defeated, you get nasty behaviour out of TCP. Congestion avoidance depends on *timely* notification to the end points of congestion: these buffers have destroyed the *timely* requirement of a fundamental presumption of internet protocol design. If you think that simultaneously: 1) destroying congestion avoidance 2) destroying slow-start, as many major web sites are by increasing their initial window 3) browsers are now using many TCP connections simultaneously 4) while the TCP traffic shifts to window scaling, enabling even a single TCP connection to fill these buffers. 5) increasing numbers of large uploads/downloads (not just bittorrent, HD movie delivery to disk, backup, crash dump uploads, etc) is a good idea, you aren't old enough to have experienced the NSFnet collapse during the 1980's (as I did). I have post-traumatic stress disorder from that experience; I'm worried about the confluence of these changes, folks. And there are network neutrality aspects to bufferbloat: since carriers have been provisioning their telephony service separate from their internet service, *and* there are these bloated buffers, *and* there is no classification that end users can perform over their broadband connections, you can't do as well as a carrier even with fancy home routers for any low latency service such as voip. See: http://gettys.wordpress.com/2010/12/07/bufferbloat-and-network-neutrality-ba... Personally, I don't think this was by malice of forethought, but it's not a good situation. The best you can do is what Ooma has done; bandwidth shaping along with being closest to the broadband connection (or by fancy home routers with classification and bandwidth shaping). That won't help the downstream direction where a single other user (or yourself), can inject large packet bursts routinely by browsing web sites like YouTube or Google images (unless some miracle occurs, and the broadband head ends are classifying traffic in the downstream direction over those links). - Jim

Mikael Abrahamsson

8:18 a.m.

On Mon, 20 Dec 2010, Jim Gettys wrote:

...

Common knowledge among whom? I'm hardly a naive Internet user.

Anyone actually looking into the matter. The Cisco "fair-queue" command was introduced in IOS 11.0 according to <http://www.cisco.com/en/US/docs/ios/12_2/qos/command/reference/qrfcmd1.html#wp1098249> to somewhat handle the problem. I have no idea when this was in time, but I guess early 90:ties?

...

And the statement is wrong: the large router buffers have effectively destroyed TCP's congestion avoidance altogether.

Routers have had large buffers since way before residential broadband even came around, the basic premise of TCP is that routers have buffers and quite a lot of it.

...

200ms is good; but it is often up to multiple *seconds*. Resulting latencies on broadband gears are often horrific: see the netalyzr plots that I posted in my blog. See:

I know of the problem, it's no news to me. You don't have to convince me. I've been using Cisco routers as a CPE because of this for a long time.

...

Dave Clark first discovered bufferbloat on his DSLAM: he used the 6 second latency he saw to DDOS his son's excessive WOW playing.

When I procured a DSLAM around 2003 or so it had 40ms of buffering at 24meg ADSL2+ speed, when the speed went down, the buffers length in bytes was constant so buffering time also went up. It didn't do any AQM either, but at least it did .1p prioritization and had 4 buffers so there was a little possibility of doing things upstream of it.

...

All broadband technologies are affected, as are, it turns out, all operating systems and likely all home routers as well (see other posts I've made recently). DSL, cable and FIOS all have problems.

Yes.

...

How many of retail ISP's service calls have been due to this terrible performance?

A lot, I'm sure.

...

Secondly, any modern operating system (anything other than Windows XP), implements window scaling, and will, within about 10 seconds, *fill* the buffers with a single TCP connection, and they stay full until traffic drops enough to allow them to empty (which may take seconds). Since congestion avoidance has been defeated, you get nasty behaviour out of TCP.

That is exactly what TCP was designed to do, use as much bandwidth as it can. Congestion is detected by two means, latency goes up and/or there is packet loss. TCP was designed with router buffers in mind. Anyhow, one thing that might help would be ECN in conjunction with WRED, but already there you're way over most CPE manufacturers head.

...

is a good idea, you aren't old enough to have experienced the NSFnet collapse during the 1980's (as I did). I have post-traumatic stress disorder from that experience; I'm worried about the confluence of these changes, folks.

I'm happy you were there, I was under the impression that routers had large buffers back then as well?

...

The best you can do is what Ooma has done; bandwidth shaping along with being closest to the broadband connection (or by fancy home routers with classification and bandwidth shaping). That won't help the downstream direction where a single other user (or yourself), can inject large packet bursts routinely by browsing web sites like YouTube or Google images (unless some miracle occurs, and the broadband head ends are classifying traffic in the downstream direction over those links).

There is definitely a lot of improvement to be had. For FTTH, if you use an L2 switch with a few ms of buffering as the ISP handoff device, you don't get this problem. There are even TCP algorithms to handle this case where you have little buffers and just tail-drop But yes, I agree that we'd all be much helped if manufacturers of both ends of all links had the common decency of introducing a WRED (with ECN marking) AQM that had 0% drop probability at 40ms and 100% drop probability at 200ms (and linear increase between). -- Mikael Abrahamsson email: swmike@swm.pp.se

Sam Stickland

8:21 p.m.

On 21 Dec 2010, at 07:18, Mikael Abrahamsson <swmike@swm.pp.se> wrote: On Mon, 20 Dec 2010, Jim Gettys wrote: Common knowledge among whom? I'm hardly a naive Internet user. Anyone actually looking into the matter. The Cisco "fair-queue" command was introduced in IOS 11.0 according to < http://www.cisco.com/en/US/docs/ios/12_2/qos/command/reference/qrfcmd1.html#wp1098249> to somewhat handle the problem. I have no idea when this was in time, but I guess early 90:ties? 200ms is good; but it is often up to multiple *seconds*. Resulting latencies on broadband gears are often horrific: see the netalyzr plots that I posted in my blog. See: I know of the problem, it's no news to me. You don't have to convince me. I've been using Cisco routers as a CPE because of this for a long time. Interestingly I've just tried to enable WRED on a Cisco 877 (advsecurity 15.1) and the random-detect commands are missing. Cisco's feature navigator says it's supported though. Weird. Also, there doesn't appear to be a way to enable fair-queue on the wireless interface. Is fair-queue seen as a bad strategy for wireless and it's varying throughput/goodput rates? And finally it doesn't support inbound shaping so I can't experience with trying to build the queues on it rather than the DSLAM. I'm a little nonplussed to be honest. However, I did notice the output queue on the dialler interface defaults to 1000 packets. (Perhaps that's a hangover from when it had to queue packets whilst dialling? I've come too late to networking to know). Reducing that number to 10 (~60ms @ 1500 bytes @ 8Mbps) has noticeably increased the latency response and fairness of the connection under load. Sam

Fred Baker

10:24 p.m.

On Dec 20, 2010, at 11:18 PM, Mikael Abrahamsson wrote:

...

On Mon, 20 Dec 2010, Jim Gettys wrote:

...
Common knowledge among whom? I'm hardly a naive Internet user.

Anyone actually looking into the matter. The Cisco "fair-queue" command was introduced in IOS 11.0 according to <http://www.cisco.com/en/US/docs/ios/12_2/qos/command/reference/qrfcmd1.html#wp1098249> to somewhat handle the problem. I have no idea when this was in time, but I guess early 90:ties?

1995. I know the guy that wrote the code. Meet me in a bar and we can share war stories. The technology actually helps with problems like RFC 6057 addresses pretty effectively.

...

...
is a good idea, you aren't old enough to have experienced the NSFnet collapse during the 1980's (as I did). I have post-traumatic stress disorder from that experience; I'm worried about the confluence of these changes, folks.

I'm happy you were there, I was under the impression that routers had large buffers back then as well?

Not really. Yup, several of us were there. The common routers on the NSFNET and related networks were fuzzballs, which had 8 (count them, 8) 576 byte buffers, Cisco AGS/AGS+, and Proteon routers. The Cisco routers of the day generally had 40 buffers on each interface by default, and might have had configuration changes; I can't comment on the Proteon routers. For a 56 KBPS line, given 1504 bytes per message (1500 bytes IP+data, and four bytes of HDLC overhead), that's theoretically 8.5 seconds. But given that messages were in fact usually 576 bytes of IP data (cf "fuzzballs" and unix behavior for off-LAN communications) and interspersed with TCP control messages (Acks, SYNs, FINs, RST), real queue depths were more like two seconds at a bottleneck router. The question would be the impact of a sequence of routers all acting as bottlenecks. IMHO, AQM (RED or whatever) is your friend. The question is what to set min-threshold to. Kathy Nichols (Van's wife) did a lot of simulations. I don't know that the paper was ever published, but as I recall she wound up recommending something like this: line rate ms queue depth (MBPS) RED min-threshold 2 32 10 16 155 8 622 4 2,500 2 10,000 1

...

But yes, I agree that we'd all be much helped if manufacturers of both ends of all links had the common decency of introducing a WRED (with ECN marking) AQM that had 0% drop probability at 40ms and 100% drop probability at 200ms (and linear increase between).

so, min-threshold=40 ms and max-threshold=200 ms. That's good on low speed links; it will actually control queue depths to an average of O(min-threshold) at whatever value you set it to. The problem with 40 ms is that it interacts poorly with some applications, notably voice and video. It also doesn't match well to published studies like http://www.pittsburgh.intel-research.net/~kpapagia/papers/p2pdelay-analysis..... In that study, a min-threshold of 40 ms would have cut in only on six a-few-second events in the course of a five hour sample. If 40 ms is on the order of magnitude of a typical RTT, it suggests that you could still have multiple retransmissions from the same session in the same queue. A good photo of buffer bloat is at ftp://ftpeng.cisco.com/fred/RTT/Pages/4.html ftp://ftpeng.cisco.com/fred/RTT/Pages/5.html The first is a trace I took overnight in a hotel I stayed in. Never mind the name of the hotel, it's not important. The second is the delay distribution, which is highly unusual - you expect to see delay distributions more like ftp://ftpeng.cisco.com/fred/RTT/Pages/8.html (which actually shows two distributions - the blue one is fairly normal, and the green one is a link that spends much of the day chock-a-block). My conjecture re 5.html is that the link *never* drops, and at times has as many as nine retransmissions of the same packet in it. The spikes in the graph are about a TCP RTO timeout apart. That's a truly worst case. For N-1 of the N retransmissions, it's a waste of storage space and a waste of bandwidth. AQM is your friend. Your buffer should be able to temporarily buffer as much as an RTT of traffic, which is to say that it should be large enough to ensure that if you get a big burst followed by a silent period you should be able to use the entire capacity of the link to ride it out. Your min-threshold should be at a value that makes your median queue depth relatively shallow. The numbers above are a reasonable guide, but as in all things, YMMV.

Jim Gettys

22 Dec 22 Dec

5:48 p.m.

On 12/21/2010 04:24 PM, Fred Baker wrote:

...

On Dec 20, 2010, at 11:18 PM, Mikael Abrahamsson wrote:

...
On Mon, 20 Dec 2010, Jim Gettys wrote:

...
Common knowledge among whom? I'm hardly a naive Internet user.

Anyone actually looking into the matter. The Cisco "fair-queue" command was introduced in IOS 11.0 according to<http://www.cisco.com/en/US/docs/ios/12_2/qos/command/reference/qrfcmd1.html#wp1098249> to somewhat handle the problem. I have no idea when this was in time, but I guess early 90:ties?

1995. I know the guy that wrote the code. Meet me in a bar and we can share war stories. The technology actually helps with problems like RFC 6057 addresses pretty effectively.

...
...
is a good idea, you aren't old enough to have experienced the NSFnet collapse during the 1980's (as I did). I have post-traumatic stress disorder from that experience; I'm worried about the confluence of these changes, folks.

I'm happy you were there, I was under the impression that routers had large buffers back then as well?

Not really. Yup, several of us were there. The common routers on the NSFNET and related networks were fuzzballs, which had 8 (count them, 8) 576 byte buffers, Cisco AGS/AGS+, and Proteon routers. The Cisco routers of the day generally had 40 buffers on each interface by default, and might have had configuration changes; I can't comment on the Proteon routers. For a 56 KBPS line, given 1504 bytes per message (1500 bytes IP+data, and four bytes of HDLC overhead), that's theoretically 8.5 seconds. But given that messages were in fact usually 576 bytes of IP data (cf "fuzzballs" and unix behavior for off-LAN communications) and interspersed with TCP control messages (Acks, SYNs, FINs, RST), real queue depths were more like two seconds at a bottleneck router. The question would be the impact of a sequence of routers all acting as bottlenecks.

IMHO, AQM (RED or whatever) is your friend. The question is what to set min-threshold to. Kathy Nichols (Van's wife) did a lot of simulations. I don't know that the paper was ever published, but as I recall she wound up recommending something like this:

line rate ms queue depth (MBPS) RED min-threshold 2 32 10 16 155 8 622 4 2,500 2 10,000 1

I don't know if you are referring to the "RED in a different light" paper: that was never published, though an early draft escaped and can be found on the net. "RED in a different light" identifies two bugs in the RED algorithm, and proposes a better algorithm that only depends on the link output bandwidth. That draft still has a bug. The (almost completed) version of the paper that never got published; Van has retrieved it from back up, and I'm trying to pry it out of Van's hands to get it converted to something we can read today (it's in FrameMaker). In the meanwhile, turn on (W)RED! For routers run by most people on this list, it's always way better than nothing, even if Van doesn't think classic RED will solve the home router bufferbloat problem. (where we have 2 orders of magnitude variation of wireless bandwidth along with highly variable workload). That's not true in the internet core.

...

...
But yes, I agree that we'd all be much helped if manufacturers of both ends of all links had the common decency of introducing a WRED (with ECN marking) AQM that had 0% drop probability at 40ms and 100% drop probability at 200ms (and linear increase between).

so, min-threshold=40 ms and max-threshold=200 ms. That's good on low speed links; it will actually control queue depths to an average of O(min-threshold) at whatever value you set it to. The problem with 40 ms is that it interacts poorly with some applications, notably voice and video.

It also doesn't match well to published studies like http://www.pittsburgh.intel-research.net/~kpapagia/papers/p2pdelay-analysis..... In that study, a min-threshold of 40 ms would have cut in only on six a-few-second events in the course of a five hour sample. If 40 ms is on the order of magnitude of a typical RTT, it suggests that you could still have multiple retransmissions from the same session in the same queue.

A good photo of buffer bloat is at ftp://ftpeng.cisco.com/fred/RTT/Pages/4.html ftp://ftpeng.cisco.com/fred/RTT/Pages/5.html

The first is a trace I took overnight in a hotel I stayed in. Never mind the name of the hotel, it's not important. The second is the delay distribution, which is highly unusual - you expect to see delay distributions more like

ftp://ftpeng.cisco.com/fred/RTT/Pages/8.html

Thanks, Fred! Can I use these in the general bufferbloat talk I'm working on with attribution? It's a far better example/presentation in a graphic form than I currently have for the internet core case (where I don't even have anything other than memory of probing the hotel's ISP's network).

...

(which actually shows two distributions - the blue one is fairly normal, and the green one is a link that spends much of the day chock-a-block).

My conjecture re 5.html is that the link *never* drops, and at times has as many as nine retransmissions of the same packet in it. The spikes in the graph are about a TCP RTO timeout apart. That's a truly worst case. For N-1 of the N retransmissions, it's a waste of storage space and a waste of bandwidth.

AQM is your friend. Your buffer should be able to temporarily buffer as much as an RTT of traffic, which is to say that it should be large enough to ensure that if you get a big burst followed by a silent period you should be able to use the entire capacity of the link to ride it out. Your min-threshold should be at a value that makes your median queue depth relatively shallow. The numbers above are a reasonable guide, but as in all things, YMMV.

Yup. AQM is our friend. And we need it in many places we hadn't realised we did (like our OS's). - Jim

Fred Baker

6:14 p.m.

On Dec 22, 2010, at 8:48 AM, Jim Gettys wrote:

...

I don't know if you are referring to the "RED in a different light" paper: that was never published, though an early draft escaped and can be found on the net.

Precisely.

...

"RED in a different light" identifies two bugs in the RED algorithm, and proposes a better algorithm that only depends on the link output bandwidth. That draft still has a bug.

The (almost completed) version of the paper that never got published; Van has retrieved it from back up, and I'm trying to pry it out of Van's hands to get it converted to something we can read today (it's in FrameMaker).

In the meanwhile, turn on (W)RED! For routers run by most people on this list, it's always way better than nothing, even if Van doesn't think classic RED will solve the home router bufferbloat problem. (where we have 2 orders of magnitude variation of wireless bandwidth along with highly variable workload). That's not true in the internet core.

...
...
But yes, I agree that we'd all be much helped if manufacturers of both ends of all links had the common decency of introducing a WRED (with ECN marking) AQM that had 0% drop probability at 40ms and 100% drop probability at 200ms (and linear increase between).

so, min-threshold=40 ms and max-threshold=200 ms. That's good on low speed links; it will actually control queue depths to an average of O(min-threshold) at whatever value you set it to. The problem with 40 ms is that it interacts poorly with some applications, notably voice and video.

It also doesn't match well to published studies like http://www.pittsburgh.intel-research.net/~kpapagia/papers/p2pdelay-analysis..... In that study, a min-threshold of 40 ms would have cut in only on six a-few-second events in the course of a five hour sample. If 40 ms is on the order of magnitude of a typical RTT, it suggests that you could still have multiple retransmissions from the same session in the same queue.

A good photo of buffer bloat is at ftp://ftpeng.cisco.com/fred/RTT/Pages/4.html ftp://ftpeng.cisco.com/fred/RTT/Pages/5.html

The first is a trace I took overnight in a hotel I stayed in. Never mind the name of the hotel, it's not important. The second is the delay distribution, which is highly unusual - you expect to see delay distributions more like

ftp://ftpeng.cisco.com/fred/RTT/Pages/8.html

Thanks, Fred! Can I use these in the general bufferbloat talk I'm working on with attribution? It's a far better example/presentation in a graphic form than I currently have for the internet core case (where I don't even have anything other than memory of probing the hotel's ISP's network).

Yes. Do me a favor and remove the name of the hotel. They don't need the bad press.

...

...
(which actually shows two distributions - the blue one is fairly normal, and the green one is a link that spends much of the day chock-a-block).

My conjecture re 5.html is that the link *never* drops, and at times has as many as nine retransmissions of the same packet in it. The spikes in the graph are about a TCP RTO timeout apart. That's a truly worst case. For N-1 of the N retransmissions, it's a waste of storage space and a waste of bandwidth.

AQM is your friend. Your buffer should be able to temporarily buffer as much as an RTT of traffic, which is to say that it should be large enough to ensure that if you get a big burst followed by a silent period you should be able to use the entire capacity of the link to ride it out. Your min-threshold should be at a value that makes your median queue depth relatively shallow. The numbers above are a reasonable guide, but as in all things, YMMV.

Yup. AQM is our friend.

And we need it in many places we hadn't realised we did (like our OS's). - Jim

George Bonser

8:03 p.m.

...

I don't know if you are referring to the "RED in a different light" paper: that was never published, though an early draft escaped and can be found on the net.

"RED in a different light" identifies two bugs in the RED algorithm, and proposes a better algorithm that only depends on the link output bandwidth. That draft still has a bug.

I also noticed another paper published later that references "RED in a different light": http://www.icir.org/floyd/adaptivered/ Adaptive RED: An Algorithm for Increasing the Robustness of RED's Active Queue Management (postscript, PDF). Sally Floyd, Ramakrishna Gummadi, and Scott Shenker. August 1, 2001. And this one: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.98.1556&rep=rep 1&type=pdf July 15, 2002 Active Queue Management using Adaptive RED Rahul Verma, Aravind Iyer and Abhay Karandikar Abhay But it doesn't look like aRED went anywhere

Carsten Bormann

23 Dec 23 Dec

7 p.m.

Some more historical pointers: If you want to look at the early history of the latency discussion, look at Stuart Cheshire's famous rant "It's the Latency, Stupid" (http://rescomp.stanford.edu/~cheshire/rants/Latency.html). Then look at Matt Mathis's 1997 TCP equation (and the 1998 Padhye-Firoiu version of that): The throughput is proportional to the inverse square root of the packet loss and the inverse RTT -- so as the RTT starts growing due to increasing buffers, the packet loss must grow to keep equilibrium! We started to understand that you have to drop packets in order to limit queueing pretty well in the late 1990s. E.g., RFC 3819 contains an explicit warning against keeping packets for too long (section 13). But, as you notice, for faster networks, the bufferbloat effect can be limited in effect by intelligent window size management, but the dominating Windows XP was not intelligent, just limited in its widely used default configuration. So the first ones to fully see the effect were the ones with many TCP connections, i.e. Bittorrent. The modern window size "tuning" schemes in Windows 7 and Linux break a lot of things -- you are just describing the tip of the iceberg here. The IETF working group LEDBAT (motivated by the Bittorrent observations) has been working on a scheme to run large transfers without triggering humungous buffer growth. Gruesse, Carsten

5215

Age (days ago)

5229

Last active (days ago)

List overview

Download

10 comments

8 participants

participants (8)

Carsten Bormann
Fred Baker
George Bonser
Jim Gettys
Joel Jaeggli
Mikael Abrahamsson
Sam Stickland
Vasil Kolev