Re: fixing TCP buffers (Re: packet reordering at exchange points)

10 Apr 2002

      On Wed, Apr 10, 2002 at 12:22:57AM +0000, E.B. Dreger wrote:
...
My static buffer presumed that one would regularly see line rate;
that's probably an invalid assumption.
Indeed. But thats why it's not an actual allocation.
...
Why bother advertising space remaining?  Simply take the total
space -- which is tuned to line rate -- and divide equitably.
Equal division is the primitive way.  Monitoring actual buffer
use, a la PSC window-tuning code, is more efficient.
Because then you havn't accomplished your goal. If you have 32MB of buffer
memory available, and you open 32 connections and share it equally for
1MB/ea, you could have 1 connection that is doing no bandwidth and one 
connection that wants to scale to more then 1MB of packets inflight. Then 
you have to start scanning all your connections on a periodic basis 
adjusting the socket buffers to reflect the actual congestion window, a 
la PSC.

My suggestion was to cut out all that non-sense by simply removing the 
received window limits all together. Actually you could accomplish this 
goal by just advertising the maximum possible window size and rely on 
packet drops to shrink the congestion window on the sending side as 
necessary, but this would be slightly less efficient in the case of a 
sender overrunning the receiver.

But alas we're both forgetting the sender side, which controls how quickly 
data moves from userland into the kernel. This part must be set by looking 
at the sending congestion window. And I thought of another problem as 
well. If you had a receiver which made a connection, requested as much 
data as possible, and then never did a read() on the socket buffer, all 
the data would pile up in the kernel and consume the total buffer space 
for the entire system.
...
To respect memory, sure, you could impose a global limit and
alloc as needed.  But on a "busy enough" server/client, how much
would that save?  Perhaps one could allocate 8MB chunks at a
time... but fragmentation could prevent the ability to have a
contiguous 32MB in the future.  (Yes, I'm assuming high memory
usage and simplistic paging.  But I think that's plausible.)
You're missing the point, you don't allocate ANYTHING until you have a
packet to fill that buffer, and then when you're done buffering it, it is
free'd. The limits are just there to prevent you from running away with a 
socket buffer.

-- 
Richard A Steenbergen <ras@e-gerbil.net>       http://www.e-gerbil.net/ras
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)