There is, however, the spectre of there being so many SYNs flying around that they alone might cause congestion collapse. I dunno if I should be frightened of that or not, but I am not one of your origin server friends. --:)
i'm not worried about the syn's so much as i am worried about the lack of interstream resource planning. in all of the popular desktop stacks, a new tcp stream does its own slow start (not paying any attention to the aggegrate bandwidth*delay when several streams are open to the same origin server). this means every new tcp session has to sense the available bandwidth*delay, causing the other tcp sessions toward the same origin to have to back off and try to find the new equilibrium. and then, wonder of wonders, it's time to close all of those connections because the user has clicked "stop" after getting bored waiting for thos GIFs to populate, and has clicked on something else, so let's start this whole stupid process over again with some other origin server. persistent http helps this a little. aggregation through proxies -- even if no caching is done -- will help it a little more. t/tcp would help some. desktop tcp stack fixes to remember end-to-end bandwidth*delay between connections, and to treat end-to-end bandwidth*delay as an aggregate to be shared between simultaneous connections from/to the same place (or to just stop doing that stupid parallelism in favour of one http/1.1 persistent connection) would also help.
Finally, could your explain the "benchmark" comment a bit?
this was in specific reference to my product's connection quota for each origin server, and the fact that if we intercept too many simultaneous connections to a given origin server, we just delay the ones for which no open connection is available.