You could simply add another IP address to the servers's source- address pool, which effectively gives you another 32K (or whatever value you have for the local port range) identifiers. Owen On Dec 5, 2012, at 7:59 AM, Ray Soucy <rps@maine.edu> wrote:
RFC 793 arbitrarily defines 2MSL (how long to hold a socket in TIME_WAIT state before cleaning up) as 4 min.
Linux is a little more reasonable in this and has it baked into the source as 60 seconds in "/usr/src/linux/include/net/tcp.h": #define TCP_TIMEWAIT_LEN (60*HZ)
Where there is no way to change this though /proc (probably a good idea to keep users from messing with it), I am considering re-building a kernel with a lower TCP_TIMEWAIT_LEN to deal with the following issue.
With a 60 second timeout on TIME_WAIT, local port identifiers are tied up from being used for new outgoing connections (in this case a proxy server). The default local port range on Linux can easily be adjusted; but even when bumped up to a range of 32K ports, the 60 second timeout means you can only sustain about 500 new connections per second before you run out of ports.
There are two options to try an deal with this, tcp_tw_reuse, and tcp_tw_recycle; but both seem to be less than ideal. With tcp_tw_reuse, it doesn't appear to be effective in situations where you're sustaining 500+ new connections per second rather than a small burst. With tcp_tw_recycle it seems like too big of a hammer and has been reported to cause problems with NATed connections.
The best solution seems to be trying to keep TIME_WAIT in place, but being faster about it.
30 seconds would get you to 1000 connections a second; 15 to 2000, and 10 seconds to about 3000 a second.
A few questions:
Does anyone have any data on how typical it is for TIME_WAIT to be necessary beyond 10 seconds on a modern network? Has anyone done some research on how low you can make TIME_WAIT safely? Is this a terrible idea? What alternatives are there? Keep in mind this is a proxy server making outgoing connections as the source of the problem; so things like SO_REUSEADDR which work for reusing sockets for incoming connections don't seem to do much in this situation.
Anyone running large proxies or load balancers have this situation? If so what is your solution?
-- Ray Patrick Soucy Network Engineer University of Maine System
T: 207-561-3526 F: 207-561-3531
MaineREN, Maine's Research and Education Network www.maineren.net