On Tue, Mar 05, 2002 at 08:42:22AM +0200, Hank Nussbacher wrote:
New 12.2(8)T feature in Cisco IOS called TCP Windows Scaling: http://www.cisco.com/univercd/cc/td/doc/product/software/ios122/122newft/122...
Specifically made for satellite networks: ip tcp window-size 750000
A) This is quite old, as part of RFC1323. Command History Release Modification 9.1 This command was introduced. 12.2(8)T Default window size and maximum window scaling factor were increased. B) This only applies to connections TO and FROM the router itself, not stuff it routes. You should also beware of turning up TCP window settings to whatever big number you feel like. I can only vouch for unix systems here, but the way the socket interface and kernel tcp works requires a buffer which is big enough to hold all data in flight be maintained in the kernel for every connection. This data cannot be released until it has been ACK'd (incase TCP needs to retransmitit), which is generally the limiting factor of TCP window sizes. Say for example you turn up your socket buffers to 1MB, and enable your RFC1323 extensions (window scaling is the one you care about, it's basically just adding a multiplier value so you can get bigger values then 65535). While this does let you to keep a whole lot of data in-flight, it also makes your system quite unstable. Consider the case where you are transfering a large file to a slow host: you will immediately fill the 1MB kernel buffer (the write() on the socket goes into that first, also the userland program has no way of knowing if that is a fast host or a big kernel buffer and will misreport speed). Open a few more connections like that and you've exausted your kernel memory and most likely will have a panic. If you did these settings on a web server, all it would take is a few dialups trying to download a big file before you go boom. Except for malicious activity on the part of the remote host, it is generally safe to turn up your RECEIVE socket buffer to a REASONABLY beefy number. Your performance may ultimately be impacted by the settings of the sending side, but you're still likely to improve performance with someone. If you're wondering what "reasonable" is, you can calculate what must be kept in-flight by doing the bandwidth * delay product: A slow ethernet pipe to a host that is fairly close by: 10Mbit * 10ms = 13,107 Bytes A fast ethernet pipe to a host that is fairly close by: 100Mbit * 10ms = 131,072 Bytes A fast ethernet pipe to a host that is on the opposite side of the US via someones drunken fiber path: 100Mbit * 100ms = 1,310,720 Bytes A satellite link: 5Mbit * 800ms = 524,288 Bytes For best results, you should multiply the number by a minimum of 2, to allow for TCP to do error recovery without destroying your windowing. Multiplying by a value of 3 is the most you would want, more than that is unnecessary and wasteful. If you're looking for something you can do as a server to improve responsiveness over long fat networks, the simplest and safest is to turn up a slowstart multiplier (under FreeBSD it's sysctl net.inet.tcp.slowstart_flightsize). This skips the slowstart ahead just a bit, optimizing for a certain target audience (say 56k modems) while people with 300baud modems will just have to drop some packets and backoff. Ramping up the slowstart from 1 can be very painful if your delay is extreme. While I'm on the subject, I'm not certain if Cisco BGP is linked to the "ip tcp" settings or if it tunes itself, but that is a potential win for a peer over an LFN if it does not. Anyone want to comment? And just for theories sake, the correct way to fix the whole socket buffer mess is to automatically tune them based on results from the congestion window. PSC (http://www.psc.edu/networking/auto.html) has an implementation for NetBSD, but it goes a tiny bit overboard doing nutty things like scanning the entire TCB list twice a second to try and achieve "fairness". Minus that part however, it is actually pretty darn simple to implement (at least on BSD, the socket buffers aren't allocated buffers at all, simply numbers which fix maximium size that can be allocated when data comes in). -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras PGP Key ID: 0x138EA177 (67 29 D7 BC E8 18 3E DA B2 46 B3 D8 14 36 FE B6)