To all, I have an east coast and west coast data center connected with a DS3. I am running into issues with streaming data via TCP and was wondering besides hardware acceleration, is there any options at increasing throughput and maximizing the bandwidth? How can I overcome the TCP stack limitations inherent in Windows (registry tweaks seem to not functions too well)? Philip ____________________________________________________________________________________ Need Mail bonding? Go to the Yahoo! Mail Q&A for great tips from Yahoo! Answers users. http://answers.yahoo.com/dir/?link=list&sid=396546091
On 27-Mar-2007, at 16:26, Philip Lavine wrote:
I have an east coast and west coast data center connected with a DS3. I am running into issues with streaming data via TCP and was wondering besides hardware acceleration, is there any options at increasing throughput and maximizing the bandwidth? How can I overcome the TCP stack limitations inherent in Windows (registry tweaks seem to not functions too well)?
You might take a look through RFC 2488/BCP 28, if you haven't already. The circuit propagation delays in that scenarios painted by that document are far higher than yours, but the principles are the same. Joe
On 27-Mar-2007, at 16:35, Joe Abley wrote:
You might take a look through RFC 2488/BCP 28, if you haven't already. The circuit propagation delays in that scenarios painted by that document are far higher than yours, but the principles are the same.
"... in *the* scenarios..." I am having trouble with words, today.
On Mar 27, 2007, at 1:26 PM, Philip Lavine wrote:
inherent in Windows (registry tweaks seem to not functions too well)?
You should certainly look at your MTU and MSS values to ensure there are no difficulties of that sort. Is there any other factor such as perhaps slow DNS server (or other lookup-type services) responses which can be contributing to the perceived slowness? How about tuning the I/O buffers on the relevant routers? Can you tune the I/O buffers on the servers? And what about your link utilization? Is the DS3 sufficient? Take a look at pps and bps, and take a look at your average packet sizes (NetFlow can help with this). Are your apps sending lots of smaller packets, or are you getting nice, large packet-sizes? Finally, if none of the above help, you could look at something like SCTP, if your servers and apps support it. But I'd go through the above exercise, first. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@cisco.com> // 408.527.6376 voice Words that come from a machine have no soul. -- Duong Van Ngo
Philip,
I have an east coast and west coast data center connected with a DS3. I am running into issues with streaming data via TCP and was wondering besides hardware acceleration, is there any options at increasing throughput and maximizing the bandwidth? How can I overcome the TCP stack limitations inherent in Windows (registry tweaks seem to not functions too well)?
I don't know the RTT, but you should have at least 300 kByte buffers on the end hosts for a 60 ms RTT path to reach 40 Mbps TCP throughput. (This requires window scaling as well.) Is this what you were trying to tune on your Windows hosts? Is your DS3 free of errors? Even a very low packet loss can degrade TCP performance badly. You'll find a lot of useful information about TCP performance in the GEANT2 PERT Knowledge Base at http://www.kb.pert.geant2.net/ Andras
At 04:26 PM 3/27/2007, Philip Lavine wrote:
I have an east coast and west coast data center connected with a DS3. I am running into issues with streaming data via TCP and was wondering besides hardware acceleration, is there any options at increasing throughput and maximizing the bandwidth? How can I overcome the TCP stack limitations inherent in Windows (registry tweaks seem to not functions too well)?
You will have problems obtaining anything more than 5-7Mbit/s based on 1500 byte Ethernet packets and a RTT latency of 70-90ms. You can increase your window size or use Jumbo Ethernet frames. Almost all GigE gear supports jumbo frames. I'm not sure of your application, but without OS tweaks, each stream is limited to 5-7Mbit/s. You can open multiple streams between the same two hosts or you can use multiple hosts to transfer your data. You can utilize the entire DS3, but not without OS TCP stack tweaks or a move to jumbo frames. You can also use UDP or another connectionless packet method to move the data between sites. Good luck. -Robert Tellurian Networks - Global Hosting Solutions Since 1995 http://www.tellurian.com | 888-TELLURIAN | 973-300-9211 "Well done is better than well said." - Benjamin Franklin
I have an east coast and west coast data center connected with a DS3. I am running into issues with streaming data via TCP and was wondering besides hardware acceleration, is there any options at increasing throughput and maximizing the bandwidth?
Use GigE cards on the servers with a jumbo MTU and only buy IP network access from a service provider who supports jumbo MTUs end-to-end through their network.
How can I overcome the TCP stack limitations inherent in Windows (registry tweaks seem to not functions too well)?
Install a pair of Linux servers and use them to send/receive the data over the WAN. Also, do some googling for Internet Speed Record and read pages like this one: http://www.internet2.edu/lsr/history.html And read up on scaling MTI size with bandwidth here: http://www.psc.edu/~mathis/MTU/arguments.html --Michael Dillon
<michael.dillon@bt.com> writes:
Use GigE cards on the servers with a jumbo MTU and only buy IP network access from a service provider who supports jumbo MTUs end-to-end through their network.
I'm not sure that I see how jumbo frames help (very much). The principal issue here is the relatively large bandwidth-delay product, right? So you need large TCP send buffers on the sending side, a large (scaled) receive window on the receiver side, and turn on selective acknowledgement (so that you don't have to resend the whole send buffer if a packet gets dropped). At 45 Mb/s and 120 ms RTT, you need to be able to have ca. 700 KBytes of data "in flight"; round up and call it a megabyte. Having said that, I too have tried to configure Windows to use a large send buffer, and failed. (In my case, it was Windows machines at a remote location sending to Linux machines.) I'm not a Windows person; maybe I didn't try hard enough. In the event, I threw up my hands and installed a Linux proxy server at the remote site, appropriately configured, and went home happy. Jim Shankland
Use GigE cards on the servers with a jumbo MTU and only buy IP network access from a service provider who supports jumbo MTUs end-to-end through their network.
I'm not sure that I see how jumbo frames help (very much). The principal issue here is ...
The people who know what helps are the ones who have been setting the Internet land speed records. They typically use frames larger than 1500. As for the principal issue, well, if there are several factors that will contribute to solving the problem, I think that you get better results if you attack all of them in parallel. Then, if you learn that there really is one principal factor and you need some management approval to move on that issue, you will have laid the groundwork for making a business case because you've already done all the other things. --Michael Dillon
<michael.dillon@bt.com> writes:
[...] if there are several factors that will contribute to solving the problem, I think that you get better results if you attack all of them in parallel.
Well, I guess; except that "only buy IP network access from a service provider who supports jumbo MTUs end-to-end through their network" may be a much bigger task than tuning your TCP stack. Jumbo frames seem to help a lot when trying to max out a 10 GbE link, which is what the Internet land speed record guys have been doing. At 45 Mb/s, I'd be very surprised if it bought you more than 2-4% in additional throughput. It's worth a shot, I suppose, if the network infrastructure supports it. On a coast-to-coast DS-3, a TCP stack that's correctly tuned for a high bandwidth-delay product environment, on the other hand, is likely to outperform an untuned stack by a factor of 10 or so in bulk transport over a single TCP session. (Though, as somebody pointed out, tuning may have to occur all the way up the application stack; there are, e.g., ssh patches out there for high-BDP environments.) So I guess, sure, try anything you can; but I know what I'd try first :-). Jim Shankland
On 28 Mar 2007, at 00:28, Jim Shankland wrote:
Jumbo frames seem to help a lot when trying to max out a 10 GbE link, which is what the Internet land speed record guys have been doing. At 45 Mb/s, I'd be very surprised if it bought you more than 2-4% in additional throughput. It's worth a shot, I suppose, if the network infrastructure supports it.
The original poster was talking about a streaming application - increasing the frame size can cause it take longer for frames to fill a packet and then hit the wire increasing actual latency in your application. Probably doesn't matter when the stream is text, but as voice and video get pushed around via IP more and more, this will matter.
The original poster was talking about a streaming application - increasing the frame size can cause it take longer for frames to fill a packet and then hit the wire increasing actual latency in your application.
Probably doesn't matter when the stream is text, but as voice and video get pushed around via IP more and more, this will matter.
Increasing the MTU is not the same as increasing the frame size. MTU stands for Maximum Transmission Unit and is a ceiling on the frame size. Frames larger than the MTU must be fragmented. Clearly it is dumb for a voice application or a realtime video application to use large frames, but setting the MTU on a WAN interface to something higher than 1500 does not require the application to fill up its frames. Also, if a video application is not realtime, then use of large frames is more likely to do good than to do harm. --Michael Dillon
Thus spake "Andy Davidson" <andy@nosignal.org>
The original poster was talking about a streaming application - increasing the frame size can cause it take longer for frames to fill a packet and then hit the wire increasing actual latency in your application.
Probably doesn't matter when the stream is text, but as voice and video get pushed around via IP more and more, this will matter.
It's a serious issue for voice due to the (relatively) low bandwidth, which is why most voice products only put 10-30ms of data in each packet. Video, OTOH, requires sufficient bandwidth that packetization time is almost irrelevant. With a highly compressed 1Mbit/s stream you're looking at 12ms to fill a 1500B packet vs 82ms to fill a 10kB packet. It's longer, yes, but you need jitter buffers of 100-200ms to do real-time media across the Internet, so that and speed-of-light issues are the dominant factors in application latency. And, as bandwidth inevitably grows (e.g. ATSC 1080i or 720p take up to 19Mbit/s), packetization time quickly fades into the background noise. Now, if we were talking about greater-than-64kB jumbograms, that might be another story, but most folks today use "jumbo" to mean packets of 8kB to 10kB, and "baby jumbos" to mean 2kB to 3kB. S Stephen Sprunk "Those people who think they know everything CCIE #3723 are a great annoyance to those of us who do." K5SSS --Isaac Asimov
Use GigE cards on the servers with a jumbo MTU and only buy IP network access from a service provider who supports jumbo MTUs end-to-end through their network.
To check MTU on transit paths, try mturoute: http://www.elifulkerson.com/projects/mturoute.php As well as the MTU eye-chart: http://www.elifulkerson.com/projects/mtu-eyechart.php -Hank Nussbacher http://www.interall.co.il
Jim Shankland wrote:
<michael.dillon@bt.com> writes:
Use GigE cards on the servers with a jumbo MTU and only buy IP network access from a service provider who supports jumbo MTUs end-to-end through their network.
I'm not sure that I see how jumbo frames help (very much).
Jumbograms don't change your top speed, but they do mean you acclerate through slow start more quickly. If there is non-congestion based packet loss on a link you can end up with slow start being stopped early, and waiting for linear increase which can mean it will take hours to reach steady state instead of minutes. Jumbograms reduces this by a factor of 6 which of course helps (60 minutes -> 10 minutes...). <snip other good advice>
At 45 Mb/s and 120 ms RTT, you need to be able to have ca. 700 KBytes of data "in flight"; round up and call it a megabyte.
I have written a calculator to help people explore these issues: http://wand.net.nz/~perry/max_download.php It also includes TFRC to show how non-congestion-related packet loss impacts your performance too (got a dodgy wireless hop there somewhere? Well expect everything to be glacially slow...)
Having said that, I too have tried to configure Windows to use a large send buffer, and failed. (In my case, it was Windows machines at a remote location sending to Linux machines.) I'm not a Windows person; maybe I didn't try hard enough. In the event, I threw up my hands and installed a Linux proxy server at the remote site, appropriately configured, and went home happy.
I've never really been a windows guy either and I've never had a windows machine in a position that it needed to be tuned. Of course most of the tuning is just upping the rwin. Apparently Vista has a larger default rwin, and an optional "Compound TCP" congestion control system designed for use over high bandwidth delay WAN links if upgrading windows is an option.
On Tue, 27 Mar 2007, Philip Lavine wrote:
I have an east coast and west coast data center connected with a DS3. I am running into issues with streaming data via TCP and was wondering besides hardware acceleration, is there any options at increasing throughput and maximizing the bandwidth? How can I overcome the TCP stack limitations inherent in Windows (registry tweaks seem to not functions too well)?
You should talk to the vendor (microsoft) and ask them how to tweak their product to properly work over the WAN. Don't let them get away with substandard product when it comes to WAN optimization. If you can get microsoft to clean up their act, you'd have done ISPs a great service, because then we can stop trying to convince customers that it's not ISP fault that they get bad speeds with their windows PCs. -- Mikael Abrahamsson email: swmike@swm.pp.se
I have an east coast and west coast data center connected with a DS3. I am running into issues with streaming data via TCP and was wondering besides hardware acceleration, is there any options at increasing throughput and maximizing the bandwidth? How can I overcome the TCP stack limitations inherent in Windows (registry tweaks seem to not functions too well)?
even on "default settings" on a modern TCP stack, getting close to path-line-rate on a 80msec RTT WAN @ DS3 speeds with a single TCP stream should not be that difficult. the Windows TCP stack as of Windows XP SP2 has some fairly decent defaults. it will do RFC1323 / large windows / SACK., but all of these can be tuned with registry settings if you wish. with a particular implementation of FCIP (Fibre Channel over IP) i worked on, we could pretty much sustain a single TCP stream from a single GbE port at wire-rate GbE with RTT up to 280msec with minimal enhancements to TCP. at that point it started to get difficult because you had close to 32MB of data in transit around at any given time, which is the current standard limit for how large you can grow a TCP window. i think the first thing you should do is ascertain there are no problems with your LAN or WAN. i.e. that there are no drops being recorded, no duplex mismatch anywhere, etc. i suggest you fire up "ttcp" on a host on each end and see what throughput you get. with both tcp & udp you should be able to get close to 5.5 to 5.6 MB/s. if you can't, i'd suggest looking into why & addressing the root cause. once you've done that, its then a case of ensuring the _applications_ you're using can actually "fill the pipe" and aren't latency-sensitive at that distance.
On 3/27/07, Lincoln Dale <ltd@interlink.com.au> wrote:
even on "default settings" on a modern TCP stack, getting close to path-line-rate on a 80msec RTT WAN @ DS3 speeds with a single TCP stream should not be that difficult.
the Windows TCP stack as of Windows XP SP2 has some fairly decent defaults. it will do RFC1323 / large windows / SACK., but all of these can be tuned with registry settings if you wish.
I was under the impression that XP's default window size was 17,520 bytes, rfc1323 options disabled. Assuming 80ms and 45Mb/s, I come up with a window size of 440Kbytes required to fill the pipe. At windows default I would only expect to see 220Kbs over that same path. I think even modern *nix OSs tend to have default window sizes in the 64kB region, still not enough for that Bandwidth/delay. -- -Steve
You might want to look at this classic by Stanislav Shalunov http://shlang.com/writing/tcp-perf.html Marshall On Mar 27, 2007, at 4:26 PM, Philip Lavine wrote:
To all,
I have an east coast and west coast data center connected with a DS3. I am running into issues with streaming data via TCP and was wondering besides hardware acceleration, is there any options at increasing throughput and maximizing the bandwidth? How can I overcome the TCP stack limitations inherent in Windows (registry tweaks seem to not functions too well)?
Philip
______________________________________________________________________ ______________ Need Mail bonding? Go to the Yahoo! Mail Q&A for great tips from Yahoo! Answers users. http://answers.yahoo.com/dir/?link=list&sid=396546091
Marshall Eubanks wrote:
You might want to look at this classic by Stanislav Shalunov
The description on this website is very good. Disclaimer: I'm a FreeBSD TCP/IP network stack kernel hacker. To quickly sum up the facts and to dispell some misinformation: - TCP is limited the delay bandwidth product and the socket buffer sizes. - for a T3 with 70ms your socket buffer on both endss should be 450-512KB. - TCP is also limited by the round trip time (RTT). - if your application is working in a request/reply model no amount of bandwidth will make a difference. The performance is then entirely dominated by the RTT. The only solution would be to run multiple sessions in parallel to fill the available bandwidth. - Jumbo Frames have definately zero impact on your case as they don't change any of the limiting parameters and don't make TCP go faster. There are certain very high-speed and LAN (<5ms) case where it may make a difference but not here. - Your problem is not machine or network speed, only tuning. Change these settings on both ends and reboot once to get better throughput: [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters] "SackOpts"=dword:0x1 (enable SACK) "TcpWindowSize"=dword:0x7D000 (512000 Bytes) "Tcp1323Opts"=dword:0x3 (enable window scaling and timestamps) "GlobalMaxTcpWindowSize"=dword:0x7D000 (512000 Bytes) http://www.microsoft.com/technet/network/deploy/depovg/tcpip2k.mspx -- Andre
Marshall
On Mar 27, 2007, at 4:26 PM, Philip Lavine wrote:
To all,
I have an east coast and west coast data center connected with a DS3. I am running into issues with streaming data via TCP and was wondering besides hardware acceleration, is there any options at increasing throughput and maximizing the bandwidth? How can I overcome the TCP stack limitations inherent in Windows (registry tweaks seem to not functions too well)?
Philip
______________________________________________________________________ ______________ Need Mail bonding? Go to the Yahoo! Mail Q&A for great tips from Yahoo! Answers users. http://answers.yahoo.com/dir/?link=list&sid=396546091
On Mar 28, 2007, at 5:12 AM, Andre Oppermann wrote:
Marshall Eubanks wrote:
You might want to look at this classic by Stanislav Shalunov http://shlang.com/writing/tcp-perf.html
The description on this website is very good.
Disclaimer: I'm a FreeBSD TCP/IP network stack kernel hacker.
To quickly sum up the facts and to dispell some misinformation:
- TCP is limited the delay bandwidth product and the socket buffer sizes. - for a T3 with 70ms your socket buffer on both endss should be 450-512KB. - TCP is also limited by the round trip time (RTT). - if your application is working in a request/reply model no amount of bandwidth will make a difference. The performance is then entirely dominated by the RTT. The only solution would be to run multiple sessions in parallel to fill the available bandwidth. - Jumbo Frames have definately zero impact on your case as they don't change any of the limiting parameters and don't make TCP go faster. There are certain very high-speed and LAN (<5ms) case where it may make a difference but not here. - Your problem is not machine or network speed, only tuning.
Change these settings on both ends and reboot once to get better throughput:
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip \Parameters] "SackOpts"=dword:0x1 (enable SACK) "TcpWindowSize"=dword:0x7D000 (512000 Bytes) "Tcp1323Opts"=dword:0x3 (enable window scaling and timestamps) "GlobalMaxTcpWindowSize"=dword:0x7D000 (512000 Bytes)
http://www.microsoft.com/technet/network/deploy/depovg/tcpip2k.mspx
And, of course, if you have Ethernet duplex or other mismatch issues anywhere along the path, performance will be bad. Regards Marshall
-- Andre
Marshall On Mar 27, 2007, at 4:26 PM, Philip Lavine wrote:
To all,
I have an east coast and west coast data center connected with a DS3. I am running into issues with streaming data via TCP and was wondering besides hardware acceleration, is there any options at increasing throughput and maximizing the bandwidth? How can I overcome the TCP stack limitations inherent in Windows (registry tweaks seem to not functions too well)?
Philip
____________________________________________________________________ __ ______________ Need Mail bonding? Go to the Yahoo! Mail Q&A for great tips from Yahoo! Answers users. http://answers.yahoo.com/dir/?link=list&sid=396546091
Andre Oppermann gave the best advice so far IMHO. I'll add a few points.
To quickly sum up the facts and to dispell some misinformation:
- TCP is limited the delay bandwidth product and the socket buffer sizes.
Hm... what about: The TCP socket buffer size limits the achievable throughput-RTT product? :-)
- for a T3 with 70ms your socket buffer on both endss should be 450-512KB.
Right. (Victor Reijs' "goodput calculator" says 378kB.)
- TCP is also limited by the round trip time (RTT).
This was stated before, wasn't it?
- if your application is working in a request/reply model no amount of bandwidth will make a difference. The performance is then entirely dominated by the RTT. The only solution would be to run multiple sessions in parallel to fill the available bandwidth.
Very good point. Also, some applications have internal window limitations. Notably SSH, which has become quite popular as a bulk data transfer method. See http://kb.pert.geant2.net/PERTKB/SecureShell
- Jumbo Frames have definately zero impact on your case as they don't change any of the limiting parameters and don't make TCP go faster.
Right. Jumbo frames have these potential benefits for bulk transfer: (1) They reduce the forwarding/interrupt overhead in routers and hosts by reducing the number of packets. But in your situation it is quite unlikely that the packet rate is a bottleneck. Modern routers typically forward even small packets at line rate, and modern hosts/OSes/Ethernet adapters have mechanisms such as "interrupt coalescence" and "large send offload" that make the packet size largely irrelevant. But even without these mechanisms and with 1500-byte packets, 45 Mb/s shouldn't be a problem for hosts built in the last ten years, provided they aren't (very) busy with other processing. (2) As Perry Lorier pointed out, jumbo frames accelerate the "additive increase" phases of TCP, so you reach full speed faster both at startup and when recovering from congestion. This may be noticeable when there is competition on the path, or when you have many smaller transfers such that ramp-up time is an issue. (3) Large frames reduce header overhead somewhat. But the improvement going from 1500-byte to 9000-bytes packets is only 2-3%, from ~97% efficiency to ~99.5%. No orders of magnitude here.
There are certain very high-speed and LAN (<5ms) case where it may make a difference but not here.
Cases where jumbo frames might make a difference: When the network path or the hosts are pps-limited (in the >Gb/s range with modern hosts); when you compete with other traffic. I don't see a relation with RTTs - why do you think this is more important on <5ms LANs?
- Your problem is not machine or network speed, only tuning.
Probably yes, but it's not clear what is actually happening. As it often happens, the problem is described with very little detail, so experts (and "experts" :-) have a lot of room to speculate. This was the original problem description from Philip Lavine: I have an east coast and west coast data center connected with a DS3. I am running into issues with streaming data via TCP In the meantime, Philip gave more information, about the throughput he is seeing (no mention how this is measured, whether it is total load on the DS3, throughput for an application/transaction or whatever): This is the exact issue. I can only get between 5-7 Mbps. And about the protocols he is using: I have 2 data transmission scenarios: 1. Microsoft MSMQ data using TCP 2. "Streaming" market data stock quotes transmitted via a TCP sockets It seems quite likely that these applications have their own performance limits in high-RTT situations. Philip, you could try a memory-to-memory-test first, to check whether TCP is really the limiting factor. You could use the TCP tests of iperf, ttcp or netperf, or simply FTP a large-but-not-too-large file to /dev/null multiple times (so that it is cached and you don't measure the speed of your disks). If you find that this, too, gives you only 5-7 Mb/s, then you should look at tuning TCP according to Andre's excellent suggestions quoted below, and check for duplex mismatches and other sources of transmission errors. If you find that the TCP memory-to-memory-test gives you close to DS3 throughput (modulo overhead), then maybe your applications limit throughput over long-RTT paths, and you have to look for tuning opportunities on that level.
Change these settings on both ends and reboot once to get better throughput:
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters] "SackOpts"=dword:0x1 (enable SACK) "TcpWindowSize"=dword:0x7D000 (512000 Bytes) "Tcp1323Opts"=dword:0x3 (enable window scaling and timestamps) "GlobalMaxTcpWindowSize"=dword:0x7D000 (512000 Bytes)
http://www.microsoft.com/technet/network/deploy/depovg/tcpip2k.mspx -- Simon.
participants (17)
-
Andre Oppermann
-
Andy Davidson
-
Hank Nussbacher
-
JAKO Andras
-
Jim Shankland
-
Joe Abley
-
Lincoln Dale
-
Marshall Eubanks
-
michael.dillon@bt.com
-
Mikael Abrahamsson
-
Perry Lorier
-
Philip Lavine
-
Robert Boyle
-
Roland Dobbins
-
Simon Leinen
-
Stephen Sprunk
-
Steve Meuse