I am not normally on this list but someone kindly gave me copies of some of the email concerning the Internet2 Land Speed record. So I have joined the list. As one of the PIs of the record, I thought it might be useful to comment on a few interesting items I have seen, and no I am not trying to flame anybody: "Give em a million dollars, plus fiber from here to anywhere and let me muck with the TCP algorith, and I can move a GigE worth of traffic too - Dave" You are modest in your budgetary request. Just the Cisco router (GSR 12406) we had on free loan listed at close to a million dollars, and the OC192 links just from Sunnyvale to Chicago would have cost what was left of the million/per month. We used a stock TCP (Linux kernel TCP). We did however, use jumbo frames (9000Byte MTUs). In response Richard A Steenbergen we are not "now living in a tropical foreign country, with lots and lots of drugs and women" but then the weather in California is great today. "What am I missing here, theres OC48=2.4Gb, OC192=10Gb ..." We were running host to host (end-to-end) with a single stream with common off the shelf equipment, there are not too many (I think none) > 1GE host NICs available today that are in production (e.g. without signing a non-disclosure agreement). "Production commercial networks ... Blow away these speeds on a regular basis". See the above remark about end-to-end application to application, single stream. "So, you turn down/off all the parts of TCP that allow you to share bandwidth ..." We did not mess with the TCP stack, it was stock off the shelf. "... Mention that "Internet speed records" are measured in terabit-meters/sec." You are correct, this is important, but reporters want a sound bite and typically only focus on one thing at a time. I will make sure next time I talk to a reporter to emphasize this. Maybe we can get some mileage out of Petabmps (Peta bit metres per second) sounds "What kind of production environment needs a single TCP stream of data at 1Gbits/s over a 150ms latency link?" Today High Energy Particle Physics needs hundreds of Megabits/s between California and Europe (Lyon, Padova and Oxford) to deliver data on a timely basis form an experiment site at SLAC to regional computer sites in Europe. Today on production acadmeic networks (with sustainable rates of 100 to a few hundred Mbits/s) it takes about a day to transmit just over a Tbyte of data which just about keeps up with the data rates. The data generation rates are doubling / year so within 1-3 years we will be needing speeds like in the record on a production basis. We needed to ensure we can achieve the needed rates, and whether we can do it with off the shelf hardware, how the hosts and OS' need configuring, how to tune the TCP stack or how newer stacks perform, what are the requirements for jumbo frames etc. Besides High Energy Physics other sciences are beginning to grapple with how to repliacte large databases across the globe, such sciences include radio-astronmoy, human genome, global weather, seismic ... The spud gun is interesting, given the distances, probably a 747 freightliner packed with DST tapes or disks is a better idea. Assuming we fill the 747 with say 50 Gbps tapes (disks would probably be better), then if it takes 10 hours to fly from San Francisco (BTW Sunnyvale is near San Francisco not near LA as one person talking about retiring to better weather might lead one to believe) the bandwidth is about 2-4 Tbits/s. However, this ignores the reality of labelling, writing the tapes, removing from silo robot, pocaking, getting to airport, loading, unloading, getting through customs etc. In reality the latency is really closer to 2 weeks. Even worse if there is an error (heads not aligned etc.) then the the retry latency is long and the effort involved considerable. Also the network solution lends itself much better to automation, in our case we saved a couple of full time equivalent people at the sending site to distribute the data on a regular basis to our collaborator sites in France, UK and Italy. The remarks about window size and buffer are interesting also. It is true large windows are needed. To approach 1Gbits/s we require 40MByte windows. If this is going to be a problem, then we need to raise question like this soon and figure out how to address (add more memory, use other protocols etc.). In practice to approcah 2.5Gbits/s requires 120MByte windows. I am quite happy to concede that this does not need to be about some jocks beating a record. I do think it is important to catch the public's attention to why high speeds are important, that they are achievable today application to application (it would also be useful to estimate when such speeds are available to universities, large companies, small companies, the home etc.), and for techies it is important to start to understand the challenges the high speeds raise, e.g. cpu and router memories, bugs in TCP, OS, application etc., new TCP stacks, new (possibly UDP based) protocols such as tsunami, need for 64 bit counters in monitoring, effects of the NIC card, jumbo requirements etc., and what is needed to address them. Also to try and put it in meaningful terms (such as 2 full length DVD movies in a minute, that could also increase the "cease and desist" legal messages shipped ;-)) is important. Hope that helps, and thanks to you guys in the NANOG for providing todays high speed networks.
LC> Date: Sat, 08 Mar 2003 10:04:20 -0800 LC> From: "Cottrell, Les" LC> The remarks about window size and buffer are interesting LC> also. It is true large windows are needed. To approach LC> 1Gbits/s we require 40MByte windows. If this is going to be LC> a problem, then we need to raise question like this soon and LC> figure out how to address (add more memory, use other LC> protocols etc.). In practice to approcah 2.5Gbits/s requires LC> 120MByte windows. Yup. About 2x to 2.5x the bandwidth*delay product. I'm still curious about insane SACK or maybe NACK. Spray TCP packets hoping they arrive (good odds), and wait to hear what made or didn't make it. Let the receiving end have the large buffers... sending machines generally must handle a greater number of sessions. ECN also would be a nice way of telling a sender to back off, [hopefully] proactively avoiding packet loss. It certainly seems a shame to require big sending buffers and slow down entire streams just in case a small bit gets lost. Eddy -- Brotsman & Dreger, Inc. - EverQuick Internet Division Bandwidth, consulting, e-commerce, hosting, and network building Phone: +1 (785) 865-5885 Lawrence and [inter]national Phone: +1 (316) 794-8922 Wichita ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Date: Mon, 21 May 2001 11:23:58 +0000 (GMT) From: A Trap <blacklist@brics.com> To: blacklist@brics.com Subject: Please ignore this portion of my mail signature. These last few lines are a trap for address-harvesting spambots. Do NOT send mail to <blacklist@brics.com>, or you are likely to be blocked.
You are modest in your budgetary request. Just the Cisco router (GSR 12406) we had on free loan listed at close to a million dollars, and the OC192 links just from Sunnyvale to Chicago would have cost what was left of the million/per month.
No, your budget folks have no clue, which they clearly demonstrate. Anyone here who buys Cisco at the list prices works for companies that for some reason want to waste money. We pay about 10c on a dollar. Anyone leasing OC-192 at that price as opposite to lighting it up is smoking.
"What am I missing here, theres OC48=2.4Gb, OC192=10Gb ..."
We were running host to host (end-to-end) with a single stream with common off the shelf equipment, there are not too many (I think none) > 1GE host NICs available today that are in production (e.g. without signing a non-disclosure agreement).
Again, if this is all available today, what is so new that you guys have done, apart from blowing tons of money?
The remarks about window size and buffer are interesting also. It is true large windows are needed. To approach 1Gbits/s we require 40MByte windows. If this is going to be a problem, then we need to raise question like this soon and figure out how to address (add more memory, use other protocols etc.). In practice to approcah 2.5Gbits/s requires 120MByte windows.
I am quite happy to concede that this does not need to be about some jocks beating a record. I do think it is important to catch the public's attention to why high speeds are important, that they are achievable today application to application (it would also be useful to estimate when such speeds are available to universities, large companies, small companies, the home etc.), and for techies it is important to start to understand the challenges the high speeds raise, e.g. cpu and router memories, bugs in TCP, OS, application etc., new TCP stacks, new (possibly UDP based) protocols such as tsunami, need for 64 bit counters in monitoring, effects of the NIC card, jumbo requirements etc., and what is needed to address them. Also to try and put it in meaningful terms (such as 2 full length DVD movies in a minute, that could also increase the "cease and desist" legal messages shipped ;-)) is important.
High speeds are not important. High speeds at a *reasonable* cost are important. What you are describing is a high speed at an *unreasonable* cost. Alex
On Sat, Mar 08, 2003 at 03:29:56PM -0500, alex@yuriev.com quacked:
High speeds are not important. High speeds at a *reasonable* cost are important. What you are describing is a high speed at an *unreasonable* cost.
To paraphrase many a california sufer, "dude, chill out." The bleeding edge of performance in computers and networks is always stupidly expensive. But once you've achieved it, the things you did to get there start to percolate back into the consumer stream, and within a few years, the previous bleeding edge is available in the current O(cheap) hardware. A cisco 7000 used to provide the latest and greatest performance in its day, for a rather considerable cost. Today, you can get a box from Juniper for the same price you paid for your 7000 that provides a few orders of magnitude more performance. But to get there, you have to be willing to see what happens when you push the envelope. That's the point of the LSR, and a lot of other research efforts. -Dave -- work: dga@lcs.mit.edu me: dga@pobox.com MIT Laboratory for Computer Science http://www.angio.net/ I do not accept unsolicited commercial email. Do not spam me.
To paraphrase many a california sufer, "dude, chill out."
When the none of my taxes goes to the silly projects, I will chill out. It had been stated by the people that participated in this research that (a) they bought hardware at the prices to help Cisco to make its quarters (b) they have spent millions of dollars for OC-192 links when they did not need them. (c) they did not come up with anything new apart from a "proof" that they achieved that speed.
The bleeding edge of performance in computers and networks is always stupidly expensive. But once you've achieved it, the things you did to get there start to percolate back into the consumer stream, and within a few years, the previous bleeding edge is available in the current O(cheap) hardware.
That is all great if they *actually* *developed* something. However, they did not. They bought off the shelf products for list prices plugged them in, ran slightly tweaked kernels, helped Qwest/Globalcrossing etc prop its quarters and announced "we did it".
A cisco 7000 used to provide the latest and greatest performance in its day, for a rather considerable cost. Today, you can get a box from Juniper for the same price you paid for your 7000 that provides a few orders of magnitude more performance.
But to get there, you have to be willing to see what happens when you push the envelope. That's the point of the LSR, and a lot of other research efforts.
That's the argument that pentagon used to justify buying $40 lightbulbs. Does not work, sorry. Alex
On Sat, 2003-03-08 at 15:58, alex@yuriev.com wrote:
That's the argument that pentagon used to justify buying $40 lightbulbs. Does not work, sorry.
That is not the argument used to justify buying 40 lightbulbs. They do not actually purchase 40 lightbulbs, the prices that you see in rag magazine reports has to do with how the budgets are handled. If you can budget a multi-billion dollar organization and put in reasonable price and performance controls there are many schools that would hire you after you revolutionized public administration and the DoD... -- Douglas F. Calvert <douglist@anize.org>
On Sat, 8 Mar 2003, Cottrell, Les wrote:
We used a stock TCP (Linux kernel TCP). We did however, use jumbo frames (9000Byte MTUs).
What kind of difference did you see as opposed to standard 1500 byte packets? I did some testing once and things actually ran slightly faster with 1500 byte packets, completely contrary to my expectations... (This was UDP and just 0.003 km rather than 10,000, though.)
The remarks about window size and buffer are interesting also. It is true large windows are needed. To approach 1Gbits/s we require 40MByte windows. If this is going to be a problem, then we need to raise question like this soon and figure out how to address (add more memory, use other protocols etc.). In practice to approcah 2.5Gbits/s requires 120MByte windows.
So how much packet loss did you see? Even with a few packets in a million lost this would bring your transfer way down and/or you'd need even bigger windows. However, bigger windows mean more congestion. When two of those boxes start pushing traffic at 1 Gbps with a 40 MB window, you'll see 20 MB worth of lost packets due to congestion in a single RTT. A test where the high-bandwidth session or several high-bandwidth sessions have to live side by side with other traffic would be very interesting. If this works well it opens up possibilities of doing this type of application over real networks rather than (virtual) point-to-point links where congestion management isn't an issue.
participants (6)
-
alex@yuriev.com
-
Cottrell, Les
-
David G. Andersen
-
Douglas F. Calvert
-
E.B. Dreger
-
Iljitsch van Beijnum