Ken, I don't think it was Bob Metcalfe... Note this telling extract from USENET of just over a year back.
------- Start of forwarded message ------- Newsgroups: comp.dcom.lans.ethernet Path: cronkite.cisco.com!decwrl!parc!wirish From: wirish@parc.xerox.com (Wes Irish) Subject: Performance problems on high utilization Ethernets Message-ID: <wirish.750731783@misty> Summary: High utilization Ethernet performance problems traced to controller implementation bugs Keywords: Ethernet, communications, interframe gap, IFG, collisions, controller, interface, packet loss, data link Sender: news@parc.xerox.com Organization: Xerox PARC Date: 16 Oct 93 00:36:23 GMT Lines: 115
For the past year or so I have been investigating performance problems on the Ethernets here at PARC. This work has uncovered problems with a number of Ethernet controllers in common use today. These low-level controller problems can lead to serious performance problems for many of the systems involved. A full paper on this work, "Investigations into observed performance problems on high utilization Ethernet networks", will be released soon (initially as a PARC Blue & White report). But, since I have been giving talks on this work and news of it has begun to hit the Internet, I feel that a should post a preliminary report in order to reduce speculation and to make sure that the facts are correctly stated. Below is a short summary of some of the key facts and issues.
The Ethernet specifications talk about making sure that transmitters enforce a 9.6 microsecond gap (IFG) between frames (packets). This is straight forward in the case of a gap following a just completed good packet. But, gaps following collision events are less straight forward. I do not want to debate the details of what is and is not "correct" in this case -- that is a discussion for another time and place. The reality of the situation is that there are a number of controllers in wide-spread use on networks today that do not interoperate very well in the face of collisions.
In general, the problems arise when the gap following a collision is too short for a particular implementation of a receiver. In addition to uncovering controllers that simply generate short IFGs I have also uncovered a major implementation bug in a particular chip that injects short signal bursts onto the network. These bursts can damage the IFG "enforced" by other machines. Either way, the result is that same -- a short IFG preceding a packet which can result in a missed packet.
It is important to note that when a controller misses a packet due to a short IFG THE FACT THAT THE PACKET WAS MISSED IS NOT DETECTED NOR REPORTED TO THE SYSTEM. System and driver statistics will claim no packets lost (unless some are lost for other reasons). Even most network analyzers are subject to the same undetected and therefore unreported packet loss. I have resorted to using a digital oscilloscope to capture and analyze these events.
Let me emphasize that these problems are almost exclusively related to dealing with collision events. On a lightly loaded network, where collisions are few and far between, these problems are virtually non-existent. But these problems do indeed come into play on moderate to heavily loaded networks. Based on my observations a VERY ROUGH network load dividing line is about 25% load (using 0.1 or 1.0 sec samples).
Here is an enumeration of some of the facts related to particular controllers that I have uncovered so far. There may be problems with other controllers but they may not appear on the networks that I have inspected.
Controller: Intel 82586 Commonly found in: SUN 3's and SUN 4's (ie interfaces), many other machines Problem: Can generate a short IFG following a collision Cause: starts IFG timer on CS dropout
Controller: Intel 82596 Commonly found in: Network General Sniffer using Cogent interface card Problem: Will not hear packet unless preceding IFG is 4.6 usec or larger
Controller: SEEQ 8003 Commonly found in: Cisco MEC and MCI interfaces, older SGI (Silicon Graphics) including 4D/35 and Indigo (but not Indigo2) Problem: Can generate a short IFG following a collision Cause: Starts 9.6 usec timer at end of its on jam and not end of collision Problem: Generates 24 bit signal burst onto network following some collisions. This burst lands in the IFG following the collision and will often result in two short IFGs resulting in other controllers missing the packet. NB: this can happen even if the chip has nothing to transmit!
Controller: AMD 7990 "LANCE" Commonly found in: SUN SPARCStation machines (SS-1, SS-1+, SS-2, SS-10, ...), many DEC machines, Cisco/SynOptics routers, Cisco IGS, many other machines Problem: Will not hear packet unless preceding IFG is 4.1 usec or larger Cause: implementation state machine Problem: many other problems including lock-up, transmit gaps greater than 9.6 usec under load, etc. Fix: A new version of the controller, the 79C90 CLANCE, fixes many of these problems but is not in common use like the LANCE.
Interface chip: AT&T T7213 Commonly found in: SUN SPARCStation 10 and other newer SUN machines Problem: Will hold the collision (and kill data) sent to the controller chip across IFGs of roughly 1.0 usec or less. It will also do this if a "manchester coding violation" is detected in a packet -- a job that should be left to the controller.
The result of all of these implementation details is that it is very possible, even probable, to put together a network that results in "undetected" packet loss. Packet loss rates of even less than 1% can result in performance hits as high as 80%, depending on a multitude of factors including the protocols and implementations being used. I have clocked the potential packet drop rate at PARC due to these problems to be in the 1% - 5% range at times.
I have been working with many of these vendors for a number of months now in an attempt to get these various bugs fixed so that different equipment interoperates properly. Most of the vendors have been very receptive to making things work now that they know there is a problem. Some have already identified solutions while others are still working on them.
Wesley Irish Network Scientist Xerox PARC wirish@parc.xerox.com
[Please send any replies via e-mail as I do not normally read netnews] ------- End of forwarded message -------
------------------------------------------------------------------------
Curtis I'm not sure I understand your use of the term "unproven."
In Lan circles we've been discussing this exact same phenomena for the last 9 months (I raised it with Jessica as a potential explanation of some of the problems we were seeing in our early testing).
Bob Metcalfe (coinventor of ethernet) discovered the some ethernet chip sets were also violating the inter-packet gap spec. A particular problem was that many of the devices used for sniffing themselves had the same chip sets and simply couldn't see what was happening to the packet stream (silent discards withour errors signalled at the receiving end).
He needed very expensive signal analysis hardware before the cause could be isolated.
Ken Latta, Merit Network, Inc. NSFNET Project, Internet Engineering Group 1071 Beal, Ann Arbor, MI 48109-2103 313.936.2115 voice, 313.747.3745 fax klatta@umich.edu, USERLFQF@umichum.bitnet
From: Curtis Villamizar <curtis@ans.net> To: nanog@merit.edu
FYI-
For those that don't appreciate the consequences of using unproven technology. The good news on Mae-East is packet loss is down to 15% from 40%? :-(
Congratulations to Sprint for picking a technology that is known to work for the Sprint NAP. FDDI works. We'll see how the others NAPs do, though I'm not encouraged by test results so far.
Curtis
BTW - this is Mae-East (the MFS bridged ethernet), not Mae-East+ (the bridged FDDI).
------- Forwarded Message
From: Sean Doran <smd@sprint.net> Reply-To: smd@sprint.net To: mae-east@uunet.uu.net Subject: Moderately urgent: getting rid of annoying packet losses Date: Wed, 19 Oct 1994 02:07:06 -0400 Sender: smd@tiny.sprintlink.net
The Magnum boxes are *very* unhappy with inter-packet gaps of less than about 23 microseconds, and drop back-to-back packets like superheated rocks.
We have a kludge which will help until the MFS hardware gets fixed.
Those of you running one Cisco with EIP 10-0 microcode or better should set the transmitter-delay of your MAE-EAST interface to 96 (0x60). This will dramatically reduce the packet loss across MAE-EAST.
IMPORTANT: Those of you who have more than one box on your ethernet drop to MFS will need to a/ acquire EIP 171-1 from Cisco and load it in then b/ set the transmitter-delay of each of your MAE-EAST interfaces to 0x360 (864).
The new microcode has apparently been well tested, and is doing the right thing for icm-dc-1.icp.net and sl-dc-6.sprintlink.net (drops to most of you have fallen from 40% to much less than 15%). It works by assigning new meanings to the upper 8 bits of the transmitter- delay value; this particular setting will delay the transfer of the packet to the datalink controller when there is traffic on the wire, then require an additional quiet time of 30usec, after which there will be the standard 9.6 usec IEEE 802.3 delay.
(The original intent apparently was to avoid drops when bursting ethernet traffic encounters collisions by backing off on handing the packet to the datalink layer; the application here is not quite exactly what was intended, but definitely helps us).
Each of your routers attached to MAE-EAST must run the new EIP 171-6 microcode and have the 0x360 transmitter-delay setting.
Thanks to Robert M. Broberg of Cisco for the code.
Those of you without Ciscos will have to come up with a similar hack somehow.
Sean.
P.S.: We are *very* keen on PSI, NETCOM, and MCI to implement the change, especially PSI. We aren't having problems with anyone else we exchange traffic with at MAE-EAST (other than Dante AS1133, but that's not a Cisco) but everyone would probably benefit from the upgrade anyway. Try pinging each of your peers in 192.41.177 a hundred or so times.
- - -- Sean Doran <smd@sprint.net> SprintLink/ICM engineering +1 703 904 2089
------- End of Forwarded Message
-- --bill