I bet myself you would respond the way you did before I finished processing my mail bag. I won. Sure, 100% packet loss is eminently acceptable if that loss rate occurs not more than 1% of the time. Maybe 10% packet loss
I suppose that is correct. 100% packet loss for 1% of my *connections*, I can live with that. We also lived with high delays (like 0.5 sec satellite round trip on USAN, I don't mean those occasional 75 seconds in some, ahem, other environments at that time), as long as it was predictable. What drives me nuts is if services are unpredictable, like immediate packet services at the 40% percentile, a few seconds and occasional losses for another 40% of the packets, and the opportunity to keep up with Dave's technical field between keystrokes for the remaining 20%. What does a consistent 10% packet loss mean? I think it has little to do with telco line qualities these days, more with resource contention. What is contended? A T1 line (or whatever) is never, ever congested, as it is just a simple bit pipe. The contention is at the equipment, called routers, that aggregate more traffic on the inbound interfaces than it can dump onto the outbound interfaces (e.g., for outbound line capactity reasons, and buffers then filling up in the router). Historically that was often a router problem, as they were too slow to deal with the onslaught of packets for a plain packet-per-second-rate (remember, in 1987 the NSF solicitation asked for a then whopping 1000 packets per second per router, which was just barely achievable then). Today you can buy technology off the shelf that does not have a pps problem for typical situations. So what is the problem, if it is not the rouuter interconnection or the router technology? The answer is bad network engineering, little consideration for architectural requirements, and lack of understanding for the Internet workload profile. Intra-NSP, perhaps even more among NSPs. Or, in other words, it is people that kill the network, not the routers or phone lines, particularly people who are trying to make money off it, probably using their unique optimization function focused on profit and limiting expenses as much as they can, not understanding the fate sharing yet. A constant 10% packet loss (or any constant packet loss) means that the network is underprovisioned. The *deployed* Internet depends heavily on massive aggregation of microscopic transactions (e.g., 10 packets or so at the 50% percentile of transaction, and tens of thousands of them perhaps in parallel). These aggregations result in some degree of steady state, but also burst behavior, which even in a well designed network can result in occasional losses due to overutilization of resources. But it should not happen consistently, if someone were to claim it to be a well designed and implemented network. The conventional wisdom says to upgrade the capacity (e.g., more bandwidth to improve the outflow from the routers) to handle the additional load in cases of resource contention. Can be an expensive undertaking, and an administrative nightmare, especially in the international realm. May be a bandaid could be a prioritization of traffic, so that more deservy traffic gets better services. For example, for my email 10% packet losses I would typically not even know about, but most interactive connections (call it lazyness, stupidity, whatever) create several packets per keystroke, with their demands for end-to-end echos (hey, Dave, you did a bad job of technology transfer out of Fuzzballs, as you got it right, line mode by default, going into character mode only if really necessary; i.e., proof that it is possible to do it right was available 10 years ago). Prioritization can be left to either the service provider (who may have to hide it; and it is also very hard to serve everyone right that way), or by the end-user. If the end-user specifies a service profile (be it IP Precedence or whatever) it will only work if there is a penalty for higher service qualities (e.g., quotas, or precedence-0 is free, higher ones cost the end-user (or someone else in the path who's pain would be desirable here) something). You would still need to understand the workload profile eventually, simple utilization graphs won't suffice if you compare the common microscopic transactions with those exhibiting a high bandwidth * duration product (e.g., them new multimedia thingies). Anyway, no magic here either. This issues are on the table for many years already, nothing new, though if the Internet is to survive, some service providers probably need to adjust their optimization models.
On Wed, 8 Nov 1995, Hans-Werner Braun wrote:
A constant 10% packet loss (or any constant packet loss) means that the network is underprovisioned.
Agreed
These aggregations result in some degree of steady state, but also burst behavior, which even in a well designed ^^^^^^^^^^^^ People who have analyzed Internet traffic would disagree with that statement. The traffic patterns on the Internet do not appear to converge on a steady state when multiple packet streams are aggregated.
Michael Dillon Voice: +1-604-546-8022 Memra Software Inc. Fax: +1-604-542-4130 http://www.memra.com E-mail: michael@memra.com
On Wed, 8 Nov 1995, Hans-Werner Braun wrote:
A constant 10% packet loss (or any constant packet loss) means that the network is underprovisioned.
Agreed
These aggregations result in some degree of steady state, but also burst behavior, which even in a well designed ^^^^^^^^^^^^ People who have analyzed Internet traffic would disagree with that statement. The traffic patterns on the Internet do not appear to converge on a steady state when multiple packet streams are aggregated.
you're the greatest expert around here, obviously.
Michael Dillon Voice: +1-604-546-8022 Memra Software Inc. Fax: +1-604-542-4130 http://www.memra.com E-mail: michael@memra.com
On Wed, 8 Nov 1995, Hans-Werner Braun wrote:
These aggregations result in some degree of steady state, but also burst behavior, which even in a well designed ^^^^^^^^^^^^ People who have analyzed Internet traffic would disagree with that statement. The traffic patterns on the Internet do not appear to converge on a steady state when multiple packet streams are aggregated.
you're the greatest expert around here, obviously.
Hardly... I just have a memory which is good at storing facts. Sometime in the last month on either nanog or com-priv a person whose name escapes me posted an article about some statistical studies that had been done on the old NSFnet and recently redone. In both instances the traffic patterns did not smooth out after aggregation. There was a specific mathematical/statistical term used to describe the distribution but I can't remember it (not a mathematician). So, no I am not a great expert, but I listen carefully, try to understand intuitively what the real experts are saying and think about the implications. In any case, expertise is not an overall thing in this day and age. I am an expert in certain fields which Curtis and Sean probably know little. They in turn are experts far beyond my level in their fields. One nice thing about Internet lists (and some parts of USENET) is that we can all rub shoulders and from time to time, develop a synergy that helps us understand our own problems better and find solutions that we would not have found strictly within the confines of our field of expertise. Michael Dillon Voice: +1-604-546-8022 Memra Software Inc. Fax: +1-604-542-4130 http://www.memra.com E-mail: michael@memra.com
On Wed, 8 Nov 1995, Hans-Werner Braun wrote:
These aggregations result in some degree of steady state, but also burst behavior, which even in a well designed ^^^^^^^^^^^^ People who have analyzed Internet traffic would disagree with that statement. The traffic patterns on the Internet do not appear to converge on a steady state when multiple packet streams are aggregated.
you're the greatest expert around here, obviously.
Hardly...
I just have a memory which is good at storing facts. Sometime in the last month on either nanog or com-priv a person whose name escapes me posted an article about some statistical studies that had been done on the old NSFnet and recently redone. In both instances the traffic patterns did [...]
Oh, man, this is getting so silly I can hardly bear to look in my mail box. If you'd wanted to name someone who'd been recently working at measuring and statistically characterizing Internet traffic you could have just cut-and-pasted the name on the top line of your note, who you contradicted on a related point of fact based on something you half-remember reading in com-priv. We're doomed, aren't we? Dennis Ferguson
On Wed, 8 Nov 1995, Dennis Ferguson wrote:
I just have a memory which is good at storing facts. Sometime in the last month on either nanog or com-priv a person whose name escapes me posted
Oh, man, this is getting so silly I can hardly bear to look in my mail box. If you'd wanted to name someone who'd been recently working at measuring and statistically characterizing Internet traffic you could have just cut-and-pasted the name on the top line of your note,
Didn't I say I can't remember the name? Didn't I say I am not an expert in this field of endeavor?
who you contradicted on a related point of fact based on something you half-remember reading in com-priv.
Somebody else was kind enough to point me to a WWW site that covers some of the work that HWB and k claffy have done. In a paper which they posted on the web, I found this statement: For example, one characteristic of network workload is `burstiness', which reflects variance in traffic rate. Network behavioral patterns of burstiness are important for defining, evaluating, and verifying service specifications, but there is not yet agreement in the Internet community on the best metrics to define burstiness. Several researchers [21,22,6] have explored the failure of Poisson models to adequately characterize the burstiness of both local and wide area Internet traffic. This is precisely the item that I am focussing on because I believe it partially explains why service is not improving. I also believe that if Internet traffic continues to have this non-Poisson characteristic then the efforts of NSP's to maintain high service levels will not be as effective as might be expected. Although I don't know whether telco POTS service exhibits Poisson characteristics, I suspect that it doesn't and plays a large part in helping them maintain near perfect service levels. At the time I first read of this discovery, I filed away the information because I felt it was useful in helping my customers understand why sometimes the Internet works fast for them and sometimes it is jerky and erratic. So, is this right or wrong? Does this non-Poisson characteristic explain why the network performance tends to be erratic? Or is it a minor factor not worth worrying about (operationally) at the present time? Michael Dillon Voice: +1-604-546-8022 Memra Software Inc. Fax: +1-604-542-4130 http://www.memra.com E-mail: michael@memra.com
Dennis, Geeze, another rare country heard from. I see you have moved again. From recent messages, it seems that the self-similar phenomenon has been rediscovered again. I would assume the providers have added that worry to their bag of troubles. Dave
Hans-Werner,
router). Historically that was often a router problem, as they were too slow to deal with the onslaught of packets for a plain packet-per-second-rate (remember, in 1987 the NSF solicitation asked for a then whopping 1000 packets per second per router, which was just barely achievable then). Today you can buy technology off the shelf that does not have a pps problem for typical situations. So what is the problem, if it is not the rouuter interconnection or the router technology? The answer is bad network engineering, little consideration for architectural requirements, and lack of understanding for the Internet workload profile. Intra-NSP, perhaps even more among NSPs. Or, in other words, it is people that kill the network, not the routers or phone lines, particularly people who are trying to make money off it, probably using their unique optimization function focused on profit and limiting expenses as much as they can, not understanding the fate sharing yet.
I disagree with you about the adequacy of routers you can buy off the shelf, and in fact would reach an exactly opposite conclusion. I think we are reaching the end of the ability to support the core of the U.S. Internet (once the NSFnet, now the collection of high-end NSPs) with routers you can obtain now in the fashion to which we've become accustomed. In fact I think we're fast approaching the state of the 56kbps network just before the deployment of the factor-of-8 bandwidth increment in trunk bandwidth that the IBM RT network provided, only at a bandwidth level 2.5 orders of magnitude higher, and I think the sagging at the center of the Internet is taking the edges with it due to the lack of push-me-pull-you incentive to keep the edges growing. I am old enough to be able to assemble the following timeline for upgrades of the U.S. Internet core trunk bandwidth over the last 10 years, along with the corresponding increase in local interconnect bandwidth. Feel free to correct the dates, but I don't think I'm too far off. 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 ---|-----|------|------|------|------|------|------|------|------|------|--- x8 x3.3 x14 x2 ^ 56kbps 1/3 T1 full T1 1/2 T3 full T3 ? x10 Ethernet FDDI ? Given that it is now almost 1996, and the growth rate of the Internet has shown no signs of having slowed, what would you extrapolate we should have been working on deploying about now? My best guess would be that we'd be due for another big increment, say a factor of 12 or so, both in backbone and in interconnect technology, to take us another couple of years, had we been following the historical rollout of new technology. Yet not only can you not buy OC-12 routers off the shelf, or anywhere else, you can't even buy honest OC-3 routers at this point (I will avoid progressing into a rant on how the bizillions invested in ATM development to produce very little of practical use so far might have been better spent...). And I would suggest that if you were, say, a big phone company, and you actually understood, in your own inimicable big phone company way, that percentage packet loss rates in your infrastructure with anything other than zeros to the left of the decimal point were unacceptable, and you were willing, at least for now, to do whatever you could to build, maintain and grow a high quality Internet infrastructure even if you hadn't yet figured out how to make a profit from it, you would still find that meeting your traffic growth projections with even the most creative arrangements of 5-slot T3 routers and whatever else you could buy to help them along to be a bleak prospect. There comes a point where you just run out of router bandwidth, and nothing but more router bandwidth is going to fix it, but the bigger bandwidth boxes are no where to be found. So we've got routing problems front and center, here and there, with bandwidth problems creeping up behind. We've got some companies with relatively deep pockets, or which are flush with IPO money, which would very probably spend to fix it if they could, if only to avoid being featured on the 10 o'clock news when disasters occur, except there doesn't seem to be anything to spend the money on which is clearly going to fix anything. I don't think this is a happy state to be in, in fact it sucks, but I don't think it is correct to attribute this state to counter-productive profit motives. I think we're victims of our having own success creep up to and pass the technology when we weren't paying close enough attention, and the only thing left to do seems to be to try to play catch-up from a position of increasing disadvantage. Dennis Ferguson
Hmm, Dennis, interesting message. No disagreement with anything you said. Not clear what to do next, though. I think you are saying that the Internet has been very successful, fairly constantly outrunning its own technology capabilities. As this success happened, people started to rely on it and assumed growth and wealth for generations to come, perhaps neglecting significant and necessary and fairly short term (1-3 (5?) years deployment horizon) development efforts at the right time, and rather shifting development to other areas that may pan out further in the future (e.g., your ATM comment). Grumble. I would agree that most activities I noticed where fairly tactical or longer term research, not much time-middleware. What do you suggest we should do now? Hans-Werner PS: My "profit hungry and not much investing" comment was in response to getting a tad annoyed at 10% packet losses being just fine. You know that I know that there are people out there who take things more seriously than that.
On Wed, 8 Nov 1995, Dennis Ferguson wrote:
to be a bleak prospect. There comes a point where you just run out of router bandwidth, and nothing but more router bandwidth is going to fix it, but the bigger bandwidth boxes are no where to be found.
Are you sure that creative ways of using lots of smaller T3 bandwidth boxes couldn't solve the problem? If we assume that bandwidth on the lines is not a problem (no shortages) and that T3 routers with smaller routing tables could make effective use of the bandwidth, then is it possible to do the following? In Hypothetica, PA are ABC ISP who has a T1 to Sprint and XYZ ISP who has a line to MCI. Both have so-called portable addresses from the swamp and thus consume space in the core routing tables. This means that traffic from ABC to XYZ travels from Hypothetica to Pennsauken, thence to MCI and back to Hypothetica. However, suppose we clean up the swamp by simply removing it entirely from all the core routing tables. What then? Every provider puts a default route in each core router. This default route points to a special router whose job is to just deal with the swamp routes and nothing else. In effect we are partitioning the routing tables in two. Under this regimen packets from ABC to XYZ travel to Pennsauken, then follow the default to Fort Worth and thence to Chicago where the swamp router lives. The swamp router uses a separate continental backbone to route the traffic back to Fort Worth, back to Pensauken and thence to MCI where the traffic takes a similar circuitous route before reaching Hypothetica. Seems terribly wasteful of bandwidth doesn't it? But if something like this can help prevent routers from flapping and if bandwidth is avaialbale, perhaps it could work. If the parallel lines carrying "swamp" traffic are of lower bandwidth than the main lines and suffer congestion, then I suppose ABC could simply renumber to be within Sprint's aggregate and be back on the mainline. In fact, if this really is a viable technical solution, perhaps the threat of deployment would cause a rush of renumbering and make it easier for NSP's to just say no to swamp addresses.
seem to be anything to spend the money on which is clearly going to fix anything. I don't think this is a happy state to be in, in fact it sucks,
If you are right, then yes it sucks. Obvoiusly the ATM and OC3 technologies are right where you have pegged them, but what about parallelism using existing DS3 technology? And if this is done, are there mux/demux boxes that can handle DS3's<->OC3 ?
profit motives. I think we're victims of our having own success creep up to and pass the technology when we weren't paying close enough attention, and the only thing left to do seems to be to try to play catch-up from a position of increasing disadvantage.
One nice side effect is that this may force the video-on-demand folks off the Internet and into straight ATM instead. I rather like the future scenario where the globe is girdled by an IPng data network and a separate parallel video/ATM network. Michael Dillon Voice: +1-604-546-8022 Memra Software Inc. Fax: +1-604-542-4130 http://www.memra.com E-mail: michael@memra.com
participants (4)
-
Dave Mills
-
Dennis Ferguson
-
hwb@upeksa.sdsc.edu
-
Michael Dillon