In message <199511231608.IAA27230@upeksa.sdsc.edu>, Hans-Werner Braun writes:
Question: Which RFC should I consult to determine acceptable delay and packe t loss?
RFCs are the result of IETF activities. The IETF is essentially a protocol standardization group, not an operations group. I don't think you perceive the IETF as "running" your network, or? There may not be much of an alternative, though, which to a large extend is the issue at hand. Nobody is responsible (individually or as a consortium or whatever) of this anarchically organized and largely uncoordinated (at a systemic level) global operational environment. While IETF/RFCs could be utilized somehow, this is not really an issue of theirs. I sure would not blame the IETF for not delivering here, is this is not their mandate.
In other email I saw it seems that the important issues are hard to understand for some. I (and I suspect several others) don't really care much about a specific tactical issue (be it an outage or whatever). The issue is how to make the system work with predictable performance and a fate sharing attitude at a global level, in a commercial and competitive environment that is still extremely young at that, and attempts to accomodate everything from mom'n'pop shops to multi-billion dollar industry. And exhibits exponential usage and ubiquity growth, without the resources to upgrade quickly to satisfy all the demands. And no control over in-flows, and major disparities across the applications. And TCP flow control not working that well, as the aggregation of transactions is very heavy, and the packet-per-transaction count is so low on average that TCP may not be all that much better to the network than UDP (in terms of adjusting to jitter in available resources). Not to mention this age-old problem with routing table sizes and routing table updates.
This belongs on the end2end-interest list or IPPM or elsewhere, but I'll save a lot of people going through the archives. In order to get X bandwidth on a given TCP flow you need to have an average window size of X * RTT. This is expressed in terms of TCP segments N = (X * RTT) / MSS (or more correctly the segment size in use rather than MSS). To sustain an average window of N segments, you must ideally reach a steady state where you cut cwnd (current window) in half, then grow linearly, fluctuating between 2/3 and 4/3 of the target size. This would mean one drop in 2/3 N windows or DropRate in terms of time is 2/3 N * RTT. In one RTT on average X * RTT amount of data flows. In practice, you rarely drop at the perfect time, so the constant 2/3 (call it K) can be raised to 1-2. Since N = (X * RTT) / MSS, DropRate = K * X * RTT * X * RTT / MSS. Units are b/s * sec * b/s * sec / b, or b. The DropRate expressed in bits can be converted to seconds or packets (divide by X or by MSS). This type of analysis is courtesy of the good folks at PSC (Matt, Jamshid, et al). For example, to get 40 Mb/s at 70 msec RTT and 4096 MSS, you get one error about every 6 seconds (K=1) or 1 in 7,300 packets. If you look at 56k Kb/s and 512 MSS you get a very interesting result. You need one error every 66 msec or 1 error in 0.9 packets. This gives a good incentive to increase delay. At 250 msec, you get a result of one error in 11.7 packets (much better!). Another interesting point to note is that you need 3 duplicate ACKs for TCP fast retransmit to work, so your window must be at least 4 segments (and should be more). If you have a very large number of TCP flows, where on average people get less than 1200 baud or so, the delay you need to make TCP work well starts to exceed the magic 3 second boundary. This was discussed ad nauseum on end2end-interest. An important result is that you need more queueing than the delay bandwidth product for severely congested links. Another is that there is a limit to the number of active TCP flows that can be supported per bandwidth. One suggestion to address the latter problem is to further drop segment size if cwnd is less than 4 segments in size and/or when estimated RTT gets into the seconds range. This analysis of how much loss is acceptable to TCP may not be outside the bounds of an informational RFC, but so far none exists. Curtis