We are ready to perform end-to-end performance testing across the new NSF architecture. PSC has a whole battery of TCP and IP diagnostics available from a test network, which can be exercised without exposing any production users to untested infrastructure. The network psc-hs-test (192.88.115), containing only the host voyager-115.psc.edu (192.88.115.29) is being announced to ANSnet (AS690), and from ANSnet to all peers at the NAPs. It is not announced by ANSnet to any other interconnects or providers. We encourage all providers to accept this route via the NAPs as soon as possible. If an RSP announces a test network to their new NSP, and also has the route announced across the NAPs, we can establish end-to-end connectivity through the new infrastructure. We are in a position to exercise such a path at loads beyond those attainable by most users. Depending on the target host available at the RSP, we can run large window TCP, mping or Jamshid's instrumented traceroute. We do need to be careful about maximum load because some of the segments are carrying production traffic. Since TCP "does the right thing" in the presence of congestion, TCP tests can be run at what ever bandwidth the network will support. We will limit IP performance tests to avoid sustained or high packet loss. I can not stress how important it is to do these tests as early as possible. We first used psc-hs-test in January of 1991, to maul the then brand new T3 NSFnet. We persisted in demonstrating that it didn't work, and Merit/ANS persisted in fixing it. At that time we had the undivided attention of the developers and they were able to do prime-time backbone software reloads and hardware upgrades. Once we agreed that it was working (it took about a month), we have never had problems with it. By hindsight, this saved us from all of the anguish suffered by NEARnet and others when they migrated to infrastructure which was beyond the capability of available diagnostic tools. Please test the path to voyager-115.psc.edu, and please let us help you meet the needs of your customers. PSC will be much happier to do gratis testing before any traffic cutover, than to address user complaints about infrastructure which is already in production. Drop a note to pscnet-eng, and we will arrange some tests. Thanks, --MM-- P.S. There is an issue of acceptance criteria: We are not parties to the provider contracts, and are not in a position to tell providers what they should deliver to their customers. Furthermore, it is known that much of the technology is being pushed into service faster than comfortable. None the less, we believe that there are some minimal requirements that should be met in all parts of the R&E Internet, if not the entire Internet. Required on day one (Note the vague words): - Mostly not congested. - Low loss when not congested. - Sufficient queuing to smooth some of the inherent burstyness. - Sane congested behavior at likely choke points. Sane means that actual IP throughput should be constant or at worst fall gently with load rising beyond the onset of congestion. It is my belief that there are a substantial number of users who will complain bitterly if the Internet fails to meet these "requirements". Note that these requirements are not sufficient to assure stable TCP operation. In the longer term, additional things are need: full D*B queues at all choke points and a strategic packet drop policy, such as RED, for early congestion notification. Also router implementations that can not do LSRR (source route) and ICMP at full rate greatly complicate sectionalizing problems on long paths. When the routers implement the entire protocol at full speed, it is trivial to localize performance problems to within one hop. However, in most commercial routers the less used parts of the protocol take a different, longer code path, which is almost always far more congested and lossy than any problematic primary path. A workaround is to install fast workstations near key interconnects. The workstations can be used as diagnostic probes/targets to test long paths section by section. --MM--
In message <9411012050.AA07250@pele.psc.edu>, "Matt Mathis" writes:
We are ready to perform end-to-end performance testing across the new NSF architecture. PSC has a whole battery of TCP and IP diagnostics available from a test network, which can be exercised without exposing any production users to untested infrastructure.
Matt, Any testing must be well defined and brief, conducted off hours and announced in advance, since this is the Internet you're going to be pounding, not a test network. Please coordinate with the ANS NOC. If we send NSR messages and anyone objects, we may have to cancel or reschedule tests. This could get tedious. ANS has an SGI Indy/SC in Maui at MHPCC that is there for testing purposes. Perhaps we can go off line and contact the folks at MHPCC and see if they would mind if we did some testing to establish a baseline. The delay (excluding queueing delay) between your ENSS and MHPCC is 117 msec. The congestion point will almost certainly be Cleveland or Chicago (both have our latest code - no RED). I'm not sure what the typical loads are in the middle of the night, but if I remember correctly it's under 5 mB/s at that time. Perhaps SDSC would like to participate. SDSC is 70 msec from PSC. fyi - PSC is 60 msec from Hayward (approximately the PacBell NAP), MHPCC is 61 msec away, and SDSC 14 msec away. The Sprint NAP is 136 msec from MHPCC, 89 msec from SDSC and 27 msec from PSC. I'm not sure how you are going to test the NAPs. Two party tests can't put a bottleneck at the NAP if ingress equals egress. The interesting tests must involve at least a small amount of traffic from a third party. (On ANSNET we probably have 100 or more third parties contributing their traffic to your tests - and maybe wondering why the Mosaic globe suddenly started spinning real slow). We may need a volunteer on MCINET and one on SprintLink (at least one with DS3 access? - that narrows the list, doesn't it). I think MCI and maybe Sprint has hosts at the PacBell NAP. That would work with both sending to you.
The network psc-hs-test (192.88.115), containing only the host voyager-115.psc.edu (192.88.115.29) is being announced to ANSnet (AS690), and from ANSnet to all peers at the NAPs. It is not announced by ANSnet to any other interconnects or providers. We encourage all providers to accept this route via the NAPs as soon as possible.
If an RSP announces a test network to their new NSP, and also has the route announced across the NAPs, we can establish end-to-end connectivity through the new infrastructure. We are in a position to exercise such a path at load s beyond those attainable by most users.
We know. :-)
Depending on the target host available at the RSP, we can run large window TCP, mping or Jamshid's instrumented traceroute.
We do need to be careful about maximum load because some of the segments are carrying production traffic. Since TCP "does the right thing" in the presenc e of congestion, TCP tests can be run at what ever bandwidth the network will support. We will limit IP performance tests to avoid sustained or high packe t loss.
An understatement: "We do need to be careful ...".
[ ... ]
There is an issue of acceptance criteria: We are not parties to the provider contracts, and are not in a position to tell providers what they should deliver to their customers. Furthermore, it is known that much of the technology is being pushed into service faster than comfortable. None the less, we believe that there are some minimal requirements that should be met in all parts of the R&E Internet, if not the entire Internet.
Required on day one (Note the vague words): - Mostly not congested.
- Low loss when not congested.
- Sufficient queuing to smooth some of the inherent burstyness.
- Sane congested behavior at likely choke points. Sane means that actual IP throughput should be constant or at worst fall gently with load rising beyond the onset of congestion.
Well stated. Other than to point out that we must be considerate of people actually trying to use this small part of the Internet thingie (ANSNET), I think you proposal to perform some testing as early as possible is a good idea. I would have preferred if this sort of testing happenned in a lab or on a test network under realistic conditions first. Part of our testnet is now down, scavenged to fix a problem at a production POP (last resort supply of spares until people start to turn in their ENSSs) so our testing is temporarily down. I think we should at least give the NAP operators the opportunity to state what further testing they plan to do, before testing on their behalf on the real network. Curtis PS - over and out - or off-line to be more accurate.
participants (2)
-
Curtis Villamizar
-
Matt Mathis