If you take a look at http://ext2.mfsdatanet.com/MAE/west.ames.overlay.html you find that the Ames FDDI ring is totally saturated. Now, that means that anyone who's trading traffic over on that side, or between Ames and San Jose, is getting really really lousy performance. What I don't understand is why that has _stayed_ saturated... it seems to me that some of the big players would have rerouted their traffic by now to avoid subjecting it to this, which would also have the side effect of causing the problem to, at least for the short term, go away. I can't do much about this myself, since my router isn't one of the ones generating huge amounts of traffic over on that ring, but surely someone on this list is in that position... and the major contributors to this traffic must be getting complaints from their own customers about how their packets are being routed via a lossy interchange point. -matthew kaufman matthew@scruz.net
Date: Thu, 21 Mar 1996 13:48:40 -0800 (PST) From: matthew@scruz.net (Matthew Kaufman)
If you take a look at http://ext2.mfsdatanet.com/MAE/west.ames.overlay.html you find that the Ames FDDI ring is totally saturated. Now, that means that anyone who's trading traffic over on that side, or between Ames and San Jose, is getting really really lousy performance.
What I don't understand is why that has _stayed_ saturated... it seems to me that some of the big players would have rerouted their traffic by now to avoid subjecting it to this, which would also have the side effect of causing the problem to, at least for the short term, go away.
We, too, have notices the problem, although we mostly see it in packets bound for the MFD ring. Pings to other Ames peers seem to do fine. This started abruptly on about March 7th and has been consistently bad since then. That is about the time BBN Planet started routing everything there via AS1, but I have no idea of this is a significant part of the problem. FWIW, MFS blames the load on the GIGAswitch, although I don't see this indicated in the plots. They didn't give us any estimate for improving the situation. R. Kevin Oberman Energy Sciences Network (ESnet) National Energy Research Supercomputer Center (NERSC) EMAIL: oberman@es.net Phone: +1 510 422-6955
This started abruptly on about March 7th and has been consistently bad since then. That is about the time BBN Planet started routing everything there via AS1, but I have no idea of this is a significant part of the problem. FWIW, MFS blames the load on the GIGAswitch, although I don't see this indicated in the plots. They didn't give us any estimate for improving the situation.
The load on the Gigaswitch as plotted in MFS' graphs doesn't seem to have changed much over the interval before the FDDI started having problems and after. Who at MFS said the problem was the Gigaswitch? Stephen - ----- Stephen Stuart stuart@pa.dec.com Network Systems Laboratory Digital Equipment Corporation
What I don't understand is why that has _stayed_ saturated... it seems to me that some of the big players would have rerouted their traffic by now to avoid subjecting it to this, which would also have the side effect of causing the problem to, at least for the short term, go away.
I'm interested in knowing what happened between March 6 and March 7 on the Ames FDDI ring. That's when the utilization jumped from around 60% to 90%. Also of interest is that it appears MFS stopped collecting stats late in the afternoon, with utilization around 60%, on the 6th, and didn't start collecting again until around noon on the 7th, at which time utilization is running at 90%. Anyone know what happened? mb -- Mark Boolootian <booloo@cats.ucsc.edu> (aka booloo@scruznet.com)
On Thu, 21 Mar 1996, Mark Boolootian wrote:
I'm interested in knowing what happened between March 6 and March 7 on the Ames FDDI ring. That's when the utilization jumped from around 60% to 90%. Also of interest is that it appears MFS stopped collecting stats late in the afternoon, with utilization around 60%, on the 6th, and didn't start collecting again until around noon on the 7th, at which time utilization is running at 90%.
Anyone know what happened?
Something along the lines that we noticed the NetEdge's were no longer collecting proper statistics on the FDDI interfaces. Rather than measuring total traffic, it only measured traffic that was forwarded over another link. We switched the data collection around which caused the hiccup in the graphical statistics. On March 20th, Lance Tatman, notified the MAE-West customers via the mailing list, that NASA had procured funds to obtain a Gigaswitch and that he is expecting delivery "towards the end of March." Once that Gigaswitch is received, we will solve two problems. #1) Congestion on the FDDI at Ames. #2) Is caused by #1, in that the NetEdge is having to "think" about discarding packets it cannot deliver onto the existing FDDI due to congestion. Secondly, we could *possibly* be running into a CPU limit on the NetEdge's where it's coming closs to it's PPS processing limit. #2 will be rectified by removing the NetEdge's from service, and terminating the OC-3c on the DEC Gigaswitch itself. However, in order for this to happen, NASA has to turn up its Gigaswitch. ;-) Between now and the time the Gigaswitch is turned up, several Ames connected carriers are obtaining connections into the San Jose (MFS) Gigaswitch. This should provide them additional redundancy and reduce load on their backhaul links as well as the individual "sections" of the MAE. -jh-
Matthew Kaufman writes about the Ames FDDI ring saturation:
What I don't understand is why that has _stayed_ saturated... it seems to me that some of the big players would have rerouted their traffic by now to avoid subjecting it to this, which would also have the side effect of causing the problem to, at least for the short term, go away.
...or why MFS hasn't installed a Gigaswitch there, or whatever. We're seeing 20% - 30% packet loss through AGIS to MCI and Sprintlink during the day, and it's not fun.
... and the major contributors to this traffic must be getting complaints from their own customers about how their packets are being routed via a lossy interchange point.
As a customer, I've complained to my contributor. But I get the impression that everyone thinks it's everybody else's problem. Who's really responsible? And why did it have to get this bad? Peter Kaminski, NanoSpace kaminski@nanospace.com
On Thu, 21 Mar 1996, Peter Kaminski wrote:
What I don't understand is why that has _stayed_ saturated... it seems to me that some of the big players would have rerouted their traffic by now to avoid subjecting it to this, which would also have the side effect of causing the problem to, at least for the short term, go away.
...or why MFS hasn't installed a Gigaswitch there, or whatever. We're seeing 20% - 30% packet loss through AGIS to MCI and Sprintlink during the day, and it's not fun.
isn't agis on the mfs side and mci+sprint on the nasa side? (i know that net99 was/is on the mfs side, but i dunno if agis is using their cage or have a different one) are the two 'sides' of mae-west still connected by only a ds3? /nm
participants (7)
-
booloo@cats.ucsc.edu
-
Jonathan Heiliger
-
Kevin Oberman
-
matthew@scruz.net
-
Nikos Mouat
-
Peter Kaminski
-
Stephen Stuart