Re: OUTAGE: MCI/Worldcom frame-relay network
SEAN@dra.COM (Sean Donelan) writes:
I'm trying to post no more than once in 12 hours.
Its been another twelve hours. Although I don't like all the red on our network map, it is showing how resiliant IP protocols are. My core network uses several different facility providers and technologies, and for those customers who have more than one connection: IP works. How else could I send this e-mail? However, for customers with only one connection, they don't really care how many other connections are working. They only care about their one connection. MCI/Worldcom claims 75% up their frame-relay network is 'up.' This is down from their previous claim on Friday that 90% of the frame-relay network was 'up.' We still see a number of PVCs throughout the US and Canada either completely down, or dropping so many frames as to be unusable. I received similar reports from other MCI/Worldcom frame-relay users throughout the US and even as far away as Germany. To try to figure out what is going on MCI/Worldcom imposed a 'quiet time' on their network, telling their engineers to stop touching things. This helped somewhat. Portions of the network stabilized on its own, but there is still massive congestion. The quiet time is over. Now MCI/Worldcom is trying to reduce the congestion by disconnecting parts of the frame-relay network. Remember this is Sunday, normally a low-traffic day. Monday is underway in Europe, and will be hitting the East Coast of the USA soon. Traffic levels are going to increase on the network, assuming any traffic can get through at all. MCI/Worldcom's web site continues its long silence about this outage. I've got to hand it to MCI/Worldcom; its engineers may have trouble keeping a frame-relay network going, but its PR department has done a great job keeping a lid on the story. Even MCI/Worldcom customers can't find out what is happening. -- Sean Donelan, Data Research Associates, Inc, St. Louis, MO Affiliation given for identification not representation
http://cbs.marketwatch.com/archive/19990806/news/current/wcom.htx?source=blq/yhoo&dist=yhoo It would seem that WCOM has indeed been communicating with the public, as early as 13:33 EST, August 6th. This is my concern with the recent trend in using NANOG as a pulpit for bringing focus to certain network problems. The information displayed here is not always accurate, and often subjective. I do not mean to suggest that Sean is, only that it does happen, and the list goes for a while with no correction. Usually I'd argue that any information is better than no information, but when the quality is of suspect value, I'm not sure the viewpoint stands. I'd recommend that an alternate forum be used for outage notification, like Stan Barber's outage-discuss list. -alan Thus spake Sean Donelan (SEAN@SDG.DRA.COM) on or about Mon, Aug 09, 1999 at 01:12:47AM -0500:
SEAN@dra.COM (Sean Donelan) writes:
I'm trying to post no more than once in 12 hours.
Its been another twelve hours.
Although I don't like all the red on our network map, it is showing how resiliant IP protocols are. My core network uses several different facility providers and technologies, and for those customers who have more than one connection: IP works. How else could I send this e-mail? However, for customers with only one connection, they don't really care how many other connections are working. They only care about their one connection.
MCI/Worldcom claims 75% up their frame-relay network is 'up.' This is down from their previous claim on Friday that 90% of the frame-relay network was 'up.' We still see a number of PVCs throughout the US and Canada either completely down, or dropping so many frames as to be unusable. I received similar reports from other MCI/Worldcom frame-relay users throughout the US and even as far away as Germany.
To try to figure out what is going on MCI/Worldcom imposed a 'quiet time' on their network, telling their engineers to stop touching things. This helped somewhat. Portions of the network stabilized on its own, but there is still massive congestion. The quiet time is over. Now MCI/Worldcom is trying to reduce the congestion by disconnecting parts of the frame-relay network.
Remember this is Sunday, normally a low-traffic day. Monday is underway in Europe, and will be hitting the East Coast of the USA soon. Traffic levels are going to increase on the network, assuming any traffic can get through at all.
MCI/Worldcom's web site continues its long silence about this outage. I've got to hand it to MCI/Worldcom; its engineers may have trouble keeping a frame-relay network going, but its PR department has done a great job keeping a lid on the story. Even MCI/Worldcom customers can't find out what is happening. -- Sean Donelan, Data Research Associates, Inc, St. Louis, MO Affiliation given for identification not representation
On Mon, 9 Aug 1999, Alan Hannan wrote:
http://cbs.marketwatch.com/archive/19990806/news/current/wcom.htx?source=blq/yhoo&dist=yhoo
What frame relay switch is causing MCI/Worldcom such grief? The above article contains this statement: "From what we know about the disruptions in the U.S., our UUNet is not being affected because its network rides on a different platform," Wagner said.
From what I know of UUNet's frame relay network (which may be woefully outdated knowledge) they are using Cascade/Ascend/Lucent switches. This would tend to make one suspect that the problems are associated with non-Lucent switches rather than being some unusual effect within the frame relay protocol itself.
Is this another vendor related problem? If not, does it affect NNI customers of Worldcom? With the growing profile of MPLS, I suspect that more networks are planning to roll out frame relay or ATM in their core in which case the technical facts behind these events should be of great interest to many of us on this list. I am less interested in the length of time a network is down and more interested in why it was down and how other operators could avoid the same problem. -- Michael Dillon - E-mail: michael@memra.com Check the website for my Internet World articles - http://www.memra.com
On Mon, 9 Aug 1999, Michael Dillon wrote:
Is this another vendor related problem? If not, does it affect NNI customers of Worldcom?
Yes. We have one PVC using NNI via MCI/Worldcom (got it from LDDS years ago). It has been bouncing up and down for the past four days. ========================================================================= Michael P. Lucking Michael@Lucking.COM
On Mon, 9 Aug 1999, Michael P. Lucking wrote:
On Mon, 9 Aug 1999, Michael Dillon wrote:
Is this another vendor related problem? If not, does it affect NNI customers of Worldcom?
Yes. We have one PVC using NNI via MCI/Worldcom (got it from LDDS years ago). It has been bouncing up and down for the past four days.
Actually, what I meant was whether this so-called packet storm has extended to frame relay networks with NNI to Worldcom. The PVC will bounce if Worldcom cannot maintain the PVC portion within their network or cannot pass traffic through their network. But the public description so far seems to imply that the problem propogates from one switch to another. Assuming this is actually what is happening, I wondered whether the problem was propogating into any other networks via NNI connections. And if the answer is yes, then what vendor made the switches, etc. etc...? -- Michael Dillon - E-mail: michael@memra.com Check the website for my Internet World articles - http://www.memra.com
Doe's anyone have an idea as to what Link State Algorithm the Cascade/Ascend 9000's switches run? I have been given conflicting reports that it is OSPF, or others say that it is OSPF "like" since John Moy worked for Cascade, he developed a version of OSPF for the Cascade/Ascend switches. Doe's anyone have more info on this? Also, if so, what version is supported on the Amethyst, Jade v.1, v.2. releases. Apparently, this may have something to do with all the instability issues with WCOM. CJ Michael Dillon wrote:
On Mon, 9 Aug 1999, Michael P. Lucking wrote:
On Mon, 9 Aug 1999, Michael Dillon wrote:
Is this another vendor related problem? If not, does it affect NNI customers of Worldcom?
Yes. We have one PVC using NNI via MCI/Worldcom (got it from LDDS years ago). It has been bouncing up and down for the past four days.
Actually, what I meant was whether this so-called packet storm has extended to frame relay networks with NNI to Worldcom. The PVC will bounce if Worldcom cannot maintain the PVC portion within their network or cannot pass traffic through their network. But the public description so far seems to imply that the problem propogates from one switch to another. Assuming this is actually what is happening, I wondered whether the problem was propogating into any other networks via NNI connections.
And if the answer is yes, then what vendor made the switches, etc. etc...?
-- Michael Dillon - E-mail: michael@memra.com Check the website for my Internet World articles - http://www.memra.com
-- ------------------------------- Shawn "CJ" Washington Senior Network Engineer CCIE # 4683 GTE Internetworking http://www.gte.com (410) 309-8349 cj@bbn.com -------------------------------
And if the answer is yes, then what vendor made the switches, etc. etc...?
My understanding is that a year or two ago the frame relay was on Cascade and the ATM on GDC. They moved the ATM bit onto Cisco/Stratacom BPX services (and we've seen no problems on this service for the duration to my knowledge), but the FR network is now independent of any other. I *believe* it still runs on the Cascades. -- Alex Bligh GX Networks (formerly Xara Networks)
And if the answer is yes, then what vendor made the switches, etc. etc...?
My understanding is that a year or two ago the frame relay was on Cascade and the ATM on GDC. They moved the ATM bit onto Cisco/Stratacom BPX services (and we've seen no problems on this service for the duration to my knowledge), but the FR network is now independent of any other. I *believe* it still runs on the Cascades.
Hmm... Didn't AT&T have a FR meltdown not so long ago? Tony
On Mon, 9 Aug 1999, Michael Dillon wrote:
What frame relay switch is causing MCI/Worldcom such grief?
The last time we had a Bay Networks salesperson visit he stated that MCI (this was before the merger) was an all-Bay frame network. Whether that has changed or not, I can't say, but I can't see them ditching it all so quickly. And yep, having one Bay router left doing frame here, I'd love to know what the problem is. In NYC recently, we've had tons of problems as Bell Atlantic migrates from frame to "frame emulation" on ATM switches. Charles
The above article contains this statement:
"From what we know about the disruptions in the U.S., our UUNet is not being affected because its network rides on a different platform," Wagner said.
From what I know of UUNet's frame relay network (which may be woefully outdated knowledge) they are using Cascade/Ascend/Lucent switches. This would tend to make one suspect that the problems are associated with non-Lucent switches rather than being some unusual effect within the frame relay protocol itself.
Is this another vendor related problem? If not, does it affect NNI customers of Worldcom?
With the growing profile of MPLS, I suspect that more networks are planning to roll out frame relay or ATM in their core in which case the technical facts behind these events should be of great interest to many of us on this list.
I am less interested in the length of time a network is down and more interested in why it was down and how other operators could avoid the same problem.
-- Michael Dillon - E-mail: michael@memra.com Check the website for my Internet World articles - http://www.memra.com
On Mon, 9 Aug 1999, Charles Sprickman wrote:
On Mon, 9 Aug 1999, Michael Dillon wrote:
What frame relay switch is causing MCI/Worldcom such grief?
The last time we had a Bay Networks salesperson visit he stated that MCI (this was before the merger) was an all-Bay frame network. Whether that has changed or not, I can't say, but I can't see them ditching it all so quickly.
I think the Bay guy had a switch to sell you. There are now and for a long time have been using Cascade (now Ascend). -- Check out the new CLEC mailing list at http://www.robotics.net/clec
<> Nathan Stratton Telecom & ISP Consulting http://www.robotics.net nathan@robotics.net
On Mon, 9 Aug 1999, Nathan Stratton wrote:
I think the Bay guy had a switch to sell you. There are now and for a long time have been using Cascade (now Ascend).
And now.... Lucent! What a wonderful industry we work in when we can use the phrase "or whatever label they're sticking on the fron this week" regularly. - Forrest W. Christian (forrestc@imach.com) KD7EHZ ---------------------------------------------------------------------- iMach, Ltd., P.O. Box 5749, Helena, MT 59604 http://www.imach.com Solutions for your high-tech problems. (406)-442-6648 ----------------------------------------------------------------------
On Mon, Aug 09, 1999 at 11:20:50AM -0400, Nathan Stratton wrote:
On Mon, 9 Aug 1999, Charles Sprickman wrote:
On Mon, 9 Aug 1999, Michael Dillon wrote:
What frame relay switch is causing MCI/Worldcom such grief?
The last time we had a Bay Networks salesperson visit he stated that MCI (this was before the merger) was an all-Bay frame network. Whether that has changed or not, I can't say, but I can't see them ditching it all so quickly.
I think the Bay guy had a switch to sell you. There are now and for a long time have been using Cascade (now Ascend).
MCI (not Worldcom) used to use Bay BCNs in what used to be known as Hyperstream service for frame switching along with GDC switches for ATM. -dorian
participants (11)
-
Alan Hannan
-
Alex Bligh
-
Charles Sprickman
-
Dorian Kim
-
Forrest W. Christian
-
Michael Dillon
-
Michael P. Lucking
-
Nathan Stratton
-
Sean Donelan
-
Shawn "CJ" Washington
-
Tony Li