FYI, NANOG Community participants, if you have any questions about this network event, please contact me.. Betty Burke NANOG Project Manager (734) 647-3743 office (734) 395-1724 cell .... ------------ Forwarded Message ------------ Date: Friday, August 17, 2007 11:15 AM -0400 From: "Elwood J. Downing" <ejd@merit.edu> To: mjts@merit.edu, netdirs@merit.edu, network-alerts@merit.edu, Merit NOC <trouble@merit.edu> Subject: TT60009 -- Merit Network Backbone Service -- Update We are pleased to provide an update to the recent backbone service outage. As reported last night, 9:29 PM, EDT, Thursday, August 16, 2007, Merit's backbone began to stabilize around 8:20 PM, EDT on Thursday and has continued without any additional network alerts or major problems reported to us from our Members. We are continuing to work with our backbone equipment vendors to determine what caused the problem and how to prevent it from happening in the future. We believe that we will have enough information to share with you the "How, What, and Why's" of this network problem. The main cause of the problem seemed to be with the management interface cards on the Extreme Aspen 10G switches, which were crashing. We would reset the card and the system would work for a while then the card would crash again. This also became a problem with these cards that had 1G LAN ports on them servicing our members and connecting to routers. Some cards did not come back after resetting which required a hard reset (power cycle). Since we and Extreme Engineers did not know the nature of the problem, we were also investigating a possible Denial of Service attack on our network. What we found is that our 10G ring was changing up/down states rapidly which made us think this was causing the hardware to stop working because they were running out of resources. We then determined that the root of the problem was here in Ann Arbor, where a Cisco switch was connected to our 10G core Extreme Networks switch. We disabled the port and the network began to stabilize. We have informed Extreme Networks and they are working to provide feedback on the problem and resolution. It took additional time for all the routes to propagate since many of our networks were route dampened because of the on-going instability of the network. If you are continuing to experience any network performance problems please contact Merit's Network Operations Center (NOC) immediately. We have engineering, NOC, and support staff available to work with you on resolving any issues you are experiencing. We sincerely apologize for the inconvenience this outage has caused your organization. We continuously strive to provide the highest level of service to our Membership and regret this service issue." Sincere Regards, --Elwood --------------------------------------------------------------------------- Elwood J. Downing e-mail: ejd@merit.edu Merit Network Phone: (734) 936-2040 Director of Member Services Fax: (734) 647-3185 Merit Network -- Connecting Organizations, Building Community ---------- End Forwarded Message ----------