Here is a summary of our experiences with the bug. Last Thursday, A TNTs with years of uptime rebooted. No cause was apparent, and nothing relevant happened in the logs. On Friday, It happened to a different TNT. This occurred with increasing frequency over the weekend, and we didn't get a lot of sleep. We tried using a filter in the tnt to block port 135 and 4444 to no avail, and then tried a filter to block ICMP in the tnt also to no avail. Next, we removed the tnt filters and tried rate-limiting ICMP to the TNT's. That didn't work. Next we removed the rate-limit and applied the Cisco-supplied anti-nachi route-map to the upstream interfaces facing the Tnt's. This significantly reduced the problem, but we were still rebooting every 12 hours or so. Disabling route-caching on the TNT stopped the rebooting problem, but we were seeing 40% packet loss on one of the TNTs. (Note, both TNT's have a Ds-3 of PRI's, and use the TNT-SL-E10-100 four port Ethernet cards) The packet loss was only affecting one TNT, and we discovered that it was running 9.0.6 while the unaffected box was running 9.0.9. Upgrading the box to 9.0.9 fixed the packet loss issue. We are currently up and haven't had any blips in 24 hours. (knock on wood.) -Ejay -----Original Message----- From: Andy Walden [mailto:andy@tigerteam.net] Sent: Wednesday, August 27, 2003 10:35 AM To: Geo. Cc: NANOG Subject: Re: Max TNT ping thing On Tue, 26 Aug 2003, Geo. wrote:
Someone on this list had mentioned a network card for the Max TNT that
made
it immune to the nachia worm ping issue.
Is that the 4 port (3 ethernet, 1 fast ether) card or the single port card with the dongle thing or something else?
It turns out this was a bogus solution. Since the load was lower afterwards, my tech thought it had been fixed. We tried limiting the size of the route cache as someone had recommended, as well as applying all of the filters without relief. This morning I had them just disable the route cache to see what happens. I will post the results. We did end up buying a support contract from Lucent after they said they had a fix and would tell us what it was after we paid them. They just supplied the filter. At this point, they have exactly zero clue as to what to do next. andy -- PGP Key Available at http://www.tigerteam.net/andy/pgp
On Wednesday, August 27, 2003, at 12:46 PM, Ejay Hire wrote:
Here is a summary of our experiences with the bug.
Last Thursday, A TNTs with years of uptime rebooted. No cause was apparent, and nothing relevant happened in the logs. On Friday, It happened to a different TNT. This occurred with increasing frequency over the weekend, and we didn't get a lot of sleep. We tried using a filter in the tnt to block port 135 and 4444 to no avail, and then tried a filter to block ICMP in the tnt also to no avail. Next, we removed the tnt filters and tried rate-limiting ICMP to the TNT's. That didn't work. Next we removed the rate-limit and applied the Cisco-supplied anti-nachi route-map to the upstream interfaces facing the Tnt's. This significantly reduced the problem, but we were still rebooting every 12 hours or so. Disabling route-caching on the TNT stopped the rebooting problem, but we were seeing 40% packet loss on one of the TNTs. (Note, both TNT's have a Ds-3 of PRI's, and use the TNT-SL-E10-100 four port Ethernet cards) The packet loss was only affecting one TNT, and we discovered that it was running 9.0.6 while the unaffected box was running 9.0.9. Upgrading the box to 9.0.9 fixed the packet loss issue. We are currently up and haven't had any blips in 24 hours. (knock on wood.)
We have a Lucent APX 8000 which is essentially a TNT on steroids. We have not experienced any of the issues. We are running TAOS 10.0.2 -Matt
participants (2)
-
Ejay Hire
-
Matthew Crocker