FW: TNT issues "workaround"
I seem to be having the same or similar problems with my Cisco boxes also , they either reboot or the pris hang , users get busy's but no one is logged in at all , when I do a show isdn status it shows b channels in use but no one on, the only way to fix is reboot the box , and it seems to be timed , everyday at 1400 and 2200 hours , since Monday anybody body heard of ciscos acting funny this week? John Lord(lord@allturbo.com) It Manager AllTurbo Internet Services Inc 410-213-9388 Office www.allturbo.com -----Original Message----- From: Brian Wallingford [mailto:brian@meganet.net] Sent: Saturday, August 23, 2003 1:41 AM To: nanog@merit.edu Subject: TNT issues "workaround" I haven't seen specific details posted here, so: Like many others, we've had a few TNTs online for years without hiccups or reboots until this week. Beginning late Sunday, we saw seemingly random blade reboots, and total system crashes. Errors ranged from memory leaks to infinite loops on the controller blade, but all blades were susceptible. HDLC2 blades seemed to be particularly vulnerable. We saw boxes that had been rock-solid for very long periods suddenly rebooting at periods ranging from 20 minutes to 4 hours, with no obvious cause (i.e., nothing more specific than the above). Border and core filtering of icmp echo * did little good. On the suggestion of some folks on another list, and against my better judgment, we disabled route caching in order to free up additional memory (though memory did not appear fragmented). This stabilized all involved boxes, and surprisingly, did not result in significant degradation of end user performance. Granted, it's not a true fix, but it may get you a few extra Z's at night. hth, brian
On Saturday, Aug 23, 2003, at 18:31 Europe/Dublin, John Lord wrote:
I seem to be having the same or similar problems with my Cisco boxes also , they either reboot or the pris hang , users get busy's but no one is logged in at all , when I do a show isdn status it shows b channels in use but no one on, the only way to fix is reboot the box , and it seems to be timed , everyday at 1400 and 2200 hours , since Monday anybody body heard of ciscos acting funny this week?
Perhaps your fast switching route cache is filling up memory. If you're willing to risk it enable CEF on all interfaces. R.
On Sat, 23 Aug 2003, Ross Chandler wrote:
I seem to be having the same or similar problems with my Cisco boxes also , they either reboot or the pris hang , users get busy's but no one is logged in at all , when I do a show isdn status it shows b channels in use but no one on, the only way to fix is reboot the box , and it seems to be timed , everyday at 1400 and 2200 hours , since Monday anybody body heard of ciscos acting funny this week?
Perhaps your fast switching route cache is filling up memory. If you're willing to risk it enable CEF on all interfaces.
Some of the older cisco access-servers don't even support CEF. The cisco failures seem to be memory starvation/fragmentation issues caused by out of control route-cache growth caused by the nachi worm's attempt to ping so many different hosts so quickly while looking for systems to spread to. You can work around the issue by: a) using policy routing to pass all dialup traffic through a route-map that sends 92 byte echo/echo-reply packets to null0. b) blocking all echo/echo-reply coming in from dial-up users (i.e. apply an input acl to your virtual-template and/or group-async interfaces). c) disabling route caching on the egress interface of the access server. I'm doing a mix of a (on the access-servers that this works on) and b where a doesn't work...and tested c this morning and found it appears to work. ---------------------------------------------------------------------- Jon Lewis *jlewis@lewis.org*| I route System Administrator | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
participants (3)
-
jlewis@lewis.org
-
John Lord
-
Ross Chandler