Cisco 6509 SUP32 SNMP Meltdown With CatOS
Anyone have experience with Cisco 6509E/SUP32 crashing under heavy SNMP polling load, causing high cpu utilization and 6509 lockup, requiring 6509 reboot? CatOS is deployed. Is the behavior any different with 6509 IOS? David
On 02/11/2012 18:37, david peahi wrote:
Anyone have experience with Cisco 6509E/SUP32 crashing under heavy SNMP polling load, causing high cpu utilization and 6509 lockup, requiring 6509 reboot? CatOS is deployed. Is the behavior any different with 6509 IOS?
You're being very coy about details here. I've not managed to actually crash a 6500 running IOS by excessive snmp, but the more interesting question is: how on earth are you running so many snmp queries that this is happening? E.g. a fully loaded 6509 with 384 ports would take ~3000 queries every several minutes to perform full port diagnostic polling, and you'd want to be doing this every couple of seconds to cause serious CPU impact. Are you doing something like full DFZ or MAC table polling? Or IP accounting over snmp? If you are, there are probably better ways of achieving what you're trying to do. Also, you may want to consider moving away from CatOS, as it's now basically abandonware (or at least will formally be in Jan 2013), and hasn't even seen maintenance updates in the last 4 years. Nick
On 11/02/2012 04:52 PM, Nick Hilliard wrote:
E.g. a fully loaded 6509 with 384 ports would take ~3000 queries every several minutes to perform full port diagnostic polling, and you'd want to be doing this every couple of seconds to cause serious CPU impact. Are you doing something like full DFZ or MAC table polling?
I bet you're close toward the end there. My guess is he's carrying a large BGP feed and querying the ipRouteTable. The caveat below is for IOS 12.4(20)T but equivalent issues surely exist for CatOS: http://www.cisco.com/en/US/docs/ios/12_4t/release/notes/124TCAVS3.html#wp205... The killer in this case is not the SNMP traffic or anything resulting directly from it, but the CPU overhead from constantly re-sorting the ipRouteTable since that's generated from the FIB when CEF is enabled. Workaround is to disable CEF (heh) or configure a MIB view that excludes the ipRouteTable. This one bites an OpenNMS support customer a few times a year -- happened again just today, in fact, at a shop that just enabled topology discovery.
Also, you may want to consider moving away from CatOS, as it's now basically abandonware (or at least will formally be in Jan 2013), and hasn't even seen maintenance updates in the last 4 years.
What you said :) -jeff
By any chance were you querying a Sup32 that had BGP full routes? That and other large tables can easily swamp the cpu on the Sup32. This technote is based on IOS, and I don't know if the same facilities exist in CatOS, but as Nick mentioned, run, don't walk and convert to IOS. CatOS is dead. http://www.cisco.com/en/US/tech/tk648/tk362/technologies_tech_note09186a0080... -----Original Message----- From: david peahi [mailto:davidpeahi@gmail.com] Sent: Friday, November 02, 2012 2:37 PM To: nanog@nanog.org Subject: Cisco 6509 SUP32 SNMP Meltdown With CatOS Anyone have experience with Cisco 6509E/SUP32 crashing under heavy SNMP polling load, causing high cpu utilization and 6509 lockup, requiring 6509 reboot? CatOS is deployed. Is the behavior any different with 6509 IOS? David
participants (4)
-
david peahi
-
Jeff Gehlbach
-
Matthew Huff
-
Nick Hilliard