Extreme + Nachi = ipfdb overflow
After battling Nachi and it's flood of icmp traffic, I've discovered that it's not the Cisco gear that gets hit hard by it, it was the Extreme gear. Nachi generates enough 'random' traffic to flood and subsequently thrash the ip forwarding DB on the Summit 1i we were using so badly as to drop it from gigabit capible to barely eeking out 6mb/sec. Before I redeploy the switch, I need to find a way to keep the ipfdb from flodding while allowing it to be the primary carrier of traffic. ACLs blocking ICMP on the Extreme act too late, by the time the cpu sees the packet to drop it, it's already horned its way into the ipfdb. Does anyone have any suggestions on ways to allow the switch to participate as an L3 router while minimizing the chances of a worm taking it out so easily again? Joshua Coombs GWI Networking
On Mon, Aug 25, 2003 at 03:38:52PM -0400, Joshua Coombs wrote:
After battling Nachi and it's flood of icmp traffic, I've discovered that it's not the Cisco gear that gets hit hard by it, it was the Extreme gear. Nachi generates enough 'random' traffic to flood and subsequently thrash the ip forwarding DB on the Summit 1i we were using so badly as to drop it from gigabit capible to barely eeking out 6mb/sec. Before I redeploy the switch, I need to find a way to keep the ipfdb from flodding while allowing it to be the primary carrier of traffic. ACLs blocking ICMP on the Extreme act too late, by the time the cpu sees the packet to drop it, it's already horned its way into the ipfdb. Does anyone have any suggestions on ways to allow the switch to participate as an L3 router while minimizing the chances of a worm taking it out so easily again?
This affects most layer 3 switches, including Extreme, Foundry, and anyone else who still can't figure out how to pre-generated a FIB instead of a Fast Cache style system. It amazes me that people still have not learned this lesson. How old is CEF now? Then again, I suppose most of these boxes are being marketed to Enterprises anyways. As long as there is a label that says "60Gbps", the box looks good, and it's relatively cheap, how many of their customers are really going to notice the first packet performance of 6Mbps before they buy, right? At least some of the other vendors have workarounds (lame as they might be *coughnetaggcough*), or newer supervisors with FIBs, but I'm not aware of anything you can do to make an L3 Barney Switch behave well under a random dest flood. -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
I believe the old IBM "routers" used for the NSFnet implemented fully distributed routing tables in each line card. At that time, the commercial router vendors were still faulting routes in to the line cards (or central accelerator cards) on demand. I think the good folks at Merit, Watson and ANS were some of the early advocates of fully distributed tables, due in part to their analysis of samples of real-world backbone traffic. ----- Original Message ----- From: "Richard A Steenbergen" <ras@e-gerbil.net> To: <jcoombs@gwi.net> Cc: <nanog@merit.edu> Sent: Monday, August 25, 2003 4:03 PM Subject: Re: Extreme + Nachi = ipfdb overflow
On Mon, Aug 25, 2003 at 03:38:52PM -0400, Joshua Coombs wrote:
After battling Nachi and it's flood of icmp traffic, I've discovered that it's not the Cisco gear that gets hit hard by it, it was the Extreme gear. Nachi generates enough 'random' traffic to flood and subsequently thrash the ip forwarding DB on the Summit 1i we were using so badly as to drop it from gigabit capible to barely eeking out 6mb/sec. Before I redeploy the switch, I need to find a way to keep the ipfdb from flodding while allowing it to be the primary carrier of traffic. ACLs blocking ICMP on the Extreme act too late, by the time the cpu sees the packet to drop it, it's already horned its way into the ipfdb. Does anyone have any suggestions on ways to allow the switch to participate as an L3 router while minimizing the chances of a worm taking it out so easily again?
This affects most layer 3 switches, including Extreme, Foundry, and anyone else who still can't figure out how to pre-generated a FIB instead of a Fast Cache style system.
It amazes me that people still have not learned this lesson. How old is CEF now? Then again, I suppose most of these boxes are being marketed to Enterprises anyways. As long as there is a label that says "60Gbps", the box looks good, and it's relatively cheap, how many of their customers are really going to notice the first packet performance of 6Mbps before they buy, right?
At least some of the other vendors have workarounds (lame as they might be *coughnetaggcough*), or newer supervisors with FIBs, but I'm not aware of anything you can do to make an L3 Barney Switch behave well under a random dest flood.
-- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
On Mon, 25 Aug 2003, Richard A Steenbergen wrote:
At least some of the other vendors have workarounds (lame as they might be *coughnetaggcough*), or newer supervisors with FIBs, but I'm not aware of anything you can do to make an L3 Barney Switch behave well under a random dest flood.
The options in the market that I know of in the $3k-$8k range either has a very small routing table (Cisco 3550 for instance) or has a large route cache (Extreme Summit i-plattform is a good example). So it's either a 3550 with a lame low number of routes and mac addresses and memory that behaves well under random destination flood, or it's Extreme with a good number of mac addresses and routes that normally does everything it should, but behaves badly under random destination load. -- Mikael Abrahamsson email: swmike@swm.pp.se
At 03:38 PM 8/25/2003, Joshua Coombs wrote:
After battling Nachi and it's flood of icmp traffic, I've discovered that it's not the Cisco gear that gets hit hard by it, it was the Extreme gear. Nachi generates enough 'random' traffic to flood and subsequently thrash the ip forwarding DB on the Summit 1i we were using so badly as to drop it from gigabit capible to barely eeking out 6mb/sec.
Cisco 65xx gear suffers the same problem. SQL Slammer infested 3 neighboring customers in a colo space we use. The 6509 (used for aggregation in that colo) dropped 10% or more of our packets, though we were not infected. So much for claims from both of these vendors about "wire speed" forwarding. When testing switch gear, I think it's time to update Scott Bradner's test suites to use random source and destination IP addresses, so we can find out the true limits of the equipment.
participants (5)
-
Daniel Senie
-
Joshua Coombs
-
Mikael Abrahamsson
-
Richard A Steenbergen
-
Robert M. Enger