On Fri, Mar 26, 2010 at 08:29:52PM -0500, James Hess wrote:
Most all switch manufacturers provide some type of port security feature that allows an end-user connection port to automatically be disabled and require admin intervention to re-activate, if the number of MAC addresses exceed a configurable number, e.g. allow 5 MAC addresses, which are remembered as that port's list of "secured" MAC addresses with an aging time of 5 minutes.
In fact, the last time this happened, I implemented exactly what you describe, mac-security with a limit of 5 MACs. The security kicked in and shutdown the port, but not before the core shutdown the edge switch's uplinks (see below).
Use that function, and use the functions of a switch that provide storm control for client ports. With a reasonable aging duration for expiring secured MAC addresses.
Have that.
If a client port emits packets with more than the expected number of MAC address sources within a short time, then that should be an early warning that traffic has taken an improper path.
Have that.
Keeping in mind a loop doesn't necessarily create an instant issue, until there is other broadcast traffic on the network, that crosses the loop, and starts messing up the CAM table on the upstream device, as well as possibly generating duplicate traffic.
Which pretty much means within milliseconds on my network.
But with port security, the number of devices that lose packets due to the loop should be limited (the smaller you set the limit without impacting legitimate use of the port, the better).
So basically, the problem is the core switches implement a proprietary loop-prevention protocol that sends "beacon" frames out every 500ms, and if a certain number of these special frames come back (exceeds threshold) it shuts down the port. Even with a 10:1 ratio of threshold settings on the two redundant links to the edge switch, it still trips both thresholds fast enough that both redundant links get shutdown in the short time before the edge switch's protection mechanism (mac-security, STP, bpduprotect, whatever) kicks in. I've now set the ratio to 100:1 (500:5 in actual packet counts) and the transmit interval to 1000ms in the hopes that at least one core link will survive long enough for the edge port to shutdown and break the loop first, but I'm beginning to think that this protocol is crap and I should just disable it and let the core ride out the loop in the hopes that it survives without taking down the entire core before the edge switch shutdown happens. The good news is that this core is being replaced soon, hopefully with gear that will be able to implement a service-provider-like design with per-port VLAN separation as was suggested in this thread. But it surprises me that low-end switch vendors (like NetGear) still put out crap that doesn't do STP, especially when the switch does Auto MDI/MDI-X, which is just asking for trouble. Anyone know if Auto MDI/MDI-X is inherent or required in 1000Base-T? It would be nice if I could shut it off. Thanks for all the ideas.