On Mon, Jul 11, 2011 at 7:48 PM, Jimmy Hess <mysidia@gmail.com> wrote:
If every vendor's implementation is vulnerable to a NDP Exhaustion vulnerability, how come the behavior of specific routers has not been documented specifically?
Well, I am in the business of knowing the behavior of kit being considered by my clients for their applications. Every box breaks when tested, period. I imagine you have tested zero, thus you have no data of your own to go on. No vendors are rushing to spend money on "independent" testing laboratories to produce reports about this, because they pretty much all know their boxes will break (or are not even aware of the potential problem, in the case of a few scary vendors.)
If "zero" devices are not vulnerable, you came to this conclusion because you tested every single implementation against IPv6 NDP DoS, or?
Although I have tested many routers to verify my thinking, if you actually read the slides and understand how routers work, you too will know that every router is vulnerable. If you don't know, you don't understand how routers work. It's that simple.
How come there are no security advisories. What's the CWE or CVE number for this vulnerability?
Again, no one is interested in this problem yet because vendors really don't want their customers to demand more knobs. Cisco is the only vendor who has done anything at all. If you read about their knob, you immediately realize that it is a knob to control the failure mode of the box, not to "fix" anything. Why? It can't be "fixed" without not using /64 (or similar) or going to the extreme lengths I outline in those slides.
It would be useful to at least have the risk properly described, in terms of what kind of DoS condition could arise on specific implementations.
Let's take 6500/SUP720 for example. On this platform, a policer is shared between the need to resolve ARP entries and ND table entries. If you attack a dual-stack SUP720 box it will break not only IPv6 neighbor resolution, but also IPv4 neighbor resolution. This is pretty much the "worst-case scenario" because not only will your IPv6 break, which may annoy customers but not be a disaster; it will also break mission-critical IPv4. That's bad. Routing-protocol adjacencies can be affected, disabling not just some hosts downstream of the box, but also its upstream connectivity. It doesn't get any worse than that. You are right to question my statements. I'm not an independent lab doing professional tests and showing the environment and conditions of how you can reproduce the results. I'm just a guy helping my clients decide what kit to buy, and how they should configure their networks. The only reason I have bothered to produce slides is because we are at a point where we have end-customers questioning our reluctance to provision /64 networks for mixed-use data-center LANs, and until vendors actually do something to address this, or "the standard" changes, I need to increase awareness of this problem so I am not forced to deploy a broken design on my own networks the way a lot of other clueless people are. Again, this is only hard to understand (or accept) if you don't know how your routers work. * why do you think there is an ARP and ND table? * why do you think there are policers to protect the CPU from excessive ARP/ND punts or traffic? * do you even know the limit of your boxes' ARP / ND tables? Do you realize that limit is a tiny fraction of one /64? * do you understand what happens when your ARP/ND policers are reached? * did you think about the impact on neighboring routers and protocol next-hops, not just servers? * did you every try to deploy a /16 on a flat LAN with a lot of hosts and see what happens? Doesn't work too well. A v6 /64 is 281 trillion times bigger than a v4 /16. There's no big leap of logic here as to why one rogue machine could break your LAN. There is no router which is not vulnerable to this. If you don't believe me, read the Cisco documentation on their knob limiting ND entries per interface, after which there may be service impact on that interface. That's the best anyone is doing right now. Of course, vendors understand that we, as customers, can configure a subnet smaller than /64. They are leaving us open to link-local issues right now even with a smaller global subnet size, but at least that cannot be exploited from "the Internet." And as it happens, exactly the same features / knobs are needed to "fix" both problems with /64, and with link-local neighbor learning. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts