Cisco 7600 PFC3B(XL) and IPv6 packets with fragmentation header
Just thought I'd share some operational info. PFC3B will by default punt IPv6 packets with fragmentation header to RP and route them there, with the obvious performance penalty this incurs. Workaround is to change this behaviour, meaning ACLs won't work for packets with fragmentation header anymore: #platform ipv6 acl fragment hardware ? drop Drop IPv6 fragments at hardware forward Forward IPv6 fragments at hardware PFC3C is supposed to not be affected. A lot of Teredo and 6to4 traffic has fragmentation headers, so this actually is a real problem. We discovered this at our Teredo relay upstream router. -- Mikael Abrahamsson email: swmike@swm.pp.se
On Fri, Sep 30, 2011 at 1:07 AM, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
Just thought I'd share some operational info.
PFC3B will by default punt IPv6 packets with fragmentation header to RP and route them there, with the obvious performance penalty this incurs.
when will vendors learn that punting to the RE/RP/smarts for packets in the fastpath is ... not just 'unwise' but wholesale stupid? :(
Workaround is to change this behaviour, meaning ACLs won't work for packets with fragmentation header anymore:
#platform ipv6 acl fragment hardware ? drop Drop IPv6 fragments at hardware forward Forward IPv6 fragments at hardware
your recommendation is to ... forward? (or perhaps not 'recommendation' but: "Forward means do not pass go, just ship out the proper egress interface. drop means ... send to hell" If you do nothing the default behavior is to send the packet to the RP... why? (why would you want this packet sent to the RP? it's got a valid destination, no? so deliver it out the egress interface?) thanks! -chris
PFC3C is supposed to not be affected.
A lot of Teredo and 6to4 traffic has fragmentation headers, so this actually is a real problem. We discovered this at our Teredo relay upstream router.
-- Mikael Abrahamsson email: swmike@swm.pp.se
On Fri, 30 Sep 2011, Christopher Morrow wrote:
If you do nothing the default behavior is to send the packet to the RP... why? (why would you want this packet sent to the RP? it's got a valid destination, no? so deliver it out the egress interface?)
I was told it's because PFC3B can't look into the packet far enough to determine what the payload is (TCP/UDP etc) and port, that's only the RP that can do ACL handling of the packet. So if you configure "forward", people can put a fragmentation header on the packet and skip past your ACL. -- Mikael Abrahamsson email: swmike@swm.pp.se
On (2011-09-30 01:55 -0400), Christopher Morrow wrote:
when will vendors learn that punting to the RE/RP/smarts for packets in the fastpath is ... not just 'unwise' but wholesale stupid? :(
What to do with IP options or IPv6 hop-by-hop options? What to do with IPv6 packets which contain options which push TCP/UDP past your lookup view? Punting transit is not only not stupid but also necessary in hardware routers which cannot handle every case in hardware (which is all routers). There should just be adequate way to limit these and there should exist default limitation. -- ++ytti
On Fri, Sep 30, 2011 at 6:02 AM, Saku Ytti <saku@ytti.fi> wrote:
On (2011-09-30 01:55 -0400), Christopher Morrow wrote:
when will vendors learn that punting to the RE/RP/smarts for packets in the fastpath is ... not just 'unwise' but wholesale stupid? :(
What to do with IP options or IPv6 hop-by-hop options? What to do with IPv6 packets which contain options which push TCP/UDP past your lookup view?
a switch to be used that stops processing this sort of thing, in an internet core (and honestly most enterprise core) routers, all I want is packet-in/packet-out. there's no need for anything else, stop trying to send line-rate packets to the cpu.
Punting transit is not only not stupid but also necessary in hardware routers which cannot handle every case in hardware (which is all routers).
no. all you need is a default 'do not process these, just fwd them' switch. (or, a switch at any rate that the operator can select one way or the other, they SHOULD know what is the best for their deployment).
There should just be adequate way to limit these and there should exist default limitation.
I really think zero limit is the right limit... (for a large number of deployments)
On (2011-09-30 10:09 -0400), Christopher Morrow wrote:
a switch to be used that stops processing this sort of thing, in an internet core (and honestly most enterprise core) routers, all I want is packet-in/packet-out. there's no need for anything else, stop trying to send line-rate packets to the cpu.
This would break e.g. RSVP. For some instances dropping all of them in hardware is an option, for other instances ignoring and forwarding without understanding is ok but some situation you simply must punt.
no. all you need is a default 'do not process these, just fwd them' switch. (or, a switch at any rate that the operator can select one way or the other, they SHOULD know what is the best for their deployment).
It would also break L4 ACL under certain situations, as well as RSVP as already explained. And probably issues I'm not aware of. Unsure if blind forwarding is best option. But I'm all for giving operator options, but calling it stupid that vendors punt something is misguided.
I really think zero limit is the right limit... (for a large number of deployments)
Traceroute would also break. Unpoliced punting certainly is extremely unwise, but punting to a level that does not introduce significant CPU load, should be safest default. -- ++ytti
On Fri, Sep 30, 2011 at 10:26 AM, Saku Ytti <saku@ytti.fi> wrote:
explained. And probably issues I'm not aware of. Unsure if blind forwarding is best option. But I'm all for giving operator options, but calling it stupid that vendors punt something is misguided.
after this long, yes... this is just dumb, there's no reason that the default should be punt. There are cases (you've brought up a few) where it's required today because of design limitations, there really shouldn't be cases like this anymore. this isn't our first rodeo, 'lessons learned' and all that...
I really think zero limit is the right limit... (for a large number of deployments)
Traceroute would also break. Unpoliced punting certainly is extremely unwise,
traceroute could certainly be handled in the fastpath.
but punting to a level that does not introduce significant CPU load, should be safest default.
what is that limit? from a single port? from a single linecard? from a chassis? how about we remove complexity here and just deal with this in the fastpath? My point in calling this all 'stupid' is that by now we all have been burned by this sort of behavior, vendors have heard from all of us that 'this is really not a good answer', enough is enough please stop doing this. -chris
On (2011-09-30 10:45 -0400), Christopher Morrow wrote:
after this long, yes... this is just dumb, there's no reason that the default should be punt. There are cases (you've brought up a few) where it's required today because of design limitations, there really shouldn't be cases like this anymore. this isn't our first rodeo, 'lessons learned' and all that...
Certainly possible, but will you pay the premium? I won't. To implement IPv6 according to standard your lookup engine needs to have MTU wide view, so up-to 65kB. Most common view today probably is 64B and highest I know 256B. And for the corner cases where this isn't enough, I'm happy to handle it in software, rather than pay premium to do it all in hardware.
traceroute could certainly be handled in the fastpath.
Yup. But again who would pay for this? I cannot be dossed by TTL exceeds as there is sufficient protetion mechanism in my hardware. So I would not pay premium for this feature.
what is that limit? from a single port? from a single linecard? from a chassis? how about we remove complexity here and just deal with this in the fastpath?
It would increase cost and complexity greatly. If I could get it for free, then I would take it, but I have lot more important things I want router vendors fix first. I do wish vendor would do is test box with attack vectors and implement sane defaults (IOS-XR is relatively good in this respect, or maybe it just looks that way as rest of them are really bad with their defaults). Very recently I had chat with GSR owner who was happy how GSR/IOS is solid DDoS resistant platform, while actually it is impossible to protect GSR/IOS (outside iACL) as none of the protections (rACL/CoPP) are implemented in hardware. 7600 is reasonably good for its age in this matter. But even modern examples, like MX80 completely fail with defaults. Killed MX80 in lab with bit over 5Mbps of IP options. Protection is quite easy but still most people do not do it, so vendors really should ship boxes with saner defaults. -- ++ytti
On 30/09/2011 15:45, Christopher Morrow wrote:
traceroute could certainly be handled in the fastpath.
which traceroute? icmp? udp? tcp? Traceroute is not a single protocol.
what is that limit? from a single port? from a single linecard? from a chassis? how about we remove complexity here and just deal with this in the fastpath?
on a pfc3, the mls rate limiters deal with handling all punts from the chassis to the RP. It's difficult to handle this in any other way.
My point in calling this all 'stupid' is that by now we all have been burned by this sort of behavior, vendors have heard from all of us that 'this is really not a good answer', enough is enough please stop doing this.
"This is a Hard Problem". There is a balance to be drawn between hardware complexity, cost and lifecycle. In the case of the PFC3, we're talking about hardware which was released in 2000 - 11 years ago. The ipv6 fragment punting problem was fixed in the pfc3c, which was released in 2003. I'm aware that cisco is still selling the pfc3b, but they really only push the rsp720 for internet stuff (if they're pushing the 6500/7600 line at all). Nick
On Fri, 30 Sep 2011, Nick Hilliard wrote:
On 30/09/2011 15:45, Christopher Morrow wrote:
traceroute could certainly be handled in the fastpath.
which traceroute? icmp? udp? tcp? Traceroute is not a single protocol.
what is that limit? from a single port? from a single linecard? from a chassis? how about we remove complexity here and just deal with this in the fastpath?
on a pfc3, the mls rate limiters deal with handling all punts from the chassis to the RP. It's difficult to handle this in any other way.
My point in calling this all 'stupid' is that by now we all have been burned by this sort of behavior, vendors have heard from all of us that 'this is really not a good answer', enough is enough please stop doing this.
"This is a Hard Problem". There is a balance to be drawn between hardware complexity, cost and lifecycle. In the case of the PFC3, we're talking about hardware which was released in 2000 - 11 years ago. The ipv6 fragment punting problem was fixed in the pfc3c, which was released in 2003. I'm aware that cisco is still selling the pfc3b, but they really only push the rsp720 for internet stuff (if they're pushing the 6500/7600 line at all).
They are pushing sup2T - however more for enterprise ip layer (6500 series). Regards, Janos Mohacsi
On 30/09/2011 16:38, Mohacsi Janos wrote:
They are pushing sup2T - however more for enterprise ip layer (6500 series).
they are now, yes. But until the sup2t started becoming available a couple of weeks ago the only option for the 6500 was a sup720. You're right that this was only pushed on the enterprise market. Of course, if you wanted a 10g capable service provider router and didn't want an asr9k, they were pushing the 7600 because the 6500 is a switch and the 7600 is a router and the two are totally different, no really you've gotta believe it. But at least the rsp720 could handle ipv6 fragments better. Nick
On Fri, Sep 30, 2011 at 12:00 PM, Nick Hilliard <nick@foobar.org> wrote:
Of course, if you wanted a 10g capable service provider router and didn't want an asr9k, they were pushing the 7600 because the 6500 is a switch and the 7600 is a router and the two are totally different, no really you've gotta believe it. But at least the rsp720 could handle ipv6 fragments better.
if I turn my head to the side I can almost believe you.
On Fri, Sep 30, 2011 at 11:24 AM, Nick Hilliard <nick@foobar.org> wrote:
On 30/09/2011 15:45, Christopher Morrow wrote:
traceroute could certainly be handled in the fastpath.
which traceroute? icmp? udp? tcp? Traceroute is not a single protocol.
traceroute is really an example of 'packet expired, send unreachable'... that, today is basically: o grab 64bytes of header (or something similar) o shove that in a payload o use the src as the dst o stick my src on o set icmp o crc and fire there's not really any need to do this in the slow path, is there? -chris
On 30/09/2011 17:30, Christopher Morrow wrote:
traceroute is really an example of 'packet expired, send unreachable'... that, today is basically: o grab 64bytes of header (or something similar) o shove that in a payload o use the src as the dst o stick my src on o set icmp o crc and fire
there's not really any need to do this in the slow path, is there?
there are unconfirmed rumours that icmp ping and traceroute are handled by hardware on the asr1k. I don't know if they are true. But you're right - it would be good to support this without resorting to hammering the routing engine. I don't really like the idea of punters running traceroutes reducing my bgp convergence time. Nick
On Fri, Sep 30, 2011 at 12:38 PM, Nick Hilliard <nick@foobar.org> wrote:
On 30/09/2011 17:30, Christopher Morrow wrote:
traceroute is really an example of 'packet expired, send unreachable'... that, today is basically: o grab 64bytes of header (or something similar) o shove that in a payload o use the src as the dst o stick my src on o set icmp o crc and fire
there's not really any need to do this in the slow path, is there?
there are unconfirmed rumours that icmp ping and traceroute are handled by hardware on the asr1k. I don't know if they are true. But you're right -
some platforms do some/all of this in hardware, yes. (I forget the matrix)
it would be good to support this without resorting to hammering the routing engine. I don't really like the idea of punters running traceroutes reducing my bgp convergence time.
this is exactly why punting anything NOT management and/or routing-protocols should be banned. Thanks for making that point explicitly. -chris
On Sep 30, 2011, at 11:44 PM, Christopher Morrow wrote:
this is exactly why punting anything NOT management and/or routing-protocols should be banned. Thanks for making that point explicitly.
And this is the requirement which should be placed in RFPs, along with other specific requirements for ACL handling, flow telemetry functionality, uRPF, et. al. If folks want to influence vendors to do the Right Thing, they have to expend the time and effort to quantify and qualify said Right Thing(s), and then put it into RFP requirements. Otherwise, complaining post-procurement isn't generally going to accomplish much. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> The basis of optimism is sheer terror. -- Oscar Wilde
On Fri, Sep 30, 2011 at 9:32 PM, Dobbins, Roland <rdobbins@arbor.net> wrote:
On Sep 30, 2011, at 11:44 PM, Christopher Morrow wrote:
this is exactly why punting anything NOT management and/or routing-protocols should be banned. Thanks for making that point explicitly.
And this is the requirement which should be placed in RFPs, along with other specific requirements for ACL handling, flow telemetry functionality, uRPF, et. al.
If folks want to influence vendors to do the Right Thing, they have to expend the time and effort to quantify and qualify said Right Thing(s), and then put it into RFP requirements. Otherwise, complaining post-procurement isn't generally going to accomplish much.
yes, my bitchfest was also a 'could we all start asking for this, now?' ... :)
On Fri, Sep 30, 2011 at 9:44 PM, Christopher Morrow <morrowc.lists@gmail.com> wrote:
On Fri, Sep 30, 2011 at 9:32 PM, Dobbins, Roland <rdobbins@arbor.net> wrote:
On Sep 30, 2011, at 11:44 PM, Christopher Morrow wrote:
this is exactly why punting anything NOT management and/or routing-protocols should be banned. Thanks for making that point explicitly.
And this is the requirement which should be placed in RFPs, along with other specific requirements for ACL handling, flow telemetry functionality, uRPF, et. al.
If folks want to influence vendors to do the Right Thing, they have to expend the time and effort to quantify and qualify said Right Thing(s), and then put it into RFP requirements. Otherwise, complaining post-procurement isn't generally going to accomplish much.
yes, my bitchfest was also a 'could we all start asking for this, now?' ... :)
which traceroute? icmp? udp? tcp? Traceroute is not a single protocol.
Router processing is only dependent on noticing that TTL is expiring, and being able to return an ICMP message (including a "quote" of part of the original packet) to the sender.
what is that limit? from a single port? from a single linecard? from a chassis? how about we remove complexity here and just deal with this in the fastpath?
on a pfc3, the mls rate limiters deal with handling all punts from the chassis to the RP. It's difficult to handle this in any other way.
If the rate limit is done "in hardware" (which one should hope), then it would be more natural to do it on a per-PFC/DFC basis. So on a box with DFCs on all linecards, it would be per linecard, not per chassis. Maybe someone who knows for sure can decide.
My point in calling this all 'stupid' is that by now we all have been burned by this sort of behavior, vendors have heard from all of us that 'this is really not a good answer', enough is enough please stop doing this.
"This is a Hard Problem". There is a balance to be drawn between hardware complexity, cost and lifecycle. In the case of the PFC3, we're talking about hardware which was released in 2000 - 11 years ago.
Um, no, in 2000 there was no PFC3. That came out (on the Supervisor 720) in March 2003.
The ipv6 fragment punting problem was fixed in the pfc3c, which was released in 2003.
The PFC 3C was announced (with the RSP720) in December 2006.
I'm aware that cisco is still selling the pfc3b, but they really only push the rsp720 for internet stuff (if they're pushing the 6500/7600 line at all).
See Janos' reply, the Catalyst 6500 seems alive and kicking with the Supervisor 2T. The 7600 is a somewhat different story. As far as I see, all development is going into feature-rich ES+ cards and a few relatively narrow applications such as mobile backhaul and FTTH aggregation(?). We have been using the 7600 as a cheap fast IPv4/IPv6 (and later also MPLS) backbone router. According to Cisco we should probably move "up" to the ASR9000 or CRS-3, but I'm tempted to "downgrade" to Catalyst 6500 with Sup-2T (until we need 100G :-). -- Simon.
On Sep 30, 2011, at 9:45 PM, Christopher Morrow wrote:
enough is enough please stop doing this.
Yes, but keep in mind that this particular issue has to do with an ASIC which is several years old and which contains other significant handicaps as well (viz. NetFlow caveats, no per-interface uRPF mode, etc.). So, complaining about most anything on this particular ASIC isn't going to accomplish much, unfortunately. The key is to a) evaluate newer ASICs on more operationally useful platforms in order to see how they handle this sort of thing (EARL8 should be fine, AFAICT) and b) put the appropriate requirements into RFCs so that vendors have a monetary value associated with doing the right thing. ----------------------------------------------------------------------- Roland Dobbins <rdobbins@arbor.net> // <http://www.arbornetworks.com> The basis of optimism is sheer terror. -- Oscar Wilde
Path MTU discovery would also break... oh wait, that's usually broken anyway. -Vinny -----Original Message----- From: Saku Ytti [mailto:saku@ytti.fi] Sent: Friday, September 30, 2011 10:27 AM To: nanog@nanog.org Subject: Re: Cisco 7600 PFC3B(XL) and IPv6 packets with fragmentation header On (2011-09-30 10:09 -0400), Christopher Morrow wrote:
a switch to be used that stops processing this sort of thing, in an internet core (and honestly most enterprise core) routers, all I want is packet-in/packet-out. there's no need for anything else, stop trying to send line-rate packets to the cpu.
This would break e.g. RSVP. For some instances dropping all of them in hardware is an option, for other instances ignoring and forwarding without understanding is ok but some situation you simply must punt.
no. all you need is a default 'do not process these, just fwd them' switch. (or, a switch at any rate that the operator can select one way or the other, they SHOULD know what is the best for their deployment).
It would also break L4 ACL under certain situations, as well as RSVP as already explained. And probably issues I'm not aware of. Unsure if blind forwarding is best option. But I'm all for giving operator options, but calling it stupid that vendors punt something is misguided.
I really think zero limit is the right limit... (for a large number of deployments)
Traceroute would also break. Unpoliced punting certainly is extremely unwise, but punting to a level that does not introduce significant CPU load, should be safest default. -- ++ytti
On Fri, Sep 30, 2011 at 12:55 AM, Christopher Morrow <morrowc.lists@gmail.com> wrote:
On Fri, Sep 30, 2011 at 1:07 AM, Mikael Abrahamsson <swmike@swm.pp.se> wrote: when will vendors learn that punting to the RE/RP/smarts for packets in the fastpath is ... not just 'unwise' but wholesale stupid? :( Yeah, that's a nice one, thanks.
At this point, I would have to describe it as ludicrous product engineering. Unless we're talking about small-business CPE devices, or true beasts with RPs capable of actually handling the load at wire speed. It goes beyond 'stupid' and well into the range of unreasonably insane UI design. Are cars designed to automatically slow to a stop when you turn on the radio if you forget to push a "don't let the radio interfere with my engine" button? The default/convention on real routers should be: Never punt a packet to RP for ACL processing. If someone asks to establish an ACL for a type of traffic would be subject to that, the request should generate an error. Or it should warn the user "% ACL Processing for this command will not be performed on fragments, unless you enable software ACL processing of IPv6 fragments using the blah blah blah command." And ask the human to manually turn on a " platform ipv6 acl fragment allow-software yes-i-am-really-really-sure " setting. -- -JH
participants (9)
-
Christopher Morrow
-
Dobbins, Roland
-
Jimmy Hess
-
Mikael Abrahamsson
-
Mohacsi Janos
-
Nick Hilliard
-
Saku Ytti
-
Simon Leinen
-
Vinny_Abello@Dell.com