Experiences with IPv6 and Routing Efficiency

newer
Remote Hand support Equinix Palo...

Mukom Akong T.

18 Jan 2014 18 Jan '14

4:09 a.m.

Hello folks, Does anyone have any experiences or insights to share on how more (or less) efficient routing is with IPv6? Any specific thoughts with respect to how the following characteristics help or not with routing efficiency? - fixed header size - Extension header chain - flow labels in header - no intermediate fragmentation - no checksums Thanks in advance. -- Mukom Akong T. http://about.me/perfexcellence | twitter: @perfexcellent ------------------------------------------------------------------------------------------------------------------------------------------ “When you work, you are the FLUTE through whose lungs the whispering of the hours turns to MUSIC" - Kahlil Gibran -------------------------------------------------------------------------------------------------------------------------------------------

Show replies by date

Nick Hilliard

18 Jan 18 Jan

12:22 p.m.

On 18/01/2014 04:09, Mukom Akong T. wrote:

...

Does anyone have any experiences or insights to share on how more (or less) efficient routing is with IPv6? Any specific thoughts with respect to how the following characteristics help or not with routing efficiency? - fixed header size - Extension header chain - flow labels in header - no intermediate fragmentation - no checksums

extension headers are a poor idea because it's troublesome to process them on cheap hardware. Because of this, packets with any sort of extension headers are routinely dropped by a large percentage of organisations. Flow labels are generally unused (i.e. set to zero by many host stacks). Nick

Saku Ytti

1 p.m.

On (2014-01-18 12:22 +0000), Nick Hilliard wrote:

...

On 18/01/2014 04:09, Mukom Akong T. wrote:

...
Does anyone have any experiences or insights to share on how more (or less) efficient routing is with IPv6? Any specific thoughts with respect to how the following characteristics help or not with routing efficiency? - fixed header size - Extension header chain - flow labels in header - no intermediate fragmentation - no checksums

extension headers are a poor idea because it's troublesome to process them on cheap hardware. Because of this, packets with any sort of extension headers are routinely dropped by a large percentage of organisations. Flow labels are generally unused (i.e. set to zero by many host stacks).

Fully agreed. Main issues in IPv6 in my POV 1. EH - allows by passing L4 ACL matches in practical devices - EH could be packet long, 64k, i.e. L4 might be in fragments - some HW simply silently drops packet with more EH than it can parse - some HW punt packets with EH they can't parse 2. lack of checksum - in some instances packet corruption maybe impossible to detect in network 3. solicited-node multicast in LAN - replaces broadcast 'problem' with vastly harder problem - likely most practical deployments will just use traditional flooding 4. large lans - no really ipv6's fault, but addressing policy's fault - due to vast scale, large lan adds hard to solve dos vectors I think IPv6 was probably designed in somewhat unfortunate time, when L2 was already all ASIC, but maybe not everyone/most saw that L3 would be too, perhaps it could have used more interdisciplinary cooperation. But none of those are really show-stoppers, and perfect is enemy of done. And hindsight is 20/20. Maybe instead of attempting to packet IPSEC as mandatory on it (now dropped), it should have done new mandatory PKI based L4, it could have been the selling point which pushes adoptation. -- ++ytti

Mukom Akong T.

19 Jan 19 Jan

4:08 a.m.

Thank you for your responses Saku, On Sat, Jan 18, 2014 at 5:00 PM, Saku Ytti <saku@ytti.fi> wrote:

...

2. lack of checksum - in some instances packet corruption maybe impossible to detect in network

How prevalent is this problem? There might be not point fixing a problem with a 0.2% probability of occurring, especially as it might be cheaper to detect and fix the errors at the application layer.

...

3. solicited-node multicast in LAN - replaces broadcast 'problem' with vastly harder problem - likely most practical deployments will just use traditional flooding

Could you please explain how broadcast is better than solicited node multicast. In any case we aren't getting round that for now and it is deeply imbedded in NDP. I am interested in your negative experiences with solicited node multicasts.

...

4. large lans - no really ipv6's fault, but addressing policy's fault - due to vast scale, large lan adds hard to solve dos vectors

Just because you can have 2^64 possible hosts on a LAN still doesn't mean we through principles of good LAN design out the door. :-) So I'd say it's rather the fault of shoddy network design rather than address policy.

...

-- Mukom Akong T. http://about.me/perfexcellence | twitter: @perfexcellent ------------------------------------------------------------------------------------------------------------------------------------------ “When you work, you are the FLUTE through whose lungs the whispering of the hours turns to MUSIC" - Kahlil Gibran -------------------------------------------------------------------------------------------------------------------------------------------

Nick Hilliard

4:15 p.m.

On 19/01/2014 04:08, Mukom Akong T. wrote:

...

Just because you can have 2^64 possible hosts on a LAN still doesn't mean we through principles of good LAN design out the door. :-) So I'd say it's rather the fault of shoddy network design rather than address policy.

no, it's a problem with the number of addresses available on the LAN; nothing to do with shoddy network design. Each device on the LAN will have a certain amount of capacity for caching neighbour addressing details. If some third party decides to send packets to a massive number of addresses on that LAN, then the router which is forwarding these packets will attempt to perform ND for these addresses. This can trivially be used as a cache exhaustion attack, which can cause regular connectivity on that LAN to be trashed. Nick

Mukom Akong T.

6:28 p.m.

On Sun, Jan 19, 2014 at 8:15 PM, Nick Hilliard <nick@foobar.org> wrote:

...

If some third party decides to send packets to a massive number of addresses on that LAN, then the router which is forwarding these packets will attempt to perform ND for these addresses. This can trivially be used as a cache exhaustion attack, which can cause regular connectivity on that LAN to be trashed.

I totally forgot about this scenario. Yes it is a real problem. -- Mukom Akong T. http://about.me/perfexcellence | twitter: @perfexcellent ------------------------------------------------------------------------------------------------------------------------------------------ “When you work, you are the FLUTE through whose lungs the whispering of the hours turns to MUSIC" - Kahlil Gibran -------------------------------------------------------------------------------------------------------------------------------------------

Saku Ytti

6:55 p.m.

On (2014-01-19 08:08 +0400), Mukom Akong T. wrote:

...

How prevalent is this problem? There might be not point fixing a problem with a 0.2% probability of occurring, especially as it might be cheaper to detect and fix the errors at the application layer.

I have no data on prevalency. But just this week we caught issue where ingress PE was mangling packets on IP2MPLS encap and calculating correct FCS on the mangled frame. All egress PE routers logged IP checksum error, it was very rare, maybe 1 per 30min on average. If it was IPv6, no error would have been logged, and customers would receive their share of these, <1 per month per customer, for sure, we would have never have found this issue in IPv6 network.

...

Could you please explain how broadcast is better than solicited node multicast. In any case we aren't getting round that for now and it is deeply imbedded in NDP. I am interested in your negative experiences with solicited node multicasts.

It requires group state in switches, potentially 16M groups, switches typically support few thousands and only populate them in SW (but forward on HW once built). Several attack vectors there.

...

Just because you can have 2^64 possible hosts on a LAN still doesn't mean we through principles of good LAN design out the door. :-) So I'd say it's rather the fault of shoddy network design rather than address policy.

Nick covered this, thanks. -- ++ytti

Mukom Akong T.

4 a.m.

On Sat, Jan 18, 2014 at 4:22 PM, Nick Hilliard <nick@foobar.org> wrote:

...

extension headers are a poor idea because it's troublesome to process them on cheap hardware.

Have you found them to be more troublesome to process than IPv4 options are/were?

...

Because of this, packets with any sort of extension headers are routinely dropped by a large percentage of organisations. Flow labels are generally unused (i.e. set to zero by many host stacks).

Frank Habicht

7:19 a.m.

On 1/19/2014 7:00 AM, Mukom Akong T. wrote:

...

On Sat, Jan 18, 2014 at 4:22 PM, Nick Hilliard <nick@foobar.org> wrote:

...
extension headers are a poor idea because it's troublesome to process them on cheap hardware.

Have you found them to be more troublesome to process than IPv4 options are/were?

at what position in the packet is the tcp port? a) in v4 b) in v6 c) v6 with a few extension headers now program a chip to filter based on this port number... Frank

Owen DeLong

21 Jan 21 Jan

9:13 a.m.

On Jan 18, 2014, at 23:19 , Frank Habicht <geier@geier.ne.tz> wrote:

...

On 1/19/2014 7:00 AM, Mukom Akong T. wrote:

...
On Sat, Jan 18, 2014 at 4:22 PM, Nick Hilliard <nick@foobar.org> wrote:

...
extension headers are a poor idea because it's troublesome to process them on cheap hardware.

Have you found them to be more troublesome to process than IPv4 options are/were?

at what position in the packet is the tcp port? a) in v4

Depends on the IPv4 options.

...

b) in v6

Assuming (based on (c) below), that this means in v6 without extension headers, then it will be at n+40 octets into the packet where n is the position of the desired port number (where desired is one of {source, destination} within the TCP header. Most of the (cheap) hardware that processes IPv4 punts packets with options to the slow path. In general, it depends on the IPv4 packet not containing options.

...

c) v6 with a few extension headers

In this case, it will be at 40+o+n octets into the packet where o is the number of octets contained in headers prior to the TCP header and n is defined as in (b) above.

...

now program a chip to filter based on this port number...

I think you might want to be more specific. After all, an ARM 9 is a chip which can easily be programmed to do so (in fact, I can point to iptables/ip6tables as running code which does this on the ARM 9). So... I suppose that whether your complaint has merit depends entirely on whether or not extension headers become more common on IPv6 packets than options have become on IPv4 packets or not and also on how hard it is to build fast-path hardware that bypasses extension headers that it does not care about. Since you only need to parse the first two fields of each extension header (Next Header Type and Header Length) to know everything you need to bypass the current header, it shouldn't be too hard to code that into a chip... Owen

Frank Habicht

10:52 a.m.

Hi Owen, On 1/21/2014 12:13 PM, Owen DeLong wrote:

...

On Jan 18, 2014, at 23:19 , Frank Habicht <geier@geier.ne.tz> wrote:

...
c) v6 with a few extension headers In this case, it will be at 40+o+n octets into the packet where o is the number of octets contained in headers prior to the TCP header and n is defined as in (b) above.

my point tried to be that it can be hard for an ASIC to know 'o'

...

...
now program a chip to filter based on this port number... I think you might want to be more specific. After all, an ARM 9 is a chip which can easily be programmed to do so (in fact, I can point to iptables/ip6tables as running code which does this on the ARM 9).

I was thinking about hardware that's forwarding packets "not in software" some of those boxes probably want to limit tcp ports 179 and 22.

...

So... I suppose that whether your complaint has merit depends entirely on whether or not extension headers become more common on IPv6 packets than options have become on IPv4 packets or not and also on how hard it is to build fast-path hardware that bypasses extension headers that it does not care about. Since you only need to parse the first two fields ^^^^ ? of each extension header (Next Header Type and Header Length) ... recursively for all extension headers ...

...

to know everything you need to bypass the current header, it shouldn't be too hard to code that into a chip... who's done that so far? Up to what number of EHs or octet-length?

Thanks, Frank

Owen DeLong

9:58 p.m.

On Jan 21, 2014, at 02:52 , Frank Habicht <geier@geier.ne.tz> wrote:

...

Hi Owen,

On 1/21/2014 12:13 PM, Owen DeLong wrote:

...
On Jan 18, 2014, at 23:19 , Frank Habicht <geier@geier.ne.tz> wrote:

...
c) v6 with a few extension headers In this case, it will be at 40+o+n octets into the packet where o is the number of octets contained in headers prior to the TCP header and n is defined as in (b) above.

my point tried to be that it can be hard for an ASIC to know 'o'

...
...
now program a chip to filter based on this port number... I think you might want to be more specific. After all, an ARM 9 is a chip which can easily be programmed to do so (in fact, I can point to iptables/ip6tables as running code which does this on the ARM 9).

I was thinking about hardware that's forwarding packets "not in software" some of those boxes probably want to limit tcp ports 179 and 22.

The difference between hardware and software gets blurrier every day. For example, many of the "forwarding ASICs" today are actually FPGAs with "software" loaded into them to do certain forwarding tasks. Yes, I took it to extremes by proposing a more general purpose CPU, but the fact of the matter remains that traversing a header chain in software looking for a known header type that you care about actually isn't complex and can easily be implemented in hardware. The task boils down to: octet i = &packet header; while (headers_remaining && header_type != type_of_interest) { header_type = i[NEXT_HEADER]; i += (i[HDR_LEN]+1) * 8; if (header_type != 0 && header_type != 60 && header_type != 43 && header_type != 44 && header_type != 51 && header_type != 50 && header_type != 60 && header_type != 135) headers_remaining = 0; } if (headers_remaining) { INSPECT THIS HEADER and act accordingly; } else { Punt, header not present; } Not a particularly complex program for a modern ASIC, actually. Where you run into potential problems (and this can apply to IPv4 as well, though less likely) is if you get an IPv6 packet where the header chain is so unreasonably long that the header you want is not in the first fragment. With a minimum MTU of 1280 octets, it's almost impossible to create a legitimate packet with a header chain so long that this would be an issue. That is one of the reasons that there are proposals to consider such packets invalid and why I support those proposals. However, those have nothing to do with ASIC difficulty. The fragment problem is hard to solve no matter how much CPU you throw at it because unless you are the end receiving host, you cannot reliably expect to be able to perform reassembly.

...

...
So... I suppose that whether your complaint has merit depends entirely on whether or not extension headers become more common on IPv6 packets than options have become on IPv4 packets or not and also on how hard it is to build fast-path hardware that bypasses extension headers that it does not care about. Since you only need to parse the first two fields ^^^^ ? of each extension header (Next Header Type and Header Length) ... recursively for all extension headers ...

Which in anything but the most contrived of cases is likely n=0 for 95+% of all (unencrypted) packets and likely n<4 for all others. Further, it's not recursively, it's repetitively. There is a difference. (See above. Note that the loop is _NOT_ recursive, it is repetitive. Recursive would be something like: sub parse_next_header(header) { char * i = header; HDR_PARSE_RESULT result; register char nh = i[NExT_HEADER]; if (nh != 0 && nh != 60 && nh != 43 && nh != 44 && nh != 51 && nh != 50 && nh != 60 && nh != 135) result = parse_next_header(i+(1+i[HDR_LEN])*8); // code to parse this header (modifies result accordingly) return (result); } This might be possible to do in an ASIC, but might be more difficult than the iterative solution proposed above.

...

...
to know everything you need to bypass the current header, it shouldn't be too hard to code that into a chip... who's done that so far?

I don't know.

...

Up to what number of EHs or octet-length?

Why would the number of HEs or the octet-length (up to the maximum MTU supported by the box) matter in this question? I understand that if you do it recursively, you get into potential stack resource issues, but with an iterative approach looking for the header of interest, I don't see the issue. Owen

Nick Hilliard

19 Jan 19 Jan

4:11 p.m.

On 19/01/2014 04:00, Mukom Akong T. wrote:

...

Have you found them to be more troublesome to process than IPv4 options are/were?

The problem is that you can have long EH chains, with one after another. Generally speaking, most hardware forwarding engines will perform a lookup based on the first N bytes of a packet. If arbitrary length EHs are not supported by the hardware, then you have 3 options: forward the packets unilaterally, drop the packets unilaterally or punt to a cpu/npu. Punting and forwarding both open up denial of service attacks for hardware-forwarded routers, so generally the only sensible option is to drop packets with long EH chains. Nick

Saku Ytti

5:05 p.m.

On (2014-01-19 16:11 +0000), Nick Hilliard wrote:

...

attacks for hardware-forwarded routers, so generally the only sensible option is to drop packets with long EH chains.

I think sensible is to handle HW when possible and punt rate-limited when must. Dropping standard compliant data seems dubious at best. Now should it be standard complaint? http://tools.ietf.org/html/draft-ietf-6man-oversized-header-chain-09 is looking to restrict EH more, I contacted authors, hoping even more limitation than what it currently suggests, they thought 6man would never accept as strict limits as I suggested. My suggestion is that IP + EH (not L4) SHOULD NOT span over 128B and implementation MAY drop frames with larger headers. -- ++ytti

joel jaeggli

5:52 p.m.

On 1/19/14, 9:05 AM, Saku Ytti wrote:

...

On (2014-01-19 16:11 +0000), Nick Hilliard wrote:

...
attacks for hardware-forwarded routers, so generally the only sensible option is to drop packets with long EH chains.

I think sensible is to handle HW when possible and punt rate-limited when must. Dropping standard compliant data seems dubious at best.

There are routers and switches that by design have no recourse to a software forwarding path. It doesn't make a lot of sense to have device that has a nominal capacity of several Tb/s attempt to punt packets up to a control-plane processor that's gig-e connected.

...

Now should it be standard complaint?

http://tools.ietf.org/html/draft-ietf-6man-oversized-header-chain-09 is looking to restrict EH more, I contacted authors, hoping even more limitation than what it currently suggests, they thought 6man would never accept as strict limits as I suggested. My suggestion is that IP + EH (not L4) SHOULD NOT span over 128B and implementation MAY drop frames with larger headers.

Saku Ytti

6:17 p.m.

On (2014-01-19 09:52 -0800), joel jaeggli wrote:

...

It doesn't make a lot of sense to have device that has a nominal capacity of several Tb/s attempt to punt packets up to a control-plane processor that's gig-e connected.

You already punt IP options, vast majority of deployments won't see significant pps for IPV6 EH. As long as you police them to acceptable rate before punt and you don't share the policer with other punt masks, it is the right thing to do, blind dropping standard compliant packets not. -- ++ytti

Mark Tinka

6:58 p.m.

On Sunday, January 19, 2014 07:52:38 PM joel jaeggli wrote:

...

It doesn't make a lot of sense to have device that has a nominal capacity of several Tb/s attempt to punt packets up to a control-plane processor that's gig-e connected.

Not that the control plane would forward said traffic at Gig-E speeds anyway. Mark.

Mark Tinka

18 Jan 18 Jan

5:49 p.m.

On Saturday, January 18, 2014 06:09:58 AM Mukom Akong T. wrote:

...

Does anyone have any experiences or insights to share on how more (or less) efficient routing is with IPv6? Any specific thoughts with respect to how the following characteristics help or not with routing efficiency? - fixed header size - Extension header chain - flow labels in header - no intermediate fragmentation - no checksums

One thing to think about is routing efficiency. At this time, networks that employ MPLS-TE for IPv4, and run native IPv6, have challenges doing the same for IPv6, mostly because it's not possible to point IPv6 traffic into MPLS-TE tunnels built over an IPv4 control plane. If you are doing 6PE, this could be possible, but most vendors can't do the former. More native IPv6 control planes for MPLS (and by extension, MPLS-TE) will mean that IPv6 traffic will travel the same path as IPv4 traffic in MPLS-TE'd networks. When that will be remains to be seen. Until then, the most we can do for native IPv6 traffic is fiddle around with IGP metrics, to obtain some kind of reasonable TE. Mark.

John van Oppen

6:30 p.m.

This is exactly what pushed us into 6PE... it was the only way to make performance similar to v4 from a routing standpoint. John @ AS11404

joel jaeggli

6:55 p.m.

On 1/18/14, 10:30 AM, John van Oppen wrote:

...

This is exactly what pushed us into 6PE... it was the only way to make performance similar to v4 from a routing standpoint.

This statement is a bit facile... What platform are you referring to?

...

John @ AS11404

John van Oppen

19 Jan 19 Jan

9:10 a.m.

We ended up with 6PE to make the v6 support on our cisco based network behave the same way as v4, IE use TE tunnels, etc. Given the v4 MPLS this was the only real way to make it the same. -----Original Message----- From: joel jaeggli [mailto:joelja@bogus.com] Sent: Saturday, January 18, 2014 10:56 AM To: John van Oppen; 'mark.tinka@seacom.mu'; nanog@nanog.org Subject: Re: Experiences with IPv6 and Routing Efficiency On 1/18/14, 10:30 AM, John van Oppen wrote:

...

This is exactly what pushed us into 6PE... it was the only way to make performance similar to v4 from a routing standpoint.

This statement is a bit facile... What platform are you referring to?

...

John @ AS11404

Saku Ytti

10:10 a.m.

On (2014-01-19 09:10 +0000), John van Oppen wrote:

...

We ended up with 6PE to make the v6 support on our cisco based network behave the same way as v4, IE use TE tunnels, etc. Given the v4 MPLS this was the only real way to make it the same.

Fully agreed. I have no problem being in 6PE until fork-lift in some future to IPv6 core and 4PE. Signalling AFI in core and AFI sold to customer have little codependency. People have too sentimental view on this, if you label your IPv4 it is silly not to run 6PE, you're just creating complexity and removing functionality. -- ++ytti

Mark Tinka

12:38 p.m.

On Sunday, January 19, 2014 12:10:47 PM Saku Ytti wrote:

...

Fully agreed. I have no problem being in 6PE until fork-lift in some future to IPv6 core and 4PE.

Assuming your addressing will continue to grow on IPv6, and remain reasonably static on IPv4, your forklift should allow you to remain native on both (on the basis that at that time, we do have native control planes both for MPLSv4 and MPLSv6, of course). So "4|6PE" would not be necessary. Personally, I think it's unnecessary labor to remove IPv4 in the future, especially when it's not expanding. One is welcome to do this, of course, if they are really bored :-). Removing native IPv4 in the future only to replace it with 4PE seems quite complex, to me.

...

People have too sentimental view on this, if you label your IPv4 it is silly not to run 6PE, you're just creating complexity and removing functionality.

Turning on native IPv6 in your core is not adding complexity, I think. Yes, agree that you may lose parity between MPLS-TEv4 and TEv6 as of today, but some would say that MPLS-TE adds quite a bit complexity today, especially if used on a long-term basis. Mark.

4304

Age (days ago)

4307

Last active (days ago)

List overview

Download

22 comments

8 participants

participants (8)

Frank Habicht
joel jaeggli
John van Oppen
Mark Tinka
Mukom Akong T.
Nick Hilliard
Owen DeLong
Saku Ytti