subnet prefix length > 64 breaks IPv6?

Glen Kent

24 Dec 2011 24 Dec '11

5:32 a.m.

Hi, I am trying to understand why standards say that "using a subnet prefix length other than a /64 will break many features of IPv6, including Neighbor Discovery (ND), Secure Neighbor Discovery (SEND) [RFC3971], .. " [reference RFC 5375] Or "A number of other features currently in development, or being proposed, also rely on /64 subnet prefixes." Is it because the 128 bits are divided into two 64 bit halves, where the latter identifies an Interface ID which is uniquely derived from the 48bit MAC address. I am not sure if this is the reason as this only applies to the link local IP address. One could still assign a global IPv6 address. So, why does basic IPv6 (ND process, etc) break if i use a netmask of say /120? I know that several operators use /120 as a /64 can be quite risky in terms of ND attacks. So, how does that work? I tried googling but couldnt find any references that explain how IPv6 breaks with using a netmask other than 64. Glen

Show replies by date

sthaug＠nethelp.no

24 Dec 24 Dec

7:08 a.m.

...

I am not sure if this is the reason as this only applies to the link local IP address. One could still assign a global IPv6 address. So, why does basic IPv6 (ND process, etc) break if i use a netmask of say /120?

As long as you assign addresses statically, IPv6 works just fine with a netmask > 64. We've been using this for several years now. No problem. Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Glen Kent

10:07 a.m.

Ok. So does SLAAC break with masks > 64? Glen On Sat, Dec 24, 2011 at 12:38 PM, <sthaug@nethelp.no> wrote:

...

...
I am not sure if this is the reason as this only applies to the link local IP address. One could still assign a global IPv6 address. So, why does basic IPv6 (ND process, etc) break if i use a netmask of say /120?

As long as you assign addresses statically, IPv6 works just fine with a netmask > 64. We've been using this for several years now. No problem.

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Karl Auer

10:58 a.m.

On Sat, 2011-12-24 at 15:37 +0530, Glen Kent wrote:

...

Ok. So does SLAAC break with masks > 64?

"Break" is not the right word. SLAAC only works with /64, But that is by design. Regards, K. -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Karl Auer (kauer@biplane.com.au) +61-2-64957160 (h) http://www.biplane.com.au/kauer/ +61-428-957160 (mob) GPG fingerprint: AE1D 4868 6420 AD9A A698 5251 1699 7B78 4EEE 6017 Old fingerprint: DA41 51B1 1481 16E1 F7E2 B2E9 3007 14ED 5736 F687

Alexandru Petrescu

11:16 a.m.

Le 24/12/2011 11:58, Karl Auer a écrit :

...

On Sat, 2011-12-24 at 15:37 +0530, Glen Kent wrote:

...
Ok. So does SLAAC break with masks> 64?

"Break" is not the right word. SLAAC only works with /64, But that is by design.

SLAAC only works with /64 - yes - but only if it runs on Ethernet-like Interface ID's of 64bit length (RFC2464). SLAAC could work ok with /65 on non-Ethernet media, like a point-to-point link whose Interface ID's length be negotiated during the setup phase. Other non-64 Interface IDs could be constructed for 802.15.4 links, for example a 16bit MAC address could be converted into a 32bit Interface ID. SLAAC would thus use a /96 prefix in the RA and a 32bit IID. IP-over-USB misses an Interface ID altogether, so one is free to define its length. Alex

...

Regards, K.

Glen Kent

2:48 p.m.

...

SLAAC only works with /64 - yes - but only if it runs on Ethernet-like Interface ID's of 64bit length (RFC2464).

Ok, the last 64 bits of the 128 bit address identifies an Interface ID which is uniquely derived from the 48bit MAC address (which exists only in ethernet).

...

SLAAC could work ok with /65 on non-Ethernet media, like a point-to-point link whose Interface ID's length be negotiated during the setup phase.

If we can do this for a p2p link, then why cant the same be done for an ethernet link? Glen

...

Other non-64 Interface IDs could be constructed for 802.15.4 links, for example a 16bit MAC address could be converted into a 32bit Interface ID. SLAAC would thus use a /96 prefix in the RA and a 32bit IID.

IP-over-USB misses an Interface ID altogether, so one is free to define its length.

Alex

...
Regards, K.

Jonathan Lassoff

3:39 p.m.

On Sat, Dec 24, 2011 at 6:48 AM, Glen Kent <glen.kent@gmail.com> wrote:

...

...
SLAAC only works with /64 - yes - but only if it runs on Ethernet-like Interface ID's of 64bit length (RFC2464).

Ok, the last 64 bits of the 128 bit address identifies an Interface ID which is uniquely derived from the 48bit MAC address (which exists only in ethernet).

...
SLAAC could work ok with /65 on non-Ethernet media, like a point-to-point link whose Interface ID's length be negotiated during the setup phase.

If we can do this for a p2p link, then why cant the same be done for an ethernet link?

I think by "point-to-point", Alexandru was referring to PPP-signalled links. In the case of Ethernet and SLAAC, the standards define a way to turn a globally unique 48-bit 802.3 MAC-48 address into an EUI-64 identifier by flipping and adding some bits. This uniquely maps conventional MAC-48 addresses into EUI-64 addresses. I imagine this was chosen because the IEEE is encouraging new standards and numbering schemes to use the 64-bit schemes over the older 48-bit ones. Presumably to avoid exhaustion in the future (like we're seeing with IPv4). The result of which is that with the standards we've got today, we can easily map a piece of hardware's globally unique MAC address into a globally unique 64-bit identifier -- which happens to cleanly fit into the second half of the v6 address space. I suppose one could make an argument to use /80 networks and just use the MAC-48 identifier for the address portion, but given the vastness of v6 space I don't think it's really worth the extra savings of bit space. So, to address your original question, in v6 networks with netmask lengths greater than 64 bits nothing "breaks" per se, but some of the conventional standards and ideas about what a "network" is in that context are broken. While it's not possible to have hosts uniquely pick addresses for themselves, one can use other addressing mechanisms like DHCPv6 or static addresses. --j

Owen DeLong

3 Jan 3 Jan

10:36 p.m.

On Dec 24, 2011, at 6:48 AM, Glen Kent wrote:

...

...
SLAAC only works with /64 - yes - but only if it runs on Ethernet-like Interface ID's of 64bit length (RFC2464).

Ok, the last 64 bits of the 128 bit address identifies an Interface ID which is uniquely derived from the 48bit MAC address (which exists only in ethernet).

Not exactly. Most media have some form of link-layer addressing. For Firewire, it's native EUI-64. For Ethernet, it's EUI-48 MAC addresses. For token ring, I believe there are also EUI-48 addresses. For FDDI (Remember FDDI?) I believe it was EUI-48 addresses. ATM and Frame Relay also have EUI addresses built in to their interfaces (though I don't remember the exact format and am too lazy to look it up at the moment).

...

...
SLAAC could work ok with /65 on non-Ethernet media, like a point-to-point link whose Interface ID's length be negotiated during the setup phase.

If we can do this for a p2p link, then why cant the same be done for an ethernet link?

I'm not so sure the statement above is actually true. Owen

...

Glen

...
Other non-64 Interface IDs could be constructed for 802.15.4 links, for example a 16bit MAC address could be converted into a 32bit Interface ID. SLAAC would thus use a /96 prefix in the RA and a 32bit IID.

IP-over-USB misses an Interface ID altogether, so one is free to define its length.

Alex

...
Regards, K.

Alexandru Petrescu

4 Jan 4 Jan

6:50 p.m.

Le 03/01/2012 23:36, Owen DeLong a écrit :

...

On Dec 24, 2011, at 6:48 AM, Glen Kent wrote:

...
...
SLAAC only works with /64 - yes - but only if it runs on Ethernet-like Interface ID's of 64bit length (RFC2464).

Ok, the last 64 bits of the 128 bit address identifies an Interface ID which is uniquely derived from the 48bit MAC address (which exists only in ethernet).

Not exactly. Most media have some form of link-layer addressing. For Firewire, it's native EUI-64. For Ethernet, it's EUI-48 MAC addresses. For token ring, I believe there are also EUI-48 addresses. For FDDI (Remember FDDI?) I believe it was EUI-48 addresses. ATM and Frame Relay also have EUI addresses built in to their interfaces (though I don't remember the exact format and am too lazy to look it up at the moment).

...
...
SLAAC could work ok with /65 on non-Ethernet media, like a point-to-point link whose Interface ID's length be negotiated during the setup phase.

If we can do this for a p2p link, then why cant the same be done for an ethernet link?

I'm not so sure the statement above is actually true.

I think that's right, sorry. I mean - a reread of the PPPv6 RFC tells that the Interface ID negotiated by PPP is stricly 64bit length. (although it does refer to rfc4941 which specifically acks that "note that an IPv6 identifier does not necessarily have to be 64 bits in length"). It's a mess :-) Alex

...

Owen

...
Glen

...
Other non-64 Interface IDs could be constructed for 802.15.4 links, for example a 16bit MAC address could be converted into a 32bit Interface ID. SLAAC would thus use a /96 prefix in the RA and a 32bit IID.

IP-over-USB misses an Interface ID altogether, so one is free to define its length.

Alex

...
Regards, K.

Sven Olaf Kamphuis

24 Dec 24 Dec

3:30 p.m.

it only breaks the auto configure crap which you don't want to use anyway. (unless you want to have any computer on your network be able to tell any other computer "oh hai i'm a router, please route all your packets through me so i can intercept them" and/or flood its route table ;) we use all kinds of things from /126'es to /112 (but hardly any /64 crap) works perfectly fine. as long as its nibble aligned (for other reasons ;) -- Greetings, Sven Olaf Kamphuis, CB3ROB Ltd. & Co. KG ========================================================================= Address: Koloniestrasse 34 VAT Tax ID: DE267268209 D-13359 Registration: HRA 42834 B BERLIN Phone: +31/(0)87-8747479 Germany GSM: +49/(0)152-26410799 RIPE: CBSK1-RIPE e-Mail: sven@cb3rob.net ========================================================================= <penpen> C3P0, der elektrische Westerwelle http://www.facebook.com/cb3rob ========================================================================= Confidential: Please be advised that the information contained in this email message, including all attached documents or files, is privileged and confidential and is intended only for the use of the individual or individuals addressed. Any other use, dissemination, distribution or copying of this communication is strictly prohibited. On Sat, 24 Dec 2011, Glen Kent wrote:

...

Hi,

I am trying to understand why standards say that "using a subnet prefix length other than a /64 will break many features of IPv6, including Neighbor Discovery (ND), Secure Neighbor Discovery (SEND) [RFC3971], .. " [reference RFC 5375]

Or "A number of other features currently in development, or being proposed, also rely on /64 subnet prefixes."

Is it because the 128 bits are divided into two 64 bit halves, where the latter identifies an Interface ID which is uniquely derived from the 48bit MAC address.

I am not sure if this is the reason as this only applies to the link local IP address. One could still assign a global IPv6 address. So, why does basic IPv6 (ND process, etc) break if i use a netmask of say /120?

I know that several operators use /120 as a /64 can be quite risky in terms of ND attacks. So, how does that work? I tried googling but couldnt find any references that explain how IPv6 breaks with using a netmask other than 64.

Glen

Sven Olaf Kamphuis

3:44 p.m.

things that -do- break on ipv6 a lot (not nessesarily related to the /64 thing) are premature protocols like ospf6 and ripng that for some magic reason refuse to work on point-to-point (as opposed to putting the interface in broadcast mode, like ethernet) interfaces without (additional) link-local addresses, despite the option to clearly specify the interface and/or address of the peer and/or address ranges they should work on (these do not nessesarily have to be /64, but they do need to be scope link local and start with a multicast prefix). also various bgp implementations will send the autoconfigure crap ip as the next-hop instead of the session ip, resulting in all kinds of crap in your route table (if not fixed with nasty hacks on your end ;) which doesn't exactly make it easy to figure out which one belongs to which peer all the more reason not to use that autoconfigure crap ;) on the whole, ipv6 simply still needs a -lot- of work. for those that do want autoconfigure (workstations?) , a proper dhcp implementation would be preferred over keeping that RA stuff around in future implementations of the v6 stack, as far as we're concerned, it can go the way of the dinosaur (already ;) On Sat, 24 Dec 2011, Sven Olaf Kamphuis wrote:

...

it only breaks the auto configure crap which you don't want to use anyway.

(unless you want to have any computer on your network be able to tell any other computer "oh hai i'm a router, please route all your packets through me so i can intercept them" and/or flood its route table ;)

we use all kinds of things from /126'es to /112 (but hardly any /64 crap)

works perfectly fine.

as long as its nibble aligned (for other reasons ;)

-- Greetings,

Sven Olaf Kamphuis, CB3ROB Ltd. & Co. KG ========================================================================= Address: Koloniestrasse 34 VAT Tax ID: DE267268209 D-13359 Registration: HRA 42834 B BERLIN Phone: +31/(0)87-8747479 Germany GSM: +49/(0)152-26410799 RIPE: CBSK1-RIPE e-Mail: sven@cb3rob.net ========================================================================= <penpen> C3P0, der elektrische Westerwelle http://www.facebook.com/cb3rob =========================================================================

Confidential: Please be advised that the information contained in this email message, including all attached documents or files, is privileged and confidential and is intended only for the use of the individual or individuals addressed. Any other use, dissemination, distribution or copying of this communication is strictly prohibited.

On Sat, 24 Dec 2011, Glen Kent wrote:

...
Hi,

I am trying to understand why standards say that "using a subnet prefix length other than a /64 will break many features of IPv6, including Neighbor Discovery (ND), Secure Neighbor Discovery (SEND) [RFC3971], .. " [reference RFC 5375]

Or "A number of other features currently in development, or being proposed, also rely on /64 subnet prefixes."

Is it because the 128 bits are divided into two 64 bit halves, where the latter identifies an Interface ID which is uniquely derived from the 48bit MAC address.

I am not sure if this is the reason as this only applies to the link local IP address. One could still assign a global IPv6 address. So, why does basic IPv6 (ND process, etc) break if i use a netmask of say /120?

I know that several operators use /120 as a /64 can be quite risky in terms of ND attacks. So, how does that work? I tried googling but couldnt find any references that explain how IPv6 breaks with using a netmask other than 64.

Glen

Glen Kent

27 Dec 27 Dec

12:55 a.m.

Sven,

...

also various bgp implementations will send the autoconfigure crap ip as the next-hop instead of the session ip, resulting in all kinds of crap in your route table (if not fixed with nasty hacks on your end ;) which doesn't exactly make it easy to figure out which one belongs to which peer all the more reason not to use that autoconfigure crap ;)

As per RFC 2545 BGP announces a global address as the next-hop. Its only in one particular case that it advertises both global and link local addresses. So, i guess, BGP is not broken. Its only RIPng afaik that mandates using a link local address. Glen

Glen Kent

11:28 p.m.

It seems ISIS and OSPFv3 use the link local next-hop in their route advertisements. We discussed that SLAAC doesnt work with prefixes > 64 on the ethernet medium (which i believe is quite, if not most, prevalent). If thats the case then how are operators who assign netmasks > 64 use ISIS and OSPF, since these protocols will use the link local address? I had assumed that nodes derive their link local address from the Route Advertisements. They derive their least significant 64 bytes from their MACs and the most significant 64 from the prefix announced in the RAs. Glen On Tue, Dec 27, 2011 at 6:25 AM, Glen Kent <glen.kent@gmail.com> wrote:

...

Sven,

...
also various bgp implementations will send the autoconfigure crap ip as the next-hop instead of the session ip, resulting in all kinds of crap in your route table (if not fixed with nasty hacks on your end ;) which doesn't exactly make it easy to figure out which one belongs to which peer all the more reason not to use that autoconfigure crap ;)

As per RFC 2545 BGP announces a global address as the next-hop. Its only in one particular case that it advertises both global and link local addresses.

So, i guess, BGP is not broken.

Its only RIPng afaik that mandates using a link local address.

Glen

Valdis.Kletnieks＠vt.edu

11:35 p.m.

On Wed, 28 Dec 2011 04:58:19 +0530, Glen Kent said:

...

I had assumed that nodes derive their link local address from the Route Advertisements. They derive their least significant 64 bytes from their MACs and the most significant 64 from the prefix announced in the RAs.

No, on Ethernet-ish networks the link-local is derived from an 'fe80::' and the MAC.

Joel Maslak

11:38 p.m.

On Dec 27, 2011, at 4:28 PM, Glen Kent <glen.kent@gmail.com> wrote:

...

I had assumed that nodes derive their link local address from the Route Advertisements. They derive their least significant 64 bytes from their MACs and the most significant 64 from the prefix announced in the RAs.

No, link local addresses are not derived from RAs. Even a system not connected to a router will have a link local address on each ethernet (I couldn't tell you how link local works on PPP, ATM, etc, without looking it up - but it doesn't require /64 networks).

Chuck Anderson

28 Dec 28 Dec

1:05 a.m.

On Wed, Dec 28, 2011 at 04:58:19AM +0530, Glen Kent wrote:

...

It seems ISIS and OSPFv3 use the link local next-hop in their route advertisements.

We discussed that SLAAC doesnt work with prefixes > 64 on the ethernet medium (which i believe is quite, if not most, prevalent). If thats the case then how are operators who assign netmasks > 64 use ISIS and OSPF, since these protocols will use the link local address?

I had assumed that nodes derive their link local address from the Route Advertisements. They derive their least significant 64 bytes from their MACs and the most significant 64 from the prefix announced in the RAs.

Each prefix on an interface can have a different prefix length. Link-locals always have a prefix length of 64, even if a global address assigned to the same interface has a different length. Also, the link-local address is derived locally without any information from RAs.

Owen DeLong

3 Jan 3 Jan

11:45 p.m.

On Dec 27, 2011, at 3:28 PM, Glen Kent wrote:

...

It seems ISIS and OSPFv3 use the link local next-hop in their route advertisements.

We discussed that SLAAC doesnt work with prefixes > 64 on the ethernet medium (which i believe is quite, if not most, prevalent). If thats the case then how are operators who assign netmasks > 64 use ISIS and OSPF, since these protocols will use the link local address?

The global unicast prefix length is independent of the link local prefix length. Technically, link local is fe80::/10, though many implementations erroneously treat it as fe80::/64. In most cases, since the 54 bits between fe80 and the IID are almost always 0, this error has no impact.

...

I had assumed that nodes derive their link local address from the Route Advertisements. They derive their least significant 64 bytes from their MACs and the most significant 64 from the prefix announced in the RAs.

No, nodes derive their link local address from the reserved prefix fe80::/10 and their EUI-64 IID based on their MAC address. They then use that link local address to send out an RS message in order to get global unicast prefixes from the RAs received in response. Owen

...

Glen

On Tue, Dec 27, 2011 at 6:25 AM, Glen Kent <glen.kent@gmail.com> wrote:

...
Sven,

...
also various bgp implementations will send the autoconfigure crap ip as the next-hop instead of the session ip, resulting in all kinds of crap in your route table (if not fixed with nasty hacks on your end ;) which doesn't exactly make it easy to figure out which one belongs to which peer all the more reason not to use that autoconfigure crap ;)

As per RFC 2545 BGP announces a global address as the next-hop. Its only in one particular case that it advertises both global and link local addresses.

So, i guess, BGP is not broken.

Its only RIPng afaik that mandates using a link local address.

Glen

Karl Auer

4 Jan 4 Jan

2:41 a.m.

On Tue, 2012-01-03 at 15:45 -0800, Owen DeLong wrote:

...

Technically, link local is fe80::/10, though many implementations erroneously treat it as fe80::/64. In most cases, since the 54 bits between fe80 and the IID are almost always 0, this error has no impact.

Yes, well, I'm a bit confused about that. Maybe I haven't read the trail of overlapping, obsoleting and conflicting RFCs carefully enough. RFC 4862 (section 5.3) says that the interface ID can run all the way up to the end of the link-local prefix. Since this is defined as a /10, an interface ID can be up to 118 bits long. In RFC 4862 the prefix length is not actually given; instead it says "the well-known link-local prefix FE80::0 [RFC4291] (of appropriate length)". RFC 4862 also says that the whole thing must be consistent with RFC 4291. RFC 4291 (section 2.5.6), defines the first ten bits as 1111111010, then the next 54 bits as zero - BUT does not specify a prefix length. Those implementations that use /64 can thus be forgiven, I think. So - are those 54 bits reserved and zero, or can an interface ID be anything up to 118 bits long? I'd be interested in a definitive answer, if there is one. Regards, K. -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Karl Auer (kauer@biplane.com.au) http://www.biplane.com.au/kauer GPG fingerprint: AE1D 4868 6420 AD9A A698 5251 1699 7B78 4EEE 6017 Old fingerprint: DA41 51B1 1481 16E1 F7E2 B2E9 3007 14ED 5736 F687

Ray Soucy

24 Dec 24 Dec

5:10 p.m.

Your understanding of IPv6 is poor if you think by not using a 64-bit prefix you will be protected against rogue RA. The prefix length you define on your router will have no impact on a rogue RA sent out. IPv6 hosts can have addresses from multiple prefixes on the same link. Choosing to make use of a 120-bit prefix (for example) will do nothing to protect against a rogue RA announcing its own 64-bit prefix with the A flag set. You can use a 64-bit prefix and not use SLAAC as well. SLAAC is used only when the A flag is set. It just so happens that the majority of router implementations have it set by default. You still need to filter RA from unauthorized hosts. Currently, many switches can accomplish this using a PACL on access ports. In the near future, we will begin to see the RA Guard feature become standard on enterprise switches. Mind you, you should be filtering out rogue RA regardless if whether or not you have deployed IPv6. Windows ICS sending RA is a widespread problem (honestly wish Microsoft would remove ICS from the default install). There are some things that will break by not using a 64-bit prefix. SLAAC can't function without it. Privacy Extensions for SLAAC can't either (obviously). If you make use of a longer prefix, then you need to use either manual configuration or DHCPv6 for address assignment. All standards-compliant implementations of IPv6 will work with prefixes longer than 64-bit. In production, we make use of 126-bit prefixes for link networks, and common use of 120 (and similar) prefixes for host networks and they work perfectly. That said, the only reason we don't make use of 64-bit prefixes for host networks is in an effort (which may be futile) to mitigate neighbor table exhaustion attacks. We still reserve a full 64-bit prefix, allowing us to expand the prefix in the future without disrupting service. The long term plan is to migrate to 64-bit prefixes when routing equipment is better able to handle neighbor table exhaustion attacks. As for the comments on the use of multicast; multicast is a good thing. On most devices is is no different than broadcast, but it adds the information needed for more advanced hardware (e.g. managed switches with MLD snooping) to only replicate the traffic to interested parties. The elimination of broadcast traffic in IPv6 is a good thing, and doesn't introduce any problem. The (related) other comment made was using ARP with IPv6 instead of ND. This also shows a poor understanding of how IPv6 works. ARP is for IPv4, ND is for IPv6. There is no ARP for IPv6. ND has the advantage that it actually happens over IPv6, rather than a lower level or parallel protocol. This makes filtering such traffic and designing hardware that is aware of it significantly easier. It will be nice to reach a point where non-IPv6 traffic can be filtered and dropped completely. Other than making use of the link-local scope and using a multicast group instead of broadcast, ND is pretty much the same thing as ARP. On Sat, Dec 24, 2011 at 10:30 AM, Sven Olaf Kamphuis <sven@cb3rob.net> wrote:

...

it only breaks the auto configure crap which you don't want to use anyway.

(unless you want to have any computer on your network be able to tell any other computer "oh hai i'm a router, please route all your packets through me so i can intercept them" and/or flood its route table ;)

we use all kinds of things from /126'es to /112 (but hardly any /64 crap)

works perfectly fine.

as long as its nibble aligned (for other reasons ;)

-- Greetings,

Sven Olaf Kamphuis, CB3ROB Ltd. & Co. KG ========================================================================= Address: Koloniestrasse 34 VAT Tax ID: DE267268209 D-13359 Registration: HRA 42834 B BERLIN Phone: +31/(0)87-8747479 Germany GSM: +49/(0)152-26410799 RIPE: CBSK1-RIPE e-Mail: sven@cb3rob.net ========================================================================= <penpen> C3P0, der elektrische Westerwelle http://www.facebook.com/cb3rob =========================================================================

Confidential: Please be advised that the information contained in this email message, including all attached documents or files, is privileged and confidential and is intended only for the use of the individual or individuals addressed. Any other use, dissemination, distribution or copying of this communication is strictly prohibited.

On Sat, 24 Dec 2011, Glen Kent wrote:

...
Hi,

I am trying to understand why standards say that "using a subnet prefix length other than a /64 will break many features of IPv6, including Neighbor Discovery (ND), Secure Neighbor Discovery (SEND) [RFC3971], .. " [reference RFC 5375]

Or "A number of other features currently in development, or being proposed, also rely on /64 subnet prefixes."

Is it because the 128 bits are divided into two 64 bit halves, where the latter identifies an Interface ID which is uniquely derived from the 48bit MAC address.

I am not sure if this is the reason as this only applies to the link local IP address. One could still assign a global IPv6 address. So, why does basic IPv6 (ND process, etc) break if i use a netmask of say /120?

I know that several operators use /120 as a /64 can be quite risky in terms of ND attacks. So, how does that work? I tried googling but couldnt find any references that explain how IPv6 breaks with using a netmask other than 64.

Glen

-- Ray Soucy Epic Communications Specialist Phone: +1 (207) 561-3526 Networkmaine, a Unit of the University of Maine System http://www.networkmaine.net/

Glen Kent

25 Dec 25 Dec

10:18 a.m.

Hi Ray,

...

prefixes on the same link. Choosing to make use of a 120-bit prefix (for example) will do nothing to protect against a rogue RA announcing its own 64-bit prefix with the A flag set.

I could not find any "A flag" in the RA. Am i missing something?

...

From http://www.iana.org/assignments/icmpv6-parameters:

Registry: RA Option Bit Description Reference ------------- --------------------------------------- --------- 0 M - Managed Address Configuration Flag [RFC2461] 1 O - Other Configuration Flag [RFC2461] 2 H - Mobile IPv6 Home Agent Flag [RFC3775] 3 Prf - Router Selection Preferences [RFC4191] 4 Prf - Router Selection Preferences [RFC4191] 5 P - Neighbor Discovery Proxy Flag [RFC4389] 6-53 R - Reserved; Available for assignment [RFC5175] 54-55 Private Experimentation [RFC5175] Glen

sthaug＠nethelp.no

10:26 a.m.

...

...
prefixes on the same link. Choosing to make use of a 120-bit prefix (for example) will do nothing to protect against a rogue RA announcing its own 64-bit prefix with the A flag set.

I could not find any "A flag" in the RA. Am i missing something?

It's part of the Prefix Information option. See http://tools.ietf.org/html/rfc4861#section-4.6.2 Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Iljitsch van Beijnum

28 Dec 28 Dec

11:23 a.m.

On 24 Dec 2011, at 6:32 , Glen Kent wrote:

...

I am trying to understand why standards say that "using a subnet prefix length other than a /64 will break many features of IPv6, including Neighbor Discovery (ND), Secure Neighbor Discovery (SEND) [RFC3971], .. " [reference RFC 5375]

For stateless autoconfig the issue is that it uses 64-bit "interface identifiers" (~ MAC addresses) that are supposed to be globally unique. You can't shave off bits and remain globally unique. With SEND a cryptographic hash that can be used to determine address ownership is stored in the interface identifier. Here shaving off addresses reduces security. Also somehow the rule that all normal address space must use 64-bit interface identifiers found its way into the specs for no reason that I have ever been able to uncover. On the other hand there's also the rule that IPv6 is classless and therefore routing on any prefix length must be supported, although for some implementations forwarding based on > /64 is somewhat less efficient.

Ray Soucy

12:13 p.m.

On Wed, Dec 28, 2011 at 6:23 AM, Iljitsch van Beijnum <iljitsch@muada.com> wrote:

...

Also somehow the rule that all normal address space must use 64-bit interface identifiers found its way into the specs for no reason that I have ever been able to uncover. On the other hand there's also the rule that IPv6 is classless and therefore routing on any prefix length must be supported, although for some implementations forwarding based on > /64 is > somewhat less efficient.

This ambiguity has always bothered me. The address architecture RFC requires a 64-bit interface identifier, but it's required to be unenforced by implementation, which makes it more of a suggestion at best. I think the wording should be updated to changed MUST to SHOULD. That said, and despite my own use of prefix lengths other than 64-bit, I do believe that a 64-bit prefix for each host network is in our long-term interest. Not having to size networks based on the number of hosts is a good thing. Features made possible by a 64-bit address space is a good thing. -- Ray Soucy Epic Communications Specialist Phone: +1 (207) 561-3526 Networkmaine, a Unit of the University of Maine System http://www.networkmaine.net/

Alexandru Petrescu

29 Dec 29 Dec

2:51 p.m.

Le 28/12/2011 13:13, Ray Soucy a écrit :

...

On Wed, Dec 28, 2011 at 6:23 AM, Iljitsch van Beijnum <iljitsch@muada.com> wrote:

...
Also somehow the rule that all normal address space must use 64-bit interface identifiers found its way into the specs for no reason that I have ever been able to uncover. On the other hand there's also the rule that IPv6 is classless and therefore routing on any prefix length must be supported, although for some implementations forwarding based on> /64 is> somewhat less efficient.

This ambiguity has always bothered me. The address architecture RFC requires a 64-bit interface identifier,

Well yes, but only if it's an address which doesn't start with 000 (3 zero bits). I understand an address which starts with 000 can have an interface id of length generic 128-n where n is prefix length. (RFC4291 "Addressing Arch", pp. 6, 1st par). Generally speaking, my mind is disturbed by suggestions that the Interface ID must always be precisely of length 64. BEcause there is no particularly valid reason to impose it so, other than the vaguely useful and semantically doubtful 'u' bit - any software ever checks it on reception? At an extreme reading, it may look as the "secure" bit. Yours, Alex

...

but it's required to be unenforced by implementation, which makes it more of a suggestion at best. I think the wording should be updated to changed MUST to SHOULD. That said, and despite my own use of prefix lengths other than 64-bit, I do believe that a 64-bit prefix for each host network is in our long-term interest. Not having to size networks based on the number of hosts is a good thing. Features made possible by a 64-bit address space is a good thing.

sthaug＠nethelp.no

28 Dec 28 Dec

1:10 p.m.

...

On the other hand there's also the rule that IPv6 is classless and therefore routing on any prefix length must be supported, although for some implementations forwarding based on > /64 is somewhat less efficient.

Can you please name names for the "somewhat less efficient" part? I've seen this and similar claims several times, but the lack of specific information is rather astounding. Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Glen Kent

2:32 p.m.

Most vendors have a TCAM that by default does IPv6 routing for netmasks <=64. They have a separate TCAM (which is usually limited in size) that does routing for masks >64 and <=128. TCAMs are expensive and increase the BOM cost of routers. Storing routes with masks > 64 takes up twice the number of TCAM entries as the routes with masks <= 64. Since IPv6 is *supposed* to work with /64 masks, most vendors (usually the not-so-expensive-routers) provide a smaller TCAM for > /64 masks. Glen On Wed, Dec 28, 2011 at 6:40 PM, <sthaug@nethelp.no> wrote:

...

...
On the other hand there's also the rule that IPv6 is classless and therefore routing on any prefix length must be supported, although for some implementations forwarding based on > /64 is somewhat less efficient.

Can you please name names for the "somewhat less efficient" part? I've seen this and similar claims several times, but the lack of specific information is rather astounding.

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

sthaug＠nethelp.no

2:41 p.m.

...

Most vendors have a TCAM that by default does IPv6 routing for netmasks <=64.

They have a separate TCAM (which is usually limited in size) that does routing for masks >64 and <=128.

Please provide references. I haven't seen any documentation of such an architecture myself.

...

TCAMs are expensive and increase the BOM cost of routers. Storing routes with masks > 64 takes up twice the number of TCAM entries as the routes with masks <= 64. Since IPv6 is *supposed* to work with /64 masks, most vendors (usually the not-so-expensive-routers) provide a smaller TCAM for > /64 masks.

Ah, but do the "not-so-expensive-routers" use TCAM at all? Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Ryan Malayter

2:35 p.m.

On Dec 28, 7:10 am, sth...@nethelp.no wrote:

...

...
On the other hand there's also the rule that IPv6 is classless and therefore routing on any prefix length must be supported, although for some implementations forwarding based on > /64 is somewhat less efficient.

Can you please name names for the "somewhat less efficient" part? I've seen this and similar claims several times, but the lack of specific information is rather astounding.

Well, I do know if you look at the specs for most newer L3 switches, they will often say something like "max IPv4 routes 8192, max IPv6 routes 4096". This leads one to believe that the TCAMs/hash tables are only using 64 bits for IPv6 forwarding, and therefores longer prefixes must be handled in software. This may very well not be true "under the hood" at all, but the fact that vendors publish so little IPv6 specification and benchmarking information doesn't help matters.

sthaug＠nethelp.no

2:50 p.m.

...

...
Can you please name names for the "somewhat less efficient" part? I've seen this and similar claims several times, but the lack of specific information is rather astounding.

Well, I do know if you look at the specs for most newer L3 switches, they will often say something like "max IPv4 routes 8192, max IPv6 routes 4096". This leads one to believe that the TCAMs/hash tables are only using 64 bits for IPv6 forwarding, and therefores longer prefixes must be handled in software.

It might lead you to believe so - however, I believe this would be commercial suicide for hardware forwarding boxes because they would no longer be able to handle IPv6 at line rate for prefixes needing more than 64 bit lookups. It would also be an easy way to DoS such boxes...

...

This may very well not be true "under the hood" at all, but the fact that vendors publish so little IPv6 specification and benchmarking information doesn't help matters.

Cisco actually has published quite a bit of info, e.g. http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/product_dat... "Delivering scalable forwarding Performance: up to 400 Mpps IPv4 and 200 Mpps IPv6 with dCEF" They have also published EANTC tests which include IPv6 forwarding rates. Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Ray Soucy

3:19 p.m.

It's fairly common knowledge that most of our systems work on 64-bit at best (and more commonly 32-bit still). If every route is nicely split at the 64-bit boundary, then it saves a step in matching the prefix. Admittedly a very inexpensive step. I expect that most hardware and software implementations store IPv6 as either a group of 4 32-bit integers or a pair of 64-bit integers, and a [ 7 or ] 8-bit prefix length field. I haven't read anything about a new 128-bit ASIC for IPv6, at least. In this context, it is perfectly reasonable, and expected, that the use of longer prefixes will have a higher cost. However, I think the number of routes, and your network architecture play a significant factor. It is a fairly standard practice to have different routes for your WAN connections (e.g. the routers you use BGP on and need to support thousands of routes) than the routers you use internally, where the routing table can be considerably smaller (and for which you can summarize). For these routers, the cost of routing is generally a non-factor as the tables are much smaller. I think a greater concern than simple routing and forwarding, would be additional services, such as queuing, or filtering. These may be implemented in hardware when a 64-bit boundary is used, but punted to CPU otherwise. Though this would be implementation specific and is something you would want to research for whatever hardware you're running. So far, the biggest performance problem I've encountered is related to neighbor discovery. It seems that in most implementations the neighbor discovery process is implemented in software. It doesn't have much to do with the boundary, but rather just that the process (e.g. solicitation for unknown entries) is expensive enough that sweeping through available address space can easily use all available CPU capacity. One [somewhat effective] solution to this is to attempt to use longer prefixes so there is much less address space where such an attack would be valid. It is much less costly for a router to discard a packet that it has no route for than it is to issue thousands of neighbor discovery solicitations per second. There are a few solutions that vendors will hopefully look into. One being to implement neighbor discovery in hardware (at which point table exhaustion also becomes a legitimate concern, so the logic should be such that known associations are not discarded in favor of unknown associations). I do think, despite these limitations, that hardware is quickly catching up to IPv6, though. I don't think it will be long before we see the major vendors have solid implementations. Some of them already may; I haven't had a chance to play with the newest stuff out there. On Wed, Dec 28, 2011 at 9:50 AM, <sthaug@nethelp.no> wrote:

...

...
...
Can you please name names for the "somewhat less efficient" part? I've seen this and similar claims several times, but the lack of specific information is rather astounding.

Well, I do know if you look at the specs for most newer L3 switches, they will often say something like "max IPv4 routes 8192, max IPv6 routes 4096". This leads one to believe that the TCAMs/hash tables are only using 64 bits for IPv6 forwarding, and therefores longer prefixes must be handled in software.

It might lead you to believe so - however, I believe this would be commercial suicide for hardware forwarding boxes because they would no longer be able to handle IPv6 at line rate for prefixes needing more than 64 bit lookups. It would also be an easy way to DoS such boxes...

...
This may very well not be true "under the hood" at all, but the fact that vendors publish so little IPv6 specification and benchmarking information doesn't help matters.

Cisco actually has published quite a bit of info, e.g.

http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/product_dat...

"Delivering scalable forwarding Performance: up to 400 Mpps IPv4 and 200 Mpps IPv6 with dCEF"

They have also published EANTC tests which include IPv6 forwarding rates.

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

-- Ray Soucy Epic Communications Specialist Phone: +1 (207) 561-3526 Networkmaine, a Unit of the University of Maine System http://www.networkmaine.net/

sthaug＠nethelp.no

3:45 p.m.

...

If every route is nicely split at the 64-bit boundary, then it saves a step in matching the prefix. Admittedly a very inexpensive step.

My point here is that IPv6 is still defined as "longest prefix match", so unless you *know* that all prefixes are <= 64 bits, you still need the longer match.

...

In this context, it is perfectly reasonable, and expected, that the use of longer prefixes will have a higher cost.

In a way I agree with you. However, if I put my purchasing hat on, I would refuse to buy equipment that could only forward on the first 64 bits, *or* where the forwarding decision was much slower (hardware vs software) for prefixes longer than 64 bits. I would not be surprised if vendors decide that it is a *commercial* necessity to support full 128 bit matches.

...

However, I think the number of routes, and your network architecture play a significant factor.

Absolutely. In our network by far the largest number of IPv6 prefixes are EBGP prefixes in the 32 to 48 bit range. However, we also have for instance our own 128 bit loopbacks - they are obviously only in our IGP.

...

I think a greater concern than simple routing and forwarding, would be additional services, such as queuing, or filtering. These may be implemented in hardware when a 64-bit boundary is used, but punted to CPU otherwise. Though this would be implementation specific and is something you would want to research for whatever hardware you're running.

Again, that would be an excellent reason *not* to buy such equipment. And yes, we know equipment that cannot *filter* on full IPv6 + port number headers exists (e.g. Cisco 6500/7600 with 144 bit TCAMs) - my original point was that I still haven't seen equipment with forwarding problems for prefixes > 64 bits.

...

There are a few solutions that vendors will hopefully look into. One being to implement neighbor discovery in hardware (at which point table exhaustion also becomes a legitimate concern, so the logic should be such that known associations are not discarded in favor of unknown associations).

I'm afraid I don't believe this is going to happen unless neighbor discovery based attacks become a serious problem. And even then it would take a long time. Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Alexandru Petrescu

29 Dec 29 Dec

2:55 p.m.

Le 28/12/2011 16:45, sthaug@nethelp.no a écrit :

...

...
If every route is nicely split at the 64-bit boundary, then it saves a step in matching the prefix. Admittedly a very inexpensive step.

My point here is that IPv6 is still defined as "longest prefix match",

:-) yes agree, except that it's not known what "longest prefix match" is precisely. It is widely implemented in often closed source code, there is books about it and lectures to first-year students. I have heard it named "crown jewels" of some companies. High value and no specification == speculation. Alex

...

so unless you *know* that all prefixes are<= 64 bits, you still need the longer match.

...
In this context, it is perfectly reasonable, and expected, that the use of longer prefixes will have a higher cost.

In a way I agree with you. However, if I put my purchasing hat on, I would refuse to buy equipment that could only forward on the first 64 bits, *or* where the forwarding decision was much slower (hardware vs software) for prefixes longer than 64 bits. I would not be surprised if vendors decide that it is a *commercial* necessity to support full 128 bit matches.

...
However, I think the number of routes, and your network architecture play a significant factor.

Absolutely. In our network by far the largest number of IPv6 prefixes are EBGP prefixes in the 32 to 48 bit range. However, we also have for instance our own 128 bit loopbacks - they are obviously only in our IGP.

...
I think a greater concern than simple routing and forwarding, would be additional services, such as queuing, or filtering. These may be implemented in hardware when a 64-bit boundary is used, but punted to CPU otherwise. Though this would be implementation specific and is something you would want to research for whatever hardware you're running.

Again, that would be an excellent reason *not* to buy such equipment.

And yes, we know equipment that cannot *filter* on full IPv6 + port number headers exists (e.g. Cisco 6500/7600 with 144 bit TCAMs) - my original point was that I still haven't seen equipment with forwarding problems for prefixes> 64 bits.

...
There are a few solutions that vendors will hopefully look into. One being to implement neighbor discovery in hardware (at which point table exhaustion also becomes a legitimate concern, so the logic should be such that known associations are not discarded in favor of unknown associations).

I'm afraid I don't believe this is going to happen unless neighbor discovery based attacks become a serious problem. And even then it would take a long time.

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Bjørn Mork

7 Jan 7 Jan

noon

sthaug@nethelp.no writes:

...

And yes, we know equipment that cannot *filter* on full IPv6 + port number headers exists (e.g. Cisco 6500/7600 with 144 bit TCAMs) - my original point was that I still haven't seen equipment with forwarding problems for prefixes > 64 bits.

Depends on what you consider a problem and whether you consider a layer 3 switch a "router" at all, but there are certainly some switches which will be more or less effective depending on prefix length. Ref e.g. http://www.cisco.com/en/US/docs/switches/lan/catalyst3750/software/release/1... where you'll find this carefully worded hint: "Note: An IPv4 route requires only one TCAM entry. Because of the hardware compression scheme used for IPv6, an IPv6 route can take more than one TCAM entry, reducing the number of entries forwarded in hardware. For example, for IPv6 directly connected IP addresses, the desktop template might allow less than two thousand entries." Translated: "The stated numbers for IPv6 routes are twice the real max. However, prefix compression may give better utilisation under certain conditions". Bjørn

sthaug＠nethelp.no

1:24 p.m.

...

"Note: An IPv4 route requires only one TCAM entry. Because of the hardware compression scheme used for IPv6, an IPv6 route can take more than one TCAM entry, reducing the number of entries forwarded in hardware. For example, for IPv6 directly connected IP addresses, the desktop template might allow less than two thousand entries."

Translated: "The stated numbers for IPv6 routes are twice the real max. However, prefix compression may give better utilisation under certain conditions".

Thanks, that's the first *specific* information I've seen of equipment that might have problems (reduced number of entries) with longer than 64 bit prefixes. Fortunately we're not using 3560/3750 for IPv6 routing at the moment. Any other takers? Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Leo Bicknell

28 Dec 28 Dec

3:57 p.m.

In a message written on Wed, Dec 28, 2011 at 10:19:54AM -0500, Ray Soucy wrote:

...

If every route is nicely split at the 64-bit boundary, then it saves a step in matching the prefix. Admittedly a very inexpensive step.

I expect that most hardware and software implementations store IPv6 as either a group of 4 32-bit integers or a pair of 64-bit integers, and a [ 7 or ] 8-bit prefix length field. I haven't read anything about a new 128-bit ASIC for IPv6, at least.

In this context, it is perfectly reasonable, and expected, that the use of longer prefixes will have a higher cost.

The routers are already having to do a 128-bit lookup under the hood. Consider you have a /48 routed in your IGP (to keep things simple). When you look up the /48 in a router you will see it has a next hop. A 128 bit next hop. This may be a link local, it may be a global unicast (depending on your implementation). This next hop has to be resolved, in the case of Ethernet as an example to a 48 bit MAC address. So a typical forwarding step is already a two step process: Look up variable length prefix to get next hop. Look up 128 bit next hop to get forwarding information. Once the vendor has built a 128-bit TCAM for step #2, there's no reason not to use it for step #1 as well. AFAIK, in all recent products this is how all vendors handle the problem (at a high level). Sadly, this is all a case where mind share is hobbled by a few early adopter problems. If you look at the first IPv6 images for platforms like the Cisco 7500 (in the VIP-2 days) that hardware was built to IPv4 criteria, and had 32 bit TCAM's. To make IPv6 work they did multiple TCAM lookups, some the simple 32 bits x 4, others fancy things trying to guess prefix lengths that might likley be used. All took a substantial line rate hit moving IPv6 as a result. Those problems simply don't exist in modern gear. Once products were designed to support native IPv6 rational design decisions were made. I don't know of any _current generation_ core router that has any performance difference based on prefix length. That's why prefix length isn't in the test criteria, it simply doesn't matter. I say this as a proud user of /128's, /126's, and /112's in a multi-vendor network, as well. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/

Glen Kent

4:36 p.m.

...

So a typical forwarding step is already a two step process:

Look up variable length prefix to get next hop. Look up 128 bit next hop to get forwarding information.

Wrong. You only do a lookup once. You look up a TCAM or a hash table that gives you the next hop for a route. You DONT need to do another TCAM lookup to get the egress encapsulation information. You get the egress encapsulation after your TCAM lookup. It typically gives you an index that stores this information. All routes pointing to one nexthop will typically point to the same index.

...

Once the vendor has built a 128-bit TCAM for step #2, there's no reason not to use it for step #1 as well. AFAIK, in all recent products this is how all vendors handle the problem (at a high level).

You only use the TCAM for #1, not for #2. Glen

Jeff Wheeler

9:08 p.m.

On Wed, Dec 28, 2011 at 10:19 AM, Ray Soucy <rps@maine.edu> wrote:

...

There are a few solutions that vendors will hopefully look into. One being to implement neighbor discovery in hardware (at which point table exhaustion also becomes a legitimate concern, so the logic should be such that known associations are not discarded in favor of unknown associations).

Even if that is done you are still exposed to attacks -- imagine if a downstream machine that is under customer control (not yours) has a whole /64 nailed up on its Ethernet interface, and happily responds to ND solicits for every address. Your hardware table will fill up and then your network has failed -- which way it fails depends on the table eviction behavior. Perhaps this is not covered very well in my slides. There are design limits here that cannot be overcome by any current or foreseen technology. This is not only about what is broken about current routers but what will always be broken about them, in the absence of clever work-arounds like limits on the number of ND entries allowed per physical customer port, etc. We really need DHCPv6 snooping and ND disabled for campus access networks, for example. Otherwise you could give out addresses from a limited range in each subnet and use an ACL (like Owen DeLong suggests for hosting environments -- effectively turning the /64 into a /120 anyway) but this is IMO much worse than just not configuring a /64. On Wed, Dec 28, 2011 at 10:45 AM, <sthaug@nethelp.no> wrote:

...

I'm afraid I don't believe this is going to happen unless neighbor discovery based attacks become a serious problem. And even then it would take a long time.

The vendors seem to range from "huh?" to "what is everyone else doing?" to Cisco (the only vendor to make any forward progress at all on this issue.) I think that will change as this topic is discussed more and more on public mailing lists, and as things like DHCPv6 snooping, and good behavior when ND is disabled on a subnet/interface, begin to make their way into RFPs. As it stands right now, if you want to disable the IPv6 functionality (and maybe IPv4 too if dual-stacked) of almost any datacenter / hosting company offering v6, it is trivial to do that. The same is true of every IXP with a v6 subnet. I think once some bad guys figure this out, they will do us a favor and DoS some important things like IXPs, or a highly-visible ISP, and give the vendors a kick in the pants -- along with operators who still have the "/64 or bust" mentality, since they will then see things busting due to trivial attacks. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts

Ray Soucy

10:07 p.m.

You will always be exposed to attacks if you're connected to the Internet. (Not really sure what you were trying to say there.) My primary concerns are attacks originated from external networks. Internal network attacks are a different issue altogether (similar to ARP attacks or MAC spoofing), which require different solutions. As previously discussed in a recent thread, the attack vector you describe (in a service provider environment) can be mitigated though architecture simply by effective CPE design (isolated link network to CPE, L3 hand-off at CPE, with stateful packet inspection; and small or link-local prefixes for link networks). Thankfully, this isn't a model that is anything new; many networks are already built in this way. The only point contested is the validity of longer-than 64-bit prefixes; which I think I've spoken enough on. Enterprise and Data Center environments have a different set of [similar] concerns. Which is where the most concern on exploitation of ND and large prefixes comes into play. I think most of us have been at this for long enough to have given up on the one-configuration-fits-all school of network design. A stateful firewall internally can be a strong tool to mitigate this attack vector in such environments, depending on their size. For networks where a stateful firewall isn't practical, though, that is where stronger router implementation comes into play. The suggestion of disabling ND outright is a bit extreme. We don't need to disable ARP outright to have functional networks with a reasonable level of stability and security. The important thing is that we work with vendors to get a set of tools (not just one) to address these concerns. As you pointed out Cisco has already been doing quite a bit of work in this area, and once we start seeing the implementations become more common, other vendors will more than likely follow (at least if they want our business). Maybe I'm just a glass-half-full kind of guy. ;-) I think being able to use longer prefixes than 64-bit helps considerably. I think that seeing routers that can implement ND in hardware (or at least limit its CPU usage), and not bump known associations for unknown ones will help considerably. Stateful firewalls (where appropriate) will help considerably. And L2 security features (ND inspection with rate-limiting, RA guard, DHCPv6 snooping) will all help -- considerably. Combined, they make for an acceptable solution by current standards. As was also pointed out, though, many networks don't even implement this level of security for IP internally; the difference is that many of them haven't needed to for external attacks because of the widespread use of NAT, stateful firewalls, and much smaller address space. That is a little different in the IPv6 world, and why there is concern being expressed on this list. The most important thing is that network operators are aware of these issues, have a basic understanding of the implications, and are provided with the knowledge and tools to address them. This really isn't much different than IPv4. On Wed, Dec 28, 2011 at 4:08 PM, Jeff Wheeler <jsw@inconcepts.biz> wrote:

...

On Wed, Dec 28, 2011 at 10:19 AM, Ray Soucy <rps@maine.edu> wrote:

...
There are a few solutions that vendors will hopefully look into. One being to implement neighbor discovery in hardware (at which point table exhaustion also becomes a legitimate concern, so the logic should be such that known associations are not discarded in favor of unknown associations).

Even if that is done you are still exposed to attacks -- imagine if a downstream machine that is under customer control (not yours) has a whole /64 nailed up on its Ethernet interface, and happily responds to ND solicits for every address. Your hardware table will fill up and then your network has failed -- which way it fails depends on the table eviction behavior.

Perhaps this is not covered very well in my slides. There are design limits here that cannot be overcome by any current or foreseen technology. This is not only about what is broken about current routers but what will always be broken about them, in the absence of clever work-arounds like limits on the number of ND entries allowed per physical customer port, etc.

We really need DHCPv6 snooping and ND disabled for campus access networks, for example. Otherwise you could give out addresses from a limited range in each subnet and use an ACL (like Owen DeLong suggests for hosting environments -- effectively turning the /64 into a /120 anyway) but this is IMO much worse than just not configuring a /64.

On Wed, Dec 28, 2011 at 10:45 AM, <sthaug@nethelp.no> wrote:

...
I'm afraid I don't believe this is going to happen unless neighbor discovery based attacks become a serious problem. And even then it would take a long time.

The vendors seem to range from "huh?" to "what is everyone else doing?" to Cisco (the only vendor to make any forward progress at all on this issue.) I think that will change as this topic is discussed more and more on public mailing lists, and as things like DHCPv6 snooping, and good behavior when ND is disabled on a subnet/interface, begin to make their way into RFPs.

As it stands right now, if you want to disable the IPv6 functionality (and maybe IPv4 too if dual-stacked) of almost any datacenter / hosting company offering v6, it is trivial to do that. The same is true of every IXP with a v6 subnet. I think once some bad guys figure this out, they will do us a favor and DoS some important things like IXPs, or a highly-visible ISP, and give the vendors a kick in the pants -- along with operators who still have the "/64 or bust" mentality, since they will then see things busting due to trivial attacks.

-- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts

-- Ray Soucy Epic Communications Specialist Phone: +1 (207) 561-3526 Networkmaine, a Unit of the University of Maine System http://www.networkmaine.net/

Jeff Wheeler

10:39 p.m.

On Wed, Dec 28, 2011 at 5:07 PM, Ray Soucy <rps@maine.edu> wrote:

...

The suggestion of disabling ND outright is a bit extreme. We don't need to disable ARP outright to have functional networks with a reasonable level of stability and security. The important thing is

I don't think it's at all extreme. If you are dealing with an access network where DHCPv6 is the only legitimate way to get an address on a given LAN segment, there is probably no reason for the router to use ND to learn about neighbor L3<>L2 associations. With DHCPv6 snooping the router can simply not use ND on that segment, which eliminates this problem. However, this feature is not yet available. It would also be difficult to convince hosting customers to use a DHCPv6 client to populate their gateway's neighbor table. However, if this feature comes along before other fixes, it will be a good option for safely deploying /64s without ND vulnerabilities.

...

that we work with vendors to get a set of tools (not just one) to address these concerns. As you pointed out Cisco has already been doing quite a bit of work in this area, and once we start seeing the implementations become more common, other vendors will more than likely follow (at least if they want our business).

Maybe I'm just a glass-half-full kind of guy. ;-)

I think your view of the Cisco work is a little optimistic. :) What they have done so far is simply acknowledge that, yes, ND exhaustion is a problem, and give the customer the option to mitigate damage to individual interfaces / VLANs, on the very few platforms that support the feature. Cisco has also given the SUP-2T independent policers for ARP and ND, so if you have a SUP-2T instead of a SUP720 / RSP720, your IPv4 won't break when you get an IPv6 ND attack. Unfortunately, there are plenty of people out there who are running IPv6 /64s on SUP720s, most who do not know that an attacker can break all their IPv4 services with an IPv6 ND attack.

...

The most important thing is that network operators are aware of these issues, have a basic understanding of the implications, and are provided with the knowledge and tools to address them.

We certainly agree here. I am glad the mailing list has finally moved from listening to Owen DeLong babble about this being a non-problem, to discussing what work-arounds are possible, disadvantages of them, and what vendors can do better in the future. My personal belief is that DHCPv6 snooping, with ND disabled, will be the first widely-available method of deploying /64s "safely" to customer LAN segments. I'm not saying this is good but it is a legitimate solution. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts

Ray Soucy

11:08 p.m.

As much as I argue with Owen on-list, I still enjoy reading his input. It's a little uncalled for to be so harsh about his posts. A lot of us are strong-willed here, and many of us read things we've posted in the past and ask "what was I thinking, that's ridiculous"; and perhaps I'm just saying that because I do so more than most. But really, let's stay civil. I don't disagree with your other comments much, but I do think (hope actually) that DHCPv6 snooping will not filter link-local traffic. That would be a job for an ND inspection kind of technology, and one I would hope was configurable. There is no DHCPv6 for link-local so it would be kind of silly to have DHCPv6 snooping restrict that traffic completely. It will be a little less straight forward than DHCP snooping is, no doubt. And I will admit I can be a Cisco fanboy at times, but only because they've consistently been able to deliver on IPv6 more that other vendors I've worked with. Like any vendor it can be hard to get through to the people who matter, but Cisco has been pretty good at responding to us when we poke them on these matters. Surprisingly, most of the time the delay is waiting on a standard to be established so they can implement that rather than their own thing. On Wed, Dec 28, 2011 at 5:39 PM, Jeff Wheeler <jsw@inconcepts.biz> wrote:

...

On Wed, Dec 28, 2011 at 5:07 PM, Ray Soucy <rps@maine.edu> wrote:

...
The suggestion of disabling ND outright is a bit extreme. We don't need to disable ARP outright to have functional networks with a reasonable level of stability and security. The important thing is

I don't think it's at all extreme. If you are dealing with an access network where DHCPv6 is the only legitimate way to get an address on a given LAN segment, there is probably no reason for the router to use ND to learn about neighbor L3<>L2 associations. With DHCPv6 snooping the router can simply not use ND on that segment, which eliminates this problem. However, this feature is not yet available.

It would also be difficult to convince hosting customers to use a DHCPv6 client to populate their gateway's neighbor table. However, if this feature comes along before other fixes, it will be a good option for safely deploying /64s without ND vulnerabilities.

...
that we work with vendors to get a set of tools (not just one) to address these concerns. As you pointed out Cisco has already been doing quite a bit of work in this area, and once we start seeing the implementations become more common, other vendors will more than likely follow (at least if they want our business).

Maybe I'm just a glass-half-full kind of guy. ;-)

I think your view of the Cisco work is a little optimistic. :) What they have done so far is simply acknowledge that, yes, ND exhaustion is a problem, and give the customer the option to mitigate damage to individual interfaces / VLANs, on the very few platforms that support the feature.

Cisco has also given the SUP-2T independent policers for ARP and ND, so if you have a SUP-2T instead of a SUP720 / RSP720, your IPv4 won't break when you get an IPv6 ND attack. Unfortunately, there are plenty of people out there who are running IPv6 /64s on SUP720s, most who do not know that an attacker can break all their IPv4 services with an IPv6 ND attack.

...
The most important thing is that network operators are aware of these issues, have a basic understanding of the implications, and are provided with the knowledge and tools to address them.

We certainly agree here. I am glad the mailing list has finally moved from listening to Owen DeLong babble about this being a non-problem, to discussing what work-arounds are possible, disadvantages of them, and what vendors can do better in the future.

My personal belief is that DHCPv6 snooping, with ND disabled, will be the first widely-available method of deploying /64s "safely" to customer LAN segments. I'm not saying this is good but it is a legitimate solution.

-- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts

-- Ray Soucy Epic Communications Specialist Phone: +1 (207) 561-3526 Networkmaine, a Unit of the University of Maine System http://www.networkmaine.net/

Ryan Malayter

3:30 p.m.

On Dec 28, 8:50 am, sth...@nethelp.no wrote:

...

It might lead you to believe so - however, I believe this would be commercial suicide for hardware forwarding boxes because they would no longer be able to handle IPv6 at line rate for prefixes needing more than 64 bit lookups. It would also be an easy way to DoS such boxes...

That's just what I'm arguing here: no vendor info I've seen positively says they *can* handle line-rate with longer IPv6 prefixes. The other information available leads one to believe that all the published specs are based on /64 prefixes. Even a third-party test reports don't mention IPv6 or prefix length at all: http://www.aristanetworks.com/media/system/pdf/LippisTestReportMay2011.pdf

...

Cisco actually has published quite a bit of info, e.g.

http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/prod...

"Delivering scalable forwarding Performance: up to 400 Mpps IPv4 and 200 Mpps IPv6 with dCEF"

They have also published EANTC tests which include IPv6 forwarding rates.

Except nowhere in there is the prefix length for the test indicated, and the exact halving of forwarding rate for IPv6 leads one to believe that there are two TCAM lookups for IPv6 (hence 64-bit prefix lookups) versus one for IPv4. For example, what is the forwarding rate for IPv6 when the tables are filled with /124 IPv6 routes that differ only in the last 60 bits? Even then EANTC test results you reference make no mention of the prefix length for IPv4 or IPv6, or even the number of routes in the lookup table during the testing: http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/prod_white_...

Ray Soucy

3:44 p.m.

For what its worth I haven't stress tested it or anything, but I haven't seen any evidence on any of our RSP/SUP 720 boxes that would have caused me to think that routing and forwarding isn't being done in hardware, and we make liberal use of prefixes longer than 64 (including 126 for every link network). They might just be under capacity to the point that I haven't noticed, though. I have no problem getting muti-gigabit IPv6 throughput. On Wed, Dec 28, 2011 at 10:30 AM, Ryan Malayter <malayter@gmail.com> wrote:

...

On Dec 28, 8:50 am, sth...@nethelp.no wrote:

...
It might lead you to believe so - however, I believe this would be commercial suicide for hardware forwarding boxes because they would no longer be able to handle IPv6 at line rate for prefixes needing more than 64 bit lookups. It would also be an easy way to DoS such boxes...

That's just what I'm arguing here: no vendor info I've seen positively says they *can* handle line-rate with longer IPv6 prefixes. The other information available leads one to believe that all the published specs are based on /64 prefixes.

Even a third-party test reports don't mention IPv6 or prefix length at all: http://www.aristanetworks.com/media/system/pdf/LippisTestReportMay2011.pdf

...
Cisco actually has published quite a bit of info, e.g.

http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/prod...

"Delivering scalable forwarding Performance: up to 400 Mpps IPv4 and 200 Mpps IPv6 with dCEF"

They have also published EANTC tests which include IPv6 forwarding rates.

Except nowhere in there is the prefix length for the test indicated, and the exact halving of forwarding rate for IPv6 leads one to believe that there are two TCAM lookups for IPv6 (hence 64-bit prefix lookups) versus one for IPv4.

For example, what is the forwarding rate for IPv6 when the tables are filled with /124 IPv6 routes that differ only in the last 60 bits?

Even then EANTC test results you reference make no mention of the prefix length for IPv4 or IPv6, or even the number of routes in the lookup table during the testing: http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/prod_white_...

-- Ray Soucy Epic Communications Specialist Phone: +1 (207) 561-3526 Networkmaine, a Unit of the University of Maine System http://www.networkmaine.net/

Ryan Malayter

5:51 p.m.

On Dec 28, 9:44 am, Ray Soucy <r...@maine.edu> wrote:

...

For what its worth I haven't stress tested it or anything, but I haven't seen any evidence on any of our RSP/SUP 720 boxes that would have caused me to think that routing and forwarding isn't being done in hardware, and we make liberal use of prefixes longer than 64 (including 126 for every link network). They might just be under capacity to the point that I haven't noticed, though. I have no problem getting muti-gigabit IPv6 throughput.

You can get >10GbE *throughput* from a Linux box doing all forwarding in software as well. That's easy when the packets are big and the routing tables are small, and the hash tables all fit in high-speed processor cache. The general lack of deep information about how the switching and routing hardware really works for IPv6 is my main problem. It's not enough to make informed buying or design decisions. Unfortunately, I have over the course of my career learned that a "trust but verify" policy is required when managing vendors. Especially vendors that have a near-monopoly market position. The problem, of course, is that verifying this sort of thing with realistic worst-case benchmarks requires some very expensive equipment and a lot of time, which is why the lack of solid information from vendors and 3rd-party testing labs is worrying. Surely some engineers from the major switch/router vendors read the NANOG list. Anybody care to chime in with "we forward all IPv6 prefix lengths in hardware for these product families"?

Ray Soucy

7:14 p.m.

I did look into this a bit before. To be more specific: IPv6 CEF appears to be functioning normally for prefixes longer than 64-bit on my 720(s). I'm not seeing evidence of unexpected punting. The CPU utilization of the software process that would handle IPv6 being punted to software, "IPv6 Input", is at a steady %0.00 average (with spikes up to 0.02%). So there would seem to be at least one major platform that is OK. On Wed, Dec 28, 2011 at 12:51 PM, Ryan Malayter <malayter@gmail.com> wrote:

...

On Dec 28, 9:44 am, Ray Soucy <r...@maine.edu> wrote:

...
For what its worth I haven't stress tested it or anything, but I haven't seen any evidence on any of our RSP/SUP 720 boxes that would have caused me to think that routing and forwarding isn't being done in hardware, and we make liberal use of prefixes longer than 64 (including 126 for every link network). They might just be under capacity to the point that I haven't noticed, though. I have no problem getting muti-gigabit IPv6 throughput.

You can get >10GbE *throughput* from a Linux box doing all forwarding in software as well. That's easy when the packets are big and the routing tables are small, and the hash tables all fit in high-speed processor cache.

The general lack of deep information about how the switching and routing hardware really works for IPv6 is my main problem. It's not enough to make informed buying or design decisions. Unfortunately, I have over the course of my career learned that a "trust but verify" policy is required when managing vendors. Especially vendors that have a near-monopoly market position.

The problem, of course, is that verifying this sort of thing with realistic worst-case benchmarks requires some very expensive equipment and a lot of time, which is why the lack of solid information from vendors and 3rd-party testing labs is worrying.

Surely some engineers from the major switch/router vendors read the NANOG list. Anybody care to chime in with "we forward all IPv6 prefix lengths in hardware for these product families"?

-- Ray Soucy Epic Communications Specialist Phone: +1 (207) 561-3526 Networkmaine, a Unit of the University of Maine System http://www.networkmaine.net/

sthaug＠nethelp.no

7:46 p.m.

...

IPv6 CEF appears to be functioning normally for prefixes longer than 64-bit on my 720(s).

I'm not seeing evidence of unexpected punting.

The CPU utilization of the software process that would handle IPv6 being punted to software, "IPv6 Input", is at a steady %0.00 average (with spikes up to 0.02%).

So there would seem to be at least one major platform that is OK.

And there are other platforms, e.g. Juniper M/MX/T, where there is no concept of "punt a packet to software to perform a forwarding decision". The packet is either forwarded in hardware, or dropped. IPv6 prefixes > 64 bit are handled like any other IPv6 prefixes, i.e. they are forwarded in hardware. Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Mark Tinka

29 Dec 29 Dec

8:56 a.m.

On Thursday, December 29, 2011 03:46:48 AM sthaug@nethelp.no wrote:

...

And there are other platforms, e.g. Juniper M/MX/T, where there is no concept of "punt a packet to software to perform a forwarding decision". The packet is either forwarded in hardware, or dropped. IPv6 prefixes > 64 bit are handled like any other IPv6 prefixes, i.e. they are forwarded in hardware.

IOS XR-based systems operate the same way. Mark.

Saku Ytti

9:10 a.m.

On (2011-12-29 16:56 +0800), Mark Tinka wrote:

...

On Thursday, December 29, 2011 03:46:48 AM sthaug@nethelp.no wrote:

...
And there are other platforms, e.g. Juniper M/MX/T, where there is no concept of "punt a packet to software to forwarded in hardware, or dropped. IPv6 prefixes > 64

IOS XR-based systems operate the same way.

Of course this isn't strictly true, transit might be punted in either platform for various reasons, IP(v6) options comes to mind, possibly too many IPv6 extension headers (Cisco.com claims to punt in such instance, JNPR/trio (imho incorrectly) just drops packet in hardware), glean/arp resolve, multicast learning, probably many other reasons I can't think off. -- ++ytti

Mark Tinka

11:39 a.m.

On Thursday, December 29, 2011 05:10:15 PM Saku Ytti wrote:

...

Of course this isn't strictly true,...

Of course, not "strictly". What I meant was the CRS and ASR9000 don't operate like the 6500/7600 and other Cisco switches that punted packets to CPU if, for one reason or another, a bug or misconfiguration caused said packets to be sent to the CPU for forwarding. Mark.

Joel jaeggli

4 Jan 4 Jan

10:16 a.m.

On 12/28/11 07:30 , Ryan Malayter wrote:

...

Except nowhere in there is the prefix length for the test indicated, and the exact halving of forwarding rate for IPv6 leads one to believe that there are two TCAM lookups for IPv6 (hence 64-bit prefix lookups) versus one for IPv4.

A cam (assuming your router uses one) can easily be parititioned to support 144 bit words, and you can look up the whole address in one go. A router designer might well choose to fold the lookup and partion a cam table in a different fashion, to reduce memory consumption, save power etc. if they choose to split lookups (for example with the 72 most significant bits in the first lookup and the last 56 in a second) it's because they believe the tradeoff associated with two constant time lookups is acceptable. remember the cam table lookup is competing against a prefix trie lookup with a variable stride pattern done in really fast dram for mind/market share.

...

For example, what is the forwarding rate for IPv6 when the tables are filled with /124 IPv6 routes that differ only in the last 60 bits?

Even then EANTC test results you reference make no mention of the prefix length for IPv4 or IPv6, or even the number of routes in the lookup table during the testing: http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/prod_white_...

Kevin Loch

29 Dec 29 Dec

7:03 p.m.

Iljitsch van Beijnum wrote:

...

On 24 Dec 2011, at 6:32 , Glen Kent wrote:

...
I am trying to understand why standards say that "using a subnet prefix length other than a /64 will break many features of IPv6, including Neighbor Discovery (ND), Secure Neighbor Discovery (SEND) [RFC3971], .. " [reference RFC 5375]

For stateless autoconfig the issue is that it uses 64-bit "interface identifiers" (~ MAC addresses) that are supposed to be globally unique. You can't shave off bits and remain globally unique.

With SEND a cryptographic hash that can be used to determine address ownership is stored in the interface identifier. Here shaving off addresses reduces security.

Also somehow the rule that all normal address space must use 64-bit interface identifiers found its way into the specs for no reason that I have ever been able to uncover. On the other hand there's also the rule that IPv6 is classless and therefore routing on any prefix length must be supported, although for some implementations forwarding based on > /64 is somewhat less efficient.

The 64 bit "mattress tag" is one of the cute historical quirks of IPv6. Of course in practice we use all sorts of longer prefixes for the same reasons we do in IPv4: Loopback ips, Limiting the scope of infrastructure links and server subnets, the many uses of more specific static routes, null routes (including the very important /128 ddos blackhole). The good news is that vendors recognized the need to efficiently route all 128 bits. Is there any known platform that does not? I'm starting to think this is an ancient myth that keeps resurfacing. - Kevin

Ray Soucy

9:13 p.m.

On Thu, Dec 29, 2011 at 2:03 PM, Kevin Loch <kloch@kl.net> wrote:

...

The 64 bit "mattress tag"

This phrase made my year. -- Ray Soucy Epic Communications Specialist Phone: +1 (207) 561-3526 Networkmaine, a Unit of the University of Maine System http://www.networkmaine.net/

4935

Age (days ago)

4949

Last active (days ago)

List overview

Download

50 comments

20 participants

participants (20)

Alexandru Petrescu
Bjørn Mork
Chuck Anderson
Glen Kent
Iljitsch van Beijnum
Jeff Wheeler
Joel jaeggli
Joel Maslak
Jonathan Lassoff
Karl Auer
Kevin Loch
Leo Bicknell
Mark Tinka
Owen DeLong
Ray Soucy
Ryan Malayter
Saku Ytti
sthaug＠nethelp.no
Sven Olaf Kamphuis
Valdis.Kletnieks＠vt.edu