Re: [Attendee] Rogue RA

17 Jun 2009

      In a message written on Tue, Jun 16, 2009 at 06:25:15PM -0400, Tom Pusateri wrote:
...
Shouldn't we see the same problem with rogue DHCP servers in v4?
There are two fundamental differences.  I've had long discussions with
some IETF folk about them.

* A rogue DHCP server does not break a working connection immediately.

  A common problem is someone boots their laptop with the 3G card still
  in it set up for sharing from where they were the day before, or
  similar.  They often find their own connectivity broken, or slow in a 
  matter of minutes, pull it out, and reconfigure.  Let's even assume
  they were using a DHCPv4 server on it.

  In IPv4+DHCP if you were already up on the LAN, the fact that a new
  DHCP server came up is uninteresting until you need to renew.  Since
  most leases are 7-30 days, and the broken laptop is in the broken
  config for a matter of minutes it's quite unlikely anyone will be
  broken, and if they are it is one or two folks.

  In IPv6+RA's, the second that laptop boots it sends RA's, and
  depending on a number of things that may break everyone on the lan in a
  matter of milliseconds.  Worse, when the person realizes and yanks out 
  the 3G card the software does not withdraw the route, so the folks
  that were broken in millisecond are now waiting for a two hour timeout
  during which the packets are blackholed.

* IPv6 has the assumption you want multiple addresses.

  Raise your hand if you've accidently plugged an ethernet cable into
  the wrong port on a switch before.  *looks*  Yeah, I think that's
  everyone.

  In IPv4+DHCP, if I take two, working subnets and accidently connect
  them (bridge) generally nothing happens.  DHCP renewals are even
  unaffected.  New clients may select the wrong network, but that's only
  an issue in highly dynamic networks.  Unplugging the cable instantly
  returns everything to a working state.

  In IPv6+RA's, if I take two working subnets and accidently bridge them
  in a matter of milliseconds the hosts detect this fact and every host
  numbers itself on both networks.  Worse, if I then unplug the network,
  because I realized 20 seconds later it was the wrong port on the
  switch, /I remove the ability for RA withdraws/ and every host in both
  networks is back to a two hour expiration.

The end result of all this is that from a theoretical level these
look like similar issues.  DHCP is unauthenticated.  RA's are
unauthenticated.  People run rogue versions of both.  However from
an operational level the RA model causes more outages, causes outages
faster, and makes them last longer.  In that sense, I've taken to
describing IPv6+RA's as less /robust/ than IPv4+DHCP in average
deployments.

What is the solution?  Well, there are, in my mind, three basic tracts
and I think it's important we go down all of them:

* "Secure RA".   There is an effort to secure, using crypto, the RA's
  in some useful way.  I have some concerns, but that is good work and
  should be continued.

* Make it possible to run DHCPv6 in the same way as DHCPv4, that is hand
  out a default router.  DHCPv6 relies on RA's to work.  In essence you
  can't turn off RA's and have a working network.  There are plenty of
  folks experienced with IPv4+DHCPv4, and for a lot of deployments it is
  "robust enough".  The fact that this option has been removed is a
  major step backwards in my mind.  DHCPv6 needs to be updated to allow
  operating an "RA-less" network, which probably also means finishing 
  VRRPv6.  I know IPv6 folks hate operators who say "make it work like
  IPv4", but in this case it is an option that should be on the table.

* Enhance LAN equipment to track RA's.  Those who need a more secure
  environment today prevent DHCP servers on "user" ports via DHCP
  snooping.  They make sure the host can only source IP from the address
  the DHCP server told them.  There's a set of 3-5 features, in the
  layer 2 switches to "lock down" dynamic addressing.  The same is
  needed for IPv6.  RA snooping, preventing user ports from advertising
  RA's, etc.

Anyone who's been to a NANOG or IETF has seen the issue with rogue
RA's.  Indeed, on a day to day basis I would say this is the only
thing left where I think there is a significant impediment to IPv6
deployment, and where IPv6 is significantly worse than IPv4 for all
of the typical cases.

-- 
       Leo Bicknell - bicknell@ufp.org - CCIE 3440
        PGP keys at http://www.ufp.org/~bicknell/