Re: The Next Big Thing: Named-Data Networking

5 Sep 2014

      Interesting, here are my speaking notes from a talk I gave at
Hackerspace/SG in June 2011. Slightly different but in a similar
spirit depending on your sense of "similar":

                            THE END OF DNS

A Quick History

The internet uses host names but routing is done based on
numeric IP addresses usually.

For example world.std.com is 192.74.137.5 in dotted notation or
0xC04A8905 hex or 1100000001001010100010010000000101 binary or
3,226,110,213.

world.std.com can be looked at as:

          w   o   r   l   d   .   s   t   d   .   c   o   m
          073557  071154  062056  071564  062056  061557  066400

So, we need to go from the host name to the ip address and back
reliably.

Originally we used a hosttable which was simply a text file, on
unix-like systems you see a remnant of it with entries like

   127.0.0.1    localhost
   192.74.137.5 world.std.com world

with other information spread across /etc/nets and /etc/gateways, a
snippet of the distributed format from RFC810 is:

   NET : 18.0.0.0 : LCSNET :
   GATEWAY : 10.0.0.77, 18.8.0.4 : MIT-GW :: MOS : IP/GW :
   HOST : 10.0.0.73 : SRI-NIC,NIC : FOONLY-F3 : TENEX :
       NCP/TELNET,NCP/FTP, TCP/TELNET, TCP/FTP :
   HOST: 10.2.0.11 : SU-TIP,FELT-TIP :::

This was downloaded from a single host, often at midnight, and there
was a unix program (hosttable) to reformat the information for Unix.

Yes, every host on the internet downloaded a common file, often every
night.

This wouldn't scale well so Paul Mockapetris devised "DNS" or Domain
Naming Service.

The idea is very simple, each site would be responsible for their own
domain and to respond to simple remote requests for name to ip address
mappings or back again.

There would be a root, or multiple roots, which would respond to
requests to locate who should be asked about a domain, for example if
you want to know the ip address for world.std.com the conversation
goes roughly:

   (To Root Server):       Where is the COM server?
   (From Root Server):     SOMEHOST
   (TO SOMEHOST):          Where is the STD.COM server?
   (From SOMEHOST):        192.137.74.112
   (TO 192.74.137.112):    WHAT IS WORLD.STD.COM's IP ADDRESS (A RECORD)?
   (FROM 192.74.137.112):  192.74.137.5

It's amazing. to me, that it works let alone so quickly!

But let's examine this, WHY do this mapping?

1. Computationally / Memory efficient

2. Sometimes IP changes, host name can be more stable.  One reason to
   change IP is change of ISP, or LAN re-organization.

3. DNS Tricks! load balancing (e.g., round-robin), failover,
   content caching and distribution.

4. Multiple interfaces

5. Aliases

We also overload some other functions on DNS such as who is this
host's mail server or their SPF information. But let's stick to host
to ip address mapping.

THE BIG IDEA:

Why not just use the host names as ip addresses? They're integers.

In IPv6 they're rather long integers, 16 bytes.

Looking thru hundreds of millions of web server log records I found
host names to be about 32 bytes long, including dots.

So sending the host name as the ip address is not an enormous
expansion over current plans for IPv6, about double on average though
variable length must be accomdated for long host names, up to 1K or
thereabouts.

Routing by ip addresses is really very simple:

A router looks at some number of bits of an IP address, the "network"
portion, and either knows where to hand this packet next, either
another router or the actual host and we're done, or only knows that
any network which isn't in its table needs to be handed to a "default"
router neighbor and with luck the packet will eventually find a router
which knows what to do with the packet.

There might use varying number of bits to determine the best router to
send a packet to next.

At the "center" there are these routers which have NO default, they
either know how to get your packet on its way or it has no route and
is discarded.

Really very simple, all the complexity is in keeping the information
in each router current. This turns out to be not only an information
distribution problem but also an information distribution STABILITY
problem.

But it's not our problem today.

All we need concern ourself with is that there is a host name to ip
addr mapping and these ip addresses are used in routing packets,
that's the point.

But why not just use the host name and skip the mapping?

FQDNs are hierarchical, we can pick them apart and start routing a
packet looking to go to world.std.com by routing to (or towards) a COM
router, then a STD.COM router, and finally hand it off to
WORLD.STD.COM.

No doubt the devil is in the details, but tonight we're interested in
whether it's worthwhile working out those details. Are there fatal
flaws?

One complaint might be computational complexity. Integers are easy to
put into tables and use as indexes. Most real routers even have
specially built memory to speed this up.

Then again the 70s are over, the 80s are over, the 90s are over, the
2000s are over, computers are fast and getting faster and parallelism
(such as multiple cores and threads) is commodity as are relatively
large memories.

If the average host name is about 32 characters and there are about a
billion hosts then it takes around 32GB to hold all that information,
maybe twice that with table overhead, 64GB. I can buy 64GB flash
drives for around $100! They're too slow but I hope you get my point.

And, besides, you only need to hold each network portion once in a router's
memory, not for every host:

  COM
   THEWORLD  192.74.137.0
    SHELL01  71
    DNS      112

that's simple.

To search the table the router could use a perfect hash function or as
close to that as we really need.

It would probably be better if we all agreed on one or a few hash
functions but it's not necessary, it's only used inside a router, but
it might make debugging easier.

Bazinga! No DNS!

But what about our list of uses of host to ip mappings?

1. Computationally / Memory efficient

   Who cares?

2. IP changes?

   Build it into ICMP and BGP infrastructure, that's a routing
   problem.

   We already have another system, ARP, which deals with similar
   problems to map IP to MAC addresses.

3. DNS Tricks!

   Trix are for kids. But, again, a routing problem.

4. Multiple interfaces

   Same sort of problem, mostly a last hop problem.

5. Aliases

   Still a last hop problem

What are the problems?

What do we gain?

We get rid of this huge, noisy, complex infrastructure.

We still need registries and registrars because we still need to file
who "owns" a host name.

But we can get rid of the entire RIR structure, the five regional
organizations which hand out IP block, usually for $1000 or more per
year depending on the number of bits in the network part (less is more
expensive.)

Well, they could still coordinate some routing functions, ASNs, etc.

No DNS, no DNS attacks!

To me this seems more secure tho that's a dangerous conjecture to make.

But we have removed a rather public, distributed target and put most
of the function in the routing infrastructure directly which tends to
be more secure. For example, you don't accept routing updates from
anyone, only trusted hosts. And in the near future we can expect even
that to be "signed".

Speaking of signed, no DNNSSEC! DNSSEC is a fairly simple concept,
sign DNS information exchanges using public key cryptography, with a
rather complex operational overhead such as key updates and
revocations. Gone!

I've discussed this on very technical (private) mailing lists with the
sort of people who built the MSN infrastructure, Morgan-Stanley (no
more than 100msecs downtime PER YEAR!), Google, Vonage, etc.

Worst complaint: We're so accustomed to thinking in terms of DNS that
there must be SOMETHING wrong with your idea!!

A few thought it was great and made reference to other discussions
over the years which were somewhat similar tho not quite as sweeping.

SO WHAT IS WRONG?

-- 
        -Barry Shein

The World              | bzs@TheWorld.com           | http://www.TheWorld.com
Purveyors to the Trade | Voice: 800-THE-WRLD        | Dial-Up: US, PR, Canada
Software Tool & Die    | Public Access Internet     | SINCE 1989     *oo*