Re: Geographic v. topological address allocation

21 Nov 1997

      "Kent W. England" <kwe@geo.net> writes:
...
This is true, but the definition of the top of the hierarchy is arbitrary
and is the nexus of the debate about "topological" versus geographical
addressing, which I interpret as "ISP at top" versus "exchange point at
top" hierarchies. Both are valid topological hierarchies.
As tli pointed out the top of the hierarchy is not
arbitrary, it must be default free.

In a hierarchical routing system there are three
forwarding directions to consider: intra-area ("lateral"),
default ("upwards") and sub-area ("aggregate" or "downwards").

At the top of a hierarchy you cannot have an upwards
forwarding direction, therefore the entire address space
must be intra-area or presented as an aggreagate.

If you consider an addressing structure that looks like
this:

      level-3-area-id:level-2-area-id:level-1-area-id:final-flat-id

and in an internetwork with three levels of hierarchy,
this pattern is easy to consider.

A level-2 router may have some things directly attached to
its level-2 area, including its peer routers, and it would
carry routes towards them, which probably would be in a
flat routing table.  Among the reasons it needs these
routes is that it has to know where to send traffic
towards each of the level-1 areas that are attached to its
own area, and it has to know where to send "default"
traffic towards one or more of its in-area peers that have
level-3 connections.

Each such level-3 router would have to know how to foward
to any given level-2 area, and therefore would need to
carry routes for each level-3/level-2 gateway.

Each level-1 router, by contrast, only needs to know how
to route towards all the things in its area, and how to
reach at least one level-2 router.

(One can be a little tricky and have a single level-1 area
connected to multiple level-2 routers in different level-2
areas, in which case better routing optimality may be
obtained by the level-1 router carrying some level-2
routing information.  This would be analogous to Yakov
Rekhter's "route pull".)

However, the minimum set of routes to carry is that which
can cause traffic to be forwarded along a strict
single-path tree-like hierarchy.  This requires that each
area be fully contiguous at all times.  Other routes may
be introduced in various places to alter this behaviour if
that is desirable, or to effect IS-IS style partition repair.

In order for the hierarchical routing system to scale the
number of entities known in any given area must be small
enough to route on in what is conceptually a flat manner.
That means that there are bounds on the number of level-n
to level-n-minus-one areas, and this in turn requires that
the addressing scheme allow for a deep enough hierarchy.

Consequently, the number of things in the top,
default-free hierarchy is always going to be limited, no
matter what "type" of hierarchical allocation scheme is
proposed.

The further requirement that any given area be fully
contiguous means that the "top" of the hierarchy must be 
self-repairing.

In other words, as tli pointed out, if you have a switch
in some convenient geographical location, like the
Grenwich Observatory or the UN building or MAE-EAST, your
entire routing system fails if that switch or that
location fails.

Consequently, to avoid the single point of failure, there
would be a desire to have several diverse geographical
locations to act as the top of the hierarchy.

The problem, again, is that any given area must be fully
contiguous, and this implies that any level-n router
connecting to one of these diverse locations would have
connectivity to every other level-n router, so that this
top level n area would be contiguous.

One could propose to implement this as a big bridged
network.  The original DGIX proposal was along these
lines.  Operational experience with much smaller but still
big bridged exchange points has demonstrated pretty much
conclusively that this is a Really Really Bad Idea.

One could propose to implement this using the native
protocol, effectively connecting all of these exchange
points into a level-n-plus-one area of its own.  As long
as the level-n-plus-one area could route to all the
level-n areas at all times, this would work just nicely,
on a technical basis.

The difference therefore between your "ISP at top" option
and your "exchange point at top" option is that in a
hierarchical addressing system, which is the only way we
currently know how to scale a global internetwork, is
merely in the choice of words.  Whatever is at the top has
to connect reliably and continuously all the things that
are one step down from the top, and simple
belt-and-suspenders implies geographical diversity.

Thus, the top of the hierarchy may be expressed as a big,
geographically diverse bridged network connecting all the
"next-level-down" routers, a single big geographically
diverse routed network comprising a single area, or a
meshed concatenation of the "next-level-down" routers in
such a way that robust interconnectivity among them is
maintained at all times.

The choice is probably best made on the basis of
reliability and cost, but experience shows that it is
likelier made on the basis of politics, autonomy/mistrust
of other operators, marketing goals, and possibly cost.

If it were possible for two routing areas to cleanly
synthesize a next-level-up area, which probably implies
the use of variable length addresses, then it strikes me
intuitively that better routing hierarchy than is likely
to be cobbled together through the deployment of physical
infrastructure can be enjoyed, keeping the number of
routing entries needed by any given router anywhere in
such an Internet to a minimum.

With a variable length addressing scheme, in other words,
one can consider a set of operations which can be
summarized as "make-hierarchical" or "make-lateral".

The obvious and increasingly important first baby step in
an evolutionary path towards a scalable Internet is, to
quote Noel Chiappa, "to make the world safe for NAT, by
making all end-to-end functions use the DNS name; e.g. for
authentication, pseudo-headers for checksums, etc, etc."
As he continued, this can be justified solely on the basis
of working better with NAT, which solves some real-world
problems now, and which is being used now.

There is lots more to discuss.  Is big-internet still in
post-Bass trauma?  If not, let's discuss it there, or
privately.

However, to tie in some tiny degree of NANOG relevance,
and to emphasise through repetition, the idea of using a
large bridged network has been broken through the history
of exchange points, particularly since the lovely days
when people didn't learn from Milo's FIX upgrade path and
began doing multimedia bridging.  Single exchange points
fail, so avoiding the large briged network by having a
single exchange point be the "top" of a hierarchy won't
work.  Therefore, the current hierarchy implemented in
provider-based addressing with some coordination to
preserve some degree of geographic alignment of addresses
(through ARIN, RIPE and APNIC, and large-ISP allocation
strategies), is almost certainly the most appropriate one.

That is to say, we got CIDR pretty much right.

	Sean.

Re: Geographic v. topological address allocation

Sean Doran