Re: Scalability issues in the Internet routing system

24 Oct 2005

      One question - which percent of routing table  of any particular router is
REALLY used, say, during 1 week?

I have a strong impression, that answer wil not be more than 20% even in
biggerst backbones, and
will be (more likely) below 1% in the rest of the world. Which makes a hige
space for optimization.

----- Original Message ----- 
From: "Daniel Senie" <dts@senie.com>
To: <nanog@nanog.org>
Sent: Tuesday, October 18, 2005 9:50 AM
Subject: Re: Scalability issues in the Internet routing system
...
At 11:30 AM 10/18/2005, Andre Oppermann wrote:
...
I guess it's time to have a look at the actual scalability issues we
face in the Internet routing system.  Maybe the area of action becomes
a bit more clear with such an assessment.
In the current Internet routing system we face two distinctive
scalability
...
...
issues:
1. The number of prefixes*paths in the routing table and interdomain
   routing system (BGP)
This problem scales with the number of prefixes and available paths
to a particlar router/network in addition to constant churn in the
reachablility state.  The required capacity for a routers control
plane is:
capacity = prefix * path * churnfactor / second
I think it is safe, even with projected AS and IP uptake, to assume
Moore's law can cope with this.
Moore will keep up reasonably with both the CPU needed to keep BGP
perking, and with memory requirements for the RIB, as well as other
non-data-path functions of routers.
...
2. The number of longest match prefixes in the forwarding table
This problem scales with the number of prefixes and the number of
packets per second the router has to process under full or expected
load.  The required capacity for a routers forwarding plane is:
capacity = prefixes * packets / second
This one is much harder to cope with as the number of prefixes and
the link speeds are rising.  Thus the problem is multiplicative to
quadratic.
Here I think Moore's law doesn't cope with the increase in projected
growth in longest prefix match prefixes and link speed.  Doing longest
prefix matches in hardware is relatively complex.  Even more so for
the additional bits in IPv6.  Doing perfect matches in hardware is
much easier though...
Several items regarding FIB lookup:
1) The design of the FIB need not be the same as the RIB. There is
plenty of room for creativity in router design in this space.
Specifically, the FIB could be dramatically reduced in size via
aggregation. The number of egress points (real or virtual) and/or
policies within a router is likely FAR smaller than the total number
of routes. It's unclear if any significant effort has been put into this.
2) Nothing says the design of the FIB lookup hardware has to be
longest match. Other designs are quite possible. Again, some
creativity in design could go a long way. The end result must match
that which would be provided by longest-match lookup, but that
doesn't mean the ASIC/FPGA or general purpose CPUs on the line card
actually have to implement the mechanism in that fashion.
3) Don't discount novel uses of commodity components. There are fast
CPU chips available today that may be appropriate to embed on line
cards with a bit of firmware, and may be a lot more cost effective
and sufficiently fast compared to custom ASICs of a few years ago.
The definition of what's hardware and what's software on line cards
need not be entirely defined by whether the design is executed
entirely by a hardware engineer or a software engineer.
Finally, don't discount the value and performance of software-based
routers. MPLS was first "sold" as a way to deal with core routers not
handling Gigabit links. The idea was to get the edge routers to take
over. Present CPU technology, especially with good embedded systems
software design, is quite capable of performing the functions needed
for edge routers in many circumstances. It may well make sense to
consider a mix of router types based on port count and speed at edges
and/or chassis routers with line cards that are using general purpose
CPUs for forwarding engines instead of ASICs for lower-volume sites.
If we actually wind up with the core of most backbones running MPLS
after all, well, we've got the technology so use it. Inter-AS routers
for backbones, will likely need to continue to be large, power-hungry
boxes so that policy can be separately applied on the borders.
I should point out that none of this really is about scalability of
the routing system of the Internet, it's all about hardware and
software design to allow the present system to scale. Looking at
completely different and more scalable routing would require finding
a better way to do things than the present BGP approach.

Re: Scalability issues in the Internet routing system

Alexei Roudnev