At 11:30 AM 10/18/2005, Andre Oppermann wrote:
I guess it's time to have a look at the actual scalability issues we face in the Internet routing system. Maybe the area of action becomes a bit more clear with such an assessment.
In the current Internet routing system we face two distinctive scalability issues:
1. The number of prefixes*paths in the routing table and interdomain routing system (BGP)
This problem scales with the number of prefixes and available paths to a particlar router/network in addition to constant churn in the reachablility state. The required capacity for a routers control plane is:
capacity = prefix * path * churnfactor / second
I think it is safe, even with projected AS and IP uptake, to assume Moore's law can cope with this.
Moore will keep up reasonably with both the CPU needed to keep BGP perking, and with memory requirements for the RIB, as well as other non-data-path functions of routers.
2. The number of longest match prefixes in the forwarding table
This problem scales with the number of prefixes and the number of packets per second the router has to process under full or expected load. The required capacity for a routers forwarding plane is:
capacity = prefixes * packets / second
This one is much harder to cope with as the number of prefixes and the link speeds are rising. Thus the problem is multiplicative to quadratic.
Here I think Moore's law doesn't cope with the increase in projected growth in longest prefix match prefixes and link speed. Doing longest prefix matches in hardware is relatively complex. Even more so for the additional bits in IPv6. Doing perfect matches in hardware is much easier though...
Several items regarding FIB lookup: 1) The design of the FIB need not be the same as the RIB. There is plenty of room for creativity in router design in this space. Specifically, the FIB could be dramatically reduced in size via aggregation. The number of egress points (real or virtual) and/or policies within a router is likely FAR smaller than the total number of routes. It's unclear if any significant effort has been put into this. 2) Nothing says the design of the FIB lookup hardware has to be longest match. Other designs are quite possible. Again, some creativity in design could go a long way. The end result must match that which would be provided by longest-match lookup, but that doesn't mean the ASIC/FPGA or general purpose CPUs on the line card actually have to implement the mechanism in that fashion. 3) Don't discount novel uses of commodity components. There are fast CPU chips available today that may be appropriate to embed on line cards with a bit of firmware, and may be a lot more cost effective and sufficiently fast compared to custom ASICs of a few years ago. The definition of what's hardware and what's software on line cards need not be entirely defined by whether the design is executed entirely by a hardware engineer or a software engineer. Finally, don't discount the value and performance of software-based routers. MPLS was first "sold" as a way to deal with core routers not handling Gigabit links. The idea was to get the edge routers to take over. Present CPU technology, especially with good embedded systems software design, is quite capable of performing the functions needed for edge routers in many circumstances. It may well make sense to consider a mix of router types based on port count and speed at edges and/or chassis routers with line cards that are using general purpose CPUs for forwarding engines instead of ASICs for lower-volume sites. If we actually wind up with the core of most backbones running MPLS after all, well, we've got the technology so use it. Inter-AS routers for backbones, will likely need to continue to be large, power-hungry boxes so that policy can be separately applied on the borders. I should point out that none of this really is about scalability of the routing system of the Internet, it's all about hardware and software design to allow the present system to scale. Looking at completely different and more scalable routing would require finding a better way to do things than the present BGP approach.