A response I received from Mark Tinka on afnog mailing list. Reposting for the nanog community: ---------- Forwarded message ---------- From: Mark Tinka <mtinka@globaltransit.net> Date: Mon, Aug 3, 2009 at 7:55 AM Subject: Re: Weekly Routing Table Report To: Zartash Uzmi <zartash@gmail.com> Cc: pfs@cisco.com, afnog@afnog.org On Monday 03 August 2009 07:29:28 am Zartash Uzmi wrote: <deleting other mailing lists and such>
Apologies if this is too naive to ask...
Any question is a good question.
but is there some detail available about the items listed in the summary?
This is probably the link you seek: http://thyme.apnic.net/about.html
1) In particular, what exactly is the difference between the "BGP routing table entries examined (292961)" and "Unique aggregates announced to Internet (145391)"?
What this tells us is, assuming all routing entries "currently" present in the routing table (as seen by the infrastructure that enables this weekly CIDR report) were aggregated, the maximum number of routing entries, as at the point this report was generated, would be 145,391; just a little shy of half what we're seeing today.
2) I believe 292961 is the worst case routing table size for any router.
Well, different networks will see slightly different views (more or less), but yes, the average number each DFZ (default-free zone) is seeing would be just about there. But if you mean the number of routing entries each router can support, then that's a different issue. This depends on a number of factors, particularly, the router's architecture. <At the risk of massively derailing this thread, little primer on routing architectures follows, for perspective> Software-based routers will, generally, hold as many routing entries as the amount of RAM they have can support. A number of software-based routers today support 2GB of RAM, e.g., Cisco's 7201 or 7206-VXR/NPE-G2, or Juniper's J4350 and J6350 routers. The limitation, obviously, is because packet forwarding occurs in the CPU path (a software Interrupt process, hence the name, "software-based routers"), there is a finite amount of traffic software-based platforms can forward, especially with other features enabled, before they run out of ideas, so to speak :-). Hardware-based routers, on the other hand, segment the process of handling routing entries and forwarding traffic, in a somewhat distributed fashion. Hardware-based routers typically have what are known as control planes and data planes. Control planes handle management and such house- keeping functions, which includes BGP. This function is generally similar to what you see in software-based routers, in that RAM is abundant (up to 4GB in today's largest systems) to hold as many routing entries from as many paths as possible. Where hardware-based routers differ from their software- based cousins is how routing entries are used to forward traffic. Once BGP has chosen the best path to all destinations in the control plane, those best paths are then "downloaded", if you will, to the router's data plane, which is normally a special chip that has a single function - forward traffic as quickly as possible. Dumb, but very efficient. The vendors call these ASIC's (Application Specific Integrated Circuits) or Programmable Chips. These ASIC's or Programmable Chips usually hold only one routing entry at a time, even though the control plane can have several copies of the same routing entry, as seen from multiple BGP paths. These ASIC's/Programmable Chips use expensive, specialized memory to hold these entries; it could be TCAM (Ternary Content Addressable Memory), SSRAM (Synchronous Static RAM), RLDRAM (Reduced Latency DRAM), e.t.c. These are high-speed types of memory that are built, in most cases, for networking applications, e.g., routers, switches, e.t.c., and work at very high bandwidths, supporting high-speed route entry look-ups, which allows hardware-based routers to forward traffic at the speeds they do, e.g, 1Gbps, 10Gbps, 40Gbps, e.t.c. The reason I brought this up is because the explosion of the IPv4 Internet routing table is putting pressure on routers, more specifically, hardware-based routers. This is because TCAM, SSRAM, RLDRAM, e.t.c., is very expensive, and as such, has a finite number of entries they can hold (I say entries because on some platforms, entries includes IPv4 routes, IPv6 routes, MPLS LSP's, ACL's, NetFlow data, e.t.c.). Upgrading these means swapping out expensive data plane infrastructure, which many service providers would like to avoid, if at all possible. </At the risk of massively derailing this thread, little primer on routing architectures follows, for perspective>
If the unique aggregates announced to the Internet is 145391, how does the routing table size anywhere may exceeds this number?
For the very reasons we receive this report on a regular basis, to remind us of what we could do to reduce the pollution of the routing table, and hence, increase the lifetime of the (powerful?) routers we have in the network that can no longer serve us because they can't hold anymore routing entries without falling over. I wouldn't do Philip Smith's live presentation of the state of the Internet routing table any justice by trying to explain it here :-), but basically, we see about 50% more routing entries than we should mainly due to de-aggregation. This happens for a number of reasons, but one of the main ones is traffic engineering, where networks announce longer versions of their prefixes in order to balance inbound traffic to their network. This is usually a noble, commercially-driven decision, but with the side effects explained above. Cake, eat, both... :-).
3) Is aggregation done at a particular router for (i) reducing the table size in that router, or (ii) reducing the number of announced prefixes by that router, or (iii) both?
Border and peering routers are generally the ones that face the world. They connect to the Internet either by purchasing transit from upstream providers, or peering with other networks privately or at public exchange points. Aggregation is typically done inside your network, but whatever the case, the prefixes that these routers announce to the outside should, ideally, be aggregates of the allocations a network receives from its RIR. Different networks implement this differently, e.g., some networks configure and announce aggregates on their border and peering routers, while others, like us, do the same on the route reflectors, which I think scales better if you have multiple border and peering routers spread across the network. But as you can tell, this is an internal design issue - the end goal is to announce to the world only what you need to announce to the world, and hopefully, keep the Internet routing table lean & mean. Hope this helps. Cheers, Mark.