Then you could have knobs for what other routes you discard when you run out of space. Receiving a covering /16? Maybe you can drop the /24s, even if they have a different next hop - routing will be sub-optimal, but it will work. (I know, previous discussions around traffic engineering and whether the originating network must / does do that in practice...)
What you are describing is exactly what the /24 convention is doing already, just with different mask lengths. By and large, RIB/FIB size can be effectively managed today by thoughtful use of policies. If a point is reached that doesn't work anymore, it's _probably_ time to re-evaluate the hardware or the design. On Mon, Oct 2, 2023 at 9:20 AM tim@pelican.org <tim@pelican.org> wrote:
On Monday, 2 October, 2023 09:39, "William Herrin" <bill@herrin.us> said:
That depends. When the FIB gets too big, routers don't immediately die. Instead, their performance degrades. Just like what happens with oversubscription elsewhere in the system.
With a TCAM-based router, the least specific routes get pushed off the TCAM (out of the fast path) up to the main CPU. As a result, the PPS (packets per second) degrades really fast.
With a DRAM+SRAM cache system, the least used routes fall out of the cache. They haven't actually been pushed out of the fast path, but the fast path gets a little bit slower. The PPS degrades, but not as sharply as with a TCAM-based router.
Spit-balling here, is there a possible design for not-Tier-1 providers where routing optimality (which is probably not a word) degrades rather than packet-shifting performance?
If the FIB is full, can we start making controlled and/or smart decisions about what to install, rather than either of the simple overflow conditions?
For starters, as long as you have *somewhere* you can point a default at in the worst case, even if it's far from the *best* route, you make damn sure you always install a default.
Then you could have knobs for what other routes you discard when you run out of space. Receiving a covering /16? Maybe you can drop the /24s, even if they have a different next hop - routing will be sub-optimal, but it will work. (I know, previous discussions around traffic engineering and whether the originating network must / does do that in practice...)
Understand which routes your customers care about / where most of your traffic goes? Set the "FIB-preference" on those routes as you receive them, to give them the greatest chance of getting installed.
Not a hardware designer, I have little idea as to how feasible this is - I suspect it depends on the rate of churn, complexity of FIB updates, etc. But it feels like there could be a way to build something other than "shortest -> punt to CPU" or "LRU -> punt to CPU".
Or is everyone who could make use of this already doing the same filtering at the RIB level, and not trying to fit a quart RIB into a pint FIB in the first place?
Thanks, Tim.