On Mon, Jun 8, 2020 at 10:52 AM <ljwobker@gmail.com> wrote:
Every "fast" FIB implementation I'm aware of takes a set of prefixes, stores them in some sort of data structure, which can perform a longest-prefix lookup on the destination address and eventually get to an actual physical interface for forwarding that packet. Exactly how those prefixes are stored and exactly how load-balancing is performed is *very* platform specific, and has tons of variability. I've worked on at least a dozen different hardware based forwarding planes, and not a single pair of them used the same set of data structures and design tradeoffs.
Howdy, AFAIK, there are two basic approaches: TCAM and Trie. You can get off in to the weeds fast dealing with how you manage that TCAM or Trie and the Trie-based implementations have all manner of caching strategies to speed them up but the basics go back to TCAM and Trie. TCAM (ternary content addressable memory) is a sort of tri-state SRAM with a special read function. It's organized in rows and each bit in a row is set to 0, 1 or Don't-Care. You organize the routes in that memory in order from most to least specific with the netmask expressed as don't-care bits. You feed the address you want to match in to the TCAM. It's evaluated against every row in parallel during that clock cycle. The TCAM spits out the first matching row. A Trie is a tree data structure organized by bits in the address. Ordinary memory and CPU. Log-nish traversal down to the most specific route. What you expect from a tree. Or have I missed one? Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/
Juniper Networks has also tried using Bloom filters. https://patents.google.com/patent/US20170187624 I think the QFX10002 was the first product they made which used this approach. https://forums.juniper.net/t5/Archive/Juniper-QFX10002-Technical-Overview/ba... On Mon, Jun 8, 2020 at 1:45 PM William Herrin <bill@herrin.us> wrote:
On Mon, Jun 8, 2020 at 10:52 AM <ljwobker@gmail.com> wrote:
Every "fast" FIB implementation I'm aware of takes a set of prefixes, stores them in some sort of data structure, which can perform a longest-prefix lookup on the destination address and eventually get to an actual physical interface for forwarding that packet. Exactly how those prefixes are stored and exactly how load-balancing is performed is *very* platform specific, and has tons of variability. I've worked on at least a dozen different hardware based forwarding planes, and not a single pair of them used the same set of data structures and design tradeoffs.
Howdy,
AFAIK, there are two basic approaches: TCAM and Trie. You can get off in to the weeds fast dealing with how you manage that TCAM or Trie and the Trie-based implementations have all manner of caching strategies to speed them up but the basics go back to TCAM and Trie.
TCAM (ternary content addressable memory) is a sort of tri-state SRAM with a special read function. It's organized in rows and each bit in a row is set to 0, 1 or Don't-Care. You organize the routes in that memory in order from most to least specific with the netmask expressed as don't-care bits. You feed the address you want to match in to the TCAM. It's evaluated against every row in parallel during that clock cycle. The TCAM spits out the first matching row.
A Trie is a tree data structure organized by bits in the address. Ordinary memory and CPU. Log-nish traversal down to the most specific route. What you expect from a tree.
Or have I missed one?
Regards, Bill Herrin
-- William Herrin bill@herrin.us https://bill.herrin.us/
There are many different tries -- see here for some examples. https://www.drdobbs.com/cpp/fast-ip-routing-with-lc-tries/184410638 And an enhancement to LC-tries http://www.diva-portal.org/smash/record.jsf?pid=diva2%3A469814&dswid=-2401 Then there are radix-n (n-ary trie) lookups, e.g. radix-4 would look up 4-bits at a time and branch 16 ways. Here's a good tutorial, and I don't think even this is exhaustive. http://klamath.stanford.edu/~pankaj/talks/hoti_tutorial.ppt On Mon, Jun 8, 2020 at 4:19 PM Josh Hoppes <josh.hoppes@gmail.com> wrote:
Juniper Networks has also tried using Bloom filters.
https://patents.google.com/patent/US20170187624
I think the QFX10002 was the first product they made which used this approach.
https://forums.juniper.net/t5/Archive/Juniper-QFX10002-Technical-Overview/ba...
On Mon, Jun 8, 2020 at 1:45 PM William Herrin <bill@herrin.us> wrote:
On Mon, Jun 8, 2020 at 10:52 AM <ljwobker@gmail.com> wrote:
Every "fast" FIB implementation I'm aware of takes a set of prefixes,
stores them in some sort of data structure, which can perform a longest-prefix lookup on the destination address and eventually get to an actual physical interface for forwarding that packet. Exactly how those prefixes are stored and exactly how load-balancing is performed is *very* platform specific, and has tons of variability. I've worked on at least a dozen different hardware based forwarding planes, and not a single pair of them used the same set of data structures and design tradeoffs.
Howdy,
AFAIK, there are two basic approaches: TCAM and Trie. You can get off in to the weeds fast dealing with how you manage that TCAM or Trie and the Trie-based implementations have all manner of caching strategies to speed them up but the basics go back to TCAM and Trie.
TCAM (ternary content addressable memory) is a sort of tri-state SRAM with a special read function. It's organized in rows and each bit in a row is set to 0, 1 or Don't-Care. You organize the routes in that memory in order from most to least specific with the netmask expressed as don't-care bits. You feed the address you want to match in to the TCAM. It's evaluated against every row in parallel during that clock cycle. The TCAM spits out the first matching row.
A Trie is a tree data structure organized by bits in the address. Ordinary memory and CPU. Log-nish traversal down to the most specific route. What you expect from a tree.
Or have I missed one?
Regards, Bill Herrin
-- William Herrin bill@herrin.us https://bill.herrin.us/
On Tue, 9 Jun 2020 at 02:22, Josh Hoppes <josh.hoppes@gmail.com> wrote:
Juniper Networks has also tried using Bloom filters.
https://patents.google.com/patent/US20170187624
I think the QFX10002 was the first product they made which used this approach.
https://forums.juniper.net/t5/Archive/Juniper-QFX10002-Technical-Overview/ba...
How they use it is not clear and it's not communicated in the article. Bloom filter of course only answers to you 'maybe' or 'no', which is not sufficient for egress rewrite information. The article also implies the bloom filter is in HMC, which I don't believe. I suspect roughly what is happening a) there are N bloom filters on-chip b) lookup is ran parallel to all N bloom filters c) bloom filters which give 'maybe' answer, cause lookup to off-chip HMC The naive approach would be 1 bloom filter per-netmask, so you have LEM only lookups in HMC, and on-chip tells which LEM lookups to do, to reduce LEM count from 32 to something rather less. But this surely isn't what they are doing, because the hash population is very different for /1 than it is to /24 and you want a reasonably fair balance, while guaranteeing LEM to off-chip. But I'm sure this is the gist of it, imprecise fast on-chip allowing LEM only off-chip. -- ++ytti
participants (4)
-
Anoop Ghanwani
-
Josh Hoppes
-
Saku Ytti
-
William Herrin