You might remember back in October I described how my qp trie is a combination of djb's crit-bit tree and Bagwell's HAMT. The extra cherry on top of those ingredients was spotting that with the right layout in memory, qp trie traversal gets a big boost from cache prefetching.
There are a lot of papers about doing routing table lookup using a trie of some kind. So how do routing tries relate to qp tries?
Rather than general-purpose data structures, the literature has a lot of application-specific tries designed just for routing. And rather than software, they are often tuned for implementation in hardware.
There are a couple of examples below; the tl;dr is that they are more like the un-hashed variant of Bagwell's HAMT than like qp tries, since they don't have the PATRICIA / crit-bit trick of omitting nodes with one child.
In August 2015, a couple of months before my qp trie work, a paper titled Poptrie: a compressed trie with population count for fast and scalable software IP routing table lookup was presented at the ACM SIGCOMM conference.
The Poptrie paper cites a paper titled Tree Bitmap: hardware/software IP lookups with incremental updates published in April 2004 and sadly paywalled.
[Getting less relevant, the Tree Bitmap paper frequently compares its data structure with the "Lulea" one described in small forwarding tables for fast routing lookups (SIGCOMM 1997). That one is worth noting as an earlier non-HAMT-like structure.]
I found the poptrie paper after I thought up the qp trie, when trying to choose a name for it that wasn't already taken. Algorithm 1 in that paper is very similar to the qp trie inner loop; the crit-bit difference is that in a poptrie the bit offset into the key has a fixed step per loop iteration (like a HAMT), whereas in a qp trie the offset is loaded from the node so it can skip ahead arbitrarily.
Another HAMT similarity occurs in section 3.4 of the poptrie paper, which describes using a jumbo node at the root of the trie to reduce the number of indirections. I have not tried implementing a qp trie with this feature.
The HAMT-like lack of PATRICIA-style skipping in a routing trie is one aspect of a pervasive structural difference due to the different kinds of data.
A qp trie stores a relatively sparse set of strings. In any trie most of the possible strings you might try to look up will not be found. Strings are variable length. And strings are prefix-free - a short string will not be a prefix of a longer one if you include its '\0' terminator.
A routing trie stores a dense set of address prefixes. Every address you look up will produce an answer - the lack of a route is handled at a higher level. Address prefixes have a limited set of possible lengths. And routing tables are not prefix-free: you often have a route for a large address range (short prefix) with a more specific route for a smaller address range (longer prefix) that carves a chunk out of it.
The sparse/dense string/route difference means that the qp trie inner loop has a three-way decision (fail to find anything; find a leaf; go down another branch) whereas a routing trie has a two-way decision (leaf or branch).
The lack of prefix-freedom means a routing trie needs a different compression approach than a qp trie. Section 3.3 of the poptrie paper has details of their version; one consequence is that a poptrie node has two bitmaps and two child pointers for leaves and branches, whereas an HAMT node or a qp trie node has one bitmap and one pointer for both.
I think poptries would benefit from prefetching like qp tries and HAMTs, but the poptrie paper doesn't mention it. However the Tree Bitmap paper does talk about doing bitmap and index calculations while waiting for a memory fetch.
Obviously this was nothing like a proper literature search so there are certainly other papers covering similar ground. I would be interested in pointers to other literature in this area!