TREE BITMAP

The material is taken from [1]. However, the example scenarios prepared by me

The tree bitmap is a fixed stride multibit trie algorithm that allows fast searches and uses storage comparable to the Lulea algorithm. However, the tree bitmap differs in one main aspect—the ability to allow fast insertions/updates. While fast lookups are needed for wire speed forwarding, from a commercial router perspective, fast insertions and modifications are clearly desirable. This is needed since the route updates/changes must modify a few hundred prefixes a second.

As observed in the previous section, inserting and deleting prefixes in the Lulea scheme require rebuilding the entire data structure. This is because the prefixes are expanded and pushed to trie leaves. If the packets are to be forwarded nonstop (which is now a requisite for a commercial router), we need to maintain two copies: one that is being used for lookups and the other for building the new data structure for prefix insertions and updating. This potentially doubles the memory requirements increasing the storage cost. In the case of implementations using a fixed size on-chip SRAM that stores the entire prefix database, the number of prefixes that can be stored is halved. The key to avoiding two copies and still allowing fast insertions and deletions is to eliminate leaf pushing. The tree bitmap algorithm achieves this by storing two bitmaps: one for pointers to the child and the other for prefixes.

The next section outlines the principles and optimizations used in the design of a tree bitmap.

Design Rationale

The tree bitmap algorithm design considers that a multibit trie node is intended to serve two purposes—one to direct the search to its child nodes and the other to retrieve the forwarding information corresponding to the best matching prefix that exists in the node. It further emphasizes that these functions are distinct from each other. The design of the data structure reflects this observation by using two bitmaps per trie node instead of a single bitmap as in the Lulea algorithm. One bitmap used for storing internal prefixes belonging to that node is referred to as a prefix bitmap and the other bitmap used for storing the pointers to children is referred to as a pointer bitmap. Such a use of two bitmaps obviates leaf pushing, allowing fast insertions and updates.

Furthermore, the tree bitmap attempts to reduce the number of child node pointers by storing all the child nodes of a given trie node contiguously. As a result, only one pointer that points to the beginning of this child node block needs to be stored in the trie node. Such an optimization potentially reduces the number of required pointers by a factor of two compared to standard multibit tries. An additional advantage is that it reduces the size of the trie nodes. In such a scheme, the address for any child node can be computed efficiently using simple arithmetic, assuming a fixed size for each child node.

The tree bitmap algorithm attempts to keep the trie nodes as small as possible to reduce the size of a memory access for a given stride. A tree bitmap trie node is expected to contain the pointer bitmap, the prefix bitmap, the base pointer to the child block, and the next-hop information associated with the prefixes in the node. If the next-hop information is stored along with the trie node, it would make the size of the trie node much larger. Instead, the next-hop information is stored separately in an array associated with this node and a pointer to the first element is stored in the trie node. Would storing next-hop information in a separate array require two memory accesses per trie node: one for accessing the trie node and the other to fetch the next-hop information? The algorithm gets around the problem by a simple lazy evaluation strategy. It does not fetch the resulting next-hop information until the search is terminated. Once the search ends, the desired node is fetched. This node carries the next-hop information corresponding to a valid prefix present in the last trie node encountered in the path.

Finally, a tree bitmap uses only one memory access per node. With burst-based memory technologies, the size of a single random access can be large as 32 bytes. If the entire trie node has to be accessed in a single memory access, it cannot be larger than the optimal memory burst sizes. The size of the trie node greatly depends on the stride size. Consider a stride size of 8 bits. The pointer bitmap will require 256 bits (28). The prefix bitmap will require 511 bits (29 − 1) since there are 29 − 1 possible prefixes of length 8 or less. In addition, we need another 4 bytes for storing the base pointers to the child and next-hop information. Hence the size of the trie node is approximately 100 bytes ((= 256 + 511 + 4 × 8)/8), which requires more than one memory access even with burst-based memory technologies. By using smaller strides, say 4 bits, the tree bitmap makes the bitmaps small enough that the entire node can be accessed in a single wide memory access. Use of small strides also keeps the update times bounded. Accessing the entire node includes both the bitmaps and the base pointers for the child and next-hop information. Since the bitmaps are smaller, special circuitry using combinatorial logic can be used to count the bits efficiently. This is unlike the Lulea algorithm, which requires at least two memory accesses because of large strides of 8 or 16 bits. Such large strides require a bitmap of larger size and necessitate the use of a separate summary table for counting, which must be accessed separately.

Consider the root node of the fixed stride multibit trie shown in Fig. 1. The corresponding tree bitmap node is shown in Fig. 2.

multibitFig. 1: “Fixed Stride Multibit Trie Example”

TreeBitMapRoot
Fig. 2: “Structure of Tree Bitmap Node Corresponding to Root Node of Trie in Fig. 1

As discussed earlier, a tree bitmap node consists of two bitmaps. The first bitmap shown vertically is the pointer bitmap, which indicates the position where child pointers exist. In Fig. 2, this bitmap is referred to as ptrbitarr. It shows that pointers exist for entries 000 and 111. These pointers correspond to two subtries rooted at the entries 000 and 111 in Figure 2. Now instead of storing explicit child pointers in a separate array as in the Lulea scheme, the tree bitmap node stores a pointer to the first child trie node as one of the fields in ptrblk.

The second bitmap shown horizontally is the prefix bitmap. It stores a list of all the prefixes within the first 3 bits that belong to the node. This bitmap is different from the Lulea scheme because it has a bit assigned for all possible prefixes of 3 bits or less. The bitmap positions are assigned to the prefixes in the order of 1-bit prefixes followed by 2-bit prefixes and so on. A bit in the prefix bitmap is set if that prefix occurs within the trie node. As you can see in Fig. 2, the root node contains the prefixes P1, P2, P3, P4, and P5. Hence, the bit positions corresponding to prefixes ∗, 1∗, 00∗, 101∗, and 111∗ are set in the prefix bitmap. The entire tree bitmap data structure for the fixed multibit trie in Fig. 1 is shown in Fig. 3.

TreeBitMapN2
Fig. 3: “Tree Bitmap Data Structure Example”

Search Algorithm: The search starts from the root trie node and using the same number of bits as the stride for the first level indexes into the pointer bitmap. Let us call this position P. If the bit in position P is set, it implies that there is a valid child pointer that points to a subtrie. To obtain the value of the child pointer, the number of 1 bits in the pointer bitmap is counted up to the indexed position P. Assuming the count is C and the base address to the child block in root trie node is A, the expression A+(C −1)×S gives the value of the child pointer, where S refers to the size of each child trie node.

Before following the child pointer to the next level, the search examines the prefix bitmap to determine if one or more prefixes match. The bits used to match these prefixes are the same set of bits used to index the pointer bitmap. For the sake of discussion, let us assume that these bits are 111. First, the bit corresponding to prefix 111∗ is examined at position 15. The bit is set, which indicates that it is the best matching prefix. If the bit had not been set, the last bit would be dropped and the prefix bitmap would be again searched for prefix 11∗. If there is no 11∗ prefix, the algorithm continues and checks for prefix 1∗. The reader might note that the search algorithm has to perform multiple iterations proportional to the number of bits in the prefix. If all these iterations have to be performed in O(1) time, they need to be executed in parallel. In custom ASICs (application-specific integrated circuits), this can be accomplished using dedicated combinatorial logic circuits and a priority encoder that returns the longest matching prefix. This could easily scale even for bitmaps as large as 256 or 512 bits.

If a matched prefix is found in the prefix bitmap, the next-hop information corresponding to the prefix is not fetched immediately. Instead, the matched prefix position and pointer to the trie node are remembered and the search continues to descend using the computed child pointer. As the search advances to lower levels, if better matched prefixes are found, the remembered matching prefix position and the pointer to the trie node are updated. The search terminates when it encounters a bit that is zero in the child pointer bitmap, meaning there are no children to continue further. At this point, the pointer to the next-hop information corresponding to the last remembered best matching prefix is computed using its trie node. This yields the desired result. For a better understanding of the search, let us look at an example.

Consider the search for the destination address beginning with 1110100 in the tree bitmap data structure shown in Fig. 3. The search first examines the pointer bitmap ptrbitarr(Root) of the root node using the first three bits 111. Since the bit is set, the search needs to examine the child subtrie. In the pointer bitmap, as it is the second bit set, the child pointer is computed as ptr + (2 − 1) × S = ptr + S where ptr is the base address and S is the size of the trie node. Before continuing the search to the child subtrie, the prefix bitmap pfxbitarr(R) is examined for matching prefixes. First, the entry corresponding to the first three bits of the address 111 is examined. Since the bit is set, a prefix match has been found, the pointer to the nexthop information is computed (similar to the computation of child pointer). This next-hop information is not fetched and instead remembered as the best matching prefix so far.

Now the computed child pointer is used to fetch the child node N2. Using the next three bits of the address 010, the child bitmap ptrbitarr(N2) is examined. Since the bit corresponding to entry 010 is not set, there is no more child subtrie to examine. The prefix bitmap pfxbitarr(N2) is examined for the entry 010. Since the bit is not set, there is no matching prefix, the search then drops the last bit and examines the entry 01. The bit is set indicating the presence of matching prefix P7. This prefix is updated as the best matching prefix; the pointer to its next-hop information is computed and fetched, terminating the search.

The algorithms for inserting and updating in a tree bitmap are very similar to the same set of operations in a multibit trie without leaf pushing. Inserting a prefix could change the entire trie node. In such cases, a new node is created and the original contents are copied, after which the necessary modifications are made and then atomically linked to the original trie. Since child nodes have to be contiguous, the entire child block may have to be copied even though only one child might require modifications. Performance results [198] show that the tree bitmap scheme is comparable to the Lulea scheme in terms of the size of the data structure and speed of lookup, but also provides the advantage of fast insertions. The algorithm lends itself to implementations ranging from software to architectures using an on-chip SRAM.

3 Responses to TREE BITMAP

  1. Michael says:

    Good explanation, but there is an error in the examples, I think. In figure 1 there are pointers for ‘100’ and ‘111’ and in drawings 2 & 3 the pointers are shown for ‘000’ and ‘111’ (in the root node)

Leave a comment