Hash table revisited

I came across how Facebook implements Hash table from this post: https://engineering.fb.com/developer-tools/f14/. It introduces several techniques making modern hash tables more efficient.

The first technique is called chunking, which reduces the time for resolving hash collision. The idea is to map keys to a chunk (a block of slots) rather than a single slot then search within the chunk in parallel. Specifically, FB designates 14 slots in one chunk. I am not totally sure how, once a key is determined to belong to a chunk, a slot within the chunk is allocated for that key. For now I assume allocating a slot in a chunk is super fast, as indicated by the article that the search within the chunk is done through highly parallelizable vector instructions. In chunked hash tables, undesired cost would happen only in chunk overflow, i.e., if the chunk is full and a key needs to be relocated to another chunk. Similarly you can also call slot overflow in non-chunked hash tables for the scenario where a slot is full and a key needs to find another slot. The chance of seeing a chunk overflow (15 keys happen to belong to the same chunk) is much lower than seeing a slot overflow, hence chunking saves the time for resolving collision.

The second technique is called count-based tombstone. We can first revisit what is a tombstone in a classic hash table implementation. It refers to setting a slot to a special empty symbol when deleting an entry in a hash table. Tombstones are needed because the freed slots can hold new entries in the future but to reliably determine whether a to-be-inserted key exists we still need to probe beyond any tombstone slot encountered, as if that tombstone slot is not empty. We call the behavior “probe” as the process to search the next slot in the presence of key collision until we can determine whether the key exists or not. Probing can happen during either inserting or searching a key.

Tombstone is easily understood from an example:

In the example, we use every key’s middle-digit as the primary hash function $h$ . We use the last-digit as the secondary hash function $i$ . When a key’s primary hash $j=h(k)$ is full, we probe the next position at $j=j-i(k)$ . The above hash table is generated by inserting 666, 761, 861 and 961 in order.

Suppose now we remove 761, followed by inserting 961 again. When 961 is probed first at index $j=h(961)=6$ , slot 6 is full. So next it probed $j=j-i(961)=6-1=5$ . Slot 5 is empty. However, if we had stopped at slot 5, we would wrongly assume 961 was not in the hash table. That’s why when we delete key 761, we need to mark slot 5 as a special empty symbol different than just a new blank slot.

Now let’s illustrate the idea of auxiliary-bit tombstones. For each slot, we use an extra bit to denote whether this slot is probed during insertion (recall probe can happen also in search). Then when we search a key, we can stop probing as soon as a 0 bit is encountered.

Auxiliary-bit tombstones can be illustrated in the following example. Suppose we generate a hash table by inserting 666, 761, 861, 333, 222, and 921 in order. Since 666 is probed during inserting 761 and 861, 761 is probed during inserting 861, and 222 is probed during inserting 921, the extra bit for 666, 761, and 222 is set to 1. After insertion, we remove key 861 and denote slot 4 as a tombstone.

Now if we want to search if key 361 exists, it will start from index 6 (666). Without the extra bits, we will probe denoted until index 1 (the probe will pass slot 4 pretending it is a full slot). But with the extra bit, we will only need to probe until index 4. At index 4, we know no more key that sharing the same probe path has inserted beyond this point.

Facebook extends the auxiliary bits to auxiliary bytes which are used to count the insertion probes for each slot. Let’s create another hash table using the same data: 666, 761, 861, 333, 222, and 921 in order. The extra bytes are created as shown in the diagram. For example, since 666 is probed during inserting 761 and 861, its auxiliary byte stores 2.

If we remove 861, the auxiliary byte will be subtracted 1 on all the slots on its probe path:

Now, if want to search if key 361 exists, we will only need to probe until index 5 because slot 5’s auxiliary byte is 0. Comparing to just using auxiliary bits, we save more unnecessary probes.

The rest techniques are more C++ relevant, which I’ll skip.

Hash table revisited

Leave a comment

Cancel reply