Home

[This page is auto-generated; please prefer to read the article’s PDF Form.]



[Prev]   [Up]   [Next]   

Two-Level Hashing

Above method does help us to find a collision-free function in simple steps quickly. But its drawback is that it requires the hashtable to have size m in O(n2), which may be acceptable only when n is small enough. To solve this issue, we use two level of hashing whenever n is not sufficiently small. The level-1 hash-function h1 (subscript ‘1’ indicates level-1) maps the n keys of S to m. We will use Si to denote the set of keys which get mapped to slot i by h1, and si to denote |Si|, for each i m.

For any slot i with si 2, there are collisions. To handle such collisions, we use level-2 hashtables Ti, one for each slot i. For each slot with si 2, we map its si keys to hashtable Ti, using a collision-free hash-function h2,i (subscript ‘2’ indicates level-2). Using the method from the last section, we can search for a collision-free function, given the set of keys Si. That is, we first choose a 1∕mi-Universal class with table-size mi = si(si 1) and search in it for the function which is collision-free for keys in Si. For the h2,i function thus found, we will store its necessary details somewhere, to allow its reconstruction later.

For slots with si = 1, the table Ti has a single key and hence does not require hashing. For slots with si = 0, table Ti is simply empty.

To lookup any key x, we first compute i h1(x). If si 2, we compute j h2,i(x). Else if si = 1,j 0. We can then return the key from the level-2 hashtable Ti, from its slot j. If si = 0, we conclude that the key is not found. Note that if the lookups may also be performed for keys outside the static set S, we must also compare the key thus found with key x to ensure they are equal. Because, other keys outside of set S may still map to the same location as any key from S.

To implement this approach, we don’t need to keep a level-1 hashtable, and level-2 hashtables (Ti) separately. We can simply maintain a single table T which is a concatenation of tables T0,T1,T2,,Tm1, in that order. The offsets of all these Ti tables in T will be calculated and stored somewhere. So, the slot j of table Ti is located at this index of T: (offset of Ti) + j.

Number of elements required in T, corresponding to each Ti,i m is:

    {
      si,        si = 0 or 1
ei =
      si(si − 1), si ≥ 2

Note that, for si 2, ei = si(si 1) si2. Also, for si = 0 or 1, ei = si = si2. Thus, the total number of elements in T are:

 ∑       ∑
    ei ≤     s2i
i∈ℤm     i∈ℤm

The last section’s method required space in O(n2). What can we say about the space complexity of table T? Note that if h1 maps all n keys to one slot, im si2 = n2, and if it maps each to a different slot, im si2 = n.

We can take help from Corollary 4 in [3]. We can search for h1 in a 1∕m-Universal class (𝜖 = 1∕m). That corollary then ensures that for at least half of the functions in the class, im si2 has the given bound:

 ∑
    s2i < 2n-(n-−-1) + n
i∈ℤm         m

Choosing m = n, we get the bound as 3n 2, which is O(n) and hence very useful. It tells that at least half of the functions in the class will keep the table T size within 3n. So, we can randomly select a function from this class and hash all the keys from S to see if this 3n bound is met. We can repeat until we find such a function, and then use it as h1.

We could thus create an injective (“collision-free”) mapping of an arbitrary set of n keys to the indices in table T, which are at most 3n integers {0,1,2,...,3n 1}. This is an interesting achievement of the FKS method.

[Prev]   [Up]   [Next]