Home

[This page is auto-generated; please prefer to read the article’s PDF Form.]



[Prev]   [Up]   [Next]   

Division Method and Key Patterns

In the Division Method of hashing, the hash-function h(k) is simply:

h(k) = k mod m

It can be proved that, if the input keys k are randomly uniformly distributed over any set of mn consecutive integers (for some positive integer n), then their mod m values too will be uniformly distributed over m = {0,1,2,,m 1}. That is, their hash-values h(k) will be uniformly distributed over m. In this situation with input keys, m being a prime or not has no importance.

But often it is not possible to be certain that the input keys are such random. They all, or a subset of them, may exhibit some patterns. For example, a subset of input keys may follow a “linear-pattern”, where each key has the form b + ai, where a and b are fixed integers and i acts as the variable integer. Such pattern can emerge due to a variety of reasons. For example:

  1. if keys are pointers, which are all multiples of 8. Here a = 8.
  2. if keys are all even or all odd. Here a = 2.
  3. if keys are prices of goods most of which end at ‘9’, then a large subset of keys will be in form 9 + 10i.
  4. consider the case when the key is formed by combining multiple components, like multiple members of an struct/object or an array. Say there are l components, each an integer: c0,c1,c2,,cl1. Suppose, they are multiplied by fixed integers ri to derive the integer key k as:
        l−1
k = ∑   cr
        i i
    i=0

    Say, a group of such keys have all components in common, except one component say cn. Then, these keys will have k in form b + (rn)cn, where b is fixed and component cn varies.

    A common example of this is strings, where each character is a component ci, last character being c0. Here, 0 ci 127 and each ri can be chosen ri, where r = 128.

    Suppose a group of string keys have their last few, say n, chars common. Then every such string will have k in form b + (rn)i, where b is the numeric value of the n common chars, b = i=0n1ciri. For example, addresses often have their last part as country name, which would be common among many addresses. Similarly, email-addresses have their last part, the domain, quite common.

[Prev]   [Up]   [Next]