# Hash Table

## Appendix

• Hash Function
• Definition

• Hashing Integer

• Hashing String

• converting into table index

• user defined hash

• Collision resolution

• Separate Chaining

• linear probing

## Hash Function

### Definition

Basic idea of Hashing is put key into a function, the function turns the key into table index, ideally each key has its own index, which refer to a address in memory so that we can easily get the key in O(1) time complexity. Here are some important convention toward hashing

• Good Hash function will distribute the key uniformly. which mean no big cluster or collision.

• Hashing is a good example of time and memory trade off, memory is cheap so try as much as you can to distribute the key uniquely through hashing.

• Key with special meaning need to be carefully considered.

### Hash Integer

• Modular hashing
we choose the array size M to be prime and, for any positive integer key k, compute the remainder when dividing k by M. This function is very easy to compute (k % M, in Java) and is effective in dispersing the keys evenly between 0 and M – 1.

### Hash String

A classic algorithm known as Horner’s method gets the job done with N multiplications, additions, and modulus operations.

``````public int GetHashCode(string str)
{
char[] s = str.ToCharArray();
int hash = 0;
for (int i = 0; i < s.Length; i++)
{
hash = s[i] + (31 * hash);
}
return hash;
}
``````

h = s · 31^(L–1) + … + s[L – 3] · 31^2 + s[L – 2] · 31^1 + s[L – 1] · 31^0.

In order to save computing time, we don’t need to add every number’s hash, we can randomly choose or define string step

``````public int GetHashCode(string str)
{
char[] s = str.ToCharArray();
int hash = 0;
int skip = Math.Max(1, s.Length / 8);
for (int i = 0; i < s.Length; i+=step)
{
hash = s[i] + (31 * hash);
}
return hash;
}
``````

### converting to table index

``````private int hash(Key x)
{
return (x.hashCode() & 0x7fffffff) % M;
}
``````

### user defined hash( for struct)

In user defined object, simply adding each primitive data type’s hash together to generate reference data type’s hash code

``````public class Transaction
{
...
private final String who;
private final Date when;
private final double amount;
public int hashCode()
{
int hash = 17;
hash = 31 * hash + who.hashCode();
hash = 31 * hash + when.hashCode();
hash = 31 * hash
+ ((Double) amount).hashCode();
return hash;
}
...
}
``````

## Collision resolution

### Separate Chaining ;

Create a array with each address reference to a linked list (symbol table), if collision happens add them to the symbol table

``````public class SeperateChainingHashSet<TKey, TValue>
{
private int M;
private SequentSearchSymbolTable<TKey, TValue>[] st;//

public SeperateChainingHashSet()
{ this(997);}

public SeperateChainingHashSet(int m)
{
this.M = m;
st = new SequentSearchSymbolTable<TKey, TValue>[m];
for (int i = 0; i < m; i++)
{
st[i] = new SequentSearchSymbolTable<TKey, TValue>();
}
}

private int hash(TKey key)
{
return (key.HashCode() & 0x7fffffff) % M;
}

public override TValue Get(TKey key)
{
return st[hash(key)].Get(key);
}

public override void Put(TKey key, TValue value)
{
st[hash(key)].Put(key, value);
}
}
``````

### linear probing ;

The simplest open-addressing method is called linear probing: when there is a colli- sion (when we hash to a table index that is already occupied with a key different from the search key), then we just check the next entry in the table (by incrementing the index). Linear probing is characterized by identifying three possible outcomes:

• Key equal to search key: search hit
• Empty position (null key at indexed position): search miss
• Key not equal to search key: try next entry
``````public class LinearProbingHashSet<TKey, TValue>
{
private int N;
private int M = 16;
private TKey[] keys;
private TValue[] values;

public LinearProbingHashSet()
{
keys = new TKey[M];
values = new TValue[M];
}

private int hash(TKey key)
{
return (key.GetHashCode() & 0xFFFFFFF) % M;
}

public TValue Get(TKey key)
{
for (int i = hash(key); keys[i] != null; i = (i + 1) % M)
{
if (key.Equals(keys[i])) { return values[i]; }
}
return null;
}

public void Put(TKey key, TValue value)
{
int hashCode = hash(key);
for (int i = hashCode; keys[i] != null; i = (i + 1) % M)
{
if (keys[i].Equals(key))
{
values[i] = value;
return;
}

keys[i] = key;
values[i] = value;
N++;
}
}

public void delete(Key key)
{
if (!contains(key)) return;
int i = hash(key);
//search the position
while (!key.equals(keys[i]))
i = (i + 1) % M;
//search hit, replace it with null
keys[i] = null;
vals[i] = null;

//re-put all the key that list after the target key
i = (i + 1) % M;
while (keys[i] != null)
{
Key   keyToRedo = keys[i];
Value valToRedo = vals[i];
keys[i] = null;
vals[i] = null;
N--;
put(keyToRedo, valToRedo);
i = (i + 1) % M;
}
N--;
if (N > 0 N == M/8) resize(M/2);
}
}
``````