Hash Table
Appendix
 Hash Function

Definition

Hashing Integer

Hashing String

converting into table index

user defined hash


Collision resolution

Separate Chaining

linear probing

Hash Function
Definition
Basic idea of Hashing is put key into a function, the function turns the key into table index, ideally each key has its own index, which refer to a address in memory so that we can easily get the key in O(1) time complexity. Here are some important convention toward hashing

Good Hash function will distribute the key uniformly. which mean no big cluster or collision.

Hashing is a good example of time and memory trade off, memory is cheap so try as much as you can to distribute the key uniquely through hashing.

Key with special meaning need to be carefully considered.
Hash Integer
 Modular hashing
we choose the array size M to be prime and, for any positive integer key k, compute the remainder when dividing k by M. This function is very easy to compute (k % M, in Java) and is effective in dispersing the keys evenly between 0 and M – 1.
Hash String
A classic algorithm known as Horner’s method gets the job done with N multiplications, additions, and modulus operations.
public int GetHashCode(string str)
{
char[] s = str.ToCharArray();
int hash = 0;
for (int i = 0; i < s.Length; i++)
{
hash = s[i] + (31 * hash);
}
return hash;
}
h = s[0] · 31^(L–1) + … + s[L – 3] · 31^2 + s[L – 2] · 31^1 + s[L – 1] · 31^0.
In order to save computing time, we don’t need to add every number’s hash, we can randomly choose or define string step
public int GetHashCode(string str)
{
char[] s = str.ToCharArray();
int hash = 0;
int skip = Math.Max(1, s.Length / 8);
for (int i = 0; i < s.Length; i+=step)
{
hash = s[i] + (31 * hash);
}
return hash;
}
converting to table index
private int hash(Key x)
{
return (x.hashCode() & 0x7fffffff) % M;
}
user defined hash( for struct)
In user defined object, simply adding each primitive data type’s hash together to generate reference data type’s hash code
public class Transaction
{
...
private final String who;
private final Date when;
private final double amount;
public int hashCode()
{
int hash = 17;
hash = 31 * hash + who.hashCode();
hash = 31 * hash + when.hashCode();
hash = 31 * hash
+ ((Double) amount).hashCode();
return hash;
}
...
}
Collision resolution
Separate Chaining
;
Create a array with each address reference to a linked list (symbol table), if collision happens add them to the symbol table
public class SeperateChainingHashSet<TKey, TValue>
{
private int M;
private SequentSearchSymbolTable<TKey, TValue>[] st;//
public SeperateChainingHashSet()
{ this(997);}
public SeperateChainingHashSet(int m)
{
this.M = m;
st = new SequentSearchSymbolTable<TKey, TValue>[m];
for (int i = 0; i < m; i++)
{
st[i] = new SequentSearchSymbolTable<TKey, TValue>();
}
}
private int hash(TKey key)
{
return (key.HashCode() & 0x7fffffff) % M;
}
public override TValue Get(TKey key)
{
return st[hash(key)].Get(key);
}
public override void Put(TKey key, TValue value)
{
st[hash(key)].Put(key, value);
}
}
linear probing
;
The simplest openaddressing method is called linear probing: when there is a colli sion (when we hash to a table index that is already occupied with a key different from the search key), then we just check the next entry in the table (by incrementing the index). Linear probing is characterized by identifying three possible outcomes:
 Key equal to search key: search hit
 Empty position (null key at indexed position): search miss
 Key not equal to search key: try next entry
public class LinearProbingHashSet<TKey, TValue>
{
private int N;
private int M = 16;
private TKey[] keys;
private TValue[] values;
public LinearProbingHashSet()
{
keys = new TKey[M];
values = new TValue[M];
}
private int hash(TKey key)
{
return (key.GetHashCode() & 0xFFFFFFF) % M;
}
public TValue Get(TKey key)
{
for (int i = hash(key); keys[i] != null; i = (i + 1) % M)
{
if (key.Equals(keys[i])) { return values[i]; }
}
return null;
}
public void Put(TKey key, TValue value)
{
int hashCode = hash(key);
for (int i = hashCode; keys[i] != null; i = (i + 1) % M)
{
if (keys[i].Equals(key))
{
values[i] = value;
return;
}
keys[i] = key;
values[i] = value;
N++;
}
}
public void delete(Key key)
{
if (!contains(key)) return;
int i = hash(key);
//search the position
while (!key.equals(keys[i]))
i = (i + 1) % M;
//search hit, replace it with null
keys[i] = null;
vals[i] = null;
//reput all the key that list after the target key
i = (i + 1) % M;
while (keys[i] != null)
{
Key keyToRedo = keys[i];
Value valToRedo = vals[i];
keys[i] = null;
vals[i] = null;
N;
put(keyToRedo, valToRedo);
i = (i + 1) % M;
}
N;
if (N > 0 N == M/8) resize(M/2);
}
}