Documente Academic
Documente Profesional
Documente Cultură
Examples:
I
h(x) = x mod N
is a hash function for integer keys
h((x, y )) = (5 x + 7 y ) mod N
is a hash function for pairs of integers
hash function h
0
1
2
3
4
6
2
tea
coffee
14
chocolate
(Alice, 020598555)
Alice
John
Sue
(Sue, 060011223)
(John, 020123456)
Problem: collisions
I
(025-611-001, Mr. X)
3
..
.
997
998
999
Collisions
Collisions
occur when different elements are mapped to the same cell:
I
0
1
(025-611-001, Mr. X)
(123-456-002, Dipsy)
3
..
.
Different possibilities of handing collisions:
I
chaining,
linear probing,
double hashing, . . .
Collisions continued
Usual setting:
I
N
N
N
(N 1) (N 2) (N p + 1)
=
N p1
q(p, N) =
hash function
The load factor of a hash table is the ratio n/N, that is, the
number of elements in the table divided by size of the table.
High load factor 0.85 has negative effect on efficiency:
I
lots of collisions
Every has value (cell in the hash table) has equal probabilty.
h(b 0 ) = 80
h(ab 0 ) = 354
h(ba 0 ) = 979
h(c 0 ) = 618
...
h(d 0 ) = 163
(025-611-000, Mr. X)
(987-067-001, Ms. X)
..
.
Then 80% of the cells in the table stay unused! Bad hash!
(for some m Z)
b
0010
c
0011
000100100011 = 291
Mid-square method:
I
Random method:
I
(01.01., Sue)
(12.03., John)
(16.08., Madonna)
2
3
4
Worst-case: everything in one cell, that is, linear list.
Linear probing:
I
41
1 2 3
18 44 59 32 22 31 73
4 5 6 7 8 9 10 11 12
findElement(k):
i = h(k)
p=0
while p < N do
c = A[i]
if c == then return No_Such_Key
if c.key == k then return c.element
i = (i + 1) mod N
p=p+1
return No_Such_Key
15 2
1 2 3
3
4
4
5
5
6
6
7
7
8
9 10 11 12
2
2
3
3
4
4
5
5
6
6
7
7
9 10 11 12
0
I
15
A 2
1 2 3
3
4
4
5
5
6
6
7
7
8
9 10 11 12
5 6
A 16 17 4 A
1 2 3 4 5 6 7
7
8
9 10 11 12
A 16 17 4
1 2 3 4 5
3
6
6
7
7
8
9 10 11 12
reduced performance
Examples:
I
Double Hashing
N = 13
h(k) = k mod 13
d(k) = 7 (k mod 7)
31
41
0 1 2 3
k
18
41
22
44
59
32
31
73
h(k)
5
2
9
5
7
6
5
8
d(k)
3
1
6
5
4
3
4
4
18 32 59 73 22 44
4 5 6 7 8 9 10 11 12
Probes
5
2
9
5, 10
7
6
5,9,0
8
Performance of Hashing
In worst case insertion, lookup and removal take O(n) time:
I
0.4
0.6
0.8
Performance of Hashing
In worst case insertion, lookup and removal take O(n) time:
I
small databases
compilers
browser caches
Universal Hashing
there always exist keys that are mapped to the same value
1
N
Proof Sketch.
Let 0 i 6= j < M. For every i 0 6= j 0 < p there exist unique a, b
such that i 0 = a i + b mod p and j 0 = a i + b mod p. Thus
every pair (i 0 , j 0 ) with i 0 6= j 0 has equal probability. Consequently
the probability for i 0 mod N = j 0 mod N is N1 .
search
O(log2 n)
O(1) 1
insert
O(log2 n)
O(1) 1
remove
O(log2 n)
O(1) 1
closestAfter
O(log2 n)
O(n + N)
closestBefore
O(log2 n)
O(n + N)