Documente Academic
Documente Profesional
Documente Cultură
Torsten Grust
Chapter 6
Hash-Based Indexing
Efficient Support for Equality Search
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
Torsten Grust
Wilhelm-Schickard-Institut fr Informatik
Universitt Tbingen
Hash-Based Indexing
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Equality selection
Insertion
Procedures
1
2
3
SELECT *
FROM
R
WHERE A = k
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
Hash-Based Indexing
Static Hashing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
h : INTEGER [0, . . . , N 1]
if A is of SQL type INTEGER.
Hash-Based Indexing
Static Hashing
Torsten Grust
...
Hash-Based Indexing
1
2
k
bucket
...
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
N-1
primary buckets
...
Running Example
Procedures
overflow pages
Static Hashing
Hash-Based Indexing
Torsten Grust
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
h(k) = h(k0 )
even if
k 6= k0
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
Hash-Based Indexing
Torsten Grust
if n = 1 then
return 1;
else
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
return different_birthday(n 1)
|
{z
}
probability that n1 persons
have different birthdays
365 (n 1)
;
365
|
{z
}
Hash Functions
It is impossible to generate truly random hash values from
the non-random key values found in actual table. Can we
define hash functions that scatter even better than a random
function?
Hash function
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
h(k) = k mod N .
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
5+1)
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
10
Hash-Based Indexing
Extendible Hashing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Example (Extendible hash table setup; ignore the 2 fields for now )
Search
Insertion
2
4* 12* 32* 16*
Extendible Hashing
Procedures
bucket A
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
00
h
1* 5* 21*
bucket B
01
10
11
10*
directory
bucket C
2
15* 7* 19*
bucket D
hash table
2 Note:
11
Hash-Based Indexing
Torsten Grust
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
To find a record with key k such that h(k) = 5 = 1012 , follow the
second directory pointer (1012 112 = 012 ) to bucket B, then use
entry 5 to access the wanted record.
12
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
13
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
13
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
2
4* 12* 32* 16*
bucket A
Running Example
Procedures
2
00
h
2
1* 5* 21* 13*
bucket B
01
10
11
10*
directory
bucket C
2
15* 7* 19*
hash table
bucket D
14
Hash-Based Indexing
Torsten Grust
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
4=
1002
12 = 11002
32 = 1000002
16 = 100002
20 = 101002
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
0
Bucket A
32
16
1
4
12
20
Bucket A2
Hash-Based Indexing
Torsten Grust
Static Hashing
Hash Functions
Extendible Hashing
Search
3
32* 16*
Hash-Based Indexing
Insertion
bucket A
Procedures
Linear Hashing
3
000
2
1* 5* 21* 13*
bucket B
Running Example
Procedures
001
010
011
10*
bucket C
100
2
101
15* 7* 19*
110
bucket D
111
directory
3
4* 12* 20*
bucket A2
16
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
17
Hash-Based Indexing
Torsten Grust
bucket A
Hash-Based Indexing
Static Hashing
3
000
3
1* 9*
Hash Functions
bucket B
Search
001
Extendible Hashing
010
011
10*
Insertion
Procedures
bucket C
100
Linear Hashing
Insertion (Split, Rehashing)
Running Example
101
Procedures
15* 7* 19*
110
bucket D
111
directory
3
4* 12* 20*
bucket A2
3
5* 21* 13*
bucket B2
18
Hash-Based Indexing
Torsten Grust
Function: hsearch(k)
n n ;
/* global depth */
b h(k) & (2n 1) ;
/* mask all but the low n bits */
return bucket[b] ;
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
19
Hash-Based Indexing
Torsten Grust
7
8
Function: hinsert(k)
n n ;
b hsearch(k) ;
if b has capacity then
Place k in bucket b ;
return;
Hash-Based Indexing
/* global depth */
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
20
Hash-Based Indexing
Torsten Grust
2
3
4
5
6
7
8
9
10
11
..
.
/* redistribute entries of b including k
foreach k0 in bucket b do
if h(k0 ) & 2d 6= 0 then
Move k0 to bucket b2 ;
*/
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
21
Hash-Based Indexing
Torsten Grust
Overflow chains?
Extendible hashing uses overflow chains hanging off a bucket
only as a resort. Under which circumstances will extendible
hashing create an overflow chain?
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
22
Hash-Based Indexing
Torsten Grust
Overflow chains?
Extendible hashing uses overflow chains hanging off a bucket
only as a resort. Under which circumstances will extendible
hashing create an overflow chain?
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
Linear Hashing
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
23
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
level
hlevel+1
0
N1
N
2N1
hlevel+1
hlevel+2
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
2N1
2N
4N1
24
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
level
N)
(level = 0, 1, 2, . . . )
Procedures
25
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
26
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
0o
next
hlevel+1
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
..
.
2level N 1
2level N + next
3
Increment next by 1.
Hash-Based Indexing
Torsten Grust
hlevel (k)
range of
hlevel+1
range of
hlevel
2level N 1
buckets already
split (hlevel+1 )
images of already
split buckets (h
level+1 )
hash buckets
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
28
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
next > 2
level
N 1?
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
29
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
next > 2
level
N 1?
Extendible Hashing
Search
Insertion
Answer:
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
Hash-Based Indexing
Torsten Grust
Static Hashing
Hash Functions
Extendible Hashing
level = 0
Search
Insertion
h1
h0
000
00
001
01
9* 25* 5*
010
10
011
11
Procedures
Linear Hashing
next
hash buckets
overflow pages
30
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
h1
Static Hashing
h0
Hash Functions
000
00
32*
001
01
9* 25* 5*
Extendible Hashing
Search
Insertion
next
Procedures
Linear Hashing
Insertion (Split, Rehashing)
010
10
011
11
Running Example
Procedures
100
43*
44* 36*
hash buckets
overflow pages
31
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
h1
Static Hashing
h0
Hash Functions
000
00
32*
001
01
9* 25* 5* 37*
Extendible Hashing
Search
Insertion
next
Procedures
Linear Hashing
Insertion (Split, Rehashing)
010
10
011
11
Running Example
Procedures
100
43*
44* 36*
hash buckets
overflow pages
32
Hash-Based Indexing
Torsten Grust
h1
h0
000
00
Static Hashing
Hash Functions
Extendible Hashing
32*
Search
Insertion
001
01
9* 25*
Procedures
Linear Hashing
010
10
next
011
11
43*
100
101
44* 36*
5* 37* 29*
33
Hash-Based Indexing
Torsten Grust
h1
h0
000
00
32*
001
01
9* 25*
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
010
10
011
11
100
43*
next
44* 36*
101
5* 37* 29*
110
34
Hash-Based Indexing
Torsten Grust
000
32*
001
9* 25*
next
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
010
50*
Procedures
Linear Hashing
011
100
44* 36*
101
5* 37* 29*
110
111
31* 7*
35
Hash-Based Indexing
Torsten Grust
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Function: hsearch(k)
b hlevel (k) ;
if b < next then
/* b has already been split, record for key k
/* may be in bucket b or bucket 2level N + b
/* rehash
b hlevel+1 (k) ;
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
*/
*/
*/
Running Example
Procedures
*/
36
Hash-Based Indexing
Torsten Grust
4
5
6
7
Function: hinsert(k)
b hlevel (k) ;
if b < next then
/* rehash
b hlevel+1 (k) ;
Place k in bucket[b] ;
if overflow(bucket[b]) then
Allocate new page b0 ;
/* Grow hash table by one page
bucket[2level N + next] addr(b0 ) ;
..
.
Hash-Based Indexing
Static Hashing
*/
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
*/
Procedures
Hash-Based Indexing
Torsten Grust
3
4
5
6
8
9
10
..
.
if overflow( ) then
..
.
Hash-Based Indexing
Static Hashing
Hash Functions
next next + 1 ;
/* did we split every bucket in the hash?
if next > 2level N 1 then
/* hash table size doubled, split from top
level level + 1 ;
next 0 ;
Extendible Hashing
Search
*/
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
*/
Procedures
*/
return;
38
Hash-Based Indexing
Torsten Grust
3
4
5
6
7
8
9
10
11
12
Function: hdelete(k)
b hlevel (k) ;
..
.
Remove k from bucket[b] ;
if empty(bucket[b]) then
Move entries from page bucket[2level N + next 1]
to page bucket[next 1] ;
next next 1 ;
if next < 0 then
/* round-robin scheme for deletion
level level 1 ;
next 2level N 1 ;
Hash-Based Indexing
Static Hashing
Hash Functions
Extendible Hashing
Search
Insertion
Procedures
Linear Hashing
Insertion (Split, Rehashing)
Running Example
Procedures
*/
return;
39