Documente Academic
Documente Profesional
Documente Cultură
Pooria Azimi
Project 1
Questions?
Abstraction Layers
DBMS Architecture Levels:
Outline
Outline
How to store a table on disk?
Outline
How to store a table on disk?
Outline
How to store a table on disk?
Outline
How to store a table on disk?
Outline
How to store a table on disk?
Users table:
UID
1 2 3 4 5 6 7
438532
Name
Email
alice@aut.ac.ir bob@gmail.com dave@yahoo.com
City
London Paris Shanghai Sydney Toronto Shanghai Cairo
Salary
carol@gmail.com Female 20-30 eve@hotmail.com Female 50-60 peggy@aut.ac.ir Female 40-50 steve@me.com
Male
Sequential Files
Sequential Files
CSV (comma separated values) Fixed-length elds Variable-length elds
Sequential Files
CSV (comma separated values) Fixed-length elds Variable-length elds XML
6
Sequential Files
CSV (comma separated values) Fixed-length elds Variable-length elds XML ...
6
Small tables and short Logs Tables that dont change very often Tables that must be read sequentially
(moving data between two application)
10
11
12
Logs Tables that can be read sequentially Tables with short, xed-length elds
13
Hash Map
14
Hash Map
2 | Bob | 26 | bob@gmail.com 14 | Jane | 20 | clark@gmail.com 5 | Smith | 14 | smith@me.com
Hash function
13 | Adams | 35 | adams@yahoo.com
UID
Hash Map
2 | Bob | 26 | bob@gmail.com 14 | Jane | 20 | clark@gmail.com 5 | Smith | 14 | smith@me.com
Hash function
13 | Adams | 35 | adams@yahoo.com
UID
UID=7
Hash Map
2 | Bob | 26 | bob@gmail.com 14 | Jane | 20 | clark@gmail.com 5 | Smith | 14 | smith@me.com
Hash function
13 | Adams | 35 | adams@yahoo.com
UID
UID=7
Hash Map
2 | Bob | 26 | bob@gmail.com 14 | Jane | 20 | clark@gmail.com 5 | Smith | 14 | smith@me.com
Hash function
13 | Adams | 35 | adams@yahoo.com
UID
City=Cupertino UID=7
Hash Map
2 | Bob | 26 | bob@gmail.com 14 | Jane | 20 | clark@gmail.com 5 | Smith | 14 | smith@me.com
Hash function
13 | Adams | 35 | adams@yahoo.com
UID
UID=7
Hash Map
15
Hash Map
Good:
15
Hash Map
Good: Fast (for lookup eld)
15
Hash Map
Good: Fast (for lookup eld) Easy to implement
15
Hash Map
Good: Fast (for lookup eld) Easy to implement Relatively space-efcient (no pointers)
15
Hash Map
Good: Fast (for lookup eld) Easy to implement Relatively space-efcient (no pointers)
15
Hash Map
Good: Fast (for lookup eld) Easy to implement Relatively space-efcient (no pointers)
Bad:
15
Hash Map
Good: Fast (for lookup eld) Easy to implement Relatively space-efcient (no pointers)
Bad: Nave implementations only allow one index (lookup key) per table
15
Hash Map
Good: Fast (for lookup eld) Easy to implement Relatively space-efcient (no pointers)
Bad: Nave implementations only allow one index (lookup key) per table Not scalable (for non-memory resident data)
15
Hash Map
Good: Fast (for lookup eld) Easy to implement Relatively space-efcient (no pointers)
Bad: Nave implementations only allow one index (lookup key) per table Not scalable (for non-memory resident data) Not suitable for range queries
15
B tree
... ...
9 | Joh-
12 | Jon- 15 | Ada-
...
10 | Bob 11 | Smi-
...
13 | Bla- 14 | Cla-
...
16 | Jan- 17 | Ste-
...
...
...
...
16
B tree
17
B tree
17
B tree
They were specically built for database storage They are I/O-friendly
17
B tree
They were specically built for database storage They are I/O-friendly B trees are fast and scalable
17
B tree
They were specically built for database storage They are I/O-friendly B trees are fast and scalable They can be modied to allow more than one index per table
17
B tree
They were specically built for database storage They are I/O-friendly B trees are fast and scalable They can be modied to allow more than one index per table They work well with range queries
17
B tree
Demo
18
B tree
19
B tree
Good:
19
B tree
Good: Fast (for indexed elds)
19
B tree
Good: Fast (for indexed elds) Scalable
19
B tree
Good: Fast (for indexed elds) Scalable Suitable for range queries (on indexed elds)
19
B tree
Good: Fast (for indexed elds) Scalable Suitable for range queries (on indexed elds)
Bad:
19
B tree
Good: Fast (for indexed elds) Scalable Suitable for range queries (on indexed elds)
Bad:
Space overhead
19
B tree
Good: Fast (for indexed elds) Scalable Suitable for range queries (on indexed elds)
Bad:
19
Index
Why use indices? Faster retrieval Most efcient in range queries Tradeoffs Slower writes/updates vs. faster retrieval Disk space overhead
20
Users table:
UID
1 2 3 4 5 6 7
438532
Name
Email
alice@aut.ac.ir bob@gmail.com dave@yahoo.com
City
London Paris Shanghai Sydney Toronto Shanghai Cairo
Salary
carol@gmail.com Female 20-30 eve@hotmail.com Female 50-60 peggy@aut.ac.ir Female 40-50 steve@me.com
Male
21
22
23
23
SELECT name, city FROM users WHERE city LIKE S% SELECT name, city FROM users WHERE uid BETWEEN 10 AND 20
23
24
10 11 12 13 14 15
24
10 11 12 13 14 15
8 Bytes (4 bytes for keys, 4 bytes for pointers)
24
10 11 12 13 14 15
8 Bytes (4 bytes for keys, 4 bytes for pointers)
24
10 11 12 13 14 15
8 Bytes (4 bytes for keys, 4 bytes for pointers)
24
10 11 12 13 14 15
8 Bytes (4 bytes for keys, 4 bytes for pointers)
24
10 11 12 13 14 15
8 Bytes (4 bytes for keys, 4 bytes for pointers)
24
10 11 12 13 14 15
8 Bytes (4 bytes for keys, 4 bytes for pointers)
24
10 11 12 13 14 15
8 Bytes (4 bytes for keys, 4 bytes for pointers)
24
25
25
25
25
25
25
25
B tree
26
B tree
Data (UID):
12 | Jon- 15 | Ada-
...
10 | Bob 11 | Smi-
...
13 | Bla- 14 | Cla-
...
26
16 | Jan- 17 | Ste-
...
...
...
...
B tree
...
10
11
...
13
14
...
16
17
...
...
...
...
Data (UID):
12 | Jon- 15 | Ada-
...
10 | Bob 11 | Smi-
...
13 | Bla- 14 | Cla-
...
26
16 | Jan- 17 | Ste-
...
...
...
...
B tree
...
10
11
...
13
14
...
16
17
...
...
...
...
Memory Disk
Data (UID):
12 | Jon- 15 | Ada-
...
10 | Bob 11 | Smi-
...
13 | Bla- 14 | Cla-
...
26
16 | Jan- 17 | Ste-
...
...
...
...
B tree
...
10
11
...
13
14
...
16
17
...
...
...
...
Memory Disk
Data (UID):
12 | Jon- 15 | Ada-
...
10 | Bob 11 | Smi-
...
13 | Bla- 14 | Cla-
...
27
16 | Jan- 17 | Ste-
...
...
...
...
Multiple Indexes
UID:
12 15
B tree
...
13 14
10 11
...
...
16 17
Memory Disk
12 | Jon- 15 | Ada-
...
10 | Bob 11 | Smi-
...
13 | Bla- 14 | Cla-
...
28
16 | Jan- 17 | Ste-
...
...
...
...
Multiple Indexes
UID:
12 15
B tree
...
13 14
10 11
...
...
16 17
Memory Disk
12 | Jon- 15 | Ada-
...
10 | Bob 11 | Smi-
...
13 | Bla- 14 | Cla-
...
29
16 | Jan- 17 | Ste-
...
...
...
...
Multiple Indexes
UID:
12 15
B tree
Username:
...
Bo Jon
...
10 11
...
13 14
...
16 17
Ad Bl
...
Cl Jan
...
Sm Ste
Memory Disk
12 | Jon- 15 | Ada-
...
10 | Bob 11 | Smi-
...
13 | Bla- 14 | Cla-
...
30
16 | Jan- 17 | Ste-
...
...
...
...
Multiple Indexes
UID:
12 15
B tree
Username:
...
Bo Jon
...
10 11
...
13 14
...
16 17
Ad Bl
...
Cl Jan
...
Sm Ste
Memory Disk
12 | Jon- 15 | Ada-
...
10 | Bob 11 | Smi-
...
13 | Bla- 14 | Cla-
...
31
16 | Jan- 17 | Ste-
...
...
...
...
UID:
12 15
...
Multiple Indexes
... ... ... ...
Age:
20 35 Ad Bl
B tree
...
Username:
Bo Jon
...
10 11
...
13 14
...
16 17
...
Cl Jan
...
Sm Ste
14 19
...
23 26
...
48 56
Memory Disk
12 | Jon- 15 | Ada-
...
10 | Bob 11 | Smi-
...
13 | Bla- 14 | Cla-
...
32
16 | Jan- 17 | Ste-
...
...
...
...
UID:
12 15
...
Multiple Indexes
... ... ... ...
Age:
20 35 Ad Bl
B tree
...
Username:
Bo Jon
...
10 11
...
13 14
...
16 17
...
Cl Jan
...
Sm Ste
14 19
...
23 26
...
48 56
Memory Disk
12 | Jon- 15 | Ada-
...
10 | Bob 11 | Smi-
...
13 | Bla- 14 | Cla-
...
33
16 | Jan- 17 | Ste-
...
...
...
...
Multiple Indexes
B tree
Index trees are small. Therefore, we can t all index trees in main memory.
34
Clustered Index:
The ordering of the physical data rows is in accordance with the index blocks that point to them. Therefore, only one clustered index can be created on a given table.
35
UID:
12 15
Username:
...
Bo Jon
...
10 11
...
13 14
...
16 17
Ad Bl
...
Cl Jan
...
Sm Ste
...
Clustered
14 19
...
23 26
...
35 48
Non-clustered
Memory Disk
12 | Jon- 15 | Ada-
...
10 | Bob 11 | Smi-
...
13 | Bla- 14 | Cla-
...
36
16 | Jan- 17 | Ste-
...
...
...
...
37
Reverse Index
38
Reverse Index
Sample Query: Find all users with aut.ac.ir email account
38
Reverse Index
Sample Query: Find all users with aut.ac.ir email account
CREATE INDEX ON email SELECT name FROM users WHERE email LIKE %@aut.ac.ir
38
Reverse Index
Sample Query: Find all users with aut.ac.ir email account
CREATE INDEX ON reverse(email) SELECT name FROM users WHERE reverse(email) LIKE reverse(%@aut.ac.ir)
39
Bitmap Index
Suitable for:
40
Bitmap Index
Gender (Female): 0111011000101010101 Age Group (20-30): 0110010000100000101 Age Group (30-40): 0001000111000000010
Female, Age between 20 and 40: 0111010000100000101
41
42
42
42
42
Bad:
42
Bad:
Much faster search (on indexed elds) Range queries Slower insert/update/delete
42
Bad:
Much faster search (on indexed elds) Range queries Slower insert/update/delete Space overhead
42
43
43
43
43
43
43
43
Indices in MySQL
CREATE INDEX IX_users_username ON users (username);
44
45
Questions
46
Questions
Storage, B+ tree, ...
46
Questions
Storage, B+ tree, ... Indices
46
Questions
Storage, B+ tree, ... Indices Project 1: MySQL
46
Questions
Storage, B+ tree, ... Indices Project 1: MySQL Project 2: MongoDB
46
Questions
Storage, B+ tree, ... Indices Project 1: MySQL Project 2: MongoDB Optional Project
46