Sunteți pe pagina 1din 9

Architecture and Implementation of

Database Systems Indexing


HS 07
0. Introduction/Record Management
Dr. Jens Dittrich
jens.dittrich@inf
www.inf.ethz.ch/~jensdi
Institut of Information Systems

Record/Tuple Management Structure of a Page


Tasks: Mapping from records/tuples to pages/blocks
! A page consists of three parts: page
! Agenda ! head (meta data, e.g. page id, log Head
! structure of a page
record/tuple block/page sequence number: see recovery section) slots
! record addressing ! slots (pointers to records)
! record mapping
(42, Hugo, Müller) ! data (record data) free
! record layout
(12, Simon, Schmidt) ! slot = (pointer, size of the record)
! storage models
! space for slots is allocated top-down
- NSM (77, Frank, Meier) date
! space for record data is allocated bottom-up
- DSM
- PAX
! compression
! long fields Advantage: records may easily migrate inside a page
! memory management
October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 3 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 4

Tuple IDs (TID) Migration of Tuples Inside a Page


! Indirection based on tuple-ID (TID) TID
TID = (page, slot)
42 2
TID Head
head
page 42
42 2
head
page 42

(42, Hugo, Müller)

(12, Simon,
Simon, Schmidt)
Schmidt)
(42, Hugo, Müller)
(77, Frank, Meier)
(12, Simon, Schmidt) (12, Simon, Schmidt)
(77, Frank, Meier)

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 5 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf
Migration of a Tuple to Another Page Migration of a Tuple to Another Page
TID TID
42 2 42 2

page 42 page 42 page 77


head head head

(42, Hugo, Müller) (42, Hugo, Müller) (12, Simon, Schmidt)

(12, Simon, Schmidt) 77 3


(77, Frank, Meier) (77, Frank, Meier)

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf

Discussion Indirect Addressing: Mapping Table


! access trivial if tuple does not migrate to a new page ! Idea:
! migration to other page using forward TIDs ! 1. keep a separate mapping table
! if migrated tuple migrates again: update first forwarding TID ! 2. hide physical addresses
(outside world only knows logical addresses)
at most one indirection caused by TID ! no forwarding
! performance ! if tuple needs to be moved: change entry in mapping table
! at least one page access required
! at most 2 page accesses required
(if forward TID has to be followed)

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 9 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 10

Migration of Tuples: Mapping Table Migration of Tuples: Mapping Table


mapping table mapping table
11 42 2 11 42 2
43 .. 43 ..
page 42 page 42 page 77
head head head

(42, Hugo, Müller) (42, Hugo, Müller) (12, Simon, Schmidt)

(12, Simon, Schmidt)


(77, Frank, Meier) (77, Frank, Meier)

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf
Migration of Tuples: Mapping Table Mapping Table
mapping table
! Drawbacks of a mapping table:
11 77 3 ! 2 block accesses (1 mapping table block + 1 data block)
43 ..
page 42 page 77 ! Advantage:
Head Head ! no space wasted for forward TIDs

(42, Hugo, Müller) (12, Simon, Schmidt)

(77, Frank, Meier)

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 14

Mapping Table (optimized, aka PPP) Mapping Table (optimized, aka PPP)
! Drawbacks of a mapping table: optimized old
Algorithm to find a record method method
! 2 block accesses (1 mapping table block + 1 data block)
IF record address contained in cache:
! Optimization: no I/O access on mapping table
! Access to mapping table can be avoided if frequently accessed entries address := cache-addresse 11 42 2 11 77 3
are kept in a separate cache in main memory if record was not found at address: 43 .. 43 ..
I/O-access on mapping table
ELSE
cache mapping table
I/O-access on mapping table
11 42 2 11 77 3
43 .. 43 ..

cache mapping table

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 15 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 16

Further Optimizations How to Store Values: Data and Metadata


! Observation: Mapping table corresponds to a big index page! ! Separation of data and metadata
! So why not store the mapping ! Metadata: data in DB catalogue
- attribute name
logical record address physical record address
- type
in an index structure in the first place, e.g., a B+-tree? ! Data: data on each page/block
! Discussion - actual values
! Advantages: ! Note:
- implicit ordering of entries guaranteed ! In XML data and metadata ares tored together:
- no additional memory management required <record>
! Disadvantages: <firstname> hugo </firstname>
<lastname> müller </lastname>
- expensive access using a multi-level tree structure
(for each record access) --> Thus we should not do this! </record>

- memory usage
October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 17 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 18
Record Layout Record Layout
! fixed-sized part: ! NULL-values
! stores all values having a type of fixed size ! small bitmap of fixed size at the beginning of each record
! e.g. numeric(10,2), date, char[42] ! “1” if attribute is set to zero, else “0”
! Advantage: direct addressing ! Advantage: simple and efficient
address = sizeof(type) * pos

! variable-sized part:
! e.g. varchars
! store size and pointer in fixed-size part
! store actual values in variable-sized part
! Disadvantage: indirect addressing
address = Zeiger
! Important: if variable-sized types are used for any attribute of a record
the direct addressing of the fixed-sized part is still possible!
October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 19 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 20

Record Layout n-ary Storage Model (NSM)


head

head Key fname lname


Key fname lname
77 Frank Meier
77 Frank Meier
12 Simon Schmidt
12 Simon Schmidt
42 Hugo Müller
42 Hugo Müller
11 Hans Meier
11 Hans Meier
25 Jens Dittrich
25 Jens Dittrich
76 Hugo Schmidt
76 Hugo Schmidt
76 Hugo Schmidt 25 Jens Dittrich
11 Hans Meier 42 Hugo Müller
76 Hugo Schmidt 25 Jens Dittrich
Meier 12 Simon Schmidt 77 Frank Meier
11 Hans 42 Hugo Müller
12 Simon Schmidt 77 Frank Meier
! record values are assigned row-wise to page
! all attribute values of a record are adjacent on the page
row-wise assignment of
record values to page

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf

Decomposition Storage Model (DSM) Decomposition Storage Model (DSM)


RID Key RID fname RID lname
RI Key fname lname RID Key RID fname RID lname
1 77 1 Frank 1 Meier
1 77 Frank Meier 1 77 1 Frank 1 Meier
D 2 12 2 Simon 2 Schmidt
2 12 Simon Schmidt 2 12 2 Simon 2 Schmidt
3 42 3 Hugo 3 Müller
3 42 Hugo Müller 3 42 3 Hugo 3 Müller
4 11 4 Hans 4 Meier
4 11 Hans Meier 4 11 4 Hans 4 Meier
5 25 5 Jens 5 Dittrich
5 25 Jens Dittrich 5 25 5 Jens 5 Dittrich
6 76 6 Hugo 6 Schmidt
6 76 Hugo Schmidt 6 76 6 Hugo 6 Schmidt

! split table into set of two-column tables head head head

! alternatively: split table into one-column table storing the RID implicitly
(array-like representation)
Hugo Jens Schmidt Dittrich
76 25 Hans Hugo Meier Müller
11 42 12 77 Simon Frank Schmidt Meier

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 23 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 24
Decomposition Storage Model (DSM) Partition Attributes Across (PAX)
! optimized for accessing few attributes ! Idea: colocate values of the same attribute inside a page
! Advantage: very efficient when only
few attributes need to be accessed head
Key fname lname
! Disadvantage: inefficient when many
77 Frank Meier
attributes are accessed
12 Simon Schmidt
! Disadvantage: rows distributed to
42 Hugo Müller subpage 1
many pages Schmidt Dittrich
11 Hans Meier Meier Müller Schmidt Meier

25 Jens Dittrich
76 Hugo Schmidt
Hugo Jens subpage 2
! Literature Hans Hugo Simon Frank

! Don S. Batory: On Searching Transposed Files. ACM Trans. Database


subpage 3
Syst. 1979. 76 25 11 42 12 77

! George P. Copeland, Setrag Khoshafian: A Decomposition Storage


Model. SIGMOD 1985
October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 25 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 26

Partition Attributes Across (PAX) Compression


! Advantages ! orthogonal to the record layout used
! improves locality for single attributes ! single attribute vs. entire tuples (vs. entire page vs. entire extent vs. ...)
! data values are reorganized inside a page only ! very efficient for DSM, why?
no change to the outside system ! Literature (selection):
(if appropriate information hiding was used...) ! Meikel Pöss, Dmitry Potapov: Data Compression in Oracle. VLDB 2003.
! record reconstruction cheap ! Balakrishna R. Iyer, David Wilhite: Data Compression Support in
! 15%-2x performance improvements when Databases. VLDB 1994.
compard with NSM ! Till Westmann, Donald Kossmann, Sven Helmer, Guido Moerkotte: The
! Disadvantages Implementation and Performance of Compressed Databases. SIGMOD
! no the best possible solution for decision support (DSS) Record 29(3) (2000)
DSM wins... ! Managing Gigabytes: Compressing and Indexing Documents and
! Literature: Anastassia Ailamaki, David J. DeWitt, Mark D. Hill, Marios Images by Ian H. Witten, Alistair Moffat, and Timothy C. Bell. 2nd
Skounakis: Weaving Relations for Cache Performance. VLDB 2001. edition. Morgan Kaufmann Publishing. 1999.

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 27 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 28

How to Map Long Records Memory Management: Append Only


! Problem: ! Task:
! What if a record is larger than a page? Find a free slot for a newly inserted record
! Example: Blobs (Binary Large Objects)
! 1. Solution ! Native implementation (append only):
! only consider the last created page
! split record into pieces of size page_size
! create index (byteoffset block) for pieces using a
! if record fits into this page: OK

hash-table or a b+-tree ! Else: create a new page

! 2. Solution ! Discussion
! store record in separate storage space (e.g. file system of the OS) ! very efficient insert
! store only a link to the separate storage on the DB-page ! poor memory usage (deletes?)

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 29 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 30
Append Only(n) Best Fit, First Fit, Next Fit
! Generalization of append only: ! Best Fit:
! only consider the n last created pages ! start search from the beginning of the page list
! if record fits into one of this pages: OK ! find optimal page
! Else: create a new page ! Cost = linear search in list
! Discussion ! First Fit:
! very fast insert ! start search from the beginning of the page list
! still: poor memory usage (deletes?) ! take the first page that fits
! disadvantage: pages at the beginning of the list will soon be full
(waste of time to search through these pages)

! Next Fit:
! like first fit, but: start search from the last position

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 31 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 32

Hybrid Approaches HY(n,u) Free Memory Table


! Algorithm: ! Idea: for each page: store available memory
If memory usage better than u: ! Implementation: extra table in segment either
use append only(n) 1. providing accurate memory information
Else page 1 2 3 4 5 2 byte/entry
use Next Fit bytes avail. f1 f2 f3 f4 f5 2 byte/entry
! Discussion: acceptable compromise
or 2. providing approximate memory information

page 1 2 3 4 5 2 byte/entry
bytes avail. f1 f2 f3 f4 f5 k bits/entry

memory available >= (fi / 2k) * page_size

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 33 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 34

Space Map Space Map


! Drawback of free memory table: linear search for suitable page ! Trade-off: granularity vs. memory needed
! Solution “space map”: ! Trade-off: granularity vs. performance
! inversion of the free memory table ! Advantage: efficient search for best fit: O(log n)
page 1 2 3 4 5 bytes avail. f4 f2 f3 f5 ! Disadvantage: maintenance cost for space map
bytes avail. f1 f2 f3 f4 f5 page 4 1,2 3 5

! index on “ bytes avail.” attribute


! binary search
! works for accurate and approximate variant

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 35 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 36
Motivation
Indexing ! Example queries:
! What is the address of the student having ID 424342?
! Which students attend less than 2 lectures in this semester?
1. One-dimensional Index Structures ! Which students live in canton Zurich?
! Which students do not live in canton Zurich?

! How does the DBMS compute the results to these queries?


! inspect all records
(sequential access pattern)
! organize data cleverly such that record may be found fast
(index-based access patterns)

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 38

What does “indexing“ mean? Access Paths


! Mapping: ! Very often the same record can be accessed via different access paths
! key set of records ! access path = possible way to retrieve a record
! key does not have to correspond to the primary key of a relation
! access paths have huge impact on the efficient processing of a
query result
index structures provide an efficient
implementation of this mapping

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 39 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 40

Primary vs. Secondary Access Path Secondary Access Path and Inversion
! Queries using a secondary access path may return more than one
! Tow classes of access paths: result
1. Access paths for primary key (1:n relationship among keys and results)
Example:
SELECT *
TID Key fname lname
FROM employees fname 1,0 77 Frank Meier
WHERE empID = 42 Frank
1,1 12 Simon Schmidt
Hans
2. Access paths for secondary key 1,2 42 Hugo Müller
Hugo
Example: 1,3 11 Hans Meier
Jens
1,4 25 Jens Dittrich
SELECT * Simon
1,5 76 Hugo Schmidt
FROM employees
WHERE canton = ‘ZH’

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 41 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 42
Onde-dimensional Access Paths Onde-dimensional Access Paths
one-dimensional access paths one-dimensional access paths

sequential structures tree structures hashed structures sequential structures tree structures hashed structures

multi-way trees multi-way trees

chains lists B-trees static hash dynamic chains lists B-trees static hash dynamic
methods hashing methods hashing
methods methods
logical physical constant dynamic logical physical B+-trees constant dynamic

continuous tree-structured key transformation continuous tree-structured key transformation

key comparison key comparison

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 43 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 44

Sequential Access Paths: Chains Sequential Access Paths: Lists


! list of records ignoring block order on disk ! list of records grouped into blocks
! records are not physically clustered on blocks/pages ! records are physically clustered on blocks/pages
! blocks are not physically clustered on disk/memory ! blocks are not physically clustered on disk/memory
blocks
blocks

! Discussion
! Discussion ! better I/O-performance
! poor I/O-performance ! Worst case: 1 random access for each page
! Worst case: 1 random access for each record ! used in DBMSs
! not used in DBMSs
October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf

Sequential Access Paths: Sequence Tree Structures


! Like chains but in addition: blocks contiguous on disk/memory ! binary trees
! Records and blocks are physically clustered ! not suitable for DBMSs
blocks ! difficult to map nodes to pages
! digital trees
! only for special applications
! important for non-relational data (e.g., spatial data)

! b-trees
! Discussion ! most important index structure for DBMSs
! optimal I/O-performance ! advantage: very versatile, easy to extend
! Worst case: 1 random access + sequential access ! invented 30 years ago, still being improved
! very important for DBMSs (especially read-mostly environments) ! several index structures exist that are based on similar ideas
! Drawback: hard to maintain in the presence of inserts and updates (e.g. R-tree, M-tree)

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 47 October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 48
B-trees
! Basics: see also lectures “Introduction to Databases“ and “IS-K“
Next Week: Indexing
! Agenda (following slides):
! basics (repetition) 1. One-dimensional Index Structures (Part 2)
! ISAM
! Bulk-loading
! prefix B+-trees
! prefix/suffix-compression
! cache-conscious B+-trees
! primary vs. secondary B+-trees
! clustered B+-trees
! secondary index and search engines (e.g. Google)
! ...

October 4, 2007 Dr. Jens Dittrich / Institut of Information Systems / jens.dittrich@inf 49

S-ar putea să vă placă și