Documente Academic
Documente Profesional
Documente Cultură
Search Tree
y Search tree is a special type of tree that is used to guide the search for a record, given the search key MLI is a variation of the search tree
P1
K1
Ki-1
Pi
Ki
Kq-1
Pq
X X< Ki
i=1
1<i<q
X Ki-1<X
i=q
Ki-1<X< Ki
Search Tree
Each key value in the tree is associated with a pointer to the record in the data file having that value. Pointer could be to the disk block containing the record Search tree itself can be stored on the disk by assigning each tree node to a disk block
Search Tree
y Constraints: Search keys within a node is ordered (increasing from L to R). With in each node, K1< K2<....< Kq-1 For all values X in the subtree pointed to by Pi, we have Ki-1<X< Ki for 1<i<q , X< Ki for i=1 & Ki-1<X for i=q.
i=1
1<i<q
i=q
Search Tree
Algorithms for inserts and deletes do not guarantee that a search tree is balanced Keeping a search tree balanced HELPS!! Keeping search tree balanced yields a uniform search speed regardless of the value of the search key Deletions may lead to nearly empty nodes, thus wasting space and increasing no. of levels
B-Tree
B-tree has additional constraints that ensure that tree is always balanced and that the space wasted by deletion is never excessive Algorithms for inserts and deletes are more complex in order to maintain these additional constraints They are mostly simple Become complicated only when inserts and deletes lead to splitting and merging of nodes respectively
B-Tree: Characteristics
Automatically maintains as many levels of index as is appropriate for the size of the file being indexed Manages space on the blocks they use so that every block is between half full & completely full Each node corresponds to a disk block
Structure of B-Trees
Balanced tree All paths from the root to a leaf have the same length Three layers in a B-tree Root Intermediate layer Leaves Parameter p is associates with each B-tree Each node will have p search keys & p+1 pointers Pick p to be as large as will allow p+1 pointers & p keys to fit in one block
Example
Block size = 4096 bytes Search key 4 byte integer Pointer - 8 bytes Assume no header information kept in block We choose p such that
B-tree structures. (a) A node in a B-tree with q 1 search values. (b) A B-tree of order p = 3. The values were inserted in the order 8, 5, 1, 7, 3, 12, 9, 6.
Each node has at most p tree pointers. 5. Each node , except the root and leaf nodes, has atleast (p/2) tree pointers. The root node has atleast two tree pointers unless it is the only node in the tree. 6. A node with q tree pointers, qp has q-1 search key field values. 7. All leaf nodes are at the same level. Leaf nodes have the same structure as internal nodes except that all of their tree pointers Pi are null.
4.
efficient; if a node is full the insertion causes a split into two nodes
y Splitting may propagate to other tree levels y A deletion is quite efficient if a node does not
B+ - Tree Structure
y A B+ - Tree is in the form of a balanced tree in which every
y y y y y
path from the root of the tree to a leaf of the tree is the same length. Each nonleaf node in the tree has between [n/2] and n children, where n is fixed. B+ -Trees are good for searches, but cause some overhead issues in wasted space. Variation of B-tree data structures. Data pointers are stored only at leaf nodes. The leaf nodes of B+-tree are linked together to provide ordered access on the search field of the records.
FIGURE 14.11
The nodes of a B+-tree. (a) Internal node of a B+-tree with q 1 search values. (b) Leaf node of a B+-tree with q 1 search values and q 1 data pointers.
2. 3.
4. 5.
6.
Each internal node is of the form <P1 ,K1,P2,K2,Kq-1,,Pq> where qp and each Pi is a tree pointer With in each internal node, K1< K2<....< Kq-1 For all search key field values X in the subtree pointed at by Pi we have Ki-1<X Ki for 1<i<q , X Ki for i=1 &Ki-1<X for i=q. Each internalnode has at most p tree pointers. Each internal node , except the root , has atleast (p/2) tree pointers. The root node has atleast two tree pointers if it is an internal node. A internal node with q tree pointers, qp has q-1 search key field values.
2. 3.
4. 5.
Each leaf node is of the form <<K1,Pr1>,<K2,Pr2>,...<Kq-1,Prq-1>,Pnext> where qp and each Pri is a data pointer- a pointer to the record whose search key field value equal to Ki or to a file block containing the record. With in each node, K1< K2<....< Kq-1 ,qp. Each Pri is a data pointer- a pointer to the record whose search key field value equal to Ki or to a file block containing the record. Each leaf node has atleast (p/2) values. All leaf nodes are at the same level.
y y y y y
The pointers in the internal nodes are tree pointers to blocks that are tree nodes Pointers in leaf nodes are data pointers to the data file records or blocks- except for the Pnext pointer. Pnext pointer is a tree pointer to the next leaf node We can traverse leaf nodes as a linked list using the Pnext pointers. We can also include Pprevious pointers.
B+ - Tree Updates
y Insertion If the new node has a search key that already
exists in another leaf node, then it adds the new record to the file and a pointer to the bucket of pointers. If the search key is different from all others, it is inserted in order. y Deletion It removes the search key value from the node.
Pr pointers is needed as block pointers that point to blocks that contain a set of record pointers to the actual records in the data file..
20
33
51
63
10*
15*
20*
27*
33*
37*
40*
46*
51*
55*
63*
97*
y Examples: search for 5*, 16*, all data entries >= 24* ...
y The last one is a range search, we need to do the sequential scan, starting from the first leaf
13
17
24
30
2*
3*
5*
7*
14* 15*
2*
3*
5*
7*
8*
14*
15* 16*
You overflow
13 17 24 30
2*
3*
5*
7*
8*
One new child (leaf node) generated; must add one more pointer to its parent, thus one more key value as well.
Inserting 8* (cont.)
y Copy up the
13 17 5 24 30 Entry to be inserted in parent node. (Note that 5 is copied up and s continues to appear in the leaf.)
5*
7*
8*
13
17
24
30
You overflow!
We split this node, redistribute entries evenly, and push up middle key.
Entry to be inserted in parent node. (Note that 17 is pushed up and only appears once in the index. Contrast this with a leaf split.)
17
13
24
30
13
24
30
2*
3*
5* 7* 8*
14* 15*
y This can happen recursively y To split index node, redistribute entries evenly, but push up middle key. (Contrast with leaf splits.) y Splits grow tree; root split increases height. y Tree growth: gets wider or one level taller at top.
13
24
30
2*
3*
5* 7* 8*
14* 16*
22*
22* 24*
27* 29*
13
27
30
2*
3*
5* 7* 8*
14* 16*
22* 24*
27* 29*
y y y y
Notice how 27 is copied up. But can we move it up? Now we want to delete 24 Underflow again! But can we redistribute this time?
Deleting 24*
y Observe the two leaf
nodes are merged, and 27 is discarded from their parent, but y Observe `pull down of index entry (below).
New root
30
22*
27*
29*
33*
34*
38*
39*
13
17
30
2*
3*
5*
7*
8*
14* 16*
Conclusion
y Search tree is a special type of tree that is used to guide
the search for a record, given the search key y A B-Tree is in the form of a balanced tree in which every path from the root of the tree to a leaf of the tree is the same length. y A B+-Tree is a variation of B-Tree. y B,B+-tree are dynamic, adjusts gracefully under inserts and deletes
Bibiliography
y Fundamentals of Data Base Management Systems y Fundamentals of Data Base
-Silber Shatz
y www.wikipedia.org y www.encarta.com