Documente Academic
Documente Profesional
Documente Cultură
_
1
1
1
1
,
3. Maximum number of keys in the B-tree of order m, height d:
- Level analysis
Level 1: m 1 keys max
Level 2: m(m-1) keys max
Level 3: m
2
(m-1) keys max
Level d: m
d-1
(m-1) keys max
Summation of all levels = (m-1) + m(m-1) + m
2
(m-1) + . . . + m
d-1
(m-1)
= (m-1)( 1 + m + m
2
+ . . . + m
d-1
)
so, max # of keys =
( )
1
1
1
1
d
d
m
m
m
m
_
=
4. Minimum number of keys in the B-tree of order m, height d:
-Level analysis
Level 1: keys min
Level 2: keys max
Level 3: keys min
Level d: keys min
Summation of all levels =
Summary of the formulas for the capacities of a B-tree of order = m, height = d:
1. max # nodes =
1
1
d
m
m
2. min # nodes = 1 + 2 *
1
1
1
2
1
1
2
d
m
m
_ 1
1
1
1 ,
3. max # of keys = m
d
- 1
4. min # keys = 2 *
1
1
1
2
d
m
1
1
Ex: If the tree has order = 21, d=3
then: 1. max # nodes =
3
21 1 9261 1
463
21 1 20
nodes
2. min # nodes =
1 + 2
1
1
1
2
1
1
2
d
m
m
_ 1
1
1
1 ,
1 + 2
1
11 1
11 1
d
_
,
1 + 2
2
11 1
11 1
_
,
1 + 2
120
10
_
,
= 25
3. Max # of keys = m
d-1
= 21
3
- 1 = 9260
4. Min # of keys
2
( )
1
1
1
2
d
m
1
1
= 2 (11
2
) - 1
= 2 (121) - 1
= 242 1
= 241
Relationship between N, m, and d
Let N = # of records in file
= # of keys in B-tree index structure
From the formulas for the maximum number of keys and the minimum number of keys that can
be stored in the B-tree index structure having order = m and depth = d (= number of levels), then
N > 2 *
1
1
1
2
d
m
1
1
and
N < m
d
- 1
We now use these two inequalities to determine lower and upper bounds on one of these three
values, N, m, or d, given the other two. These three cases will be considered next:
1. Given m and d, find bounds on N
2. Given N and d, find bounds on m
3. Given N and m, find bounds on d
1. Given m and d, these inequalities determine the lower and upper bounds on N = the filesize (#
of records in the data file) that can be stored in the file as
2 *
1
1
1
2
d
m
1
1
< N < m
d
- 1
lower bound upper bound
on N on N
For example, if m = 21 and d = 3, then 241 < N < 9260.
2. Given N and d, we can find m by solving each inequality for m
First, from
N < m
d
- 1
get
N + 1 < m
d
and so
d
(N+1) < m which is a formula for a lower bound on the order m.
Also, from
2
1
1
1
2
d
m N
1
1
get
1
1
1
1
2
2
d
N
m
+
1
1
so 1/2 m <
1
1
1
2
2
d
N
m
+
1
1
and so
1
1
2
2
d
N
m
+
which is a formula for an upper bound on the order m.
That is, the lower and upper bounds on m are
d
(N+1) <
1
1
2
2
d
N
m
+
1
1
1 2
2
d d
N
N
+
+
Ex: Given N = 9000 and d=3,
then 21 (best case # order) m 134 (worst case node size # order)
3. Given N and m, then solve both inequalities for d. First, from N < m
d
- 1, get N+1 < m
d
,
and now solve for d by taking log
m
of both sides of the inequality, to get
log
m
(N + 1) < log
m
(m
d
) = d
so
log
m
(N + 1) < d, which gives a lower bound for d.
Now from 2 *
1
1
1
2
d
m
1
1
< N , get
1
2
m
1
1
1
d-1
< (N+1)/2, and solve this for d by taking
log base
1
2
m
1
1
1
of both sides of the inequality to get
d
1
2
1
log 1
2
m
N
1
1
1
+ _ _
+
, ,
, which gives an upper bound on d.
In summary, the bounds on d are
log
m
(N+1) d
1
2
1
log 1
2
m
N
1
1
1
+ _ _
+
, ,
lower bound upper bound
Ex. Given N = 9000 and m = 67
log
67
(9001) d
34
9001
log 1
2
_
+
,
so 2.16 d 3.38
and so 3 d 3, that is, d must be 3.
Space Analysis
Memory Analysis
Memory required for a B-tree structured includes space for a B-tree node at each level.
memory space = d * (memory for a B-tree node)
Since a node has m-node pointers (RRN's of B-tree nodes), and (m-1) keys and (m-1) data record
addresses, then
memory space = d * [ m* RRNsize + (m-1)* Keysize + (m-1)* RRNsize ]
Ex: For d = 3 m = 67 then memory for 3 B-tree nodes is
memory space = 3 (67*4 + 66*21 + 66*4)
= 3 ( 268 + 1386 + 264 ) = 5754 bytes
Disk space for B-tree indexfile
To compute the possible disk space requirements for the B-tree index file (of the B-tree index
nodes), use the other two formulas that involve the maximum and minimum number of nodes in
the B-tree structure of order = m and depth = d:
1. Max # of nodes =
1
1
d
m
m
2. Min # of nodes = 1 + 2
1
1
1
2
1
1
2
d
m
m
_ 1
1
1
1 ,
These provide for lower and upper bounds on the number of nodes in the B-tree or order = m,
depth = d:
1 + 2
1
1
1
2
1
1
2
d
m
m
_ 1
1
1
1 ,
< #nodes <
1
1
d
m
m
_ 1
1
+
1
1 ,
) and ( )
1
1
d
m
nodesize
m
,
Ex: If m = 67 and d = 3, then with node size = 1929 B
1 + 2 ( )
2
1
67 1
2
1929
1
1
2
B
m
_ 1
1
1
1 ,
1 + 2
2
34 1
34 1
_
,
1929 = 71 * 1929 = 136,959 137 KB best case
and ( )
3
67 1
1929
67 1
B
_
,
= 8, 402, 724 = 8.4 MB worst case
This is the range that will be used for disk space requirements for the B-tree indexfile.
Again, the data file size requirements are not considered since the data records must be stored
regardless of the type of file structure being used.
Time Analysis
Expected time to retrieve a record from B-tree file with order = m, depth = d
E(T/retrieval) = E (time to read 1
st
node)
+ E (time to search node)
+ E (time to read 2
nd
node)
+ E (time to search node)
+ E (time to read 3
rd
node)
+ E (time to search node)
+ E (time to read data record)
=d * [ E( node read) + E (node search) ] + E (record read)
Ex: For m = 67 d = 3,
max E(time/retrieval) = 3[ (12 ms + 6 ms +
.7
1
*
2
1929
512
2.7
_ 1
1
1
,
+
( )
( )
67 1 1
10
2
s
+ _
,
]
+ 12 + 6 +
1
*
2
200
512
2.7
_ 1
1
1
,
=
= ms maximum time for search