Sunteți pe pagina 1din 10

Capacity of a B-tree with order = m, height = d

Four measures of capacity


1. The max # of nodes in the B-tree
2. The min # of nodes in the B-tree
3. The max # of keys in the B-tree
4. The min # of keys in the B-tree
1. Maximum number of nodes in the B-tree of order m, height d:
- Level analysis
Level 1: max of 1 node
Level 2: max of m nodes max of m
2
descendants
Level 3: max of m
2
nodes
. . .
Level d: max of m
d-1
Summation of levels = 1 + m + m
2
+ . . . + m
d-1
=
so, max nodes =
1
1
d
m
m

2. Minimum number of nodes in the B-tree of order m, height d:


- Level analysis
Level 1: 1 node with at least 2 descendants
Level 2: 2 nodes minimum with at least 2 *
1
2
m
1
1
1
descendants
Level 3: 2
1
2
m
1
1
1
nodes min with
Level 4: 2
2
1
2
m
1
1
nodes min with
Level d: 2
2
1
2
d
m

1
1
nodes min
Total for all levels with minimum numbers of nodes at each level is
Min # nodes = 1 + 2 + 2 *
1
2
m
1
1
+ 2 *
2
1
2
m
1
1
..+ 2 *
2
1
2
d
m

1
1
= 1+2
1
1
1
1
2
1
2
d
m
m

_
1

1

1

1
,

3. Maximum number of keys in the B-tree of order m, height d:
- Level analysis
Level 1: m 1 keys max
Level 2: m(m-1) keys max
Level 3: m
2
(m-1) keys max
Level d: m
d-1
(m-1) keys max
Summation of all levels = (m-1) + m(m-1) + m
2
(m-1) + . . . + m
d-1
(m-1)
= (m-1)( 1 + m + m
2
+ . . . + m
d-1
)
so, max # of keys =
( )
1
1
1
1
d
d
m
m
m
m
_

=
4. Minimum number of keys in the B-tree of order m, height d:
-Level analysis
Level 1: keys min
Level 2: keys max
Level 3: keys min
Level d: keys min
Summation of all levels =
Summary of the formulas for the capacities of a B-tree of order = m, height = d:
1. max # nodes =
1
1
d
m
m

2. min # nodes = 1 + 2 *
1
1
1
2
1
1
2
d
m
m

_ 1

1


1

1 ,
3. max # of keys = m
d
- 1
4. min # keys = 2 *
1
1
1
2
d
m

1

1
Ex: If the tree has order = 21, d=3
then: 1. max # nodes =
3
21 1 9261 1
463
21 1 20
nodes


2. min # nodes =
1 + 2
1
1
1
2
1
1
2
d
m
m

_ 1

1


1

1 ,
1 + 2
1
11 1
11 1
d
_

,
1 + 2
2
11 1
11 1
_

,
1 + 2
120
10
_

,
= 25
3. Max # of keys = m
d-1
= 21
3
- 1 = 9260
4. Min # of keys
2
( )
1
1
1
2
d
m

1

1
= 2 (11
2
) - 1
= 2 (121) - 1
= 242 1
= 241
Relationship between N, m, and d
Let N = # of records in file
= # of keys in B-tree index structure
From the formulas for the maximum number of keys and the minimum number of keys that can
be stored in the B-tree index structure having order = m and depth = d (= number of levels), then
N > 2 *
1
1
1
2
d
m

1

1
and
N < m
d
- 1
We now use these two inequalities to determine lower and upper bounds on one of these three
values, N, m, or d, given the other two. These three cases will be considered next:
1. Given m and d, find bounds on N
2. Given N and d, find bounds on m
3. Given N and m, find bounds on d
1. Given m and d, these inequalities determine the lower and upper bounds on N = the filesize (#
of records in the data file) that can be stored in the file as
2 *
1
1
1
2
d
m

1

1
< N < m
d
- 1
lower bound upper bound
on N on N
For example, if m = 21 and d = 3, then 241 < N < 9260.
2. Given N and d, we can find m by solving each inequality for m
First, from
N < m
d
- 1
get
N + 1 < m
d
and so
d
(N+1) < m which is a formula for a lower bound on the order m.
Also, from
2
1
1
1
2
d
m N

1

1
get

1
1
1
1
2
2
d
N
m

+
1

1
so 1/2 m <
1
1
1
2
2
d
N
m

+
1

1
and so
1
1
2
2
d
N
m

+
which is a formula for an upper bound on the order m.
That is, the lower and upper bounds on m are
d
(N+1) <
1
1
2
2
d
N
m

+

1
1
1 2
2
d d
N
N

+
+
Ex: Given N = 9000 and d=3,
then 21 (best case # order) m 134 (worst case node size # order)
3. Given N and m, then solve both inequalities for d. First, from N < m
d
- 1, get N+1 < m
d
,
and now solve for d by taking log
m
of both sides of the inequality, to get
log
m
(N + 1) < log
m
(m
d
) = d
so
log
m
(N + 1) < d, which gives a lower bound for d.
Now from 2 *
1
1
1
2
d
m

1

1
< N , get
1
2
m
1
1
1
d-1
< (N+1)/2, and solve this for d by taking
log base
1
2
m
1
1
1
of both sides of the inequality to get
d
1
2
1
log 1
2
m
N
1
1
1
+ _ _
+

, ,
, which gives an upper bound on d.
In summary, the bounds on d are
log
m
(N+1) d
1
2
1
log 1
2
m
N
1
1
1
+ _ _
+

, ,
lower bound upper bound
Ex. Given N = 9000 and m = 67
log
67
(9001) d
34
9001
log 1
2
_
+

,
so 2.16 d 3.38
and so 3 d 3, that is, d must be 3.
Space Analysis
Memory Analysis
Memory required for a B-tree structured includes space for a B-tree node at each level.
memory space = d * (memory for a B-tree node)
Since a node has m-node pointers (RRN's of B-tree nodes), and (m-1) keys and (m-1) data record
addresses, then
memory space = d * [ m* RRNsize + (m-1)* Keysize + (m-1)* RRNsize ]
Ex: For d = 3 m = 67 then memory for 3 B-tree nodes is
memory space = 3 (67*4 + 66*21 + 66*4)
= 3 ( 268 + 1386 + 264 ) = 5754 bytes
Disk space for B-tree indexfile
To compute the possible disk space requirements for the B-tree index file (of the B-tree index
nodes), use the other two formulas that involve the maximum and minimum number of nodes in
the B-tree structure of order = m and depth = d:
1. Max # of nodes =
1
1
d
m
m

2. Min # of nodes = 1 + 2
1
1
1
2
1
1
2
d
m
m

_ 1

1


1

1 ,
These provide for lower and upper bounds on the number of nodes in the B-tree or order = m,
depth = d:
1 + 2
1
1
1
2
1
1
2
d
m
m

_ 1

1


1

1 ,
< #nodes <
1
1
d
m
m

lower bound upper bound


Then the disk space required for the B-tree indexfile would be computed by
disk space = # nodes * node size
but since we do not know the actual number of B-tree nodes required, then the disk space is
between
( (min # nodes in B-tree) * nodesize ) and ( (max # nodes in B-tree) * nodesize )
so the index file size is between (
1
1
1
2
1 2
1
1
2
d
m
nodesize
m

_ 1

1
+

1

1 ,
) and ( )
1
1
d
m
nodesize
m

,
Ex: If m = 67 and d = 3, then with node size = 1929 B
1 + 2 ( )
2
1
67 1
2
1929
1
1
2
B
m
_ 1

1


1

1 ,
1 + 2
2
34 1
34 1
_

,
1929 = 71 * 1929 = 136,959 137 KB best case
and ( )
3
67 1
1929
67 1
B
_

,
= 8, 402, 724 = 8.4 MB worst case
This is the range that will be used for disk space requirements for the B-tree indexfile.
Again, the data file size requirements are not considered since the data records must be stored
regardless of the type of file structure being used.
Time Analysis
Expected time to retrieve a record from B-tree file with order = m, depth = d
E(T/retrieval) = E (time to read 1
st
node)
+ E (time to search node)
+ E (time to read 2
nd
node)
+ E (time to search node)
+ E (time to read 3
rd
node)
+ E (time to search node)
+ E (time to read data record)
=d * [ E( node read) + E (node search) ] + E (record read)
Ex: For m = 67 d = 3,
max E(time/retrieval) = 3[ (12 ms + 6 ms +
.7
1
*
2
1929
512
2.7
_ 1

1
1

,
+
( )
( )
67 1 1
10
2
s
+ _

,
]
+ 12 + 6 +
1
*
2
200
512
2.7
_ 1

1
1



,
=
= ms maximum time for search

S-ar putea să vă placă și