Documente Academic
Documente Profesional
Documente Cultură
One disadvantage of the hash table is that it has no way of sorting its contents. This is primarily
due to the scattering of data over buckets via hashing. If we could improve upon this feature, we
would have everything: fast add, fast search, fast delete and the ability to sort.
We could muck with the hash table definition as it stands and grind out the ability to order
elements, but it would be messy affair and inevitably slow down the data structure's overall
performance to more than it is worth.
Rather, we search for another storage methodology, a binary tree, a data structure which has all
the advantages of hash tables including the ability to order elements.
By definition, a binary tree is a structure of nodes with two branches, left and right, the left
branch supporting or "growing" data that is less than or equal to the node, the right branch
supporting data that is greater than or equal to the node.
Here is a visual example of a tree of integers of depth 4:
(root)
80
/ \ /\
20 200 || 4 levels of branches
/ \ \ ||
16 55 300 \/
/ / / \
3 42 243 587
(leaves)
Searching for elements in a binary tree
We can see that it is easy to search for data in such a structure because we start from the root (80
in this case), and keep working down left and right branches until we find the search element or
reach a terminal leaf.
For example, if we wanted to find the data 55 in the above tree, we check node 80 and find 55 <
80 so we choose the left subtree of 80 which gives us 20 and find 55 > 20 so we choose the right
subtree of 20 and find 55 and we are done in three comparisons.
80 < first a comparison on 80
/ \
20 200 < then a comparison on 20
/ \ \
16 55 300 < we find a mark at 55
/ / / \
3 42 243 587
Search performance
Just how fast is the search? If we look at the shape of a tree, the worst case number of
comparisons is no greater than the largest depth from root to any one leaf.
In the sample tree given in the previous examples the depth is 4, so a maximum of 4 comparisons
are needed to find any element.
We can see that a tree functions best when it is balanced: i.e. all branches are roughly equidistant
from root to leaf. Here is an example of an unbalanced version of the balanced tree:
3
\ Unbalanced tree
587 provides approximately linear search
/ performance
16
\ worst search case is 8 comparisons
300
/
20
\
200
/ \
55 243
/ \
42 80
The maximum number of elements a balanced tree of depth d can support is 2d.
Hence the worst case number of search comparisons for a balanced tree is log2(n) = log2(2d) = d,
a very small number. For example, if we have 10000 elements in our tree, at most the number of
searches is 14!
Here is a chart of comparison values for various values of N:
N : 10 50 100 200 500 1000 2000 5000 10000 20000 50000 100000
#comparisons: 4 6 7 8 9 11 12 13 14 15 16 170
Adding elements to a binary tree
Adding an element to the tree entails insert it in the proper place.
Let's say we wanted to add the data 77. Then following our search algorithm, we arrive at a
nofind situation. The last node visited was 55. 77 Is greater than 55 so we know to append 77
on 55's right side:
80
/ \
20 200
/ \ \
16 55 300 < "55" is the last node visited
/ / \ / \ in the search, so logically
3 42 77 243 587 77 should be tagged here
On uniqueness
If 77 already existed, our add() function should report an error saying that duplicates not allowed
and terminate the add. We will enforce uniqueness of keys.
How is the tree organized in memory?
All this looks nice on paper, but how are we going to store such a complicated piece of
machinery in memory? If we look at the potential of the tree, it could fan out to myriads of
branches with subranches with subranches with subranches...
At first glance the coding looks painful and impossible, but on closer examination, we recognize
an important pattern. We see that each node in the tree is in itself a tree, or rather a subtree.
And each node potentially spawns two more trees of its own, a left subtree and a right subtree.
...
\
\
| 300 | < 300 in itself is the root
of a tree
left / \ right The node that 300 resides
subtree / \ subtree in consists of the data "300"
/ \ and a reference to a left
subtree and a right subtree
| 243 | | 587 |
< 587's parent is node 300
/ \ / \ 243's parent is node 300
/ \ / \ 300's children are 243, 587
... ... ... ...
This relationship is convenient and manageable. We can come up with a recursive definition of
a tree, recursive in the sense that we can define it as a function of itself.
Binary tree definition
There are two elements of the binary tree: a BTNode which describes a single element and its
linkage, and a BTree structure which describes the high level characteristics of the tree, mainly a
pointer to the first node in the tree.
BTNode
data : pointer to data
left : pointer to left subtree
right : pointer to right subtree
BTree
root : pointer to first BTNode
nElements : total number of elements in tree
Similar to nodes in the linked list, nodes in the binary tree can be allocated on the fly into heap
space and be scattered throughout memory, but every node will have a parent who knows where
it is and manages it. The ultimate parent is the root who knows where all its children,
grandchildren, greatgranchildren and so on are.
If a node's left and right subtrees are null, then we consider the node a terminal leaf.
An empty tree contains a null root pointer.
Ordering elements in the tree
Visually on paper the tree has a welldefined relationship. All elements to the left of any node
have data lessthan or equal to node's data and all elements to the right have data greater than or
equal to node's data. But how does this help us with sorting the maze?
If we relabel the tree in terms of the order in which we would display each node to display its
contents in ascending order, we get:
6
/ \
3 7
/ \ \
2 4 9
/ / / \
1 5 8 10
yielding the output : {3,16,20,42,55,80,200,243,300,587}
Note the first element processed in the display is the deepest leftmost and the last element
processed is the deepest rightmost.
We must perform a depthfirst search to process the "deepest" elements first and from left to
right. A recursive algorithm can be sketched briefly and concisely:
display(BTNode node)
display(node.right)
print node.data
display(node.left)
Deleting elements in the tree
Removing nodes from trees is perhaps the trickiest operation. Why? Because an empty space or
"hole" in a tree creates a mess, involving a nontrivial reorganization of the tree.
We can attack the problem by looking at some simple cases and try to find pattern(s) which can
hopefully germinate into an algorithm.
Case 1:
5 5
/ \ / \
2 8 => 2 8
\
20 < deleting 20
is easy because it
is a terminal leaf
Case 2:
5 5
/ \ / \
2 8 < deleting 8 is => 2 20
\ a little trickier
20 because what happens
to 20? It must be
shifted up below 5
Case 3:
(a) (b)
5 => 5 5
/ \ / \ OR / \
2 8 < deleting 8 is trickier 2 12 2 6
/ \ this time because shifting / \ \
6 20 up 20 fails! 6 or 12 6 20 20
/ are the only possibilities. /
12 12
Actually, case (3a) results in a better balanced tree then (3b), so we should opt for replacing 8
with 12 rather than 6.
A deletion algorithm
Case 3 describes the most general scenario of removing a node from a tree. The worst case is
that we are trying to remove an element from the middle of the tree.
Trailing or "child" nodes must be rearranged as so must the ancestor or "parent" node of the
node to be deleted.
As well we have a choice of taking the "leftmost next" or "rightmost next" node of the node
to be deleted as the replacement node. The leftmost next node is the node that logically would
appear next in line to the node to be deleted that appears as a terminal leaf in the left subtree of
the node to be deleted. The rightmost next, the same but in the right subtree of nodeToDelete.
The one we choose for the actual replacement should be the one that descends deeper in the tree.
> nodeToDelete <
| / \ |
| ... ... |
| \ / |
leftMostNext rightMostNext
Pseudocode:
* Remove <data> element from binary tree <bt>
remove(bt,data)
Search for <data>
if (found)
Find <leftMostNext>
Find <rightMostNext>
if leftMostNext descends deeper than rightMostNext
replacementNode = leftMostNext
else
replacementNode = rightMostNext
Attach child pointers of <nodeToDelete>
to child pointers of <replacementNode>
Attach <replacementNode> to <nodeToDelete's> parent
deallocate <nodeToDelete>
bt.nElements = bt.nElements 1
The Binary Tree definition (BT.H)
#ifndef BTH
#define BTH
#include "std.h"
/* Data description */
typedef struct btnode
{
void *data;
struct btnode *left,*right;
} BTNode;
typedef struct
{
int nElements;
BTNode *root;
} BTree;
/* Executable function signature */
typedef void (*FunctionToExecute)(void *data,void *args[]);
/* Function prototypes */
extern void BTInit(BTree *bt);
extern void BTWrapup(BTree *b,int shouldDelete);
extern BTNode* BTSearch(BTree *b,void *data,CompareFunction cmp);
extern BTNode* BTAdd(BTree *bt,void *data,CompareFunction cmp);
extern int BTRemove(BTree *b,void *data,CompareFunction cmp,
int shouldDelete);
extern void BTInorder(BTree *bt,FunctionToExecute f,void *args[]);
extern void BTDisplayLikeTree(BTree *b,FunctionToExecute displayer);
extern int BTNElements(BTree *bt);
extern BTNode* BTRoot(BTree *bt);
extern BTNode* BTLeftTree(BTNode *btn);
extern BTNode* BTRightTree(BTNode *btn);
#endif
Pseudocode for the operations
****************************************************************
* Initialize a binary tree <b>
init(b)
b.root < 0
b.nElements < 0
wrapup(b,shouldDelete)
****************************************************************
* Search for <data> in tree <b> given comparison function <cmp>
* If <data> is found, returns <BTNode> where <data> resides
* otherwise returns 0
search(b,data,cmp)
if (find(b.root,data,cmp,findNode))
return findNode.left
else
return 0
****************************************************************
* Find <data> in BTNode <bt> given comparison function <cmp>
* If found, <findNode.left> contains the node match
* <findNode.right> contains the parent of the matched node
* otherwise, <findNode>left> contains the last node where <data>
* should have been if it existed in <bt>
find(bt,data,cmp,findNode)
if (btNode is empty)
return FALSE * (1) no find
findNode.left < bt
if (data = bt.data) * (2) Match found
return TRUE;
* On case of next find call yielding equality, remember parent
findNode>right < bt
if (data < bt.data) * (3) search left subtree
return find(bt.left,data,cmp,findNode)
else * (4) search right subtree
return find(bt.right,data,cmp,findNode)
****************************************************************
* Add <data> in tree <b> given comparison function <cmp>
* If <data> already exists in <b> the duplicate is NOT added
* otherwise <data> is added in the proper place in the tree
add(b,data,cmp)
if (find(b,data,cmp,findNode))
newNode < createBTNode(0,data,0)
if (b.nElements = 0)
* First element added, point root to it
b>root < newNode
else
* Add element to left or right of <findNode>
nodeToInsertAt = findNode.left
if (data < nodeToInsertAt.data)
nodeToInsertAt.left < newNode
else
nodeToInsertAt.right < newNode
b.nElements = b.nElements + 1
****************************************************************
* Traverse <bt> in sorted order and execute function <f>
* passing <args> for every node passed
inorder(b,f,args[])
processInorder(b.root,f,args)
processInorder(bt,f,args[])
if (bt is empty)
return
processInorder(bt.left,f,args)
f(bt.data,args)
processInorder(bt.right,f,args)
****************************************************************
* Allocate, initialize and return a new <BTNode>
* given <left> subtree, <right> subtree and node <data>
createBTNode(left,data,right)
allocate new BTNode <newNode>
newNode.left < left
newNode.data < data
newNode.right < right
return newNode
****************************************************************
* Deallocate BTNode <btn>
* Deallocates <btn.data> if <shouldDelete> is TRUE
deallocateBTNode(btn,shouldDelete)
if (btn is empty)
return;
if (shouldDelete)
deallocate btn.data
deallocate btn;
The Binary Tree operations (BT.C)
#include "bt.h"
static int shouldDebugBTRemove = FALSE;
#define TREE_DISPLAY_TAB_WIDTH 8
static BTNode* createBTNode(BTNode *left,void *data,BTNode *right);
static void deallocateBTNode(BTNode *btn,int shouldDelete);
static void printNode(char *descr,BTNode *node);
static void BTProcessWrapup(BTNode *btn,int shouldDelete);
/* */
/* Code to set up a tree <b> for operation */
/* */
void BTInit(BTree *b)
{
b>nElements = 0;
b>root = 0;
}
/* */
/* Code to deallocate all nodes in tree <b> */
/* Does postorder traversal of <b> whose last operation is to */
/* deallocate the node and optionally deallocate node's <data> */
/* if <shouldDelete> is TRUE */
/* */
void BTWrapup(BTree *b,int shouldDelete)
{
BTProcessWrapup(b>root,shouldDelete);
}
static void BTProcessWrapup(BTNode *btn,int shouldDelete)
{
if (!btn)
return;
BTProcessWrapup(btn>left,shouldDelete);
BTProcessWrapup(btn>right,shouldDelete);
if (shouldDelete)
free(btn>data);
free(btn);
}
/* */
/* Code to find <data> in <bt> given comparison function <cmp> */
/* If found, <findNode>left> contains the node match */
/* <findNode>right> contains the parent of the matched node */
/* If not found, <findNode>left> contains the last node where <data> */
/* should have been if it existed in <bt> */
/* */
static int BTFind(BTNode *bt,void *data,CompareFunction cmp,BTNode *findNode)
{
int compareResult;
/* CASE 1: No match found */
if (!bt)
return (FALSE);
compareResult = (*cmp)(data,bt>data);
findNode>left = bt;
/* CASE 2: Match found */
if (compareResult == EQUAL)
return (TRUE);
/* On case of next BTFind() call yielding equality, remember parent */
findNode>right = bt;
/* CASE 3 : Search left subtree */
if (compareResult == LESS_THAN)
return (BTFind(bt>left,data,cmp,findNode));
/* CASE 4 : Search right subtree */
else
return (BTFind(bt>right,data,cmp,findNode));
}
BTNode* BTSearch(BTree *b,void *data,CompareFunction cmp)
{
BTNode findNode; /* dummy node whose <left> subtree pointer
points to last node processed in find */
if (BTFind(b>root,data,cmp,&findNode))
return (findNode.left); /* CASE 1: Match */
else
return NULL; /* CASE 2: No Match */
}
/* */
/* Code to add <data> to <b> given comparison function <cmp> */
/* If <data> already exists in <b> */
/* . the duplicate is NOT added, and the node data belongs to is returned */
/* otherwise */
/* . <data> is added in the proper place in the tree and NULL is returned */
/* */
BTNode* BTAdd(BTree *b,void *data,CompareFunction cmp)
{
BTNode findNode;
if (BTFind(b>root,data,cmp,&findNode))
{
/* CASE 1: No adding the same element twice */
return (findNode.left); /* inform user where duplicate is */
}
else
{
/* CASE 2: add the element */
BTNode *newNode = createBTNode(0,data,0);
if (!b>nElements)
{
/* CASE 2a: First element added, point root to it */
b>root = newNode;
}
else
{
/* CASES 2b,2c: add element to left or right of <findNode> */
BTNode *nodeToInsertAt = findNode.left;
if ((cmp)(data,nodeToInsertAt>data) == LESS_THAN)
nodeToInsertAt>left = newNode;
else
nodeToInsertAt>right = newNode;
}
b>nElements++;
return (NULL); /* Add successful */
}
}
/* */
/* Code to traverse <bt> in sorted order and execute function <f> */
/* passing <args> for every node passed */
/* */
static void BTProcessInorder(BTNode *bt,FunctionToExecute f,void *args[])
{
if (!bt)
return;
BTProcessInorder(bt>left,f,args);
(*f)(bt>data,args);
BTProcessInorder(bt>right,f,args);
}
void BTInorder(BTree *b,FunctionToExecute f,void *args[])
{
BTProcessInorder(b>root,f,args);
}
/* */
/* Helper functions for tree delete algorithm */
/* Find and return leftmost next node and rightmost next node */
/* after node <btn> */
/* */
static BTNode* BTLeftMostNode(
BTNode *btn,
int *leftMostNextNodeDepth,
BTNode **leftMostNextParent)
{
*leftMostNextNodeDepth = 0;
*leftMostNextParent = 0;
if (!btn)
return(NULL);
*leftMostNextParent = btn;
btn = btn>left;
if (btn)
{
while (btn>right || btn>left)
{
*leftMostNextParent = btn;
if (btn>right)
btn = btn>right;
else
btn = btn>left;
(*leftMostNextNodeDepth)++;
}
}
return(btn);
}
static BTNode* BTRightMostNode(
BTNode *btn,
int *rightMostNextNodeDepth,
BTNode **rightMostNextParent)
{
*rightMostNextNodeDepth = 0;
*rightMostNextParent = 0;
if (!btn)
return(NULL);
*rightMostNextParent = btn;
btn = btn>right;
if (btn)
{
while (btn>left || btn>right)
{
*rightMostNextParent = btn;
if (btn>left)
btn = btn>left;
else
btn = btn>right;
(*rightMostNextNodeDepth)++;
}
}
return(btn);
}
/* */
/* Code to remove <data> in <b> given comparison function <cmp> */
/* and <shouldDelete> */
/* If <data> is found,
/* . the node data belongs to is removed and <data> */
/* freed in <shouldDelete> is TRUE */
/* . <b> is reorganized to fill the hole where <data> used to be
/* according to the best overall balancing
/* Sample tree to delete element <96>
<nodeToDelete> = 96
<leftMostNextNode> = 30
<leftMostNextNodeDepth> = 1
<leftMostNextParent> = 96
<rightMostNextNode> = 102
<rightMostNextNodeDepth> = 3
<rightMostNextParent> = 100
<nodeToDeletesParent> = 200
Before delete: After delete:
6 6
/ \ / \
1 200 1 200
/ /
|96| |102|
/ \ / \
30 104 30 104
/ \ / \
100 190 100 190
\
102
*/
/* */
int BTRemove(BTree *b,void *data,CompareFunction cmp,int shouldDelete)
{
BTNode findNode,*nodeToDelete,*nodeToDeletesParent;
BTNode *leftMostNextNode,*leftMostNextParent;
BTNode *rightMostNextNode,*rightMostNextParent;
BTNode *replacementNode;
int leftMostNextNodeDepth,rightMostNextNodeDepth;
/* CASE 1: Can't find element to delete, quit */
findNode.right = b>root;
if (!BTFind(b>root,data,cmp,&findNode))
return FALSE;
/* CASE 2: Found the element to delete */
/* Figure out which element (if any) to replace */
/* <element to delete> with */
nodeToDelete = findNode.left;
nodeToDeletesParent = findNode.right;
leftMostNextNode = BTLeftMostNode(nodeToDelete,
&leftMostNextNodeDepth,
&leftMostNextParent);
rightMostNextNode = BTRightMostNode(nodeToDelete,
&rightMostNextNodeDepth,
&rightMostNextParent);
if (!leftMostNextNode && !rightMostNextNode)
{
/* CASE 2a: No replacement of <nodeToDelete> necessary */
replacementNode = 0;
}
else if (leftMostNextNode
&& (leftMostNextNodeDepth >= rightMostNextNodeDepth))
{
/* CASE 2b: choose <leftMostNextNode> to replace <nodeToDelete> */
/* for best balancing */
replacementNode = leftMostNextNode;
leftMostNextNode>left = nodeToDelete>left;
leftMostNextNode>right = nodeToDelete>right;
if (leftMostNextParent>left == leftMostNextNode)
leftMostNextParent>left = NULL;
else if (leftMostNextParent>right == leftMostNextNode)
leftMostNextParent>right = NULL;
}
else
{
/* CASE 2c: choose <rightMostNextNode> to replace <nodeToDelete> */
/* for best balancing */
replacementNode = rightMostNextNode;
rightMostNextNode>left = nodeToDelete>left;
rightMostNextNode>right = nodeToDelete>right;
if (rightMostNextParent>left == rightMostNextNode)
rightMostNextParent>left = NULL;
else if (rightMostNextParent>right == rightMostNextNode)
rightMostNextParent>right = NULL;
/* Patch up <nodeToDeletesParent> properly */
if (nodeToDelete == b>root)
b>root = replacementNode;
else
{
if (nodeToDeletesParent>left == nodeToDelete)
nodeToDeletesParent>left = replacementNode;
else
nodeToDeletesParent>right = replacementNode;
}
/* get rid of <nodeToDelete> as the final operation */
deallocateBTNode(nodeToDelete,shouldDelete);
b>nElements;
return(TRUE);
}
/* */
/* Code to display a binary tree in tree format */
/* */
static void displayTab(int nSpaces)
{
int i;
for (i = 0; i < nSpaces; i++)
printf(" ");
}
static void displayLikeTree(BTNode *node,int depth,FunctionToExecute displayer)
{
if (!node)
return;
displayLikeTree(node>left,depth+1,displayer);
displayTab(depth*TREE_DISPLAY_TAB_WIDTH);
(*displayer)(node>data,0);
displayLikeTree(node>right,depth+1,displayer);
}
void BTDisplayLikeTree(BTree *b,FunctionToExecute displayer)
{
printf("\n");
displayLikeTree(b>root,0,displayer);
}
int BTNElements(BTree *b) { return b>nElements; }
BTNode* BTRoot(BTree *b) { return b>root; }
/* BTNode functions */
static BTNode* createBTNode(BTNode *left,void *data,BTNode *right)
{
BTNode *newNode = (BTNode*)malloc(sizeof(BTNode));
newNode>left = left;
newNode>data = data;
newNode>right = right;
return (newNode);
}
static void deallocateBTNode(BTNode *btn,int shouldDelete)
{
if (!btn)
return;
if (shouldDelete)
free(btn>data);
free(btn);
}
static void printNode(char *descr,BTNode *node)
{
printf("%s %s\n",descr,(char*)node>data);
}
BTNode* BTLeftTree(BTNode *btn) { return btn>left; }
BTNode* BTRightTree(BTNode *btn) { return btn>right; }
Code to test the binary tree (BTTEST.C)
#include "bt.h"
static int comparer(void *data1,void *data2)
{
char *d1 = (char*)data1;
char *d2 = (char*)data2;
int result = strcmp(d1,d2);
if (result == 0)
return (EQUAL);
else if (result < 0)
return (LESS_THAN);
else
return (GREATER_THAN);
}
static void displayer(void *data,void *args[]) // args not used
{
char *stringData = (char*)data;
printf("%s\n",stringData);
}
static void selectiveDisplayer(void *data,void *args[])
{
/* args[0] gives us the first character of the names to display */
char firstCharToAccept = *(char*)args[0];
char *stringData = (char*)data;
if (stringData
&& (stringData[0] == firstCharToAccept))
printf("%s\n",stringData);
}
void main()
{
BTree bT,*b = &bT;
BTNode *btNode;
void *args[1];
BTInit(b);
/* Add some elements to the tree and display them */
BTAdd(b,(void*)"Joe",comparer);
BTAdd(b,(void*)"Nancy",comparer);
BTAdd(b,(void*)"Zeo",comparer);
BTAdd(b,(void*)"Emmett",comparer);
BTAdd(b,(void*)"Boo",comparer);
BTAdd(b,(void*)"Bob",comparer);
BTAdd(b,(void*)"Frank",comparer);
printf("Here are the elements in sorted order:\n");
printf("\n");
BTInorder(b,displayer,NULL);
printf("Total # of elements: %d\n\n",BTNElements(b));
printf("Here is tree display of elements:\n");
printf("\n");
BTDisplayLikeTree(b,displayer);
/* Search for an existing and nonexistent */
btNode = BTSearch(b,(void*)"Emmett",comparer);
if (btNode)
printf("Emmett found\n");
btNode = BTSearch(b,(void*)"Zeke",comparer);
if (!btNode)
printf("Zeke not found\n");
printf("Here are the elements after 1st delete:\n");
printf("\n");
if (BTRemove(b,(void*)"Emmett",comparer,FALSE))
printf("Emmett removed...\n");
BTDisplayLikeTree(b,displayer);
printf("\n\n");
printf("Here are the elements after 2nd delete:\n");
printf("\n");
if (!BTRemove(b,(void*)"KK",comparer,FALSE))
printf("KK not removed...\n");
if (BTRemove(b,(void*)"Joe",comparer,FALSE))
printf("Joe removed...\n");
BTDisplayLikeTree(b,displayer);
printf("Total # of elements: %d\n\n",BTNElements(b));
printf("\n\n");
/* Display only 'B' names */
printf("Here are 'B' elements:\n");
printf("\n");
args[0] = (void*)"B";
BTInorder(b,selectiveDisplayer,args);
BTWrapup(b,FALSE);
}
Output of BTTEST.C
Here are the elements in sorted order:
Bob
Boo
Emmett
Frank
Joe
Nancy
Zeo
Total # of elements: 7
Here is tree display of elements:
Bob
Boo
Emmett
Frank
Joe
Nancy
Zeo
Emmett found
Zeke not found
Here are the elements after 1st delete:
Emmett removed...
Boo
Bob
Frank
Joe
Nancy
Zeo
Here are the elements after 2nd delete:
KK not removed...
Joe removed...
Boo
Bob
Frank
Nancy
Zeo
Total # of elements: 5
Here are 'B' elements:
Boo
Bob
Here are the elements in sorted order:
Bob
Boo
Emmett
Frank
Joe
Nancy
Zeo
Total # of elements: 7
Here is tree display of elements:
Bob
Boo
Emmett
Frank
Joe
Nancy
Zeo
Emmett found
Zeke not found
Here are the elements after 1st delete:
Emmett removed...
Boo
Bob
Frank
Joe
Nancy
Zeo
Here are the elements after 2nd delete:
KK not removed...
Joe removed...
Boo
Bob
Frank
Nancy
Zeo
Total # of elements: 5
Here are 'B' elements:
Boo
Bob
Expression trees
Binary trees are very commonly used to evaluate mathematical expressions. Each node consists
of a operator or a number and the very structure of the tree decides the ordering of the
operations. Consider the following trees each describing different expressions:
+ *
/ \ / \
3 * + 6
/ \ / \
2 6 3 2
3 + 2*6 (3 + 2)*6
In evaluating the expression tree we start from the leaves and work upwards. In the 3 + 2*6
example, we can work through the intermediate steps on paper to compute a final result:
+ + 15
/ \ / \
3 * => 3 12 =>
/ \
2 6
Creating or parsing the expression tree from a string is in itself another matter.
A complete application allows the user to enter a mathematical expression and proceeds to
evaluate it.
Let's build an application that performs the evaluation portion.
Expression tree evaluator (EXP.H)
#ifndef EXPH
#define EXPH
enum {VALUE='v',ADD='+',SUBTRACT='',MULTIPLY='*',DIVIDE='/'};
typedef struct
{
double value;
char type;
} ExpressionNode;
#endif
Expression tree evaluator (EXP.C)
/* Code to evaluate a numeric tree expression */
#include "bt.h"
#include "exp.h"
static double processEvaluateTreeExpression(BTNode *btn);
static ExpressionNode* createExpNode(int type,double value)
{
ExpressionNode *expNode = (ExpressionNode*)malloc(sizeof(ExpressionNode));
expNode>type = type;
expNode>value = value;
return (expNode);
}
/* Create an expression tree that describes: (3+2) * (8 (6)) */
static void createTreeExpression(BTree *b)
{
BTNode *t1L = createBTNode(0,(void*)createExpNode(VALUE,3.0),0);
BTNode *t1R = createBTNode(0,(void*)createExpNode(VALUE,2.0),0);
BTNode *t1 = createBTNode(t1L,(void*)createExpNode(ADD,0.0),t1R);
BTNode *t2L = createBTNode(0,(void*)createExpNode(VALUE,8.0),0);
BTNode *t2R = createBTNode(0,(void*)createExpNode(VALUE,6.0),0);
BTNode *t2 = createBTNode(t2L,(void*)createExpNode(SUBTRACT,0.0),t2R);
BTNode *t3 = createBTNode(t1,(void*)createExpNode(MULTIPLY,0.0),t2);
BTInit(b);
b>root = t3;
}
static double evaluateTreeExpression(BTree *b)
{
if (!b>root)
return 0.0;
else
return (processEvaluateTreeExpression(b>root));
}
static void displayer(void *data,void *args[]) // args not used
{
ExpressionNode *expNode = (ExpressionNode*)data;
if (expNode>type == VALUE)
printf("%f\n",expNode>value);
else
printf("%c\n",expNode>type);
}
static double processEvaluateTreeExpression(BTNode *btn)
{
ExpressionNode *expNode,*expLNode,*expRNode;
double result,leftResult,rightResult;
if (btn>left)
{
expLNode = (ExpressionNode*)btn>left>data;
if (expLNode>type == VALUE)
leftResult = expLNode>value;
else
leftResult = processEvaluateTreeExpression(btn>left);
expRNode = (ExpressionNode*)btn>right>data;
if (expRNode>type == VALUE)
rightResult = expRNode>value;
else
rightResult = processEvaluateTreeExpression(btn>right);
expNode = (ExpressionNode*)btn>data;
if (expNode>type == ADD)
result = leftResult + rightResult;
else if (expNode>type == SUBTRACT)
result = leftResult rightResult;
else if (expNode>type == MULTIPLY)
result = leftResult * rightResult;
else if (expNode>type == DIVIDE)
{
if (rightResult == 0.0)
{
printf("?Division by zero error on %f/%f",leftResult,rightResult);
exit(0);
}
result = leftResult / rightResult;
}
else
{
printf("Unknown operation: %c\n",expNode>type);
exit(0);
}
return result;
}
}
void main()
{
BTree expTree,*exp=&expTree;
double result;
createTreeExpression(exp);
printf("Here is the expression tree:\n");
BTDisplayLikeTree(exp,displayer);
result = evaluateTreeExpression(exp);
printf("\nThe result is %f\n",result);
BTWrapup(exp,FALSE);
}
Output
Here is the expression tree:
3.000000
+
2.000000
*
8.000000
6.000000
The result is 70.000000
Cross-referencing with trees
In large competing companies, especially in the software industry, it is not unusual to find
employees drifting amongst competitors. While one company suffers "brain drain" of managers,
programmers, analysts and engineers, another gains by snatching them up. Several such
companies have agreed to bring in a third party to pool their employee databases to better
monitor and assess the degree of employee migration.
Studies have been done and a senior designer has proposed to centre the core data on an array of
trees. An array was chosen to record companies because generally the number of companies is
fixed. Each company has its own tree of employees. A tree is excellent for fast search (and
ordered display). Fast search is a required characteristic if we are to crossreference employees
across companies.
The program will create a tree of employees and each employee will have a linked list describing
their employment history amongst the "family" of competitors. If more companies are added to
the pool, a new company tree is added to the array and the program updates the employee history
links as need be.
Record data
Employee
name : String
currentEmployer> : Company
history > : Linked list of History records
History
salary : Integer
title : String
hireDate : MM/YY
duration : Months
company > : Company
Company
name : String
Employees : BTree
Companyemployee original database:
| *Fred |
| / \ | * means "is a current employee"
| Kal Inc | Joe *Henrietta |
| \ |
| *Nancy |
| *Albert |
| / |
| DZM Ltd.| *Carol Fred |
| \ / \ |
| *Tim Joe |
.
.
.
| Anthony |
| / |
| XYZ Ltd.| Carol *Joe |
| \ / |
| Sandy |
Employee History:
Joe, currently employed by XYZ as a manager has had previous positions
at DZM and Kal as analyst and programmer respectively
|Joe | |68000 | |50000 | |45000 |
|XYZ | > |Manager | > |Analyst | > |Programmer |
|05/95 | |03/94 | |06/91 |
|19 | |14 | |33 |
|XYZ | |DZM | |Kal |
Fred, currently working for Kal, has had only 1 previous position
at DZM employeed also as a programmer
|Fred | |34000 | |28000 |
|Kal | > |Programmer | > |Programmer |
|05/96 | |10/94 |
|7 | |24 |
|Kal | |DZM |
Anthony, currently unemployed amongst pool of companies,
worked for XYZ for 5 years at ending salary 20k
|Anthony| > |20000 |
|0 | |Mech Engineer |
|04/88 |
|60 |
|XYZ |
etc...
Employee history generated tree
| Joe |
| XYZ 68000 ... |
| | |
| DZM 50000 ... |
| | |
| Kal 45000 ... |
/ \
/ \
| Fred | | Tim |
| Kal 34000 ... | | DZM 55000 ... |
| | |
| DZM 28000 ... |
/ \ / \
/ \ / \
... ... ...
| Anthony |
| XYZ 20000 |
/ \
/ \
... ...
Performing queries
The application as designed has much room for user queries. Here are some examples:
. Name all the employees who have worked for DTM and Kal in the last 3 years
. Who amongst the pool of companies has "Joe" worked for in the last 7 years?
. What is the average salary of migratory personell this year versus last year?
. Compile a list of yearly migrations for the last N years:
Employees
1997 1996 1995 ...
Out In Out In Out In
Kal | 4 2 | 3 1 | 0 1 |
DTM | 2 5 | 2 2 | 4 2 | ...
...
XYZ | 3 2 | 1 3 | 1 2 |
Total 9 9 6 6 5 5
To get a feel for the application, the following code is supplied to demonstrate how to create a
visual history profile map of employees back N years, N=7 years in this case:
1997 1996 1995 1994 1993 1992 1991 1990
Albert DZM Anthony
Carol DZM XYZ Fred Kal DZM
Henrietta Kal Joe XYZ DZM Kal
Nancy Kal Sandy XYZ
Tim DZM
To maximize speed, the ordering of employee historyrecords can be done after the company
crossreferencing.
COMPANY.DAT input file
3
Kal 4
Joe 0 45 Programmer 6 1991 33
Fred 1 34 Programmer 5 1996 7
Henrietta 1 25 Personell 10 1992 50
Nancy 1 45 Programmer 2 1997 0
DZM 5
Carol 1 77 Manager 12 1995 0
Albert 1 44 Accountant 7 1994 24
Tim 1 45 Programmer 11 1991 28
Fred 0 22 Consultant 7 1994 12
Joe 0 50 Analyst 3 1994 14
XYZ 4
Carol 0 12 Personell 3 1994 15
Anthony 0 30 MechEngineer 4 1988 60
Sandy 0 87 Analyst 2 1991 11
Joe 1 68 Manager 5 1995 19
# File format is:
<Number of Companies>
<Company Name> <Number of Employees>
<Employee name> <currently employed> <salary> <title> <hireMonth> <hireYear> <monthly duration>
...
...
Employee cross-referencer (COMPANY.H)
#ifndef COMPANYH
#define COMPANYH
#include "bt.h"
#include "dsarray.h"
#include "ll.h"
/*
Record data
Employee
name : String
currentEmployer> : Company
history > : Linked list of History records
History
salary : Integer
title : String
hireDate : MM/YY
duration : Months
company > : Company
Company
name : String
Employees : BTree
*/
typedef struct
{
char name[20];
BTree empTree;
} Company;
typedef struct
{
int month;
int year;
} HireDate;
typedef struct
{
int salary;
char title[20];
HireDate hireDate;
int monthDuration;
Company *company;
} History;
typedef struct
{
char name[20];
Company *currentEmployer;
History currentHistory;
LinkedList *historyLL;
} Employee;
#endif
Employee cross-referencer (COMPANY.C)
/* Code to read into memory an employee database
for several companies.
A crossreference employee tree is created to
produce a visual map of employee work history */
#include "company.h"
#define COMPANY_FILE "company.dat"
#define NEMPLOYEE_CHARS 10
#define NCOMPANY_CHARS 7
#define CURRENT_YEAR 1997
#define NYEARS_TO_REPORT 10
static void regularDisplayer(void *data,void *args[]);
static void printTab(int tab);
static int nameComparer(void *data1,void *data2)
{
Employee *e1 = (Employee*)data1;
Employee *e2 = (Employee*)data2;
char *d1 = (char*)e1>name;
char *d2 = (char*)e2>name;
int result = strcmp(d1,d2);
if (result == 0)
return (EQUAL);
else if (result < 0)
return (LESS_THAN);
else
return (GREATER_THAN);
}
static int dateComparer(void *data1,void *data2)
{
Employee *e1 = (Employee*)data1;
Employee *e2 = (Employee*)data2;
History *h1 = &e1>currentHistory;
History *h2 = &e2>currentHistory;
if (h1>hireDate.year > h2>hireDate.year)
return (GREATER_THAN);
else if (h1>hireDate.year < h2>hireDate.year)
return (LESS_THAN);
else if (h1>hireDate.month > h2>hireDate.month)
return (GREATER_THAN);
else if (h1>hireDate.month < h2>hireDate.month)
return (LESS_THAN);
else
return (EQUAL);
}
static int inverseDateComparer(void *data1,void *data2)
{
int result = dateComparer(data1,data2);
if (result == GREATER_THAN)
result = LESS_THAN;
else if (result == LESS_THAN)
result = GREATER_THAN;
return (result);
}
static int readCompanyEmployees(DSArray *ca,FILE *fp)
{
int nCompanies,i,j,nEmployees;
BTree *b;
Company *cNode;
Employee *eNode;
History *hNode;
fscanf(fp,"%d",&nCompanies);
DInit(ca,nCompanies,10);
for (i = 0; i < nCompanies; i++)
{
cNode = (Company*)malloc(sizeof(Company));
fscanf(fp,"%s %d",cNode>name,&nEmployees);
b = &cNode>empTree;
BTInit(b);
for (j = 0; j < nEmployees; j++)
{
int currentlyEmployed;
eNode = (Employee*)malloc(sizeof(Employee));
hNode = &eNode>currentHistory;
fscanf(fp,"%s %d %d %s %d %d %d",
eNode>name,¤tlyEmployed,&hNode>salary,&hNode>title,
&hNode>hireDate.month,&hNode>hireDate.year,
&hNode>monthDuration);
if (currentlyEmployed)
eNode>currentEmployer = cNode;
hNode>company = cNode;
BTAdd(b,(void*)eNode,nameComparer);
}
DAdd(ca,(void*)cNode);
}
return TRUE;
}
static void displayEmployeesByCompany(DSArray *ca)
{
int i;
Company *c;
BTree *b;
printf("Companies:\n");
printf("\n");
for (i = 0; i < DNElements(ca); i++)
{
c = (Company*)DGet(ca,i);
printf("(%d) %s\n",i,c>name);
b = &c>empTree;
BTDisplayLikeTree(b,regularDisplayer);
}
}
static void crossReferenceAnEmployee(void *data,void *args[])
{
Employee *eNode,*otherENode;
int currentCompany,i;
Company *company;
LinkedList *ll;
BTree *employeeTree,*b;
BTNode *btn;
DSArray *ca;
char *employee;
eNode = (Employee*)data;
employee = eNode>name;
/* avoid processing employees twice */
if (eNode>historyLL)
return;
/* grab arguments from argument list */
currentCompany = *((int*)args[0]);
ca = (DSArray*)args[1];
employeeTree = (BTree*)args[2];
/* create the history linked list for <eNode> and
add current history for starters */
eNode>historyLL = (LinkedList*)malloc(sizeof(LinkedList));
ll = eNode>historyLL;
LLInit(ll);
LLAdd(ll,(void*)&eNode>currentHistory,0);
/* look through all other company employee lists
for <employee> */
for (i = 0; i < DNElements(ca); i++)
{
/* avoid same company that <employee> is working for */
if (i == currentCompany)
continue;
company = (Company*)DGet(ca,i);
b = &company>empTree;
if (btn = BTSearch(b,(void*)employee,nameComparer))
{
/* <employee> worked for another company, add
current history for <employee> working for <company> */
otherENode = (Employee*)btn>data;
LLAdd(ll,(void*)&otherENode>currentHistory,0);
}
}
/* finally, add the <eNode> for <employee> to the <employeeTree> */
BTAdd(employeeTree,(void*)eNode,nameComparer);
}
static void createEmployeeHistoryLinks(DSArray *ca,BTree *employeeTree)
{
/* For each employee of each company, see if they have been
past employees in any of the other companies.
If so, add a record for the employee
and mark the employee as having been visited */
int i;
Company *company;
BTree *b;
void *args[2];
args[1] = (void*)ca;
args[2] = (void*)employeeTree;
BTInit(employeeTree);
for (i = 0; i < DNElements(ca); i++)
{
company = (Company*)DGet(ca,i);
b = &company>empTree;
args[0] = (void*)&i;
BTInorder(b,crossReferenceAnEmployee,args);
}
}
static void sortEmployeeHistory(void *data,void *args[]) // args not used
{
/* Copy history link list nodes into a DSArray, sort the DSArray
by <hireDate> and pipe sorted nodes back into the link list */
Employee *eNode;
LLNode *node;
LinkedList *llist;
History *hNode;
int i;
DSArray dS,*ds=&dS;
DInit(ds,10,10);
eNode = (Employee*)data;
llist = eNode>historyLL;
LLBegin(llist);
while (node = LLNext(llist))
{
hNode = (History*)node>data;
DAdd(ds,(void*)hNode);
}
DSSort(ds,inverseDateComparer);
LLWrapup(llist,FALSE);
LLInit(llist);
for (i = DNElements(ds)1; i >=0; i)
{
hNode = (History*)DGet(ds,i);
LLAdd(llist,(void*)hNode,0);
}
DWrapup(ds,FALSE);
}
static void orderEmployeeHistoryLinks(BTree *employeeTree)
{
BTInorder(employeeTree,sortEmployeeHistory,0);
}
static void regularDisplayer(void *data,void *args[]) // args not used
{
Employee *eNode;
LLNode *node;
LinkedList *llist;
History *hNode;
eNode = (Employee*)data;
printf("%s \n",eNode>name);
llist = eNode>historyLL;
LLBegin(llist);
while (node = LLNext(llist))
{
hNode = (History*)node>data;
printf("<%d> ",hNode>salary);
}
printf("\n");
}
static void yearlyDisplayer(void *data,void *args[]) // args not used
{
Employee *eNode;
LLNode *node;
LinkedList *llist;
History *hNode;
Company **coStart;
int i,nYears,hireYear,index,nSpaces;
char shortName[100];
eNode = (Employee*)data;
nYears = *((int*)args[0]);
coStart = (Company**)malloc(sizeof(Company*)*nYears);
for (i = 0; i < nYears; i++)
coStart[i] = NULL;
llist = eNode>historyLL;
LLBegin(llist);
while (node = LLNext(llist))
{
hNode = (History*)node>data;
hireYear = hNode>hireDate.year;
if ((hireYear <= CURRENT_YEAR)
&& (hireYear >= CURRENT_YEARnYears))
{
index = CURRENT_YEARhireYear;
coStart[index] = hNode>company;
}
}
strncpy(shortName,eNode>name,NEMPLOYEE_CHARS);
printf("%s ",shortName);
nSpaces = NEMPLOYEE_CHARSstrlen(shortName);
printTab(nSpaces);
for (i = 0; i < nYears; i++)
{
Company *c = coStart[i];
if (c)
{
strncpy(shortName,c>name,NCOMPANY_CHARS);
printf("%s",shortName);
printTab(NCOMPANY_CHARSstrlen(shortName));
}
else
printTab(NCOMPANY_CHARS);
}
printf("\n");
free(coStart);
}
static void printTab(int tab)
{
int j;
for (j = 0; j < tab; j++)
printf(" ");
}
static void displayEmployeeHistories(BTree *employeeTree)
{
int i,nYearsToList = NYEARS_TO_REPORT;
void *args[1];
char yearString[30];
args[0] = (void*)&nYearsToList;
printf("Employee history map:\n");
printf("\n");
printTab(NEMPLOYEE_CHARS+1);
for (i = 0; i < nYearsToList; i++)
{
sprintf(yearString,"%d",CURRENT_YEARi);
printf("%s",yearString);
printTab(NCOMPANY_CHARSstrlen(yearString));
}
printf("\n");
BTInorder(employeeTree,yearlyDisplayer,args);
}
void main()
{
FILE *fp;
DSArray cA,*ca=&cA;
BTree employeeTree,*eT=&employeeTree;
fp = fopen(COMPANY_FILE,"r");
if (!fp)
{
printf("Can't open %s\n",COMPANY_FILE);
exit(0);
}
if (!readCompanyEmployees(ca,fp))
exit(0);
fclose(fp);
displayEmployeesByCompany(ca);
createEmployeeHistoryLinks(ca,eT);
orderEmployeeHistoryLinks(eT);
displayEmployeeHistories(eT);
}
Output of COMPANY.C
Companies:
(0) Kal
Fred
Henrietta
Joe
Nancy
(1) DZM
Albert
Carol
Fred
Joe
Tim
(2) XYZ
Anthony
Carol
Joe
Sandy
Employee history map:
1997 1996 1995 1994 1993 1992 1991 1990 1989 1988
Albert DZM
Anthony XYZ
Carol DZM XYZ
Fred Kal DZM
Henrietta Kal
Joe XYZ DZM Kal
Nancy Kal
Sandy XYZ
Tim DZM
Expanding a BTNode's definition
Currently the binary tree maintains uniqueness of keys. If we wanted to store multiple records
with the same key data, we could expand upon the definition of <data> as stored in a BTNode.
Theoretically it could be anything, why not an array of data?
typedef struct Btnode
{
void **data;
int nNodeElements;
struct Btnode *left,*right;
} BTNode;
Consider one node in a tree that orders elements by <lastName>:
...
\
data[0] |Jones, Harry | < This node stores all the
data[1] |Jones, Betty | "Jones" <lastName> matches
data[2] |Jones, Silvia| in an array
nNodeElements| 3 |
/ \
... ...
Alternatively, if we wanted to have names sorted within <firstName>, we could import our
DSArray data structure:
typedef struct Btnode
{
DSArray ds;
struct Btnode *left,*right;
} BTNode;
Better yet, if we wanted fast update and fast search within <firstName>, we could import either of
the HashTable or BTree data structures within a BTNode definition. Consider the tree within a
tree case; each of our BTNodes in itself is a tree:
...
\
| Harry |
| // \\ |
|Betty Silvia|
/ \
... ...
There is really no end to how we can combine and embed data structures to work together!
Supporting complex BTNodes
The disadvantage of embedding a data structure like DSArray directly inside the BTNode
definition is that we alter the original definition, and as a result must provide support for it. We
must recode all BTree functions that reference void* data in BTNode, namely add(), search(),
remove() to make the appropriate add(), search(), remove() calls on the DSArray.
Sure enough, there is no way of getting around these calls, some part of the code has to do it, but
whether or not the Binary Tree is responsible for micromanaging the user's definition of the node
data is another matter.
More often than not the answer is NO. A generalized Binary Tree definition should NOT bend
over backwards to provide support for arbitrary node data. At most (or best), upon an add() the
binary tree can ask for a user function that describes how to add <data> to BTNode. We should
keep the original BTNode void *data definition and simply add a few hooks in the BTree code to
provide adder() support for whatever definition the user of BTree wishes. Suppose it is a
DSArray. Then we have:
* DSARRAY adder
adder(nodeToAddTo,data)
if nodeToAddTo.data is empty
da < new DSArray * Create and initialize a new
da.init(...) * dsarray and patch it into
nodeToAddTo.data = da * <nodeToAddTo>
else
da = nodeToAddTo.data
da.add(data)
* BTree add() logic
* <adder()> describes how to add <data> to <BTNode.data>
add(bt,data,cmp,adder)
if (find(bt,data,cmp,findNode))
nodeToAdd < findNode.left * (1) Add duplicate key
* to an existing BTNode
else
nodeToAdd < createBTNode(0,0,0) * (2) Add first time data
adder(nodeToAdd,data) * to a new BTNode
...
Inorder, Postorder, Preorder traversals
We have used Inorder traversals quite frequently throughout the examples to process nodes in a
tree in sorted order. Recall that for each node, we visit first the left subtree then process the node
data, then visit the right subtree.
Other orders of visitation are possible, two more in fact. A Postorder traversal processes the left
subtree, then the right, and processes the node data last. The tree wrapup code does exactly this,
destroying all children before deleting the parent to preserve the parentchild pointer up until the
last moment.
Preorder processes the node first, then the left subtree, then the right.
Consider the following tree and a summary of the order of node visitation for the three traversal
methods:
89
/ \
12 900
/ \ / \
5 54 201 908
\ \
67 1000
/ \
911 1045
InOrder : 5 12 54 67 89 201 900 908 911 1000 1045
Postorder: 5 67 54 12 201 911 1045 1000 908 900 89
Preorder : 89 12 5 54 67 900 201 908 1000 911 1045
Balancing binary trees
Sometimes trees become "jagged" in shape and some internal management or balancing is
necessary to keep the tree efficient.
Let's sketch some ideas on designing a tree balancer. We already touched into an algorithm when
we designed the remove() function. Recall that to promote the best tree shape, we based our
decision of replacement node upon the deepest leaf in the left or right subtree.
Let's pick an unbalanced node in an unbalanced tree. Clearly 30 is unbalanced as it's leftmost
subtree depth is zero and rightmost subtree depth is 4. We can shift it to a better position in the
tree by deleting it and adding it:
8 8 8
/ \ / \ / \
4 40 4 40 4 40
/ \ / \ / \
30 50 => 33 50 => 33 50
\ \ / \
35 35 30 35
/ / /
32 32 32
/ \ / \ / \
31 34 31 34 31 34
/
33
(1) Decide to (2) Delete (3) Readd 30
reorganize 30 30
Notice that the tree's depth has now changed from 7 to 6.
We can continue this process for other unbalanced nodes, say elements 8, 33 and 40 to get an
overall better weighted tree.
Balancing algorithm pseudocode
nBalances : shared integer, current # of node reorganizations done
maxBalances : shared integer, maximum # of node reorganizations to perform
DEPTH_TOLERANCE = 2
************************************************************************
* Perform a maximum of <maxNumberOfBalances> shifts of nodes
* working in preorder fashion from root to leaves in <bt>
balanceATree(bt,maxNumberOfBalances)
nBalances < 0
maxBalances < maximumNumberOfBalances
BTPreorderBalance(bt,balanceASubtree)
************************************************************************
* Decide if <node> needs to be reorganized
* YES, if <node>'s left subtree and right subtree depths are skewed
* by more than <DEPTH_TOLERANCE>
*
* returns TRUE or FALSE depending on whether <node> was rebalanced
balanceASubtree(node,bt)
depthDifference = | leftMostSubtreeDepth(node) rightMostSubTreeDepth(node) |
if ((depthDifference > DEPTH_TOLERANCE))
BTDelete(bt,node)
BTAdd(bt,node)
return TRUE
else
return FALSE
************************************************************************
* Keep performing preorder traversals of <bt> calling function <f>
* until <bt> is sufficiently balanced, that is,
* until we have performed a cumulative total of <maxBalances> OR a
* preorder traversal results in failure to find an unbalanced node
* Notes: The preorder traversal is terminated upon the FIRST rebalance
* or if <f> has returned TRUE more than <maxBalances> times
* We MUST retraverse the tree starting from the root EACH time after
* the next rebalance because the tree changes and we can't depend
* upon any old links in the recursive traversal being accurate
* Fortunately, Preorder grabs nodes from the TOP of the
* tree first so the algorithm is relatively quick to find
* unbalanced nodes
BTPreorderBalance(bt,f)
while (nBalances < maxBalances)
if (!BTProcessPreorderBalance(bt,bt.root,f))
nBalances = maxBalances * balancing successful, force termination of loop
BTProcessPreorder(bt,node,f)
if (f(node,bt))
nBalances = nBalances + 1 * (1) exit upon a rebalance
return TRUE
if (nBalances >= maxBalances) * (2) exit upon sufficient # of balances
return TRUE
if (BTProcessPreorder(bt,node.left)) * (3) exit recursively upon either
return TRUE * of above conditions
if (BTProcessPreorder(bt,node.right))
return TRUE
return FALSE
B-Trees
The binary tree maintains left and right subtrees only but there is no reason why one cannot
maintain more subtrees per node. We can expand a node definition to support N pieces of data
and N+1 subtrees. Such a structure is called a BTree of order N+1.
Consider a BTree of order 3 where we have at most two pieces of data per node and at most
three subtrees, a left, right, and middle:
60 90
/ | \
/ | \
/ | \
20 35 64 82 150 177
/ | / | / \
/ | / | / \
5 15 23 24 61 68 79 110 141 200 214
| /
| /
72 76 188
/ \
/ \
70 77 78
The advantage of the BTree is that it supports more data per node. If we consider each node as
being a record of data, we can save time on record access. If each node represents a disk access
and each subtree link a pointer to a disk address (or record index), then we can reduce the
number of accesses by making the order, or number of elements per node, large, say 256. In
merely a few disk accesses we can find an element in a group of several million entries!
The disadvantage of BTree's is that tight monitoring of the BTree's shape is required to preserve
efficiency. In an evolving tree, constant rebalancing must be done, not to mention that the basic
functions {add,search,remove,...} are definitely trickier to implement.
Unit V exercises
(1) Write an application that stores information on computer supplies in a binary tree:
Part
Make : String <=| combined
Model : String <=| search key
Description : String
Quantity : integer
Retail price: float
Samsung
SP2413
Printer
/ \
Logitech Samsung
L123 SP2417
Mouse Printer
/ / \
Epson Samsung Sony
MX970 SP2415 TrinitronM17sf
Printer Printer Monitor
/
Sony
TrinitronM15sf
Monitor
Allow the user to search for the details of a particular make and model. Create a repeatSearch()
function that reports all parts of a certain make, given the subtree to search in. Don't make this
part of the binary tree (bt.c) code!
Enter query => Epson MX970
Epson MX970 10 $289
Enter query => Sony
Sony TrinitronM17sf 3 $900
Sony TrinitronM15sf 2 $650
Sample application pseudocode for key make search only:
[ user enters the {make} key]
subTree = BTSearch(bt,key,findMake)
repeatSearch(subTree,key,findMake,displayDetails)
findMake(part1,part2)
make1 = part1.make
make2 = part2.make
if (make1 equals make2)
return EQUAL
else if (make1 < make2)
return LESS_THAN
else
return GREATER_THAN
repeatSearch(node,key,findMake,displayDetails)
args[0] = key
BTProcessInorder(node,displayDetails,args)
displayDetails(part,args)
key = args[0]
if (findMake(part,key) == EQUAL)
display part.make,part.model,...
(2) For security purposes, a daily inmemory structure of the transactions occurring on a family
of ATM machines is required. Instant queries on cardholders must be possible in order to
expediate research on fraud cases via theft, card loss or scams. Implement a binary tree that
embeds a hash table of Transactions as BTNode data using the ideas outlined in the notes under
"Supporting complex BTNodes". To test your data structure, create the following node
definitions:
ATM
Bank name : String <| combined
Branch# : long integer <| BTNode search key
Transaction
Time: String
TypeOfCard: String
CardNumber: String < hash table key
| RoyalBank |
| 3456 | <= btnode
| 14:34 RoyalBank 9923 54 53344 55 |
| 8:45 Visa 4521 4324 2343 243 | <== hash table
| 11:48 MasterCard 2343 3454 435 34 |
|. . . |
/ \
/ \
/ \
| CanadaTrust | | ScotiaBank |
| 612 | | 432 |
|9:14 CT 4323 234 233 32| |13:11 Visa 4332 4334 343 |
|10:42 CIBC 45 2354 4397 6 | |12:01 ScotiaCard 5525 2353 34555 3 |
| . . . | | . . . |
/ \
/ \
| BankOfMontreal | | ScotiaBank |
| 1122 | | 2434 |
... ...
Write an application which populates your tree with sample banks and transactions. Let the user
simulate search for transactions on a lost/stolen card.
(3) Support the following user queries on the companyemployee database (company.dat)
developed early:
(a) Name all the employees who have worked for a,b,c,.. companies in the last n years.
(b) Who amongst the pool of companies has employee x worked for in the last n years?
(c) What is the average salary of migratory personel this year versus last year? In year k versus
year m?
(d) Compile a list of yearly migrations for the last n years:
Employees
1997 1996 1995 ...
Out In Out In Out In
Kal | 4 2 | 3 1 | 0 1 |
DTM | 2 5 | 2 2 | 4 2 | ...
...
XYZ | 3 2 | 1 3 | 1 2 |
Total 9 9 6 6 5 5
(4) Translate the balancing algorithm given in the notes into code. Prove that your solution
works.
(5) Write code to support a k order BTree. Write some code to test your implementation.
(6) Mathematicians wish to be able to type formulas of various
variables into the computer and have it evaluate the formulas for various values. Use the code
developed in EXP.C as a skeleton to implement a numerical parser and evaluator.
Sample run:
1 Enter formula
2 Add variable
3 Evaluate
4 Quit
=> 1
Enter formula : => 3*x+2+y
=> 2
Enter variable,value : => x,2.1
=> 2
Enter variable,value : => y,3.2
=> 3
Result : 11.5
...
Start with designing a very simple expression parser/evaluator that handles expression that deal
only with addition:
x + 2
8 + x + x
x + 2 + 1 + x
etc....
When you feel comfortable with addition, add subtraction, multiplication, division and order of
operations parentheses () to your evaluator's capabilities.
A full blown parser/evaluator includes the ability to evaluate
expressions including all the common math functions:
{sin,cos,tan,abs,log,pow,...}
Here is an example expression:
2*sin(x+c)
At parse time, your program encodes the above expression
into a parse tree looking something like:
*
/ \
2 sin
\
+
/ \
x c
At evaluate time, the values of x and c are bound numerically
via a lookup into a value table and the tree is recursively evaluated to a floating point result.
You may find useful the following grammar to parse your
expressions:
Expression = <Term>
Expression = <Term> + <Term>
Expression = <Term> <Term>
Term = <Factor>
Term = <Factor> * <Factor>
Term = <Factor> / <Factor>
Factor = ( <Expression> )
Factor = Floating point number
Factor = <math function>( <Expression> )
Factor = Identifier
You can build functions parseExpression(), parseTerm() and parseFactor() to parse the
appropriate units of an expression.