Sunteți pe pagina 1din 18

PROGRAMMING AND DATA STRUCTURES - MIT 564

ALGORITHM is a logical sequence of instructions for solving a problem. There can be


more than one solution to solve a given problem. An algorithm can be implemented using
different programming languages on different platforms.

An algorithm must be correct. Once we have a correct algorithm for a problem, we have
to determine the efficiency of that algorithm.

Algorithm that use a similar problem-solving approach can be group together. This
classification scheme is neither exhaustive nor disjoint.

CONVENTIONAL WAYS OF WRITING ALGORITHM (LEARN HOW TO WRITE


ALGORITHM)

1. identify the problem


2. indentation (control structures)
3. documentation (internal & external)
4. parameters - Define all parameters that will be used in the algorithm. Explicitly
define both local and global
5. define all multiple assignment statements

HOW TO ANALYSE ALGORITHM

The performance of algorithms are tide to such factors or resources as

1. memory
2. communication bandwidth
3. logical gates
4. complexity of the underlying algorithm
5. input size of data
6. input data type
7. number of primitive operation
Algorithmic Performance

There are two aspects of algorithmic performance:

• Time

• Instructions take time.

• How fast does the algorithm perform?

• What affects its runtime?

• Space

• Data structures take space

• What kind of data structures can be used?

• How does choice of data structure affect the runtime?

 We will focus on time:

– How to estimate the time required for an algorithm

– How to reduce the time required

COMPLEXITY OF ALGORITHM

What to Analyze

• When we analyze algorithms, we should employ mathematical techniques that


analyze algorithms independently of specific implementations, computers, or data.

• An algorithm can require different times to solve different problems of the same
size.

– Eg. Searching an item in a list of n elements using sequential search. 


Cost: 1,2,...,n

• WORST-CASE ANALYSIS –The maximum amount of time that an algorithm


require to solve a problem of size n.

– This gives an upper bound for the time complexity of an algorithm.

– Normally, we try to find worst-case behavior of an algorithm.


• BEST-CASE ANALYSIS –The minimum amount of time that an algorithm
require to solve a problem of size n.

– The best case behavior of an algorithm is NOT so useful.

• AVERAGE-CASE ANALYSIS –The average amount of time that an algorithm


require to solve a problem of size n.

– Sometimes, it is difficult to find the average-case behavior of an algorithm.

– We have to look at all possible data organizations of a given size n, and their
distribution probabilities of these organizations.

Kinds of analyses
Worst-case: (usually)
• T(n) = maximum time of algorithm
on any input of size n.
Average-case: (sometimes)
• T(n) = expected time of algorithm
over all inputs of size n.
• Need assumption of statistical
distribution of inputs.
Best-case: (bogus)
• Cheat with a slow algorithm that
works fast on some input.
Worst-case analysis is more common than average-case analysis.

Algorithm

1. For j 2 to length
2. Do key  A[j]
3. // insert
4. Ij+1
5. While I > 0 and A[i] > key
6. Do A[I + 1]  A[i]
7. I  j - 1
8. A[I + 1]  key

Solution refer to the table

The running time of the algorithm is the sum of running times for each statement executed;
a statement that takes ci steps to execute and is executed n times will contribute ci n to the
total running time.3 To compute T(n), the running time of INSERTION-SORT, we sum the
products of the cost and times columns, obtaining

Data structures
Is about the way we organise and structure data. It can be in a linear or non-linear form.
In the linear form, we have arrays, list, linked-list, stacks and queues. Non-linear form,
we have trees, B-trees, binary trees and heap.

ARRAYS

is a linear data structure of collection of elements(value or variable) which are of the


same data type stored in a consecutive memory location. Elements of data are stored
sequentially in blocks within the array. Each element is referenced by an index. The
index is usually a number used to address an element in the array.
Applications
Arrays are used to implement mathematical vectors and matrices. Arrays are often used
to implement tables, especially lookup tables. Arrays are used to implement other data
structures, such as heaps, hash tables, deques, queues, stacks, strings, and VLists.

CHARACTERISTICS OF ARRAY
1. An Array holds elements that have the same data type - homogeneous type.
2. Array elements are stored in subsequent memory locations
3. It should have a name
4. Compose of elements and separated by commas
5. It should have a size
6. All arrays should have an index as pointer that point to any of the elements
7. These indexes are use as the address of the various elements

LIST
It can be define as a collection of records usually identify by pointers. This list can be
single list or densely list. For a single list, the structure is made up of block of
consecutive memory locations in which each element consist of fixed size memory size.

CHARACTERISTICS OF LIST
1. The list can either be alphabetical order or serial order.
2. List can be either dynamic meaning it can shrink or grow.
3. All list structure uses pointers. The pointer is the memory cell which contains the
address of the main data item or element.
4. The size of lists. It indicates how many elements are there in the list.

PROBLEM OF LIST
1. Deletion
It takes a lot of time to get to the end of the list
It involves a lot of movements
2. Addition
When the data structure is large, it involves a lot of data movement
It is not advisable to use this structure

LINKED LIST
Consist of series of nodes. Each node consist of at least two(2) fields that is the data item
field and link field to the next node. The terminal node is represented by null.

Linked lists are among the simplest and most common data structures. They can be used
to implement several other common abstract data types, including stacks, queues,
associative arrays.

The principal benefit of a linked list over a conventional array is that the list elements can
easily be inserted or removed without reallocation or reorganization of the entire
structure because the data items need not be stored contiguously in memory or on disk.

On the other hand, simple linked lists by themselves do not allow random access to the
data, or any form of efficient indexing.

Each record of a linked list is often called an element or node. The field of each node
that contains the address of the next node is usually called the next link or next pointer.
The remaining fields are known as the data, information, value, cargo, or payload
fields.
The head of a list is its first node. The tail of a list may refer either to the rest of the list
after the head, or to the last node in the list.

Deletion
(DRAW THE NODES)

INSERTION
(DRAW THE NODES)

LINEAR AND CIRCULAR LISTS


In the last node of a list, the link field often contains a null reference, a special value used
to indicate the lack of further nodes. A less common convention is to make it point to the
first node of the list; in that case the list is said to be circular or circularly linked;
otherwise it is said to be open or linear.

A circular linked list

Singly, doubly, and multiply linked lists


Singly linked lists contain nodes which have a data field as well as a next field, which
points to the next node in the linked list.

A singly linked list whose nodes contain two fields: an integer value and a link to the next
node
In a doubly linked list, each node contains, besides the next-node link, a second link
field pointing to the previous node in the sequence. The two links may be called
forward(s) and backwards, or next and prev(ious).

A doubly linked list whose nodes contain three fields: an integer value, the link forward
to the next node, and the link backward to the previous node

LINKED LISTS VS. ARRAYS


Linked lists have some advantages over arrays:
1. Insertion or deletion of an element at a specific point of a list is a constant time
operation.
2. Arbitrarily many elements may be inserted into a linked list, limited only by the total
memory available; while an array will eventually fill up, and then have to be resized.
3. Arrays allow random access, while linked lists allow only sequential access to
elements
4. Sequential access on arrays is also faster than on linked lists on many machines,
because they have greater locality of reference and thus benefit more from processor
caching.
5. Linked lists require extra storage needed for references, that often makes them
impractical for lists of small data items such as characters or Boolean values.
NB:
ALGORITHM

//Insert a nodes call Ama


T  Data (I)
T  Data (I)  "Ama"
I  I+1
Link (T)  I
Link (I)  NULL
END

AND POLYNOMIAL APPLICATION

STACK are one of the commonly used data structures. A stack is also known as a Last
In First Out (LIFO) system. It can be considered as a linear list in which insertion and
deletion can take place only at one end called the TOP. The push operation adds a new
item to the top of the stack, or initializes the stack if it is empty. If the stack is full and
does not contain enough space to accept the given item, the stack is then considered to be
in an overflow state. The pop operation removes an item from the top of the stack. A pop
either reveals previously concealed items, or results in an empty stack, but if the stack is
empty then it goes into underflow state (It means no items are present in stack to be
removed). The stack pointer always points to the top value of the stack.

STACK OPERATIONS
1. CREATE(S) - it means create an empty stacks(s)
2. ADD(i, s) - Add an element I onto stack S and return the content of the stack i
3. Delete (S) - remove the top element of stack (S)
4. Top(S) - return the top element of stack ie to access the top element of the stack
5. ISEMPTY (S) - if the stack is full. It returns two value ie True/False
6. ISFULL (S) - if the stack is full

CONDITION FOR POP AND PUSH


1. STACK OVERFLOW occur when a system or program attempt to add an
element to a full stack. Although conceptually, stack as a data structure has no
limit but in actual implementation it has limit.
2. STACK UNDERFLOW occurs when a program attempt to remove an element
from an empty stack.
In a push or pop operation, limit test should always be carried out to check for overflow
and underflow.

If Top  Top - 1 means underflow test


If Top  limit + 1 means overflow test

If Top  -1 underflow / or empty

Algorithm

Insertion (PUSH operation)


ADD (item, stack, n, top) // add item to stack with size n to the top
If Top >= n // stack is full
Item  stack (Top)
Top  Top + 1
Stack (Top)  item

Deletion (POP Operation)

Delete (item, stack, top)


If Top <=0 // stack is empty
Item  stack (Top)
Top  Top - 1
Stack (Top)  item

QUEUE: This is First-In-First-Out (FIFO) data structure. A queue is a linear structure in


which element may be inserted at one end called the rear, and the deleted at the other end called the
front. A queue is an example of a linear data structure.

Suppose t1 is the unit of time needed to provide service and t2 is the unit of time on
average the person arrive at the counter or join queue

1. If t1< t2, there will be limited queue or no queue.


2. If t1>t2, there will be long queue
3. If t1 = t2, average person leaves the counter while a new person comes in

PRIORITY QUEUES
A priority queue is a collection of elements such that each element has been assigned a
priority and such that the order in which elements are deleted and processed comes from
the following rules:
1) An element of higher priority is processed before any element of lower priority.
2) Two elements with the same priority are processed according to the order in which
they were added to the queue.

Many application involving queues require priority queues rather than the simple FIFO
strategy. For elements of same priority, the FIFO order is used. For example, in a
multiuser system, there will be several programs competing for use of the central
processor at one time. The programs have a priority value associated to them and are held
in a priority queue. The program with the highest priority is given first use of the central
processor.
Queues will be maintained by a linear array QUEUE and two pointer variables: FRONT,
containing the location of the front element of the queue; and REAR, containing the
location of the rear element of the queue. The condition FRONT = NULL will indicate
that the queue is empty.

TRY QUESTION
WRITE A PROGRAM THAT ADDS AND SUBTRACT POLYNOMIALS.
Each polynomial is to be represented as a linked list. The first node in the list represents the first term in
the polynomial, the second node represents the second term and so forth. Each node contains three
fields. The first field is the term’s coefficient, the second field is the term’s power, and the third field is
a pointer to the next term.

E.g.
5𝑥 4 + 6𝑥 3 + 7

Poly 1

5 4 6 3 7 0 λ

2𝑥 3 − 7𝑥 2 + 3𝑥

Poly 2

2 3 -7 2 3 1

Poly + Poly 2

5𝑥 4 + 6𝑥 3 + 7 + 2𝑥 3 − 7𝑥 2 + 3𝑥

5𝑥 4 + 8𝑥 3 − 7𝑥 2 + 3𝑥 + 7

Result

5 4 8 4 -7 2 3 1 7 0 λ

The rules for the addition of polynomials are as follows:


1. If the powers are equal, the coefficients are algebraically added.
2. If the powers are unequal, the term with the highest power is inserted in the new polynomial.
3. If the exponent is 0, it represent 𝑥 0 which is 1. The value of the term is therefore the value of the
coefficient.
4. If the result of adding the coefficient result is 0, the term is dropped.

QUESTIONS
1.
a. .
b. of 𝑥 is 2𝑥 2 + −𝑥 + 1, use the linked list structure to solve that simultaneously.
c. Suppose you have a homogenous array with 6 rows and 8 columns to store data items in row
major order, starting at the base address of 20 in base 10, if each entry in the array requires 1
memory cell, what then will be the address if each entry requires two memory cells? 4 marks
d. Given a sequence of 𝑇 numbers i.e. 𝑎1 , 𝑎2 , 𝑎3 , 𝑎4 , 𝑎5 , … 𝑎 𝑇 in a sorting problem what
will be the instance of the sorting algorithm if 𝑎 𝑇 − 1 > 𝑎 𝑇 ?
Use this concept and comment on the correctness of the algorithm base on their instances.
2. What is a linear list?
3. Distinguish between a linear list and a circular linked list.
4. a. With the aid of a diagram write an algorithm to delete the 𝑘th element in a list.
[3 marks]
b. Write an algorithm to insert an element y immediately after the 𝑘th element
deleted. [3 marks]
5. Using the stack and the queue structures explain the LIFO and the FIFO principles in data
management. [2 marks]
6. Explain the statement: The time taken by an algorithm grows proportionally to the size of the …
7. With the help of a diagram explain the heap data structure. Use that same diagram to explain the
maximum heap and minimum heap. [6 marks]
8. Draw a complete binary tree with exactly 3x7 nodes.
9. Distinguish between the worst case and the average case running time and the way they affect
the running time of any given algorithm.
10. Describe how an array could be used to implement a queue in a program written in a high level
language. 2 marks.
11. If you have a nested parenthesis, convert it into a tree structure.
12. You have an algorithm, use the algorithm to calculate the total running time of execution.

Data Structure - Tree


Tree represents nodes connected by edges. We'll going to discuss binary tree or
binary search tree specifically.

Binary Tree is a special datastructure used for data storage purposes. A binary tree
has a special condition that each node can have two children at maximum. A binary
tree have benefits of both an ordered array and a linked list as search is as quick
as in sorted array and insertion or deletion operation are as fast as in linked list.
Terms
Following are important terms with respect to tree.

 Path − Path refers to sequence of nodes along the edges of a tree.

 Root − Node at the top of the tree is called root. There is only one root per tree and one
path from root node to any node.

 Parent − Any node except root node has one edge upward to a node called parent.

 Child − Node below a given node connected by its edge downward is called its child
node.

 Leaf − Node which does not have any child node is called leaf node.

 Subtree − Subtree represents descendents of a node.

 Visiting − Visiting refers to checking value of a node when control is on the node.

 Traversing − Traversing means passing through nodes in a specific order.

 Levels − Level of a node represents the generation of a node. If root node is at level 0,
then its next child node is at level 1, its grandchild is at level 2 and so on.

 keys − Key represents a value of a node based on which a search operation is to be


carried out for a node.

Binary Search Tree Representation


Binary Search tree exhibits a special behaviour. A node's left child must have value
less than its parent's value and node's right child must have value greater than it's
parent value.

We're going to implement tree using node object and connecting them through
references.

Node
A tree node should look like the below structure. It has data part and references to
its left and right child nodes.

struct node {

int data;

struct node *leftChild;

struct node *rightChild;

};

In a tree, all nodes share common construct.

BST Basic Operations


The basic operations that can be performed on binary search tree data structure,
are following −

 Insert − insert an element in a tree / create a tree.

 Search − search an element in a tree.

 Preorder Traversal − traverse a tree in a preorder manner.

 Inorder Traversal − traverse a tree in an inorder manner.

 Postorder Traversal − traverse a tree in a postorder manner.


We shall learn creating (inserting into) tree structure and searching a data-item in
a tree in this chapter. We shall learn about tree traversing methods in the coming
one.

Insert Operation
The very first insertion creates the tree. Afterwards, whenever an element is to be
inserted. First locate its proper location. Start search from root node then if data is
less than key value, search empty location in left subtree and insert the data.
Otherwise search empty location in right subtree and insert the data.

Algorithm
If root is NULL

then create root node

return

If root exists then

compare the data with node.data

while until insertion position is located

If data is greater than node.data

goto right subtree

else

goto left subtree

endwhile

insert data

end If

Implementation
The implementation of insert function should look like this −

void insert(int data) {

struct node *tempNode = (struct node*) malloc(sizeof(struct node));

struct node *current;

struct node *parent;


tempNode->data = data;

tempNode->leftChild = NULL;

tempNode->rightChild = NULL;

//if tree is empty, create root node

if(root == NULL) {

root = tempNode;

}else {

current = root;

parent = NULL;

while(1) {

parent = current;

//go to left of the tree

if(data < parent->data) {

current = current->leftChild;

//insert to the left

if(current == NULL) {

parent->leftChild = tempNode;

return;

//go to right of the tree

else {

current = current->rightChild;

//insert to the right

if(current == NULL) {

parent->rightChild = tempNode;

return;

}
}

Search Operation
Whenever an element is to be search. Start search from root node then if data is
less than key value, search element in left subtree otherwise search element in
right subtree. Follow the same algorithm for each node.

Algorithm
If root.data is equal to search.data

return root

else

while data not found

If data is greater than node.data

goto right subtree

else

goto left subtree

If data found

return node

endwhile

return data not found

end if

The implementation of this algorithm should look like this.

struct node* search(int data) {

struct node *current = root;

printf("Visiting elements: ");

while(current->data != data) {

if(current != NULL)

printf("%d ",current->data);
//go to left tree

if(current->data > data) {

current = current->leftChild;

//else go to right tree

else {

current = current->rightChild;

//not found

if(current == NULL) {

return NULL;

return current;

S-ar putea să vă placă și