Sunteți pe pagina 1din 24

c  


 

Ê Ê
BCA-I Data Structures Sec-B

c  ÊÊ
Ê
Ê
c  Ê Êc Ê

Data is the basic entity or fact that is used in calculation or manipulation process.There are two
types of data such as numerical and alphanumerical data. Integer and floating-point numbers are
of numerical data type and strings are of alphanumeric data type. Data may be single or a set of
values, and it is to be organized in a particular fashion. This organization or structuring of data
will have profound impact on the efficiency of the program.

Data structure is the structural representation of logical relationships between elements of data.
In other words a data structure is a way of organizing data items by considering its relationship
to each other. Data structure affects the design of both the structural and functional aspects of a
program.

Data structures are the building blocks of a program; here the selection of a particular data
structure will help the programmer to design more efficient programs as the complexity and
volume of the problems solved by the computer is steadily increasing day by day. The
programmers have to strive hard to solve these problems. If the problem is analyzed and divided
into sub problems, the task will be much easier i.e., divide, conquer and combine.

cÊ

c      
 
           
             

   

  



=            


   
         

  

It is clear from the above discussion that the data structure and the operations on organized data items can
integrally solve the problem using a computer

Data Structure = Organized data + Allowed Operations

Ê
BCA-I Data Structures Sec-B

ÊÊc  ÊÊ

Data Structure

inear Non-inear

inked
Array Stack Queue Trees Graph
ist

The various Data Structures are divides into two Categories:

1) inear Data Structures: In inear data structure, the data items are arranged in sequence.
Examples are Array, inked ist, Stack, Queue.
2) Non-inear Data Structures: In Non-inear data structures, the data items are not in
sequence. Examples are Trees and Graphs.

c  ÊÊ  Ê


The various operations that can be performed on different data structures are described
below:

Ê  ÊAccessing each record exactly once so that certain items in the record may
be processed. (This operation is called visiting the element.)Ê
Ê  Adding a new record or element to the structure.Ê
Ê c Removing a record or element from the structure.Ê
Ê    Finding the location of given element.Ê
!Ê  Arranging the records either in ascending or descending order.Ê
"Ê ü Combining the records in two different sorted files into a single sorted file.Ê
BCA-I Data Structures Sec-B

#  Ê #Êc #  ÊÊc  ÊÊ

c  Ê
#  Ê c #  Ê
Ê

Slow search
Quick inserts
  Slow deletes
Fast access if index known
Fixed size

Slow inserts
##Ê Faster search than unsorted
Slow deletes
  array
Fixed size

 $ ast-in, first-out access Slow access to other items

% First-in, first-out access Slow access to other items

Quick inserts
&$#Ê& Slow search
Quick deletes

Quick search
Quick inserts
' Ê  Quick deletes Deletion algorithm is complex
›   
 


Best models real-world Some algorithms are slow and very


[ 
situations complex
Ê

Ê
BCA-I Data Structures Sec-B

ü  (  Ê Ê #Ê)Ê


ü  (  Ê  is a system of symbolic representations of mathematical objects and ideas. The
notation uses symbols or symbolic expressions which are intended to have a precise semantic meaning.
Mathematical notations are used in mathematics, the physical sciences, engineering, and economics.
Mathematical notations include relatively simple symbolic representations, such as numbers 1 and 2,
function symbols.

A , in a mathematical sense, expresses the idea that one quantity (the argument of the function,
also known as the input) completely determines another quantity (the value, or the output).

In function notation, the "v" in " (v)" is called "the argument of the function", or just "the
argument". So if they give you " (2)" and ask for the "argument", the answer is just "2".

et's proceed to "evaluation": You evaluate " (v)" just as you would evaluate "".

y [* Ê+Ê Ê,Ê Ê-Ê.Ê#ÊÊ*/

(2) = (2)2 +2(2) ± 1


=4+4±1

y [* Ê+Ê Ê,Ê Ê-Ê.Ê#ÊÊ*-/

(±3) = (±3)2 +2(±3) ± 1


=9±6±1
=Ê

ü  (  Ê 


 ÊÊ ( Ê
Πis equivalent to
 absolute value
Oº  summation for [i] from 1 through [n]
o the square root of 2
 approximately equal to
 less than
 greater than
â rounded up (to a whole number)
' rounded down (to a whole number)
A rounded to the nearest whole number
 ratio of the circumference to the radius of a circle
 infinity; an infinitely large number
BCA-I Data Structures Sec-B

/ÊÊ)Ê #Ê Ê)ÊÊ

If x is a real number, then it means that x lies between two integers which are called the floor and
ceiling of x. i.e.

x ', called the floor of x , denotes the greatest integer that does not exceed x

âx , called the ceiling of x, denotes the least integer that is not less than x

E.g

3.14 ' 

â3.14 

/Ê
( #Ê)Ê*ü# Ê (ÊÊ

If k is any integer and M is a positive integer, then:

k (mod M)

gives the integer remainder when k is divided by M.

E.g.

25(mod 7) = 4
25(mod 5) = 0

/ÊÊ #Ê0Ê1 Ê)ÊÊ

If x is a real number, then integer function INT(x) will convert x into integer and the fractional
part is removed.
E.g.

INT (3.14) = 3
INT (-8.5) = -8

The absolute function ABS(x) or | x | gives the absolute value of x i.e. it gives the positive value
of x even if x is negative.
E.g.
BCA-I Data Structures Sec-B

ABS(-15) = 15 or ABS | -15| = 15


ABS(7) = 7 or ABS | 7 | = 7
ABS(-3.33) = 3.33 or ABS | -3.33 | = 3.33

/Ê(( Ê(0Ê*(ÊÊ

The symbol which is used to denote summation is a Greek letter Sigma .


et a1, a2, a3, «.. , an be a sequence of numbers. Then the sum a1 + a2 + a3 + «.. + an will be
written as:

n
™ aj
j=1

where j is called the dummy index or dummy variable.


E.g.

n
™ j = 1 + 2 + 3 +«..+ n
j=1
Ê

!/Ê)  Ê)ÊÊ

n! denotes the product of the positive integers from 1 to n. n! is read as µn factorial¶, i.e.

n! = 1 * 2 * 3 * «.. * (n-2) * (n-1) * n

E.g.

4! = 1 * 2 * 3 * 4 = 24
5! = 5 * 4 * 3 * 2 * 1 = 120
"/Ê2( ÊÊ

et we have a set of n elements. A permutation of this set means the arrangement of the elements
of the set in some order.

E.g.

Suppose the set contains a, b and c. The various permutations of these elements can be: abc, acb,
bac, bca, cab, cba.

If there are n elements in the set then there will be n! permutations of those elements. It means if
the set has 3 elements then there will be 3! = 1 * 2 * 3 = 6 permutations of the elements.
BCA-I Data Structures Sec-B

Õ/Ê3Ê #Ê&  (ÊÊ

Exponent means how many times a number is multiplied by itself. If m is a positive integer,
then:

am = a * a * a * «.. * a (m times)

E.g.
24 = 2 * 2 * 2 * 2 = 16

The concept of logarithms is related to exponents. If b is a positive number, then the logarithm of
any positive number x to the base b is written as logbx. It represents the exponent to which b
should be raised to get x i.e. y = logbx .
E.g.

log28 = 3, since 23=8

 (Ê (3Ê
After designing an algorithm, it has to be checked and its correctness needs to be predicted; this
is done by analyzing the algorithm. The algorithm can be analyzed by tracing all step-by-step
instructions, reading the algorithm for logical correctness, and testing it on some data using
mathematical techniques to prove it correct. Another type of analysis is to analyze the simplicity
of the algorithm. That is, design the algorithm in a simple way so that it becomes easier to be
implemented. However, the simplest and most straightforward way of solving a problem may not
be sometimes the best one. Moreover there may be more than one algorithm to solve a problem.
The choice of a particular algorithm depends on following performance analysis and
measurements :

1. Space complexity

2. Time complexity

Ê2 Ê ü2&4 5Ê

Analysis of space complexity of an algorithm or program is the amount of memory it needs to


run to completion.
BCA-I Data Structures Sec-B

Some of the reasons for studying space complexity are:

1. If the program is to run on multi user system, it may be required to specify the amount of
memory to be allocated to the program.

2. We may be interested to know in advance that whether sufficient memory is available to run
the program.

3. Can be used to estimate the size of the largest problem that a program can solve.

The space needed by a program consists of following components.

‡ Instruction space : Space needed to store the executable version of the program and it is fixed.

‡ Data space:

(a) Space needed by constants and simple variables. This space is fixed.

(b) Space needed by fixed sized structural variables, such as arrays and structures.

(c) Dynamically allocated space. This space usually varies.

(d) Return address : i.e., from where it has to resume after completion of the called function.Ê

(Ê (3Ê

The time complexity of an algorithm is the amount of time it needs to run to completion. The
exact time will depend on the implementation of the algorithm, programming language,
optimizing the capabilities of the compiler used, the CPU speed,other hardware
characteristics/specifications and so on. Computer time, an algorithm might require for its
execution, would usually depend on the ³size´ of the data input. In other words we can say that
the time complexity of an algorithm is often a function of input size ³n´, that is f(n). If there are
less input data then it may take less time than that it takes time for more input data. Therefore we
can say that the same algorithm may take different time to execute for different inputs having the
same size.
BCA-I Data Structures Sec-B

 (Ê(3Ê6 Ê3 (Ê

There are two types of algorithm complexity

1. Space Complexity

2. Time Complexity

1. Space Complexity: It is the amount of memory which is needed by the algorithm (program) to
run to completion. We can measure the space by finding out that how much memory will be
consumed by the instructions and by the variables used.

E.g.

Suppose we want to add two integer numbers. To solve this problem we have following two

algorithms:

Algorithm 1: Algorithm 2:

Step 1- Input A. Step 1- Input A.

Step 2- Input B. Step 2- Input B.

Step 3- Set C: = A+ B. Step 3- Write: µSum is µ, A+B.

Step 4- Write: µSum is µ, C. Step 4- Exit.

Step 5- Exit.

Both algorithms will produce the same result. But first takes 6 bytes and second takes 4 bytes (2
bytes for each integer). And first has more instructions than the second one. So we will choose
the second one as it takes less space than the first one.

2. Time Complexity: It is the amount of time which is needed by the algorithm (program) to run
to completion. We can measure the time by finding out the compilation time and run time. The
compilation time is the time which is taken by the compiler to compile the program. This time is
not under the control of programmer. It depends on the compiler and differs from compiler to
compiler. One compiler can take less time than other compiler to compile the same program. So
we ignore the compilation time and only consider the run time. The run time is the time which is
taken to execute the program. We can measure the run time on the basis of number of
instructions in the algorithm.
BCA-I Data Structures Sec-B

Example.

Suppose we want to add two integer numbers. To solve this problem we have following two
algorithms:

Algorithm 1: Algorithm 2:

Step 1- Input A. Step 1- Input A.

Step 2- Input B. Step 2- Input B.

Step 3- Set C: = A+ B. Step 3- Write: µSum is µ, A+B.

Step 4- Write: µSum is µ, C. Step 4- Exit.

Step 5- Exit.

Suppose 1 second is required to execute one instruction. So the first algorithm will take 4
seconds and the second algorithm will take 3 seconds for execution. So we will choose the
second one as it will take less time.

'Ê.Ê Ê #Ê7Ê Ê

When we analyze an algorithm it depends on the input data, there are three cases :

1. Best case

2. Average case

3. Worst case

In the 0Ê , the amount of time a program might be expected to take on best possible input
data.

In the  Ê  , the amount of time a program might be expected to take on typical (or
average) input data.

In the 6Ê  , the amount of time a program would take on the worst possible input
configuration.
BCA-I Data Structures Sec-B

'Ê Ê Ê


'8 Ê   (also known as 0Ê Ê   and (Ê  ) describes the limiting
behavior of a function when the argument tends towards a particular value or infinity, usually in terms of
simpler functions. Big O notation characterizes functions according to their growth rates: different
functions with the same growth rate may be represented using the same O notation.

It is not possible to perform simple analysis on the algorithm to determine exact amount of time
required by it. The first complication is that the exact amount of time will depend on the
implementation of the algorithm and on the actual machine. Another complication is that the
time requirements will depend on the amount of input data. So the estimate for the time required
by an algorithm is represented as a function of the size of the input data.

Suppose M is an algorithm and suppose n is the size of input data. The complexity function f(n)
of M increases as n increases. The rate of increase of f(n) is found by comparing f(n) with some
standard functions, such as:

log2 n, n, n log2 n, n2, n3, 2n

Table for rate of growth of standard functions

According to the above table, the rates of growth are: The logarithmic function log2 n (log n) grows most
slowly, the exponential function 2n grows most rapidly and the polynomial function nc grows according to
the exponent c.
BCA-I Data Structures Sec-B

An #Ê Ê 6 is a set of functions whose asymptotic growth behavior is considered
equivalent. For example, 2, 100 and  + 1 belong to the same order of growth, which is written
() in ³Big-Oh notation´ and often called ³linear´ because every function in the set grows
linearly with . All functions with the leading term 2 belong to (2).

The following table shows some of the orders of growth that appear most commonly in
algorithmic analysis.

#ÊÊ6 Ê  (Ê
(1) constant
(log ) logarithmic (for any )
() linear
( log ) ³en log en´
(2) quadratic
(3) cubic
() exponential (for any )

For example,

let (v) = 6v4 í 2v3 + 5, and suppose we wish to simplify this function, using O notation, to describe
its growth rate as v approaches infinity. This function is the sum of three terms: 6v4, í2v3, and 5. Of these
three terms, the one with the highest growth rate is the one with the largest exponent as a function of v,
namely 6v4. Now one may apply the second rule: 6v4 is a product of 6 and v4 in which the first factor does
not depend on v. Omitting this factor results in the simplified form v4. Thus, we say that (v) is a big-oh of
(v4) or mathematically we can write (v) = O(v4).

&( ÊÊ'Ê Ê Ê


Complexity analysis can be very useful, but there are problems with it too.

y Ê #ÊÊ  9. Many algorithms are simply too hard to analyze mathematically.
y  Ê Ê$6. There may not be sufficient information to know what the most
important "average" case really is, therefore analysis is impossible.
y It contains no consideration of programming effort.Ê

Ê
BCA-I Data Structures Sec-B

ü82 Ê
cÊ ))Ê

There may be more than one approach (or algorithm) to solve a problem. The best algorithm (or program)
to solve a given problem is one that requires less space in memory and takes less time to complete its
execution. But in practice, it is not always possible to achieve both of these objectives. One algorithm
may require more space but less time to complete its execution while the other algorithm requires less
time space but takes more time to complete its execution. Thus, we may have to sacrifice one at the cost
of the other. If the space is our constraint, then we have to choose a program that requires less space at the
cost of more execution time. On the other hand, if time is our constraint such as in real time system, we
have to choose a program that takes less time to complete its execution at the cost of more space.
BCA-I Data Structures Sec-B

5
 Ê

Arrays are most frequently used in programming. Mathematical problems like matrix, algebra
and etc can be easily handled by arrays. An array is a collection of homogeneous data elements
described by a single name. Each element of an array is referenced by a subscripted variable or
value, called subscript or index enclosed in parenthesis. If an element of an array is referenced by
single subscript, then the array is known as one dimensional array or linear array and if two
subscripts are required to reference an element, the array is known as two dimensional array and
so on. The arrays whose elements are referenced by two or more subscripts are called multi
dimensional arrays.

An array is a data structure. It is a collection of similar type of (homogeneous) data elements and
is represented by a single name.

E.g.

An array STUDENT containing 8 records is shown below STUDENT [ 8 ] :

STUDENT
Ritika
Gurpreet
Anupama
Hanish
Harsh
Navdeep
Shalini
Kapil

Ê
BCA-I Data Structures Sec-B

Êcü &Ê

5Ê

One-dimensional array (or linear array) is a set of µn¶ finite numbers of homogenous data
elements such as :

1. The elements of the array are referenced respectively by an index set consisting of µn¶
consecutive numbers.

2. The elements of the array are stored respectively in successive memory locations. µn¶ number
of elements is called the length or size of an array. The elements of an array µA¶ may be denoted
in C as

A[0], A[1], A[2], ...... A[n ±1].

The number µn¶ in A[n] is called a subscript or an index and A[n] is called a subscripted variable.
If µn¶ is 10, then the array elements A[0], A[1]......A[9] are stored in sequential memory locations
as follows :

A[0] ,A[1], A[2], ...... A[9]

In C, array can always be read or written through loop. To read a one-dimensional array, it
requires one loop for reading and writing the array, for example:

For reading an array of µn¶ elements

for (i = 0; i < n; i ++)

scanf (³%d´,&a[i]);

For writing an array

for (i = 0; i < n; i ++)

printf (³%d´, &a[i]);

ü & Êcü &Ê

5Ê

If we are reading or writing two-dimensional array, two loops are required. Similarly the array of
µn¶ dimensions would require µn¶ loops. The structure of the two dimensional array is illustrated
in the following figure : A[ 2 ][ 2 ]
BCA-I Data Structures Sec-B

 ÊÊ ÊÊü(Ê


The elements of linear array are stored in consecutive memory locations. It is shown below:

)#Ê Ê& ÊÊ(ÊÊ Ê  Ê

The elements of linear array are stored in consecutive memory locations. The computer does not
keep track of address of each element of array. It only keeps track of the base address of the
array and on the basis of this base address the address or location of any element can be found.
We can find out the location of any element by using following formula:

OC (A [K]) = Base (A) + w (K ± B)

Here OC (A [K]) is the location of the Kth element of A.
Base (A) is the base address of A.
w is the number of bytes taken by one element.
K is the Kth element.
B is the lower bound.
BCA-I Data Structures Sec-B

Suppose we want to find out oc (A [3]). For it, we have:


Base (A) = 1000
w = 2 bytes (Because an integer takes two bytes in the memory).
K=3
B = 1

After putting these values in the given formula, we get:


OC (A [3]) = 1000 + 2 (3 ± 1)
= 1000 + 2 (2)
= 1000 + 4
= 1004

Ê
BCA-I Data Structures Sec-B

 Ê Ê& Ê  Ê

Traversing a linear array means accessing and processing each element of the array exactly once.

c: Here A is a linear array with lower bound B and upper bound UB. This algorithm
traverses array A and applies the operation PROCESS to each element of the array.

Algorithm:

1. Repeat For I = B to UB

2. Apply PROCESS to A [I]

[End of For oop]

3. Exit

Explanation: Here A is a linear array stored in memory with lower bound B and upper
bound UB. The for loop iterates from B to UB and visits each element of the array. During
this visit, it applies the operation PROCESS to the elements of the array A.

ÊÊ
Insertion refers to the operation of adding an new element to existing list of elements. A new
element can be inserted in an array if the arrays have enough space for that element. If you want
to add element at the end of the array, then there is no problem. Just insert the element at the end.

If you want to insert the element in the middle in the array, then you have to move each element
from the specified location by one location downward.Ê

c Here A is a linear array with N elements and K is the position where you want to
insert new element. ITEM is new element you want to insert

 (Ê

/Ê Ê:Ê+ÊÊ
/Ê
 ÊÊÊ #ÊÊ6 Ê:Ê;+Ê$Ê
/Ê Ê&<Ê:,Ê=Ê+&<:=ÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ>>(Ê: Ê(Ê#66 #Ê
/Ê Ê:Ê+Ê:Ê-ÊÊ
<#ÊÊÊÊ&=Ê
!/Ê Ê&<Ê?Ê=+ üÊ
"/Ê ÊÊ+ÊÊ,Ê
Õ/Ê 3Ê
BCA-I Data Structures Sec-B

cÊ
Deletion refers to the operation of removing one of the elements from a linear array. If you delete
element in the middle from the array then, you have to move element one location upward.

c Here A is a linear array with N elements and K is the position of the element you
want to delete. Deleted variable will be stored in ITEM variable. J is a variable.

 (Ê

/Ê Ê üÊ+&<?=Ê


/Ê Ê:+?Ê
/Ê
 ÊÊÊ #Ê!Ê6 Ê:Ê@ÊÊ
/Ê Ê&<Ê:Ê=Ê+&<:Ê,Ê=ÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊÊ>>(Ê(Ê6 #Ê
!/Ê Ê:Ê+Ê:Ê,ÊÊ
<#ÊÊÊÊ&=Ê
"/Ê ÊÊ+ÊÊ8ÊÊ
Õ/Ê 3Ê

ü & Êcü &Ê

5Ê

A two-dimensional array is a table with R rows and C columns, written R x C. An element of the
array is specified by [r,c], where r is the row and c is the column. The figure below illustrates a
3 x 4 array with the elements labelled by row and column.

Ê
Ê
Ê
Ê
Ê
BCA-I Data Structures Sec-B

6Êü AÊ #ÊÊ

In row-major storage, a multidimensional array in linear memory is accessed such that rows are
stored one after the other. Row major ordering assigns successive elements, moving across the
rows and then down the columns, to successive memory locations.

This mapping is demonstrated in Figure:

When using row-major order, the difference between addresses of array cells in increasing rows
is larger than addresses of cells in increasing columns. For example, consider this 2×3 array:

An array declared in C as

O A Aº   


 

would be laid out contiguously in linear memory as:

   
 
BCA-I Data Structures Sec-B

To traverse this array in the order in which it is laid out in memory, one would use the following
nested loop:

 Oº O O

  º   

O  AOA 

(8ü AÊ #Ê

The order in which elements of two dimensional array are stored column by column is known as
column majorn order,that is first column of two dimensional array is stored first, than second,
third, fourth and so on.
.

Ê
Example

The 2 X 3 array

if stored contiguously in linear memory with column-major order would look like the following:

   
 
BCA-I Data Structures Sec-B

2
Ê

5Ê

Sparse array is an important application of arrays. A sparse array is an array where nearly all of
the elements have the same value (usually zero) and this value is a constant. One-dimensional
sparse array is called sparse vectors and two-dimensional sparse arrays are called sparse matrix.

The main objective of using arrays is to minimize the memory space requirement and to improve
the execution speed of a program. This can be achieved by allocating memory space for only
non-zero elements.

For example a sparse array can be viewed as

We will store only non-zero elements in the above sparse matrix because storing all the elements
of the sparse array will be consisting of memory sparse. The non-zero elements are stored in an
array of the form.

A[0......n][1......3]

Where µn¶ is the number of non-zero elements in the array. In the above Figure µn = 7¶. The
space array given in Figure may be represented in the array A[0......7][1.....3].
BCA-I Data Structures Sec-B

The element A[0][1] and A[0][2] contain the number of rows and columns of the sparse array.
A[0][3] contains the total number of nonzero elements in the sparse array.

A[1][1] contains the number of the row where the first nonzero element is present in the sparse
array. A[1][2] contains the number of the column of the corresponding nonzero element. A[1][3]
contains the value of the nonzero element. In the Figure, the first nonzero element can be found
at 1st row in 3rd column.

S-ar putea să vă placă și