1

MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept.
of Computer Science And Applications, SJCET, Palai 1 MODULE 1 1.1 WHAT IS AN ALGORITHM Definition: An algorithm is a finite set of instructions that, if followed, acco mplishes a particular task. PROPERTIES OF AN ALGORITHM All algorithms must satisfy the following criteria: 1. Input. Zero or more quantities are externally supplied. 2. Output. At least one quantity is produced. 3. Definiteness. Each instruction is clear and unambiguous. 4. Finiteness. If we trace out the instructions of an algorithm, then for all ca ses, the algorithm terminates after a finite number of steps. 5. Effectiveness. Every instruction must be very basic so that it can be carried out, in principle, by a person using only pencil and paper. It is not enough th at each operation be definite as in criterion 3; it also must be feasible. An algorithm is composed of a finite set of steps, each of which may require one or more operations. The possibility of a computer carrying out these operations necessitates that certain constraints be placed on the type of operations an al gorithm can include. Criteria 1 and 2 require that an algorithm produce one or more outputs and have zero or more inputs that are externally supplied. According to criteria 3, each operation must be definite, meaning that it must be perfectly clear what should be done. The fourth criterion for algorithms is that they terminate after a finite number of operations. A related consideration is that the time for termination should be reasonably short. Criteria 5 requires that each operation be effective; each step must be such tha t it can, at least in principle, be done by a person using pencil and paper in a finite amount of time. Performing arithmetic on integers is an example of an ef fective operation, but arithmetic with real numbers is not, since some values ma y be expressible only by infinitely long decimal expansion. 1.1.1DIFFERENCE BETWEEN ALGORITHM, COMPUTATIONAL PROCEDURE AND PROGRAM COMPUTATIONAL PROCEDURE Algorithms that are definite and effective are also called computational procedu res. One important example of computational procedures is the operating system o f a digital computer. This procedure is designed to control the execution of MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 2 jobs, in such a way that when no jobs are available, it does not terminate but c ontinues in a waiting state until a new job is entered. PROGRAM To help us achieve the criterion of definiteness, algorithms are written in a pr ogramming language. Such languages are designed so that each legitimate sentence has a unique meaning. A program is the expression of an algorithm in a programm ing language. Sometimes words such as procedure, function and subroutine are use d synonymously for program. STUDY OF ALGORITHM The study of algorithms includes many important and active areas of research. Th ere are four distinct areas of study: 1. How to devise algorithms: - Creating an algorithm is an art which may never b e fully automated. There are several techniques with which you can devise new an d useful algorithms. Dynamic programming is one such technique. Some of the tech niques are especially useful in fields other than computer science such as opera tions research and electrical engineering. 2. How to validate algorithms: - Once an algorithm is devised, it is necessary t o show that it computes the correct answer for all possible legal inputs. This p rocess is referred to as algorithm validation. It is sufficient to state the alg orithm in any precise way and need not be expressed as a program. The purpose of validation is to assure us that this algorithm will work correctly independentl
y of the issues concerning the programming language it will eventually be writte n in. once the validity of the method has been shown, a program can be written a nd a second phase begins. This phase is referred to as program proving or progra m verification. A proof of correctness requires that the solution be stated in t wo forms. One form is usually as a program which is annotated by a set of assert ions about the input and output variables of the program. These assertions are o ften expressed in predicate calculus. The second form is called a specification, and this may also be expressed in the predicate calculus. A proof consists of s howing that these two forms are equivalent in that for every given legal input, they describe the same output. A complete proof of program correctness requires that each statement of the programming language be precisely defined and all bas ic operations be proved correct. MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 3 3. How to analyze algorithms: - this field of study is called analysis of algori thms. As an algorithm is executed, it uses the computer s central processing unit (CPU) to perform operations and its memory (both immediate and auxiliary) to hol d the program and data. Analysis of algorithms or performance analysis refers to the task of determining how much computing time and storage an algorithm requir es. An important result of this study is that it allows you to make quantitative judgments about the value of one algorithm over another. Another result is that it allows you to predict whether the software will meet any efficiency constrai nts that exist. Questions such as how well an algorithm performs in the best cas e, in the worst case, or on the average are typical. 4. How to test a program: - testing a program consists of two phases: debugging and profiling (or performance measurement). Debugging is the process of executin g programs on sample data sets to determine whether faulty results occur and, if so to correct them. In cases in which we cannot verify the correctness of outpu t on sample data, the following strategy can be employed: let more than one prog rammer develop programs for the same problem, and compare the outputs produced b y these programs. If the outputs match, then there is a good chance that they ar e correct. A proof of correctness is much more valuable than a thousand tests, s ince it guarantees that the program will work correctly for all possible inputs. Profiling or performance measurement is the process of executing a correct prog ram on data sets and measuring the time and space it takes to compute the result s. These timing figures are useful in that they may confirm a previously done an alysis and point out logical places to perform useful optimization. PSEUDOCODE CONVENTIONS We can describe an algorithm in many ways.We can use a natural language like Eng lish, although I we select this option,we must make sure that the resulting inst ructions are definite. We can present most of our algorithms using a pseudocode that resembles c 1) Comments begin with // and continue untill the end of line Eg: count: =count+1;//count is global; It is initially zero. 2) Blocks are indicated with matching braces: { and } .A compound statement can be representing as a block. The body of a procedure also forms a block. Statemen ts are delimited by ; Eg: for j:= 1 to n do { Count:=count+1; MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 4 C[i,j]:=a[i,j]+b[i,j]; Count:=count +1; } 3) An identifier begins with a letter. The data types of variables are not expli citly declared. The types will be clear from the context .Whether a variable is global or local to a procedure will also be evident from the context. Compound d ata types can be formed with records. Eg: node=record
{ datatype_1 data_1; : datatype_n data_n; node *link; } 4) Assignment of values to variables is done using the assignment statement <variable> := <expression>; Eg: count:= count+1; 5) There are two Boolean values true and false. In order to produce these values , the logical operators and, or, and not and the relational operators <,<=,=,!=, >= and > are provided. Eg: if (j>1) then k:=i-1; else k:=n-1; 6) Elements of multidimensional arrays are accessed using [ and ]. For eg: if A is a two dimensional array , the (i,j) th element of the array is d enoted as A[i,j]. Array indicates start at zero. 7) The following looping statements are employed: for, while and repeat until. T he while loop takes the following form. While (condition) do { <statement 1> : : <statement n> } 8) A conditional statement has the following forms: If < condition > then <statement> If<condition> then <statement 1> else <statement 2> Here < condition > is a Boolean expression and <statement>, <statement 1>, and< statement 2> are arbitrary statements. 9) Input and output are done using the instructions read and write. No format is used to specify the size of input or output quantities. MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 5 Eg: write ( n is even ); 10) There is only one type of procedure: Algorithm. An algorithm consists of a h eading and a body. The heading takes the form Algorithm Name (<parameter list>) 1.2 RECURSIVE ALGORITHMS A function that calls itself repeatedly, satisfying some condition is called a R ecursive Function. The algorithm that does this is called a recursive algorithm. Using recursion, we split a complex problem into its single simplest case. The recursive function only knows how to solve that simplest case. TYPES OF RECURSION: 1.2.1 Linear Recursion A linear recursive function is a function that only makes a single call to itsel f each time the function runs (as opposed to one that would call itself multiple times during its execution). The factorial function is a good example of linear recursion. Another example of a linear recursive function would be one to compu te the square root of a number using Newton's method (assume EPSILON to be a ver y small number close to 0): double my_sqrt(double x, double a) { double difference = a*x-x; if (difference < 0.0) difference = -difference; if (difference < EPSILON) return(a); else return(my_sqrt(x,(a+x/a)/2.0)); } 1.2.2 Tail recursive Tail recursion is a form of linear recursion. In tail recursion, the recursive c
all is the last thing the function does. Often, the value of the recursive call is returned. As such, tail recursive functions can often be easily implemented i n an iterative manner; by taking out the recursive call and replacing it with a loop, the same effect can generally be achieved. In fact, a good compiler can re cognize tail recursion and convert it to iteration in order to optimize the perf ormance of the code. A good example of a tail recursive function is a function to compute the GCD, or Greatest Common Denominator, of two numbers: int gcd(int m, int n) { int r; MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 6 if (m < n) return gcd(n,m); r = m%n; if (r == 0) return(n); else return(gcd(n,r)); } 1.2.3 Binary Recursive Some recursive functions don't just have one call to themself, they have two (or more). Functions with two recursive calls are referred to as binary recursive f unctions. The mathematical combinations operation is a good example of a functio n that can quickly be implemented as a binary recursive function. The number of combinations, often represented as nCk where we are choosing n elements out of a set of k elements, can be implemented as follows: int choose(int n, int k) { if (k == 0 || n == k) return(1); else return(choose(n-1,k) + choose(n-1,k-1)); } 1.2.4 Exponential recursion An exponential recursive function is one that, if you were to draw out a represe ntation of all the function calls, would have an exponential number of calls in relation to the size of the data set (exponential meaning if there were n elemen ts, there would be O(an) function calls where a is a positive number). A good ex ample an exponentially recursive function is a function to compute all the permu tations of a data set. Let's write a function to take an array of n integers and print out every permutation of it. void print_array(int arr[], int n) { int i; for(i=0; i<n; i) printf("%d ", arr[i]); printf("\n"); } void print_permutations(int arr[], int n, int i) { int j, swap; print_array(arr, n); for(j=i+1; j<n; j) { swap = arr[i]; arr[i] = arr[j]; arr[j] = swap; MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 7 print_permutations(arr, n, i+1); swap = arr[i]; arr[i] = arr[j]; arr[j] = swap; } } To run this function on an array arr of length n, we'd do print_permutations(arr , n, 0) where the 0 tells it to start at the beginning of the array. 1.2.5 Nested Recursion
In nested recursion, one of the arguments to the recursive function is the recur sive function itself! These functions tend to grow extremely fast. A good exampl e is the classic mathematical function, "Ackerman's function. It grows very quic kly (even for small values of x and y, Ackermann(x,y) is extremely large) and it cannot be computed with only definite iteration (a completely defined for() loo p for example); it requires indefinite iteration (recursion, for example). Ackerman's function int ackerman(int m, int n) { if (m == 0) return(n+1); else if (n == 0) return(ackerman(m-1,1)); else return(ackerman(m-1,ackerman(m,n-1))); } 1.2.6 Mutual Recursion A recursive function doesn't necessarily need to call itself. Some recursive fun ctions work in pairs or even larger groups. For example, function A calls functi on B which calls function C which in turn calls function A. A simple example of mutual recursion is a set of function to determine whether an integer is even or odd. int is_even(unsigned int n) { if (n==0) return 1; else return(is_odd(n-1)); } int is_odd(unsigned int n) { return (!iseven(n)); } MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 8 1.2.7 EXAMPLES OF RECURSIVE ALGORITHMS: The Towers of Hanoi The Towers of Hanoi puzzle (TOH) was first posed by a French professor, Edouard L ucas, in 1883. Although commonly sold today as a children s toy, it is often discu ssed in discrete mathematics or computer science books because it provides a sim ple example of recursion. In addition, its analysis is straightforward and it ha s many variations of varying difficulty. The object of the Towers of Hanoi problem is to specify the steps required to mo ve the disks or, as we will sometimes call them, rings) from pole r (r = 1, 2, o r 3) to pole s (s = 1, 2, or 3; s _= r), observing the following rules: i) Only one disk at a time may be moved. ii) At no time may a larger disk be on top of a smaller one. The most common form of the problem has r = 1 and s = 3. image The Towers of Hanoi problem Solution: The algorithm to solve this problem exemplifies the recursive paradigm . We imagine that we know a solution for n - 1 disks ( reduce to a previous case ), and then we use this solution to solve the problem for n disks. Thus to move n d isks from pole 1 to pole 3, we would: 1. Move n - 1 disks (the imagined known solution) from pole 1 to pole 2. However we do this, the nth disk on pole 1 will never be in our way because any valid sequence of moves with only n -1 disks will still be valid if there is an nth (larger) disk always sitting at the bottom of pole 1 (why?). 2. Move disk n from pole 1 to pole 3. 3. Use the same method as in Step 1 to move the n -1 disks now on pole 2 to pole 3. 1.3 ALGORITHM DESIGN TECHNIQUES For a given problem, there are many ways to solve them. The different methods ar e listed below. 1. Divide and Conquer.
2. Greedy Algorithm. 3. Dynamic Programming. 4. Branch and Bound. 5. Backtracking Algorithms. 6. Randomized Algorithm. MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 9 Now let us discuss each method briefly. 1. Divide and Conquer Divide and conquer method consists of three steps. a. Divide the original problem into a set of sub-problems. b. Solve every sub-problem individually, recursively. c. Combine the solutions of the sub-problems into a solution of the whole origin al problem. 2. Greedy Approach Greedy algorithms seek to optimize a function by making choices which are the be st locally but do not look at the global problem. The result is a good solution but not necessarily the best one. Greedy Algorithm does not always guarantee the optimal solution however it generally produces solutions that are very close in value to the optimal solution. 3. Dynamic Programming. Dynamic Programming is a technique for efficient solution. It is a method of sol ving problems exhibiting the properties of overlapping sub problems and optimal sub-structure that takes much less time than other methods. 4. Branch and Bound Algorithm. In Branch and Bound Algorithm a given Algorithm which cannot be bounded has to b e divided into at least two new restricted sub-problems. Branch and Bound Algori thm can be slow, however in the worst case they require efforts that grows expon entially with problem size. But in some cases the methods converge with much les s effort. Branch and Bound Algorithms are methods for global optimization in non -convex problems. 5. Backtracking Algorithm. They try each possibility until they find the right one. It is a depth first sea rch of a set of possible solution. During the search if an alternative doesn t wor k, the search backtrack to the choice point, the place which presented different alternatives and tries the next alternative. If there are no more choice points the search fails. 6. Randomized Algorithm. A Randomized Algorithm is defined as an algorithm that is allowed to access a so urce of independent, unbiased, random bits, and it is then allowed to use these random bits to influence its computation. MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 10 1.4 ALGORITHMIC COMPLEXITY The time complexity of an algorithm is given by the number of steps taken by the algorithm to compute the function it was written for. The number of steps is it self a function of the instance characteristics. Although any specific instance may have several characteristics, the number of steps computed as a function of some subset of these. Usually, we might wish to know how the computing time incr eases as the number of inputs increase. In this case the number of steps will be computed as a function of the number of inputs alone. For a different algorithm , we might be interested in determining how the computing time increases as the magnitude of one of the input increases. In this case the number of steps will b e computed as a function of the magnitude of this input alone. Thus before the s tep count of an algorithm can be determined, we need to know which characteristi cs of the problem instance are to be used. These define the variables in the exp ression for the step count. In the case of sum, we chose to measure the time com plexity as a function of the number n of elements being added. For algorithm Add , the choice of characteristics was the number m of rows and the number n of col umns in the matrices being added.
1.4.1 SPACE COMPLEXITY Algorithm abc computes a+b+b*c+(a+b-c)/(a+b)+4.0; The space needed by each of these algorithms is seen to be the sum of the follow ing components: 1. A fixed part that is independent of the characteristics of the inputs and out puts. This part typically includes the instruction space, space for simple varia bles and fixed-size component variables, space for constants. 2. A variable part that consists of the space needed by component variables whos e size is dependent on the particular problem instance being solved, the space n eeded by reference variables and he recursion stack space. The space requirement s(p) of any algorithm p may thereore be written as s(p) =c +Sp, Where c is a constant. 1.4.2 TIME COMPLEXITY COMPLEXITY OF SIMPLE ALGORITHMS Time complexity of an algorithm is given by the number of steps taken by the alg orithm to compute the function it was written for. The time T(P) taken by a prog ram P is the sum of compile time and run time. The compile time does not depend the instance characteristics. A compiled program will run several times without MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 11 recompilation. So we have to concern with the run time of the program which is d enoted by tp (instance characteristics). If we know the characteristics of the compiler to be used ,we could proceed to d etermine the number of additions, subtractions, multiplications, divisions, comp ares, loads, stores and so on,that would be made the code for P. So we could obt ain an expression for tp (n) of the form tp(n)=ca ADD(n) + cs SUB(n) + cm MUL(n) + cd DIV(n)+ . Where n denotes the instance characteristics, and ca, cs , cm, cd, and so on, re spectively, denote the time needed for an addition, subtraction, multiplication, division ,and so on, and ADD,SUB,MUL, DIV, and so on, are functions whose value s are the numbers of additions, subtractions, multiplications, divisions, and so on ,that are performed when the code for P is used on an instance with characte ristic n. The value of tp(n) for any n can be obtained only experimentally. The program is typed, compiled and run on a particular machine. The execution time i s physically blocked, and tp(n) obtained. In a multi user system, execution time depends on factors such as system load and number of other programs running on the computer at the time P is running, the characteristics of these other progra ms and so on. A program step is loosely defined as a syntactically or semantically meaningful segment of a program that has an execution time independent of the instance char acteristics. For example, consider the entire statement return a+b+b*c+(a+b-c)/(a+b)+4.0 of the program given below Algorihm abc(a,b,c) { return a+b+b*c+(a+b-c)/(a+b)+4.0 } The above line could be considered as a step since its execution time is indepen dent of the instance characteristics. The number of steps any program statement is assigned depends on the kind of statement. For example comments count as zero steps; an assignment statement which does not involve any calls to other algori thms is counted as one step; in an iterative statement such as the for, while, a nd repeat until statements, we consider the step count only for the control part of the statement. The control parts for for and while statements have the follo wing forms: for i=(expr) to (expr1) do while(expr) do MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 12 Each execution of the control part of a while statement is given by a step count equal to the number of step counts assignable to(expr).The step count for each
execution of the control part of a for statement is one, unless the counts attri butable to (expr) and (expr1) are the functions of the instance characteristics. In the latter case, the first execution of the control part the for has a step count equal to the sum of counts for (expr) and (expr1).Remaining executions of the for statement have a step count of one; and o on. We can determine the number of steps needed by a program to solve a particular p rogram instance in one of the two ways. In the first method we introduce a new v ariable, count into the problem. This is a global variable with initial value eq uals zero. Statement to increment count by appropriate amount are introduced int o the program. This is done so that each time a statement in the original progra m is executed; count is incremented by the step count of that statement. EXAMPLE FOR TIME COMPLEXITY CALCULATION Fibonacci series of numbers starts as 0,1,1,2,3,5,8,13,21,34,55 .. Each new term is obtained by taking the sum of two previous terms. If we call th e first term of the sequence f0, then f0=0,f1=1,and in general fn=fn-1+fn-2, n>=2 Algorithm Fibonacci(n) // Compute the nth Fibonacci number { if (n<=1) then Write(n); else { fnm2:=0;fnm=1; For I := 1 to n do { fn = fnm1+fnm2; fnm2 =fnm1;fnm1=fn; } Write(fn) } } To analyze the complexity of this algorithm we need to consider two cases (1) n= 0 or 1 and (2) n>1.When n=0 or 1 lines 4and 5 get executed once each. Since each line has an s/e of 1,total step count for this case is 2.When n>1,lines 4, 8 an d 14 are MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 13 each executed once. Line 9 gets executed n times, and lines 11 and 12 get execut ed n-1 times each. Line 8 has an s/e of 2,line 12 has an s/e of 2 and line 13 ha s s/e of 0.The remaining lines that get executed have s/e s of 1 The total steps f or the case n>1 is therefore 4n+1. Example 1 Sum of n numbers Algorithm with count statements added Algorithm sum(a,n) { S:=0.0; Count := count+1;//count is global; it is initially zero. For i:=1 to n do { Count :=count + 1;//For for S:= s+a[i];count:=count+1;//For assignment } Count :=count+1;//For last time of for Count := count+1;//For the return Return s; } Simplified version of algorithm Algorithm Sum(a,n) { For i:=1 to n do count:= count+2; Count:= count+3;} Complexity calculation
tRSum(n)=2+tRSum(n-1) =2+2+tRSum(n-2) =2(2)+tRSum(n-2) : =n(2)+tRSum(0) =2n+2, n>=0 MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 14 Example 2 :Complexity of Fibonacci series Algorithm Fibonacci(n) //compute the nth Fibonacci number { If(n<=1) then Write (n); else { Fnm2:=0; fnm1:= 1; for i:=2 to n do { fn:= fnm1+fnm2; fnm2:=fnm1;fnm1:=fn; } Write (fn); } } To analyse the time complexity of this algorithm, we need to consider the two ca ses (1) n=0 or 1 and(2) n>1.When n=0 or 1 , lines 4 and 5 get executed once each . Since each line has an s/e of 1, the total step count for this case is 2.When n>1, lines 4,8 and 14 are each executed once. Line 9 gets executed n times and l ines 11 and 12 get executed n-1 times each.Line 8 has an s/e of 2, line 12 has a n s/e of 2,and line 13 has an s/e of 0. The remaning lines that get executed hav e s/e of 1. The total steps for the case n>1 is therefore 4n+1. 1.5 ASYMPTOTIC N OTATION Introduction A problem may have numerous algorithmic solutions. In order to choose the best algorithm for a particular task, you need to be able to judg e how long a particular MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 15 solution will take to run. Or, more accurately, you need to be able to judge how long two solutions will take to run, and choose the better of the two. You don' t need to know how many minutes and seconds they will take, but you do need some way to compare algorithms against one another. Asymptotic complexity is a way o f expressing the main component of the cost of an algorithm, using idealized uni ts of computational work. Consider, for example, the algorithm for sorting a dec k of cards, which proceeds by repeatedly searching through the deck for the lowe st card. The asymptotic complexity of this algorithm is the square of the number of cards in the deck. This quadratic behavior is the main term in the complexit y formula, it says, e.g., if you double the size of the deck, then the work is r oughly quadrupled. The exact formula for the cost is more complex, and contains more details than are needed to understand the essential complexity of the algor ithm. With our deck of cards, in the worst case, the deck would start out revers e-sorted, so our scans would have to go all the way to the end. The first scan w ould involve scanning 52 cards, the next would take 51, etc. So the cost formula is 52 + 51 + ... + 1. generally, letting N be the number of cards, the formula is 1 + 2 + ... + N, which equals (N + 1) * (N / 2) = (N2 + N) / 2 = (1 / 2)N2 + N / 2. But the N^2 term dominates the expression, and this is what is key for co mparing algorithm costs. (This is in fact an expensive algorithm; the best sorti ng algorithms run in sub-quadratic time.) Asymptotically speaking, in the limit as N tends towards infinity, 1 + 2 + ... + N gets closer and closer to the pure quadratic function (1/2) N^2. And what difference does the constant factor of 1/ 2 make, at this level of abstraction. So the behavior is said to be O(n2). Now l et us consider how we would go about comparing the complexity of two algorithms. Let f(n) be the cost, in the worst case, of one algorithm, expressed as a funct
ion of the input size n, and g(n) be the cost function for the other algorithm. E.g., for sorting algorithms, f(10) and g(10) would be the maximum number of ste ps that the algorithms would take on a list of 10 items. If, for all values of n >= 0, f(n) is less than or equal to g(n), then the algorithm with complexity fu nction f is strictly faster. But, generally speaking, our concern for computatio nal cost is for the cases with large inputs; so the comparison of f(n) and g(n) for small values of n is less significant than the "long term" comparison of f(n ) and g(n), for n larger than some threshold. Note that we have been speaking ab out bounds on the performance of algorithms, rather than giving exact speeds. Th e actual number of steps required to sort our deck of cards (with our naive quad ratic algorithm) will depend upon the order in MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 16 which the cards begin. The actual time to perform each of our steps will depend upon our processor speed, the condition of our processor cache, etc., etc. It's all very complicated in the concrete details, and moreover not relevant to the e ssence of the algorithm. 1.5.1 BIG-O NOTATION Definition Big-O is the formal met hod of expressing the upper bound of an algorithm's running time. It's a measure of the longest amount of time it could possibly take for the algorithm to compl ete. More formally, for non-negative functions, f(n) and g(n), if there exists a n integer n0 and a constant c > 0 such that for all integers n > n0, f(n) = cg(n ), then f(n) is Big O of g(n). This is denoted as "f(n) = O(g(n))". If graphed, g(n) serves as an upper bound to the curve you are analyzing, f(n). O-Notation (Upper Bound) This notation gives an upper bound for a function to within a constant factor. W e write f(n) = O(g(n)) if there are positive constants n0 and c such that to the right of n0, the value of f(n) always lies on or below cg(n). Theory Examples So, let's take an example of Big-O. Say that f(n) = 2n + 8, and g(n) = n2. Can we find a constant c, so that 2n + 8 <= n2? The number 4 works he re, giving us 16 <= 16. For any number c greater than 4, this will still work. S ince we're trying to generalize this for large values of n, and small values (1, 2, 3) aren't that important, we can say that f(n) is generally faster than g(n) ; that is, f(n) is bound by g(n), and will always be less than it. MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 17 It could then be said that f(n) runs in O(n2) time: "f-of-n runs in Big-O of n-s quared time". To find the upper bound - the Big-O time - assuming we know that f (n) is equal to (exactly) 2n + 8, we can take a few shortcuts. For example, we c an remove all constants from the runtime; eventually, at some value of c, they b ecome irrelevant. This makes f(n) = 2n. Also, for convenience of comparison, we remove constant multipliers; in this case, the 2. This makes f(n) = n. It could also be said that f(n) runs in O(n) time; that lets us put a tighter (closer) up per bound onto the estimate. Practical Examples O(n): printing a list of n items to the screen, looking at each item once. O(ln n): also "log n", taking a list of items, cutting it in half repeatedly until there's only one item left. O(n2): taking a list of n items, and comparing every item to every other item. 1.5.2 B IG-OMEGA NOTATION For non-negative functions, f(n) and g(n), if there exists an integer n0 and a constant c > 0 such that for all integers n > n0, f(n) = cg(n), then f(n) is omega of g(n). This is denoted as "f(n) = O(g(n))". This is almost the same definition as Big Oh, except that "f(n) = cg(n)", this makes g(n) a lo wer bound function, instead of an upper bound function. It describes the best th at can happen for a given data size. O-Notation (Lower Bound) This notation gives a lower bound for a function to within a constant factor. We write f(n) = O(g(n)) if there are positive constants n0 and c such that to the right of n0, the value of f(n) always lies on or above cg(n). MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 18 How asymptotic notation relates to analyzing complexity Temporal comparison is n ot the only issue in algorithms. There are space issues as well. Generally, a tr
adeoff between time and space is noticed in algorithms. Asymptotic notation empo wers you to make that trade off. If you think of the amount of time and space yo ur algorithm uses as a function of your data over time or space (time and space are usually analyzed separately), you can analyze how the time and space is hand led when you introduce more data to your program. This is important in data stru ctures because you want a structure that behaves efficiently as you increase the amount of data it handles. Keep in mind though those algorithms that are effici ent with large amounts of data are not always simple and efficient for small amo unts of data. So if you know you are working with only a small amount of data an d you have concerns for speed and code space, a trade off can be made for a func tion that does not behave well for large amounts of data. A few examples of asym ptotic notation Generally, we use asymptotic notation as a convenient way to exa mine what can happen in a function in the worst case or in the best case. For ex ample, if you want to write a function that searches through an array of numbers and returns the smallest one: function find-min(array a[1..n]) let j := for i : = 1 to n: j := min(j, a[i]) repeat return j end Regardless of how big or small t he array is, every time we run find-min, we have to initialize the i and j integ er variables and return j at the end. Therefore, we can just think of those part s of the function as constant and ignore them. So, how can we use asymptotic not ation to discuss the find-min function? If we search through an array with 87 el ements, then the for loop iterates 87 times, even if the very first element we h it turns out to be the minimum. Likewise, for n elements, the for loop iterates n times. Therefore we say the function runs in time O(n). What about this functi on: function find-min-plus-max(array a[1..n]) MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 19 // First, find the smallest element in the array let j := ; for i := 1 to n: j : = min(j, a[i]) repeat let minim := j // Now, find the biggest element, add it to the smallest and j := ; for i := 1 to n: j := max(j, a[i]) repeat let maxim := j // return the sum of the two return minim + maxim; end What's the running time for find-min-plus-max? There are two for loops, that each iterate n times, so t he running time is clearly O(2n). Because 2 is a constant, we throw it away and write the running time as O(n). Why can you do this? If you recall the definitio n of Big-O notation, the function whose bound you're testing can be multiplied b y some constant. If f(x) = 2x, we can see that if g(x) = x, then the Big-O condi tion holds. Thus O(2n) = O(n). This rule is general for the various asymptotic n otations. 1.5.3 THETA Definition: The function f(n)=theta(g(n))(read as f of n is theta of g of n ) iff there exist p ositive constants c1,c2, and n0 such that c1g(n) <=f(n)<=c2g(n) for all n, n>=n0 . Example: The function 3n+2=theta(n) as 3n+2>=3n for all n>=2 and 3n+2<=4n for all n>=2, s o c1=3,c2=4 and n0=2. 3n+3= theta(n), 10n2+4n+2=theta(n2),6*2n+n2=theta(2n), and 10* log n+4= theta (log n). 3n+2 !=theta(1), 3n+3 != theta(n2), 10n2+4n+2 !=the ta(n), 10n2+4n+2 !=theta(1), 6*2n+n2 !=theta(n2), 6*2n +n2 !=theta(n100), and 6 * 2n +n2 !=theta(1). MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 20 The theta notation is more precise than both the big oh and big omega notations. The function f(n)=theta(g(n)) iff g(n) is both lower and upper bound of f(n). 1.5.4 Little oh Definition: The function f(n)=0(g(n)) (read as f of n is little oh of g of n ) iff Lim f(n)/g(n)=0 n->infinity Example: The function 3n+2=o(n2) since lim n->infinity (3n+2)/n2=0. 3n+2=o(n log n). 3n+2 = o(n log log n). 6*2n+n2=o(3n). 6*2n+n2=o(2n log n). 3n+2 !=o(n). 6*2n+n2 !=o(2 n).
MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 21 1.5.5 Little omega Definition: The function f(n)=w(g(n)) (read as f of n is little omega of g of n ) iff Lim g(n)/f(n)=0 n->infinity Example: Algorithm Sum(a,n) { S:=0.0; for i=1 to n do s:=s+a[i]; return s; } Alg 1. Iterative function for sum For this algorithm Sum wec determined that tSum(n)= 2n+3. So, tSum(n)=theta(n) Asymptotic Notation Properties Let f(n) and g(n) be asymptotically positive functions. Prove or disprove each o f the following conjectures. a. f(n)=O(g(n)) implies g(n)=O(f(n)). b. f(n)+g(n)=(min(f(n),g(n))). c. f(n)=O(g(n)) implies lg(f(n))=O(lg(g(n))),where lg(g(n))>=1 and f(n)>=1 for a ll sufficiently large n. d. f(n)=O(g(n)) implies 2^f(n)=O(2^g(n)). e. f(n)=O((f(n))^2). f. f(n)=O(g(n)) implies g(n) and O(f(n)). g. f(n)=(f(n/2)). h. f(n)+o(f(n))=(f(n)). MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 22 COMMON COMPLEXITY FUNCTIONS Comparison of functions Many of the relational properties of real numbers apply to asymptotic comparison s as well. For the following, assume that f(n) and g(n) are asymptotically posit ive. Transitivity: f(n)=(g(n)) and g(n)=(h(n)) imply f(n)=(h(n)), f(n)=O(g(n)) and g(n)=O(h(n)) imply f(n)=O(h(n)), f(n)=O(g(n)) and g(n)=O(h(n)) imply f(n)=O(h(n)), f(n)=o(g(n)) and g(n)=o(h(n)) imply f(n)=o(h(n)), f(n)=?(g(n)) and g(n)=?(h(n)) imply f(n)=?(h(n)). Reflexivity: f(n)=(f(n)), f(n)=O(f(n)), f(n)=O(f(n)). Symmetry: f(n)=(g(n)) if and only if g(n)=(f(n)). Transpose symmetry f(n)=O(g(n)) if and only if g(n)=O(f(n)), f(n)=O(g(n)) if and only if g(n)=O(f(n)), Because these properties hold for asymptotic notations, one can draw an analogy between the asymptotic comparison of two functions f and g and the comparison of two real numbers a and b: f(n)=O(g(n)) similar to a<=b, f(n)=O(g(n)) similar to a>=b, f(n)=(g(n)) similar to a=b, f(n)=o(g(n)) similar to a<b, f(n)=?(g(n)) similar to a>b. MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept. of Computer Science And Applications, SJCET, Palai 23 We say that f(n) is asymptotically smaller than g(n) if f(n)=o(g(n)), and f(n) i
s asymptotically larger than g(n) if f(n)=?(g(n)). One property of real numbers, however, does not carry over to asymptotic notatio ns: Trichotomy: For any two real numbers a and b, exactly one of the following must hold: a<b, a=b, or a>b. Although any two real numbers can be compared, not all functions are asymptotica lly comparable. That is, for two functions f(n) and g(n), it may be the case tha t neither f(n)=O(g(n)) nor f(n)=O(g(n)) holds. For example, the functions n and n^(1+sin n) cannot be compared using asymptotic notation, since the value of the exponent in n^(1+sin n) oscillates between 0 and 2, taking on all values in bet ween.

1

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

1

Încărcat de

Drepturi de autor:

Formate disponibile

MODULE I MCA403 ALGORITHM ANALYSIS AND DESIGN ADMN 2009- 10 Dept.

S-ar putea să vă placă și