Sunteți pe pagina 1din 36

Advanced Data Structures

Lecture 1: Introduction Mircea Marin

October 1 2013

Mircea Marin

Advanced Data Structures

Organizatorial items
Lecturer and TA: Mircea Marin
email: mmarin@info.uvt.ro

Course objectives:
1

Become familiar with some of the advanced data structures and algorithms which underpin much of todays computer programming Recognize the data structure and algorithms that are best suited to model and solve your problem Be able to perform a qualitative analysis of your algorithms time complexity analysis

Course webpage: http://web.info.uvt.ro/mmarin/lectures/ADS Handouts: will be posted on the webpage of the lecture Grading: 50% nal exam (written), 50% lab + homework Attendance: required
Mircea Marin Advanced Data Structures

Organizatorial items
Lab work: implementation in C++ or Java of applications, using data structures and algorithms presented in this lecture Requirements:
Be up-to-date with the presented material Prepare the programming assignments for the stated deadlines

Recommended textbooks
Cormen, Leiserson, Rivest. Introduction to Algorithms. MIT Press. Adam Drozdek. Data structures and algorithms in C++. Third edition, Thomson Course Technology, 2005. Aho, Hopcroft, Ullmann. Data structures and algorithms, Addison-Wesley, 1985.

Mircea Marin

Advanced Data Structures

From problems to programs

Mircea Marin

Advanced Data Structures

From problems to programs


Initial problems have no simple, precise specications

Mircea Marin

Advanced Data Structures

From problems to programs


Initial problems have no simple, precise specications Identify a good data structure to model the problem

Mircea Marin

Advanced Data Structures

From problems to programs


Initial problems have no simple, precise specications Identify a good data structure to model the problem
Familiarization with the most common data structures used in computer science is a good prerequisite!

Mircea Marin

Advanced Data Structures

From problems to programs


Initial problems have no simple, precise specications Identify a good data structure to model the problem
Familiarization with the most common data structures used in computer science is a good prerequisite!

Find a solution in the form of an algorithm, using the selected model


Algorithm =

Mircea Marin

Advanced Data Structures

From problems to programs


Initial problems have no simple, precise specications Identify a good data structure to model the problem
Familiarization with the most common data structures used in computer science is a good prerequisite!

Find a solution in the form of an algorithm, using the selected model


Algorithm = nite sequence of instructions, each of which has a clear meaning and can be performed with a nite amount of effort in a nite length of time.

Mircea Marin

Advanced Data Structures

From problems to programs


Initial problems have no simple, precise specications Identify a good data structure to model the problem
Familiarization with the most common data structures used in computer science is a good prerequisite!

Find a solution in the form of an algorithm, using the selected model


Algorithm = nite sequence of instructions, each of which has a clear meaning and can be performed with a nite amount of effort in a nite length of time. Do not confuse algorithm with program!

Mircea Marin

Advanced Data Structures

From problems to programs


Initial problems have no simple, precise specications Identify a good data structure to model the problem
Familiarization with the most common data structures used in computer science is a good prerequisite!

Find a solution in the form of an algorithm, using the selected model


Algorithm = nite sequence of instructions, each of which has a clear meaning and can be performed with a nite amount of effort in a nite length of time. Do not confuse algorithm with program!
Algorithm is language-independent: it can be written in pseudocode

Mircea Marin

Advanced Data Structures

From problems to programs


Initial problems have no simple, precise specications Identify a good data structure to model the problem
Familiarization with the most common data structures used in computer science is a good prerequisite!

Find a solution in the form of an algorithm, using the selected model


Algorithm = nite sequence of instructions, each of which has a clear meaning and can be performed with a nite amount of effort in a nite length of time. Do not confuse algorithm with program!
Algorithm is language-independent: it can be written in pseudocode To write a program, you need to choose a programming language

Mircea Marin

Advanced Data Structures

From problems to programs


A modeling scenario

Problem specication:
intersection of 5 roads: A, B, C, D, E; Roads C and E are oneway; There are 13 possible turns: AB, EC, etc. Design a light intersection system to avoid car collisions.

Chosen model: a graph with


nodes = possible turns: {AB , AC , AD , BA, BC , BD , DA, DB , DC , EA, EB , EC , ED } edges are between turns that can be performed simultaneously.
Mircea Marin Advanced Data Structures

From problems to programs


A modeling scenario (contd.)

Graph coloring: assignment of a color to each node so that no two vertices connected by an edge have the same color. R EMARK : Our problem is the same as coloring the graph of incompatible turns using as few colors as possible. Advantages of the chosen model: Graph are a well-known abstract data structure. Graph coloring is a well-studied problem.
Bad news: the problem is NP complete: to nd the best solution, we must essentially try all possibilities :-( nding optimal solution may be very inefcient. Alternative: be happy with a reasonably good solution, using a heuristic algorithm. E.g., use the following "greedy" algorithm.
Mircea Marin Advanced Data Structures

From problems to programs


A modeling scenario (contd.)

Example (Optimal coloring)

color blue red green yellow

turns AB , AC , AD , BA, DC , ED BC , BD , EA DA, DB EB , EC

Mircea Marin

Advanced Data Structures

From problems to programs


A modeling scenario (contd.)

Example (Optimal coloring)

color blue red green yellow

turns AB , AC , AD , BA, DC , ED BC , BD , EA DA, DB EB , EC

The "greedy" algorithm - informal description


1 2

Select some uncolored node and color it with a new color. Scan the list of uncolored nodes. For each node, determine whether it has an edge to any vertex already colored with the new color. If there is no such edge, color the present vertex with the new color.
Mircea Marin Advanced Data Structures

Remarks about the greedy algorithm

Does not always use the minimum possible number of colors. We can use the theory of algorithms to evaluate the goodness of the solution produced. k -clique = set of k nodes, every pair of which is connected by an edge. R EMARK 1: We need k colors to color a k -clique. R EMARK 2: The illustrated graph of incompatible turns contains a 4-clique AC , DA, BD , EB 4 colors are needed.

Mircea Marin

Advanced Data Structures

Pseudo-Language and Stepwise Renements


Illustrated algorithm: greedy

Mircea Marin

Advanced Data Structures

Pseudo-Language and Stepwise Renements


Illustrated algorithm: greedy

This algorithm can be rened to more specic implementation.

Mircea Marin

Advanced Data Structures

Pseudo-Language and Stepwise Renements


Illustrated algorithm: greedy

First renement: clarify the notion of adjacency

Mircea Marin

Advanced Data Structures

Pseudo-Language and Stepwise Renements


Illustrated algorithm: greedy

Second renement: clarify the notion of for each loop


procedure greedy (var G:GRAPH; var newclr :LIST); bool found ; int v , w ; begin while v ! = null do begin found := false w := first vertex in newclr if there is an edge between v and w in G then found :=true; w :=next vertex in newclr end if found =false do begin mark v colored; add v to newclr end v :=next uncolored vertex in G end { greedy }
Mircea Marin Advanced Data Structures

Representations of graphs
Adjacency matrix, etc.

Adjacency matrix
AB AB AC AD BA BC BD DA DB DC EA EB EC ED AC AD BA BC 1 BD 1 1 DA 1 1 DB 1 DC EA 1 1 1 EB 1 1 1 1 1 EC ED

1 1 1

1 1 1 1 1 1 1 1 1 1 1 1

1 1 1

1 1

1 1

Mircea Marin

Advanced Data Structures

Abstract Data Types (ADT)

ADT = mathematical model with a collection of operations dened on that model. ADT = generalization of primitive data types (integer, oat, char, boolean, etc.) Typical examples of ADT:
Lists, stacks, heaps, trees, sets, hash-tables, etc.

In OOP (e.g., C++), they can be implemented as classes.


Benets: encapsulation, generalization (by class extension), overloading, . . .

Mircea Marin

Advanced Data Structures

ADTs for the greedy algorithm


For colors: List Desirable operations:
init(colorList) - initialize the list of colors to be empty. rst(colorList) - return the rst element of the list, and null if the list is empty. next(colorList) - get the next element of the list, and null if there is no next member insert(c, colorList) - insert a new color c into the colorList

For colored graphs: Graph Desirable operations:


get the rst uncolored node, test whether there is an edge between two nodes, mark a node colored, get the next uncolored node, ...

Mircea Marin

Advanced Data Structures

Running time of a program

Factors on which it depends: Size of the input. Quality of compiler-generated code. Nature and speed of computer instructions. Time complexity of the underlying algorithm. The running time of a program should be dened as a function T (n) of the size n of input.

Mircea Marin

Advanced Data Structures

The Big-Oh, Big-Omega, and Big-Theta notation

Assumptions: T (n) : running time function, depending on input size n N. "Big-Oh" notation: T (n) is O (f (n)) if there are c R, c > 0, and n0 N such that T (n) c f (n) for all n n0 . "Big-Omega" notation: T (n) is (f (n)) if there are c R, c > 0 such that T (n) c f (n) for innitely many values of n. "Big -Theta" notation: T (n) is (f (n)) if there are c1 , c2 R, c1 , c2 > 0, and n0 N such that c1 f (n) T (n) c2 f (n) for all n n0 .

Mircea Marin

Advanced Data Structures

Desirable running times


Polynomial running time: T (n) is O (nk ) for some k > 0.
Algorithms with polynomial running time are called tractable.

Question: Algorithm A1 has running time 5 n3 and A2 has running time 100 n2 . Which alg. has better running time?

Mircea Marin

Advanced Data Structures

Desirable running times


Polynomial running time: T (n) is O (nk ) for some k > 0.
Algorithms with polynomial running time are called tractable.

Question: Algorithm A1 has running time 5 n3 and A2 has running time 100 n2 . Which alg. has better running time?

Mircea Marin

Advanced Data Structures

Desirable running times


Polynomial running time: T (n) is O (nk ) for some k > 0.
Algorithms with polynomial running time are called tractable.

Question: Algorithm A1 has running time 5 n3 and A2 has running time 100 n2 . Which alg. has better running time?

Answer: A1 if n < 20; A2 otherwise.


Mircea Marin Advanced Data Structures

Calculating the running time of program

Rule for sums for program P = P1 ; . . . ; Pk made of sequence of programs P1 ,. . . , Pk .


If Pi is O (fi (n)) for i = 1, . . . , k then P is O (f (n)) where f (n) = max{fi (n) | 1 i k }.
If there exists n0 N and g (n) such that fi (n) g (n) for all n n0 then P is in O (g (n)).

Rule for products: If T1 (n) is O (f (n)) and T2 (n) is O (g (n)) then T1 (n) T2 (n) is O (f (n)g (n)).

Mircea Marin

Advanced Data Structures

Case study: Bubble-sort

Input size: n Question: What is T (n)?

Mircea Marin

Advanced Data Structures

General rules for the analysis of programs


A SSUMPTION : The only parameter for the running time function T is the input size (n)
The running time of each assignment, read, and write statement can usually be taken to be O (1). The running time of a sequence of stmts. is given by sum rule. That is, the running time of the sequence is, to within a constant factor, the largest running time of any stmt. in the sequence. An if statement has running time = time for evaluating condition (usually O (1)) + time for evaluating the corresponding branch. The running time to execute a loop is the sum, over all times around the loop, of the time to execute the body and the time to evaluate the condition for termination (usually the latter is O (1)). Often this time is, neglecting constant factors, the product of the number of times around the loop and the largest possible time for one execution of the body, but we must consider each loop separately to make sure.
Mircea Marin Advanced Data Structures

Exercise 1

What is the running time T (n) for the computation of fact (n) by the following recursive program:

Mircea Marin

Advanced Data Structures

Exercise 3

Give, using big oh notation, the worst case running times of the following procedures as a function of n.

Mircea Marin

Advanced Data Structures

From algorithms to programs

Object-oriented programming languages provide good support for abstract data types
Programming in C++ is encouraged Programming in Java is also ok

Modern OOP languages, including C++ and Java, have built-in implementations of several advanced data structures (ADS) and algorithms presented in this course
The missing ADS and algorithms are not hard to implement by yourself

C++ provides the Standard Template Library STL to work with efcient implementations of data structures.

Mircea Marin

Advanced Data Structures

References

Alfred V. Aho et al. Data Structures and Algorithms. Chapter 1: Design and Analysis of Algorithms. 1983. Addison-Wesley.

Mircea Marin

Advanced Data Structures

S-ar putea să vă placă și