Documente Academic
Documente Profesional
Documente Cultură
Andrew Ensor
School of Engineering, Computer and Mathematical Sciences
Spring 2019
ii
Course Information
Instructors
Assoc. Professor Andrew Ensor Dr Maryam Doborjeh
andrew.ensor@aut.ac.nz maryam.gholami.doborjeh@aut.ac.nz
WT609 WT701
921-9999 ext 8485
Office Hours Students are very welcome to discuss with their instructor prob-
lems regarding the course or other matters. Office hours set aside each
week specifically for student questions will be announced in class.
AUT online Students are encouraged to regularly check the course web site on
AUT online at https://blackboard.aut.ac.nz/. This web site contains
class announcements, discussion forums, assignment information, class re-
sources, as well as updated class marks.
Course Work The course work grade will be based equally on four practical
computer assignments and is worth a total of 40% of the final grade.
These assignments are to be completed in the student’s own time and
assignments should be submitted in the correct assignment folder on AUT
online by 5pm on the due date.
Late Assignments The policy with late assignments is that for each day an
assignment is late it has one fifth of its marks deducted. However each
student is entitled to a total of three grace days throughout the semester
before they are penalized.
iii
iv
Mid-Semester Test This is a two hour test worth 10% of the final grade.
Final Examination This is a three hour test scheduled during the examina-
tion period. It is worth 50% of the final grade.
Textbook The textbook for this course is Introduction to Algorithms by Cor-
man, Leiserson, Rivest, Stein.
Collaboration Students are welcome to discuss their assignment work with
their instructor and with other students. However, any group work must
clearly state the contribution of each member of the group. Hence no stu-
dent may receive any part of an assignment from another person, whether
in printed or electronic form without the proper acknowledgement. Fail-
ure to abide by this will result in the assignment not being accepted. Any
form of cheating during the mid-semester test or the final examination is
not acceptable.
Timetable for Spring 2019
Week Wednesday Wednesday
Class 1 17 July Class 2 17 July
1
Mutex Algorithms Semaphores and Monitors
Class 3 24 July Class 4 24 July
2
Client-Server Framework Creational Patterns
Class 5 31 July Class 6 31 July
3
Structural Patterns Behavioral Patterns
Class 7 7 August Class 8 7 August
4
Basic Analysis Recurrence Analysis
1 Class 9 14 August Class 10 14 August
5
Divide-and-Conquer Technique Dynamic Programming
Class 11 21 August Class 12 21 August
6
Elements of Dynamic Programming Greedy Technique
2 Class 13 28 August Class 14 28 August
7
Red-Black Trees Augmenting Data Structures
Mid-Semester Break (2 September to 13 September)
3 Class 15 18 September Class 16 18 September
8
B-Trees Disjoint Sets
Class 17 25 September Class 18 25 September
9
Elementary Graph Algorithms Minimal Spanning Trees
Class 19 2 October Class 20 2 October
10
Single-Source Shortest Paths All-Pairs Shortest Paths
4 Class 21 9 October Class 22 9 October
11
Maximum Flow Network Routing
Class 23 16 October Class 24 16 October
12
Matrix Operations Fast Fourier Transforms
5
Examination Period (21 October to 8 November)
v
vi
Contents
1 Concurrency Algorithms 1
1.1 Mutex Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Semaphores and Monitors . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Client-Server Framework . . . . . . . . . . . . . . . . . . . . . . . 15
2 Design Patterns 25
2.1 Creational Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Structural Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Behavioral Patterns . . . . . . . . . . . . . . . . . . . . . . . . . 35
3 Algorithmic Analysis 45
3.1 Basic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Recurrence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 48
4 Design Techniques 55
4.1 Divide-and-Conquer Technique . . . . . . . . . . . . . . . . . . . 55
4.2 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . 59
4.3 Elements of Dynamic Programming . . . . . . . . . . . . . . . . 63
4.4 Greedy Technique . . . . . . . . . . . . . . . . . . . . . . . . . . 68
vii
viii CONTENTS
Concurrency Algorithms
1
2 CHAPTER 1. CONCURRENCY ALGORITHMS
Thread A Thread B
LOAD R,x
INC R
LOAD R,x ; original value loaded
STORE R,x
If instead the STORE instruction does get executed first then the second thread
loads the incremented value of x:
Thread A Thread B
LOAD R,x
INC R
STORE R,x
LOAD R,x ; incremented value loaded
Worse still, if both threads attempt to modify the value of x then the results
are unpredictable. To illustrate this suppose two threads concurrently attempt
to increment the value of x. Most testing would show that x is incremented by
two:
Thread A Thread B
LOAD R,x
INC R
STORE R,x
LOAD R,x ; incremented value loaded
INC R
STORE R,x ; incremented by two
However, if the operating system happens to swap between the two threads part
way through the three instructions then both threads increment the original
value of x.
Thread A Thread B
LOAD R,x
INC R
LOAD R,x ; original value loaded
INC R
STORE R,x
STORE R,x ; incremented by one
This is known as a lost update since it appears as though one of the increment
statements has been ignored. Such problems can be difficult to detect, making
debugging multithreaded code difficult, so it is important to have a very clear
understanding of concurrency issues. The class LostUpdate demonstrates how
serious the problem can become, due to compiler optimizations of the loop
(probably holding the value of x in a register for efficiency, just loading its value
1.1. MUTEX ALGORITHMS 3
/**
A class that demonstrates the lost update problem in concurrency
by creating two threads that concurrently try to increment x
each a total of ITERATIONS times.
Sometimes the final value of x is not 2*ITERATIONS
@author Andrew Ensor
*/
public LostUpdate()
{ x = 0;
}
at the start of the loop and storing it at the end) the net result of one entire
thread might be lost.
A critical section is a segment of code that should only be executed by one
thread at a time and mutual exclusion refers to the sychronization of thread
access to critical sections so that only one thread can be inside the code at a time.
If data get modified by multiple threads then any code that modifies the data is
considered a critical section and so some algorithm is required to ensure mutual
exclusion. Mutual exclusion can be obtained for a critical section by insisting
that each thread t must acquire a unique lock before it can enter the critical
section and only release the lock once it has completed the critical section.
In practice, a multithreaded system calls a method such as Acquire-Lock
whenever a thread is about to enter a critical section of code, which blocks
that thread until the lock is available. When the lock becomes available the
Acquire-Lock method then unblocks one blocked thread so that it completes
the method and starts executing code in the critical section. When the thread
completes the critical section the system calls a method such as Release-Lock
to release the lock so that it becomes available for other threads.
An algorithm for implementing mutual exclusion via the Acquire-Lock
and Release-Lock methods is called a mutex algorithm. A first attempt at
implementing a mutex algorithm might be to include a boolean flag exclude
that is initially false but is set to true when a thread t enters the critical sec-
tion, as given by the following pseudocode (a compact and language-independent
way of specifying algorithms via a mixture of natural language and high-level
programming constructs):
Incorrect-Acquire-Lock(t)
1 while exclude do
2 delay
3 exclude ← true lock has been acquired by thread t
Incorrect-Release-Lock(t)
1 exclude ← false lock released by thread t
Dekker-Acquire-Lock(t)
1 s ← the other thread than t
2 requested [t] ← true
3 while requested [s] do
4 if priority = s then
5 requested [t] ← false
6 while priority = s do
7 delay
8 requested [t] ← true
Dekker-Release-Lock(t)
1 s ← the other thread than t
2 priority ← s
3 requested [t] ← false
Dekker’s algorithm can be simplified to give Peterson’s algorithm, by taking
advantage of a race condition on a variable turn in the while loop. If one thread
is caught in the while loop because the other thread has also just requested the
lock, then when the other thread changes turn the first thread is released from
the loop and the second thread gets caught by it until the first releases the lock.
Peterson-Acquire-Lock(t)
1 s ← the other thread than t
2 requested [t] ← true
3 turn ← t
4 while turn = t and requested [s] do
5 delay thread s has not released lock
Peterson-Release-Lock(t)
1 requested [t] ← false
Peterson’s algorithm can be extended to work for more than two threads
but the algorithm grows in complexity. As an alternative, Lamport’s bakery
algorithm works with any number of concurrent threads much like customers
being assigned order numbers in a bakery. Each thread t that requests the lock
is assigned a number number [t], and threads are allowed access to the lock in
increasing order of the number. In the case where two threads might be assigned
the same id number, their distinct process numbers are used to break the tie.
The algorithm ensures that the lock is not provided to a thread while there is a
thread s with a lower number or that is part way through choosing a number.
Lamport-Acquire-Lock(t)
1 choosing[t] ← true
2 assign a number to thread t one larger than currently assigned maximum
3 choosing[t] ← false
4 for each thread s do
5 while choosing[s] do
6 delay thread s has not chosen a number
7 while (number [s] 6= 0 and number [s] < number [t])
8 or (number [s] = number [t] and Id [s] < Id [t]) do
9 delay thread s has not released the lock
6 CHAPTER 1. CONCURRENCY ALGORITHMS
Lamport-Release-Lock(t)
1 number [t] ← 0 unassigned
TestAndSet-Acquire-Lock(t)
1 while TestAndSet(exclude, true) do
2 delay
TestAndSet-Release-Lock(t)
1 TestAndSet(exclude, false)
Semaphore-Acquire-Resource(t)
1 following lines must all be one atomic operation
2 while s ≤ 0 do
3 delay
4 s←s−1
1.2. SEMAPHORES AND MONITORS 7
Semaphore-Release-Resource(t)
1 following line must be an atomic operation
2 s←s+1
• the thread that has been made eligible to run continues execution with the
monitor whereas the thread that called notify is made to block until the
monitor is available (called a signal-and-exit monitor or a Hoare monitor),
• the thread that has been made eligible to run now blocks until the thread
that called notify relinquishes the monitor (called a signal-and-continue
monitor).
Note that the wait and notify operations allow for multiple threads to be
inside code synchronized on the same monitor at the same time, although only
one of the threads can hold the monitor at a time and so be executing code.
Each thread can use the notify operation to pass control to another thread to
resume executing synchronized code.
In Java any object can act as a signal-and-continue monitor (rather than a
signal-and-exit monitor), and it inherits the Object methods:
notify to remove one (arbitrarily chosen) thread from the wait set,
synchronized method). A thread can be made to relinquish its monitor part way
through the code and be put in the monitor’s wait set by calling the monitor’s
wait method. This is typically done by a thread when it holds the monitor inside
synchronized code but can not continue with its task until another thread has
done something. If another thread which currently holds the monitor calls the
monitor’s notify method then one arbitrary thread is notified and removed
from the wait set. More commonly, the thread holding the monitor calls the
monitor’s notifyAll method to remove all the waiting threads from the wait
set, giving them each a chance to proceed, and all but one are then made to wait
again (hence in Java the wait method usually appears within a while loop with
a condition that the notifying thread has changed before calling notifyAll).
Once removed from the wait set a thread remains blocked until the monitor
becomes available, as Java uses signal-and-continue monitors so the notifying
thread still holds the monitor (hence often a notifying thread calls notify or
notifyAll just before it completes the synchronized code). When the monitor
becomes available that thread then competes with other threads that might also
be trying to obtain the monitor.
Threads can be managed by the Thread methods:
setPriority changes the thread priority between limits Thread.MIN PRIORITY
and Thread.MAX PRIORITY,
start puts the thread in the runnable state so that it begins executing the run
method of a Runnable object (that was specified to the Thread construc-
tor),
sleep (static method) indicates that the currently executing thread should
not get any further processor time for a specified number of milliseconds
(the thread does not relinquish any monitors it holds),
yield (static method) indicates that the currently executing thread should
give up its current slice of processor time (the thread does not relinquish
any monitors it holds),
join blocks the currently executing thread until the thread whose join method
was called is dead,
interrupt interrupts the thread from its sleep, wait, join, or blocking on an
I/O operation.
Threads can also be assigned to a ThreadGroup object when they are created,
allowing them to be worked with together as a group. The example class
BlockingLock demonstrates how the monitor wait and notify methods can
be used to implement a blocking lock in Java.
One common application of synchronization arises in coordinating access
to a shared database. The reader-writer problem requires synchronization so
that if a thread is writing to the database then no other thread can read from
nor write to it, but multiple threads should be able to read concurrently when
no thread is writing. The reader-writer problem can be solved using a lock
writeLock to ensure that other threads cannot read nor write while one thread
is writing, a count numReaders of the number of threads currently reading, and
another lock readLock to ensure that reading threads synchronize their access
to numReaders.
1.2. SEMAPHORES AND MONITORS 9
/**
A class that represents a blocking (rather than spinning) lock
where the BlockingLock instance is used as a monitor to control
access to its acquireLock and releaseLock methods
@author Andrew Ensor
*/
public BlockingLock()
{ s = true; // initially lock is available
}
Read(t)
1 Acquire-Lock(readLock , t)
2 if numReaders = 0 then block any writing threads
3 Acquire-Lock(writeLock , t)
4 numReaders ← numReaders +1
5 Release-Lock(readLock , t)
6 Read from the database
7 Acquire-Lock(readLock , t)
8 numReaders ← numReaders −1
9 if numReaders = 0 then unblock any writing threads
10 Release-Lock(writeLock , t)
11 Release-Lock(readLock , t)
Write()
1 Acquire-Lock(writeLock , t)
2 Write to the database
3 Release-Lock(writeLock , t)
10 CHAPTER 1. CONCURRENCY ALGORITHMS
This algorithm can suffer from a flaw that is common in concurrent programming
known as starvation, where a thread is forever blocked from obtaining a lock. In
this case if threads frequently read then any thread that wants to write might
never be able to acquire writeLock .
The dining philosopher problem is another classic synchronization problem
that illustrates the need for coordinated access to shared resources. It involves
five philosophers sitting around a table with five chopsticks (the shared re-
source), where a single chopstick is between each adjacent pair of philosophers.
The philosophers sit and think, but from time to time they get hungry. In
order to eat they need to pick up the chopstick on either side of them, which
prevents the philosophers on either side from eating. Whenever a philosopher
stops eating the chopsticks are made available for the philosophers on either
side to use.
A simple attempt to solve the dining philosopher problem might be to use a
lock for controlling access to each chopstick, ensuring that only one philosopher
can hold each chopstick at a time. When a philosopher is hungry they could
try to acquire the lock for the chopstick on their left and on their right. The
classes Philosopher and DiningPhilosophersProblem demonstrate how this
strategy could be implemented, where each philosopher repeatedly changes from
thinking, to hungry (where the philosopher tries to acquire both chopsticks),
to eating. Although this approach might appear to work it does have a very
significant defect which is eventually apparent if the calls to sleep are removed
from Philosopher. The DiningPhilosophersProblem class risks resulting in
deadlock. A deadlock is a situation where no thread can progress because every
thread is waiting on resources held by another thread which in turn is also
waiting. In this example, since each philosopher first picks up the chopstick on
the left and holds it until the chopstick on the right is available eventually the
scenario can result in which each philosopher holds the chopstick on their left
and waits forever for the chopstick on the right.
The class DiningPhilosophersSolution demonstrates how the deadlock in
the dining philosophers problem can be fixed. A DiningPhilosophersSolution
uses itself as a monitor to ensure that chopsticks on either side of a philosopher
are available before either is picked up and that only one philosopher at a time
is in the process of acquiring the chopsticks. Although this approach does avoid
deadlock it is not guaranteed to be free from starvation.
/**
A class that represents a philosopher for the Dining Philosophers
problem (remove sleep and try/catch to make problem more evident)
@see DiningPhilosophersSolution.java
*/
import java.util.Random;
/**
An implementation of the dining philosophers problem using blocking
locks. Note that this implementation might cause a deadlock
@see DiningPhilosophersSolution.java
*/
public DiningPhilosophersProblem()
{ chopsticks = new BlockingLock[NUM];
for (int i=0; i<NUM; i++)
chopsticks[i] = new BlockingLock();
}
/**
An implementation of the dining philosophers problem using a
monitor. Note that this implementation does not cause deadlock
but may cause starvation of a thread
@author Andrew Ensor
*/
public DiningPhilosophersSolution()
{ philState = new State[NUM];
for (int i=0; i<NUM; i++)
philState[i] = State.THINKING;
}
cont-
14 CHAPTER 1. CONCURRENCY ALGORITHMS
-cont
/**
A class that represents an element of generic type E that includes
a method for swapping two references which can cause deadlock
@author Andrew Ensor
*/
machine. When a client requests a connection to that port (via its own socket)
the server socket creates a new socket to handle the server end of the connection.
A client can be implemented in Java by the following steps:
Obtain streams Input and/or output streams are obtained from the socket
and are layered with appropriate filtering streams:
Communicate The client uses the streams to communicate with the server
using an agreed protocol (either a standard protocol such as HTTP or
SMTP or a custom protocol for the application):
out.println(request);
String serverResponse = in.readLine();
Close connection When completed the client cleans up by closing the streams
and then the socket:
out.close();
in.close();
socket.close();
A server must be running on the host machine before any client requests
a connection. Implementing a server is similar to a client but just requires a
few extra steps since a typical server should be able to handle multiple clients
concurrently:
Create a server socket The server states its intention to listen on a specific
port by creating a ServerSocket object for that port. It might be con-
figured with a timeout:
Obtain streams Input and/or output streams are obtained from the socket
and are layered with appropriate filtering streams:
18 CHAPTER 1. CONCURRENCY ALGORITHMS
Communicate The server thread uses the streams to communicate with the
client using an agreed protocol:
String clientRequest = in.readLine();
out.println(response);
Close connection When the communication has completed with that client
the server cleans up by closing the streams and then the socket (if the
server itself is completed then it also closes the server socket):
out.close();
in.close();
socket.close();
For example, the classes GuessClient and GuessServer demonstrate a sim-
ple networked application between one or more clients that try to guess random
numbers which are determined by a server. Note that these two classes have no
reference to each other, each client just knows the host name, correct port for
the server, and the protocol to use for the communication. The protocol used
in this example is as follows:
• the server initiates the communication by sending the Unicode message
Guess the number between m and n inclusive,
• the client then responds with a number,
/**
A class that represents a client in a number guessing game
@see GuessServer.java
*/
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.IOException;
import java.io.PrintWriter;
import java.net.Socket;
import java.util.Scanner; // Java 1.5 equivalent of cs1.Keyboard
public GuessClient()
{
}
cont-
20 CHAPTER 1. CONCURRENCY ALGORITHMS
-cont
/**
A class that represents a server in a number guessing game where
GuessClient objects connect to this GuessServer and try to guess
a random integer value between min (incl) and max (excl)
The game initiates with a response from the server and ends when
the server responds with "Correct guess!"
@author Andrew Ensor
*/
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.IOException;
import java.io.PrintWriter;
import java.net.InetAddress;
import java.net.ServerSocket;
import java.net.Socket;
import java.util.Random;
cont-
22 CHAPTER 1. CONCURRENCY ALGORITHMS
-cont
try
{ while (!stopRequested)
{ // block until the next client requests a connection
// note that the server socket could set an accept timeout
Socket socket = serverSocket.accept();
System.out.println("Connection made with "
+ socket.getInetAddress());
// start a game with this connection, note that a server
// might typically keep a reference to each game
GuessGame game = new GuessGame(socket,
generator.nextInt(max-min)+min);
Thread thread = new Thread(game);
thread.start();
}
serverSocket.close();
}
catch (IOException e)
{ System.err.println("Can’t accept client connection: " + e);
}
System.out.println("Server finishing");
}
// stops server AFTER the next client connection has been made
// (since this server socket doesn’t timeout on client connections)
public void requestStop()
{ stopRequested = true;
}
cont-
1.3. CLIENT-SERVER FRAMEWORK 23
-cont
Design Patterns
structural patterns which describe how classes and objects can be combined
to form larger structures,
Creational patterns deal with the best way of creating instances of objects.
Objects are always created in a language such as Java via a constructor:
Placing such a statement in a program requires that the most appropriate class
and constructor be known when the program is coded. However, sometimes the
exact nature of the object might vary, so it might be preferable to delegate the
responsibility of calling an appropriate constructor to some class that is better
suited to determining the appropriate object to instantiate.
The Factory Pattern is a creational pattern which uses a class commonly
known as a factory class to create instances of objects. The factory class has
methods for instantiating and returning an object instance. The constructor
and class used to create the appropriate object are chosen by the factory from
one or more subclasses of some abstract class (or classes that all implement
the same interface). The client program is generally unaware of which actual
subclass was chosen by the factory to create the object.
25
26 CHAPTER 2. DESIGN PATTERNS
Client
Product Factory
+getProduct():Product
ConcreteProduct
produces
uses
ConcreteProductA ConcreteProductB
produces
produces
+setStreet(street:String):void PhoneNumber
+setCity(city:String):void
+setPostalCode(code:int):void +setPhoneNumber(phone:int):void
+setRegion(region:String):void +getFullPhoneNumber():String
+getCountry():String
+getFullAddress():String
28 CHAPTER 2. DESIGN PATTERNS
Typical New Zealand and French addresses and phone numbers appear as:
31 Symonds Street Musee du Louvre
Auckland. 1020 F-75058 Paris Cedex
New Zealand FRANCE
+64 9 921 9999 +33 1 40 20 50 50
Prepare a suitable software design that has the flexibility for handling such in-
formation.
≪interface≫ ≪interface≫
Client Client
Requirement Requirement
Adapter Adapter
≪interface≫
Node
Leaf Composite
Decorators are widely used when performing input/output with streams. Fil-
tering streams such as BufferedInputStream, DataInputStream, and Input-
StreamReader all follow the Decorator Pattern, since they accept an existing
stream as a parameter to their constructor and manipulate the data in the
stream to give a new (decorated) stream.
/**
An example of a decorator for gui components
Adapted from The Design Patterns Java Companion by Cooper
@author Andrew Ensor
*/
import java.awt.BorderLayout;
import javax.swing.JComponent;
Façade
uses
/**
Decorator for any JComponent to change the foreground colour
whenever the mouse passes over the component
@see GUIDecorator.java
*/
import java.awt.Color;
import java.awt.Dimension;
import java.awt.Toolkit;
import java.awt.event.MouseAdapter;
import java.awt.event.MouseEvent;
import javax.swing.JButton;
import javax.swing.JComponent;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JPanel;
import javax.swing.JTextField;
Client uses
Subject
RealSubject 0..1
Proxy
/**
An abstract class that represents an image which is
obtained via a URL, as a demonstration of the Proxy Pattern
@author Andrew Ensor
*/
import java.awt.Image;
import java.net.URL;
Exercise 2.2 (Parity Check for Streams) Prepare a class called Parity-
OutputSream that extends the class FilterOutputStream and which uses the
Decorator Pattern to add an even parity bit to every seven bits that is written
to the stream. Take care with the last byte written and test your class with a
simple driver program.
2.2. STRUCTURAL PATTERNS 33
/**
A class that represents a ConcreteImage which is obtained via a URL,
and whose getImage method blocks until the image is loaded
@see RemoteImage.java
*/
import java.awt.Component;
import java.awt.Image;
import java.awt.MediaTracker;
import java.awt.Toolkit;
import java.net.URL;
import java.util.Properties;
/**
A class that represents a proxy image which is obtained via a URL,
with a temporary image displayed until the actual image is loaded
@see RemoteImage.java
*/
import java.awt.Component;
import java.awt.Image;
import java.net.URL;
import javax.swing.ImageIcon;
uses
Client obtains element from
≪interface≫
≪interface≫ Iterator<E>
Collection<E>
+next() : E
+hasNext() : boolean
+remove() : void
ConcreteCollection ConcreteIterator
The Iterator Pattern does not specify the behaviour of an iterator if the
collection is modified somehow as it is iterating through its elements, such as if
another thread adds an element. A fail-fast iterator is an iterator that throws
an exception (such as a ConcurrentModificationException) if the collection
is modified in some way other than by using the iterator’s remove method.
Some iterators such as Java’s ListIterator interface provide further meth-
ods supporting bi-directional iteration through the elements in a collection. Usu-
ally such iterators are only used for collections that can somehow be efficiently
traversed in both forward and backward directions (such as a doubly-linked list).
The cursor position (index) of a bi-directional iterator is considered to always
lie between elements, immediately after the (previous) element most recently
called by next.
For example, the JDBC API libraries use a database connection to cre-
ate a statement of a specified type, either the default TYPE FORWARD ONLY (re-
sult set is not scrollable, meaning that the iterator moves only in one direc-
tion), TYPE SCROLL INSENSITIVE (result set is scrollable but not sensitive to
database changes, meaning that the iterator can be jumped to any index),
or TYPE SCROLL SENSITIVE (result set is scrollable and sensitive to database
changes), and of a specified concurrency, either CONCUR READ ONLY (result set
cannot be used to update the database) or CONCUR UPDATABLE (result set can
be used to update the database). One peculiarity with a ResultSet iterator is
36 CHAPTER 2. DESIGN PATTERNS
that it requires an initial call to next to position the cursor at the first record:
Statement stmt = con.createStatement(type, concurrency );
String command = "SELECT ...";
ResultSet rs = stmt.executeQuery(command);
while (rs.next())
{ ...= rs.getXxx (...);
...
}
Another type of iterator is a filtered iterator which iterates through only
those elements of the collection that satisfy some condition (this portion of the
collection is called a view ). Others might allow some control over the order
in which the iterator returns the elements of the collection. For example, a
Java ME record store has a RecordEnumeration iterator for iterating through
its byte[] records, which allows an optional filter to filter out only certain
records, an optional comparator to iterate the elements in a certain order, and a
boolean to indicate whether the iterator should be kept updated with (sensitive
to) changes in the record store:
RecordEnumeration re = rs.enumerateRecords(filter,
comparator, sensitive );
while (re.hasNextElement())
{ byte[] bytes = re.nextRecord();
...
}
Observable
-listeners:Collection<Observer> notifies
+addListener(o:Observer):void
+removeListener(o:Observer):void
+notifyAll():void ≪interface≫
Observer
+process(e:Event):void
The Observer Pattern provides a way to have one or more classes (called ob-
servers or listeners) notified of particular events. The subject that will generate
the events has methods such as addListener and removeListener for manipu-
lating a collection it holds of observers registered with it that should be notified
of the events. It also often has a method called notifyAll or fireXxx which
calls some process method for each registered observer, passing it details of
the event as a parameter. This process method would typically perform some
action as a consequence of the event.
2.3. BEHAVIORAL PATTERNS 37
ConcreteStrategyA ConcreteStrategyB
+perform():void +perform():void
The Strategy Pattern is used to select between several alternative algorithms
or strategies. A context object holds a reference to an instance of the chosen
strategy which might be determined by the client or instead selected by the
context itself according to the situation. The pattern encourages the algorithm
for each strategy to be implemented in its own class rather than coded as meth-
ods in the context itself. All the strategies inherit from a common parent class
(or implement a common interface), so new strategies can be easy added with
only minor additions to the context. This pattern is typically used when there
are several alternative ways of performing some task, and allows the choice to
38 CHAPTER 2. DESIGN PATTERNS
/**
A class that represents a Transmitter of objects of type E that
are sent to the Receiver (operating in a separate thread) one
item at a time via a Messenger
@author Andrew Ensor
*/
import java.util.Collection;
import java.util.Random;
/**
A class that represents a Receiver of objects of type E that
are sent from the Transmitter (operating in a separate thread) one
item at a time via a Messenger
@author Andrew Ensor
*/
import java.util.Collection;
import java.util.Random;
/**
A class that represents a Messenger which is passed an object of
type E (by a Transmitter) and holds it until it is requested (by a
Receiver). Note that further attempts by a thread to give the
messenger an object while it holds one are made to wait, and
attempts by a thread to obtain the object while the messenger does
not hold one are made to wait.
@author Andrew Ensor
*/
public Messenger()
{ item = null;
empty = true;
accepting = true;
}
cont-
2.3. BEHAVIORAL PATTERNS 41
-cont
/**
A class which demonstrate communication between two threads
one created from a Transmitter and the other from a Receiver
which communicate via a Messenger object by observing its
thread notifications
@author Andrew Ensor
*/
import java.util.ArrayList;
import java.util.Collection;
concrete methods that are implemented in the superclass and which the sub-
classes would use without being overridden (such as final methods),
abstract methods that are not implemented in the superclass and so must be
implemented by subclasses,
Exercise 2.3 (Strategy for Layout Managers) Write a simple gui panel
that demonstrates the Strategy Pattern for determining the layout of the compo-
nents of the panel. You might like to consider using a custom layout manager
by preparing a class that implements the LayoutManager interface (such as the
example CircleLayout that is available on AUT online).
44 CHAPTER 2. DESIGN PATTERNS
Chapter 3
Algorithmic Analysis
45
46 CHAPTER 3. ALGORITHMIC ANALYSIS
Insertion-Sort(A)
1 for i ← 1 to length[A] − 1 do
2 key ← A[i]
3 Insert A[i] into the sorted sequence A[0 . . i − 1]
4 insertIndex ← i
5 while insertIndex > 0 and A[insertIndex −1] > key do
6 shift element at insertIndex −1 along one to make space
7 A[insertIndex ] ← A[insertIndex −1]
8 insertIndex ← insertIndex −1
9 A[insertIndex ] ← key
The formula for the time T (n) depends not only on the number n of elements
to be sorted but also on the initial ordering of the elements, which determines
the known values m1 , m2 , . . . , mn−1 . The best case (i.e. least time) is when
m1 = 1, m2 = 1, . . . , mn−1 = 1 which corresponds to the elements all being
initially in order. In this case:
T (n) = c1 n + c2 (n − 1) + c4 (n − 1) + c5 (n − 1) + c9 (n − 1)
= (c1 + c2 + c4 + c5 + c9 ) n − (c2 + c4 + c5 + c9 ) ,
so the running time is a linear function T (n) = an + b of n. The worst case (i.e.
greatest time) is when m1 = 1, m2 = 2, . . . , mn−1 = n − 1, which does occur if
the elements are exactly in reverse order. In this case:
n−1
X
T (n) = c1 n + c2 (n − 1) + c4 (n − 1) + c5 i
i=1
n−1
X n−1
X
+ c7 (i − 1) + c8 (i − 1) + c9 (n − 1).
i=1 i=1
3.1. BASIC ANALYSIS 47
n−1 n−1
X n(n − 1) X (n − 1)(n − 2)
Using the formulas i= and (i − 1) = gives:
i=1
2 i=1
2
n(n − 1)
T (n) = c1 n + c2 (n − 1) + c4 (n − 1) + c5
2
(n − 1)(n − 2) (n − 1)(n − 2)
+ c7 + c8 + c9 (n − 1)
2 2
c5 + c7 + c8 2 1 3 3
= n + c1 + c2 + c4 − c5 − c7 − c8 + c9 n
2 2 2 2
+ (−c2 − c4 + c7 + c8 − c9 ) ,
so the running time is a quadratic function of n. For this reason the Insertion-
Sort algorithm is considered to have O n2 time complexity, since for some
inputs its running time can be at worst a quadratic function of the input size n.
One technique that is useful in determining the correctness of algorithms
containing loops is to use loop invariants. A loop invariant is a boolean state-
ment that is correct for each iteration of the loop. For example, a loop invariant
for the for loop in the Insertion-Sort algorithm is:
To show a loop invariant is true an inductive proof is used. First the statement
is shown to be true at the start of the first iteration of the loop. Next it is
show that if it is true for iteration i then it must also be true for the following
iteration i + 1. Upon termination of the loop the loop invariant should provide
useful information for the analysis of the correctness of the algorithm.
The given loop invariant for the Insertion-Sort algorithm can be verified
as follows. First, the loop starts with i = 1 and so the subarray A[0 . . i − 1]
consisting of only one element A[0], which is the original element at A[0] (since
the loop has not yet swapped any elements) and is trivially sorted. Next, if the
loop invariant is true for iteration i then at the start of iteration i the subarray
A[0 . . i − 1] is a sorted permutation of the original elements A[0 . . i − 1]. To
see that the loop invariant holds for the start of the next iteration i + 1 the
statements inside the for loop executed during iteration i are studied. Note
that the inner while loop just inserts the next element A[i] in its correct sorted
position (formally this would be justified with another loop invariant for the
while loop). Hence at the completion of iteration i the subarray A[0 . . i] holds
the elements that were originally at A[0 . . i − 1] and A[i]. Since A[0 . . i − 1] were
in sorted order at the start of the iteration and A[i] has been inserted at the
correct sorted position, the loop invariant will be valid for the start of the next
iteration i + 1. Note that the loop terminates at the start of iteration i = n, and
at this stage the loop invariant states that A[0 . . n − 1] is a permutation of the
elements that were originally in A[0 . . n − 1] but in sorted order. This proves
that the Insertion-Sort algorithm does correctly solve the sorting problem.
Another algorithm that correctly solves the sorting problem is the Merge-
Sort algorithm. This algorithm uses recursion to repeatedly divide the array
A into two smaller arrays L and R, sort each, and then merge the two sorted
halves back together to one sorted array. To achieve this it typically uses an
48 CHAPTER 3. ALGORITHMIC ANALYSIS
Solving this recurrence (using the Master Theorem of Section 3.2) gives that
the dominant term of T (n) is n log2 n in all cases (regardless of the original
order of the elements). Hence the Merge-Sort algorithm is considered to
have O (n log n) complexity.
Exercise 3.1 (Analyzing Selection Sort) Find an expression for the time
T (n) for the following Selection-Sort algorithm, and a suitable loop invariant
to verify that the algorithm correctly solves the sorting problem.
Selection-Sort(A)
1 n ← length[A]
2 for i ← 0 to n − 2 do
3 Find the least element in A[i . . n − 1]
4 indexMin ← i
5 for j ← i + 1 to n − 1 do
6 if A[j] < A[indexMin] then
7 indexMin ← j
8 swap elements at indexMin and i
O (g(n)) = {f (n): ∃c > 0 and ∃n0 for which 0 ≤ f (n) ≤ cg(n) for every n ≥ n0 } .
3.2. RECURRENCE ANALYSIS 49
Merge-Sort(A)
1 n ← length[A]
2 Merge-Sort-Segment(A, 0, n)
Merge-Sort-Segment(A, p, r)
1 if p + 1 < r then there are several elements to be sorted
2 q ← (p + r)/2 q is the middle index between p and r
3 Merge-Sort-Segment(A, p, q)
4 Merge-Sort-Segment(A, q, r)
5 n1 ← q − p
6 n2 ← r − q
7 create arrays L[0 . . n1 − 1] and R[0 . . n2 − 1]
8 for i ← 0 to n1 − 1 do
9 L[i] ← A[p + i]
10 for j ← 0 to n2 − 1 do
11 R[i] ← A[q + j]
12 i←0
13 j←0
14 for k ← p to r − 1 do
15 determine which element to next put in A
16 if i < n1 then there are more elements in L
17 if j < n2 then there are more elements in R
18 if L[i] ≤ R[j] then
19 A[k] ← L[i]
20 i←i+1
21 else A[k] ← R[j]
22 j ←j+1
23 else A[k] ← L[i]
24 i←i+1
25 else A[k] ← R[j]
26 j ←j+1
50 CHAPTER 3. ALGORITHMIC ANALYSIS
This means that the set O (g(n)) consists of all those Time
cg(n)
(non-negative) functions f (n) that for n sufficiently
large are bounded above by some (possibly large) f (n)
multiple of the function g(n). One often says f (n) is
O (g(n)) if f (n) ∈ O (g(n)). The O-notation is used
to give an asymptotic upper bound on a function n
f (n) to within a constant factor. n0
For example, 7n+18 is O(n) since it is bounded above by 8n for n ≥ 18. Also
5n2 + 100n log2 n − 1 is O n2 since it is bounded above by 6n2 for n ≥ 996.
But 5n2 + 100n log2 n − 1 is not O(n) as no constant multiple of any linear
function can be an upper bound for 5n2 + 100n log2 n − 1, which eventually
grows above any linear function. However, 7n + 18 is also considered O n2
since it is bounded above by 1n2 for n ≥ 9. For this reason one often says that
the time complexity of Insertion-Sort algorithm is O n2 even though for
some inputs it has linear complexity.
The Ω-notation is used to give an asymptotic lower bound on a function
f (n) to within a constant factor. The set Ω (g(n)) is given by:
Ω (g(n)) = {f (n): ∃c0 > 0 and ∃n0 for which 0 ≤ c0 g(n) ≤ f (n) for every n ≥ n0 } .
For example, 7n+18 is Ω(n) since it is bounded below Time
by 7n for n ≥ 1. It is also Ω(1) since it is bounded
below by the constant function 25 for n ≥ 1 Also f (n)
5n2 +100n log2 n−1 is Ω n2 since it is bounded be-
c′g(n)
low by 4n2 for n ≥ 1. Note that the Insertion-Sort
algorithm is Ω(n), since for some inputs it has linear n
time complexity, whereas for others it is quadratic. n0
The Θ-notation is used to asymptotically bound a function f (n) from above
and from below by two multiples of the same function g(n), where Θ (g(n)) is
the set:
{f (n): ∃c, c0 > 0 and ∃n0 for which 0 ≤ c0 g(n) ≤ f (n) ≤ cg(n) for every n ≥ n0 } .
Time
cg(n)
Hence a function f (n) is Θ (g(n)) means that for
large enough values of n the function f (n) is bounded f (n)
above by one multiple cg(n) of g(n) and below by an- c′g(n)
other multiple c0 g(n), in which case g(n) is called an
asymptotically tight bound for f (n). For example, n
n0
7n + 18 is Θ(n) and 5n2 + 100n log2 n − 1 is Θ n2 .
Note that f (n) is Θ (g(n)) if and only if f (n) is both Ω (g(n)) and O (g(n)).
Hence the Θ-notation cannot be applied to the Insertion-Sort algorithm since
for some inputs it has linear complexity, whereas for others it has quadratic
complexity. On the other hand, the Merge-Sort algorithm is Θ (n log n) since
regardless of the input its time requirement T (n) always has dominant term
n log n.
Two cautions however with using the O-, Ω-, and Θ-notations to compare
algorithms for a problem. Firstly, these notations hide the constant factors ci
in an algorithm, which can be quite large (and significant when comparing two
algorithms of the same complexity or algorithms for small values of n). Secondly,
they also hide the non-dominant terms which might also be significant for small
3.2. RECURRENCE ANALYSIS 51
values of n. For example, although the Insertion-Sort algorithm is O n2
and the Merge-Sort algorithm is Θ (n log n), for small values of n Insertion-
Sort is quite quick, and for some special inputs for even large values of n its
time complexity is linear (since it is Ω(n)).
The analysis of the time requirements of an algorithm often involves solving
recurrences, which are equations that express the values of a function T (n) for
n > 1 recursively in terms of its values for smaller values of n. Simple recurrences
of the form T (n) = c1 T (n − 1) + c2 T (n − 2) (such as the Fibonacci numbers)
can be solved using a technique that makes use of the roots of the quadratic
x2 − c1 x − c2 . However many of the recurrences encountered in algorithmic
analysis have the more complicated form:
where a ≥ 1 and b > 1 are constants, and f (n) is some known function. Techni-
cally n/b might mean bn/bc or dn/be, where the floor function bxc denotes the
largest integer that is not more than the number x and the ceiling function dxe
denotes the smallest integer that is not less than x. However, usually the choice
does not affect the analysis and for convenience often only certain values of n,
such as powers of 2, are analyzed.
Sometimes a pattern can be found in the values given by a recurrence
and an explicit form obtained for T (n). For example, consider the recurrence
T (n) = 2T (n/2) + cn where c is a constant. Calculating some values of T (n)
for successive values of n ≥ 2 gives:
T (2) = 2T (1) + 2c
T (4) = 2T (2) + 4c
= 4T (1) + 8c
T (8) = 2T (4) + 8c
= 8T (1) + 24c
T (16) = 2T (8) + 16c
= 16T (1) + 64c
T (32) = 2T (16) + 32c
= 32T (1) + 160c.
cn
Next, the recursion tree is
grown by expanding each of cn/2 cn/2
the T (n/2) leaf nodes us-
ing the recurrence, where
T (n/2) = 2T (n/4) + cn/2. T (n/4) T (n/4) T (n/4) T (n/4)
This process is repeated until all the leaf nodes in the tree hold the value T (1),
which cannot be expanded further using the recurrence. For the recurrence
T (n) = 2T (n/2) + cn this results in a tree with height log2 n where there are
n leaf nodes on the lowest level, each holding the term T (1), and each of the
other levels has total cn. Summing all the node values for the recursion tree
then gives that T (n) = nT (1) + cn log2 n. Hence in this example if c 6= 0 then
T (n) is again seen to be Θ (n log2 n).
Totals:
cn cn
cn/2 cn/2 cn
The recursion tree for a general recurrence of the form T (n) = aT (n/b)+f (n)
will have height logb n and have alogb n (which is the same as nlogb a ) leaf nodes
on the lowest level, each holding the term T (1).
Totals:
f (n) f (n)
T (n) = aT (n/b) + f (n) where a ≥ 1 and b > 1 are constants, and f (n) is some
known function.
• If f (n) is O nlogb a− for some constant > 0 then T (n) is Θ nlogb a .
• If f (n) is Θ nlogb a then T (n) is Θ nlogb a logb n .
• If f (n) is Ω nlogb a+ for some constant > 0 and if af (n/b) ≤ rf (n) for
some constant r < 1 and all large enough n then T (n) is Θ (f (n)).
The key to applying the Master Theorem is to compare nlogb a with the
complexity of f (n). For example, suppose T (n) is given by the recurrence
T (n) = 4T (n/2) + n3/2 . In this case nlog2 4 = n2 and f (n) = n3/2 is O n3/2 ,
so the first case of the Master Theorem gives that T (n) is Θ n2 .
In the earlier example T (n) was given by the recurrence T (n) = 2T (n/2)+cn.
In this case nlogb a = n and f (n) = cn is Θ(n), so the second case of the Master
Theorem gives that T (n) is Θ (n log2 n).
As a final example, suppose T (n) is given by the recurrence
T (n) = T (n/3)+
n. In this case nlog3 1 = n0 = 1 and f (n) = n is Ω n1 with af (n/b) = n/3 =
1
3 f (n), so the third case of the Master Theorem given that T (n) is Θ(n).
Design Techniques
Divide-and-Conquer(n)
1 if n ≤ n0 then
2 directly solve problem without further dividing
3 else
4 divide problem into a subproblems each of size n/b
5 for i ← 0 to a − 1 do
6 Divide-and-Conquer(n/b) subproblem i
7 combine the a solutions into solution of the problem of size n
Let f (n) denote the time required to divide the problem of size n into a sub-
problems and later combine their solutions back together into a solution for the
original problem. Then the total time T (n) to solve a problem of size n > n0 is
given by the recurrence:
T (n) = aT (n/b) + f (n).
The Merge-Sort algorithm on Page 49 is one example of a divide-and-
conquer algorithm. It starts with a list or array of n elements to be sorted
55
56 CHAPTER 4. DESIGN TECHNIQUES
and divides the sorting problem into two smaller sorting problems, simply by
splitting the array in half to give two arrays each of size n2 (so a = 2 and
b = 2). It then recursively sorts each, and combines the two sorted arrays
together into a solution of the sorting problem by merging them into one sorted
array. The threshold is commonly taken to be n0 = 1, an array that contains
just one element and so is already sorted. With the Merge-Sort algorithm
T (n) = 2T (n/2) + c2 n + c3 for n > 1, so the second case of the Master Theorem
gives that the algorithm is Θ (n log2 n).
Quick sort is another sorting algorithm that uses the divide-and-conquer
technique. It also divides an array into two smaller arrays, using one chosen
element as the pivot to partition the array into two parts, however with quick
sort these arrays are not guaranteed to have the same approximate size. Once
each is recursively sorted, they are easily combined by appending the second
sorted side of the partition after the first to give the complete sorted array.
A binary search also uses the divide-and-conquer technique. To search a
list or array that is sorted in order the Binary-Search algorithm starts by
comparing the target with the element in the middle. This effectively divides
the search problem in half, as if the target has not been found only one half
of the elements need to be searched further, which can be done by recursively
calling the algorithm again. For this algorithm a = 1 since only one of the halves
is searched further, and b = 2 since the problem size is divided by two each time.
So the recurrence is T (n) ≤ T (n/2) + c for n ≥ 2 (an inequality is used since the
target might be found at any stage), where c is the constant time to perform the
recursive call, calculate the midpoint, and perform each comparison. The second
case of the Master Theorem applied to T (n) = T (n/2) + c gives Θ (log2 n), so
the Binary-Search algorithm must be O (log2 n).
Binary-Search(A, target)
1 return Binary-Search-Segment(A, target, 0, length[A])
Binary-Search-Segment(A, target, p, q)
1 if p ≥ q then
2 return -p-1 not found so return -insertion point-1
3 else
4 mid ← (p + q)/2
5 if target = A[mid ] then
6 return mid target has been found
7 else if target < A[mid ] then
8 return Binary-Search-Segment(A, target, p, mid )
9 else
10 return Binary-Search-Segment(A, target, mid +1, q)
Let pH be the integer with digits p0 , p1 , . . . , pbn/2c−1 and let pL have digits
pbn/2c , pbn/2c+1 , . . . , pn−1 . Likewise, take qH and qL be the integers formed
from the first and second halves of the digits for q. Then p = pH · 10dn/2e + pL
and q = qH · 10dn/2e + qL . For example, if p = 123456 and q = 987654 then
pH = 123, pL = 456, qH = 987, and qL = 654. Next, note that
p·q = pH · 10dn/2e + pL · qH · 10dn/2e + qL
= pH · qH · 102dn/2e + (pH · qL + pL · qH ) · 10dn/2e + pL · qL .
pH · qL + pL · qH = (pH + pL ) · (qH + qL ) − pH · qH − pL · qL .
Hence the multiplication p·q can be performed by calculating the three products
pH · qH , pL · qL , (pH + pL ) · (qH + qL ), giving the recurrence T (n) = 3T (n/2) +
f (n) and that T (n) is Θ nlog2 3 .
The example IntegerMultiplication shows how this Θ n1.585 can be
implemented in Java (it also uses a java.math class called BigInteger to check
that the product is actually correct).
There are many other common algorithms that use the divide-and-conquer
technique. As a further example, consider a set of points P0 , P1 , . . . , Pn−1 where
Pi = (xi , yi ) and suppose x0 ≤ x1 ≤ · · · ≤ xn−1 (otherwise an O (n log n)
algorithm could first be used to sort them in order of increasing x-coordinates).
Finding the two points that are closest to each other has applications to areas
58 CHAPTER 4. DESIGN TECHNIQUES
/**
A class that demonstrates how the divide-and-conquer technique can
be used to provide an O(n^(log_2 3)) algorithm for multiplying
(positive) integer values of arbitrary length n
@author Andrew Ensor
*/
public class IntegerMultiplication
{
...
// multiply the two positive integer values held in the strings
public String multiply(String p, String q)
{ int pLength = p.length();
int qLength = q.length();
String product;
if (pLength+qLength<=THRESHOLD)
{ // directly evaluate the multiplication as int values
product = Integer.toString
(Integer.parseInt(p)*Integer.parseInt(q));
}
else
{ // ensure that p and q have the same length
if (pLength>qLength)
{ q = padWithZeros(q, pLength);
qLength = q.length();
}
else if (qLength>pLength)
{ p = padWithZeros(p, qLength);
pLength = p.length();
}
// divide the integer strings m and n into two parts
int middle = pLength/2;
String pHigh = p.substring(0, middle);
String pLow = p.substring(middle);
String qHigh = q.substring(0, middle);
String qLow = q.substring(middle);
// perform recursive conquer with three multiplications
String highPartProduct = multiply(pHigh, qHigh);
String lowPartProduct = multiply(pLow, qLow);
String mixedPart = multiply(add(pHigh,pLow),add(qHigh,qLow));
// combine three multiplications together to get product pq
String highPartShifted = appendZeros(highPartProduct,
2*(pLength-middle));
String midPartShifted = appendZeros(subtract(subtract(
mixedPart,highPartProduct),lowPartProduct),pLength-middle);
product = add(add(highPartShifted, midPartShifted),
lowPartProduct);
}
return product;
}
}
4.2. DYNAMIC PROGRAMMING 59
Exercise 4.1 (Closest Pair of Points) Write a program that accepts a col-
lection of points and which implements the divide-and-conquer technique to find
the distance between the closest two points in the collection.
progresses so that those parts do not get solved again and again as the solution
to the original problem is constructed. By comparison, the divide-and-conquer
technique treats subproblems as being independent, and so might do more work
than necessary if the same subproblem is encountered several times.
For example, suppose a car factory produces cars in n steps along two as-
sembly lines A and B that work in parallel. Let ai and bi be the times required
for step i of the production (for 0 ≤ i < n) along each line, eA and eB be the
time required for a chassis to enter the line, and xA and xB the time for the
completed car to exit the line. Hence the total time taken to produce a car on
n−1
X
assembly line A is eA + ai + xA , and the time taken on assembly line B is
i=0
n−1
X
eB + bi + xB . Suppose that for rush jobs a partially completed car can be
i=0
shifted from one assembly line to the other at a time cost of si (from line A to
line B after step i of production) and ti (from line B to line A).
a0 a1 an−1
b b b b b
eA xA
eB b b b b b xB
b0 b1 bn−1
The problem is to efficiently find the minimal time required to assemble a
car. Ignoring paths that loop back and forth between the same step of assembly
(which just waste time), there are 2n distinct possible paths that could be
taken through the assembly process, so a brute-force approach of checking every
possible path is totally infeasible except for very small values of n.
A dynamic programming approach starts by characterizing the structure of
an optimal solution. Clearly an optimal solution would not transfer from one
line to the other and then straight back so it must consist of a path that first
starts with step 0 of one of the lines (requiring either time eA + a0 or eB + b0 ),
and then either continue on the same line (either a1 or b1 ) or else swap to
the other line and perform step 1 there (either s0 + b1 or t0 + a1 ). It repeats
this until finishing step n − 1 and exiting (which adds a further time of either
an−1 + xA or bn−1 + xB ). Note that if an optimal solution performs step i on
a particular line then that optimal path must also be an optimal solution up
to that point of assembly. This must be the case since if there were a quicker
way of getting to that point of assembly, then replacing the first portion of the
optimal solution with the quicker way would give a solution even quicker than
the optimal solution. Put another way, each step of an optimal path for the
entire problem gives the fastest way to get to each point in the production line.
For example, if an optimal solution to the assembly problem enters on line A,
performs step 0 on that line and then switches to line B for step 1, then the
fastest way of getting to the start of step 2 on line B must also be by performing
step 0 on line A and then switching to line B for step 1. The identification that
the optimal solution to some problem must have optimal substructure is the key
to applying dynamic programming to solve the problem.
Next, the optimal solution must be defined recursively in terms of optimal
4.2. DYNAMIC PROGRAMMING 61
Fastest-Way(a, b, s, t, e, x)
1 fA (0) ← eA
2 fB (0) ← eB
3 lA (0) ← A only path to get time fA (0) is along line A
4 lB (0) ← B only path to get time fB (0) is along line B
5 for i ← 1 to n − 1 do
6 find quickest time fA (i) to start of step i on line A
7 if fA (i − 1) + ai−1 < fB (i − 1) + bi−1 + ti−1 then
8 fA (i) ← fA (i − 1) + ai−1
9 lA (i) ← A quickest path used step i − 1 of line A
10 else
11 fA (i) ← fB (i − 1) + bi−1 + ti−1
12 lA (i) ← B quickest path used step i − 1 of line B
13 find quickest time fB (i) to start of step i on line B
14 if fB (i − 1) + bi−1 < fA (i − 1) + ai−1 + si−1 then
15 fB (i) ← fB (i − 1) + bi−1
16 lB (i) ← B quickest path used step i − 1 of line B
17 else
18 fB (i) ← fA (i − 1) + ai−1 + si−1
19 lB (i) ← A quickest path used step i − 1 of line A
20 if fA (n − 1) + an−1 + xA < fB (n − 1) + bn−1 + xB then
21 f ∗ = fA (n − 1) + an−1 + xA
22 l∗ = A
23 else
24 f ∗ = fB (n − 1) + bn−1 + xB
25 l∗ = B
26 return f ∗ and l∗
to optimally form the product Ai Ai+1 . . . Aj−1 , and for j > i + 1 let sij denote
that value of k with i < k < j where the optimal product splits as:
since the first product is a pi × psij matrix and the second product is a psij × pj
matrix.
Continuing the dynamic programming approach the desired value of m0n is
found from the recurrence not by a top-down recursive approach, but instead
with a bottom-up approach, starting with m01 , m12 , . . . , mn−1n and then using
them to find m02 , m13 , . . . , mn−2 n , and then m03 , m14 , . . . , mn−3 n etcetera until
m0n is found.
Note as in the assembly line problem that there is overlap in the subproblems.
Here a term such as m02 cannot be found until m01 and m12 are known, but
m13 also needs the value of m12 (as well as m23 ). The Matrix-Chain-Order
algorithm shows how the optimal solution is found in O n3 using dynamic
programming, and the program MatrixChainOrder is a Java implementation of
this algorithm using ragged arrays for m and s, where mij is the entry m[j][i]
in the array (so that each m[j] is a one-dimensional array of length j).
Matrix-Chain-Order(p)
1 n ← length[p] − 1
2 for j ← 1 to n do
3 i←j−1
4 mij ← 0
5 for l ← 2 to n do number of matrices multiplied together
6 for j ← l to n do
7 i←j−l
8 sij ← mini<k<j (mik + mkj + pi pk pj ) requires loop
9 mij ← misij + msij j + pi psij pj
10 return m and s
/**
A class that demonstrates how the divide-and-conquer technique can
be used to find an optimum way of multiplying a chain of n
matrices where each matrix has order p_i times p_{i+1}
@author Andrew Ensor
*/
public MatrixChainOrder(int[] p)
{ this.p = p;
n = p.length-1;
m = new int[n+1][];
m[0] = null; // m[j][i] not used for j=0
s = new int[n+1][];
s[0] = null; s[1] = null; // s[j][i] not used for j=0 nor j=1
for (int j=1; j<=n; j++)
{ int i = j-1;
m[j] = new int[j]; // create m[j][0], ..., m[j][j-1]
m[j][i] = 0;
s[j] = new int[j-1]; // create s[j][0], ..., s[j][j-2]
}
for (int l=2; l<=n; l++) // l is number of matrices in product
{ for (int j=l; j<=n; j++)
{ int i = j-l;
// find k for which m[k][i]+m[j][k]+p[i]p[k]p[j] minimized
int indexK = i+1;
int minM = m[indexK][i]+m[j][indexK]+p[i]*p[indexK]*p[j];
for (int k=i+2; k<j; k++)
{ int anotherM = m[k][i]+m[j][k]+p[i]*p[k]*p[j];
if (anotherM<minM)
{ indexK = k;
minM = anotherM;
}
}
// update arrays m and s
s[j][i] = indexK;
m[j][i] = minM;
}
}
}
cont-
4.3. ELEMENTS OF DYNAMIC PROGRAMMING 65
-cont
If this were not the case and say a shorter path could be found between u and
w then combine it with the portion between w and v. If the result were a path
66 CHAPTER 4. DESIGN TECHNIQUES
between u and v then it would be shorter than the shortest path, a contradiction.
If it were not a path then it would have two edges in common. Removing the
common edges would lead to an even shorter path, again a contradiction. Hence
the shortest path problem has optimal substructure.
Now instead consider a longest path between u
and v. If w is a vertex along this path one might
b b
be tempted to claim that the portion of the path
between u and w is a longest path between these u v
two vertices, but this might be incorrect. There b b
• xm−1 = yn−1 , and so too zk−1 = xm−1 since Z is a longest common subse-
4.3. ELEMENTS OF DYNAMIC PROGRAMMING 67
• xm−1 6= yn−1 , and so zk−1 cannot equal both of them. Hence Z must be
a longest common subsequence of either Xm−1 and Y (if zk−1 6= xm−1 )
or of X and Yn−1 (if zk−1 6= yn−1 ).
This shows that an optimal solution to the longest common subsequence problem
is composed of optimal solutions to subproblems of finding the longest common
subsequence of Xi and Yj for i < m and j < n. A recursive solution to the
problem would first check whether xm−1 = yn−1 , and if so would recursively find
the longest common subsequence of Xm−1 and Yn−1 and append xm−1 to that
subsequence. If not it would recursively find the longest common subsequence
of Xm−1 and Y , and of X and Yn−1 and take the longer of the two. Note that
the second possibility will probably require finding the longest common subse-
quence of Xm−1 and Yn−1 in the next recursion, so there are many overlapping
subproblems.
For i ≤ m and j ≤ n let cij be the length of the longest common subse-
quence of Xi and Yj , and let bij give information about how the subsequence
is constructed from the recurrence, with bij = INCLUDE if xi−1 = yj−1 and
so xi−1 is used in the subsequence, bij = NOTX if xi−1 is not included in the
subsequence, or instead bij = NOTY if yj−1 is not included in the subsequence.
Then every ci0 = 0 since Y0 = hi and every c0j = 0 since X0 = hi. For i > 0
and j > 0 the value of cij is given by the recurrence:
c(i−1)(j−1) + 1 if xi−1 = yj−1
cij =
max c(i−1)j , ci(j−1) if xi−1 6= yj−1 .
The LCS-Length algorithm shows how the length of the longest common
subsequence is found in O(mn), and the class LCSLength is a Java implementa-
tion of this algorithm which uses the values of bij to output one of the longest
common subsequences, starting with bmn .
Exercise 4.3 (Knapsack Problem) The 0-1 knapsack problem involves a set
of n items for which item i has positive benefit bi and positive integer weight wi .
Suppose a person with a knapsack can only carry up to a maximum total weight
W . The problem is to decide which items to choose so as to maximize the total
benefit.
Let Bkw be the maximum benefit that can be obtained from selecting some of
items 0 to k − 1 using a knapsack that has capacity w. Note that B0w = 0, and
for k > 0 that Bkw is given by the recurrence:
B(k−1)w if w < wk−1
Bkw =
max B(k−1)w , B(k−1)(w−wk−1 ) + bk−1 if w ≥ wk−1 .
Design an O(nW ) algorithm for solving the 0-1 knapsack problem and implement
the algorithm in a program using an int[n+1][W+1] array of benefits. Then
use a boolean[n+1][W+1] to keep track of which items are used in the optimal
solution.
68 CHAPTER 4. DESIGN TECHNIQUES
LCS-Length(X, Y )
1 m ← length[X]
2 n ← length[Y ]
3 for i ← 1 to m do
4 ci0 ← 0
5 for j ← 0 to n do
6 coj ← 0
7 for i ← 1 to m do
8 for j ← 1 to n do
9 if xi = yj then
10 cij ← c(i−1)(j−1) + 1
11 bij ← INCLUDE
12 else if c(i−1)j > ci(j−1) then
13 cij ← c(i−1)j
14 bij ← NOTX
15 else
16 cij ← ci(j−1)
17 bij ← NOTY
18 return c and b
/**
A class that demonstrates how the divide-and-conquer technique can
be used to find a longest common subsequence of strings x and y
@author Andrew Ensor
*/
public LCSLength()
{
}
cont-
70 CHAPTER 4. DESIGN TECHNIQUES
-cont
2. prove that there must be at least one optimal solution to the problem
which makes the greedy choice,
claiming that every optimal solution contains a greedy choice, but rather than
at least one optimal solution contains a greedy choice (this is the one that a
greedy algorithm would find).
Choosing the activity ai in S that finishes first for the solution still leaves the
subproblem of finding the maximum number of activities that can be selected
which start after time fi . Thus the optimal solution can be found by first
making a greedy choice and then combining it with an optimal solution to
the subproblem. The simpler subproblem can be solved in the same way by
first finding the activity aj that finishes first from those that start after time
fi , and then finding the maximum number of activities that can be selected
starting after time fj . Hence the activity selection problem possesses the optimal
subproblem property (and note that there is only one further subproblem that
needs to be solved at each step).
Rather than searching for the activity that finishes first each time, the ac-
tivities in S can be sorted in order of finishing time before the algorithm com-
mences. Once this sorting has been done (say by an O (n log n) algorithm) the
Activity-Selector algorithm shows how an optimal solution can be found
in Θ(n). In this example, the recursive algorithm could easily be replaced by a
non-recursive algorithm which simply iterates through the activities in order.
Activity-Selector(s, f )
1 find maximum number of compatible activities from s that finish by f
2 return {a0 } ∪ Recursive-Activity-Selector(s, f, 0)
Recursive-Activity-Selector(s, f, i)
1 find the next activity in sorted order that starts after time fi
2 m←i+1
3 while m < length[s] and sm < fi do
4 m←m+1
5 if m < length[s] then use am as the greedy choice
6 return {am } ∪ Recursive-Activity-Selector(s, f, m)
7 else
8 return ∅
acter occurs (either found by iterating through the string or else estimated from
previous typical strings). The algorithm works by building a binary tree whose
leaf nodes are the characters, built so that characters with a higher frequency
appear further up the tree. Tracing a path from the root to any character gives
the bits in the encoding for that character, each left edge followed corresponds
to a 0 in the code and each right edge corresponds to a 1, so that the number
of bits in a character’s code is given by its height in the tree. The construction
is started by making a forest of binary trees, one per allowable character and
each having height 0. Then progressively two of the binary trees are chosen at a
time to join together to form a tree of height one greater, until only one binary
tree remains which holds the allowable characters at its leaf nodes. The choice
of which two binary trees to combine at each step is a greedy choice, the two
trees whose combined characters occur with the least frequency are chosen to
combine each time.
For example, suppose a Huffman code is to be made for the string:
“peter piper picked a peck of pickled peppers”.
First a table is made of each character and its frequency in the string:
character a c d e f i k l o p r s t space
frequency 1 3 2 8 1 3 3 1 1 9 3 1 1 7
Next, a separate binary tree is made for each character, and the trees are merged
together two at a time, until only one tree remains. The leaf nodes of each tree
holds the characters and for convenience each node holds the total frequency for
all its descendant character leaf nodes. First the leaf nodes with least frequency
are joined together two at a time, such as leaf l with leaf t, leaf s with leaf f ,
and leaf o with leaf a.
2 2 2
l:1 t:1 s:1 f :1 o:1 a:1 d:2 r:3 i:3 c:3 k:3 :7 e:8 p:9
Then the two smallest trees with frequency 2 are joined together and the process
continues until only one tree remains.
44
17 27
8 p:9 12 15
4 4 6 6 :7 e:8
for the string would have required at least 4 bits (since there are 14 distinct
characters including the space), and the entire 44 character string would be
encoded using a total of 176 bits. Using the variable-length Huffman code
found in this example would result in a total of 149 bits (a savings of just over
15%).
The algorithm Huffman shows how the encryption algorithm is performed
in O (n log2 n), and the program Huffman (and interface HuffmanNode) is a
Java implementation of the algorithm. For efficiency the program stores the
root of each binary tree in a priority queue (implemented by a heap to keep the
operations O (log n)) ordered by frequencies.
Huffman(C)
1 create a priority queue Q to hold root of each binary tree
2 put each potential character and its frequency in the queue
3 while length[Q] > 1 do
4 x ← Extract-Min(Q)
5 y ← Extract-Min(Q)
6 allocate new node z with x and y as its children
7 Insert(Q, z) add z to queue Q
8 return Extract-Min(Q) return root node of last remaining tree
/**
An interface that represents a node in a binary tree that is used
for the Huffman encoding algorithm
@see Huffman.java
*/
/**
A class that performs the Huffman encoding algorithm for a string
@author Andrew Ensor
*/
import java.util.Comparator;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.PriorityQueue;
cont-
4.4. GREEDY TECHNIQUE 77
-cont
cont-
78 CHAPTER 4. DESIGN TECHNIQUES
-cont
cont-
4.4. GREEDY TECHNIQUE 79
-cont
Reading: pp273-293
A binary search tree (or sort tree) is a binary tree (a tree where each node
has at most two child nodes) for which the element held in each node is greater
than or equal to that held in its left child but less than or equal to that in its
right child. The abstract data type for a binary search tree has add, contains,
and remove methods, but might also include other methods such as iterator,
maximum, minimum, predecessor, and successor.
A tree is said to be balanced if the left and right subtrees of any node always
have height within one of each other. As a consequence a balanced binary search
tree must have all its leaf nodes within one level of each other, and no binary
tree with the same number of nodes could have fewer levels.
Part of the appeal of using a binary search tree to store elements that can
be compared is that the add, contains, and remove methods all take time
proportional to the height of the tree, which for a balanced binary search tree is
log2 n. This is preferable to using a linear data structure such as a sorted array
list, which has O (log2 n) contains, but O (n) add and remove methods (since
elements need to be shifted along to maintain the ordering).
However, each call to add and remove affects the shape of the tree, so the
performance of the add, contains, and remove methods might drop if there are
more levels in the tree than necessary.
For example, if elements:
cow
“cow”, “fly”, “dog”, “bat”, “fox”,
“cat”, “eel”, “ant”, bat fly
are added in this order then the illustrated binary
tree will be built with root holding the element ant cat dog fox
“cow”. Searching for an element in this balanced
tree would require at most 4 ≈ log2 n compar-
eel
isons.
81
82 CHAPTER 5. ADVANCED DATA STRUCTURES
x y
x y
a y x c
sever a c make
b c two b two a b
links links
=⇒ =⇒
A right rotation reverses this process by promoting the left child x up a level
and demoting the right child c (if present) down a level (the mirror image of a
left rotation), first severing two links and then making two other links.
y x
x y
x c a y
sever a c make
a b two b two b c
links links
=⇒ =⇒
The Θ(1) algorithm Left-Rotate(T, x) demonstrates how a left rotation
could be performed around a node x in a tree T where each node x has operations
for obtaining its parent parent[x], left child left[x], and right child right[x].
5.1. RED-BLACK TREES 83
Left-Rotate(T, x)
1 y ← right[x] y is child to promote up one level
2 transfer left child of y to become right child of x
3 right[x] ← left[y]
4 if left[y] 6= null then
5 parent[left[y]] ← x
6 make y the parent
7 parent[y] ← parent[x] reassign the parent of y
8 if parent[x] = null then x was the root node of T
9 root[T ] ← y
10 else determine whether x was left child or right child of its parent
11 if x = left[parent[x]] then
12 left[parent[x]] ← y
13 else
14 right[parent[x]] ← y
15 left[y] ← x
16 parent[x] ← y
Note that this algorithm updates the links to parent nodes as well as the links
to the left and right child nodes. Links to parents are typically required since
rebalancing algorithms need to efficiently move up the tree. An algorithm to
perform a right rotation is quite similar.
There are several alternative algorithms that are widely used to determine
when a tree needs rebalancing. An AVL tree is a binary tree where each node is
assigned a balance factor, which is the height of its left subtree minus the height
of its right subtree. After the addition or removal of a node the balance factor
of its ancestor nodes are recalculated (so it is convenient for each node in an
AVL tree to keep a link to its parent). If the balance factor is not −1, 0, or 1
for a node, then checking the balance factor for its two child nodes determines
which type of rotation is required to return the tree to be balanced (see the
discussion in the textbook p296).
A common alternative to an AVL tree is to use a red-black tree. In a red-
black tree each node is assigned one of two colours subject to the following
restrictions:
• there must be the same number of black nodes along any path from the
root to a null reference.
Take the black height bh(x) of a node x to be one more than the number of black
nodes down any path from either child of x to a null reference (one is added
since the null node is itself considered black). Induction on the height of a
node x in the tree can be used to show that the subtree rooted at x has at least
2bh(x) − 1 nodes. Indeed, if the height of x is 0 then x is a leaf, and so bh(x) = 1
and the subtree rooted at x consists of just x (21 − 1 = 1 node). Furthermore,
if every subtree rooted at y with height k has at least 2bh(y) − 1 nodes, then
consider a node x with height k + 1. If y is one of the (up to two) child nodes
of x then its height is k, and if y is black then bh(y) = bh(x) − 1 and so its
84 CHAPTER 5. ADVANCED DATA STRUCTURES
subtree has at least 2bh(x)−1 − 1 nodes (by the inductive hypothesis), whereas
if y is red then bh(y) = bh(x), and so its subtree has at least 2bh(x) − 1 nodes
(also by the inductivehypothesis). Hence the subtree rooted at x must have at
least 2 · 2bh(x)−1 − 1 + 1 = 2bh(x) − 1 nodes, which completes the induction
argument. Now suppose a red-black tree with n nodes has height h. Since the
parent of any red node must be black the black height of the root must be at
least bh/2c + 1 ≥ h/2, and so by the previous discussion must have at least
2h/2 − 1 nodes, so n ≥ 2h/2 − 1. Hence the following important fact is obtained:
Thus the red-black properties ensure that a red-black tree is kept reasonably
balanced, with height at most about double that of a perfectly balanced binary
tree.
After any node has been added or removed from the tree the red-black
properties must be reestablished. The algorithm RB-Insert uses a while loop
to traverse down the tree starting from the root to find the parent node y where
the new node z should be added as a leaf. It then adds z and assigns it the
color red, so that the third red-black property is not violated. However, if the
parent parent[z] of z is also red then the RB-Insert-Fixup(T, z) algorithm
make adjustments to reestablish the second red-black property. It does this by
handling six possible cases, which can be divided into two groups depending on
whether parent[z] is a left child or a right child of the grandparent of z. Note
that parent[z] cannot be the root since it is red, and that the grandparent must
currently be black.
Suppose parent[z] is the left child of the grandparent. Case one supposes
that z has an uncle y that is also red. In this case parent[z] and y are both
made black but the grandparent is made red to preserve the black height in
the tree.
c z c
b y d b d
z a case a
one
=⇒
This works regardless of whether z is the left or the right child of parent[z], but
might result in the grandparent and its parent node both being red. Hence the
while loop is repeated with z taken as the grandparent node.
Case two supposes that z has been added as the right child of its parent and
either it does not have an uncle or else its uncle is black. In this case a simple
recolouring is not sufficient so a left rotation is performed about its parent,
and then case three is used with z taken to be the former parent. Case three
supposes that z has been added as the left child of its parent and again either
it does not have an uncle or else its uncle is black. In this case a combination
of recolouring and a rotation is applied, parent[z] is labelled black and the
grandparent red, then a right rotation is performed about the grandparent.
Both case two and case three reestablish the red-black properties, so the while
loop will terminate.
5.1. RED-BLACK TREES 85
RB-Insert(T, z)
1 find node y that will be the parent of added node z
2 y ← null
3 x ← root[T ]
4 while x 6= null do
5 y←x
6 if key[z] < key[x] then
7 x ← left[x]
8 else
9 x ← right[x]
10 add node z as a child of y
11 parent[z] ← y
12 if y = null then
13 root(T ) ← z
14 else if key[z] < key[y] then
15 left[y] ← z
16 else
17 right[y] ← z
18 left[z] ← null
19 right[z] ← null
20 colour [z] ← red added node assigned red label
21 RB-Insert-Fixup(T, z) fix any violation of red-black labelling
RB-Insert-Fixup(T, z)
1 while parent[z] 6= null and colour [parent[z]] = red do
2 since parent[z] is red it is not root[T ] so z has grandparent
3 if parent[z] = left[parent[parent[z]]] then parent[z] is left child
4 y ← right[parent[parent[z]]] y is uncle of z or null
5 if y 6= null and colour [y] = red then case one
6 colour [parent[z]] ← black
7 colour [y] ← black
8 colour [parent[parent[z]]] ← red
9 z ← parent[parent[z]] repeat loop again
10 else
11 if z = right[parent[z]] then case two
12 z ← parent[z]
13 Left-Rotate(T, z)
14 colour [parent[z]] ← black case three
15 colour [parent[parent[z]]] ← red
16 Right-Rotate(T, parent[parent[z]])
17 else parent[z] is right child
18 same as above but interchange left and right
19 colour [root[T ]] ← black
86 CHAPTER 5. ADVANCED DATA STRUCTURES
c c
a b b
z b case z a case z a c
two three
=⇒ =⇒
The other three cases presume parent[z] is a right child of the grandparent
and are the same as the first three cases but with the roles of left and right
interchanged.
Removing a node from a red-black tree is only slightly more complicated.
The RB-Delete algorithm first checks whether z had only one child, if so
then it replaces z with its child, otherwise a loop is used to find the successor
node of z and replace z with it, adopting the original colour of the node z.
If z is replaced by its successor this could cause a colouring violation for the
successor’s right child if the successor is black. If instead z is replaced by one
of its children or by null, a colouring problem would arise if z were black. In
these cases the RB-Delete-Fixup algorithm is used to reestablish the colour
properties starting at the node x that has the colouring problem. The fix up
checks eight possible cases, divided into two groups depending on whether x is
a left child or a right child of its parent. The cases depend on the colouring
of the sibling w of x, and of the colouring of the child nodes of w, performing
suitable recolouring and rotations to reestablish the colour properties. In the
first four cases x is the left child, and case one ensures that its sibling will be
black for later cases. Case two handles when w has no red children, and just
performs a simple recolouring and repeats the loop. Instead, case three handles
when left[w] is red and performs a rotation to instead make right[w] red for
case four, which performs recolouring and a single left rotation to reestablish
the colouring properties.
RB-Delete(T, z)
1 find node r that will take the place of z
2 checkN ode ← null node whose colour needs checking
3 if left[z] 6= null and right[z] 6= null then z has two children
4 find successor node (left-most descendant of right subtree of z)
5 r ← right[z]
6 repeat
7 r ← left[r]
8 until left[r] = null
9 make the right child of r the left child of parent of r
10 left[parent[r]] ← right[r]
11 if right[r] 6= null then
12 parent[right[r]] ← parent[r]
13 if colour [r] = black then
14 checkN ode ← right[r]
15 have r adopt both child nodes of z
16 left[r] ← left[z]
17 parent[left[z]] ← r
18 right[r] ← right[z]
19 parent[right[z]] ← r
20 colour [r] ← colour [z] keep same colour
21 else if left[z] 6= null or right[z] = null then z has only one child
22 if left[z] 6= null then
23 r ← left[z] replace z by its left child
24 else
25 r ← right[z] replace z by its right child
26 if colour [z] = black then
27 checkN ode ← r
28 else z had no children
29 r ← null
30 if colour [z] = black then
31 checkN ode ← parent[z]
32 update the link with parent of z
33 if r 6= null then
34 parent[r] ← parent[z]
35 if parent[z] = null then
36 root[T ] ← r
37 else if z = left[parent[z]] then
38 left[parent[z]] ← r
39 else
40 right[parent[z]] ← r
41 if checkNode 6= null then
42 RB-Delete-Fixup(T, checkNode)
88 CHAPTER 5. ADVANCED DATA STRUCTURES
RB-Delete-Fixup(T, x)
1 while x 6= root[T ] and colour [x] = black do
2 if x = left[parent[x]] then x is left child
3 w ← right[parent[x]] w is sibling of x
4 if w 6= null and colour [w] = red then
5 case one
6 colour [w] ← black
7 colour [parent[x]] ← red
8 Left-Rotate(T, parent[x])
9 w ← right[parent[x]]
10 note the sibling w is now black
11 if w = null or
12 ((left[w] = null or colour [left[w]] = black) and
13 (right[w] = null or colour [right[w]] = black)) then
14 case two
15 if w 6= null then
16 colour [w] ← red
17 x ← parent[x]
18 else
19 if right[w] = null or
20 colour [right[w]] = black then
21 case three
22 colour [left[w]] ← black
23 colour [w] ← red
24 Right-Rotate(T, w)
25 w ← right[parent[x]]
26 note right[w] is now red
27 case four
28 colour [w] ← colour [parent[x]]
29 colour [parent[x]] ← black
30 colour [right[w]] ← black
31 Left-Rotate(T, parent[x])
32 x ← root[T ] terminate loop
33 else x is right child
34 same as above but interchange left and right
35 colour [x] ← black
5.2. AUGMENTING DATA STRUCTURES 89
/**
* Initializes a new node in an order-statistic tree
* @param data Data to save in the node.
*/
public Node(Comparable data)
{ super(data);
90 CHAPTER 5. ADVANCED DATA STRUCTURES
size = 1;
}
/**
* Deletes a node from the tree.
* @param handle Handle to the node being deleted.
* @throws ClassCastException if <code>handle</code> is not
* <code>Node</code> object.
*/
public void delete(Object handle)
{ // Walk up the tree by following parent pointers while
// updating the size of each node along the path.
Node x = (Node) handle;
for (Node i=(Node)x.parent; i!=nil; i=(Node)i.parent)
i.size--;
// Now actually remove the node.
super.delete(handle);
}
One complication with using a red-black tree is that the size field also needs
to be updated every time a left or right rotation is performed:
5.2. AUGMENTING DATA STRUCTURES 91
/**
* Calls {@link RedBlackTree#leftRotate} and then fixes the
* <code>size</code> fields.
* @param x handle The node being left rotated.
*/
protected void leftRotate(RedBlackTree.Node x)
{ Node y = (Node) x.right;
super.leftRotate(x);
y.size = ((Node) x).size;
((Node)x).size=((Node)x.left).size+((Node)x.right).size+1;
}
/**
* Calls {@link RedBlackTree#rightRotate} and then fixes the
* <code>size</code> fields.
* @param x handle The node being right rotated.
*/
protected void rightRotate(RedBlackTree.Node x)
{ Node y = (Node) x.left;
super.rightRotate(x);
y.size = ((Node) x).size;
((Node)x).size=((Node)x.left).size+((Node)x.right).size+1;
}
Once the existing operations of the red-black tree have been extended so
that they maintain the new size field, new operations that use the field can be
added, such as select which obtains the i-th order statistic in O (log2 n):
/**
* Finds node in a subtree that is at given ordinal position
* in an inorder walk of the subtree.
* @param x Root of the subtree.
* @param i The ordinal position.
* @return node that is ith in an inorder walk of the subtree
*/
protected Object select(BinarySearchTree.Node x, int i)
{ int r = 1 + ((Node)x.left).size;
if (i == r)
return x;
else if (i < r)
return select(x.left, i);
else
return select(x.right, i - r);
}
In general, a red-black tree (or any other type of balanced binary search tree)
can be augmented with a new field f without affecting O (log2 n) performance of
the tree’s add, contains, remove methods provided that f (x) can be calculated
for a node x using only the nodes x, left[x], right[x], and the values of f (left[x]),
f (right[x]). This ensures the operations of the tree can maintain the field f
without affecting the O (log2 n) performance of these operations. This helps
92 CHAPTER 5. ADVANCED DATA STRUCTURES
explain why the number of nodes in the subtree was chosen as an appropriate
field, rather than the rank of each node, since the rank of a node cannot be
determined solely from the rank of its left child and right child.
Usually interval trees are implemented using a red-black tree as the underly-
ing data structure, with intervals ordered by the starting value of each interval.
The data structure is augmented by an additional max field which gives the
maximum ending value for any interval in the subtree. The value of max for
a node x which holds an interval [a, b] is simply the largest of b, max [left[x]],
and max [right[x]], thus the new field can be maintained by the add, remove,
left rotate, and right rotate operations of the tree without affecting their per-
formance. The max field ensures that the search operation can be performed in
O (log2 n) as given by the following algorithm:
Interval-Search(T, i)
1 find an interval in interval tree T that overlaps with interval i
2 x ← root[T ]
3 while x 6= null and i ∩ interval [x] = ∅ do
4 if left[x] 6= null and max [left[x]] ≥ start[i] then
5 x ← left[x]
6 else
7 x ← right[x]
8 return x
This algorithm works its way down the tree starting at the root. If the interval
in a node x does not overlap with the interval i the algorithm moves to the left
or right child of x depending on whether the left subtree of x contains an interval
that ends after the start of i, given by the comparison max [left[x]] ≥ start[i].
The first interval found that overlaps with i is returned, whereas if there is no
such interval in the tree, null is returned.
Intervals in Interval Tree
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
5.3. B-TREES 93
[7,11]
max:20
[2,3] [9,12]
max:5 max:20
[16,17] [18,19]
max:17 max:19
Note that the Interval-Search algorithm will find an interval x that overlaps
with i if one exists in the tree. If max [left[x]] < start[i] then there is no interval
in the left subtree that overlaps with i, so if there is one it must be in the right
subtree. On the other hand, if max [left[x]] ≥ start[i] then some interval i0 in
the left subtree has end [i0 ] ≥ start[i], and if i does not overlap with i0 then
start[i0 ] > end [i]. Then the ordering property of T gives that no interval in the
right subtree of x can start before start[i0 ], so no interval in the right subtree
can overlap with i (so it is the left subtree that should be searched).
Exercise 5.2 (Interval Trees) Design a class called IntervalTree that im-
plements an interval tree using a red-black tree augmented with a max field, and
which has a search method to find an interval which overlaps with a specified
interval.
5.3 B-Trees
Reading: pp434-452
The design and analysis of most data structures presumes that the elements
are held in main memory so that each element is accessed in constant time. If a
collection is very large then its elements might need to be held on a secondary
storage device and only parts of it read in and out from main memory as needed.
The time required to obtain data from secondary storage consists of the seek
time (time required for the device to physically locate the data) the read/write
time (time to read or write the data), and the transfer time (time to transfer
the data to/from main memory). Usually the seek time is the most significant,
so typically entire pages (blocks) of consecutive data are read/written with each
seek, and the page manipulated in main memory to minimize the time delays
associated with the device.
A B-tree is a generalization of a balanced binary search tree that is designed
for use with secondary storage devices where the number of accesses to data
must be kept as low as possible. Each parent node in a B-tree holds multiple
elements and has many child nodes. This keeps the height of the tree low so that
94 CHAPTER 5. ADVANCED DATA STRUCTURES
but only requires up to logt n+1 2 disk reads (once the root is initially read into
memory).
Inserting a key into a B-tree requires some effort to keep the B-tree balanced.
New keys are always inserted into existing leaf nodes, but since there is an upper
limit 2t − 1 on the number of keys in a node, a full node first needs to be split
(around its median key) into two nodes. The B-Tree-Split-Child algorithm
splits a full child ci (with 2t − 1 keys) of a parent node x (presumed to not itself
be full) into two half-full nodes, y with t − 1 keys and a new node z with t − 1
keys, with the median key key t−1 [y] moved up to be the key key i [x] in x.
B-Tree-Split-Child(x, i)
1 split child ci of parent x into existing child y and new child z
2 y ← ci
3 z ← Allocate-Node()
4 leaf [z] ← leaf [y] z is made a leaf if y is a leaf
5 n[z] ← t − 1
6 move the second half of the keys in y to z
7 for j ← 0 to t − 2 do
8 key j [z] ← key j+t [y]
9 key j+t [y] ← null
10 move the second half of the links (if y not a leaf) in y to z
11 if not leaf [y] then
12 for j ← 0 to t − 1 do
13 cj [z] ← cj+t [y]
14 cj+t [y] ← null
15 n[y] ← t − 1
16 insert link to z in parent node x
17 for j ← n[x] downto i + 1 do
18 cj+1 [x] ← cj [x]
19 ci+1 [x] ← z
20 insert key (median in y) to z in parent node x
21 for j ← n[x] − 1 downto i do
22 key j+1 [x] ← key j [x]
23 key i [x] ← key t−1 [y]
24 key t−1 [y] ← null
25 n[x] ← n[x] + 1
26 update the secondary storage
27 Disk-Write(y)
28 Disk-Write(z)
29 Disk-Write(x)
To avoid changes propagating up the tree (as happens in red-black trees), requir-
ing the retrieval of ancestor nodes again from the storage, the B-Tree-Insert
algorithm anticipates the problem as it traverses down the tree, and splits any
full nodes it encounters. If the root is full, then firstly it gets split, which results
in an additional level being added to the tree. The B-Tree-Insert-Nonfull
algorithm is then used to insert the key k starting at the node x (presumed
not full), recursively splitting any full child nodes it encounters, until a leaf is
reached, in which case k is inserted (in the non-full leaf node).
For example, a B-tree with minimum order t = 2 to which the characters
96 CHAPTER 5. ADVANCED DATA STRUCTURES
B-Tree-Insert(T, k)
1 r ← root[T ]
2 if n[r] = 2t − 1 then split the root, increasing height of T by 1
3 s ← Allocate-Node()
4 root[T ] ← s
5 leaf [s] ← false
6 n[s] ← 0
7 c0 [s] ← r
8 B-Tree-Split-Child(s, 0)
9 B-Tree-Insert-Nonfull(s, k)
10 else
11 B-Tree-Insert-Nonfull(r, k)
B-Tree-Insert-Nonfull(x, k)
1 i ← n[x] − 1
2 if leaf [x] then
3 insert k in the leaf at correct position
4 while i ≥ 0 and k < key i [x] do
5 key i+1 [x] ← key i [x]
6 i←i−1
7 key i+1 [x] ← k
8 n[x] ← n[x] + 1
9 Disk-Write(x)
10 else
11 find which child ci [x] of x to traverse to
12 while i ≥ 0 and k < key i [x] do
13 i←i−1
14 i←i+1
15 Disk-Read(ci [x])
16 if n[ci [x]] = 2t − 1 then child ci [x] is full so split it
17 B-Tree-Split-Child(x, i)
18 if k > key i [x] then
19 i←i+1
20 B-Tree-Insert-Nonfull(ci [x], k)
5.3. B-TREES 97
‘F’, ‘S’, ‘Q’, ‘K’, ‘C’, ‘L’, ‘H’, ‘T’, ‘V’, ‘W’, ‘M’, ‘R’, ‘N’, ‘P’ are added is built
as follows.
Q
F FS FQS FK S
Q FQ FQ
CFK S C KL S C HKL S
FQ FQ FQT
FK T FK T
C H LM S VW C H LM RS VW
Q Q
FK T FKM T
C H LMN RS VW C H L NP RS VW
The B-Tree-Delete algorithm searches for the key k to be removed by
traversing down the B-tree starting at the root root[T ]. If the key k is not
found in some node x then its child node ci [x] that will be searched next is
ensured to have at least t keys, one more than the minimum number t − 1 (the
exception is the root which must have at least 1 key if it is not itself a leaf).
This is done by checking the adjacent siblings of the child ci [x], if either has
at least t keys then its end key is moved up to the parent to replace the key
key i [x], which is moved down to ci [x], so that the child now has t keys (this is
essentially a rotation). If neither of the adjacent siblings has t keys, then they
must both have the minimum t − 1 keys, so one is chosen to be merged with
ci [x] together with the parent’s key for ci [x], giving a total of 2t − 1 keys in ci [x].
Note that since x has already been ensured to have t keys (or 2 for the root),
this second case does not reduce the number of keys in x below the minimum
t − 1.
Since the node x to be searched is ensured to have at least t keys, if k is
found in x (either by a linear search or a binary search) then two cases are
possible. If x is a leaf node then the key k is simply removed. However, if x is
a parent node then the removal of k = key i [x] would have repercussions for the
child nodes ci [x] and ci+1 [x]. If ci [x] has at least t keys then its greatest key k 0
(the predecessor of k) can be moved to take the place of k. Likewise, if ci+1 [x]
has at least t keys then its least key (the successor of k) can be moved. The only
remaining possibility is if both ci [x] and ci+1 [x] have t − 1 keys. In this case the
two siblings are merged together into one node with the key k placed between
98 CHAPTER 5. ADVANCED DATA STRUCTURES
their respective keys, and recursively k is deleted from the merged node.
For example, removing the key ‘P’
from the previous B-tree would be
achieved by starting at the root and
ensuring that the node with keys ‘F’, Q
‘K’, ‘M’ had at least t = 2 keys, and
then that the node with keys ‘N’ and FKM T
‘P’ had at least t = 2 keys, which they
do so the key ‘P’ is simply removed
from the leaf. C H L N RS VW
However, to now remove the key ‘S’ more work is required since the node with
key ‘T’ does not have at least 2 keys, which must be remedied before moving to
this node. Since its sibling has at
least t keys the key ‘M’ is moved to
replace ‘Q’ which is moved down to M
join ‘T’ (much like a right rotation in
a red-black tree). Then the node has FK QT
‘Q’ and ‘T’ and so is suitable to be
searched, as is the node with ‘R’ and
C H L N R VW
‘S’, so the key ‘S’ gets removed.
Next, consider removing the key ‘K’. Since ‘K’ lies in a parent node the children
on either side of the key are checked to see whether either has at least 2 keys
(and if so then ‘K’ would be replaced
by either its predecessor or successor M
key). In this case both have only the
minimum t−1 = 1 key, so as an inter- F QT
mediate step ‘H’, ‘K’, ‘L’ are merged
into one node with 2t − 1 = 3 keys,
C HL N R VW
and ‘K’ recursively removed from it.
Exercise 5.3 (Modifying a B-Tree) Perform some further add and remove
operations on the above B-tree, and use the classes BTree and BTreeTest to
check your answers.
Make-Set(x) creates a new set containing just the element x (which must not
be in any other set in the collection),
Union(x, y) forms a new set that is the union of the set containing the element
x with the set containing the element y, and removes those two sets from
the collection,
5.4. DISJOINT SETS 99
B-Tree-Delete(T, k)
1 r ← root[T ]
2 B-Tree-Delete(r, k)
3 if not leaf [r] and n[r] = 0 then root is now parent with no keys
4 root[T ] ← c0 [r] reduce height of B-tree by 1
B-Tree-Delete(x, k)
1 search for k in node x using linear search
2 i←0
3 while i < n[x] and k > key i [x] do
4 i←i+1
5 if i < n and k = keyi [x ] then k found in x at position i
6 if leaf [x] then
7 delete k from leaf x by moving keys on right left one
8 for j ← i + 1 to n[x] − 1 do
9 key j−1 [x] ← key j [x]
10 key[n[x] − 1] ← null
11 n[x] ← n[x] − 1
12 Disk-Write(x)
13 else delete k from the parent x
14 try to replace k by predecessor key
15 y ← ci [x]
16 Disk-Read(y)
17 if n[y] ≥ t then move key k 0 from y
0
18 k ← Delete-Greatest-In-Subtree(y)
19 key i [x] = k 0
20 else try to replace k by successor key
21 z ← ci+1 [x]
22 Disk-Read(z)
23 if n[z] ≥ t then move key k 0 from z
0
24 k ← Delete-Least-In-Subtree(z)
25 key i [x] = k 0
26 else both y and z have t-1 keys so merge
27 key n[y] [y] ← k move k to node y
28 remove k from x
29 move all keys and links of node z to y
30 n[y] ← n[y] + n[z] + 1
31 Disk-Write(x)
32 Disk-Write(y)
33 Disk-Free(z)
34 B-Tree-Delete(y, k) recursive del
35 else k not found in x so search child node ci [x]
36 Disk-Read(ci [x])
37 B-Tree-Ensure-Full-Enough(ci [x])
38 B-Tree-Delete(ci [x], k)
100 CHAPTER 5. ADVANCED DATA STRUCTURES
repNode
The Union(x, y) operation can be implemented to add the elements of the set
containing y to the set containing x, by iterating through each node in that set
(setting each representative link), changing two further links to link it into the
set containing x, and then removing the set that originally contained y from the
collection.
An aggregate analysis of n Make-Set operations followed by n − 1 Union
operations can be undertaken using the number of links changed as a measure
of the time cost. In the worst case Union would always adds the largest set in
the collection to one of the smallest sets.
Operation Cost Total
Make-Set(x0 ) 1 1
Make-Set(x1 ) 1 2
Make-Set(x2 ) 1 3
... 1 ...
Make-Set(xn−1 ) 1 n
Union(x1 , x0 ) 3 n+3
Union(x2 , x0 ) 4 n+7
Union(x3 , x0 ) 5 n + 12
... ... ...
(n+4)(n−1)
Union(xn−1 , x0 ) n + 1 n + 2
5.4. DISJOINT SETS 101
/**
An interface that defines the abstract data type for a disjoint
set collection whose sets hold elements with type E
*/
/**
Forms the union of the sets which currently contain the
elements x and y
@param x, y Elements in each set to union (merge) together
@return A representative of the set
*/
public E union(E x, E y);
/**
Returns a representative of the set which currently contains x
@param x The element in the set
@return A representative of the set
*/
public E findSet(E x);
}
/**
A class that implements a disjoint set collection using a linked
list for each set, where each node has a link to the next node in
the list and a link back to the representative at the head
@author Andrew Ensor
*/
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
cont-
root holds the representative for the set and for which each node only holds a
link to its parent node.
-cont
public LinkedDisjointSets()
{ repNodes = new ArrayList<Node<E>>();
setSizes = new HashMap<Node<E>, Integer>();
elementMap = new HashMap<E, Node<E>>();
}
public E makeSet(E x)
{ if (elementMap.containsKey(x))
throw new IllegalArgumentException("element already used");
Node<E> node = new Node<E>(x);
node.repNode = node; // rep of the new set is x itself
repNodes.add(node); // add the head of the new set to the list
setSizes.put(node, new Integer(1));
elementMap.put(x, node); // add the new element to the map
return x;
}
public E union(E x, E y)
{ Node<E> nodeX = elementMap.get(x);
Node<E> nodeY = elementMap.get(y);
if (nodeX == null && nodeY == null)
return null; // neither element is in any set
if (nodeX == null)
return nodeY.repNode.x; // x was not in any set
else if (nodeY == null)
return nodeX.repNode.x; // y was not in any set
Node<E> repX = nodeX.repNode;
Node<E> repY = nodeY.repNode;
if (repX == repY)
return repX.x; // same set
else // add the smaller set to the larger set for efficiency
{ int sizeX = setSizes.get(repX).intValue();
int sizeY = setSizes.get(repY).intValue();
if (sizeX < sizeY)
return link(repY, repX); // add set with x to set with y
else
return link(repX, repY); // add set with y to set with x
}
}
cont-
104 CHAPTER 5. ADVANCED DATA STRUCTURES
-cont
// helper method adds second non-empty set to first where each set
// is specified by the node of its representative
private E link(Node<E> repX, Node<E> repY)
{ // insert nodes of second set into first immediately after repX
Node<E> nodeY = repY;
Node<E> previousY = null;
do
{ nodeY.repNode = repX;
previousY = nodeY;
nodeY = nodeY.next;
}
while (nodeY != null);
// link second set into the first set
previousY.next = repX.next;
repX.next = repY;
// update the map of set sizes and list of repNodes
int sizeX = setSizes.get(repX).intValue();
int sizeY = setSizes.get(repY).intValue();
setSizes.put(repX, new Integer(sizeX+sizeY));
setSizes.remove(repY);
repNodes.remove(repY); // setY no longer exists
return repX.x;
}
public E findSet(E x)
{ Node<E> node = elementMap.get(x);
if (node == null)
return null; // element not in any set
else // element is in a set
return node.repNode.x; // return representative of the set
}
public Node(E x)
{ this.x = x;
next = null;
repNode = null;
}
}
}
5.4. DISJOINT SETS 105
disjoint set forest (using union by rank and path compression) has worst case
time O(mα(n)), where α(n) is an extremely slow growing function (in fact
α(n) ≤ 4 for n ≤ 1080 ), so the operations are on average virtually O(1).
/**
A class that implements a disjoint set collection using a tree for
each set, where each node has a link to the parent node in
the tree and the representative is at the root
@author Andrew Ensor
*/
...
public class ForestDisjointSets<E> implements DisjointSetsADT<E>
{
private List<Node<E>> repNodes; // root for each tree in forest
private Map<Node<E>,Integer> setRanks; //each repNode gives set rank
private Map<E, Node<E>> elementMap; // map of elements to locators
public ForestDisjointSets()
{ repNodes = new ArrayList<Node<E>>();
setRanks = new HashMap<Node<E>, Integer>();
elementMap = new HashMap<E, Node<E>>();
}
public E makeSet(E x)
{ if (elementMap.containsKey(x))
throw new IllegalArgumentException("element already used");
Node<E> node = new Node<E>(x);
node.parentNode = node; // parent of the new node is itself
repNodes.add(node); // add the root of the new tree to the list
setRanks.put(node, new Integer(0)); // initial rank is zero
elementMap.put(x, node); // add the new element to the map
return x;
}
public E union(E x, E y)
{ Node<E> nodeX = elementMap.get(x);
Node<E> nodeY = elementMap.get(y);
if (nodeX == null && nodeY == null)
return null; // neither element is in any set
if (nodeX == null)
return getRootNode(nodeY).x; // x was not in any set
else if (nodeY == null)
return getRootNode(nodeX).x; // y was not in any set
cont-
5.4. DISJOINT SETS 107
-cont
// helper method that returns root node of tree with specified node
private Node<E> getRootNode(Node<E> node)
{ while (node.parentNode != node)
node = node.parentNode;
return node;
}
// helper method adds second non-empty set to first where each set
// is specified by the node of its representative
private E link(Node<E> repX, Node<E> repY)
{ // add the tree rooted at repY as a child of tree rooted at repX
repY.parentNode = repX;
// update the map of set ranks and list of repNodes
int rankX = setRanks.get(repX).intValue();
int rankY = setRanks.get(repY).intValue();
if (rankX == rankY)
setRanks.put(repX, new Integer(++rankX));//add 1 to setX rank
setRanks.remove(repY);
repNodes.remove(repY); // setY no longer exists
return repX.x;
}
cont-
108 CHAPTER 5. ADVANCED DATA STRUCTURES
-cont
public E findSet(E x)
{ Node<E> node = elementMap.get(x);
if (node == null)
return null; // element not in any set
else // element is in a set
return pathCompress(node).x; // return representative of set
}
// recursive helper method that path compresses path up from node
// to root so all nodes along path are now children of root
// returns the eventual parent node (root node) of node
private Node<E> pathCompress(Node<E> node)
{ if (node.parentNode == node)
return node; // node is the root node
Node<E> rootNode = pathCompress(node.parentNode);
node.parentNode = rootNode;
return rootNode;
}
public Node(E x)
{ this.x = x;
parentNode = null;
}
}
}
Chapter 6
Graph Algorithms
Reading: pp527-557
A directed graph G is a pair (V, E) where the vertex set V is a set whose
elements are called vertices and the edge set E is a binary relation on V whose
elements are called edges. For any edge (u, v) in E the edge is said to be incident
from the vertex u (or leaves u) and incident to the vertex v (or enters v), and
the vertices u and v are said to be adjacent. The degree of a vertex is the sum
of the number of edges leaving it with the number of edges entering it.
waiting
wait notified
monitor
unavailable
start
new runnable monitor sync. blocked
obtained
run sleep
finished time
expired
sleeping
dead
109
110 CHAPTER 6. GRAPH ALGORITHMS
edge set
b
auc-wel
b
b
auc-chr
b vertex set adjacency lists
b
auc-fij
b
b
auc b
auc-wel auc-chr auc-fij auc-sam auc-tah auc-bri ...
auc-sam
b
b
wel b
auc-wel wel-chr wel-cha
auc-tah
b
b
chr b auc-chr wel-chr chr-mel
auc-bri
b
chr-mel b
bri b auc-bri bri-syd
b
b
fij-sam
b
syd b auc-syd fij-syd bri-syd syd-mel
b
fij-syd
b
mel b auc-mel chr-mel syd-mel
b
bri-syd
b
b
syd-mel
b
vertex list
0 auc
1 wel
adjacency matrix
2 chr index 0 1 2 3 4 5 6 7 8 9
0 auc-welauc-chr auc-fijauc-samauc-tahauc-briauc-sydauc-mel
3 cha 1 auc-wel wel-chrwel-cha
2 auc-chrwel-chr chr-mel
4 fij 3 wel-cha
4 auc-fij fij-sam fij-syd
5 sam 5 auc-sam fij-sam
6 auc-tah
6 tah 7 auc-bri bri-syd
8 auc-syd fij-syd bri-syd syd-mel
7 bri 9 auc-mel chr-mel syd-mel
8 syd
9 mel
112 CHAPTER 6. GRAPH ALGORITHMS
A breadth-first search first visits each of its adjacent vertices, then it visits
each of their adjacent vertices that have not already been visited, and continues
in this way until no further vertices can be reached. Typically a breadth-first
search uses a processing queue of vertices that have been visited but not yet
processed (had all their adjacent vertices visited). Also it needs to keep track
of the vertices it has already visited, either storing them in a collection or
else decorating the vertices (such as by applying the Decorator Pattern to the
vertex node) by colouring the visited vertices (such as white for unvisited, grey
for visited but not yet processed, and black for processed vertices).
The edges that are followed (called
tree edges) when visiting a vertex for Queue:
the first time form a tree called the d
predecessor subgraph. For example, a b c a b e
performing a breadth-first search of f a b
the directed graph shown on Page c f a
?? starting at vertex d results in the c f
d e f c
vertices e, b, a being visited first and
then vertices f and c.
One application of breadth-first search is to find paths of least length between
a vertex and other vertices in the graph. Since a breadth-first search starting
at a vertex v builds paths from v incrementing in length by one each time a
vertex is processed it can be shown (see the textbook pp535-537) that the path
obtained when a vertex is first visited actually has minimal length.
A depth-first search first visits one of the adjacent vertices of the starting
vertex and then visits one of its adjacent vertices, continuing until a vertex
is reached that has no unvisited adjacent vertices. Then it backtracks to the
previously visited vertex and visits another of its adjacent vertices, repeating the
process until no further vertices can be reached. Typically a depth-first search
uses either a stack of vertices that have been visited but not yet processed
(the grey vertices), or else uses recursion to perform the search. Similarly to
breadth-first search, the depth-first search algorithm also needs to keep track
of the vertices it has already visited, either by building a collection of visited
vertices or by decorating the vertex nodes with a colour.
A depth-first search of the Stack:
directed graph shown on a b c
c a
Page ?? starting at vertex d
b b b f
might result in the vertices e, e e e e e
b, c being visited first, then d d d d d d
d e f
vertex a, followed by f .
It is interesting to note that if each vertex v is given a time stamp d[v] when it
is first visited (changed from white to grey) and another f [v] when it has been
fully processed (changed from grey to black) then for any two reachable vertices
u and v the time interval from d[u] to f [u] and the time interval from d[v] to
f [v] must satisfy that either one is contained entirely within the other or else
they are disjoint (do not overlap at all).
The classes BreadthFirstSearch and DepthFirstSearch demonstrate how
each search algorithm can be implemented for both undirected and directed
graphs. The constructor of each class starts by colouring each vertex white (un-
visited), using a map to store the colours (as an alternative the Vertex interface
could be decorated with a colour property using the Decorator Pattern). The
6.2. MINIMAL SPANNING TREES 113
search method can be called various times for one graph, since each search
only searches one strongly connected component of the graph. Note that both
these classes use the Template Pattern, with the search algorithm in a tem-
plate method (search) and hook methods vertexDiscovered (called whenever
the colour of a vertex is changed from white to grey), vertexFinished (called
whenever the colour is changed from grey to black), and edgeTraversed (called
when an edge is being followed to a previously unvisited vertex). These methods
can be overridden by subclasses to incorporate the search algorithm as a part
of a more sophisticated algorithm.
To illustrate the flexibility of using hook methods, the class StronglyCon-
nectedComponents uses two subclasses of DepthFirstSearch to determine the
strongly connected components of any directed graph. The CompleteDepth-
FirstSearch subclass repeatedly performs a depth-first search of a graph until
all vertices have been visited. The RecordDepthFirstSearch subclass records
the vertices that have been visited when a search is performed.
Finding strongly connected components is an important tool as many graph
algorithms can only be used on strongly connected graphs, so the StronglyCon-
nectedComponents class can be used to break a graph into its strongly connected
subgraphs, and the algorithms applied to each separately. The components of
a graph G are found in three steps:
• repeatedly perform depth-first searches of G until all vertices have been
visited, and store each vertex once a depth-first search has finished with
it on a stack (the order is essential here),
• form the transpose of G which is the graph formed with the same (equiv-
alent) vertices as G but has all edges reversed,
• use each vertex on the stack in order as a starting vertex for a depth-first
search of the transpose graph, and note the vertices that are visited in
each search (which give the components of the original graph).
Topological-Sort(G)
1 repeatedly call Depth-First-Search(G, v) until all vertices visited
2 add each vertex to the front of a list as it is finished in the search
3 return the list of vertices
/**
A class which contains the breadth first search algorithm for any
graph that implements the GraphADT interface and whose vertices
hold elements of generic type E that has suitable hashing function
@author Andrew Ensor
*/
import java.util.HashMap;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.Set;
cont-
6.2. MINIMAL SPANNING TREES 115
-cont
/**
A class which contains the depth first search algorithm for any
graph that implements the GraphADT interface and whose vertices
hold elements of generic type E that has suitable hashing function
@author Andrew Ensor
*/
import java.util.HashMap;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.Set;
cont-
6.2. MINIMAL SPANNING TREES 117
-cont
/**
A class which contains the strongly connected components algorithm
for a graph that implements GraphADT interface and whose vertices
hold elements of generic type E that has suitable hashing function
@author Andrew Ensor
*/
import java.util.HashSet;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;
import java.util.Stack;
cont-
118 CHAPTER 6. GRAPH ALGORITHMS
-cont
cont-
6.2. MINIMAL SPANNING TREES 119
-cont
$380
Fij Sam
$650
Bri $490 Tah
$680
$290 $750 $900
b 5 d Syd $550
Auc
2 10 12
$160 $90
$700
a 9 5 f
Mel $120 Wel $350
15 3 6 $850
$55
c e Cha
6 Chr
A spanning tree for an undirected connected graph G = (V, E) is a tree
which is a subgraph of G that includes all the vertices V . If a connected graph
has only a finite number of edges then a spanning tree can be obtained from the
graph by progressively removing edges from closed paths until no closed paths
remain. Part of the importance of having a spanning tree for a connected graph
is that every two vertices in the graph are then connected by a unique path in
the spanning tree. For example, the following are two spanning trees for the
previous graphs.
$380
Fij Sam
Bri $490 Tah
$290
$900
b 5 d Syd $550
Auc
2 10 12
$160 $90
a f
Mel $120 Wel $350
15
c e Cha
Chr
Many spanning trees are possible for a weighted graph, and they can have
different total weights. A minimal spanning tree for a connected weighted graph
is a spanning tree for the graph that has the smallest possible weight, in the
sense that the sum of the weights for all its edges is as small as possible.
Rather than starting with a connected graph and removing heaviest edges
one by one until a minimal spanning tree is found, it is more practical (and
efficient) to gradually build up a minimal spanning tree starting from any chosen
vertex. The following terminology assists in building a minimal spanning tree.
A cut of a graph G = (V, E) is a partition of the
b 5 d
vertices of the vertices V into two disjoint (and
non-empty) sets S and V − S. An edge (u, v) is 2 10 12
said to cross the cut if either u ∈ S and v ∈ V −S a 9 5 f
or else u ∈ V − S and v ∈ S. The cut is said to
respect a set A of edges of the graph if no edge in 15 3 6
c e
A crosses the cut. 6
Suppose A is contained in some minimal spanning tree T for an undirected
and connected weighted graph, and suppose S, V − S is a cut of the graph that
respects A. Then A can be extended as follows. Since the graph is connected
there must be edges that cross the cut, and adding any of them to A would not
create a closed path in A (as A respects the cut). So one of the edges (u, v) that
crosses the cut which has least weight is chosen to be added to A. Note that A
will still be a subset of some minimal spanning tree as if T itself did not contain
the edge (u, v) then it must contain a path from u to v that includes some other
6.2. MINIMAL SPANNING TREES 121
MST-Kruskal(G, w)
1 A←∅ set A holds edges in minimal spanning tree
2 for each vertex v ∈ V [G] do
3 Make-Set(v)
4 sort the edges E of G into increasing order of weight w
5 for each edge (u, v) ∈ E, taken in increasing order do
6 if Find-Set(u) 6= Find-Set(v) then
7 add edge (u, v) to A
8 Union(u, v)
9 return A
edge (x, y) that crosses the cut (because T spans the graph). This edge (x, y)
in T could be replaced by the edge (u, v) without increasing the weight of T
(since (u, v) was chosen to have least weight crossing the cut) and still be a tree
(without closed paths) that spans the graph. Hence A is still contained in some
minimal spanning tree.
The fact that a subset of a minimal spanning tree can be grown across a
cut of the graph by adding an edge that crosses the cut with least weight shows
that the problem of finding a minimal spanning tree satisfies the greedy-choice
property. It also has optimal substructure as if T is a minimal spanning tree
and a cut is made of the graph for which exactly one edge (u, v) of T crosses
the cut then the two subtrees formed by removing edge (u, v) from T must
also both be minimal spanning trees on each side of the cut (otherwise a tree
with smaller weight than T could be built). Thus a greedy technique would be
suitable for solving the minimal spanning tree problem if appropriate cuts of
the graph could be found.
One greedy algorithm for finding a minimal spanning tree is Kruskal’s algo-
rithm. It starts with a forest A of disjoint trees, each initially a single vertex,
and gradually joins them together until only a single tree remains. At each step
the algorithm takes the least weight edge (u, v) not already considered, if u and
v belong to the same tree in A then the edge is discarded, otherwise a cut of the
graph can be made which respects A and which (u, v) crosses, so (u, v) can be
added to A. A convenient way of efficiently checking whether the end points u
and v already belong to the same tree in A (and so would cause a closed path)
is to also hold the vertices of A in a disjoint set data structure, with one set of
vertices for each tree. If both end points u and v belong to the same set then
the edge should not be used to grow A.
For example, Kruskal’s algorithm can applied to the weighted graph with
six vertices and ten edges illustrated on Page ??. Each edge is considered in
turn, in order of increasing weight. Starting with six disjoint trees, each a single
vertex the least weight edge (a, b) is added to A, joining together two of the trees
in A, then the next-smallest weight edge (c, d) is added, joining together two
more trees in A. When the edge (c, e) is eventually considered it is discarded as
both endpoints c and e are already in the same tree (so also in the same set in
the disjoint set data structure). Once all the edges have been considered there
remains just one tree in A which is a minimal spanning tree.
122 CHAPTER 6. GRAPH ALGORITHMS
MST-Prim(G, w, r)
1 for each vertex u ∈ V [G] do
2 if u 6= r then
3 leastEdge[u] ← null
4 Enqueue(Q, u)
5 A←∅ set A holds edges in minimal spanning tree
6 addedVertex ← r keeps track of most-recent vertex added to A
7 while size[Q] > 0 do
8 update vertices on Q that are adjacent to addedVertex
9 for each edge e ∈ incident[addedVertex ] do
10 v ← opposite[e, addedVertex ]
11 if v ∈ Q then
12 if weight[leastEdge[v]] > weight[e] then
13 leastEdge[v] ← e
14 priority queue Q now has vertex with least cross edge at head
15 v ← dequeue[Q]
16 add [A, leastEdge[v]]
17 addedVertex ← v
18 return A
b d b d b 5 d
2
a f a f a f
3
c e c e c e
b d b d b d
a 5 f a f a f
6
c e c e c e
6
Another greedy algorithm for finding a minimal spanning tree is Prim’s
algorithm. Prim’s algorithm works by enlarging a single tree A, starting from
any specified vertex r. At each stage a cut is made to separate A from the rest
of the graph and an edge (u, v) with least weight that crosses the cut is added
to A. In order to efficiently find the least weight edge (u, v) a priority queue Q
is used to hold each vertex that has not yet been added to A, together with the
least weight edge found so far between that vertex and some vertex in A. Each
time another edge and vertex is added to A the vertices still in the queue that
are adjacent to the new vertex are checked to see whether their least weight
edge should be updated, as the adjacent vertices might now be closer to A.
For example, Prim’s algorithm can be used on the previous weighted graph
that has six vertices and ten edges. Starting with the vertex a, the smallest
weight edge that joins this vertex to another vertex has weight 2 (and ap-
pears at the head of the queue). Once this edge is added to the tree, the tree
6.3. SINGLE-SOURCE SHORTEST PATHS 123
has four edges joining its two vertices to other vertices, of weights 15, 9, 10,
5. Adding the edge with smallest weight (weight 5), and continuing the al-
gorithm the tree grows until all the vertices are included in the set A. The
class MinimalSpanningTreePrim demonstrates how Prim’s algorithm can be
implemented using a queue for the vertices that are still to be processed.
b d b 5 d b d
2 10 10 12
a f a 9 f a 9 5 f
15 15 15 3
c e c e c e
b d b d b d
10 12 12
a 5 f a f a f
6
c e c e c e
6
If the graph has m edges and n vertices then either algorithm is actually
O (m log2 n), although with a judicious choice of data structure (such as a Fi-
bonacci heap) Prim’s algorithm can be made O (m + n log2 n) (an improvement
for dense graphs).
/**
A class which contains Prim’s algorithm for finding a minimal
spanning tree in an undirected and connected weighted graph that
implements GraphADT interface and whose vertices hold elements of
generic type E that has suitable hashing function
@author Andrew Ensor
*/
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.PriorityQueue;
import java.util.Set;
cont-
6.3. SINGLE-SOURCE SHORTEST PATHS 125
-cont
Bellman-Ford(G, w, s)
1 for each vertex v ∈ V [G] do
2 d[v] ← ∞
3 leastEdge[v] ← null last edge on shortest path to v
4 d[s] ← 0
5 n ← |V [G]| n is the number of vertices in G
6 for i ← 1 to n − 1 do
7 for each edge e ∈ E[G] do e can be directed or undirected
8 relax edge e = (u, v)
9 u ← start[e]
10 v ← end [e]
11 if d[u] + w[e] < d[v] then
12 d[v] ← d[u] + w[e]
13 leastEdge[v] ← e
14 check whether any edge can still be relaxed
15 for each edge e ∈ E[G] do
16 u ← start[e]
17 v ← end [e]
18 if d[u] + w[e] < d[v] then
19 return false G has a negative weight closed path
20 return true and every leastEdge[v]
each vertex v the Bellman-Ford algorithm also holds the edge leastEdge[v]
that is the last edge in a path from s to v with weight d[v] (so that the shortest
path with weight d[v] can be reconstructed once the algorithm has finished).
Since each of the n − 1 iterations checks whether every edge can be relaxed the
algorithm is Θ(mn).
To verify that the algorithm does find for every vertex v a shortest path
from s to v consider the following loop invariant:
at the start of iteration i the value d[v] is the weight of the shortest
path from s to v which uses at most i − 1 edges (or possibly i
depending on the order in which edges are processed).
At the start of iteration i = 1 the vertex s has d[s] = 0 and all other vertices
v have d[v] = ∞, so the loop invariant holds for i = 1 (since the path using 0
edges from s to s has total weight 0). Suppose the loop invariant holds for some
6.3. SINGLE-SOURCE SHORTEST PATHS 127
value of i ≥ 1, and consider a vertex v during iteration i of the loop. Any path
e1 , e2 , . . . , ei of i edges from s to v must pass through an adjacent vertex u of
v, and so e1 , e2 , . . . , ei−1 must form a path from s to u. By the loop invariant,
d[u] is at most the weight of the path e1 , e2 , . . . , ei−1 , so the total weight of
the i edges is at least d[u] + w(ei ). But iteration i of the loop checks whether
d[u] + w(ei ) < d[v] and updates the value of d[v] if this is the case. Hence the
loop invariant holds at the end of the iteration (and start of iteration i + 1). At
the start of iteration n (when the loop terminates) the loop invariant gives that
d[v] is the weight of the shortest path from s to v that uses at most n − 1 edges.
Note that if there were a shorter path using n (or more) edges, then such a path
would have to include some vertex twice, so contain a closed path of negative
weight. The algorithm finishes by checking for the possibility of a closed path
of negative weight, by seeing whether any edge can still be relaxed, if not then
a shortest path to each vertex v has been found.
For example, suppose the Bellman-Ford algo-
rithm is to be applied to the illustrated directed b 11 d
graph using the source vertex a, finding the short- 2 5 2
est path from a to any other vertex v in the (con- 1
nected) graph. Initially, d[a] = 0 and d[v] = ∞ a −1 9 5 f
for all other vertices. After one iteration the 15 36 7
edges incident to the source vertex get relaxed, c e
so d[b] = 2 and d[c] = 15. After the following −2
iteration further edges get relaxed (the number of relaxations depends on the
order in which edges are considered), resulting in d[c] being decreased to 11,
and d[d] = 13, d[e] = 7. After three further iterations each d[v] holds the weight
of the shortest path from s to v and the leastEdge[v] edges can be followed in
reverse from v to s to find the edges in the shortest paths.
∞ ∞ 2 ∞ 2 13
11 b 11 d b 11 d
b d
2 2 5 2 2 5 2
02 5 ∞ 0 ∞ 0 ∞
1 1 1
a −1 9 5 f a −1 9 5 f a −1 9 5 f
15 36 7 15 36 7 15 36 7
c e c e c e
∞ −2 ∞ 15 −2 ∞ 11 −2 7
2 13 2 8 2 8
b 11 d b 11 d b 11 d
2 2 2
02 5
14 02 5
14 02 5
10
1 1 1
a −1 9 5 f a −1 9 5 f a −1 9 5 f
15 36 7 15 36 7 15 36 7
c e c e c e
5 −2 7 5 −2 7 5 −2 7
If the edge weights in a directed or undirected graph are all non-negative then
a more efficient alternative to the Bellman-Ford algorithm can be used to solve
the single-source shortest path problem. Dijkstra’s algorithm is an algorithm
that grows a set S of vertices whose final shortest-path weights from the source
s have already been found, starting with S = {s}. It is a greedy algorithm as
it always selects the vertex v ∈ V − S not already in S that has the smallest
shortest-path estimate d[v] as the next element to add to S. For efficiency the
128 CHAPTER 6. GRAPH ALGORITHMS
Dijkstra(G, w, s)
1 for each vertex v ∈ V [G] do
2 leastEdge[v] ← null
3 if v 6= s then
4 d[v] ← ∞ last edge on shortest path to v
5 enqueue[Q, v]
6 else
7 d[s] ← 0
8 S ← {s} set S holds vertices whose final shortest path are known
9 A←∅ set A holds edges in shortest paths tree
10 addedVertex ← s
11 while size[Q] > 0 do
12 relax edges incident to addedVertex
13 for each edge e ∈ incident[addedVertex ] do
14 relax edge e = (addedVertex , v)
15 v ← opposite[e, addedVertex ]
16 if d[addedVertex ] + w[e] < d[v] then
17 d[v] ← d[addedVertex ] + w[e]
18 leastEdge[v] ← e
19 priority queue Q now has vertex with smallest d[v] at head
20 addedVertex ← dequeue[Q]
21 add [S, addedVertex ]
22 add [A, leastEdge[addedVertex ]]
23 return A
To see why a greedy choice works in the single-source shortest path problem
consider any iteration of the while loop where the set S already holds vertices
whose final shortest-path weights are known and a vertex u is at the head of
the priority queue, so that d[u] ≤ d[y] for all other vertices y ∈ V − S. For the
greedy-choice property to hold it must be shown that d[u] is the weight of the
shortest path from s to u. Suppose instead that there is a shorter path that is
a shortest path from s to u (proof by contradiction), and let (x, y) denote the
first edge in this shorter path for which x ∈ S but y ∈ / S. As x ∈ S the edge
(x, y) must have been relaxed in the earlier iteration when x was added to S
so d[y] is at most the weight of the portion of this path to y. Furthermore, as
this is a shorter path to u it follows that d[y] < d[u] (so long as all edges have
non-negative weights), which contradicts the fact that u is at the head of the
queue.
6.4. ALL-PAIRS SHORTEST PATHS 129
/**
A class which contains Dijkstra’s algorithm for solving the
single-source shortest path problem in a directed or undirected
weighted graph that implements GraphADT interface and whose
vertices hold elements of generic type E that has suitable hashing
function (note all edges presumed to have non-negative weight)
@author Andrew Ensor
*/
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.PriorityQueue;
import java.util.Set;
cont-
6.4. ALL-PAIRS SHORTEST PATHS 131
-cont
Hence, once d1 is found, then d2 , d4 , d8 , d16 , .. . , dk are each found in turn until
k ≥ n−1. This approach gives a Θ n3 log2 n algorithm for solving the all-pairs
shortest path problem.
Note that each dk can be viewed as an n × n matrix, and the process of
finding dk+l from dk and dl is often viewed as a matrix multiplication where
instead of adding and multiplying together the entries of the matrices (weights)
the operations used are minimum and addition.
b 11 d
For example, suppose the weight of the shortest 2 5 2
paths between every pair of vertices is to be found 1
for the graph alongside. Taking the vertices in the a −1 9 5 f
order a, b, c, d, e, f and forming 6 × 6 matrices 15 36 7
dk , the values of d0 , d1 , d2 = d1 · d1 , d4 = d2 · d2 , c e
and d8 = d4 · d4 can be found. −2
0 ∞ ∞ ∞ ∞ ∞ 0 2 15 ∞ ∞ ∞
∞ 0 ∞ ∞ ∞ ∞ ∞ 0 9 11 5 ∞
∞ ∞ 0 ∞ ∞ ∞ ∞ −1 0 3 6 ∞
d0 = d1 =
∞ ∞ ∞ 0 ∞ ∞ ∞ ∞ ∞ 0 5 2
∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ −2 ∞ 0 7
∞ ∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ 1 ∞ 0
6.4. ALL-PAIRS SHORTEST PATHS 133
0 2 15 ∞ ∞ ∞ 0 2 15 ∞ ∞ ∞
∞ 0 9 11 5 ∞ ∞ 0 9 11 5 ∞
∞ −1 0 3 6 ∞ ∞ −1 0 3 6 ∞
d2 = ·
∞ ∞ ∞ 0 5 2 ∞ ∞ ∞ 0 5 2
∞ ∞ −2 ∞ 0 7 ∞ ∞ −2 ∞ 0 7
∞ ∞ ∞ 1 ∞ 0 ∞ ∞ ∞ 1 ∞ 0
0 2 11 13 7 ∞
∞ 0 3 11 5 12
∞ −1 0 3 4 5
=
∞ ∞ 3 0 5 2
∞ −3 −2 1 0 7
∞ ∞ ∞ 1 6 0
0 2 11 13 7 ∞ 0 2 11 13 7 ∞
∞ 0 3 11 5 12 ∞ 0 3 11 5 12
∞ −1 0 3 4 5 ∞ −1 0 3 4 5
d4 = ·
∞ ∞ 3 0 5 2 ∞ ∞ 3 0 5 2
∞ −3 −2 1 0 7 ∞ −3 −2 1 0 7
∞ ∞ ∞ 1 6 0 ∞ ∞ ∞ 1 6 0
0 2 5 8 7 14
∞ 0 3 6 5 8
∞ −1 0 3 4 5
=
∞ 2 3 0 5 2
∞ −3 −2 1 0 3
∞ 3 4 1 6 0
0 2 5 8 7 14 0 2 5 8 7 14
∞ 0 3 6 5 8 ∞ 0 3 6 5 8
∞ −1 0 3 4 5 ∞ −1 0 3 4 5
d8 = ·
∞ 2 3 0 5 2 ∞ 2 3 0 5 2
∞ −3 −2 1 0 3 ∞ −3 −2 1 0 3
∞ 3 4 1 6 0 ∞ 3 4 1 6 0
0 2 5 8 7 10
∞ 0 3 6 5 8
∞ −1 0 3 4 5
=
∞ 2 3 0 5 2
∞ −3 −2 1 0 3
∞ 3 4 1 6 0
The entries of the matrix d8 then give the weights of the shortest paths between
any pair of vertices. For example, since the 2, 6-entry is 8 the shortest path from
vertex b to vertex f must have weight 8.
The all-pairs shortest path problem can actually be solved in Θ n3 using
another characterization of shortest path, where the choice of intermediate (non-
endpoint) vertices that are in the path is considered rather than the number
of edges in the path. Let v0 , v1 , . . . , vn−1 denote the n vertices of the graph,
which is presumed again not to have any closed paths with negative weight.
Suppose u and v are vertices and that some shortest path from u to v only uses
intermediate vertices from v0 , v1 , . . . , vk−1 (so apart from the end points u and
v no other vertex in the path includes any of vk , vk+1 , . . . vn−1 ). Since the graph
does not contain any closed paths with negative weights the shortest path can
be presumed to not use any vertex twice. There are two possibilities to consider,
either the path includes vertex vk−1 once or else it doesn’t include vk−1 as an
intermediate vertex at all. In the case where vk−1 does appear in the path, the
portion of the path up to vk−1 must be a shortest path from u to vk−1 and
the portion after vk−1 must be a shortest path from vk−1 to v. Furthermore,
neither of these portions contain vk−1 as an intermediate vertex (since vk−1 only
appeared once in the shortest path from u to v). This shows that the all-pairs
shortest path problem has another type of optimal substructure.
The Floyd-Warshall algorithm is a Θ n3 dynamic programming solution
for the all-pairs shortest path problem. It iteratively finds the smallest possible
134 CHAPTER 6. GRAPH ALGORITHMS
weights for paths between pairs of vertices that only use intermediate vertices
from v0 , v1 , . . . , vk−1 , starting with k = 0, then for k = 1, 2, . . . , n. When k = n
there is no restriction made on the intermediate vertices so the problem is solved.
For vertices u and v let dk [u, v] denote the smallest weight for paths from
u to v that only use intermediate vertices from v0 , v1 , . . . , vk−1 , and let pk [u, v]
denote the vertex x (the last intermediate vertex) before v on that smallest
path from u to v. For k = 0, if u and v are adjacent with least weight edge
e incident to them then d0 [u, v] = w[e] and p0 [u, v] = u, whereas if u and
v are not adjacent then d0 [u, v] = ∞ and p0 [u, v] is taken to be null. For
k > 0, if there is no path from u to v that uses intermediate vertices from
v0 , v1 , . . . , vk−1 then dk [u, v] = ∞ and pk [u, v] = null (the same as dk−1 [u, v]
and pk [u, v]). If instead there is a path from u to v using only intermediate
vertices from v0 , v1 , . . . , vk−1 then the optimal substructure gives that either
dk [u, v] = dk−1 [u, v] and pk [u, v] = pk−1 [u, v] (if the least weight such path
doesn’t include vk−1 ) or else dk [u, v] = dk−1 [u, vk−1 ] + dk−1 [vk−1 , v] (the sum of
weights of two shortest paths, neither of which includes vk−1 as an intermediate
vertex) and pk [u, v] = pk−1 [vk−1 , v]. Thus for k > 0:
dk [u, v] = min (dk−1 [u, v], dk−1 [u, vk−1 ] + dk−1 [vk−1 , v]) .
Floyd-Warshall(w)
1 solves all-pairs shortest paths problem for graph with weights w
2 n ← rows[w] graph has vertices v0 , v1 , . . . , vn−1
3 for i ← 0 to n − 1 do
4 for j ← 0 to n − 1 do
5 d0 [i, j] ← w[i, j] weight of edge from vi to vj
6 if there is an edge from vi to vj then
7 p0 [i, j] ← i
8 else
9 p0 [i, j] ← null
10 for k ← 1 to n do consider paths with intermed. from v0 , v1 , . . . , vn−1
11 for i ← 0 to n − 1 do consider paths from vi
12 for j ← 0 to n − 1 do consider paths to vj
13 find least weight such path
14 s ← dk−1 [i, k − 1] + dk−1 [k − 1, j] via vk−1
15 if dk−1 [i, j] ≤ s then
16 dk [i, j] ← dk−1 [i, j] least not via vk−1
17 pk [i, j] ← pk−1 [i, j]
18 else least is via vk−1
19 dk [i, j] ← s
20 pk [i, j] ← pk−1 [k − 1, j]
21 return dn and pn
Checking an entry such as the 2, 6-entry of p6 gives that the vertex preceding
f on the shortest path from b to f is v3 = d. Then checking the 2, 4-entry
gives the vertex preceding d is v2 = c, checking the 2, 3-entry gives the vertex
preceding c is v4 = e, and checking the 2, 5-entry gives the vertex preceding e
is v1 = b. Hence the shortest path from b to f is via e, c, and then d.
Transitive-Closure(G)
1 determines the transitive closure of the graph G
2 n ← |V [G]|
3 for i ← 0 to n − 1 do
4 for j ← 0 to n − 1 do
5 if i = j or there is an edge from vi to vj then
6 t0 [i, j] ← true
7 else
8 t0 [i, j] ← false
9 for k ← 1 to n do consider paths with intermed. from v0 , v1 , . . . , vn−1
10 for i ← 0 to n − 1 do consider paths from vi
11 for j ← 0 to n − 1 do consider paths to vj
12 tk [i][j] ← tk−1 [i][j] | (tk−1 [i][k−1] & tk−1 [k−1][j])
13 return tn
/**
A class that demonstrates the Floyd-Warshall algorithm for solving
the all-pairs shortest paths problem in O(n^3)
*/
public class AllPairsFloydWarshall
{
private static final int INFINITY = Integer.MAX_VALUE;
private static final int NO_VERTEX = -1;
private int n; // number of vertices in the graph
private int[][][] d; //d[k][i][i] is weight of path from v_i to v_j
private int[][][] p; //p[k][i][i] is penultimate vertex in path
a residual network whose vertices are the same as those of the original flow
network, but whose edges are (u, v) if cf (u, v) > 0.
For example, the flow from Page 137 has the il-
lustrated residual network. Note that the edge b 9 d
2
(a, c) has residual capacity cf (a, c) = 10 since the 52 1
flow used f (a, c) = 5 units of the original capac- 2
ity c(a, c) = 15, and the edge (c, a) has residual a 1 9 4 1 f
capacity cf (c, a) = 5, since although the origi- 10 34 1
nal network did not contain an edge (c, a) up to 5 5 c e 6
units of existing flow from a to c can be cancelled. 4
Residual capacities are useful for augmenting the value of a flow f , since if a
flow g can be found in the residual network then the combined flow f + g gives a
flow with value |f | + |g|. An augmenting path is a path in the residual network
from the source to the sink. Such a path can be used to obtain a flow g in the
residual network by taking the minimum of the capacities along each edge of
the path.
For example, one augmenting path in the pre-
vious residual network starts at the source and b 9 d
2
passes through c and e, ending at the sink. The 52 1
capacities of the edges are all at least 1 unit, so 2
the residual network has sufficient capacity for a a 1 9 4 1 f
flow g where g(a, c) = 1, g(c, e) = 1, g(e, f ) = 1. 10 34 1
This results in a flow f + g with larger value 5 c e 6
7 + 1 = 8. 4
Since the value of a flow increases whenever it is augmented the residual
network for the flow must eventually not contain any further augmenting paths
(so long as it has only integer flow values). The Ford-Fulkerson method uses this
fact to solve the maximum-flow problem by repeatedly combining flows obtained
from any augmenting path until no further augmenting paths are possible. To
understand why this method results in a flow with maximum value it is useful
to consider a cut of the network into two disjoint sets of vertices S and V − S,
where the source is in S and the sink is in V − S on the opposite side of the
cut. The capacity of the cut is taken to be the sum of all c(u, v) where u ∈ S
and v ∈ V − S. It is not difficult to show that the value of a flow |f | cannot
exceed the capacity of any cut of the network (see for example, Page 656 of the
textbook). The following result ensures that when the Ford-Fulkerson method
terminates the flow will be maximum, and its value will be the same as the
capacity of some minimum cut (this is justified on Page 657 of the textbook).
Max-Flow Min-Cut Theorem: Suppose f is a flow in a flow network. Then
the following are equivalent:
• f is a maximum flow,
by using a greedy technique, always choosing an augmenting path that uses the
least number of edges in the residual network. When augmenting paths are
chosen in this order the Ford-Fulkerson
method is known as the Edmonds-Karp
algorithm, and is O m2 n . One simple strategy for finding an augmenting path
with the least number of edges is to use a breadth-first search in the residual
network for the sink starting at the source.
The class MaxFlowEdmondsKarp uses the Edmonds-Karp algorithm to solve
the maximum flow problem for the network on Page 137. It starts with a
flow with value 0, and augments it by a flow with value 2 using an augmenting
path through the vertices a, b, d, f . Next, it augments by a flow with value
6 through the vertices a, c, e, f . Then it augments by a flow with value 1
through the vertices a, c, b, e, f (note the longer path). Since there are no
further augmenting paths available in the residual network the flow must now
have maximum value 2 + 6 + 1 = 9.
11 9
b d 2 b d
2 5 2 52
1
a 1 9 5 f a 1 9 53 f
15 36 7 15 36 7
c e c e
2 2
b 9 d b 9 d
2 2
52 4 2
a 1 9 53 f a 10 53 f
9 3 1 8 3 1
6 c e 6 7 c e 7
8 8
Exercise 6.5 (Finding the Maximum Flow) Apply the Edmonds-Karp al-
gorithm to find the maximum flow for some sample flow networks and use the
class MaxFlowEdmondsKarp to check your answers.
/**
A class that solves the maximum-flow problem by the Edmonds-Karp
algorithm for a network whose capacities are specified by a square
array (presumed to hold non-negative values)
@author Andrew Ensor
*/
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
cont-
6.6. NETWORK ROUTING 141
-cont
cont-
142 CHAPTER 6. GRAPH ALGORITHMS
-cont
if (sinkFound)
{ // determine the path that was found from parentVertices
List<Integer> reversePath = new ArrayList<Integer>();
int currentVertex = sink;
while (currentVertex != source)
{ reversePath.add(currentVertex);
currentVertex = parentVertices.get(currentVertex);
}
reversePath.add(source);
// transfer the vertices in path to an int[] array
int numVertices = reversePath.size();
int[] path = new int[numVertices];
for (int i=0; i<numVertices; i++)
{ path[i] = reversePath.get(numVertices-1-i);
}
return path;
}
else
return null; // no path found
}
The simplest algorithm for broadcast routing is the flooding algorithm. When
a router wants to send a packet to all the other routers on the network it simply
sends it to all its adjacent routers which in turn send the packet to all their
other adjacent routers. Of course most networks contain closed paths so the
flooding algorithm must be modified to avoid infinite loops where a packet is
forever transmitted around the path.
One strategy for this is to include a positive hop counter with each packet.
Every time a router receives a packet it decrements the hop counter and only
forwards the packet if the counter is still positive. If the diameter of the network
(the largest number of edges in a shortest path between any two vertices in the
graph) is known then starting the hop counter equal to the diameter ensures
that the packet will reach every router in the network.
An alternative strategy for the flooding algorithm is for the source router of
the packet to assign its own unique sequence number to the packet, and each
router to maintain a hash table of sequence numbers it has already received
from all the other routers in the network. When a router receives a packet it
checks the packet’s source and sequence number with its hash table and only
forwards the packet if it has not previously received the packet. To avoid the
potentially large space requirements for each hash table, often each router only
stores the most recent sequence number for each other router in the network
(presuming that if it receives a packet with a certain sequence number from the
source then it has probably already processed all the previous packets from that
source.
One advantage of the flooding algorithm is that it requires no setup costs
for the network, each router needs only be aware of its adjacent routers, and so
is suitable when the topology of the network (the configuration of routers and
connections) changes frequently. The flooding algorithm ensures that a packet
is sent to all the other routers in the shortest possible time but it does so at the
expense of the load on the network (involving many redundant packets being
transmitted).
For unicast routing the flooding algorithm results in too high a network load
so instead unicast routing algorithms make use of the topology of the network,
treating it as a weighted graph. Each weight represents the cost of sending a
packet through that edge, such as time delay in the transmission. A router
can estimate the time delay for an edge by sending an ECHO packet along the
connection to the adjacent router and timing the round-trip delay for the packet
to return.
The distance vector algorithm is an adaptation of the Bellman-Ford algo-
rithm from Page 126 for a distributed network. Each router stores a distance
vector (routing table), giving a shortest-path estimate d[v] for each router in
the network (identified by its IP address), and the adjacent edge leastEdge[v]
for that shortest-path estimate. Whenever a router receives a packet intended
for some destination router it checks the destination v with its current distance
vector and forwards the packet along the edge leastEdge[v] (the flooding algo-
rithm would instead forward the packet along every adjacent edge). Each router
x periodically sends its current distance vector to its adjacent routers. Each of
these routers compares that distance vector dx with its own and relaxes any of
its own shortest-path estimates d[v] if a path to v via the adjacent router would
give a shorter estimate than its own. Over a period of time each router builds
a more accurate distance vector for routing packages through the network, pro-
144 CHAPTER 6. GRAPH ALGORITHMS
Initialize-Distance-Vector()
1 determine the initial shortest path estimates from this vertex (router) u
2 for each known vertex (router) v do
3 if v ∈ adjacent[u] then
4 find estimated time w for packet along edge (u.v)
5 d[v] ← w
6 leastEdge[v] ← (u, v)
7 else no shortest path estimate yet for path to v
8 d[v] ← ∞
9 leastEdge[v] ← null
10 d[u] ← 0
11 leastEdge[u] ← null
Relax-Distance-Vector(dx )
1 relax distance vector d of this vertex u given distance vector dx of
2 adjacent vertex x
3 e ← (u, x)
4 for each known vertex (router) v do
5 if w[e] + dx [v] < d[v] then
6 d[v] ← w[e] + dx [v]
7 leastEdge[v] ← e
/**
A class that represents a router for performing unicast routing
across a network built from unicast routers
@author Andrew Ensor
*/
...
public class UnicastRouter implements Runnable
{
private int routerPort; // port on local machine used by server
private InetAddress routerAddress; // IP address on local machine
private ServerSocket serverSocket;
private boolean stopRequested;
private RouterDistanceVector distanceVector;
private QueueADT<UnicastPacket> packetQueue; // packets to process
private Thread processingThread; // handles the queue processing
private Map<InetAddress, RouterConnection> adjacentRouters; //synch
private RouterDisplay routerDisplay; // display for router output
cont-
146 CHAPTER 6. GRAPH ALGORITHMS
-cont
while (!stopRequested)
{ try
{ // block until the next client requests a connection
// or the server timeout is reached
Socket socket = serverSocket.accept();
InetAddress newRouter = socket.getInetAddress();
displayText("Connection made with " + newRouter);
RouterConnection connection
= new RouterConnection(this, socket);
// add the connection to the adjacentRouter map
adjacentRouters.put(newRouter, connection);
displayText("Connection made with " + newRouter);
displayAdjacentRouters();
// note that newRouter is not added to distanceVector
// until UnicastConnectPacket received from newRouter
// since we don’t yet know the distance from newRouter
Thread.sleep(50); // give other threads a chance
}
catch (SocketTimeoutException e)
{} // ignore the timeout and pass around while loop again
catch (InterruptedException e)
{} // ignore the interruption
catch (IOException e)
{ displayError("Can’t accept client connection: "+e);
stopRequested = true;
}
}
displayText("Router finishing");
try
{ serverSocket.close();
}
catch (IOException e)
{ displayError("Can’t close server: " + e);
}
}
cont-
6.6. NETWORK ROUTING 147
-cont
/**
A class that represents a distance vector held by a unicast router
Note that the maps used in this class have been synchronized
and so are suitable for concurrent thread usage
@see UnicastRouter.java
*/
...
public class RouterDistanceVector implements Serializable
{
private InetAddress source;
private Map<InetAddress,Double> distanceVector; //shortest distance
private Map<InetAddress,InetAddress> pathVector;//shortest direction
cont-
6.6. NETWORK ROUTING 149
-cont
each individual router to store. Instead, the routers are arranged into domain
hierarchies, and routers only use routing tables for destinations in the same
domain. Packets destined for another domain are sent to some known router
that handles that domain. This compromise means that packets might not take
a shortest path across the network, although they still do within each domain.
The reverse path forwarding algorithm is an adaptation of the flooding al-
gorithm for multicast routing, that takes advantage of the routing tables built
for unicast routing. The router that is the source of a packet for multicasting
to a group of routers starts by sending the packet to all its adjacent routers.
When a router u receives a packet from a source via its adjacent router x it
checks its routing table to see whether x is on the shortest path from u to the
source. If this is the case then u forwards the packet on to the other adjacent
routers (except x). If however x is not on the shortest path then the packet is
not forwarded and instead u sends x a prune message telling it to stop sending
u multicast packets that originate from that source. In this way the multicast
packet floods the entire network along edges in the shortest path tree that is
rooted at the source.
The reverse path forwarding algorithm leads to some waste, particularly if
only a few routers have client machines that are part of the group. To reduce
this the algorithm includes a type of group prune message, specifically for each
multicast group and source of packets for that group. If a router has pruned all
but one of its adjacent routers for a given source (so it is a leaf in the shortest
path tree rooted at the source), and if it is not itself part of the group then it
can tell the remaining adjacent router to prune it for any multicast packets from
the source for that group. This might result in the adjacent router becoming
a leaf instead, so it too might in turn have itself pruned for such multicast
packets. Eventually, only those routers required along the shortest path to the
group are included in the multicasting. One complication is that some client
might eventually want to join the group, but its router is already pruned. To
accommodate this possibility, any group pruning is made to expire after a certain
time period, so any router that wants to remain pruned must periodically resend
the group prune message to its adjacent routers.
A more efficient algorithm for multicast routing is the centre-based trees
algorithm. In this algorithm each group is assigned one router z on the network
that acts as a rendezvous centre for the group. Any multicast packet destined
for the group is forwarded to the centre router, which then uses a shortest path
tree to forward the packet to all the members of the group in the network. This
requires that any router that is part of the shortest path tree (and which might
itself not even be part of the group) be aware that the packet has come from its
parent in the tree and so should be forwarded to the child routers in the tree,
rather than be forwarded again to the centre (otherwise the packet would be
forwarded in a loop indefinitely).
A Steiner tree for a collection of vertices in a weighted graph G = (V, E, w)
is a tree with minimum total weight that includes those vertices (possibly along
with some other vertices of the graph). As special cases, the Steiner tree for
two vertices is the tree that gives the shortest path between the two vertices,
and the Steiner tree for all the vertices V in the graph is a minimal spanning
tree. Unfortunately, apart from these special cases there is no known efficient
algorithm for finding the Steiner tree for a collection of vertices in a graph
(this problem belongs to a class of problems known as NP-hard, see pp966-1017
6.6. NETWORK ROUTING 151
Exercise 6.6 (Unicast Routing) Working in a team use the class Unicast-
Router to prepare a network of routers, and investigate the effect on routing
when a router is relaxed or when the topology of the network changes.
152 CHAPTER 6. GRAPH ALGORITHMS
Chapter 7
Numerical Algorithms
Thus the product of two n × n matrices can be broken down into four sums
of eight n2 × n2 products. The time required T (n) for this divide-and-conquer
technique
is given by the recurrence T (n) = 8T (n/2) + f (n) where f (n) is
2
Θ (n/2) = Θ n2 (the time required to perform four n2 × n2 sums). So by
153
154 CHAPTER 7. NUMERICAL ALGORITHMS
the first case of the Master Theorem with a = 8 and b = 2 one obtains that
T (n) is Θ n3 , asymptotically not an improvement over the straightforward
approach to matrix multiplication.
Strassen’s algorithm uses an (obscure) combination of seven products of the
matrices A1 , A2 , A3 , A4 , B1 , B2 , B3 , B4 to give a divide-and-conquer technique
that has lower complexity. First one uses seven n2 × n2 multiplications to form
the following matrices:
P1 = A1 (B2 − B4 )
P2 = (A1 + A2 ) B4
P3 = (A3 + A4 ) B1
P4 = A4 (B3 − B1 )
P5 = (A1 + A4 ) (B1 + B4 )
P6 = (A2 − A4 ) (B3 + B4 )
P7 = (A1 − A3 ) (B1 + B2 ) .
Then the final n × n product can be given by five further additions and three
subtractions:
P5 + P4 − P2 + P6 P1 + P2
.
P3 + P4 P5 + P1 − P3 − P7
This gives the recurrence T (n) = 7T (n/2)+f (n) where f (n) is again
Θ n2 . So
the first case of the Master Theorem gives that T (n) is Θ nlog2 7 ≈ Θ n2.808 .
In practice, if the matrices are sparse (with many zero entries) then there
are alternative algorithms that are preferable. If n is not large (below about 20)
then the extra overhead involved with Strassen’s algorithm that is hidden in the
asymptotic complexity make it slower than the straightforward multiplication
algorithm. For very large matrices Strassen’s algorithm might be used to recur-
sively reduce the problem to evaluating matrix multiplications that are smaller
than about 20 × 20, and then the straightforward algorithm used to directly
calculate the result. Actually, there are more sophisticated algorithms for ma-
trix multiplication
which have even lower complexity, the current best bound is
about O n2.376 . It is currently
unknown whether there is an algorithm closer
to the lower bound Ω n2 .
If A is an invertible n × n matrix then its LUP-decomposition consists of
three n × n matrices L, U , P for which P A = LU , where P is a permutation
matrix obtained from the n×n identity matrix In by interchanging various rows
(so all but n entries of P are zero and it has a 1 in each row and each column), L
is a lower triangular matrix where all entries above the diagonal are zero (each
entry lij = 0 for i < j), and U is an upper triangular matrix where all entries
below the diagonal are zero (each entry uij = 0 for i > j).
An LUP-decomposition for an invertible matrix A can be found by using
Gaussian elimination to row reduce A to row echelon form (which is the matrix
U ). Operations that interchange two rows are kept track of by applying them
to the identity matrix In , which will become the matrix P . The matrix L can
be found by keeping track of the inverse of each operation, starting with the
zero matrix, and when the process is completed In is added to give the matrix
7.1. MATRIX OPERATIONS 155
One starts with the augmented matrix (0n |A|In ), and applies Gaussian elimi-
nation to the matrix A noting the inverse of every operation in the matrix on
the left and just the row interchanges in the matrix on the right.
0 0 0 4 −5 6 1 0 0
0 0 0 8 −6 7 0 1 0
0 0 0 12 −7 12 0 0 1
Next, since |−8/3| > |−4/3| row 2 is interchanged with row 3 to minimized
numerical inaccuracies:
0 0 0 12 −7 12 0 0 1
1/3 0 0 0 −8/3 2 1 0 0
2/3 0 0 0 −4/3 −1 0 1 0
Then the entry −8/3 is used to put a zero in the row below it by adding − 12
times row 2 to row 3:
0 0 0 12 −7 12 0 0 1
1/3 0 0 0 −8/3 2 1 0 0
2/3 1/2 0 0 0 −2 0 1 0
Now that the matrix A is an upper triangular matrix Gaussian elimination can
be stopped and the identity In added to the matrix on the left:
1 0 0 12 −7 12 0 0 1
1/3 1 0 0 −8/3 2 1 0 0
2/3 1/2 1 0 0 −2 0 1 0
LUP-Decomposition(A)
1 n ← rows[A] A is matrix with rows 0, . . . , n − 1 to decompose
2 for i ← 0 to n − 1 do
3 π[i] ← i permutation matrix π is initially identity matrix
4 find row k of the decomposition π, lower, upper triangular matrices
5 for k ← 0 to n − 1 do
6 find the largest entry from akk , . . . , a(n−1)k
7 p ← |akk |
8 k0 ← k
9 for i ← k + 1 to n − 1 do
10 if |aik | > p then
11 p ← |aik |
12 k0 ← i
13 if p = 0 then only zero entries down the column
14 throw exception as matrix is not invertible
15 swap row k with row k 0
16 swap π[k] with π[k 0 ] modify permutation matrix for row swap
17 for j ← 0 to n − 1 do
18 swap akj with ak0 j
19 for i ← k + 1 to n − 1 do
20 subtract aik /akk times row k from row i
21 aik ← aik /akk apply inverse operation for lower
22 for j ← k + 1 to n − 1 do
23 aij ← aij − aik akj apply operation to upper
24 return π, lower triangle of A with 1 on diagonal, upper triangle of A
is also quickly solved using back substitution, as −2x3 = −5/2 gives that x3 =
5/4, then −8/3x2 +2x3 = 1/3 gives that x2 = 13/16, and so 12x1 −7x2 +12x3 =
5 gives that x1 = −23/64. Thus once the LUP decomposition is known for an
invertible matrix A a system AX = B can be solved in Θ n2 .
where:
So if ω = cos 2π 2π
n − i sin n then the Discrete Fourier Transform of the n-tuple
(a0 , a1 , a2 , . . . , an−1 ) is (y0 , y1 , y2 , . . . , yn−1 ) where:
yk = p ω k = peven ω 2k + ω k · podd ω 2k
for 0 ≤ k < n. But note that peven and podd are both polynomials of degree
at most n/2 − 1 and ω 2 is actually an n/2-th root of unity, so for 0 ≤ k <
n/2 the values peven ω 2k and podd ω 2k are just entry k of the DFT of
(a0 , a2 , . . . , an−2 ) and of (a1 , a3 , . . . , an−1 ) respectively. For n/2 ≤ k < n note
that ω 2k = ω 2(k−n/2) , since ω n = 1, and ω k = ω (k−n/2)+n/2 = −ω k−n/2 , since
ω n/2 = −1. Thus all n entries of the DFT of (a0 , a1 , a2 , . . . , an−1 ) can be found
in a loop from k = 0 to k = n/2 − 1 where:
= peven ω 2k + ω k · podd ω 2k
yk
= peven ω 2k − ω k · podd ω 2k .
yk+n/2
Recursive-FFT(a)
1 n ← length[a] a = (a0 , a1 , a2 , a3 , . . . , an−1 ) and n is a power of 2
2 if n = 1 then
3 return a base case a = (a0 )
4 ω ← cos 2π
n − i sin 2π
n n-th root of unity
5 aeven ← (a0 , a2 , . . . , an−2 )
6 aodd ← (a1 , a3 , . . . , an−1 )
7 yeven ← Recursive-FFT(aeven )
8 yodd ← Recursive-FFT(aodd )
9 x←1 point x = ω k is a complex root of unity
10 for k ← 0 to n/2 − 1 do
11 y[k] ← yeven [k] + x · yodd [k]
12 y[k + n/2] ← yeven [k] − x · yodd [k]
13 x←x·ω
14 return y y = (y[0], y[1], y[2], . . . , y[n − 1])
x = ωk
1 for k ← 0 to n/2 − 1 do
2 t ← x · yodd [k] yodd [k] × − y[k+n/2]
3 u ← yeven [k]
4 y[k] ← u + t
5 y[k + n/2] ← u − t
6 x←x·ω
Next, note that the allocation of the arrays aeven and aodd can be avoided
if the elements of the initial array (n-tuple) a can be suitably rearranged so
that the algorithm can operate with portions of a single array. For example,
if n = 23 = 8 and a = (a0 , a1 , a2 , a3 , a4 , a5 , a6 , a7 ) then it would be more
convenient to have a arranged as (a0 , a4 , a2 , a6 , a1 , a5 , a3 , a7 ) so that the DFT
is applied bottom-up.
7.2. FAST FOURIER TRANSFORMS 161
(a0 , a1 , a2 , a3 , a4 , a5 , a6 , a7 )
(a0 , a2 , a4 , a6 ) (a1 , a3 , a5 , a7 )
Bit-Reverse-Copy(a)
1 rearrange array a to give array y by reversing bits of each index
2 n ← length[a] n is presumed to be a power of 2
3 for k ← 0 to n − 1 do
4 k0 ← k
5 r←0 r is the bit reversal of the index k
6 for j ← 0 to log2 n − 1 do
7 b ← k 0 &1 find bit j from right of n
8 r ← (r << 1) + b shift bits left and add bit b
9 k 0 ← k 0 >>> 1 shift bits right
10 y[r] ← a[k]
11 return y
Iterative-FFT(a)
1 n ← length[a] a = (a0 , a1 , a2 , a3 , . . . , an−1 ) and n is a power of 2
2 y ← Bit-Reverse-Copy(a)
3 use butterfly operations on y to find the DFT of a
4 for s ← 1 to log2 n do
5 m ← 2s apply butterfly operations to m-tuples
6 ω ← cos 2πm − i sin 2π
m m-th root of unity
7 for k ← 0 to n − 1 by m do increment k in steps of m
8 x←1 point x = ω k is a complex root of unity
9 for j ← 0 to m/2 − 1 do
10 perform butterfly operation in-place at k + j
11 t ← x · y[k + j + m/2]
12 u ← y[k + j]
13 y[k + j] ← u + t
14 y[k + j + m/2] ← u − t
15 x←x·ω
16 return y y = (y[0], y[1], y[2], . . . , y[n − 1])
Appendix A
Advanced Analysis
Techniques
163
164 APPENDIX A. ADVANCED ANALYSIS TECHNIQUES
Using the probability distribution for the coin tossing example, the condi-
tional probability that at least two heads are obtained given that at least one
head is known to be obtained can be calculated as follows. Let A denote the
event that at least two heads are obtained, and B denote the (non-exclusive)
event that at least one head is obtained. Then:
7
Pr(B) = 1 − Pr(no heads are obtained) = 1 − Pr({T T T }) = ,
8
and Pr(A ∩ B) = Pr(A) = 84 since A ⊆ B. So Pr(A|B) = 48 / 78 = 74 .
Two events A and B are said to be independent if Pr(A ∩ B) = Pr(A) Pr(B),
or equivalently, if Pr(A|B) = Pr(A) (so long as Pr(B) > 0). For example, if A
denotes the event that the first toss is heads and if B denotes the event that
the third toss is tails, then A ∩ B = {HHT, HT T } so Pr(A ∩ B) = 28 . Since
Pr(A) = 48 and Pr(B) = 48 , and Pr(A ∩ B) = Pr(A) Pr(B), the events A and B
are independent.
A random variable X is a function from a sample space S to real numbers,
associating a real number with each possible outcome of an experiment. If a
sample space S is discrete then the expected value E(X) of a random variable
is defined by: X
E(X) = x Pr(X = x)
x
For example, the experiment of rolling two dice has a sample space with 36
elementary events, and if the dice are unbiased then the uniform probability
distribution would be used where each elementary event is assigned equal prob-
1
ability 36 . If X denotes the random variable whose value is the number showing
on the first die (one dice) then its expected value is:
1 1 1 1 1 1
E(X) = 1 · + 2 · + 3 · + 4 · + 5 · + 6 · = 3.5
6 6 6 6 6 6
An indicator random variable is a random variable that only has values 0
and 1. In particular, if A is some event and X is the indicator random variable
defined by X(a) = 1 if a ∈ A and X(a) = 0 if a ∈ / A then E(X) = Pr(A).
Indicator random variables provide a convenient way of handling probabilities.
It is not difficult to show that the expected value of the sum of two random
variables X and Y is the sum of the expected value of each, that is E(X + Y ) =
E(X) + E(Y ). If in the dice experiment Y denotes the random variable whose
value is the number showing on the second die then E(Y ) = 3.5, and X + Y is
the random variable representing the sum of the dice. Hence the expected value
of the sum is E(X + Y ) = E(X) + E(Y ) = 7.
Two random variables X and Y are said to be independent if for all real
values x and y the event X = x and the event Y = y are independent, that is,
Pr(X = x and Y = y) = Pr(X = x) Pr(Y = y). If X and Y are independent
then one can verify that E(XY ) = E(X)E(Y ). Hence in the dice experiment,
the expected value of the product of the values showing on the two dice is
E(XY ) = 3.5 · 3.5 = 12.25.
Find-Index-Maximum(A)
1 maxIndex ← 0
2 for i ← 1 to length[A] − 1 do
3 if A[i] > A[maxIndex ] then
4 maxIndex ← i
5 return maxIndex
and note that for input A the assignment statement is performed X(A) times.
If one assumes that the elements in the input array are in random order then
1 1
E (Xi ) = i+1 since there would be a probability of i+1 that ai is greater than
all of a0 , a1 , . . . , ai−1 . The expected number of assignment statements E(X)
can then be found by:
n−1
! n−1 n−1
X X X 1
E(X) = E Xi = E (Xi ) = ,
i=0 i=0 i=0
i+1
which is a quantity bounded below by loge (n+1) and above by loge n+1. Hence
E(X) ≈ loge n.
A randomized algorithm is an algorithm whose behaviour is determined not
only by its inputs but also by the values produced by a random number gener-
ator. Randomness might be used to alter the behaviour of an algorithm each
time it is run, or to ensure the expected case behaviour of an algorithm by
randomizing the order of its inputs (some algorithms such as Insertion-Sort
perform very poorly if their inputs have a certain order, such behaviour can
be avoided if the order is randomized). For example, consider the simple al-
gorithm Randomize-In-Place which can be used to randomize the order in
which elements appear in the input array A.
166 APPENDIX A. ADVANCED ANALYSIS TECHNIQUES
Randomize-In-Place(A)
1 n ← length[A]
2 for i ← 0 to n − 1 do
3 j ← Random(n − i) + i random number between i and n-1
4 swap A[i] with A[j]
The claim that this Θ(n) algorithm actually does randomize the order of the
elements in an array A of n elements means that every permutation of the n
elements is equally likely. Since there are n! = 1 · 2 · 3 · . . . · n permutations of
1
the elements each permutation should have probability n! if the algorithm truly
randomizes the order. To see that this is the case consider the following loop
invariant:
at the start of iteration i of the for loop, for each of the n!/(n −
i)! possible permutations of i elements of A, the chances that the
subarray A[0 . . i − 1] is that permutation has equal probability (n −
i)!/n!.
First, the loop starts with i = 0 and so the subarray A[0 . . i − 1] consisting of
no elements of A is a trivial permutation with probability 1 (since the trivial
permutation is the only permutation of no elements). Next, suppose at the
start of iteration i every possible permutation of i elements has equal probability
(n−i)!/n! of appearing in A[0 . . i−1]. Iteration i of the for loop swaps A[i] with
one element randomly chosen from A[i], A[i + 1], . . . , A[n − 1], each having equal
1
probability n−i of being chosen. Consider any permutation ha0 , a1 , . . . , ai i of
i+1 elements of A, and let E1 denote the event that iterations 0, 1, . . . , i−1 have
created the permutation ha0 , a1 , . . . , ai−1 i, which is supposed to have probability
(n−i)!/n!. Let E2 denote the event that iteration i swaps ai into position i. The
probability that iteration i results in the permutation ha0 , a1 , . . . , ai i is given
by:
(n − i)! 1 (n − (i + 1))!
Pr (E1 ∩ E2 ) = Pr (E1 ) Pr (E2 |E1 ) = · = .
n! n−i n!
Hence at the start of iteration i+1 the loop invariant is valid. Upon termination
of the loop i = n and so the loop invariant gives that each of the n! permutations
1
of the n elements of A has equal probability n! of appearing in A[0 . . n − 1].
As a final example of probabilistic analysis consider the selection problem.
This problem takes as input an array or list of n elements and an integer i with
0 ≤ i < n, and has as output the element that is the i + 1-th smallest element
(the element that is larger than i other elements). The Randomized-Select
algorithm for solving the selection problem makes use of a Θ(n) Randomized-
Partition algorithm which picks an element at random as the partition element
and partitions the elements (in-place). All the elements smaller than the parti-
tion are placed to the left and all elements greater are placed to the right, with
the partition element between, and the eventual index of the partition element
is returned. The Randomized-Partition is a randomized algorithm to ensure
that the left and right sides of the partition can be expected to be of equal sizes
regardless of the initial ordering of A.
A.2. PROBABILISTIC ANALYSIS 167
Randomized-Partition(A, p, r)
1 choose an element at random as partition and place it at index p
2 swap A[p] with A[Random(r − p) + p]
3 swap elements so elements on left are smaller than partition
4 and elements on right are larger than partition
5 leftIndex ← p + 1
6 rightIndex ← r − 1
7 while leftIndex < rightIndex do
8 find element starting from left that is greater than partition
9 while A[leftIndex ] ≤ A[p] and leftIndex < rightIndex do
10 leftIndex ← leftIndex +1
11 find element starting from right that is less than partition
12 while A[rightIndex ] > A[p] do
13 rightIndex ← rightIndex −1
14 if leftIndex < rightIndex then
15 swap A[leftIndex ] with A[rightIndex ]
16 place partition element between the left and right sides of partition
17 swap A[p] with A[rightIndex]
18 return rightIndex
Randomized-Select(A, p, r, i)
1 if p + 1 ≥ r then only one element
2 return A[p]
3 else
4 pick an element at random and partition A[p . . r − 1] using it
5 indexPartition ← Randomized-Partition(A, p, r)
6 if indexPartition = i then use partition element
7 return A[indexPartition]
8 else if i < indexPartition then check left side of partition
9 return Randomized-Select(A, p, indexPartition, i)
10 else check right side of partition
11 return Randomized-Select(A, indexPartition +1, q, i)
Note that:
1 1 1 1
E(X) = 1 · +2· +3· +4· + . . . = 2.
2 4 8 16
Each call to Randomized-Partition takes time bn where b > 0 is some con-
stant. Suppose when a good partition is finally made that the index i falls on
the larger side of the partition, which in the worst case would have up to 43 n
elements. Then T (n) is at worst the time required to make X partitions until a
good partition is made, plus the time required to
perform the algorithm on the
side with 43 n elements, so T (n) = bnX + T 34 n . Hence:
3
E(T (n)) = bn · E(X) + E T n .
4
Increment(A)
1 i←0
2 while i < length[A] and A[i] = 1 do
3 A[i] ← 0
4 i←i+1
5 if i < length[A] then
6 A[i] ← 1
7 else
8 overflow exception
A simple amortized analysis can be used to determine the average time for the
increment operation on the array data structure by averaging the total time
required to perform the operation n times over.
The aggregate method finds the worst-case running time T (n) for a sequence
of n operations, and the amortized cost per operation is then taken to be T (n)/n,
regardless of whether there were several different types of operations in the
sequence. When the sequence of operations is not specified it is usually presumed
to be a worst-case sequence of operations on a newly created data structure.
An aggregate method analysis of the increment operation considers the total
time to perform the operation n consecutive times, which can be measured by
the total number of bits that get flipped. Note that bit 0 flips every time the
increment operation is performed, whereas bit 1 flips every second time. In
general, bit i flips every 2i times the increment operation is performed, so the
k−1
Xj n k
total number of flips for n consecutive increment operations is < 2n.
i=0
2i
Counter A[7] A[6] A[5] A[4] A[3] A[2] A[1] A[0] Cost Total
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 1 1 1
2 0 0 0 0 0 0 1 0 2 3
3 0 0 0 0 0 0 1 1 1 4
4 0 0 0 0 0 1 0 0 3 7
5 0 0 0 0 0 1 0 1 1 8
6 0 0 0 0 0 1 1 0 2 10
7 0 0 0 0 0 1 1 1 1 11
8 0 0 0 0 1 0 0 0 4 15
9 0 0 0 0 1 0 0 1 1 16
10 0 0 0 0 1 0 1 0 2 18
11 0 0 0 0 1 0 1 1 1 19
12 0 0 0 0 1 1 0 0 3 22
13 0 0 0 0 1 1 0 1 1 23
14 0 0 0 0 1 1 1 0 2 25
15 0 0 0 0 1 1 1 1 1 26
16 0 0 0 1 0 0 0 0 5 31
Hence the average cost of each increment is 2n/n = 2, so the increment operation
requires constant time on average (and each increment flips on average 2 bits).
The accounting method uses a scheme of credits and debits to keep track of
the running time of a sequence of operations, assigning an amortized charge to
each operation. Interestingly, these charges are permitted to differ from their
real time cost, to allow for the fact that one type of operation might make more
170 APPENDIX A. ADVANCED ANALYSIS TECHNIQUES
work later for another operation and so should be charged more. So long as
the total real cost of the sequence of operations is bounded by the total of the
amortized costs, the amortized costs can be used instead of the real costs to
analyze an algorithm, removing the hassle of variability in the real costs from
the analysis. Conceptually the data structure functions like a bank holding a
balance of credit. Whenever a change is made its amortized cost is added to the
data structure’s credit (which should never drop into debt) and the real cost of
the operation is deducted from that balance.
Consider again the increment operation of the binary counter. The real cost
of the increment operation is variable due to the while loop that can iterate
a different number of times depending on the state of the data structure, and
is given by the total number of bits flipped. An amortization scheme for this
example could consider flipping a bit as a real cost of one dollar (a dollar in the
sense of some time cost unit). Whenever the increment operation sets a bit to 1
extra work needs to be done by a later increment operation to put it back to 0,
since the Increment algorithm only has extra work to do when it encounters
1 bits. To compensate for this each increment operation could be charged an
fixed amortized cost of two dollars. This is done to ensure that there will be
sufficient credit accumulated in the data structure to pay for future increment
operations that might have a higher real cost than this amortized cost. One
Counter A[7] A[6] A[5] A[4] A[3] A[2] A[1] A[0] Cost Balance
0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 1 1 1
2 0 0 0 0 0 0 1 0 2 1
3 0 0 0 0 0 0 1 1 1 2
4 0 0 0 0 0 1 0 0 3 1
5 0 0 0 0 0 1 0 1 1 2
6 0 0 0 0 0 1 1 0 2 2
7 0 0 0 0 0 1 1 1 1 3
8 0 0 0 0 1 0 0 0 4 1
9 0 0 0 0 1 0 0 1 1 2
10 0 0 0 0 1 0 1 0 2 2
11 0 0 0 0 1 0 1 1 1 3
12 0 0 0 0 1 1 0 0 3 2
13 0 0 0 0 1 1 0 1 1 3
14 0 0 0 0 1 1 1 0 2 3
15 0 0 0 0 1 1 1 1 1 4
16 0 0 0 1 0 0 0 0 5 1
way to see that this is the case (and that the data structure won’t become
bankrupt) is to consider that for the two dollars in total paid by each increment
operation, one dollar is paid for actually flipping the least significant bit. If the
least-significant bit was 0 then the other dollar is credited to the data structure
(and imagined placed with the new 1 bit). If instead the least-significant bit
was 1 then the data structure already has a one dollar credit, which combined
with the other dollar gives the data structure two dollars to pay for flipping
the next bit to the left. This can be visualized by considering each 1 bit in the
data structure as corresponding to a credit of one dollar that the data structure
currently has. Thus a sequence of n consecutive increment operations would
have an amortized cost of 2n (the real cost might be slightly less), so each takes
A.3. AMORTIZED ANALYSIS 171
where the first term (n) is due to the storing of the pushed element in the array
and the other terms account for the (approximately log2 n) times the capacity is
expanded. The aggregate method thus shows that the total cost of any sequence
of n operations is at worst 3n, so the amortized cost of any single operation is
3 (constant time on average).
Alternatively, the accounting method could be used where the amortized cost
of the pop operation is taken to be 1 dollar (and real cost is 1), and the amortized
cost of the push operation is taken as 3 dollars (which has variable real cost).
To justify this choice and that the stack won’t fall into debt, note that for each
push operation 1 is spent in the basic push, 1 is placed as credit for the pushed
element to cover the cost of copying it to a larger array when the capacity is
next exceeded, and the remaining 1 is placed as credit for an element in the first
half of the array as credit for the cost of copying it (since the credit allocated
to it when it was first added was used up in an earlier expansion). Using the
amortized costs the occasional expansion of capacity can now be ignored in the
analysis since its real cost is pre-paid by the push method. Hence the worst case
amortized cost for a sequence of n push operation is 3n, so each has average 3
(constant time). Another valid scheme for the accounting method would be to
assign 4 dollars to the push operation and 0 to the pop operation, so that the
push operation also pays in advance for any eventual pop of that element. One
suitable potential function for applying the potential method would be to take
the potential of a stack holding n elements and capacity L to be Φ = 0 if n < L2
and Φ = 2n − L if n ≥ L2 .
It is instructive to compare this analysis with the analysis of an array im-
plementation of a stack that increases the capacity by a fixed amount K each
time (rather than doubling). Repeating the analysis with the aggregate method
gives a total running time of:
n(n/K + 1)
T (n) = n + (K + 2K + 3K + 4K + · · · + (n − K) + n) = n + ,
2
which is quadratic in n, so each push method would require linear time on
average. For this reason it is much smarter to double the capacity each time
rather than increase it by a fixed amount.