10 DP

Spell Checking Problem
Given a string exponen that is not in the dictionary, how should

a spell checker suggest a nearby string?
What does nearness mean?
Question: Given two strings x1 x2 . . . xn and y1 y2 . . . ym what is a

distance between them?
Edit Distance: minimum number of edits to transform x into y .

Edit Distance
Edit Distance: minimum number of edits to transform x into y .
Edit operations:
delete a letter
add a letter
substitute a letter with another letter
Why is substitute not delete plus add?
In general different edits can have different costs and using

substitution as a edit allows a single operation as far as distance is
concerned
Edit Distance Problem
Input Two strings x = x1 x2 . . . xn and y = y1 y2 . . . ym over

some fixed alphabet
Goal Find edit distance between x and y : minimum
number of edits to transform x into y
Note: EditDist(x,y) = EditDist(y,x)

Recursive Solution
Letters of x are mapped to letters of y (but some are not).

Case 1 xi is mapped to yj . Then x1 x2 . . . xi1 is mapped to
y1 y2 . . . yj1
Case 2a xi is deleted, and yj is to the left of the deleted letter. Then
x1 x2 . . . xi1 is mapped to y1 y2 . . . yj .
Case 2b yj is inserted, and xi is to the left of the inserted letter. Then
x1 x2 . . . xi is mapped to y1 y2 . . . yj1 .
Subproblems involve the edit distance between prefixes of the two

strings.
Find edit distance between prefix x[1..i] of x and prefix y [1..j] of y
EditDist(x,y) is the distance between x[1..n] and y [1..m]
Recursive Solution
E [i, j]: edit distance between x[1..i] and y [1..j]
Case 1 xi mapped to yj .
E [i, j] = diff (xi , yj ) + E [i 1, j 1]
where diff (xi , yj ) = 0 if xi = yj , otherwise diff (xi , yj ) = 1

Case 2a xi is deleted and yj is left of deleted letter
E [i, j] = 1 + E [i 1, j]
Case 2b yj is inserted, and xi is left of inserted letter
E [i, j] = 1 + E [i, j 1]
Recursive Solution
E [i, j] = min{diff (xi , yj )+E [i 1, j 1], 1+E [i 1, j], 1+E [i, j 1]
Base cases:
E [i, 0] = i for i 0
E [0, j] = j for i 0
How many subproblems? O(mn)
Iterative Solution
What is table? E is a two-dimensional array of size (n + 1)(m + 1)

How do we order the computation?
To compute E [i, j] need to have computed
E [i 1, j 1], E [i 1, j], E [i, j 1].
for i = 0 to n
E [i, 0] = i
for j = 0 to m
E [0, j] = j
for i = 1 to n do
for j = 1 to m do
E [i, j] = min{diff (xi , yj ) + E [i 1, j 1], 1 + E [i 1, j], 1 + E [i, j 1]}
Running Time: O(nm)

Space: O(nm)
Can reduce space to O(n + m) if only distance is needed (but its
not obvious how to actually compute the edits).
Where is the DAG?
one node for each (i, j), 1 = 0 i n, 0 j m.

Edges for node (i, j): from (i 1, j 1) of cost diff (xi , yj ),
from (i 1, j) of cost 1, from (i, j 1) of cost 1
find shortest path from (0, 0) to (n, m)
Binary Search Trees
Given n totally ordered keys a1 < a2 < . . . < an .

Data structure to store the keys so that one can answer dictionary
queries: is a one of the keys?
Binary Search Tree:

a full binary tree T
keys stored at the leaves of the tree
leaves in left to right order give sorted order a1 , a2 , . . . , an
internal nodes store relevant information to guide the search
Given a key a, we can walk down the tree to check if a is in the

tree or not.
Balanced Binary Search Trees
General setting: keys are dynamic with insertions, deletions, etc.
Dynamic search trees: keep tree balanced so that height of tree is

O(log n). Search/insertion/deletion take O(log n) time.
Static Setting with Statistical Information
Static setting:
keys a1 , a2 , . . . , an known in advance
no insertions or deletions, only search queries
also know frequencies of search queries: pi probability of
querying ai
Problem: design a binary search tree T so as to minimize the

average search time
Xn
pi sT (ai )
i=1
where sT (ai ) is the search time for ai in T .
What is sT (ai )? depth of ai in T denoted by dT (ai )

Real Problem
Can search for any key a

Statistical information: q0 , p1 , q1 , p2 , q2 , . . . , pn , qn
pi : probability that ai is searched for
qi : probability that a number a in the range (ai , ai+1 ) is
searched for
Simpler problem ideas can be extended to the above real problem.

Optimal Binary Search Trees: Recursive Solution?
Can we solve the problem recursively?
S(i, j): optimum cost of a binary search tree for ai , ai+1 , . . . , aj

with probabilities pi , pi+1 , . . . , pj
Want S(1, n)
Recurrence for S(i, j)

j
!
X
S(i, j) = min S(i, k) + S(k + 1, j) + pk
ik<j
k=i
Base case: S(i, i) = pi for 1 i n

Iterative Algorithm
j
!
X
S(i, j) = min S(i, k) + S(k + 1, j) + pk
ik<j
k=i
How many subproblems? O(n2 )

Pj
Precomputation: P(i, j) = k=i pk in O(n2 ) time.
Iterative Algorithm
S(i, j) = min (S(i, k) + S(k + 1, j) + P(i, j))

ik<j
for i = 1 to n do
S[i, i] = P[i, i]
for d = 1 to n 1 do
for i = 1 to n d do
j =i +d
S[i, j] = minik<j (S[i, k] + S[k + 1, j] + P[i, j])
Running time: O(n3 )

Space: O(n2 )
Computing the Table: Alternative 1
for i = 1 to n do
S[i, i] = P[i, i]
for i = n downto 1 do
for j = i + 1 to n do
Computing the Table: Alternative 2
for i = 1 to n do
S[i, i] = P[i, i]
for j = 1 to n do
for i = j 1 downto 1 do
Knapsack Problem
Input n items. Each item i has a positive integer size

si and a positive integer profit pi .
a knapsack of integer capacity B.
Goal Pack a maximum profit subset of items into
knapsack.
Towards a Recursive Solution
Observation
Consider an optimal solution O
Case item n O Then O {n} is an optimum solution for items
1 to n 1 in knapsack of capacity B sn
Case item n 6 O O is an optimal solution to items 1 to n 1
Subproblems depend also on remaining capacity.
OPT (i, C ): optimum profit for items 1 to i in knapsack of size C
Goal: compute OPT (n, B)

Recursive Solution
OPT (i, C ): optimum profit for items 1 to i in knapsack of size C

pi + OPT (i 1, C si ) if si C
OPT (i, C ) = max 0 if si > C
OPT (i 1, C )

Base case: OPT (i, 0) = 0 for i = 1 to n.
How many subproblems? O(nB)

Iterative Algorithm
for i = 0 to n do
OPT [i, 0] = 0
for i = 1 to n do
for C = 1 to B do
if si C then
OPT [i, C ] = max(OPT [i 1, C ], pi + OPT [i 1, C si ])
else
OPT [i, C ] = OPT [i 1, C ]
Output OPT [n, B]
Running time: O(nB)

Space: O(nB)
Knapsack Algorithm and Polynomial time
Pn
Input size for Knapsack: O(n) + log B + i=1 (log si + log pi )
Running time of dynamic programming algorithm: O(nB)
Not a polynomial time algorithm!
Example: B = 2n and si , pi [1..2n ].

Input size is O(n), running time is O(n2n ).
Algorithm is called a pseudo-polynomial time algorithm because

running time is polynomial if numbers in input are of size
polynomial in combinatorial size of problem.
Knapsack is NP-hard if numbers are not polynomial in n!

Traveling Salesman Problem
Input A graph G = (V , E ) with non-negative edge

costs/lengths. c(e) for edge e
Goal Find a tour of minimum cost that visits each node.
No polynomial time algorithm known. Problem is NP-Hard.

An Exponential Time Algorithm
How many different tours are there? n!

Stirlings formula: n! ' n(n/e)n which is (2cn log n ) for some
constant c > 1
Can we do better? Can we get a 2O(n) time algorithm?

A More General Path Problem
Given G and nodes vi , vj find a minimum cost path from vi to vj

that visits every node exactly once.
Can solve TSP using above. Do you see how?
Let f (i, j, V ) be minimum cost path from vi to vj that visits all

nodes.
Can we express this as a recursive solution?
What is the next node in the optimum path from i to j? Suppose

it vk . Then what is f (i, j)?
f (i, j, V ) = c(vi , vk ) + f (k, j, V {i})

A Recursive Solution
f (i, j, V ) = min (c(vi , vk ) + f (k, j, V {i}))

k6=i,j
Why is f (k, j, V {i}) a subproblem?

What are the subproblems?
f (a, b, S) for a = 1, 2, . . . , n, b = 1, 2, . . . , n, S V .
How many subproblems? O(n2 2n )

Exercise: Show that one can compute TSP using above dynamic
program in O(n3 2n ) time and O(n2 2n ) space.
Disadvantage of dynamic programming solution: memory!

10 DP

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

10 DP

Încărcat de

Drepturi de autor:

Formate disponibile

Spell Checking Problem

Given a string exponen that is not in the dictionary, how should

What does nearness mean?

Question: Given two strings x1 x2 . . . xn and y1 y2 . . . ym what is a

Edit Distance: minimum number of edits to transform x into y .

Edit Distance: minimum number of edits to transform x into y .

In general different edits can have different costs and using

Input Two strings x = x1 x2 . . . xn and y = y1 y2 . . . ym over

Note: EditDist(x,y) = EditDist(y,x)

Letters of x are mapped to letters of y (but some are not).

Subproblems involve the edit distance between prefixes of the two

E [i, j]: edit distance between x[1..i] and y [1..j]

E [i, j] = diff (xi , yj ) + E [i 1, j 1]

where diff (xi , yj ) = 0 if xi = yj , otherwise diff (xi , yj ) = 1

Case 2b yj is inserted, and xi is left of inserted letter

E [i, j] = min{diff (xi , yj )+E [i 1, j 1], 1+E [i 1, j], 1+E [i, j 1]

What is table? E is a two-dimensional array of size (n + 1)(m + 1)

Running Time: O(nm)

one node for each (i, j), 1 = 0 i n, 0 j m.

Given n totally ordered keys a1 < a2 < . . . < an .

Binary Search Tree:

Given a key a, we can walk down the tree to check if a is in the

General setting: keys are dynamic with insertions, deletions, etc.

Dynamic search trees: keep tree balanced so that height of tree is

Problem: design a binary search tree T so as to minimize the

where sT (ai ) is the search time for ai in T .

What is sT (ai )? depth of ai in T denoted by dT (ai )

Can search for any key a

Simpler problem ideas can be extended to the above real problem.

Can we solve the problem recursively?

S(i, j): optimum cost of a binary search tree for ai , ai+1 , . . . , aj

Recurrence for S(i, j)

Base case: S(i, i) = pi for 1 i n

Base case: S(i, i) = pi for 1 i n

How many subproblems? O(n2 )

S(i, j) = min (S(i, k) + S(k + 1, j) + P(i, j))

Base case: S(i, i) = pi for 1 i n

Running time: O(n3 )

Input n items. Each item i has a positive integer size

Subproblems depend also on remaining capacity.

OPT (i, C ): optimum profit for items 1 to i in knapsack of size C

Goal: compute OPT (n, B)

OPT (i, C ): optimum profit for items 1 to i in knapsack of size C

Base case: OPT (i, 0) = 0 for i = 1 to n.

How many subproblems? O(nB)

Output OPT [n, B]

Running time: O(nB)

Running time of dynamic programming algorithm: O(nB)

Not a polynomial time algorithm!

Example: B = 2n and si , pi [1..2n ].

Algorithm is called a pseudo-polynomial time algorithm because

Knapsack is NP-hard if numbers are not polynomial in n!

Input A graph G = (V , E ) with non-negative edge

No polynomial time algorithm known. Problem is NP-Hard.

How many different tours are there? n!

Can we do better? Can we get a 2O(n) time algorithm?

Given G and nodes vi , vj find a minimum cost path from vi to vj

Can solve TSP using above. Do you see how?

Let f (i, j, V ) be minimum cost path from vi to vj that visits all

What is the next node in the optimum path from i to j? Suppose

f (i, j, V ) = c(vi , vk ) + f (k, j, V {i})

f (i, j, V ) = min (c(vi , vk ) + f (k, j, V {i}))

Why is f (k, j, V {i}) a subproblem?