Sunteți pe pagina 1din 4

Final Review Slides:

Connected Components Dijkstra’s Dynamic Programming Minimum Spanning Trees


Suppose you reverse all of the edges of graph G to form Suppose that we construct a graph G’ from Do Dijkstra’s, Bellman-Ford, or Floyd-Warshall use Suppose we’re using Prim’s
graph G’. Do G and G’ have the same weakly connected a graph G with non-negative edge weights dynamic programming? Are any of them greedy? algorithm to find the MST and
components? Strongly connected components? such that for each (u, v) ∈ E, we introduce All three use DP. Dijkstra’s can be considered greedy then after half of the vertices have
G and G’ have the same weakly connected components and since it chooses the vertex with the lowest weight as been added to A, we want to start
two new vertices a’ and b’ where u →a’ →b’
strongly connected components. the next. using Kruskal’s. How can we
→v and wG’(u, a’) = -1 and wG’(b’, v) = 1 and
modify Kruskal’s to accommodate
wG’(a’, b’) = wG(u, v). Can we solve the
Suppose we replaced DFS-loop in Kosaraju’s algorithm and What’s the difference between the invariant this desire?
single-source shortest path problem for G
just mapped the indexes in the topological ordering of the maintained in Bellman-Ford and the one in Floyd- Instead of initializing all of the
by evaluating Dijkstra’s on G’?
vertices to finishing times. Would the algorithm still Warshall? vertices to their own sets, initialize
Even though we introduced negative edge
function as normal? Bellman-Ford records the minimum path of length k, only the vertices in V\A to their own
weights in G’, Dijkstra’s would only fail if
No, topological orderings are undefined for directed graphs and allows this path to pass through any vertex. Floyd- sets and put all of the vertices in A
edges extended from a’. In fact, it works as
with cycles. Why? By definition, if (u, v) ∈ E then u precedes v Warshall requires it the path to pass only through into a large set. Then proceed as
long as wG’(u, a’) + wG’(b’, v) + wG’(u, v) =
vertices 1 to k-1. usual.
in the topological ordering. But in a cycle involving vertices v 1 wG(u, v).
→… →vk →v1 then v1 must precede vk which must precede v1
Find the longest arithmetic subsequence in an
introducing a contradiction. Though, if you use the topo_sort Same problem as the previous slide,
except wG’(u, a’) = wG’(b’, v) = 1. ordered list of distinct integers.
algorithm, it will work.
e.g. A = [1, 2, 3, 6, 5, 4, 7, 9, 8] => 5 from [1, 3, 5, 7, 9].
No, the weights of the paths in G’ unfairly
At every index, populate a dict B with a mapping from
Suppose a graph G contains two distinct SCCs C1 and C2 and penalize shortest paths in G with lots of
the increment to the length of the sequence so far . To
there exists an edge (u, v) such that u ∈ C1 and v ∈ C2. Does smaller weight edge.
populate index i, set B[i, A[i] - A[k]] = B[k, A[i] - A[k]] +
u have a lower finishing time than v in Grev? 1 for all indices k = 1 to i - 1.
Yes. If (u, v) is an edge in G then (v, u) is an edge in Grev and in It helps to keep track of the max, so you don’t need to
order for u to have a higher finishing time (i.e. finish after) v, fetch it from B afterwards.
there must be another path u →some intermediate vertices →v.
But this creates a cycle involving u and v, which can’t happen
since u and v are in different SCCs.
Greedy as good as optimal: Just confirming, xi is the number of new nodes that you cover by adding the ith set, and yi is the number of nodes covered by your current solution before adding that ith set, right?
The key argument is that by definition, the optimal solution uses k sets and covers OPT nodes. So, no matter how you've picked your i−1 previous sets, there are some sets which are used by the optimal solution which you
haven't used yet. In particular, if the current solution covers yi nodes, then at least OPT−yi of the nodes in the optimal solution are not covered. And, since it's possible to cover all OPT nodes of the optimal solution
using k sets, it's definitely possible to cover those OPT−yi nodes with those k sets. This means that at least one of those k sets covers Opt−yi/k new nodes. Since the greedy algorithm chooses the next set which adds the
most new nodes, and one of the options adds at least Opt−yi/k new nodes, then whatever we pick will definitely have at least Opt−yi/k new nodes.

(d) (2 points) Recall that we spent a lecture showing that if we choose the “pivot” in quicksort uniformly at random from the 𝑛
numbers being sorted, we obtain a randomized algorithm that is correct with probability 1, and which has expected runtime 𝑂(𝑛
log 𝑛). In one sentence describe how we could modify the randomized quicksort algorithm to obtain a deterministic quicksort
algorithm that also runs in time 𝑂(𝑛 log 𝑛). Justify why your modified algorithm still runs in time 𝑂(𝑛 log 𝑛) in at most two
sentences. (Feel free to refer to other algorithms that we studied in class.)
At each recursive step, rather than picking the pivot randomly, pick the median as the pivot using the Select algorithm, which runs in 𝑂(𝑛)
time. Then at each step we are guaranteed to recurse on problems half the size, and at each level we have need linear time to find the pivot
and partition the array, so this version of quicksort obeys the deterministic recurrence 𝑇(𝑛) = 2𝑇(𝑛/2) + 𝑂(𝑛) = 𝑂(𝑛 log 𝑛).

(f) (2 points) What are the key similarities and differences between balanced binary search trees, and hash tables? For full credit,
describe at least one similarity, and at least one difference.
Similarity: Hash table and a balanced binary search tree both data structures that allow fast insert, lookup, and delete operations.
Difference: In hash table, insert, lookup, and delete operations take constant time (in expectation). However, these operations take 𝑂(log 𝑛)
time in balanced binary search trees.

(g) (2 points) Suppose I have a deterministic algorithm for Problem-X that interacts with an input via a series of Yes/No/Maybe
questions, and then produces the correct output. Suppose that 1 there are 𝑛 possible inputs to the problem, and 𝑚 possible outputs,
with 𝑚 < 𝑛 (and that each input has exactly one correct output). There exists an input such that my algorithm will ask at least
XXXX questions. Which of the following quantities is the correct value of XXXX? (a) log2 𝑛 (b) log2 𝑚 (c) log3 𝑛 (d) log3 𝑚
Choose one of these four options, and give a one-sentence proof/justification for this lower-bound.
Suppose the algorithm asks at most 𝑠 questions on all inputs; since the algorithm is deterministic, for each of the 3𝑠 sequences of 𝑠 answers
there can be out most one output, hence we must have 𝑚 ≤ 3 𝑠 , which implies the algorithm must ask at least 𝑠 ≥ log3 𝑚 questions.

(h) (2 points) Consider a set of hash functions, 𝐻 = {ℎ1, . . . , ℎ𝑘}, with each function ℎ𝑖 : 𝑈 → {1, 2, . . . , 𝑛} mapping a universe 𝑈 of
“keys” into 𝑛 buckets. This set, 𝐻, is a “Universal” family if, for all 𝑘, 𝑘′ ∈ 𝑈 with 𝑘 =
̸ 𝑘 ′ , if ℎ is drawn uniformly at random from
the set 𝐻, Pr[ℎ(𝑘) = ℎ(𝑘 ′ )] ≤ 1/𝑛. For any 𝑈 and 𝑛, there exists a universal family, 𝐻𝑈,𝑛, such that if ℎ is chosen uniformly at
random from 𝐻𝑈,𝑛, then Pr[ℎ(𝑘) = ℎ(𝑘 ′ )] = 1/𝑛. [Note the STRICT equality sign—this is not a typo!!] Describe how to construct
the set 𝐻𝑈,𝑛 in at most one sentence. No proof necessary.
𝐻𝑈,𝑛 is simply the set of all hash functions from 𝑈 to {1, 2, . . . , 𝑛}.

(i) (3 points) Djikstra’s algorithm allows us to find the shortest paths between a “source” vertex, 𝑠, and every other vertex in a
weighted graph. Consider a slightly easier version of this problem, where all the edges have weight 1. In this case, does Djikstra’s
algorithm visit nodes in the same order as Depth-First-Search, Breadth-First-Search, neither DFS nor BFS, or both? Justify your
answer in one or two sentences—no need for a formal proof.
We accepted two answers (either way, a convincing 1- or 2-sentence justification is required):
∙ “BFS”. BFS, starting at node 𝑠, visits vertices in order of their unweighted distance, or unweighted shortest path length, from 𝑠. When all
edges have weight 1, Dijkstra’s algorithm also visits nodes by unweighted distance from 𝑠. ∙ “Neither”. Dijkstra’s algorithm visits nodes in
the order they are popped from its priority queue (a black box). Therefore, nodes with the same distance from 𝑠 may not be visited in the
same order by Dijkstra’s algorithm as by BFS. Note that DFS does not proceed by distance, so the answers “DFS” and “either/both BFS
and DFS” are incorrect.

Problem 1
F, F, skipped bc I don't think we covered this, skipped bc I don't think we covered this, F, F, T, F, T, T

T F n 1/10 = O((log n) 5 )
T F The worst-case running time of Randomized Quicksort on an array of n elements is O(n log n).
T F A heap can be used to sort an array of n elements with worst-case running time O(n log n).
T F A hash table can be used to sort an array of n elements with expected running time O(n).
T F Every algorithm that always correctly computes the median of an array of n elements has worst-case running time Ω(n log n).
T F Dijkstra’s algorithm is always correct even in graphs with negative edge weights.
T F Prim’s algorithm is always correct even in graphs with negative edge weights.
T F Using a suitable data structure, Dijkstra’s algorithm can be implemented in O(m log n) time in graphs with m edges and n vertices.
T F Using a suitable data structure, Prim’s algorithm can be implemented in O(m log n) time in graphs with m edges and n vertices.
T F Linearity of expectation (that the expectation of a sum of random variables equals the sum of the random variables’ expectations) holds even for random variables that are not independent.

S-ar putea să vă placă și