cs180 - introduction to algorithms and complexity

Stable Matching

A common example of a stable matching problem is that of marriage, in which we’d like to match \(n\) men with \(n\) women based on their preferences. Here, we consider the scenario of med-school student admissions to hospitals. We are given a set of preferences among hospitals and med-school students and we’d like to come up with an algorithm to match students to hospitals.

unstable pair

hospital \(h\) and student \(s\) form an unstable pair if both

\(h\) prefers \(s\) to one of its admitted students
\(s\) prefers \(h\) to assigned hospital

In other words, an unstable pair occurs when the hospital and student both prefer each other more than their current matching. Given this, we can now define a stable matching, which is simply an assignment with no unstable pairs.

matching: a matching \(M\) is a set of ordered pairs \(h-s\) with \(h \in H\) and \(s \in S\) such that all hospitals and students appear in at most one pair of \(M\)
perfect matching: a matching M is perfect if all hospitals and students are matched

Note that, graphically, we can think of these matchings as a bipartite graph.

stable matching: a stable matching is a perfect matching with no unstable pairs

Gale-Shapley

\begin{algorithm}
\caption{Gale-Shapley}
\begin{algorithmic}
\STATE{initialize $M$ to empty matching}
\WHILE{some hospital $h$ is unmatched and hasn't offered to every student}
    \STATE $s =$ first student on $h'$s list to whom $h$ has not yet offered
    \IF{$s$ is unmatched}
        \STATE add $h$-$s$ to matching $M$
    \ELIF{$s$ prefers $h$ to current partner $h'$}
        \STATE replace $h'$-$s$ with $h$-$s$ in matching $M$
    \ELSE
        \STATE $s$ rejects $h$
    \ENDIF
\ENDWHILE
\end{algorithmic}
\end{algorithm}

To prove the correctness of this algorithm, we must show that it (1) terminates, (2) produces a perfect matching, and (3) results in a stable matching.

Claim: Gale-Shapley terminates in at most \(n^2\) iterations.
- By brief inspection, we can prove termination by noticing that the while loop will run at most \(n^2\) iterations and thus, in total, hospitals make at most \(n^2\) proposals.
Claim: Gale-Shapley produces a perfect matching.
- Subclaim: Gale-Shapley produces a matching. Since hospitals only match with the best unmatched student, they are in at most one pair. Since students only keep the best offer, they are also in at most one pair. Because both hospitals and students appear in at most one pair, the algorithm produces a matching.
- Subclaim: All hospitals and students are matched. We can show that all hospitals are matched with a proof by contradiction. Suppose that there exists a hospital \(h\) that is unmatched at the end of the algorithm. Since there are \(n\) hospitals and \(n\) students, an unmatched hospital implies an unmatched student \(s\). This means that \(s\) was never given an offer. However, \(h\) must have offered to all \(n\) students. Thus, by contradiction, all hospitals are matched. It follows that if all hospitals are matched and every hospital is matched to at most one student, then all students are matched.
- Since Gale-Shapley produces a matching and all hospitals and students are matched, we can conclude that it produces a perfect matching.
Claim: Gale-Shapley produces a stable matching.
- We can prove this by contradiction. Suppose the matching \(M\) from Gale-Shapley is not stable, meaning that there exists an unstable pair \(h-s\). We know that \(h\) prefers \(s\) over \(s'\) and \(s\) prefers \(h\) over \(h'\), where \(s'\) and \(h'\) are their current matches in \(M\). There are two cases for which this could occur.
  - Case 1: \(h\) never offers to \(s\) and \(h\) ends up offering to \(s'\) in \(M\). However, since \(h\) offers in order of preference, this would imply that \(h\) prefers \(s'\) over \(s\) and hence we have a contradiction.
  - Case 2: \(h\) offers to \(s\), but \(s\) rejects \(h\) for \(h'\). However, this means that \(s\) prefers \(h'\) over \(h\) and hence we have a contradiction.
- By contradiction, we have proven that the Gale-Shapley matching \(M\) does not have an unstable pair and thus produces a stable matching.

valid partner: a student \(s\) is a valid partner of \(h\) if there exists any stable matching with \(h-s\)

Claim: Gale-Shapley is hospital-optimal.

Proof Techniques

Loop Invariants

loop invariant: a loop invariant is a condition that must be true before entering the loop, after each iteration, and after the loop has completed

Induction

Claim: Any \(2n \times 2n\) board with any square removed can be tiled with L-shaped tiles for any \(n\) greater than or equal to \(1\).

We can prove this with induction.

Base Case: For \(n = 1\), we have a \(2 \times 2\) board. By inspection, we can see that regardless of which square we remove, we can tile the remaining squares with an L-shaped tile. The four cases are shown below, in which the green square is removed and the blue squares are covered by the L-shaped tile.

Inductive Hypothesis: Suppose that a \(2n \times 2n\) board with any square removed can be tiled with L-shaped tiles for some \(n \geq 1\).
Inductive Step: For the \(n + 1\) case, we have a \(2^{n + 1} \times 2^{n + 1}\) board, which we can split into four \(2^n \times 2^n\) quadrants. Now, remove any square we’d like. The quadrant that this removed square sits in is now a \(2^n \times 2^n\) board with one square removed. For the remaining quadrants, we can place an L-shaped tile at the center of the \(2^{n + 1} \times 2^{n + 1}\) board, which essentially also turns the \(3\) quadrants into \(2^n \times 2^n\) boards with one square removed. By the inductive hypothesis, we can now tile all of these quadrants with L-shaped tiles. The \(n = 2\) case is shown below, in which the green square is removed, the yellow squares are covered by \(1\) L-shaped tile that we place, and the blue squares are covered by L-shaped tiles placed as a result of our inductive hypothesis.

Asymptotic Notation

big-o: \(O(g(n))\) is the set of all functions \(f(n)\) such that there exists positive constants \(c\) and \(n_0\), where for any \(n \geq n_0\), we have \(0 \leq f(n) \leq c \cdot g(n)\)

Equivalently, \(f(n) \in O(g(n))\) if and only if \(\lim_{n \rightarrow \infty} \frac{f(n)}{g(n)} \lt \infty\).

big-omega: \(\Omega(g(n))\) is the set of all functions \(f(n)\) such that there exists positive constants \(c\) and \(n_0\), where for any \(n \geq n_0\), we have \(0 \leq c \cdot g(n) \leq f(n)\)

We can also use the limit rule definition, which says \(f(n) \in \Omega(g(n))\) if and only if \(\lim_{n \rightarrow \infty} \frac{f(n)}{g(n)} > 0\).

big-theta: \(\Theta(g(n))\) is the set of all functions \(f(n)\) such that there exists positive constants \(c_1, c_2\) and \(n_0\), where for any \(n \geq n_0\), we have \(0 \leq c_1 \cdot g(n) \leq f(n) \leq c_n \cdot g(n)\)

The limit rule definition tells us that \(f(n) \in \Theta(g(n))\) if and only if \(\lim_{n \rightarrow \infty} \frac{f(n)}{g(n)} = k\) for some positive constant \(k\).

A general hierarchy of functions exists: constant → logarithmic → polynomial → exponential. When ordering functions, we can first identify which broad category they belong to and then order each class of functions. For instance, we want to rank the following functions: \(\log{n^n}, n^2, 2^{\log{n}}, \log^2n, n^n, n^{\log{n}}, 2^n, 2^{1000}, n^{\sqrt{2}}\).

Constant	Logarithmic	Polynomial	Exponential
\(2^{1000}\)	\(\log^2n\)	\(\log{n^n} = n\log{n}, n^2, 2^{\log{n}} = n, n^{\sqrt{2}}\)	\(n^n, n^{\log{n}}, 2^n\)

For the polynomial functions, we immediately know that \(n \lt n^{\sqrt{2}} \lt n^2\). For \(n\log{n}\), we can factor out \(n\) from all \(4\) functions and since we know that constant \(\lt\) logarithmic \(\lt\) polynomial functions, then the final ordering is \(n \lt n\log{n} \lt n^{\sqrt{2}} \lt n^2\).

For the exponential functions, we can compare them by taking the \(\log\) of every function. We can do this because \(\log\) is an order-preserving function. So, we get \(n\log{n}, \log^2n, n\). We can easily order these functions and get \(n^{\log{n}} \lt 2^n \lt n^n\).

So, the overall ordering of functions is \(2^{1000} \lt \log^2n \lt 2^{\log{n}} \lt \log{n^n} \lt n^{\sqrt{2}} \lt n^2 \lt n^{\log{n}} \lt 2^n \lt n^n\).

Properties

\(O, \Omega, \Theta\) are all reflexive [\(f(n) \in O(f(n))\)] and transitive [if \(f(n) \in O(g(n))\) and \(g(n) \in O(h(n))\), then \(f(n) \in O(h(n))\)]. \(\Theta\) is also symmetric [if \(f(n) \in \Theta(g(n))\), then \(g(n) \in \Theta(f(n))\)].

Let \(f(n)\) and \(g(n)\) be non-negative functions. We want to prove that \(O(f(n) + g(n)) = O(\max(f(n), g(n)))\). To show that two sets are equal, we must show that both are subsets of one another.

Claim: If \(h(n) \in O(f(n) + g(n))\), then \(h(n) \in O(\max(f(n), g(n)))\).
- Using the formal definition, we know that there exists a positive \(c, n_0\) where for any \(n > n_0\), we have \(0 \leq h(n) \leq c \cdot (f(n) + g(n))\). Since \(f(n) + g(n) \leq 2 \cdot \max(f(n), g(n))\), it follows that \(0 \leq h(n) \leq 2c \cdot \max(f(n), g(n))\). Let \(c' = 2c, n' = n_0\), then by the formal definition, we can conclude that \(h(n) \in O(\max(f(n), g(n)))\).
Claim: If \(h(n) \in O(\max(f(n), g(n)))\), then \(h(n) \in O(f(n) + g(n))\).
- The formal definition tells us that there exists a positive \(c, n_0\) where for any \(n > n_0\), we have \(0 \leq h(n) \leq c \cdot \max(f(n), g(n))\). Since \(\max(f(n), g(n)) \leq f(n) + g(n)\), it follows that \(0 \leq h(n) \leq c \cdot f(n) + g(n)\). Let \(c' = c, n' = n_0\), then by the formal definition, we can conclude that \(h(n) \in O(f(n) + g(n))\).

Since we’ve proven both are subsets of one another, we can conclude that \(O(f(n) + g(n)) = O(\max(f(n), g(n)))\).

To conduct runtime analysis, we can follow these steps:

Given an algorithm and input of size \(n\), express the runtime as a function of \(n\), \(T(n)\).
Derive a closed form expression for \(T(n)\) if necessary.
Apply asymptotic notation to \(T(n)\) to obtain its order of growth.

def shrinking_array_max(a: List[int]):
    l = len(a) # c_1
    while l >= 1:
        print(max(a[:l])) # c_4 (n / 2^i)
        l /= 2 # c_2
        if l < 1: # c_3
            break
    return

For the above function, we can see that it consists of constant time operations \(c_1, c_2, c_3\) inside and outside of the while loop as well as a linear-time max operation. It’s important to note that because \(l\) is halved at every iteration, the size of the input array for max is also halved. It follows that the runtime for the max operation is actually \(c_4 \cdot \frac{n}{2^i}\), where \(i\) corresponds to the number of times \(l\) has been halved. We know that the number of times \(n\) can be halved is \(\log{n}\). So, in summary, we have the following:

\[\begin{aligned} T(n) &= c_1 + \sum_{i = 0}^{\log{n}} (c_2 + c_3 + c_4 \frac{n}{2^i}) \\ &= c_1 + \sum_{i = 0}^{\log{n}} (c_5 + c_4 \frac{n}{2^i}) \\ &= c_6 + c_5 \log{n} + \sum_{i = 0}^{\log{n}} c_4 \frac{n}{2^i} \\ &< c_6 + c_5 \log{n} + c_4 n \sum_{i = 0}^\infty \frac{1}{2^i} \\ &= c_6 + c_5 \log{n} + 2 c_4 n \end{aligned}\]

Now, we can apply asymptotic notation, giving us \(T(n) = O(n)\).

Graphs

graph: a graph \(G = (V, E)\) is a collection of vertices \(V\) of size \(n\) and edges \(E\) of size \(m\)
degree: in an undirected graph, the degree of a vertex is the number of edges for which it is an endpoint

Claim: The sum of degrees of all vertices in a graph is twice the number of edges. We can prove this with a loop invariant, where after \(i\) iterations, the sum of degrees for all vertices is \(2i\).

Graph Representation

adjacency matrix: an adjacency matrix is a \(n \times n\) matrix \(A\) where \(A_{u, v} = 1\) if \((u, v)\) is an edge

An adjacency matrix has two representations for each edge of an undirected graph (\(A_{u, v} = A_{v, u}\)). Its spatial complexity is proportional to \(n^2\). We can check if \((u, v)\) is an edge in \(\Theta(1)\) time, but iterating through all the edges takes \(\Theta(n^2)\) time.

adjacency list: an adjacency list is a node-indexed array of lists

An adjacency list also has two representations for each edge, but only requires \(\Theta(m + n)\) space. We can check if \((u, v)\) is an edge in \(O(degree(u))\) time and identify all edges in \(\Theta(m + n)\) time.

path: a path in an undirected graph \(G = (V, E)\) is a sequence of nodes \(v_1, v_2, ..., v_k\) where each consecutive pair \(v_{i - 1}, v_i\) is joined by an edge \((v_{i - 1}, v_i) \in E\)

We say that a graph is connected if there exists a path between every pair of nodes \(u\) and \(v\).

cycle: a cycle is a path \(v_1, v_2, ..., v_k\) in which \(v_1 = v_k\) and \(k ≥ 2\)
tree: an undirected graph is a tree if it is connected and does not contain a cycle

Claim: A tree on \(n\) vertices has exactly \(n - 1\) edges for all \(n\) greater than or equal to \(1\). We can prove this with induction.

Base Case: For \(n = 1\), the tree has \(n - 1 = 0\) edges.
Inductive Hypothesis: Suppose that a tree on \(n\) vertices has exactly \(n - 1\) edges for some \(n \geq 1\).
Inductive Step: Then, for the \(n + 1\) case, we want to show that a tree \(T\) on \(n + 1\) vertices has exactly \(n\) edges. Let us pick a leaf node \(v\), which is a vertex with degree \(1\). If we remove \(v\) and the edge \(e\) that connects it to the rest of the tree, we obtain a graph \(T'\) with \(n\) vertices. We know that \(T'\) is also a tree because removing an edge and vertex doesn’t create a cycle and \(T'\) is trivially connected. Since \(T'\) is a tree on \(n\) vertices, then by the inductive hypothesis, \(T'\) has exactly \(n - 1\) edges. Now, we readd the \(v\) and \(e\) from before to obtain \(T\). So, \(T\) has \((n - 1) + 1 = n\) edges and we have proven the \(n + 1\) case.

Thus, by induction we have shown that a tree on \(n\) vertices has exactly \(n - 1\) edges for all \(n\) greater than or equal to \(1\).

Claim: Let \(G\) be an undirected graph on \(n\) nodes. Then, any two of the following implies the third: (1) \(G\) is connected, (2) \(G\) does not contain a cycle, (3) \(G\) has \(n – 1\) edges.

Graph Traversals

breadth first search: a breadth first search explores the graph from the starting node \(s\) in all possible directions and finds nodes level by level

\begin{algorithm}
\caption{Breadth First Search}
\begin{algorithmic}
\STATE mark $s$ as visited and the rest unvisited
\STATE set level[$0$] $=$ \{$s$\} and $i = 0$
\WHILE{level[$i$] is not empty}
    \STATE initialize level[$i + 1$] as an empty list
    \FOR{vertices $u$ in level[$i$]}
        \FOR{neighbors $v$ of $u$}
            \IF{$v$ has not been visited}
                \STATE mark $v$ as visited
                \STATE add $v$ to level[$i + 1$]
            \ENDIF
        \ENDFOR
    \ENDFOR
    \STATE increment $i$
\ENDWHILE
\end{algorithmic}
\end{algorithm}

Breadth first search runs in \(O(m + n)\) time. We can arrive at this fact by observing that the outer for loop will execute \(n\) times, since each vertex appears in only one level and the inner for loop will execute \(2m\) times, since \(\sum_{v \in V} degree(v) = 2m\).

depth first search: a depth first search explores the graph in one direction as far as possible before backtracking

\begin{algorithm}
\caption{Depth First Search}
\begin{algorithmic}
\STATE mark all nodes as unvisited
\STATE initialize stack to only contain $s$
\WHILE{stack is not empty}
    \STATE pop from the stack to obtain $u$
    \IF{u has not been discovered}
        \STATE mark $u$ as discovered
        \FOR{neighbors $v$ of $u$}
            \STATE push $v$ onto stack
        \ENDFOR
    \ENDIF
\ENDWHILE
\end{algorithmic}
\end{algorithm}

Depth first search also runs in \(O(m + n)\) time.

connected component: the connected component of \(s\) are all the nodes reachable from \(s\)

To do graph induction proofs, we can use the following template:

Start from \(G\) that satisfies the premise.
Perform graph operations on \(G\) and obtain \(G'\) such that the premise of the inductive hypothesis holds for \(G'\).
By the inductive hypothesis, the claim holds for \(G'\).
Reconstruct \(G\) from \(G'\) and show that the claim also holds for \(G\).

bipartite: a graph is bipartite if its vertices can be partitioned into two sets \(L\) and \(R\) such that (1) \(L \cup R = V\), (2) \(L \cap R = \emptyset\), (3) every edge has one end in \(L\) and the other in \(R\)

We can also show that a graph is bipartite if and only if it is two-colorable and contains no odd length cycles.

Claim: A graph is bipartite if and only if it is two-colorable.

(→) We want to prove that if a graph is bipartite, then it is two-colorable. We know that \(V\) can be partitioned into \(L\) and \(R\), which we can color blue and green, respectively. Since all edges \(e \in E\) have one end in \(L\) and the other in \(R\), then all edges are not monochromatic and thus we can conclude that the graph is two-colorable.
(←) We want to prove that if a graph is two-colorable, then it is bipartite. Without loss of generality, we can assign \(L\) to be blue vertices and \(R\) to be green vertices. Since there are no monochromatic edges, all edges have one end in \(L\) and the other in \(R\) and thus we can conclude that the graph is bipartite.

Claim: A graph is bipartite if and only if it does not contain an odd length cycle.

(→) We want to prove that if a graph is bipartite, then it does not contain an odd length cycle. By our previous result, \(G\) is also two-colorable. Now, suppose we have a \(k\)-length cycle in \(G\), \((v_0, v_1, ..., v_k)\). Let’s color the vertices at even indices blue and at odd indices green. Since this is a cycle, \(v_0 = v_k\) and they must have the same color. \(v_0\) is colored blue, so \(v_k\) must also be blue, which implies that \(k\) is even. Thus, for any arbitrary cycle in \(G\), it must be of even length.
(←) We want to prove that if a graph does not contain an odd length cycle, then it is bipartite. Let’s pick an arbitrary \(v \in V\) and use breadth first search to partition the vertices into \(L\) and \(R\) according to the evenness of their distance from \(v\). A vertex is in \(L\) if the distance is even, otherwise it is in \(R\). Now color vertices in \(L\) blue and vertices in \(R\) green. We want to show that no monochromatic edge exists and we can do so by contradiction. Suppose a monochromatic edge \(e = (u', v')\) does exist. Without loss of generality, let \(u', v' \in L\), then both are even distances away from \(v\). If we connect the path from \(u'\) to \(v\), the path from \(v'\) to \(v\), and \(e\), we obtain an odd length cycle. However, this contradicts our original assumption and so we have shown that such a monochromatic edge could not exist. Thus, \(G\) is bipartite.

strong connectivity: a graph is strongly connected if there exists a path from \(u\) to \(v\) and \(v\) to \(u\) for every pair of nodes \(u, v\)
directed acyclic graphs: a directed acyclic graph or DAG is a directed graph that contains no directed cycle
topological order: a topological order of a directed graph \(G = (V, E)\) is an ordering of its nodes \(v_1, v_2, ..., v_n\) such that for every edge \((v_i, v_j)\) in \(G\), we have \(i < j\)

Claim: If \(G\) has a topological order, then \(G\) is a DAG. We can prove this by contradiction. Suppose \(G\) has a topological order \(v_1, v_2, ..., v_i, ..., v_j, ..., v_n\), but \(G\) is not a DAG. This means that there exists some directed cycle \((v_i, ..., v_j, v_i)\) for some \(i < j\). However, we have an edge \((v_j, v_i)\), where \(j > i\). Thus, we have contradicted one of the criteria of a topological order and so we can conclude that \(G\) has to be a DAG.

Claim: If \(G\) is a DAG, then \(G\) has a topological ordering. We can prove this by induction over \(n\) the number of vertices.

Base Case: For \(n = 1\), it is trivial that the graph \(G\) is a DAG and has a topological ordering.
Inductive Hypothesis: Suppose that the statement holds for some \(n \geq 1\). In other words, if a graph with \(n\) vertices \(G\) is a DAG, then it has a topological ordering.
Inductive Step: For the \(n + 1\) case, we want to show that if a graph with \(n + 1\) vertices \(G\) is a DAG, then it has a topological ordering. Because \(G\) is a DAG, we know there exists a node with in-degree \(0\). Let’s remove node \(v\) and its associated edges to obtain \(G' = G - \{v\}\). \(G'\) is a graph with \(n\) vertices. Because removing a node and its edges does not create any cycles, we know that \(G'\) is still a DAG. Our inductive hypothesis tells us that \(G'\) has a topological ordering. Now, we can add \(v\) back to obtain \(G\) and prepend it to the topological ordering of \(G'\), resulting in a topological ordering of \(G\).

Thus, by induction, we have shown that if \(G\) is a DAG, then \(G\) has a topological ordering.

\begin{algorithm}
\caption{Topological Ordering}
\begin{algorithmic}
\STATE find a node $v$ with no incoming edges and order it first
\STATE delete $v$ from $G$
\STATE recursively compute a topological ordering of $G - $\{$v$\} and append this order after $v$
\end{algorithmic}
\end{algorithm}

Greedy Algorithms

Interval Scheduling

interval scheduling: an interval scheduling problem is as follows – given a set of jobs, where job \(j\) has a start \(s_j\) and finish \(f_j\) time, we want to find the maximum subset of jobs where no two jobs overlap

One attempt could be to consider scheduling jobs with the earliest start time. However, we can see in the following case that selecting with this criteria would not result in the optimal solution.

Another attempt schedules jobs with the smallest interval first. However, we can also find a counterexample like below.

Claim: Scheduling by earliest finish time is the optimal solution. We can prove this by contradiction. Suppose the scheduling \(S\) produced by this algorithm is not optimal. Let \(i_1, ..., i_k\) be jobs in \(S\). Let \(j_1, ..., j_m\) be jobs in the optimal solution \(S^*\). So, \(m > k\). Now, let \(r\) be the largest number such that \(S\) and \(S^*\) are the same. Since \(S\) is ordered by earliest finish time, we know that \(f_{j_{r + 1}} \geq f_{i_{r + 1}}\). It follows that we can swap \(j_{r + 1}\) for \(i_{r + 1}\) in \(S^*\) while maintaining its optimality. However, this means that \(r\) was not the largest number such that \(S\) and \(S^*\) are in common and thus we have a contradiction and have shown that scheduling by the earliest finish time is optimal.

interval partitioning: an interval partitioning problem is as follows – given a set of lectures, where lecture \(j\) starts at \(s_j\) and finishes at \(f_j\), find the minimum number of classrooms to schedule all lectures such that no two lectures occur at the same time in the same room

Here, we can’t use the earliest finish time algorithm to schedule. A counterexample is shown below.

Dijkstra’s Algorithm

single-pair shortest path: given a digraph \(G = (V, E)\), positive edge lengths \(l_e\), source \(s \in V\), and destination \(t \in V\), find a shortest directed path from \(s\) to \(t\)

\begin{algorithm}
\caption{Dijkstra's Algorithm}
\begin{algorithmic}
\STATE create an empty priority queue q
\STATE set dist[$s$] = $0$
\STATE add ($s$, $0$) to q
\FOR{$v$ != $s$}
    \STATE set dist[$v$] = $\infty$, pred[$v$] = NULL
    \STATE add ($v$, $\infty$) to q
\ENDFOR
\WHILE{$q$ is not empty}
    \STATE pop q to get $u$
    \FOR{neighbors $v$ of $u$}
        \IF{dist[$u$] + $l_e <$ dist[$v$]}
            \STATE set pred[$v$] = $u$
            \STATE set dist[$v$] = dist[$u$] + $l_e$
            \STATE decrease priority in q for $v$ to dist[$v$]
        \ENDIF
    \ENDFOR
\ENDWHILE
\end{algorithmic}
\end{algorithm}

Mininum Spanning Trees

cut: a cut is a partition of the nodes into two nonempty subsets \(S\) and \(V - S\)
cutset: a cutset of a cut \(S\) is the set of all edges with exactly one endpoint in \(S\)
spanning tree: a spanning tree of \(G\) is a subgraph that is a tree and contains all vertices

Let \(H = (V, T)\) be a subgraph of an undirected graph \(G = (V, E)\). Then, the following are equivalent:

\(H\) is a spanning tree of \(G\)
\(H\) is acyclic and connected
\(H\) is connected and has \(\mid V \mid - 1\) edges
\(H\) is acyclic and has \(\mid V \mid - 1\) edges
\(H\) is minimally connected (i.e. removal of any edge disconnects it)
\(H\) is maximally acyclic (i.e. adding any edge creates a cycle)

minimum spanning tree: a minimum spanning tree is a spanning tree with the minimum sum of edge weights

Claim: For any cycle in \(G\), the maximum cost edge in the cycle is not in any MST of \(G\).

Claim: For any cut of \(G\), the minimum cost edge spanning the cut must be contained in every MST. We can prove this by contradiction. Assume that there exists a MST \(T\), where this minimum cost edge \(e\) is not contained in \(T\). Let \(u \in S\) and \(v \in V - S\). Since \(T\) is a spanning tree, we know that there exists a path from \(u\) to \(v\) using an edge \(e^*\) that spans the cut. Let \(T' = T - \{e^*\} + \{e\}\). Replacing edge \(e^*\) with \(e\) maintains connectivity because they both span the cut. Removing and adding an edge also does not create a cycle. It follows that \(T'\) is still a spanning tree. However, the cost of \(T'\) is less than \(T\) because \(e^* > e\). This is a contradiction to our assumption that \(T\) is the minimum spanning tree. Thus, we have shown that for any cut of \(G\), the minimum cost edge spanning the cut must be contained in every MST.

\begin{algorithm}
\caption{Prim's Algorithm}
\begin{algorithmic}
\STATE initialize $S = \{s\}$ for an arbitrary node $s$
\STATE set $T = \emptyset$
\FOR{$n - 1$ iterations}
    \STATE add to $T$ a min-cost edge with exactly one endpoint in $S$
    \STATE add the other endpoint to $S$
\ENDFOR
\end{algorithmic}
\end{algorithm}

\begin{algorithm}
\caption{Kruskal's Algorithm}
\begin{algorithmic}
\STATE sort edges in ascending order of cost
\FOR{edges $e$}
    \STATE add to tree $T$ unless doing so creates a cycle
\ENDFOR
\end{algorithmic}
\end{algorithm}

Divide and Conquer

divide and conquer: a divide and conquer strategy divides problems up into smaller subproblems, solves subproblems recursively, and combines subproblem solutions to obtain the overall solution

For the problem setting of sorting, mergesort is an example of a divide and conquer approach. In merge sort, we recursively sort the left and right halves and merge the subresults in linear time. This linear time merging is done by scanning both sorted lists \(A\) and \(B\) from left to right and appending whichever top element is smaller.

\begin{algorithm}
\caption{Merge Sort}
\begin{algorithmic}
\PROCEDURE{Mergesort}{$L$}
\IF{list $L$ has only one element}
    \STATE return
\ENDIF
\STATE divide list $L$ into two halves $A$ and $B$
\STATE sort list $A$ with \CALL{Mergesort}{$A$}
\STATE sort list $B$ with \CALL{Mergesort}{$B$}
\STATE return \CALL{Merge}{$A$, $B$}
\ENDPROCEDURE
\end{algorithmic}
\end{algorithm}

Claim: If \(T(n)\) satisfies \(T(n) = 2 T(\frac{n}{2}) + n\) where \(T(1) = 0\), then \(T(n) = n \log n\). Let’s first visualize this algorithm with a recursion tree.

We can see that at every level the work being done is \(n\) and since there are \(\log n\) levels, the runtime is \(T(n) = n \log n\).

def bsearch(a: List[int], target: int):
    m = len(a) // 2 # c_1
    if target == a[m]: # c_2
        return m # c_3
    elif target < a[m]: # c_4
        return bsearch(a[:m], target) # T(n / 2)
    else:
        return bsearch(a[m + 1:], target) # T(n / 2)

For the above function, we can see that each py bsearch function call does constant work and recursively calls itself for an additional \(T(\frac{n}{2})\) amount of work. Thus, the recurrence relation is:

\[T(n) = T(\frac{n}{2}) + c\]

\begin{algorithm}
\caption{Counting Inversions}
\begin{algorithmic}
\PROCEDURE{SortAndCount}{$L$}
\IF{list $L$ has only one element}
    \STATE return ($0$, $L$)
\ENDIF
\STATE divide list $L$ into two halves $A$ and $B$
\STATE sort list $A$ with \CALL{SortAndCount}{$A$} to obtain ($r_A, A$)
\STATE sort list $B$ with \CALL{SortAndCount}{$B$} to obtain ($r_B, B$)
\STATE \CALL{MergeAndCount}{$A$, $B$} to obtain ($r_{AB}$, $L$)
\STATE return ($r_A + r_B + r_{AB}$, $L$)
\ENDPROCEDURE
\end{algorithmic}
\end{algorithm}

\begin{algorithm}
\caption{Closest Pair}
\begin{algorithmic}
\PROCEDURE{ClosestPair}{p$_1$, p$_2$, ..., p$_n$}
\STATE compute vertical line $L$ such that it splits points in half
\STATE $d_1$ = \CALL{ClosestPair}{points in left half}
\STATE $d_2$ = \CALL{ClosetsPair}{points in right half}
\STATE $d$ = min($d_1$, $d_2$)
\STATE $A$ = list of all points closer than $d$ to line $L$
\STATE sort $A$ by y-coordinates
\STATE scan $A$ in y-order, compare distance between each point and next $7$ neighbors, and update $d$ if smaller distance
\ENDPROCEDURE
\end{algorithmic}
\end{algorithm}

Master Theorem

The general recurrence relation for a divide and conquer approach is:

\[T(n) = a T(\frac{n}{b}) + f(n)\]

\(a \geq 1\) is the number of subproblems, \(b \geq 2\) is the factor by which the subproblem size decreases, and \(f(n) \geq 0\) is the work required to divide and combine subproblems. In the corresponding recursion tree, \(a\) is the branching factor, \(a^i\) is the number of subproblems at level \(i\) (root is level \(0\)), there are \(1 + \log_{b} n\) levels, and \(\frac{n}{b^i}\) is the size of subproblems at level \(i\).

Suppose that \(T(n)\) is a function that satisfies the relation:

\[T(n) = a T(\frac{n}{b}) + \Theta(n^c)\]

Here, \(T(0) = 0\) and \(T(1) = \Theta(1)\). Then, the master theorem tells us the following:

Case 1: If \(c \gt \log_{b} a\), then \(T(n) = \Theta(n^c)\).
Case 2: If \(c = \log_{b} a\), then \(T(n) = \Theta(n^c \log n)\).
Case 3: If \(c \lt \log_{b} a\), then \(T(n) = \Theta(n^{\log_{b} a})\).

Dynamic Programming

dynamic programming: a dynamic programming approach breaks up the original problem into overlapping subproblems and combines smaller subproblems to form solution for the larger problem

Weighted Interval Scheduling

weighted interval scheduling: a weighted interval scheduling problem is an extension of the interval scheduling problem, where each job now has a weight \(w_j\) and the goal is to find the optimal subset of mutually compatible jobs, such that the total weight is maximized

We can solve this with dynamic programming. First, sort the jobs in ascending order based on their finish times. Define \(p(j)\) as the largest index \(i < j\) such that job \(i\) is compatible with \(j\). Also, define \(OPT(j)\) as the maximum weight of any subset of mutually compatible jobs for subproblem consisting only of jobs \(1, 2, ..., j\). It follows that our goal is to determine \(OPT(n)\). When it comes to evaluating \(OPT(j)\), there are two cases:

Case 1: \(OPT(j)\) does not select job \(j\), then the optimal solution is \(OPT(j - 1)\).
Case 2: \(OPT(j)\) selects job \(j\), then the optimal solution is \(w_j + OPT(p(j))\).

Now, we can write out the bellman equation as follows:

\[OPT(j) = \begin{cases} 0 & \text{if } j = 0 \\ \max\{OPT(j - 1), w_j + OPT(p(j))\} & \text{if } j > 0 \end{cases}\]

memoization: memoization or top-down dynamic programming is a recursive approach that caches results of smaller subproblems
tabulation: tabulation or bottom-up dynamic programming is a iterative approach that tabulates previous results

Knapsack Problem

knapsack problem: in the knapsack problem, we are given \(n\) items each with a value \(v_i \gt 0\) and a weight \(w_i \gt 0\), and we want to maximize the total value of items taken under the constraint that the knapsack has a total weight limit \(W\)

To use dynamic programming, we can define \(OPT(i, w)\) as the maximum value of items taken from items \(1, ..., i\) subject to the weight limit \(w\). It follows that our objective is to compute \(OPT(n, W)\). To evaluate \(OPT(i, w)\), there are two cases:

Case 1: \(OPT(i, w)\) does not select item \(i\), then the optimal solution is \(OPT(i - 1, w)\).
Case 2: \(OPT(i, w)\) selects item \(i\), then the optimal solution is \(OPT(i - 1, w - w_i)\).

Now, we can write out the bellman equation as follows:

\[OPT(i, w) = \begin{cases} 0 & \text{if } i = 0 \\ OPT(i - 1, w) & \text{if } w_i > w \\ \max\{OPT(i - 1, w), v_i + OPT(i - 1, w - w_i)\} & \text{otherwise} \end{cases}\]

RNA Secondary Structure

secondary structure: a valid secondary structure is a set of pairs \(S = {(b_i, b_j)}\) that satisfies (1) \(S\) is a matching and each pair in \(S\) is either \(A-U\), \(U-A\), \(C-G\), or \(G-C\), (2) if \((b_1, b_j) \in S\), then \(i < j - 4\), (3) if \((b_i, b_j)\) and \((b_k, b_l)\) are two pairs in \(S\), then we cannot have \(i < k < j < l\)

Our objective is to find a secondary structure with the minimum total free energy, which is achieved by maximizing the number of base pairs. Here, we can define \(OPT(i, j)\) as the maximum number of base pairs in a secondary structure of the substring \(b_i, b_{i + 1}, ..., b_j\). It follows that our goal is to figure out \(OPT(1, n)\). To compute \(OPT(i, j)\), there are three cases:

Case 1: If \(i \leq j - 4\), \(OPT(i, j) = 0\) because of the no sharp turns constraint.
Case 2: Base \(b_j\) is not involved in a pair, so the optimal solution is \(OPT(i, j - 1)\).
Case 3: Base \(b_j\) pairs with \(b_t\) for some \(i \leq t \lt j - 4\) and the optimal solution is \(1 + \max_t \{OPT(i, t - 1) + OPT(t + 1, j - 1)\}\).

Sequence Alignment

alignment: an alignment \(M\) is a set of ordered pairs \(x_i-y_j\) such that each character appears in at most one pair and no crossings