# CS261: The Book

Despite no popular demand, I have collected all the notes from CS261, the course on algorithms for combinatorial optimization problems that I taught in the past term, in one pdf file, available here, and I have created a new page to collect links to my lecture notes.

For the occasion, I have also posted a single file containing the notes from my Spring 2009 class on the foundations of cryptography. As explained in the foreword to the crypto notes, they use a definition of CCA-security that is wrong, that is, a definition that is weaker than the standard one in a way that actually allows potentially dangerous attacks. The weaker definition, however, is much simpler to define and work with, and I think it is pedagogically justified. I believe that everything else in the notes is consistent with standard definitions. As far as I know, the notes are the only place in which one can find a concrete-security treatment of zero knowledge.

# CS261 Lecture 18: Using Expert Advice

In which we show how to use expert advice, and introduce the powerful “multiplicative weight” algorithm.

We study the following online problem. We have ${n}$ “experts” that, at each time step ${t=1,\ldots,T}$, suggest a strategy about what to do at that time (for example, they might be advising on what technology to use, on what investments to make, they might make predictions on whether something is going to happen, thus requiring certain actions, and so on). Based on the quality of the advice that the experts offered in the past, we decide which advice to follow, or with what fraction of our investment to follow which strategy. Subsequently, we find out which loss or gain was associated to each strategy, and, in particular, what loss or gain we personally incurred with the strategy or mix of strategies that we picked, and we move to step ${t+1}$.

We want to come up with an algorithm to use the expert advice such that, at the end, that is, at time ${T}$, we are about as well off as if we had known in advance which expert was the one that gave the best advice, and we had always followed the strategy suggested by that expert at each step. Note that we make no probabilistic assumption, and our analysis will be a worst-case analysis over all possible sequences of events.

The “multiplicative update” algorithm provides a very good solution to this problem, and the analysis of this algorithm is a model for the several other applications of this algorithm, in rather different contexts.

# CS261 Lecture 17: Online algorithms

In which we introduce online algorithms and discuss the buy-vs-rent problem, the secretary problem, and caching.

In this lecture and the next we will look at various examples of algorithms that operate under partial information. The input to these algorithms is provided as a “stream,” and, at each point in time, the algorithms need to make certain decisions, based on the part of the input that they have seen so far, but without knowing the rest of the input. If we knew that the input was coming from a simple distribution, then we could “learn” the distribution based on an initial segment of the input, and then proceed based on a probabilistic prediction of what the rest of the input is going to be like. In our analysis, instead, we will mostly take a worst-case point of view in which, at any point in time, the unknown part of the input could be anything. Interestingly, however, algorithms that are motivated by “learn and predict” heuristics often work well also from the point of view of worst-case analysis.

# CS261 Lecture 16: Multicommodity flow

In which we define a multi-commodity flow problem, and we see that its dual is the relaxation of a useful graph partitioning problem. The relaxation can be rounded to yield an approximate graph partitioning algorithm.

# CS261 Lecture 15: the LP of Max Flow

In which we look at the linear programming formulation of the maximum flow problem, construct its dual, and find a randomized-rounding proof of the max flow – min cut theorem.

In the first part of the course, we designed approximation algorithms “by hand,” following our combinatorial intuition about the problems. Then we looked at linear programming relaxations of the problems we worked on, and we saw that approximation algorithms for those problems could also be derived by rounding a linear programming solution. We also saw that our algorithms could be interpreted as constructing, at the same time, an integral primal solution and a feasible solution for the dual problem.

Now that we have developed exact combinatorial algorithms for a few problems (maximum flow, minimum s-t cut, global min cut, maximum matching and minimum vertex cover in bipartite graphs), we are going to look at linear programming relaxations of those problems, and use them to gain a deeper understanding of the problems and of our algorithms.

# CS261 Lecture14: Algorithms in Bipartite Graphs

In which we show how to solve the maximum matching problem and the minimum vertex cover problem in bipartite graphs.

# CS261 Lecture 12: the Push-Relabel Algorithm

In which we prove that the basic implementation of the push-relabel algorithm runs in time ${O(|V|^2\cdot |E|)}$.

# CS261 Lecture 11: Strongly Polynomial Time Algorithms

In which we prove that the Edmonds-Karp algorithm for maximum flow is a strongly polynomial time algorithm, and we begin to talk about the push-relabel approach.

# CS 261 Lecture 10: the fattest path

In which we discuss the worst-case running of the Ford-Fulkerson algorithm, discuss plausible heuristics to choose an augmenting path in a good way, and begin analyzing the “fattest path” heuristic.

In the last lecture we proved the Max-Flow Min-Cut theorem in a way that also established the optimality of the Ford-Fulkerson algorithm: if we iteratively find an augmenting path in the residual network and push more flow along that path, as allowed by the capacity constraints, we will eventually find a flow for which no augmenting path exists, and we proved that such a flow must be optimal.

Each iteration of the algorithm takes linear time in the size of the network: the augmenting path can be found via a DFS of the residual network, for example. The problem is that, in certain cases, the algorithm might take a very long time to finish. Consider, for example, the following network. Suppose that, at the first step, we pick the augmenting path ${s\rightarrow a\rightarrow b \rightarrow t}$. We can only push one unit of flow along that path. After this first step, our residual network (not showing edges out of ${t}$ and into ${s}$, which are never used in an augmenting path) is Now it is possible that the algorithm picks the augmenting path ${s\rightarrow b\rightarrow a\rightarrow t}$ along which, again, only one unit of flow can be routed. We see that, indeed, it is possible for the algorithm to keep picking augmenting paths that involve a link between ${a}$ and ${b}$, so that only one extra unit of flow is routed at each step.

The problem of very slow convergence times as in the above example can be avoided if, at each iteration, we choose more carefully which augmenting path to use. One reasonable heuristic is that it makes sense to pick the augmenting path along which the most flow can be routed in one step. If we had used such an heuristic in the above example, we would have found the optimum in two steps. Another, alternative, heuristic is to pick the shortest augmenting path, that is, the augmenting path that uses the fewest edges; this is reasonable because in this way we are going to use the capacity of fewer edges and keep more residual capacity for later iterations. The use of this heuristic would have also resulted in a two-iterations running time in the above example.