In which we look at the linear programming formulation of the maximum flow problem, construct its dual, and find a randomized-rounding proof of the max flow – min cut theorem.
In the first part of the course, we designed approximation algorithms “by hand,” following our combinatorial intuition about the problems. Then we looked at linear programming relaxations of the problems we worked on, and we saw that approximation algorithms for those problems could also be derived by rounding a linear programming solution. We also saw that our algorithms could be interpreted as constructing, at the same time, an integral primal solution and a feasible solution for the dual problem.
Now that we have developed exact combinatorial algorithms for a few problems (maximum flow, minimum s-t cut, global min cut, maximum matching and minimum vertex cover in bipartite graphs), we are going to look at linear programming relaxations of those problems, and use them to gain a deeper understanding of the problems and of our algorithms.
We start with the maximum flow and the minimum cut problems.
1. The LP of Maximum Flow and Its Dual
Given a network , the problem of finding the maximum flow in the network can be formulated as a linear program by simply writing down the definition of feasible flow.
We have one variable for every edge of the network, and the problem is:
Now we want to construct the dual.
When constructing the dual of a linear program, it is often useful to rewrite it in a way that has a simpler structure, especially if it is possible to rewrite it in a way that has fewer constraints (which will correspond to fewer dual variables), even at the cost of introducing several new variables in the primal.
A very clean way of formulating the maximum flow problem is to think in terms of the paths along which we are going to send the flow, rather than in terms of how much flow is passing through a specific edge, and this point of view makes the conservation constraints unnecessary.
In the following formulation, we have one variable for each of the possible simple paths from to (we denote by the set of such paths), specifying how much of the flow from to is being routed along the path :
Note that, usually, a network has exponentially many possible paths from to , and so the linear program (2) has an exponential number of variables. This is ok because we are never going to write down (2) for a specific network and pass it to a linear programming solver; we are interested in (2) as a mathematical specification of the maximum flow problem. If we want to actually find a maximum flow via linear programming, we will use the equivalent formulation (1).
(There are several other cases in combinatorial optimization in which a problem has a easier-to-understand linear programming relaxation or formulation that is exponentially big, and one can prove that it is equivalent to another relaxation or formulation of polynomial size. One then proves theorems about the big linear program, and the theorems apply to the small linear program as well, because of the equivalence. Then the small linear program can be efficiently solved, and the theorems about the big linear program can be turned into efficient algorithms.)
Proof: Note that this is exactly the Flow Decomposition Theorem that we proved in Lecture 11, in which it is stated as Lemma 2.
that is, let the sum of the flows of all the paths that use the edge . Then satisfies the capacity constraints and, regarding the conservation constraints, we have
Let us now construct the dual of (2). We have one dual variable for every edge , and the linear program is:
The linear program (3) is assigning a weight to each edges, which we may think of as a “length,” and the constraints are specifying that, along each possible path, and are at distance at least one. This means that dual variables are expressing a way of “separating” from and, after thinking about it for a moment, we see that (3) can be seen as a linear programming relaxation of the minimum cut problem.
Fact 3 For every feasible cut in the network , there is a feasible solution to (3) whose cost is the same as the capacity of .
Proof: Define if and , and let otherwise. Then
This means that the optimum of (3) is smaller than or equal to the capacity of the minimum cut in the network. Now we are going to describe a randomized rounding method that shows that the optimum of (3) is actually equal to the capacity of the minimum cut. Since the optimum of (3) is equal to the optimum of (2) by the Strong Duality Theorem, and we have proved that the optimum of (3) is equal to the cost of the maximum flow of the network, Lemma 4 below will prove that the cost of the maximum flow in the network is equal to the capacity of the minimum flow, that is, it will be a different proof of the max flow – min cut theorem. It is actually a more difficult proof (because it uses the Strong Duality Theorem whose proof, which we have skipped, is not easy), but it is a genuinely different one, and a useful one to understand, because it gives an example of how to use randomized rounding to solve a problem optimally. (So far, we have only seen examples of the use of randomized rounding to design approximate algorithms.)
Lemma 4 Given any feasible solution to (3), it is possible to find a cut such that
Proof: Interpret the as weights on the edges, and use Dijkstra’s algorithm to find, for every vertex , the distance from to according to the weights .
The constraints in (3) imply that .
Pick a value uniformly at random in the interval , and let be the set
Then, for every choice of , contains and does not contain , and so it is a feasible cut.
Using linearity of expectation, the average (over the choice of ) capacity of can be written as
Finally, we observe the “triangle inequality”
which says that the shortest path from to is at most the length of the shortest path from to plus the length of the edge .
Putting all together, we have
and there clearly must exist a choice of for which the capacity of is at most the expected capacity.
About finding efficiently, we can also note that, although there is an infinite number of choices for , there are only at most different cuts that can be generated. If we sort the vertices in increasing order of , and let them be in this order, then we have , and let be such that but . Then the only cuts which are generated in our probability distribution are the cuts of the form
for , and one of them must have capacity . We can compute the capacity of each and pick the with the smallest capacity.
Let us now see what the dual of (1) looks like. It will look somewhat more mysterious than (3), but now we know what to expect: because of the equivalence between (1) and (2), the dual of (1) will have to be a linear programming relaxation of the minimum cut problem, and it will have an exact randomized rounding procedure.
The dual of (1) has one variable for each vertex (except and ), which we shall call , corresponding to the conservation constraints, and one variable for each edge, which we shall call , corresponding to the capacity constraints.
Let us see that (4) is a linear programming relaxation of the minimum cut problem and that it admits an exact rounding algorithm.
Fact 5 If is a feasible cut in the network, then there is a feasible solution to (4) such that
Proof: Define if , and if . Define if and , and otherwise.
To see that it is a feasible solution, let us first consider the constraints of the first kind. They are always satisfied because if then , and if then crosses the cut and , so the left-hand-side is always at least one. We can similarly see that the constraints of the third type are satisfied.
Regarding the constraints of the second kind, we can do a case analysis and see that the constraint is valid if (regardless of the value of the other variables), and it is also valid if (regardless of the value of the other variables). The remaining case is and , which is the case in which crosses the cut and so .
Fact 6 Given a feasible solution of (4), we can find a feasible cut whose capacity is equal to the cost of the solution.
Proof: Pick uniformly at random in , then define
This is always a cut, because, by construction, it contains and it does not contain . (Recall that there is no variable because there is no conservation constraint for .)
Then we have
It remains to argue that, for every edge , we have
For edges of the form , we have
(Actually, the above formula applies if . If , then the probability is zero and and we are fine; if , then the probability is one, and , so we are again fine.)
For edges of the form in which and are in we have
(Again, we have to look out for various exceptional cases, such as the case , in which case the probability is zero and , and the case , and the case .)
For edges of the form , we have