*In which we show how to round a linear programming relaxation in order to approximate the set cover problem, and we show how to reason about the dual of the relaxation to derive a simple combinatorial approximation algorithm for the weighted case.*

Recall that in Set Cover we are given a finite set and a collection of subsets of . We want to find the fewest sets whose union is , that is, the smallest such that .

We described an algorithm whose approximation guarantee is . The lower bound technique that we used in our analysis was to realize that if we have a subset of elements, such that every set contains at most elements of , then .

Today we will describe a linear programming relaxation of set cover, and we will show a way to round it, in order to achieve an approximation.

We will also show that the lower bounds to the optimum that we used in the analysis of the greedy algorithm can also be expressed as feasible solutions for the dual of our linear programming relaxation.

**1. A Linear Programming Relaxation of Set Cover **

We begin by formulating the set cover problem as an Integer Linear Programming problem. Given an input of the set cover problem, we introduce a variable for every set , with the intended meaning that when is selected, and otherwise. We can express the set cover problem as the following integer linear program:

From which we derive the linear programming relaxation

Remark 1In an earlier lecture, we noted that every instance of the vertex cover problem can be translated into an instance of the set cover problem in which every element of belongs precisely to two sets. The reader should verify that the linear programming relaxation (3) of the resulting instance of set cover is identical to the linear programming relaxation of the vertex cover problem on the graph .

More generally, it is interesting to consider a *weighted* version of set cover, in which we are given the set , the collection of sets , and also a *weight* for every set. We want to find a sub-collection of *minimal total weight* whose union is , that is, we want to find such that , and such that is minimized. The unweighted problem corresponds to the case in which all weights equal .

The ILP and LP formulation of the unweighted problem can easily be generalized to the weighted case: just change the objective function from to .

Suppose now that we solve the linear programming relaxation (3), and we find an optimal fractional solution to the relaxation, that is, we are given a number in the range for every set . Unlike the case of vertex cover, we cannot round the to the nearest integer, because if an element belongs to sets, it could be that for each of those sets, and we would be rounding all those numbers to zero, leaving the element not covered. If we knew that every element belongs to at most sets, then we could round the numbers to 1, and the numbers to zero. This would give a feasible cover, and we could prove that we achieve a -approximation. Unfortunately, could be very large, even , while we are trying to prove an approximation guarantee that is never worse than .

Maybe the next most natural approach after rounding the to the nearest integer is to think of each as a *probability*, and we can think of the solution as describing a probability distribution over ways of choosing some of the subsets , in which we choose with probability , with probability , and so on.

Algorithm *RandomPick*

- Input: values feasible for (3)
- for to
- with probability , assign , otherwise do nothing

- return

Using this probabilistic process, the expected cost of the sets that we pick is , which is the same as the cost of in the linear programming problem, and is at most the optimum of the set cover problem. Unfortunately, the solution that we construct could have a high probability of missing some of the elements.

Indeed, consider the probabilistic process “from the perspective” of an element . The element belongs to some of the subsets, let’s say for simplicity the first sets . As long as we select at least one of , then we are going to cover . We select with probability and we know that ; what is the probability that is covered? It is definitely not 1, as we can see by thinking of the case that belongs to only two sets, and that each set has an associated equal to ; in such a case is covered with probability . This is not too bad, but maybe there are elements like that, each having probability of remaining uncovered, so that we would expect uncovered elements on average. In some examples, elements would remain uncovered with probability . How do we deal with the uncovered elements?

First of all, let us see that every element has a reasonably good probability of being covered.

Lemma 1Consider a sequence of independent experiments, in which the -th experiment has probability of being successful, and suppose that . Then there is a probability that at least one experiment is successful.

*Proof:* The probability that no experiment is successful is

This is the kind of expression for which the following trick is useful: is approximately for small values of .

More precisely, using the Taylor expansion around of , we can see that, for

and so

because

where the last inequality follows from the fact that we have an infinite sum of non-negative terms.

Our randomized rounding process will be as follows: we repeat the procedure *RandomPick* until we have covered all the elements.

Algorithm *RandomizedRound*

- Input: feasible for (3)
- while there are elements such that
- for to
- with probability , assign , otherwise do nothing

- for to
- return

How do we analyze such a procedure? We describe a very simple way to get a loose estimate on the quality of the cost of the solution found by the algorithm.

Fact 2There is a probability at most that thewhileloop is executed for more than times. In general, there is a probability at most that thewhileloop is executed form more than times.

*Proof:* The probability that we need to run the *while* loop for more than times is the same as the probability that, if we run the body of the *while* loop (that is, the *RandomPick* procedure) for exactly steps, we end up with some uncovered elements.

Consider an element . For each iteration of the *while* loop, there is a probability at most that is not covered by the sets added to in that iteration. The probability that is not covered after iterations is then at most

The probability that, after iterations, there is an element that is not covered, is at most the sum over all of the probability that is not covered, which is at most .

Fact 3Fix any positive integer parameter and any feasible solution for (3). Then the expected size of in AlgorithmRandomizedRoundon input after iterations (or at the end of the algorithm, if it ends in fewer than iterations) is at most

*Proof:* Let be the total cost of the elements assigned to by the algorithm during the first, second, , -th iteration, respectively. Let denote the total weight after iterations (or at the end of the algorithm if it terminates in fewer than iterations). Note that these are random variables determined by the random choices made by the algorithm. Clearly, we have, with probability 1,

where we do not have equality because certain elements might be assigned to in more than one iteration.

If the algorithm stops before the -th iteration, then , because there is no -th iteration and so no assignment happens during it. Conditioned on the -th iteration occurring, the expectation of is

and so we have

Recall that Markov’s inequality says that if is a non-negative random variable (for example, the size of a set), then, for every

For example, .

We can combine these observations to get a rather loose bound on the size of the set output by RandomizedRound

Fact 4Given an optimal solution to (3), algorithmRandomizedRoundoutputs, with probability , a feasible solution to the set cover problem that contains at most sets.

*Proof:* Let . There is a probability at most that the algorithm runs the *while* loop for more than steps. After the first steps, the expected total weight of the solution is at most , which is at most . Thus, the probability that the solution , after the first steps, contains sets of total weight more than is at most . Taking the union of the two events, there is a probability at most that either the algorithm runs for more than iterations, or that it adds to its solution sets of total cost more than in the first steps.

Equivalently, there is a probability at least that the algorithm halts within iterations and that, within that time, it constructs a solution of total cost at most .

**2. The Dual of the Relaxation **

In a past lecture, we analyzed a simple greedy algorithm for unweighted set cover, that repeatedly picks a set that covers the largest number of uncovered elements, until all elements are covered, and we proved that the algorithm uses at most sets.

As in all analyses of approximation algorithms, we had to come up with ways to prove lower bounds to the optimum. In the unweighted case, we noted that if we have a subset of elements ( for “difficult” to cover), and every contains at most elements of , then . In the weighted case, neither this lower bound technique nor the approximation analysis works. It is possible to modify the algorithm so that, at every step, it picks the set that is most “cost-effective” at covering uncovered elements, and we can modify the analysis to take weights into accounts. This modification will be easier to figure out if we first think about the analysis of our previous algorithm in terms of the dual of the LP relaxation of set cover.

To form the dual of (3), we first note that the constraints do not change the optimum, because a solution in which some s are bigger than 1 can be converted to a solution in which all variables are while decreasing the objective function, and so no variable is larger than 1 in an optimal solution, even if we do not have the constraint .

If we work with this simplified version of the LP relaxation of set cover

we see that its dual is an LP with one variable for every element , and it is defined as follows:

That is, we assign a “charge” to every element, subject to the constraint that, for every set, the sum of the charges of the elements of the set are at most 1. The cost of the dual solution (and a lower bound to the optimum of the set cover problem) is the sum of the charges.

In the unweighted case, for every . Suppose that we are in the unweighted case, and that we notice a set of elements such that every contains at most elements of . Then we can form the feasible dual solution

This is a feasible dual solution of cost , and so it is a way to formulate our old lower bound argument in the language of dual solutions. The simplest extension of this example to the weighted setting is that: if we have a set of elements such that for every set of weight we have that contains at most elements of ; then the assignment for and for is a feasible dual solution of cost , and so the optimum is at most .

These observations are already enough to derive a version of the greedy algorithm for weighted instances.

Algorithm: SetCoverGreedy

- Input: set , subsets , weights
- while there are uncovered elements such that for every
- let be the set of uncovered elements
- for every set , let be the
*cost effectiveness*of - let be a set with the best cost-effectiveness

- return

We adapt the analysis of the unweighted case.

Let be an enumeration of the elements of in the order in which they are covered by the algorithm. Let be the cost-effectiveness of the set that was picked at the iteration in which was covered for the first time. Then, in that iteration, there is a set of at least elements, and every set of weight contains at most elements of . This means that setting and is a feasible dual solution of cost , and so

If we denote by the cost of the solution found by our algorithm, we see that

because, at each iteration, if we pick a set of weight that covers new elements, then each of those elements has a parameter equal to , and so when we sum over all elements that are covered for the first time at that iteration, we get exactly the weight .

Rearranging the equations, we get again