Fractional Set Cover

Here we derive a Lagrangian-relaxation algorithm for fractional set cover. The derivation illustrates the use of random stopping times to handle weights in the cost function.

A fractional cover x is a C-multicover if its coverage of every element is at least C.

We use C-multicovers with C\gt 1 as a stepping stone for finding fractional set covers. We apply the method of conditional probabilities to a randomized-rounding scheme to derive the following algorithm to find an approximately minimum-cost integer C-multicover. Then we’ll use that algorithm, and scaling, to compute an approximately minimum-cost fractional set cover.

  input: Collection S of subsets of U, cost vector c, coverage C, and \varepsilon \in (0,1).
  output: Integer cover \tilde x of near-minimum cost with coverage at least C.
1. Initialize \tilde x_ s=0 for every set s\in S. Recall that \tilde x(e) denotes \sum _{s\ni e} \tilde x_ s.
2. While \tilde x’s coverage is less than C:
3. Choose a set s\in S maximizing \sum _{e\ni s} (1-\varepsilon )^{\tilde x(e)} / c_ s.
4. Increase \tilde x_ s by 1.
5. Return \tilde x.

Let n=|U| be the number of elements.

Theorem (performance guarantee).

Assume C is an integer. For any \varepsilon \in (0,1) such that C \ge \ln (n)/\varepsilon ^2, the algorithm returns an integer cover \tilde x of coverage C whose cost |c\cdot \tilde x| is at most c\cdot x^*/(1-2\varepsilon ), where x^* is any minimum-cost cover with coverage C.

We prove the theorem by applying the method of conditional probabilities to the following rounding scheme.

Fix any fractional cover x with coverage C.

Initialize \tilde x_ s = 0 for all s\in S.
Randomly sample sets independently from the distribution x/|x|;
with each sample s, increment \tilde x_ s.
When \tilde x achieves coverage C, stop and return \tilde x.

To analyze the rounding scheme, we use the following variant of the Chernoff bound. Suppose a random process starts in some fixed start state S_0, then goes through a random sequence of states S_1,S_2,S_3,\ldots . As it goes, it maintains n values (each initially zero). In state S_ t, each of the n values y^ t(i) is determined. (In our setting, the state S_ t consists of the first t samples. Value y^ t(e) represents the coverage of element e by those samples.)

Lemma (Chernoff variant).

Let T be a stopping time for the system. Fix \varepsilon \in (0,1), and \mu \gt 0. Assume that, at each time step, each of the n counters increases by at most 1 and by at least \mu in expectation. Then the expectation of the minimum of the values at time T is at least (1-\varepsilon )\big (\mu \, \textrm{E}[T] \, -\, \varepsilon ^{-1}\ln n\big ).

(Note: there is a tighter version of this lemma.)

Now we analyze the rounding scheme.

Lemma (rounding scheme).

Assume C is an integer. Fix any \varepsilon \in (0,1) such that C\ge \ln (n)/\varepsilon ^2. Let x be any fractional cover with coverage at least C. Given x, the rounding scheme returns an integer cover \tilde x with coverage at least C and whose expected cost is at most c\cdot x/(1-2\varepsilon ).

Next, to prove the theorem, we describe how applying the method of conditional probabilities to the rounding scheme gives the algorithm.

Finding a near-optimal fractional set cover

Given an instance S and c, and an arbitrary \varepsilon \in (0,1/2), one can also use the algorithm to compute a 1/(1-2\varepsilon )-approximate fractional (non-integer) set cover, as follows.

Define coverage C to be C = \lceil 2(1-\varepsilon )\ln (n)/\varepsilon ^2\rceil . Run the algorithm with coverage C to find a 1/(1-2\varepsilon )-approximate integer C-multicover \tilde x, then scale \tilde x by 1/C to obtain the fractional set cover x=\tilde x/C for the original problem, with coverage 1. The cover x will have cost at most 1/(1-2\varepsilon ) times the minimum cost of any fractional set cover (with coverage 1).

Running time

The number of iterations for this algorithm can be large, even when C is small. (Consider an example with C=1 and \varepsilon =1/2, and with two sets s=\{ e\} and s’=\{ e’\} , where c_ s equals 1 and c_{s’} is arbitrarily large. The algorithm will choose set s about \log _2 c_{s’} times before choosing s’.)

As discussed in other notes here, it is possible to modify the algorithm to guarantee a polynomial running time.