Greedy Set Cover I: unweighted ln(n)-approximation

Randomized rounding yields the Johnson/Lovász greedy algorithm for unweighted Set Cover.

The rounding scheme samples sets i.i.d. from the fractional cover until all elements are covered. Applying the method of conditional probabilities yields the Johnson/Lovász greedy algorithm for unweighted Set Cover, and a proof that it is a \ln (n)-approximation algorithm.



Randomized-rounding scheme for unweighted Set Cover

  input: Set-Cover instance {\cal I}
  output: set cover for {\cal I}
0. Compute a fractional set cover x^* minimizing |x^*|.
1. Repeat until the sampled sets form a cover:
2. Sample a random set from the distribution defined by x^*/|x^*|.
3. Return the sampled sets.

Note: |x^*| denotes the 1-norm \sum _ s x^*_ s of x^*.

Lemma (existence, [3]).

Let n\gt 1 be the number of elements. With non-zero probability the rounding scheme returns a set cover of size at most T = \lceil |x^*| \ln n\rceil .

Proof.

The fractional solution x^* gives total weight at least 1 to the sets containing any given element e, so the probability that e is covered in any given round is at least 1/|x^*|. Hence, after T rounds, the expected number of the n elements left uncovered is at most n(1-1/|x^*|)^ T which is strictly less than n\exp (T/|x^*|), which is 1 or less by the choice of T.

 
 

Next, we apply the method of conditional probabilities to derive the algorithm.

Lemma (algorithm, [3]).

Applying the method of conditional probabilities to the existence proof yields the following algorithm.

Greedy algorithm for Set Cover (unweighted): ln(n)-approximation

  input: Set-Cover instance {\cal I}
  output: set cover for {\cal I}

1. Repeat until the chosen sets form a cover:
2. Choose a set that contains a maximum number of elements that are not yet covered by chosen sets.
3. Return the chosen sets.

To derive the algorithm, we need an appropriate pessimistic estimator \phi _ t for the expectation of the number of elements that will be left uncovered by the end of iteration T, conditioned on the state at the end of a given iteration t\le T. Letting n_ t denote the number of elements remaining uncovered at the end of iteration t, this conditional expectation is at most

\begin{eqnarray} \label{eq} \phi _ t ~ =~ n_ t(1-1/|x^*|)^{T-t}. \end{eqnarray}

The algorithm chooses the set that contains a maximum number of not-yet-covered elements. This choice also minimizes the resulting value of the pessimistic estimator. (The factor of (1-1/|x^*|)^{T-t} in \phi _ t can be ignored, since it is independent of the choices made by the algorithm. Hence, minimizing \phi _ t is the same as minimizing n_ t.) Hence, the algorithm guarantees a successful outcome.

Theorem ([1, 2]).

For unweighted set cover with n\gt 1, the greedy Set-Cover algorithm returns a cover of size at most \lceil |x^*| \ln n\rceil , where |x^*| is the minimum size of any fractional cover and n is the number of elements.