Greedy Set Cover II: weighted H(n)-approximation via random stopping time

Randomized rounding yields Chvátal’s greedy algorithm for weighted Set Cover.

The rounding scheme samples sets i.i.d. from the fractional cover until all elements are covered. Applying the method of conditional probabilities yields Chvátal’s greedy algorithm for weighted Set Cover, and a proof that it is an {\rm H}(n)-approximation algorithm.



Rounding scheme for weighted Set Cover

  input: weighted Set-Cover instance {\cal I}
  output: set cover for {\cal I}
1. Compute a min-cost fractional set cover x^*.
2. Repeat until the chosen sets form a cover:
3. Choose a set randomly from the distribution defined by x^*/|x^*|.
4. Return the chosen sets.

Recall that |x^*| denotes the 1-norm \sum _ s x^*_ s of x^* and {\rm H}(n) is 1+1/2+\cdots + 1/n, about 0.5+\ln n.

Lemma (existence).

With non-zero probability the rounding scheme returns a cover of cost at most {\rm H}(n) times the cost of the fractional set cover x^*.

Proof.

Let random variable T be the number of draws until all n elements are covered, and let r.v. n_ t be the number of elements not yet covered after t\le T samples.

Let c_ s denote the cost of set s. The expected cost of each sampled set is \sum _ s c_ s x^*_ s / |x^*|, that is, c\cdot x^* / |x^*|. By Wald’s equation, the expected cost of the chosen sets is \textrm{E}[T]\, c\cdot x^*/|x^*|.

The fractional cover x^* gives total weight at least 1 to the sets containing any given element e, so each randomly sampled set covers e with probability at least 1/|x^*|. Hence, in expectation the number of uncovered elements reduces by at least a factor of 1-1/|x^*| with each sample:

\begin{equation} \label{eqn} \textrm{E}[n_ t – n_{t+1} \, |\, n_ t] ~ \ge ~ n_ t/|x^*|. \end{equation}

By Wald’s equation for dependent decrements, bound \eqref{eqn} implies that the expected number of sampled sets, \textrm{E}[T], is at most |x^*|\, {\rm H}(n).

Hence, the expected cost of the chosen sets is at most the product {\rm H}(n)\, c\cdot x^*.

 
 

Next we apply the method of conditional probabilities to derive the following algorithm.

H(n)-approximation via random stopping time

  input: weighted Set-Cover instance {\cal I}
  output: set cover for {\cal I}
1. Repeat until the chosen sets form a cover:
2. Choose a set s minimizing the cost of s divided by the number of elements in s not yet covered by chosen sets.
3. Return the chosen sets.

To derive the algorithm we use the following pessimistic estimator \phi _ t for the expectation of the final cost, conditioned on the state at the end of a given iteration t\le T:

\[ \phi _ t ~ =~ {\rm H}(n_ t) \, c\cdot x^*/|x^*| ~ +~ \sum _{s\in S_ t} c_ s, \]

where S_ t contains the first t sets chosen. The second term in \phi _ t is the cost of the sets chosen so far. The first term is an upper bound on the expected cost of the sets remaining to be chosen before all remaining elements are covered (because each iteration costs c\cdot x^*/|x^*| in expectation, and we expect at most |x^*| {\rm H}(n_ t) more iterations).

The algorithm keeps the pessimistic estimator from increasing, ensuring a successful outcome (see the verification of the pessimistic estimator for details). The well-known performance guarantee follows as a corollary:

Theorem ([1]).

The algorithm above returns a cover of cost at most {\rm H}(n) times the minimum cost of any fractional set cover, where n is the number of elements.

The next note discusses the stronger bound of {\rm H}(d), where d=\max _ s |s|.

Related