# Greedy Set Cover II: weighted H(n)-approximation via random stopping time

Randomized rounding yields Chvátal’s greedy algorithm for weighted Set Cover.

The rounding scheme samples sets i.i.d. from the fractional cover until all elements are covered. Applying the method of conditional probabilities yields Chvátal’s greedy algorithm for weighted Set Cover, and a proof that it is an -approximation algorithm.

# Rounding scheme for weighted Set Cover

 input: weighted Set-Cover instance output: set cover for 1. Compute a min-cost fractional set cover . 2. Repeat until the chosen sets form a cover: 3. Choose a set randomly from the distribution defined by . 4. Return the chosen sets.

Recall that denotes the 1-norm of and is , about .

Lemma (existence).

With non-zero probability the rounding scheme returns a cover of cost at most times the cost of the fractional set cover .

Proof.

Let random variable be the number of draws until all elements are covered, and let r.v.  be the number of elements not yet covered after samples.

Let denote the cost of set . The expected cost of each sampled set is , that is, . By Wald’s equation, the expected cost of the chosen sets is .

The fractional cover gives total weight at least 1 to the sets containing any given element , so each randomly sampled set covers with probability at least . Hence, in expectation the number of uncovered elements reduces by at least a factor of with each sample:

$$\label{eqn} \textrm{E}[n_ t – n_{t+1} \, |\, n_ t] ~ \ge ~ n_ t/|x^*|.$$

By Wald’s equation for dependent decrements, bound \eqref{eqn} implies that the expected number of sampled sets, , is at most .

Hence, the expected cost of the chosen sets is at most the product .

Next we apply the method of conditional probabilities to derive the following algorithm.

# H(n)-approximation via random stopping time

 input: weighted Set-Cover instance output: set cover for 1. Repeat until the chosen sets form a cover: 2. Choose a set minimizing the cost of divided by the number of elements in not yet covered by chosen sets. 3. Return the chosen sets.

To derive the algorithm we use the following pessimistic estimator for the expectation of the final cost, conditioned on the state at the end of a given iteration :

$\phi _ t ~ =~ {\rm H}(n_ t) \, c\cdot x^*/|x^*| ~ +~ \sum _{s\in S_ t} c_ s,$

where contains the first sets chosen. The second term in is the cost of the sets chosen so far. The first term is an upper bound on the expected cost of the sets remaining to be chosen before all remaining elements are covered (because each iteration costs in expectation, and we expect at most more iterations).

The algorithm keeps the pessimistic estimator from increasing, ensuring a successful outcome (see the verification of the pessimistic estimator for details). The well-known performance guarantee follows as a corollary:

Theorem ([1]).

The algorithm above returns a cover of cost at most times the minimum cost of any fractional set cover, where is the number of elements.

The next note discusses the stronger bound of , where .