Randomized rounding yields Chvátal’s greedy algorithm for weighted Set Cover.

The rounding scheme samples sets i.i.d. from the fractional cover until all elements are covered. Applying the method of conditional probabilities yields Chvátal’s greedy algorithm for weighted Set Cover, and a proof that it is an ${\rm H}(n)$ -approximation algorithm.

Click for background material…

Rounding scheme for weighted Set Cover

	input: weighted Set-Cover instance ${\cal I}$
	output: set cover for ${\cal I}$
1.	Compute a min-cost fractional set cover $x^*$ .
2.	Repeat until the chosen sets form a cover:
3.	Choose a set randomly from the distribution defined by $x^/\|x^\|$ .
4.	Return the chosen sets.

Recall that $|x^*|$ denotes the 1-norm $\sum _ s x^*_ s$ of $x^*$ and ${\rm H}(n)$ is $1+1/2+\cdots + 1/n$ , about $0.5+\ln n$ .

Lemma (existence).

With non-zero probability the rounding scheme returns a cover of cost at most ${\rm H}(n)$ times the cost of the fractional set cover $x^*$ .

Proof.

Let random variable $T$ be the number of draws until all $n$ elements are covered, and let r.v. $n_ t$ be the number of elements not yet covered after $t\le T$ samples.

Let $c_ s$ denote the cost of set $s$ . The expected cost of each sampled set is $\sum _ s c_ s x^*_ s / |x^*|$ , that is, $c\cdot x^* / |x^*|$ . By Wald’s equation, the expected cost of the chosen sets is $\textrm{E}[T]\, c\cdot x^*/|x^*|$ .

The fractional cover $x^*$ gives total weight at least 1 to the sets containing any given element $e$ , so each randomly sampled set covers $e$ with probability at least $1/|x^*|$ . Hence, in expectation the number of uncovered elements reduces by at least a factor of $1-1/|x^*|$ with each sample:

\begin{equation} \label{eqn} \textrm{E}[n_ t – n_{t+1} \, |\, n_ t] ~ \ge ~ n_ t/|x^*|. \end{equation}

By Wald’s equation for dependent decrements, bound \eqref{eqn} implies that the expected number of sampled sets, $\textrm{E}[T]$ , is at most $|x^*|\, {\rm H}(n)$ .

Hence, the expected cost of the chosen sets is at most the product ${\rm H}(n)\, c\cdot x^*$ .

Next we apply the method of conditional probabilities to derive the following algorithm.

H(n)-approximation via random stopping time

	input: weighted Set-Cover instance ${\cal I}$
	output: set cover for ${\cal I}$
1.	Repeat until the chosen sets form a cover:
2.	Choose a set $s$ minimizing the cost of $s$ divided by the number of elements in $s$ not yet covered by chosen sets.
3.	Return the chosen sets.

To derive the algorithm we use the following pessimistic estimator $\phi _ t$ for the expectation of the final cost, conditioned on the state at the end of a given iteration $t\le T$ :

\[ \phi _ t ~ =~ {\rm H}(n_ t) \, c\cdot x^*/|x^*| ~ +~ \sum _{s\in S_ t} c_ s, \]

where $S_ t$ contains the first $t$ sets chosen. The second term in $\phi _ t$ is the cost of the sets chosen so far. The first term is an upper bound on the expected cost of the sets remaining to be chosen before all remaining elements are covered (because each iteration costs $c\cdot x^*/|x^*|$ in expectation, and we expect at most $|x^*| {\rm H}(n_ t)$ more iterations).

Click for verification of the pessimistic estimator…

1. The pessimistic estimator is initially ${\rm H}(n) c\cdot x^*$ . (By inspection).

2. The pessimistic estimator is a super-martingale w.r.t. the rounding scheme.

When a set $s’$ is chosen in iteration $t$ , the increase $\phi _{t} - \phi _{t-1}$ equals

\[ c_{s’} ~ -~ ({\rm H}(n_{t-1}) – {\rm H}(n_{t})) c\cdot x^*/|x^*|. \]

Using ${\rm H}(b) - {\rm H}(a) \ge (b-a)/a$ , the increase is at most

\begin{equation} \label{desired} c_{s’} \, -\, \frac{n_{t-1} – n_{t}}{n_{t-1}}\, c\cdot x^*/|x^*|. \end{equation}

For a set $s’$ chosen randomly from $x^*/|x^*|$ , the bound \eqref{desired} is non-positive in expectation, because $\textrm{E}[c_{s'}] = c\cdot x^*/|x^*|$ while $\textrm{E}[n_ t - n_{t+1}] \ge n_ t/|x^*|$ .

3. If the final value of the pessimistic estimator is at most ${\rm H}(n)c\cdot x^*$ , then the outcome is successful. (By inspection.)

Now that we’ve verified the pessimistic estimator, we verify that the algorithm keeps it from increasing at each step. As observed above, the expectation of \eqref{desired} is non-positive. Hence, there exists a set $s’$ making it non-positive. Gathering terms that depend on $s’$ (namely $c_{s'}$ and $n_{t}$ ), it is non-positive iff

\[ \frac{c_{s’}}{n_{t-1} – n_{t}} \, \le \, \frac{|x^*|}{n_{t-1}\, c\cdot x^*/|x^*|}. \]

Some set $s’$ satisfies this, so the algorithm’s choice of $s’$ (which minimizes the left-hand side) must do so.

The algorithm keeps the pessimistic estimator from increasing, ensuring a successful outcome (see the verification of the pessimistic estimator for details). The well-known performance guarantee follows as a corollary:

Theorem ([1]).

The algorithm above returns a cover of cost at most ${\rm H}(n)$ times the minimum cost of any fractional set cover, where $n$ is the number of elements.

The next note discusses the stronger bound of ${\rm H}(d)$ , where $d=\max _ s |s|$ .

$O(\log n)$ -approximation for weighted Set Cover without Wald’s: analyzing the rounding scheme at a fixed stopping time is possible, but yields a different algorithm and weaker performance guarantee.
Set cover problem (wikipedia)
Václav Chvátal (wikipedia)

Bibliography

[1]	V. Chvátal. A greedy heuristic for the set-covering problem. Math. Operations Research, 4(3):233–235, 1979.

Notes on algorithms

Lecture notes on algorithms

Greedy Set Cover II: weighted H(n)-approximation via random stopping time

Rounding scheme for weighted Set Cover

H(n)-approximation via random stopping time

Related

Bibliography