Using the Markov bound on the cost yields an ugly greedy algorithm.

As illustrated in a previous note, one can use random stopping times and Wald’s equation to bound the cost, leading to Chvatal’s algorithm. Here we describe another approach: analyzing a fixed number of samples, then using the Markov bound to bound the cost. This leads to a more complicated algorithm with a slightly weaker approximation ratio.

Click for background material…

Weighted set cover via expected cost

Ugly algorithm

We show the following performance guarantee:

Theorem.

The algorithm below is a $2(1+\ln (2n))$ -approximation algorithm.

	input: collection $S$ of sets, costs $c: S\rightarrow {\mathbb R}_+$
	output: set cover $C$
1.	Let $T\doteq \lceil \ln (2n) n\rceil$ .
2.	For $t=1,2,\ldots ,T$ do:
3.	Let set $U$ contain the elements not yet covered by chosen sets.
4.	Define $\mbox{rhs}(s) \doteq \frac{1}{2T} + \|U\|(1-1/n)^{T-t+1} - \|U-s\|(1-1/n)^{T-t}$ .
5.	If $\mbox{rhs}(\emptyset ) \ge 0$ do nothing (i.e., choose $s=\emptyset$ ).
6.	Else, choose a set $s$ to maximize $\mbox{rhs}(s)/c_ s$ .
7.	Return the chosen sets.

To prove the theorem we apply the method of conditional probabilities to the usual rounding scheme. We start by analyzing the rounding scheme.

Click to see rounding scheme…

	input: weighted Set-Cover instance ${\cal I}$
	output: set cover for ${\cal I}$
0.	Compute a min-cost fractional set cover $x^*$ .
1.	Repeat until the chosen sets form a cover:
2.	Choose a set randomly from the distribution defined by $x^/\|x^\|$ .
3.	Return the chosen sets.

Lemma.

Fix $T\doteq \lceil \ln (2n)|x| \rceil$ . With positive probability, the rounding scheme returns a cover $C$ of cost at most $2 T c\cdot x /|x|$ (which is at most $2(1+\ln (2n))c\cdot x$ ).

Click for proof of lemma …

Let $C$ contain the first $T$ sampled sets.

By calculation (as in the unweighted case), the probability that any given element remains uncovered after $T$ rounds is less than $1/(2n)$ . Thus, the expected number of elements not covered by $C$ is less than 1/2. By the Markov bound, the probability that $C$ is not a cover is less than 1/2.

By calculation, each sample costs $c\cdot x/|x|$ in expectation. By linearity of expectation, the expected cost of $C$ is $T \, c\cdot x/|x|$ . By the Markov bound, the chance that the cost of $C$ exceeds twice this is at most 1/2.

By the naive union bound, the probability that $C$ is not a cover or costs too much is less than 1.

Method of conditional probabilities

Proof of theorem.

To prove the theorem, we show that the algorithm keeps the following pessimistic estimator (on the probability of failure) from increasing:

\[ \phi _ t ~ \doteq ~ \frac{\sum _{s\in S_ t} c_ t ~ +~ (T-t)c\cdot x/|x|}{2\, T\, c\cdot x/|x|} {\, {+}\, }n_ t(1-1/|x|)^{T-t}, \]

where $S_ t$ contains the first $t$ sets chosen and $n_ t$ is the number of elements left uncovered by $S_ t$ .

Click for verification of pessimistic estimator…

If the algorithm chooses a set $s$ in round $t$ , the increase in the pessimistic estimator $\phi _{t}-\phi _ t$ is

\[ \frac{c_ s}{2T c\cdot x/|x|} \, -\, \frac{1}{2T} \, +\, n_{t}(1-1/|x|)^{T-t} – n_{t-1}(1-1/|x|)^{T-t+1}. \]

Given $n_{t-1}$ , for $s$ chosen randomly from $x/|x|$ , the expected increase is non-positive (because $\textrm{E}[c_ s] = c\cdot x/|x|$ and $\textrm{E}[n_{t}] \le (1-1/|x|)n_{t-1}$ ), so some set $s$ makes it non-positive. The increase will be non-positive iff

\[ \frac{c_ s}{2T c\cdot x/|x|} ~ \le ~ \frac{1}{2T} \, +\, n_{t-1}(1-1/|x|)^{T-t+1} \, -\, n_{t}(1-1/|x|)^{T-t}. \]

Unfortunately choosing $s$ to ensure the above inequality requires knowing $|x|$ . Work around this with the following trick: modify the input instance by adding the empty set $\emptyset$ , with cost 0, to the cover. Then assume without loss of generality that $|x|=n$ . (If not, increase $x_{\emptyset }$ until $|x|= n$ , without changing $c\cdot x$ ). Now $T =\lceil \ln (2n)n\rceil$ and the desired condition is

\[ \frac{c_ s}{2T c\cdot x/n} \, +\, n_{t}(1-1/n)^{T-t} ~ \le ~ \frac{1}{2T} + n_{t-1}(1-1/n)^{T-t+1}. \]

Since this holds for some $s$ , one of the following two choices will ensure that it holds: choosing $s=\emptyset$ , or choosing $s$ to maximize the right-hand side divided by the left-hand side. Note that to maximize this ratio, the algorithm does not need to know $c\cdot x$ . This gives the algorithm.

The algorithm ensures $\phi _ T \le \phi _0 < 1/2 + n\exp (-T/|x|) \le 1$ (by the choice of $T$ ), which by inspection of $\phi _ T$ ensures that at time $t=T$ all elements are covered cost of the cover is at most $2T c\cdot x/|x|$ , as desired.

Click for explanation of pessimistic estimator…

Why is the conditional probability of failing to find the desired cover, given $S_ t$

, at most

\[ \phi _ t ~ \doteq ~ \frac{\sum _{s\in S_ t} c_ t ~ +~ (T-t)c\cdot x/|x|}{2\, T\, c\cdot x/|x|} {\, {+}\, }n_ t(1-1/|x|)^{T-t}? \]

The first addend is the conditional expectation of $c\cdot {\tilde x}^{\scriptscriptstyle (T)}/(2T c\cdot x/|x|)$ , which is an upper bound on the conditional probability that the cost is too high. The second addend is an upper bound on the conditional expectation of the number of elements left uncovered.

Notes on algorithms

Lecture notes on algorithms

Set Cover / using Markov to bound the cost

Ugly algorithm

Method of conditional probabilities

Related