Randomly rounding a fractional LP solution to an integer solution.

The idea, introduced by Raghavan and Thompson in 1987, is to use the probabilistic method to round the solution of a linear program, converting it into an approximately optimal integer solution . It’s a broadly useful technique. For many problems, randomized rounding yields algorithms with optimal approximation ratios (assuming P $\neq$ NP). This note describes randomized rounding and gives a few examples.

Click for background material…

Trivial example: Vertex Cover

Here is a very simple randomized-rounding scheme for Vertex Cover:

1. Given any graph $G$ , compute a min-cost fractional vertex cover $x^*$ .
2. Return the random vertex cover $\tilde x$ where $\tilde x_ v = 1$ with probability $\min (2x^*_ v,1)$ .

The algorithm is a 2-approximation algorithm in expectation. Indeed, the rounded solution $\tilde x$ is a cover because for any edge one of its two endpoints $v$ must have $x^*_ v \ge 1/2$ , which forces $v$ into the cover $\tilde x$ . By inspection, the expected cost of $\tilde x$ is at most twice the cost of $x^*$ , which, since the cost of $x^*$ is a lower bound on the cost of the integral , is at most $2{\textsc{opt}}$ .

Relation to the probabilistic method in combinatorics

If you’re already familiar with the probabilistic method, the new element here is that the fractional solution $x^*$ is used as basis for the probability distribution. This allows the cost of the rounded solution to be related to the cost of $x^*$ , and via that (since $x^*$ is an optimal solution to a relaxation) to the cost of opt. In this way, one obtains the per-input worst-case performance guarantees necessary for approximation algorithms.

Standard randomized-rounding scheme for Set Cover

Next is a more sophisticated example of randomized rounding: a standard randomized-rounding scheme for Set Cover. This example is comparable to one in [4, Sec.14.2]. For more examples, see [1, ch.4].

Click for the definition of Set Cover…

Given a collection $S$ of finite sets from a universe $U$ and a cost $c_ s$ for each set $s\in S$ , the problem is to choose a minimum-cost set cover — a collection $C$ of sets in $S$ such that each element $e\in U$ is contained in at least one set in $C$ . The cost of $C$ is $\sum _{s\in C} c_ s$ .

Rounding scheme.

Let $x^*$ be a minimum-cost fractional set cover $x^*$ (computed, say, by linear programming). Consider first the most natural scheme for rounding $x^*$ :

For each set $s\in S$ independently, take $\tilde x_ s = 1$ with probability $x^*_ s$ , otherwise take $\tilde x_ s = 0$ .

With this rounding scheme, the expected cost of the chosen sets is $\sum _ s c_ s x^*_ s$ , the cost of the fractional cover. This is good. Unfortunately the coverage is not good in the worst case, when the set weights are relatively small, for then the probability that an element $e$ is not covered is about

\[ \prod _{s\ni e} 1-x^*_ s {\, {\approx }\, }\prod _{s\ni e} \exp (-x^*_ s) {\, {=}\, }\exp \Big(-\sum _{s\ni e}x^*_ s\Big) {\, {\approx }\, }\exp (-1). \]

Only a constant fraction of the elements will be covered in expectation. To increase the coverage, the standard rounding scheme scales up the rounding probabilities by an appropriate factor $\lambda \gt 1$ :

For each set $s\in S$ independently, take $\tilde x_ s = 1$ with probability $p_ s = \min (\lambda x^*_ s, 1)$ , otherwise take $\tilde x_ s = 0$ .

Scaling the probabilities increases the expected cost by $\lambda$ , but makes coverage of all elements likely.

The next step is to show that this rounding scheme works with positive probability.

Lemma (existence proof).

Fix $\lambda = \ln (2n)$ , where $n$ is the number of elements. With non-zero probability, the rounding scheme returns a set cover $\tilde x$ of cost at most $2\ln (2 n)\, c\cdot x^*$ (and thus, of cost $O(\log n)$ times ${\textsc{opt}}$ ).

(Note: with care the $O(\log n)$ can be reduced to $\ln (n)+O(\log \log n)$ .)

Proof.

The output $\tilde x$ of the random rounding scheme has the desired properties as long as none of the following “bad” events occur:

the cost $c\cdot \tilde x$ is too high: it exceeds $2\lambda c\cdot x^*$ , or
$\tilde x$ is not a cover: it fails to cover one of the $n$ elements $e$ .

Regarding the cost, the expectation of each $\tilde x_ s$ is at most $\lambda x_ s^*$ . By linearity of expectation, the expectation of $c\cdot \tilde x$ is at most $\sum _ s c_ s\lambda x_ s^*=\lambda c\cdot x^*$ . Thus, by the Markov bound, the probability of the first bad event (the cost is too high) is at most $1/2$ .

For the remaining bad events (coverage of each element $e$ ), note that, for any given element $e$ , since $\sum _{s\ni e} x^*_ s \ge 1$ , the probability that $e$ is not covered is

\[ \prod _{s\ni e} 1-p_ s {\, {<}\, }\prod _{s\ni e} \exp ({-}\lambda x^*_ s) {\, {=}\, }\exp \Big({-}\lambda \sum _{s\ni e} x^*_ s \Big) {\, {\le }\, }\exp ({-}\lambda ) {\, {=}\, }1/(2n). \]

(The first step uses $p_ s = \min (\lambda x^*_ s,1)$ and $1-z\le e^{-z}$ , which is strict for $z\ne 0$ . The last step uses the definition of $\lambda$ .) Thus, for each of the $n$ elements, the probability that the element is not covered is less than $1/(2n)$ . By the naive union bound, the probability that one of the $1+n$ bad events happens is less than $1/2 + n/(2n)=1$ . Thus, with positive probability there are no bad events and $\tilde x$ is a set cover of cost at most $2\lambda c\cdot x^*$ .

Deconstructing the Set Cover existence proof to find $Q$

The argument above shows the existence of a set cover of cost $O(\log (n)c\cdot x^*$ ). The next step is to apply the method of conditional probabilities to the existence proof. The resulting algorithm will consider each set $s\in S$ in turn, and set $\tilde x_ s$ to either 0 or 1. Instead of setting $\tilde x_ s$ randomly based on $x^*$ , it will set $\tilde x_ s$ to keep the conditional probability of failure (defined with respect to the random experiment) below 1.

Although the concept is simple, implementing it requires a careful calculation bounding the conditional probability of failure as the scheme proceeds. The original existence proof implicitly bounds the (unconditioned) probability of failure by the expectation of

\[ Q = \frac{c\cdot \tilde x}{2\lambda c\cdot x^*} {\, {+}\, }n_ m \]

where $n_ m$ is the number of elements left uncovered at the end that is, $n_ m$ is the size of the set $U_ m=\{ e\, \, |\, \, \forall s\ni e.~ \tilde x_ s = 0 \}$ .

The form of $Q$ mirrors the probabilistic proof in a systematic way.¹ The first term in $Q$ comes from applying the Markov bound to bound the probability of the first bad event. It contributes at least 1 to $Q$ if the cost is too high. The second term $n_ m$ counts the number of bad events of the second kind (uncovered elements). It contributes at least 1 to $Q$ if the coverage fails. Since failure implies $Q\ge 1$ , by the Markov bound, the probability of failure is at most $E[Q]$ . Finally (as shown by calculation in the proof of the lemma) $E[Q] \lt 1$ .

Algorithm via the method of conditional probabilities

The algorithm will choose each set to keep the conditional expectation of $Q$ , given the choices made so far, below 1. Consider the state after the $t$ th choice. Let $S_ t=(s_1,s_2,\ldots ,s_ t)$ denote the sets considered so far — for each set $s\in S_ t$ , the value of $\tilde x_ s$ has been set to 0 or 1; the values for the remaining sets have not yet been determined. The conditional expectation of $Q$ , given this state, is

\[ \phi _ t ~ =~ \frac{\sum _{s\in S_ t} c_ s \tilde x_ s + \sum _{s\not\in S_ t} c_ s p_ s}{2\lambda c\cdot x^*} ~ {\, {+}\, }~ \sum _ e \prod _{s\in S_ t} (1-[e\in s] \tilde x_ s) \prod _{s\not\in S_ t} (1-[e\in s]p_ s). \]

The algorithm will keep $\phi _ t$ (the conditional expectation of $Q$ ) below 1 by setting each $\tilde x_{s_ t}$ to minimize the resulting value of $\phi _ t$ . Let $s’=s_ t$ denote the $t$ th set. When the algorithm sets $\tilde x_{s'}$ , each $\tilde x_ s$ for $s\in S_{t-1}$ has already been set. Considered as a function of just $\tilde x_{s'}$ (with all other quantities fixed) the quantity $\phi _ t$ is linear in $\tilde x_{s'}$ . Let $U= \{ e \, |\, \prod _{s\in S_{t-1}} (1-[e\in s] \tilde x_ s) = 1\}$ denote the set of elements not covered before $s’$ is considered. The coefficient of $\tilde x_{s'}$ in $\phi _ t$ is

\[ \frac{c_{s’}}{2\lambda c\cdot x^*} {\, {-}\, }\sum _{e\in U}[e\in s’]\prod _{s\not\in S_ t} (1-[e\in s]p_ s). \]

To minimize $\phi _ t$ , if this coefficient is positive, then the algorithm sets $\tilde x_{s'}$ to 0. Otherwise it sets $\tilde x_{s'}$ to 1:

	input: set system $S$ , universe $U$ , cost vector $c$
	output: set cover $\tilde x$
1.	Compute min-cost fractional set cover $x^*$ .
2.	Take $\lambda \leftarrow \ln (2n)$ and set $p_ s \leftarrow \max (\lambda x^*_{s},1)$ for each $s$ .
3.	For each $s’\in S$ do:
4.	Let $S \leftarrow S - \{ s’\}$ . … $S$ contains the not-yet-decided sets.
5.	If $\displaystyle \frac{c_{s'}}{2\lambda c\cdot x^*} \gt \sum _{e\in s'\cap U} \prod _{s\in S, s\ni e}(1-p_ s)$ then set $\tilde x_ s\leftarrow 0$ , else set $\tilde x_ s\leftarrow 1$ and $U\leftarrow U - s’$ .
6.	Return $\tilde x$ .

By ensuring $\phi _ t \le \phi _{t-1}$ at each step, the algorithm guarantees an outcome with $Q\le \textrm{E}[Q]\lt 1$ , so that $\tilde x$ has cost at most $2\ln (2n)c\cdot x^*$ and covers all elements:

Lemma (rounding algorithm approximation guarantee).

The algorithm above returns a set cover $\tilde x$ of cost at most $2\ln (2n)$ times the minimum cost of any (fractional) set cover.

Randomized rounding without solving the linear program

The algorithm’s choice for each $\tilde x_{s'}$ depends intrinsically on the individual values within $x^*$ . (For example, if each element $e\in s’$ is contained in some not-yet-considered set $s$ with $p_{s}=1$ , then $s’$ will not be added to the cover even if $p_{s'}=1$ .) This dependence is not surprising, but it is not inevitable. By starting with a different randomized rounding scheme, the method of conditional probabilities can yield an algorithm that, in each step, makes a choice that is the same for any feasible $x^*$ . Thus, the algorithm can compute the approximate cover $\tilde x$ without first computing $x^*$ . You can read more about that here.

Textbooks: [4, Sec.14.2], [1, ch.4], [5, ch.5] (online)
Randomized rounding (wikipedia)
Randomized rounding without solving the linear program

Bibliography

[1]	R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1995.
[2]	P. Raghavan. Probabilistic construction of deterministic algorithms approximating packing integer programs. J. Computer System Sciences, 37(2):130–143, Oct. 1988.
[3]	P. Raghavan and C. D. Thompson. Randomized rounding: A technique for provably good algorithms and algorithmic proofs. Combinatorica, 7:365–374, 1987.
[4]	V. Vazirani. Approximation Algorithms. Springer Verlag, 2003.
[5]	D. P. Williamson and D. B. Shmoys. The design of approximation algorithms. Cambridge University Press, 2011. (Available online).

Footnotes

(1) For an application of the Markov bound, ( $\Pr [X\ge {}c] {\, {\le }\, }E[X]/c$ ), the pessimistic estimator will be the conditional expectation of $X/(c\textrm{E}[X])$ .
(2) For an application of the naive union bound ( $\Pr [A\text { or } B] {\, {\le }\, }\Pr [A] + \Pr [B]$ ) the pessimistic estimator will be the sum of the pessimistic estimators of the individual events.
(3) For a bad event whose probability is computed (or estimated) directly, the pessimistic estimator will be the conditional probability (or the corresponding estimate).

Discussion

Can you please explain this line in last line of page 5 “By ensuring ϕt≤ϕt−1ϕt≤ϕt−1 at each step, the algorithm guarantees an outcome with Q≤E[Q]<1, so that x~ has cost at most 2ln(2n)c⋅x∗and covers all elements. "

- Arindam February 21, 2016
That line summarizes the reasoning in the previous pages. To break it down, the proof of the lemma on page 2 explains why the randomized-rounding scheme (with non-zero probability) returns $\tilde x$ of at most that cost ($2\ln(2n) \,c\cdot x^*$). The discussion on page 4 explains that to find an outcome where $\tilde x$ achieves that cost bound, it’s enough to find an outcome where $Q$ (as defined there) is less than 1, and why $E[Q]< 1$. The discussion on page 5 explains why $\phi_t$ is the conditional expectation of $Q$ given the first $t$ choices. That page also sketches why the deterministic algorithm there, in each iteration $t$, makes a choice that ensures that $\phi_t$ is at most $\phi_{t-1}$, so that the algorithm keeps the conditional expectation of $Q$ below 1, and thus reaches an outcome where $Q<1$, so that $\tilde x$ achieves the desired cost bound. If these comments don't make much sense to you, try going over the background material, linked to at the top. Or, if you have a question about any one of these steps, please post it.

- Neal Post authorFebruary 21, 2016

Notes on algorithms

Lecture notes on algorithms

Randomized rounding

Trivial example: Vertex Cover

Relation to the probabilistic method in combinatorics

Standard randomized-rounding scheme for Set Cover

Deconstructing the Set Cover existence proof to find $Q$

Algorithm via the method of conditional probabilities

Randomized rounding without solving the linear program

Related

Bibliography

Footnotes

Discussion

Trivial example: Vertex Cover

Relation to the probabilistic method in combinatorics

Standard randomized-rounding scheme for Set Cover

Deconstructing the Set Cover existence proof to find QQ

Algorithm via the method of conditional probabilities

Randomized rounding without solving the linear program

Related

Bibliography

Footnotes

Discussion

Deconstructing the Set Cover existence proof to find $Q$