Deriving greedy algorithms via randomized rounding.

Greedy algorithms are often natural, but coming up with exactly the right algorithm or analysis can still be a challenge. Here, the idea is to view these algorithms as derandomized versions of a sample-and-increment rounding scheme. The proof of the performance guarantee starts with the rounding scheme, using standard probabilistic tools. Applying the method of conditional probabilities then yields the algorithm in a systematic way.

Click for background material…

Introduction — About the basic idea of using randomized rounding to derive greedy and Lagrangian-relaxation algorithms.
linear programming — Notes on integer linear programming, including linear programs for Set Cover.
example algorithm — A review of the greedy Set-Cover algorithm.
randomized rounding — Notes on conventional randomized rounding. (See also the method of conditional probabilities and pessimistic estimators.)
Wald’s equation — A review of Wald’s equation, for analyzing random sampling schemes with random stopping times.

Three introductory examples

1. Unweighted Set Cover.

This introductory example illustrates the basic idea. The rounding scheme randomly samples a fixed number $N = \lceil |x|\ln n\rceil$ of sets from the distribution defined by the fractional solution $x$ . By direct calculation, the expected number of elements left uncovered is less than 1. Applying the method of conditional probabilities yields Johnson and Lovász’s greedy set-cover algorithm, and a proof that it returns a cover of size at most $\lceil |x^*|\ln n\rceil$ , where $x^*$ is an optimal fractional cover. More on deriving the greedy set-cover algorithm…

2. Weighted Set Cover.

For problems whose cost functions have weights, the increase in cost with each sample depends on the sample. Also, instead of sampling for a pre-determined number of samples, the most natural rounding scheme stops when some condition is met (such as when all elements are covered). The analysis of the rounding scheme becomes a little more subtle. The analysis for the unweighted case can be adapted by using Wald’s equation.

In this example, the rounding scheme samples from the distribution defined by the fractional solution until all elements are covered. We use Wald’s equation to show that the expected number of samples before all elements are covered is at most $|x|{\rm H}(n)$ , so the expected cost of all the samples is at most $c\cdot x\, {\rm H}(n)$ (the fractional cost times ${\rm H}(n)=1+1/2+\cdots +1/n$ ). Applying the method of conditional probabilities yields Chvátal’s algorithm (which generalizes the greedy set-cover algorithm to the weighted case), and a proof that it returns a ${\rm H}(n)$ -approximation. More on using Wald’s to handle weighted set cover…

3. Improving the ratio to ${\rm H}(d)$ by “localizing.”

Here $d = \max _ s |s|$ is the maximum set size. The analyses in the examples above analyze the number of samples necessary to meet all $n$ constraints, then multiply that by the cost per sample. With this “global” approach, the approximation ratio inevitably grows as $n$ increases, because the number of samples required to meet all constraints grows with $n$ (it’s about $|x|{\rm H}(n)$ ). We can avoid this by recasting the analysis to make it “local”.

In this example, we modify the random sampling scheme so that, when a set $s$ is sampled, the set is added to the cover only if $s$ contains an element that is not yet covered. For the analysis we focus on each set $s$ in isolation. Following the same reasoning as in the global analysis, the expected number of samples before all elements in $s$ are covered is at most $|x|{\rm H}(|s|)$ . (And $s$ won’t be added to the cover after that.) Thus, the probability that $s$ is added is at most $x_ s{\rm H}(|s|)$ . By linearity of expectation the expected cost of the final cover is at most $\sum _ s c_ s x_ s {\rm H}(|s|)$ .

Applying the method of conditional probabilities yields Chvátal’s algorithm again (although the pessimistic estimator is more subtle) and a proof that the algorithm returns a cover of cost at most $\min _{x^*} \sum _ s c_ s x^*_ s {\rm H}(|s|)$ , which is at most ${\rm H}(d)\min _{x^*} c\cdot x^*$ , where $x^*$ ranges over the feasible fractional set covers.

More on improving the bound by localizing it…

Other related examples

Notes not yet added

more general covering problems, including k-medians and facility locations
open problems (e.g. greedy algorithm for integer covering with box constraints)
Implicit dual solutions

Notes on algorithms

Lecture notes on algorithms

Greedy examples

Three introductory examples

Other related examples

Notes not yet added