Deriving greedy algorithms via randomized rounding.
Greedy algorithms are often natural, but coming up with exactly the right algorithm or analysis can still be a challenge. Here, the idea is to view these algorithms as derandomized versions of a sample-and-increment rounding scheme. The proof of the performance guarantee starts with the rounding scheme, using standard probabilistic tools. Applying the method of conditional probabilities then yields the algorithm in a systematic way.
Three introductory examples
This introductory example illustrates the basic idea. The rounding scheme randomly samples a fixed number N = \lceil |x|\ln n\rceil of sets from the distribution defined by the fractional solution x. By direct calculation, the expected number of elements left uncovered is less than 1. Applying the method of conditional probabilities yields Johnson and Lovász’s greedy set-cover algorithm, and a proof that it returns a cover of size at most \lceil |x^*|\ln n\rceil , where x^* is an optimal fractional cover. More on deriving the greedy set-cover algorithm…
For problems whose cost functions have weights, the increase in cost with each sample depends on the sample. Also, instead of sampling for a pre-determined number of samples, the most natural rounding scheme stops when some condition is met (such as when all elements are covered). The analysis of the rounding scheme becomes a little more subtle. The analysis for the unweighted case can be adapted by using Wald’s equation.
In this example, the rounding scheme samples from the distribution defined by the fractional solution until all elements are covered. We use Wald’s equation to show that the expected number of samples before all elements are covered is at most |x|{\rm H}(n), so the expected cost of all the samples is at most c\cdot x\, {\rm H}(n) (the fractional cost times {\rm H}(n)=1+1/2+\cdots +1/n). Applying the method of conditional probabilities yields Chvátal’s algorithm (which generalizes the greedy set-cover algorithm to the weighted case), and a proof that it returns a {\rm H}(n)-approximation. More on using Wald’s to handle weighted set cover…
Here d = \max _ s |s| is the maximum set size. The analyses in the examples above analyze the number of samples necessary to meet all n constraints, then multiply that by the cost per sample. With this “global” approach, the approximation ratio inevitably grows as n increases, because the number of samples required to meet all constraints grows with n (it’s about |x|{\rm H}(n)). We can avoid this by recasting the analysis to make it “local”.
In this example, we modify the random sampling scheme so that, when a set s is sampled, the set is added to the cover only if s contains an element that is not yet covered. For the analysis we focus on each set s in isolation. Following the same reasoning as in the global analysis, the expected number of samples before all elements in s are covered is at most |x|{\rm H}(|s|). (And s won’t be added to the cover after that.) Thus, the probability that s is added is at most x_ s{\rm H}(|s|). By linearity of expectation the expected cost of the final cover is at most \sum _ s c_ s x_ s {\rm H}(|s|).
Applying the method of conditional probabilities yields Chvátal’s algorithm again (although the pessimistic estimator is more subtle) and a proof that the algorithm returns a cover of cost at most \min _{x^*} \sum _ s c_ s x^*_ s {\rm H}(|s|), which is at most {\rm H}(d)\min _{x^*} c\cdot x^*, where x^* ranges over the feasible fractional set covers.
Notes not yet added
-
more general covering problems, including k-medians and facility locations
-
open problems (e.g. greedy algorithm for integer covering with box constraints)
-
Implicit dual solutions