# Set Cover / Wolsey’s generalization

Wolsey’s generalization of the greedy Set-Cover algorithm to a large class of problems.

It is natural to ask what general classes of problems the greedy Set-Cover algorithm generalizes to. Here we describe one such class, due to Wolsey (1982), that captures many, but not all, such problems.

# Minimizing a linear cost subject to a submodular constraint

The class is called choosing a subset to minimize a linear cost function subject to a submodular constraint. It contains all problems of the form

$\min _ S \Big\{ \sum _{s \in S} c_ s ~ \Big|~ S\subseteq C, f(S) \ge k\Big\}$

where is a universe of elements and is a monotone submodular function (mapping subsets to reals).1

The generalized algorithm starts with , then repeatedly chooses an element to add to , where maximizes , stopping when achieves the maximum possible value .

Performance guarantees.

Wolsey proves a number of performance guarantees for the algorithm [3, 2]. The most satisfactory guarantee holds only when the submodular function is integer-valued. In this case, the approximation ratio is , where is , the largest possible increase in from adding a single element. The proof is primal-dual, based on an abstract linear-program relaxation of the problem.

For more general submodular functions, Wolsey bounds the approximation ratio (somewhat weakly) by , where

• is
where contains the first elements chosen by the algorithm and ranges over . This generalizes Wolsey’s result for integer-valued .

• , where is the number of iterations and is the value of the quantity that is maximized in iteration of the algorithm (that is, ).

• is .

These latter bounds aren’t fully satisfying.

# Examples

The class of problems is somewhat abstract. Here are some examples:

• Set Partial Cover: Given an integer and a collection of sets over a universe , where each set has a cost , choose a minimum-cost collection of the sets so that their union contains at least elements. The approximation ratio is , where is the maximum size of any single set .

• Set Multicover: Given a collection of sets, where each set has a cost , and an integer requirement for each element , choose a minimum-cost collection of the sets so that each element is contained in at least of the chosen sets. Each set can only be chosen once. The approximation ratio is , where is the maximum size of any single set .

• Given a collection of sets of vectors of , where each set has a cost , choose a minimum-cost collection of the sets so that their union spans . The approximation ratio is , where is the maximum rank of any single set .

• Given a collection of sets of edges in a graph , choose a minimum-cost collection of the sets so that their union connects the graph. The approximation ratio is , where is the maximum size of any set . If the problem is MST, the algorithm gives 1-approximation (it reduces to Kruskal’s algorithm).

# An open problem

One of the most natural generalizations of Set Cover is Integer Covering with upper bounds on the variables, that is, where each . (To model a more general constraint for a variable , just replace each variable with copies.)

A nice open question is whether the approximation ratio (or perhaps a ratio) can be shown for any natural greedy algorithm for this problem, where is the maximum number of constraints in which any variable appears. An -approximation algorithm is known for the problem, but it is not a greedy algorithm .

For this problem, it is natural to define the underlying submodular function by or . But for these functions, none of Wolsey’s three quantities above are bounded independent of and .

The question is resolved positively for for two slightly easier problems:

• (no upper bounds on the variables), and

• where . (This one is Set Multicover, mentioned previously, for which the function above is integer-valued, so Wolsey’s result applies.)

## Footnotes

1. is submodular if for any and . That is, adding to the current set gives at least as much benefit as would adding after adding . Intuitively, adding cannot make it better to add . The key property for the analysis of the greedy algorithm is the following: if adding a set of elements would increase by at least , then there must be at least one element in that, if added by itself, would increase by at least .