modeling Set Cover and Multicommodity Flow

Set Cover and Multicommodity flow as (integer) linear programs.

Modeling a problem as a linear program or integer linear program is a basic skill. Here are two examples.



Set Cover

Recall the definition of Set Cover: Given a collection {\cal S} of finite sets from a universe U and a cost c_ s for each set s\in {\cal S}, the problem is to choose a minimum-cost set cover — a collection {\cal C} of sets in {\cal S} such that each element e\in U is contained in at least one set in {\cal C}. The cost of {\cal C} is \sum _{s\in {\cal C}} c_ s.

The standard integer linear program for the problem is

\[ \min \Big\{ c\cdot x ~ |~ x\in {\mathbb Z}_+^{|{\cal S}|};\, (\forall e\in U)\, \sum _{s\ni e} x_ s \ge 1\Big\} . \]

In the linear-program relaxation of the problem, the variables can take non-integer values:

\[ \min \Big\{ c\cdot x ~ |~ x\in {\mathbb R}_+^{|{\cal S}|};\, (\forall e\in U)\, \sum _{s\ni e} x_ s \ge 1\Big\} . \]

If a (possibly fractional) x meets the constraint \sum _{s\ni e} x_ s \ge 1 for an element e, we say that x covers e, and, if x covers all elements, it is a fractional set cover — a feasible solution to the linear program. If x is an integer solution, we identify it with the collection of sets \{ s\in {\cal S}: x_ s \gt 0\} .

\includegraphics[type=pdf,ext=.pdf,read=.pdf,width=3.75in]{shared/graphics/set_cover_graph} The figure shows a fractional set cover of size 2 (that is, \sum _ s x_ s = 2). The value assigned to each variable x_ s is written above its set s. For each element e, the total coverage \sum _{s\ni e} x_ s of e is shown below e. The total coverage of each element is at least 1, so x is a fractional set cover.

Multicommodity Flow

Maximum multicommodity flow (defined in this note) is exactly modeled by the following linear program:

\[ \max \Big\{ |f| : f\in {\mathbb R}_+^{|P|}, (\forall e\in E)\, \sum _{p\ni e} f_ p \le c_ e\Big\} . \]

For each path p\in P, there is a variable f_ p representing the flow on path p. For each edge e, there is a constraint that the total flow on paths through the edge is at most the edge capacity c_ e. Subject to these constraints, the objective is to maximize the total flow |f| (i.e., \sum _ p f_ p) on all paths.

Remark.

The number of variables in this formulation can be exponential in the size of the graph.1 Greedy and Lagrangian-relaxation algorithms can deal with this, as long as they have an appropriate oracle (optimization subroutine).

Exercise.

Describe how, given a graph G, to model the problem of finding a maximum-size independent set in G as an integer linear programming problem.

Footnotes

  1. You may be familiar with an alternate formulation of network flow using flow variables for the edges. That formulation can be smaller, but has both negative and positive coefficients — it is not a packing or covering linear program.