Misc. background material

Misc. background material.

 

Excerpts

Probabilistic method

A brief introduction to the probabilistic method, followed by some basic bounds and a few examples.

Alon, Spencer, and Erdős (1992) describe the method as follows:

In order to prove the existence of a combinatorial structure with certain properties, we construct an appropriate probability space and show that a randomly chosen element in the space has the desired properties with positive probability.[1]

The method has come into wide use since about 1950, even in areas of mathematics that, a-priori, have nothing to do with probability. This is roughly because probability provides a useful high-level conceptual framework for understanding (formalizing, organizing, communicating) many widely applicable counting and averaging arguments. This probabilistic viewpoint is called the probabilistic lens.

Alon et al. note that, although in principle analysis with the probabilistic method can be replaced by direct arguments,

…in practice, the probability is essential. It would be hopeless to replace the applications of many of the tools [by counting arguments]. [1, page 2]

In Computer Science, the method is used in randomized rounding, to design approximation algorithms for combinatorial optimization problems.

basic bounds

Existence proofs, the naive union bound, linearity of expectation, Markov bound.

method of conditional probabilities (Max Cut)

The method of conditional probabilities converts a probabilistic existence proof into a deterministic algorithm.
The method of conditional probabilities is a systematic method for converting non-constructive probabilistic existence proofs into efficient deterministic algorithms that explicitly construct the desired object.

pessimistic estimators (Turáns theorem)

Pessimistic estimators in the method of conditional probabilities.
In applying the method of conditional probabilities, exact conditional probabilities (or expectations) are sometimes hard to compute. Pessimistic estimators can be used instead. We illustrate the idea by example, using Turán’s theorem.

Chernoff bound

A brief introduction to Chernoff bounds.

If you’re already familiar with Chernoff bounds, you may prefer to skip directly to the statement and proof of a typical Chernoff bound.

Chernoff bounds (a.k.a. tail bounds, Hoeffding/Azuma/Talagrand inequalities, the method of bounded differences, etc. [1, 2]) are used to bound the probability that some function (typically a sum) of many “small” random variables falls in the tail of its distribution (far from its expectation).

Wald’s equation

Wald’s equation, a form of linearity of expectation for sums with randomly many terms.

Consider a sum \sum _{t=1}^ T x_ t of random variables, where the number of terms T is itself a random variable. If each term x_ t has expectation at most (or at least) \mu , then the expectation of the sum is at most (or at least) \mu \, \textrm{E}[T] (the bound on the expectation of each term, times the expected number of terms). This holds provided the random variables in the sum are bounded above or below and T is a stopping time with finite expectation.

Wald’s equation for dependent increments

A variant of Wald’s equation to use when the expected increment depends on the sum so far.
Wald’s equation applies to any sequence that, with each step, increases (or decreases) in expectation by a constant additive amount. What about sequences where the expected change with each step depends on the current value? For example, suppose Alice starts with n coins. In each round t=1,2,\ldots ,T, she flips each remaining coin and discards those that come up tails. She stops once all coins are discarded. What is the expected number of rounds?

Markov bound for super-martingales

For any non-negative super-martingale, the probability that its maximum \max _ t X_ t ever exceeds a given value c is at most \textrm{E}[X_0]/c.

The Markov bound plays a fundamental role in the following sense: many probabilistic proofs, including, for example, the proof of the Chernoff bound, rely ultimately on the Markov bound. This note discusses a bound that plays a role similar to the Markov bound in a particular important scenario: when analyzing the maximum value achieved by a given non-negative super-martingale.

Here’s a simple example. Alice goes to the casino with $1. At the casino, she plays the following game repeatedly: she bets half her current balance on a fair coin flip. (For example, on the first flip, she bets 50 cents, so she wins 50 cents with probability 1/2 and loses 50 cents with probability 1/2.) Will Alice’s winnings ever reach $10 or more? The bound here says this happens with probability at most 1/10.

“Stopping-time” Chernoff bounds

Extending the Chernoff bound to handle sums with randomly many terms.
Alice goes to the casino and plays bets on a sequence of fair coin flips. On the tth bet, Alice chooses an amount a_ t\in [0,1] to bet: she wins a_ t if this flip is heads, otherwise she loses a_ t. Since it’s Vegas, she never stops playing. Fix any \varepsilon \gt 0. Let W_ t be the sum of bets won after the tth bet. Let L_ t be the sum of bets lost after the tth bet. Will Alice ever reach a time t such that W_ t/(1+\varepsilon ) - L_ t/(1-\varepsilon ) \ge \varepsilon \mu ? The bound below says that the probability that she does is less than \exp (-\varepsilon ^2\mu ).

Expected maximum (or minimum) of many sums

Bounds on the expected maximum (or minimum) among a collection of sums.

It can be technically convenient to work with expectations directly, instead of working with probabilities. Here, given a collection of sums of 0/1 random variables, we bound the expected maximum (or minimum) sum in the collection.

For example, suppose Alice throws balls randomly into n bins just until the first bin has n balls. The bound says that the expected maximum number of balls in any bin will be at most n+2\sqrt {n\ln n}+\ln n. Similarly, the expected minimum number of balls in any bin will be at least n-2\sqrt {n\ln n}.

The bound differs from Chernoff in a few ways:

  • it bounds the expected maximum or minimum (as opposed to the probability of a large deviation),

  • the sums can have randomly many terms, so they don’t have to be concentrated around their means.

Expected deviation of a sum

Bounds on the expected deviation of a sum from a threshold.

Here are bounds on the expected deviation of a sum of 0/1-random variables above or below some threshold (typically near its mean).

For example, suppose Alice flips a fair coin n times. She pays Bob $1 for each head after the first (1+\varepsilon )n/2 heads (if any). What is her expected payment to Bob? The bounds here say: at most \varepsilon ^{-1} \exp (-\varepsilon ^2 n/6). For example, if \varepsilon =\Omega (1/\sqrt n), the expected payment is O(\sqrt n). If \varepsilon =\sqrt {6c\ln (n)/n}, the expected payment is O(1\, /\, n^{c-1}\sqrt {\ln n}). In general the expectation is about \varepsilon ^{-1} times the probability (according to Chernoff) that the sum exceeds its mean by a factor of 1+\varepsilon .

modeling Set Cover and Multicommodity Flow

Set Cover and Multicommodity flow as (integer) linear programs.
Modeling a problem as a linear program or integer linear program is a basic skill. Here are two examples.

rounding an LP relaxation

A simple example of computing an approximate solution by rounding the solution to a linear-program relaxation.

The basic paradigm:

  1. Model your problem as an integer linear program.

  2. Solve its linear program relaxation.

  3. Somehow round the solution x^* of the relaxed problem to get a solution \tilde x of the original problem.

The rounding step is typically most easily done with a so-called randomized rounding scheme.

Rounding a relaxation is one of two standard ways to use linear-programming relaxations to design approximation algorithms. (The other way is the primal-dual method.)

Randomized rounding

Randomly rounding a fractional LP solution to an integer solution.
The idea, introduced by Raghavan and Thompson in 1987, is to use the probabilistic method to round the solution of a linear program, converting it into an approximately optimal integer solution [3]. It’s a broadly useful technique. For many problems, randomized rounding yields algorithms with optimal approximation ratios (assuming P\neq NP). This note describes randomized rounding and gives a few examples.

Set Cover / Wolsey’s generalization

Wolsey’s generalization of the greedy Set-Cover algorithm to a large class of problems.
It is natural to ask what general classes of problems the greedy Set-Cover algorithm generalizes to. Here we describe one such class, due to Wolsey (1982), that captures many, but not all, such problems.

Lagrangian relaxation / example

A simple example of a Lagrangian-relaxation algorithm.
The algorithm is for Maximum Multicommodity Flow. It illustrates some prototypical aspects of Lagrangian relaxation.