Wald’s equation
Wald’s equation, a form of linearity of expectation for sums with randomly many terms.
Consider a sum \sum _{t=1}^ T x_ t of random variables, where the number of terms T is itself a random variable. If each term x_ t has expectation at most (or at least) \mu , then the expectation of the sum is at most (or at least) \mu \, \textrm{E}[T] (the bound on the expectation of each term, times the expected number of terms). This holds provided the random variables in the sum are bounded above or below and T is a stopping time with finite expectation.
Wald’s equation for dependent increments
Wald’s equation applies to any sequence that, with each step, increases (or decreases) in expectation by a constant additive amount. What about sequences where the expected change with each step depends on the current value? For example, suppose Alice starts with n coins. In each round t=1,2,\ldots ,T, she flips each remaining coin and discards those that come up tails. She stops once all coins are discarded. What is the expected number of rounds?A variant of Wald’s equation to use when the expected increment depends on the sum so far.
Markov bound for super-martingales
For any non-negative super-martingale, the probability that its maximum \max _ t X_ t ever exceeds a given value c is at most \textrm{E}[X_0]/c.
The Markov bound plays a fundamental role in the following sense: many probabilistic proofs, including, for example, the proof of the Chernoff bound, rely ultimately on the Markov bound. This note discusses a bound that plays a role similar to the Markov bound in a particular important scenario: when analyzing the maximum value achieved by a given non-negative super-martingale.
Here’s a simple example. Alice goes to the casino with $1. At the casino, she plays the following game repeatedly: she bets half her current balance on a fair coin flip. (For example, on the first flip, she bets 50 cents, so she wins 50 cents with probability 1/2 and loses 50 cents with probability 1/2.) Will Alice’s winnings ever reach $10 or more? The bound here says this happens with probability at most 1/10.
“Stopping-time” Chernoff bounds
Alice goes to the casino and plays bets on a sequence of fair coin flips. On the tth bet, Alice chooses an amount a_ t\in [0,1] to bet: she wins a_ t if this flip is heads, otherwise she loses a_ t. Since it’s Vegas, she never stops playing. Fix any \varepsilon \gt 0. Let W_ t be the sum of bets won after the tth bet. Let L_ t be the sum of bets lost after the tth bet. Will Alice ever reach a time t such that W_ t/(1+\varepsilon ) - L_ t/(1-\varepsilon ) \ge \varepsilon \mu ? The bound below says that the probability that she does is less than \exp (-\varepsilon ^2\mu ).Extending the Chernoff bound to handle sums with randomly many terms.
Expected maximum (or minimum) of many sums
Bounds on the expected maximum (or minimum) among a collection of sums.
It can be technically convenient to work with expectations directly, instead of working with probabilities. Here, given a collection of sums of 0/1 random variables, we bound the expected maximum (or minimum) sum in the collection.
For example, suppose Alice throws balls randomly into n bins just until the first bin has n balls. The bound says that the expected maximum number of balls in any bin will be at most n+2\sqrt {n\ln n}+\ln n. Similarly, the expected minimum number of balls in any bin will be at least n-2\sqrt {n\ln n}.
The bound differs from Chernoff in a few ways:
it bounds the expected maximum or minimum (as opposed to the probability of a large deviation),
the sums can have randomly many terms, so they don’t have to be concentrated around their means.
Expected deviation of a sum
Bounds on the expected deviation of a sum from a threshold.
Here are bounds on the expected deviation of a sum of 0/1-random variables above or below some threshold (typically near its mean).
For example, suppose Alice flips a fair coin n times. She pays Bob $1 for each head after the first (1+\varepsilon )n/2 heads (if any). What is her expected payment to Bob? The bounds here say: at most \varepsilon ^{-1} \exp (-\varepsilon ^2 n/6). For example, if \varepsilon =\Omega (1/\sqrt n), the expected payment is O(\sqrt n). If \varepsilon =\sqrt {6c\ln (n)/n}, the expected payment is O(1\, /\, n^{c-1}\sqrt {\ln n}). In general the expectation is about \varepsilon ^{-1} times the probability (according to Chernoff) that the sum exceeds its mean by a factor of 1+\varepsilon .
Click on the title of any entry to see the full entry…