# Chernoff bound

A brief introduction to Chernoff bounds.

If you’re already familiar with Chernoff bounds, you may prefer to skip directly to the statement and proof of a typical Chernoff bound.

Chernoff bounds (a.k.a. tail bounds, Hoeffding/Azuma/Talagrand inequalities, the method of bounded differences, etc. [1, 2]) are used to bound the probability that some function (typically a sum) of many “small” random variables falls in the tail of its distribution (far from its expectation).

# The distribution of a sum of “small” random variables

Let be a sum of many small random variables. For example, imagine that is the number of heads in coin flips. The figure shows the typical distribution of such a sum around its expectation . For , the sum has probability about of deviating from by more than a factor of (that is, an additive deviation of ).

For example (taking any ) the sum has constant probability of deviating from by an additive .

Or (taking ), the sum has probability of deviating from by more than .

A typical Chernoff bound quantifies this behavior. It says that the event that exceeds times its expectation is rare: it happens with probability at most . (A similar bound holds for .)

# Example applications

Here are a couple of examples using a Chernoff bound.

Example (balls in bins / max bin).

Throw balls randomly into bins. Let be the number of balls thrown into a given bin. Then is a sum of independent random -variables:

$Y=\sum _{t} [\text {ball } t \text { goes in the first bin}].$

The expectation of is . By the Chernoff bound, for any ,

$\renewcommand{\mathspread }{\, \, \, }\Pr [Y \ge (1+\varepsilon )\mu ] {\, \, \, {<}\, \, \, }\exp (- \varepsilon ^2\textrm{E}[Y]/3 ){\, \, \, {=}\, \, \, }1/n^{2\varepsilon ^2}.$

Taking gives — the probability that a given bin receives at least balls is less than .

The naive union bound implies that the probability that any of the bins has more than balls is at most .

Exercise (more balls in bins).

Show that if balls are thrown, then the probability that any bin has more than balls is at most .

# Expectations of exponentials

The proof of the Chernoff bound for a sum works by reasoning about the expectation of . Here is some intuition for this approach. (For the full proof, go here.)

Recall the humble Markov bound: for any non-negative random variable , the probability that exceeds is at most . This is useful but somewhat weak. Can we strengthen it by transforming ?

To take a concrete example, let be the number of heads in random coin flips. Consider . By direct calculation,

$\renewcommand{\mathspread }{\, \, \, }\textrm{E}\big [3^{X_{t+1}} \big ] {\, \, \, {=}\, \, \, }{\textstyle \frac{1}{2}} \textrm{E}[3^{X_ t}] + {\textstyle \frac{1}{2}} 3 \cdot \textrm{E}[3^{X_ t}] {\, \, \, {=}\, \, \, }2\cdot \textrm{E}[3^{X_ t}].$

By induction, .

Does this give anything useful? Applying the Markov bound to gives

$\renewcommand{\mathspread }{\, \, \, }\Pr [X_ t \ge c] {\, \, \, {=}\, \, \, }\Pr [3^{X_ t} \ge 3^ c] {\, \, \, {\le }\, \, \, }\frac{\textrm{E}[3^{X_ t}]}{3^{c}} {\, \, \, {=}\, \, \, }\frac{2^ t}{3^{c}} \approx 2^{t - 1.58\, c}.$

For example, taking and , we can conclude that the probability of getting or more heads in flips is at most about 5/10000. Choosing a base other than 3 may give a better bound.

The proof of the Chernoff bound generalizes this idea, and optimizes the base of the exponent. Typically, in order to bound the probability of deviating by a factor from the mean, the best base for the exponent is something like . To prove bounds on deviations below (as opposed to above) the mean, one uses a base less than 1. For most settings, these bounds are tight up to small constant factors in the exponent.