Extending the Chernoff bound to handle sums with randomly many terms.

Alice goes to the casino and plays bets on a sequence of fair coin flips. On the $t$ th bet, Alice chooses an amount $a_ t\in [0,1]$ to bet: she wins $a_ t$ if this flip is heads, otherwise she loses $a_ t$ . Since it’s Vegas, she never stops playing. Fix any $\varepsilon \gt 0$ . Let $W_ t$ be the sum of bets won after the $t$ th bet. Let $L_ t$ be the sum of bets lost after the $t$ th bet. Will Alice ever reach a time $t$ such that $W_ t/(1+\varepsilon ) - L_ t/(1-\varepsilon ) \ge \varepsilon \mu$ ? The bound below says that the probability that she does is less than $\exp (-\varepsilon ^2\mu )$ .

Click for background material…

“Stopping-time” Chernoff bound

The bound applies to a system that goes through a random sequence of states $S_1,S_2,S_3,\ldots$ , where each state $S_ t$ determines two values: $x_ t$ and $y_ t$ , each in $[0,1]$ , where the expectation of $x_ t$ is at most that of $y_ t$ :

\[ \textrm{E}[x_ t \, |\, S_{t-1}]~ \le ~ \textrm{E}[y_ t \, |\, S_{t-1}]. \]

The bound shows that $\sum _{t=1}^ T x_ t$ is unlikely to ever significantly exceed $\sum _{t=1}^ T y_ t$ (for any $T$ ).

Lemma (stopping-time Chernoff).

Given the conditions above, let $\varepsilon ,\mu \gt 0$ with $\varepsilon \le 1$ . The probability that

\begin{equation} \label{event2} \textstyle \exists t.~ ~ \sum _{s=1}^ t x_ s/(1+\varepsilon ) ~ -~ \sum _{s=1}^ t y_ s/(1-\varepsilon ) ~ \ge ~ \varepsilon \mu \end{equation}

is less than $\exp (-\varepsilon ^2\mu )$ .

(Various bounds of this kind are possible. The constants in this bound are not optimized.)

Proof

The proof adapts the standard Chernoff proof.

Click for proof…

Define $\phi _ t ~ =~ (1+\varepsilon )^{\sum _{s=1}^ t x_ s}(1-\varepsilon )^{\sum _{s=1}^ t y_ s}e^{-\varepsilon ^2\mu }.$

If \eqref{event2} happens for some $t$ , then by calculation¹ $\phi _ t$ exceeds 1. To finish, we observe that $\phi$ is a non-negative super-martingale. This proves the lemma, because then, by Markov for stopping time, $\Pr [\exists t. ~ \phi _ t \gt 1]$ is less than $\phi _0 = \exp (-\varepsilon ^2\mu )$ .

By inspection each $\phi _ t$ is non-negative. To see that $\phi$ is a super-martingale, note that, using $\varepsilon ,x_ t,y_ t\in [0,1]$ ,

\[ \frac{\phi _{t}}{\phi _{t-1}} ~ =~ (1+\varepsilon )^{x_ t}(1-\varepsilon )^{y_ t} ~ \le ~ (1+\varepsilon x_ t)(1-\varepsilon {y_ t}) ~ \le ~ 1+\varepsilon x_ t-\varepsilon {y_ t}. \]

Since $\textrm{E}[x_ t-y_ t\, |\, S_{t-1}] \le 0$ , the expectation of the right-hand side (conditioned on $S_{t-1}$ ) is at most 1.

Pessimistic estimator

Note that $\phi _ t$ in the proof above also serves as a pessimistic estimator on the probability of event \eqref{event2}: that is, (a) its initial value, $\phi _0$ , is $\exp (-\varepsilon ^2\mu )$ , and (b) for each $t$ , given the current state $S_ t$ , the conditional probability of \eqref{event2} is less than $\phi _ t$ (so the event won’t happen as long as the algorithm keeps $\phi _ t \le 1$ ), and (c) $\phi$ is a super-martingale.

Azuma’s inequality (wikipedia)

Footnotes

Let $X=\sum _{s=1}^ t x_ s$ and $Y=\sum _{s=1}^ t y_ s$ . \eqref{event2} implies $\exp (X \varepsilon / (1+\varepsilon ) - Y \varepsilon /(1-\varepsilon )) ~ \ge ~ \exp (\varepsilon ^2 \mu )$ . Using $e^{z/(1+z)} \lt 1+z$ for $z\in \{ \varepsilon ,-\varepsilon \}$ gives $(1+\varepsilon )^ X(1-\varepsilon )^ Y ~ \gt ~ \exp (\mu \varepsilon ^2).$

Notes on algorithms

Lecture notes on algorithms

“Stopping-time” Chernoff bounds

“Stopping-time” Chernoff bound

Proof

Pessimistic estimator

Related

Footnotes