Markov bound for super-martingales

For any non-negative super-martingale, the probability that its maximum \max _ t X_ t ever exceeds a given value c is at most \textrm{E}[X_0]/c.

The Markov bound plays a fundamental role in the following sense: many probabilistic proofs, including, for example, the proof of the Chernoff bound, rely ultimately on the Markov bound. This note discusses a bound that plays a role similar to the Markov bound in a particular important scenario: when analyzing the maximum value achieved by a given non-negative super-martingale.

Here’s a simple example. Alice goes to the casino with $1. At the casino, she plays the following game repeatedly: she bets half her current balance on a fair coin flip. (For example, on the first flip, she bets 50 cents, so she wins 50 cents with probability 1/2 and loses 50 cents with probability 1/2.) Will Alice’s winnings ever reach $10 or more? The bound here says this happens with probability at most 1/10.

Markov bound for super-martingale maxima

Let X_0,X_1,X_2,\ldots be a non-negative super-martingale — a sequence of non-negative random variables that is non-increasing in expectation: \textrm{E}[X_ t\, |\, X_{t-1}] \le X_{t-1}, and X_ t \ge 0 for each t. The sequence may be finite or infinite.

Consider the event that \max _ t X_ t \ge c, for some given c. To bound the probability of this event, if we have a bound on the expectation of \max _ t X_ t we can use the Markov bound. For example, in the ideal case, if it happens that \textrm{E}[\max _ t X_ t] is at most \textrm{E}[X_0], then the Markov bound implies that the event in question happens with probability at most \textrm{E}[X_0]/c. Although \textrm{E}[\max _ t X_ t] can be much larger than \textrm{E}[X_0], the desired bound in any case:

Lemma (Markov for super-martingale maxima).

Fix any c\ge 0
(a) \Pr [\max _ t X_ t \ge c] \le E[X_0]/c.
(b) \Pr [\max _ t X_ t > c] < E[X_0]/c.

In short, this bound substitutes for the Markov bound to give us a natural bound on the probability of the event \max _ t X_ t\ge c. Note that in most applications X_0 will be a fixed value independent of the outcome.

Proof idea

For our purposes, knowing how to use this bound is more important than knowing how to prove it. Here is the proof just for the sake of completeness.

To get the intuition, consider the following seemingly weaker bound. If T is any stopping time with finite expectation, then by Wald’s equation \textrm{E}[X_ T] is at most \textrm{E}[X_0], so by Markov \Pr [X_ T \ge c] is at most \textrm{E}[X_0]/c. That is, the desired bound holds for the single value X_ T.

The proof of the lemma uses this argument, with T specifically defined to be the first time such that X_ T \ge c (if any, else \infty ). (In the example, this is analogous to Alice quitting as soon as her winnings reach $10.) This T is indeed a stopping time, and, crucially, the event \max _ t X_ t \ge c occurs only if X_ T \ge c. So the bound on \Pr [X_ T \ge c] from the previous paragraph implies the result. A technical obstacle is that T might not have finite expectation, but this is easily overcome via a limit argument.


Pessimistic estimator

(a) Given the value X_{t} at the current time t, as long as the bad event has not yet happened (that is, as long as \forall s \lt t. ~ X_ s \lt c ), the value \phi _ t = X_ t/c is a pessimistic estimator for the conditional probability of the event \exists t.~ X_ t \ge c. The value is initially \textrm{E}[X_0]/c, it is non-increasing in expectation with each step, and, as long as it remains less than 1, the event in question doesn’t happen.

(b) The same \phi _ t is a pessimistic estimator for the event in part (b): the value is initially \textrm{E}[X_0]/c, it is non-increasing in expectation with each step, and, as long as it remains less than or equal to 1, the event in question doesn’t happen.