# Expected maximum (or minimum) of many sums

Bounds on the expected maximum (or minimum) among a collection of sums.

It can be technically convenient to work with expectations directly, instead of working with probabilities. Here, given a collection of sums of 0/1 random variables, we bound the expected maximum (or minimum) sum in the collection.

For example, suppose Alice throws balls randomly into bins just until the first bin has balls. The bound says that the expected maximum number of balls in any bin will be at most . Similarly, the expected minimum number of balls in any bin will be at least .

The bound differs from Chernoff in a few ways:

• it bounds the expected maximum or minimum (as opposed to the probability of a large deviation),

• the sums can have randomly many terms, so they don’t have to be concentrated around their means.

# Warm-up: sums with a fixed number of terms

For a warm up, here are bounds that are directly analogous to standard Chernoff bounds, in that they concern sums that have a fixed number of terms.

Lemma 1 (expected maximum or minimum of many sums).

For each , let be a sum of independent random variables, where each is in and is fixed. Let .

(a) If for all and all , then

$\textstyle \textrm{E}\left[\max _{j\in [m]} X^ j\right] ~ \lt ~ (1+\varepsilon )(\mu T + \varepsilon ^{-1}\ln m).$

Taking , the expected maximum is less than .

(b) If for all and all , then

$\textstyle \textrm{E}\left[\min _{j\in [m]} X^ j\right] ~ \gt ~ (1-\varepsilon )\mu T – \varepsilon ^{-1}\ln m.$

Taking , the expected minimum is more than .

# Sums with a random number of terms

The next bound applies to a system that goes through a random sequence of states , where is a stopping time with finite expectation. At each time , the current state determines values: for and , where each is in and for (or for ). The bound shows that the expectation of (or ) is close to the expectation of .

Lemma 2 (expected maximum or minimum of many sums).

Given the conditions above, let .

(c) If for all and all , then

$\textstyle \textrm{E}\Big[\, {\max _{j\in [m]}}\, \sum _{t=1}^ T x^ j_ t\, \Big] ~ \le ~ (1+\varepsilon ) \big (\textrm{E}\big [\sum _{t=1}^ T y_ t\big ] \, +\, \varepsilon ^{-1}\ln m\big ).$

(d) If for all and all , then

$\textstyle \textrm{E}\Big[\, {\min _{j\in [m]}}\, \sum _{t=1}^ T x^ j_ t\, \Big] ~ \ge ~ (1-\varepsilon )\textrm{E}\big [\sum _{t=1}^ T y_ t\big ] \, -\, \varepsilon ^{-1}\ln m. ~ ~$

The constants in the bounds are not optimized.

# Some pessimistic estimators for the bound (c)

(Right now we’ve just recorded pessimistic estimators for (c). We’ll add the rest here as needed.)

The bound (c) relates and . In a typical use of bound (c), we know a bound on one of the two quantities, and we use (c) to bound the expectation of the other. We start with pessimistic estimators for these two cases. Let . Let .

1. Consider first the case that we start with an upper bound , and we use (c) to upper bound by .

Given the outcomes of the first iterations (in particular, values of and for all ) the following pessimistic estimator for the conditional expectation of is implicit in the proof of (c):

\begin{equation} \label{pe:c:U}\textstyle \psi _ t ~ =~ \phi _ t + (1+\varepsilon )(U – Y_ t) ~ =~ \log _{1+\varepsilon }\sum _{j=1}^ m (1+\varepsilon )^{X_ t} ~ ~ +~ (1+\varepsilon )(U – Y_ t). \end{equation}

The initial value is less than , as desired. The final value is an upper bound on the maximum in question (given ). In a given iteration, letting , the increase is at most

\begin{equation} \label{pe:c:U:delta}\textstyle (1+\varepsilon ) \sum _ j x^ j_ t \omega _ j/\sum _ j \omega _ j ~ -~ (1+\varepsilon )y_ t, \end{equation}

which, in expectation is non-positive.

2. Consider next the case that we start with a lower bound on the maximum, and we use (c) to lower bound by .

Given the outcomes of the first iterations (in particular, values of and for all ), the following pessimistic estimator for the conditional expectation of is implicit in the proof of (c):

\begin{equation} \label{pe:c:L}\textstyle \psi _ t ~ =~ Y_ t + (1+\varepsilon )^{-1}(L – \log _{1+\varepsilon }\sum _{j=1}^ m (1+\varepsilon )^{X_ t}). \end{equation}

The initial value is more than , as desired. The final value is a lower bound on (given that the maximum is at least ). In a given iteration, letting , the increase is at least

\begin{equation} \label{pe:c:L:delta}\textstyle y_ t – \sum _ j x^ j_ t \omega _ j/\sum _ j \omega _ j, \end{equation}

which, in expectation is non-negative.

3. Or, it may be the case that we have no a-priori upper bound or lower bound . Rather, we simply want an outcome where the inequality in (c) holds: . In that case, we can use the pessimistic estimator

\begin{equation} \label{pe:c}\textstyle \psi _ t ~ =~ \log _{1+\varepsilon } \sum _ j (1+\varepsilon )^{X^ j_ t} ~ ~ -~ (1+\varepsilon ) (Y_ t + \varepsilon ^{-1}\ln m). \end{equation}

It is initially negative. If its final value is at most 0, then the desired inequality holds. In a given iteration, the increase is at most Expression \eqref{pe:c:U:delta}, which is non-positive in expectation.