last updated: May 9, 2024

Probability with Measure

Chapter 6 Inequalities for Random Variables \((\Delta )\)

We’ve seen several examples of useful inequalities in Chapters 4 and 5, including Markov’s and Chebyshev’s inequalities (Lemma 4.2.3 and Exercise 5.2), Fatou’s lemma (Lemma 4.6.3) and the Cauchy-Schwarz inequality (Exercise 5.12). These inequalities can all be written in terms of general integrals, as well as in terms of random variables and expectations. In this chapter we will include some further examples of inequalities that are useful in probability theory.

We will look only at inequalities that apply to general random variables. It is worth noting that beyond what is included here there are several more specialized types of inequality (such as those that apply only to martingales, or to Markov chains, or to some particular distributions) that are also well known in probability theory.

6.1 Chernoff bounds \((\Delta )\)

Chernoff bounds are based on applying Markov’s inequality to the exponential function \(e^{tX}\), where \(X\) is a random variable, then choosing \(t\in \R \) to make the resulting bound be as strong as possible. Perhaps surprisingly, it is known that that this method gives near optimal bounds in many situations.

  • Lemma 6.1.1 (Generic Chernoff Bound) Let \(X\) be a random variable. Then

    • 1. For all \(t>0\) we have \(\P [X\geq c]\leq e^{-tc}\E [e^{tX}]\).

    • 2. For all \(t<0\) we have \(\P [X\leq c]\leq e^{-tc}\E [e^{tX}]\).

Proof: Note that \(e^{tX}\geq 0\) so \(\E [e^{tX}]\) is well defined as a non-negative extended real number. We prove the two claims in turn. For the first, for \(t>0\), from Markov’s inequality (Lemma 4.2.3) we have

\[\P [X\geq c]=\P [e^{tX}\geq e^{tc}]\leq \frac {1}{e^{tc}}\E [e^{tX}]\]

as required. For the second, note that for \(t<0\) we have \(\P [X\leq c]=\P [e^{tX}\geq e^{tc}]\) (using \(t<0\) reverses the inequality) and then proceed as before.   ∎

The function \(t\mapsto \E [e^{tX}]\) is known as the moment generating function or m.g.f. of the random variable \(X\). Its value is an extended real number and it is possible that \(\E [e^{tX}]=\infty \) for all \(t\neq 0\). However, in many cases the moment generating function is finite for \(t\) in (at least) some interval \((-\eps ,\eps )\) where \(\eps >0\). That is often enough to derive a Chernoff bound.

  • Example 6.1.2 Let \(X\) have a Binomial\((n,p)\) distribution. We will take \(p=\frac 12\), so \(\E [X]=\frac {n}{2}\), \(\var (X)=\frac {n}{4}\), and \(\E [e^{tX}]=(\frac 12+\frac 12e^t)^n\). We will derive a Chernoff bound for \(\P [X\geq \frac {3n}{4}]\), but first let us try Markov’s and Chebyshev’s inequalities. Markov’s inequality (Lemma 4.2.3) gives us

    \[\P \l [X\geq \frac {3n}{4}\r ]\leq \frac {1}{(3n/4)}\frac {n}{2}=\frac {4}{6}=\frac 23,\]

    which provides very little information. Chebyshev’s inequality (Exercise 5.2) gives

    \[\P \l [X\geq \frac {3n}{4}\r ]=\P \l [X-\frac {n}{2}\geq \frac {n}{4}\r ] \leq \P \l [|X-\E [X]|\geq \frac {n}{4}\r ]\leq \frac {1}{(n/4)^2}\var (X)=\frac {4}{n}.\]

    This is significantly better and tends to zero as \(n\to \infty \). Lastly, Chernoff’s bound gives us that

    \begin{equation} \label {eq:pz_ex_c} \P \l [X\geq \frac {3n}{4}\r ]\leq e^{-3nt/4}\l (\frac 12+\frac 12e^t\r )^n \end{equation}

    for all \(t\geq 0\). Choosing \(t\) to minimize the right hand side (which can be done via differentiating, finding turning points, and checking for minima) gives that the maximum occurs when \(e^t=3\). Putting this value for \(e^t\) into the above equation and simplifying results in

    \[\P \l [X\geq \frac {3n}{4}\r ]\leq 3^{-3n/4}(2)^n=\l (\frac {2}{3^{3/4}}\r )^{n}\approx (0.88)^n.\]

    This bound converges to zero much faster than the bound in (6.1).