last updated: May 9, 2024

Probability with Measure

Chapter 4 Lebesgue Integration

The concept of integration as a technique that both acts as an inverse to the operation of differentiation and also computes areas under curves goes back to the origin of the calculus and the work of Isaac Newton (1643-1727) and Gottfried Leibniz (1646-1716). It was Leibniz who introduced the \(\int \cdots dx\) notation. The first rigorous attempt to understand integration as a limiting operation within the spirit of analysis was due to Bernhard Riemann (1826-1866). The approach to Riemann integration that is often taught (as in MAS2004/2009) was developed soon after by Jean-Gaston Darboux (1842-1917). At the time it was developed, this theory seemed to be all that was needed, but as the 19th century drew to a close, some problems appeared:

  • • One of the main tasks of integration is to recover a function \(f\) from its derivative \(f'\). But some functions were discovered for which \(f'\) existed and was bounded, but where \(f'\) was not Riemann integrable.

  • • Suppose \((f_{n})\) is a sequence of functions converging pointwise to \(f\). People wanted a useful set of conditions under which

    \begin{equation} \label {eq:int_conv} \int f(x)\,dx = \lim _{n \rightarrow \infty }\int f_{n}(x)\,dx. \end{equation}

    but weren’t able to find any suitable conditions. Problem 4.18 illustrates some of the difficulties here; it gives an example of \(f_n,f\) such that \(f_n(x)\to f(x)\) for all \(x\), but in which (4.1) fails.

  • • Riemann integration was limited to computing integrals over \(\R ^{n}\) with respect to Lebesgue measure. Although it was not yet apparent, the emerging theory of probability would require the calculation of expectations of random variables \(X\) using the formula \(\E (X) = \int _{\Omega }X(\omega )d\P (\omega ).\) This requires a version of integration that works on a general measure space.

A new approach to integration was needed. In this chapter, we’ll study Lebesgue integration, which allow us to investigate \(\int _{S}f(x)\,dm(x)\) where \(f:S \rightarrow \R \) is a ‘suitable’ measurable function defined on a general measure space \((S, \Sigma , m)\). It was developed by Henri Lebesgue (pronounced ‘Leb-eyg’) and first published in 1902.

We will see that if we take \(m\) to be Lebesgue measure on \((\R , {\cal B}(\R ))\) then we recover the familiar integral \(\int _{\R }f(x)\,dx\) but we will now be able to integrate a much bigger class of functions than Riemann and Darboux could. Most importantly, the Lebesgue integral solves all three of the issues discussed above.

4.1 The Lebesgue integral for simple functions

We’ll present the construction of the Lebesgue integral in three steps. We’ll work over a general measure space \((S,\Sigma ,m)\) for most of Chapter 4, and we’ll integrate functions \(f:S\to \R \). The first step will involve simple functions. In the second step we extend to non-negative measurable functions, and the final step we extend to what are known as ‘Lebesgue integrable’ functions.

  • Definition 4.1.1 (Lebesgue Integral, Step 1) If \(A\in \Sigma \) then the integral of the indicator function \(\1_A\) is defined as

    \begin{equation} \label {eq:leb_int_indicators} \int _{S}{\1}_{A}\,dm = m(A). \end{equation}

    More generally, if \(f = \sum _{i=1}^{n}c_{i}{\1}_{A_{i}}\) is a non-negative simple function (i.e. all \(c_i\geq 0\)) then we define

    \begin{equation} \label {eq:leb_int_simple} \int _{S}f\, dm = \sum _{i=1}^{n}c_{i}m(A_{i}). \end{equation}

    Note that \(m(A_i)\) might be infinite, so \(\int _{S}f dm \in [0, \infty ]\).

Note that we can represent \(f\) in more than one way as a simple function. For example if \(f=\1_{[0,1]}\) then also \(f=\1_{[0,\frac 12)}+\1_{[\frac 12,1]}\). It is easy to guess that the value of (4.3) does not depend on the choice of representation. We’ll omit a formal proof of this fact. Note also that equations (4.2) and (4.3) are consistent with each other, in the sense that if we take \(n=1\) and \(c_1=1\) in (4.3) then we obtain (4.2). We restrict to non-negative simple functions to make sure ‘\(\infty -\infty \)’ does not occur in (4.3).

When we work with the theory of integration we will tend to write \(\int _S f\,dm\) for the Lebesgue integral of \(f\). We will sometimes use the shorthand notation

\[\mc {I}(f)=\int _S f\,dm,\]

to make our proofs easier to read. For calculations it is often more helpful to write \(\int _S f(x)\,dm(x)\), which is closer to the notation you’ve used before in the case \(S=\R \). We’ll come back to this point after we’ve reached Step 2 of the construction.

In each step of defining the Lebesgue integral, we’ll establish some useful properties of the integral. Because we expand the amount of functions we can integrate at each step, this will mean that we carry several properties with us as we go, and we do some work in each step to upgrade them. We begin this process with the next lemma.

  • Lemma 4.1.2 If \(f\) and \(g\) are non-negative simple functions then:

    • 1. Linearity: for all \(\alpha ,\beta \in \R \)

      \[\int _{S}(\alpha f + \beta g)\,dm = \alpha \int _{S}f\,dm + \beta \int _{S}g\,dm,\]

    • 2. Monotonicity:

      \[f\leq g \quad \ra \quad \int _{S}f\,dm \leq \int _{S}g\,dm.\]

Proof: Let us write \(f = \sum _{i=1}^{n}c_{i}{\1}_{A_{i}}\). Note that we can assume without loss of generality that \(\bigcup _{i=1}^n A_i=S\) by including an extra term with \(c_{n+1}=0\) and \(A_{n+1}=S\sc (\bigcup _{i=1}^n A_i)\) into the summation. Similarly, write \(g = \sum _{j=1}^{m}d_{j}{\1}_{B_{j}}\) where \(\bigcup _{i=1}^n B_i=S\). By the definition of simple functions we have \(A_i\cap A_j=\emptyset \) and \(B_i\cap B_j=\emptyset \) for all \(i\neq j\).

We have

\begin{align} f &= \sum _{i=1}^{n}c_{i}{\1}_{A_{i} \cap S} = \sum _{i=1}^{n}c_{i}{\1}_{A_{i} \cap \bigcup _{j=1}^{m}B_{j}} = \sum _{i=1}^{n}c_{i}{\1}_{\bigcup _{j=1}^m (A_{i}\cap B_{j})} = \sum _{i=1}^{n}c_{i}\sum _{j=1}^m {\1}_{A_{i}\cap {B_j}} \notag \\ &= \sum _{i=1}^{n}\sum _{j=1}^{m}c_{i}{\1}_{A_{i}\cap B_{j}}. \label {eq:f_alpha_beta} \end{align} Here, we use that \(\bigcup _{j=1}^{m}B_{j} = S\) and part (a)(i) of Exercise 2.5 (note that \(A_{i}\cap B_{j_1}\) and \(A_{i}\cap B_{j_2}\) are disjoint whenever \(j_1\neq j_2\)). We can obtain a similar expression for \(g\), giving

\begin{equation} \label {eq:g_alpha_beta} g=\sum _{i=1}^n\sum _{j=1}^m d_j \1_{A_i\cap B_j}. \end{equation}

It follows that

\begin{equation} \label {eq:fg_alpha_beta} \alpha f + \beta g = \sum _{i=1}^{n}\sum _{j=1}^{m}( \alpha c_{i} + \beta d_{j}){\1}_{A_{i} \cap B_{j}}. \end{equation}

Using (4.3), and the representations of \(f,g\) and \(\alpha f + \beta g\) as simple functions in (4.4)-(4.6),

\begin{align} \mc {I}( \alpha f + \beta g) & =\sum _{i=1}^{n}\sum _{j=1}^{m}( \alpha c_{i} + \beta d_{j})m(A_{i} \cap B_{j}), \label {eq:fg_alpha_beta_int}\\ \mc {I}(f) & =\sum _{i=1}^{n}\sum _{j=1}^{m}c_i\, m(A_{i} \cap B_{j}), \notag \\ \mc {I}(g) & =\sum _{i=1}^{n}\sum _{j=1}^{m}d_j\,m(A_{i} \cap B_{j}). \notag \end{align} From these formulae we have \(\mc {I}( \alpha f + \beta g) = \alpha I(f) + \beta I(g)\), which proves linearity.

For monotonicity, we now assume that \(f\leq g\). Hence \(g-f\) is non-negative. Putting \(\alpha =-1\) and \(\beta =1\) into (4.6), we have that \(g-f\) is a non-negative simple function, and that \(c_i\leq d_j\) whenever \(A_i\cap B_j\neq \emptyset \). Hence, from (4.7), we have \(I(g-f)\geq 0\). Linearity gives that

\[I(f)=I(g)+I(g-f)\]

and thus \(I(f)\leq I(g)\), as required.   ∎

4.1.1 Integration over subsets

Integrals over the real numbers are commonly written in the form \(\int _a^b\), which denotes integration over the interval \([a,b]\sw \R \). We now introduce some notation for this, in the general case.

  • Definition 4.1.3 If \(A \in \Sigma \), whenever \(\int _{S}fdm\) is defined for some \(f:S \rightarrow \R \) we define

    \[ \mc {I}_{A}(f) = \int _{A}f\,dm = \int _{S} {\1}_{A} f\,dm .\]

We call \(\mc {I}_A(f)\) the integral of \(f\) over the set \(A\). In general there is no guarantee that \(I_{A}(f)\) is defined for some function \(f\). We need \(f\) to be one of the types of functions that we work with in the steps used to define the integral.

The following lemma is another property of the Lebesgue integral that we will carry with us as we build up the definition.

  • Lemma 4.1.4 Let \(f:S\to \R \) be a simple function. Then \(\nu :\Sigma \to \R \) by

    \[\nu (X)=\int _X f\,dm\]

    is a measure.

Proof: Let us write \(f=\sum _{i=1}^n c_i \1_{A_i}\). Note that \(\1_Xf=\sum _{i=1}^n c_i\1_{A_i}\1_{X}=\sum _{i=1}^n c_i\1_{A_i\cap X}\), where we have used the identity \(\1_A\1_B=\1_{A\cap B}\) from Exercise 2.5. By Definition 4.1.3 and (4.3) we have

\begin{equation} \label {eq:intXF_measure} \int _X f\,dm=\sum _{i=1}^n c_i m(A_i\cap X). \end{equation}

By part (b) of Exercise 1.5 we have that \(X\mapsto m(A_i\cap X)\) defines a measure, for all \(i\). Using part (a) of the same exercise, \(X\mapsto c_i m(A_i\cap X)\) also defines a measure. In Exercise 2.4 we showed that a finite sum of measures was also a measure, hence in fact the right hand side of (4.8) defines a measure. This completes the proof.   ∎