last updated: May 9, 2024

Probability with Measure

Chapter 4 Lebesgue Integration

The concept of integration as a technique that both acts as an inverse to the operation of differentiation and also computes areas under curves goes back to the origin of the calculus and the work of Isaac Newton (1643-1727) and Gottfried Leibniz (1646-1716). It was Leibniz who introduced the dx notation. The first rigorous attempt to understand integration as a limiting operation within the spirit of analysis was due to Bernhard Riemann (1826-1866). The approach to Riemann integration that is often taught (as in MAS2004/2009) was developed soon after by Jean-Gaston Darboux (1842-1917). At the time it was developed, this theory seemed to be all that was needed, but as the 19th century drew to a close, some problems appeared:

  • • One of the main tasks of integration is to recover a function f from its derivative f. But some functions were discovered for which f existed and was bounded, but where f was not Riemann integrable.

  • • Suppose (fn) is a sequence of functions converging pointwise to f. People wanted a useful set of conditions under which

    (4.1)f(x)dx=limnfn(x)dx.

    but weren’t able to find any suitable conditions. Problem 4.18 illustrates some of the difficulties here; it gives an example of fn,f such that fn(x)f(x) for all x, but in which (4.1) fails.

  • • Riemann integration was limited to computing integrals over Rn with respect to Lebesgue measure. Although it was not yet apparent, the emerging theory of probability would require the calculation of expectations of random variables X using the formula E(X)=ΩX(ω)dP(ω). This requires a version of integration that works on a general measure space.

A new approach to integration was needed. In this chapter, we’ll study Lebesgue integration, which allow us to investigate Sf(x)dm(x) where f:SR is a ‘suitable’ measurable function defined on a general measure space (S,Σ,m). It was developed by Henri Lebesgue (pronounced ‘Leb-eyg’) and first published in 1902.

We will see that if we take m to be Lebesgue measure on (R,B(R)) then we recover the familiar integral Rf(x)dx but we will now be able to integrate a much bigger class of functions than Riemann and Darboux could. Most importantly, the Lebesgue integral solves all three of the issues discussed above.

4.1 The Lebesgue integral for simple functions

We’ll present the construction of the Lebesgue integral in three steps. We’ll work over a general measure space (S,Σ,m) for most of Chapter 4, and we’ll integrate functions f:SR. The first step will involve simple functions. In the second step we extend to non-negative measurable functions, and the final step we extend to what are known as ‘Lebesgue integrable’ functions.

  • Definition 4.1.1 (Lebesgue Integral, Step 1) If AΣ then the integral of the indicator function 𝟙A is defined as

    (4.2)S𝟙Adm=m(A).

    More generally, if f=i=1nci𝟙Ai is a non-negative simple function (i.e. all ci0) then we define

    (4.3)Sfdm=i=1ncim(Ai).

    Note that m(Ai) might be infinite, so Sfdm[0,].

Note that we can represent f in more than one way as a simple function. For example if f=𝟙[0,1] then also f=𝟙[0,12)+𝟙[12,1]. It is easy to guess that the value of (4.3) does not depend on the choice of representation. We’ll omit a formal proof of this fact. Note also that equations (4.2) and (4.3) are consistent with each other, in the sense that if we take n=1 and c1=1 in (4.3) then we obtain (4.2). We restrict to non-negative simple functions to make sure ‘’ does not occur in (4.3).

When we work with the theory of integration we will tend to write Sfdm for the Lebesgue integral of f. We will sometimes use the shorthand notation

I(f)=Sfdm,

to make our proofs easier to read. For calculations it is often more helpful to write Sf(x)dm(x), which is closer to the notation you’ve used before in the case S=R. We’ll come back to this point after we’ve reached Step 2 of the construction.

In each step of defining the Lebesgue integral, we’ll establish some useful properties of the integral. Because we expand the amount of functions we can integrate at each step, this will mean that we carry several properties with us as we go, and we do some work in each step to upgrade them. We begin this process with the next lemma.

  • Lemma 4.1.2 If f and g are non-negative simple functions then:

    • 1. Linearity: for all α,βR

      S(αf+βg)dm=αSfdm+βSgdm,

    • 2. Monotonicity:

      fgSfdmSgdm.

Proof: Let us write f=i=1nci𝟙Ai. Note that we can assume without loss of generality that i=1nAi=S by including an extra term with cn+1=0 and An+1=S(i=1nAi) into the summation. Similarly, write g=j=1mdj𝟙Bj where i=1nBi=S. By the definition of simple functions we have AiAj= and BiBj= for all ij.

We have

f=i=1nci𝟙AiS=i=1nci𝟙Aij=1mBj=i=1nci𝟙j=1m(AiBj)=i=1ncij=1m𝟙AiBj(4.4)=i=1nj=1mci𝟙AiBj. Here, we use that j=1mBj=S and part (a)(i) of Exercise 2.5 (note that AiBj1 and AiBj2 are disjoint whenever j1j2). We can obtain a similar expression for g, giving

(4.5)g=i=1nj=1mdj𝟙AiBj.

It follows that

(4.6)αf+βg=i=1nj=1m(αci+βdj)𝟙AiBj.

Using (4.3), and the representations of f,g and αf+βg as simple functions in (4.4)-(4.6),

(4.7)I(αf+βg)=i=1nj=1m(αci+βdj)m(AiBj),I(f)=i=1nj=1mcim(AiBj),I(g)=i=1nj=1mdjm(AiBj). From these formulae we have I(αf+βg)=αI(f)+βI(g), which proves linearity.

For monotonicity, we now assume that fg. Hence gf is non-negative. Putting α=1 and β=1 into (4.6), we have that gf is a non-negative simple function, and that cidj whenever AiBj. Hence, from (4.7), we have I(gf)0. Linearity gives that

I(f)=I(g)+I(gf)

and thus I(f)I(g), as required.   ∎

4.1.1 Integration over subsets

Integrals over the real numbers are commonly written in the form ab, which denotes integration over the interval [a,b]R. We now introduce some notation for this, in the general case.

  • Definition 4.1.3 If AΣ, whenever Sfdm is defined for some f:SR we define

    IA(f)=Afdm=S𝟙Afdm.

We call IA(f) the integral of f over the set A. In general there is no guarantee that IA(f) is defined for some function f. We need f to be one of the types of functions that we work with in the steps used to define the integral.

The following lemma is another property of the Lebesgue integral that we will carry with us as we build up the definition.

  • Lemma 4.1.4 Let f:SR be a simple function. Then ν:ΣR by

    ν(X)=Xfdm

    is a measure.

Proof: Let us write f=i=1nci𝟙Ai. Note that 𝟙Xf=i=1nci𝟙Ai𝟙X=i=1nci𝟙AiX, where we have used the identity 𝟙A𝟙B=𝟙AB from Exercise 2.5. By Definition 4.1.3 and (4.3) we have

(4.8)Xfdm=i=1ncim(AiX).

By part (b) of Exercise 1.5 we have that Xm(AiX) defines a measure, for all i. Using part (a) of the same exercise, Xcim(AiX) also defines a measure. In Exercise 2.4 we showed that a finite sum of measures was also a measure, hence in fact the right hand side of (4.8) defines a measure. This completes the proof.   ∎