Stochastic Processes and Financial Mathematics
(part one)
2.2 Random variables
Our probability space gives us a label \(\omega \in \Omega \) for every possible outcome. Sometimes it is more convenient to think about a property of \(\omega \), rather than about \(\omega \) itself. For this, we use a random variable, \(X:\Omega \to \R \). For each outcome \(\omega \in \Omega \), the value of \(X(\omega )\) is a property of the outcome.
For example, let \(\Omega =\{1,2,3,4,5,6\}\) and \(\mc {F}=\mc {P}(\Omega )\). We might be interested in the property
\[ X(\omega )= \begin {cases} 0 & \text { if }\omega \text { is odd},\\ 1 & \text { if }\omega \text { is even}.\\ \end {cases} \]
We write
\[X^{-1}(A)=\{\omega \in \Omega \-X(\omega )\in A\},\]
for \(A\sw \R \), which is called the pre-image of \(A\) under \(X\). In words, \(X^{-1}(A)\) is the set of outcomes \(\omega \) for which the property \(X(\omega )\) falls inside the set \(A\). In our example above \(X^{-1}(\{0\})=\{1,3,5\}\), \(X^{-1}(\{1\})=\{2,4,6\}\) and \(X^{-1}(\{0,1\})=\{1,2,3,4,5,6\}\).
It is common to write \(X^{-1}(a)\) in place of \(X^{-1}(\{a\})\), because it makes easier reading. Similarly, for an interval \((a,b)\sw \R \) we write \(X^{-1}(a,b)\) in place of \(X^{-1}\big ((a,b)\big )\).
If it is clear which \(\sigma \)-field \(\mc {G}\) we mean to use, which might simply say that \(X\) is measurable. We will often shorten this to writing simply \(X\in m\mc {G}\).
For a probability space \((\Omega ,\mc {F},\P )\), we say that \(X:\Omega \to \R \) is a random variable if \(X\) is \(\mc {F}\)-measurable. The relationship to the usual notation for probability is that \(\P [X\in A]\) means \(\P [X^{-1}(A)]\), so as e.g.
\(\seteqnumber{0}{2.}{1}\)\begin{align*} \P \l [a<X<b\r ]&=\P \l [X^{-1}(a,b)\r ]=\P [\omega \in \Omega \-X(\omega )\in (a,b)]\\ \P [X=a]&=\P [X^{-1}(a)]=\P [\omega \in \Omega \-X(\omega )=a]. \end{align*} We usually prefer writing \(\P [X=a]\) and \(\P [a<X<b]\) because we find them more intuitive; we like to think of \(X\) as an object that takes a random value.
For example, suppose we toss a coin twice, with \(\Omega =\{HH,HT,TH,TT\}\) as in Example 2.1.2. If we take our \(\sigma \)-field to be \(\mc {F}=\mc {P}(\Omega )\), the set of all subsets of \(\Omega \), then any function \(X:\Omega \to \R \) is \(\mc {F}\)-measurable. However, suppose we choose instead
\[\mc {G}=\big \{\Omega ,\{HT,TH,TT\},\{HH\},\emptyset \big \}\]
(as we did in Example 2.1.2). Then if we look at function
\[X(\omega )=\text { the total number of tails which occurred}\]
we have \(X^{-1}([0,1])=\{HH,HT,TH\}\notin \mc {G},\) so \(X\) is not \(\mc {G}\)-measurable. The intuition here is that \(\sigma \)-field \(\mc {G}\) ‘isn’t big enough’ for \(X\), because \(\mc {G}\) only contains information about whether we threw two heads (or not).
\(\sigma \)-fields and information
It will be very important for us to understand the connection between \(\sigma \)-fields and information. When we talk about the ‘information’ contained in a \(\sigma \)-field \(\G \), we mean the following.
Suppose that an outcome \(\omega \) of our experiment has occurred, but suppose that we don’t know which \(\omega \in \Omega \) it was. Each event \(G\in \mc {G}\) represents a piece of information. This piece of information is whether or not \(\omega \in G\) i.e. whether or not the event \(G\) has occurred. If this ‘information’ allows us to deduce the exact value of \(X(\omega )\), and if we can do this for any \(\omega \in \Omega \), then \(X\) is \(\mc {G}\)-measurable.
Going back to our example above, of two coin tosses, the information contained in \(\mc {G}\) is whether (or not) we threw two heads. Recall that \(X\in \{0,1,2\}\) was the number of tails thrown. Knowing just this information provided by \(\G \) doesn’t allow us to deduce \(X\) – for example if all we know is that we didn’t throw two heads, we can’t work out exactly how many tails we threw.
The interaction between random variables and \(\sigma \)-fields can be summarised as follows:
\(\seteqnumber{0}{2.}{1}\)\begin{eqnarray*} \boxed {\mbox {event}} &\leftrightarrow & \boxed {\mbox {a piece of information: did the event occur}} \\ \boxed {\mbox {$\sigma $-field $\G $}} &\leftrightarrow & \boxed {\mbox {which information we care about}} \\ \boxed {\mbox {$X$ is $\G $-measurable}} &\leftrightarrow & \boxed {\mbox {$X$ depends only on information that we care about}} \end{eqnarray*}
Rigorously, if we want to check that \(X\) is \(\mc {G}\)-measurable, we have to check that \(X^{-1}(I)\in \mc {G}\) for every subinterval of \(I\sw \R \). This can be tedious, especially if \(X\) takes many different values. Fortunately, we will shortly see that, in practice, there is rarely any need to do so. What is important for us is to understand the role played by a \(\sigma \)-field.
\(\sigma \)-fields and pre-images
Suppose that we have a probability space \((\Omega ,\mc {F},\P )\) and a random variable \(X:\Omega \to \R \). We want to check that \(X\) is measurable with respect to some smaller \(\sigma \)-field \(\mc {G}\).
If \(X\) is a discrete random variable, we can use the following lemma.
-
Lemma 2.2.2 Let \(\mc {G}\) be a \(\sigma \)-field on \(\Omega \). Let \(X:\Omega \to \R \), and suppose \(X\) takes a finite or countable set of values \(\{x_1,x_2,\ldots \}\). Then:
\[\text {$X$ is measurable with respect to $\mc {G}$}\quad \Leftrightarrow \quad \text { for all }j,\;\{X=x_j\}\in \mc {G}.\]
Proof: Let us first prove the \(\Rightarrow \) implication. So, assume the left hand side: \(X\in m\mc {G}\). Since \(\{X=x_j\}=X^{-1}(x_j)\) and \(\{x_j\}=[x_j,x_j]\) is a subinterval of \(\R \), by Definition 2.2.1 we have that \(\{X=x_j\}\in \mc {G}\). Since this holds for any \(j=1,\ldots ,n\), we have shown the right hand side.
Next, we will prove the \(\Leftarrow \) implication. So, we now assume the right hand side: \(\{X=x_j\}\in \mc {G}\) for all \(j\). Let \(I\) be any subinterval of \(\R \), and define \(J=\{j\-x_j\in I\}\). Then,
\[X^{-1}(I)=\{\omega \in \Omega \-X(\omega )\in I\}=\bigcup _{j\in J}\{\omega \- X(w)=x_j\}=\bigcup _{j\in J} \{X=x_j\}.\]
Since \(\{X=x_j\}\in \mc {G}\), the definition of a \(\sigma \)-field tell us that also \(X^{-1}(I)\in \mc {G}\). ∎
For example, take \(\Omega =\{1,2,3,4,5,6\}\), which we think of as rolling a dice, and take \(\mc {F}=\P (\Omega )\). Consider
\(\seteqnumber{0}{2.}{1}\)\begin{align*} \mc {G}_1&=\big \{\emptyset , \{1,3,5\}, \{2,4,6\}, \Omega \big \},\\ \mc {G}_2&=\big \{\emptyset , \{1,2,3\}, \{4,5,6\}, \Omega \big \}, \end{align*} Here, \(\mc {G}_1\) contains the information of whether the roll is even or odd, and \(\mc {G}_2\) contains the information of whether the roll is \(\leq 3\) or \(>3\). It’s easy to check that \(\mc {G}_1\) and \(\mc {G}_2\) are both \(\sigma \)-fields.
Let us define
\(\seteqnumber{0}{2.}{1}\)\begin{align*} X_1(\omega ) &= \begin{cases} 0 & \text { if }\omega \text { is odd},\\ 1 & \text { if }\omega \text { is even}, \end {cases} \\ X_2(\omega ) &= \begin{cases} 0 & \text { if }\omega \leq 3,\\ 1 & \text { if }\omega >3. \end {cases} \end{align*} That is, \(X_1\) tests if the roll is even, and \(X_2\) tests if the roll is less than or equal to three. Based on our intuition about information we should expect that \(X_1\) is measurable with respect to \(\G _1\) but not \(\G _2\), and that \(X_2\) is measurable with respect to \(\G _2\) but not \(\G _1\).
We can justify our intuition rigorously using Lemma 2.2.2. We have \(X_1^{-1}(0)=\{1,3,5\}\), \(X_1^{-1}(1)=\{2,4,6\}\), so \(X_1\in m\mc {G}_1\) but \(X_1\notin m\mc {G}_2\). Similarly, we have \(X_2^{-1}(0)=\{1,2,3\}\) and \(X_2^{-1}(1)=\{4,5,6\}\), so \(X_2\notin m\mc {G}_1\) and \(X_2\in m\mc {G}_2\).
Let us extend this example by introducing a third \(\sigma \)-field,
\(\seteqnumber{0}{2.}{1}\)\begin{equation*} \mc {G}_3=\sigma \big (\{1,3\},\{2\},\{4\},\{5\},\{6\}\big ). \end{equation*}
The \(\sigma \)-field \(\mc {G}_3\) is, by Definition 2.1.7, the smallest \(\sigma \)-field containing the events \(\{1,3\}, \{2\}, \{4\}, \{5\}\) and \(\{6\}\). It contains the information of which \(\omega \in \Omega \) we threw except that it can’t tell the difference between a \(1\) and a \(3\). If we tried to write \(\G _3\) out in full we would discover that it had \(32\) elements (and probably make some mistakes!) so instead we just use Definition 2.1.7.
To check if \(X_1\in m\mc \G _3\), we need to check if \(\G _3\) contains \(\{1,3,5\}\) and \(\{2,4,6\}\). We can write
\[\{1,3,5\}=\{1,3\}\cup \{5\},\quad \quad \{2,4,6\}=\{2\}\cup \{4\}\cup \{6\}\]
which shows that \(\{1,3,5\},\{2,4,6\}\in \mc {G}_3\) because, in both cases, the right hand sides are made up of sets that we already know are in \(\G _3\), plus countable set operations. Hence, \(X_1\) is \(\mc {G}_3\) measurable. You can check for yourself that \(X_2\) is also \(\mc {G}_3\) measurable.
-
Remark 2.2.3 \(\offsyl \) For continuous random variables, there is no equivalent of Lemma 2.2.2. More sophisticated tools from measure theory are needed – see MAS350/61022.
\(\sigma \)-fields generated by random variables
We can think of random variables as containing information, because their values tell us something about the result of the experiment. We can express this idea formally: there is a natural \(\sigma \)-field associated to each function \(X:\Omega \to \R \).
In words, \(\sigma (X)\) is the \(\sigma \)-field generated by the sets \(X^{-1}(I)\) for intervals \(I\). The intuition is that \(\sigma (X)\) is the smallest \(\sigma \)-field of events on which the random behaviour of \(X\) depends.
For example, consider throwing a fair die. Let \(\Omega =\{1,2,3,4,5,6\}\), let \(\mc {F}=\mc {P}(\Omega )\) and let
\[ X(\omega )= \begin {cases} 1 & \text { if }\omega \text { is odd}\\ 2 & \text { if }\omega \text { is even.} \end {cases} \]
Then \(X(\omega )\in \{1,2\}\), with pre-images \(X^{-1}(1)=\{1,3,5\}\) and \(X^{-1}(2)=\{2,4,6\}\). The smallest \(\sigma \)-field that contains both of these subsets is
\[\sigma (X)=\Big \{\emptyset , \{1,3,5\}, \{2,4,6\}, \Omega \Big \}.\]
The information contained in this \(\sigma \)-field is whether \(\omega \) is even or odd, which is precisely the same information given by the value of \(X(\omega )\).
In general, if \(X\) takes lots of different values, \(\sigma (X)\) could be very big and we would have no hope of writing it out explicitly. Here’s another example: suppose that
\[ Y(\omega )= \begin {cases} 1 & \text { if }\omega =1,\\ 2 & \text { if }\omega =2,\\ 3 & \text { if }\omega \geq 3.\\ \end {cases} \]
Then \(Y(\omega )\in \{1,2,3\}\) with pre-images \(Y^{-1}(1)=\{1\}\), \(Y^{-1}(2)=\{2\}\) and \(Y^{-1}(3)=\{3,4,5,6\}\). The smallest \(\sigma \)-field containing these three subsets is
\[\sigma (Y)=\Big \{\emptyset , \{1\}, \{2\}, \{3,4,5,6\}, \{1,2\}, \{1,3,4,5,6\}, \{2,3,4,5,6\}, \Omega \Big \}.\]
The information contained in this \(\sigma \)-field is whether \(\omega \) is equal to \(1\), \(2\) or some number \(\geq 3\). Again, this is precisely the same information as is contained in the value of \(X(\omega )\).
It’s natural that \(X\) should be measurable with respect to the \(\sigma \)-field that contains precisely the information on which \(X\) depends. Formally:
Proof: Let \(I\) be a subinterval of \(\R \). Then, by definition of \(\sigma (X)\), we have that \(X^{-1}(I)\in \sigma (X)\) for all \(I\). ∎
If we have a finite or countable set of random variables \(X_1,X_2,\ldots \) we define \(\sigma (X_1,X_2,\ldots )\) to be \(\sigma (X^{-1}_1(I),X^{-1}_2(I),\ldots \-I\text { is a subinterval of }\R )\). The intuition is the same: \(\sigma (X_1,X_2,\ldots )\) corresponds to the information jointly contained in \(X_1,X_2,\ldots \).
Combining random variables
Given a collection of random variables, it is useful to be able to construct other random variables from them. To do so we have the following proposition. Since we will eventually deal with more than one \(\sigma \)-field at once, it is useful to express this idea for a sub-\(\sigma \)-field \(\mc {G}\sw \mc {F}\).
-
Proposition 2.2.6 Let \(\alpha \in \R \) and let \(X,Y,X_1,X_2,\ldots \) be \(\mc {G}\)-measurable functions from \(\Omega \to \R \). Then
\(\seteqnumber{0}{2.}{1}\)\begin{equation} \label {eq:its_all_meas} \alpha , \hspace {1pc}\alpha X, \hspace {1pc}X+Y, \hspace {1pc}XY, \hspace {1pc}1/X, \end{equation}
are all \(\mc {G}\)-measurable1. Further, if \(X_\infty \) given by
\[X_\infty (\omega )=\lim _{n\to \infty }X_n(\omega )\]
exists for all \(\omega \), then \(X_\infty \) is \(\mc {G}\)-measurable.
Essentially, every natural way of combining random variables together leads to other random variables. Proposition 2.2.6 can usually be used to show this. We won’t prove Proposition 2.2.6 in this course, see MAS31002/61022.
For example, if \(X\) is a random variable then so is \(\frac {X^2+X}{2}\). For a more difficult example, suppose that \(X\) is a random variable and let \(Y=e^X\), which means that \(Y(\omega )=\lim _{n\to \infty }\sum _{i=0}^n \frac {X(\omega )^i}{i!}\). Recall that we know from analysis that this limit exists since \(e^x=\lim _{n\to \infty }\sum _{i=0}^n \frac {x^i}{i!}\) exists for all \(x\in \R \). Each of the partial sums
\[Y_n=\sum _{i=0}^n\frac {X^i}{i!}=1+X+\frac {X^2}{2}+\ldots +\frac {X^n}{n!}\]
is a random variable (we could use (2.2) repeatedly to show this) and, since the limit exists, \(Y(\omega )=\lim _{n\to \infty }Y_n(\omega )\) is measurable.
In general, if \(X\) is a random variable and \(g:\R \to \R \) is any ‘sensible’ function then \(g(X)\) is also a random variable. This includes polynomials, powers, all trig functions, all monotone functions, all continuous and piecewise continuous functions, integrals/derivatives, etc etc.
Independence
We can express the concept of independence, which you already know about for random variables, in terms of \(\sigma \)-fields. Recall that two events \(E_1,E_2\in \mc {F}\) are said to be independent if \(\P [E_1\cap E_2]=\P [E_1]\P [E_2].\) Using \(\sigma \)-fields, we have a consistent way of defining independence, for both random variables and events.
-
Definition 2.2.7 Sub-\(\sigma \)-fields \(\G _1,\G _2\) of \(\F \) are said to be independent if \(\P (G_1\cap G_2)=\P (G_1)\P (G_2)\) for all \(G_1\in \G _1\) and \(G_2\in \G _2\).
Events \(E_1\) and \(E_2\) are independent if \(\sigma (E_1)\) and \(\sigma (E_2)\) are independent.
Random variables \(X_1\) and \(X_2\) are independent if \(\sigma (X_1)\) and \(\sigma (X_2)\) are independent.
It can be checked that, for events and random variables, this definition is equivalent to the definitions you will have seen in earlier courses. The same principle, of using the associated \(\sigma \)-fields, applies to defining what it means for e.g. a random variable and an event to be independent. We won’t check this claim as part of our course, see MAS31002/61022.