last updated: October 24, 2024

Bayesian Statistics

\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \) \(\require {colortbl}\) \(\let \LWRorigcolumncolor \columncolor \) \(\renewcommand {\columncolor }[2][named]{\LWRorigcolumncolor [#1]{#2}\LWRabsorbtwooptions }\) \(\let \LWRorigrowcolor \rowcolor \) \(\renewcommand {\rowcolor }[2][named]{\LWRorigrowcolor [#1]{#2}\LWRabsorbtwooptions }\) \(\let \LWRorigcellcolor \cellcolor \) \(\renewcommand {\cellcolor }[2][named]{\LWRorigcellcolor [#1]{#2}\LWRabsorbtwooptions }\) \(\require {mathtools}\) \(\newenvironment {crampedsubarray}[1]{}{}\) \(\newcommand {\smashoperator }[2][]{#2\limits }\) \(\newcommand {\SwapAboveDisplaySkip }{}\) \(\newcommand {\LaTeXunderbrace }[1]{\underbrace {#1}}\) \(\newcommand {\LaTeXoverbrace }[1]{\overbrace {#1}}\) \(\newcommand {\LWRmultlined }[1][]{\begin {multline*}}\) \(\newenvironment {multlined}[1][]{\LWRmultlined }{\end {multline*}}\) \(\let \LWRorigshoveleft \shoveleft \) \(\renewcommand {\shoveleft }[1][]{\LWRorigshoveleft }\) \(\let \LWRorigshoveright \shoveright \) \(\renewcommand {\shoveright }[1][]{\LWRorigshoveright }\) \(\newcommand {\shortintertext }[1]{\text {#1}\notag \\}\) \(\newcommand {\vcentcolon }{\mathrel {\unicode {x2236}}}\) \(\renewcommand {\intertext }[2][]{\text {#2}\notag \\}\) \(\newenvironment {fleqn}[1][]{}{}\) \(\newenvironment {ceqn}{}{}\) \(\newenvironment {darray}[2][c]{\begin {array}[#1]{#2}}{\end {array}}\) \(\newcommand {\dmulticolumn }[3]{#3}\) \(\newcommand {\LWRnrnostar }[1][0.5ex]{\\[#1]}\) \(\newcommand {\nr }{\ifstar \LWRnrnostar \LWRnrnostar }\) \(\newcommand {\mrel }[1]{\begin {aligned}#1\end {aligned}}\) \(\newcommand {\underrel }[2]{\underset {#2}{#1}}\) \(\newcommand {\medmath }[1]{#1}\) \(\newcommand {\medop }[1]{#1}\) \(\newcommand {\medint }[1]{#1}\) \(\newcommand {\medintcorr }[1]{#1}\) \(\newcommand {\mfrac }[2]{\frac {#1}{#2}}\) \(\newcommand {\mbinom }[2]{\binom {#1}{#2}}\) \(\newenvironment {mmatrix}{\begin {matrix}}{\end {matrix}}\) \(\newcommand {\displaybreak }[1][]{}\) \( \def \offsyl {(\oslash )} \def \msconly {(\Delta )} \) \( \DeclareMathOperator {\var }{var} \DeclareMathOperator {\cov }{cov} \DeclareMathOperator {\Bin }{Bin} \DeclareMathOperator {\Geo }{Geometric} \DeclareMathOperator {\Beta }{Beta} \DeclareMathOperator {\Unif }{Uniform} \DeclareMathOperator {\Gam }{Gamma} \DeclareMathOperator {\Normal }{N} \DeclareMathOperator {\Exp }{Exp} \DeclareMathOperator {\Cauchy }{Cauchy} \DeclareMathOperator {\Bern }{Bernoulli} \DeclareMathOperator {\Poisson }{Poisson} \DeclareMathOperator {\Weibull }{Weibull} \DeclareMathOperator {\IGam }{IGamma} \DeclareMathOperator {\NGam }{NGamma} \DeclareMathOperator {\ChiSquared }{ChiSquared} \DeclareMathOperator {\Pareto }{Pareto} \DeclareMathOperator {\NBin }{NegBin} \DeclareMathOperator {\Studentt }{Student-t} \DeclareMathOperator *{\argmax }{arg\,max} \DeclareMathOperator *{\argmin }{arg\,min} \) \( \def \to {\rightarrow } \def \iff {\Leftrightarrow } \def \ra {\Rightarrow } \def \sw {\subseteq } \def \mc {\mathcal } \def \mb {\mathbb } \def \sc {\setminus } \def \wt {\widetilde } \def \v {\textbf } \def \E {\mb {E}} \def \P {\mb {P}} \def \R {\mb {R}} \def \C {\mb {C}} \def \N {\mb {N}} \def \Q {\mb {Q}} \def \Z {\mb {Z}} \def \B {\mb {B}} \def \~{\sim } \def \-{\,;\,} \def \qed {$\blacksquare $} \CustomizeMathJax {\def \1{\unicode {x1D7D9}}} \def \cadlag {c\`{a}dl\`{a}g} \def \p {\partial } \def \l {\left } \def \r {\right } \def \Om {\Omega } \def \om {\omega } \def \eps {\epsilon } \def \de {\delta } \def \ov {\overline } \def \sr {\stackrel } \def \Lp {\mc {L}^p} \def \Lq {\mc {L}^p} \def \Lone {\mc {L}^1} \def \Ltwo {\mc {L}^2} \def \toae {\sr {\rm a.e.}{\to }} \def \toas {\sr {\rm a.s.}{\to }} \def \top {\sr {\mb {\P }}{\to }} \def \tod {\sr {\rm d}{\to }} \def \toLp {\sr {\Lp }{\to }} \def \toLq {\sr {\Lq }{\to }} \def \eqae {\sr {\rm a.e.}{=}} \def \eqas {\sr {\rm a.s.}{=}} \def \eqd {\sr {\rm d}{=}} \def \approxd {\sr {\rm d}{\approx }} \def \Sa {(S1)\xspace } \def \Sb {(S2)\xspace } \def \Sc {(S3)\xspace } \)

6.3 Exercises on Chapter 6

  • 6.1 \(\color {blue}\star \) Show that the mode of the \(\Gam (\alpha ,\beta )\) distribution is \(\frac {\alpha -1}{\beta }\), where \(\alpha \geq 1\). What about \(\alpha \in (0,1)\)?

  • 6.2 \(\color {blue}\star \,\star \) The following equations, written in Bayesian shorthand, are the key conclusions from results in earlier chapters of these notes. Which results are they from?

    • (a) \(f(x|y)=\frac {f(y,x)}{f(y)}\).

    • (b) If \(\theta \sim \Beta (\alpha ,\beta )\) and \(x|\theta \sim \Bern (\theta )^{\otimes n}\) then \(\theta |x\sim \Beta (\alpha +k,\beta +n-k)\), where \(x=(x_i)_1^n\) and \(k=\sum _1^n x_i\).

    Write the following results in Bayesian shorthand, using similar notation to that in parts (a) and (b).

    • (c) Lemma 4.2.1.

    • (d) From Section 4.5, the two facts above Lemma 4.5.2 concerning marginal and conditional distributions of the \(\NGam \) distribution.

  • 6.3 \(\color {blue}\star \,\star \) The following results are written in Bayesian shorthand.

    • (a) If \(x\sim N(0,1)\) then \(x|\{x>0\}\sim |x|\).

    • (b) If \(x\) and \(y\) are independent then \(x|y\sim x\).

    In each case, write a version of the results in precise mathematical notation. Which parts of Chapter 1 are they closely related to?

  • 6.4 \(\color {blue}\star \,\star \) Suppose that we model \(x|\theta \sim \NBin (m,\theta )^{\otimes n}\), where \(m\in \N \) is fixed and \(\theta \in (0,1)\) is an unknown parameter.

    • (a) Show that \(f(x|\theta )\propto \theta ^{mn}(1-\theta )^{\sum _1^n x_i}.\)

    • (b) Show that the prior \(\theta \sim \Beta (\alpha ,\beta )\) is conjugate to \(\NBin (m,\theta )^{\otimes n}\), and find the posterior parameters.

    • (c)

      • (i) Show that the reference prior for \(\theta \) is given by \(f(\theta )\propto \theta ^{-1}(1-\theta )^{-1/2}\).

      • (ii) Does \(f(\theta )\) define a proper distribution?

      • (iii) Find the posterior density \(f(\theta |x)\) arising from this prior.

    Hint: The setup given is a Bayesian model with model family \(M_{\theta }\sim \NBin (m,\theta )^{\otimes n}\).

  • 6.5 Suppose that we model \(x|\mu ,\tau \sim \Normal (\mu ,\frac {1}{\tau })^{\otimes n}\), where both \(\mu \) and \(\tau \) are unknown parameters. We use the improper prior \(f(\mu , \tau )\propto \frac {1}{\tau }\) for \(\tau >0\), and \(f(\tau )=0\) elsewhere.

    • (a) \(\color {blue}\star \,\star \) Show that for \(\mu \in \R \) and \(\tau >0\) the posterior distribution satisfies

      \[f(\mu ,\tau |x)\propto \tau ^{\frac {n}{2}-1}\exp \l (-\frac {\tau }{2}\sum _{i=1}^n(x_i-\mu )^2\r ).\]

    • (b) \(\color {blue}\star \star \star \) Find the marginal p.d.f of \(\tau |x\). Show that \((\mu ,\tau )|x\) is a proper distribution if and only if \(n\geq 2\).

    Hint: The setup given is a Bayesian model with model family \(M_{\mu ,\tau }\sim \Normal (\mu ,\frac {1}{\tau })^{\otimes n}\). For part (b) use the sample-mean-variance identity (4.10).

  • 6.6 \(\color {blue}\star \,\star \) Let \((M_\theta )_{\theta \in \Pi }\) be a continuous family of distributions. For \(i=1,2,\) let \(\Theta _i\) be a continuous random variable with p.d.f. \(f_{\Theta _i}\), both taking values in \(\R ^d\). Let \(\alpha ,\beta \in (0,1)\) be such that \(\alpha +\beta =1\).

    • (a) Show that \(f_\Theta (\theta )=\alpha f_{\Theta _1}(\theta )+\beta f_{\Theta _2}(\theta )\) is a probability density function.

    • (b) Consider Bayesian models \((X_1,\Theta _1)\) and \((X_2,\Theta _2)\), with the same model family \((M_\theta )\) and different prior distributions. Consider also a third Bayesian model \((X,\Theta )\) with model family \((M_\theta )\) and prior \(\Theta \) with p.d.f. \(f_\Theta (\theta )=\alpha f_{\Theta _1}(\theta )+\beta f_{\Theta _2}(\theta )\).

      Show that the posterior distributions of these three models satisfy

      \[f_{\Theta |_{\{X=x\}}}(\theta )=\alpha ' f_{\Theta _1|_{\{X_1=x\}}}(\theta ) + \beta ' f_{\Theta _2|_{\{X_2=x\}}}(\theta )\]

      where \(\alpha '=\frac {\alpha Z_1}{\alpha Z_1+\beta Z_2}\) and \(\beta '=\frac {\beta Z_2}{\alpha Z_1+\beta Z_2}\). Here \(Z_1\) and \(Z_2\) are the normalizing constants given in Theorem 3.1.2 for the posterior distributions of \((X_1,\Theta _1)\) and \((X_2,\Theta _2)\).

    • (c) Outline briefly how to modify your argument in (c) to also cover the case of discrete Bayesian models.

  • 6.7 \(\color {blue}\star \star \star \) This question explores the idea in Exercise 4.6 further, but except for (a)(ii) it does not depend on having completed that exercise.

    • (a) Let \((M_\theta )\) be a discrete or absolutely continuous family with range \(R\). Let \((X,\Theta )\) be a Bayesian model with model family \(M_\theta ^{\otimes n}\). Let \(x\in R^n\) and write \(x(1)=(x_1,\ldots ,x_{n_1})\), \(x(2)=(x_{n_1+1},\ldots ,x_{n})\). Let \((X_1,\Theta )\) and \((X_2,\Theta |_{\{X_1=x(1)\}})\) be Bayesian models with model families \(M_\theta ^{\otimes n_1}\) and \(M_\theta ^{\otimes n_2}\), where \(n_1+n_2=n\).

      • (i) Show that

        \[(\Theta _1|_{\{X_1=x(1)\}})|_{\{X_2=x(2)\}}\eqd \Theta |_{\{X=x\}}.\]

        Use likelihood functions to write your argument in a way that covers both the discrete and absolutely continuous cases.

      • (ii) What is the connection between this fact and Exercise 4.6?

    • (b) Rewrite your solution to (a)(i) in a Bayesian shorthand notation of your choice.