Bayesian Statistics

$\newcommand{\footnotename}{footnote}$ $\def \LWRfootnote {1}$ $\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}$ $\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}$ $\let \LWRorighspace \hspace $ $\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }$ $\newcommand {\mathnormal }[1]{{#1}}$ $\newcommand \ensuremath [1]{#1}$ $\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } $ $\newcommand {\setlength }[2]{}$ $\newcommand {\addtolength }[2]{}$ $\newcommand {\setcounter }[2]{}$ $\newcommand {\addtocounter }[2]{}$ $\newcommand {\arabic }[1]{}$ $\newcommand {\number }[1]{}$ $\newcommand {\noalign }[1]{\text {#1}\notag \\}$ $\newcommand {\cline }[1]{}$ $\newcommand {\directlua }[1]{\text {(directlua)}}$ $\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}$ $\newcommand {\protect }{}$ $\def \LWRabsorbnumber #1 {}$ $\def \LWRabsorbquotenumber "#1 {}$ $\newcommand {\LWRabsorboption }[1][]{}$ $\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }$ $\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }$ $\def \mathcode #1={\mathchar }$ $\let \delcode \mathcode $ $\let \delimiter \mathchar $ $\def \oe {\unicode {x0153}}$ $\def \OE {\unicode {x0152}}$ $\def \ae {\unicode {x00E6}}$ $\def \AE {\unicode {x00C6}}$ $\def \aa {\unicode {x00E5}}$ $\def \AA {\unicode {x00C5}}$ $\def \o {\unicode {x00F8}}$ $\def \O {\unicode {x00D8}}$ $\def \l {\unicode {x0142}}$ $\def \L {\unicode {x0141}}$ $\def \ss {\unicode {x00DF}}$ $\def \SS {\unicode {x1E9E}}$ $\def \dag {\unicode {x2020}}$ $\def \ddag {\unicode {x2021}}$ $\def \P {\unicode {x00B6}}$ $\def \copyright {\unicode {x00A9}}$ $\def \pounds {\unicode {x00A3}}$ $\let \LWRref \ref $ $\renewcommand {\ref }{\ifstar \LWRref \LWRref }$ $ \newcommand {\multicolumn }[3]{#3}$ $\require {textcomp}$ $\newcommand {\intertext }[1]{\text {#1}\notag \\}$ $\let \Hat \hat $ $\let \Check \check $ $\let \Tilde \tilde $ $\let \Acute \acute $ $\let \Grave \grave $ $\let \Dot \dot $ $\let \Ddot \ddot $ $\let \Breve \breve $ $\let \Bar \bar $ $\let \Vec \vec $ $\require {colortbl}$ $\let \LWRorigcolumncolor \columncolor $ $\renewcommand {\columncolor }[2][named]{\LWRorigcolumncolor [#1]{#2}\LWRabsorbtwooptions }$ $\let \LWRorigrowcolor \rowcolor $ $\renewcommand {\rowcolor }[2][named]{\LWRorigrowcolor [#1]{#2}\LWRabsorbtwooptions }$ $\let \LWRorigcellcolor \cellcolor $ $\renewcommand {\cellcolor }[2][named]{\LWRorigcellcolor [#1]{#2}\LWRabsorbtwooptions }$ $\require {mathtools}$ $\newenvironment {crampedsubarray}[1]{}{}$ $\newcommand {\smashoperator }[2][]{#2\limits }$ $\newcommand {\SwapAboveDisplaySkip }{}$ $\newcommand {\LaTeXunderbrace }[1]{\underbrace {#1}}$ $\newcommand {\LaTeXoverbrace }[1]{\overbrace {#1}}$ $\newcommand {\LWRmultlined }[1][]{\begin {multline*}}$ $\newenvironment {multlined}[1][]{\LWRmultlined }{\end {multline*}}$ $\let \LWRorigshoveleft \shoveleft $ $\renewcommand {\shoveleft }[1][]{\LWRorigshoveleft }$ $\let \LWRorigshoveright \shoveright $ $\renewcommand {\shoveright }[1][]{\LWRorigshoveright }$ $\newcommand {\shortintertext }[1]{\text {#1}\notag \\}$ $\newcommand {\vcentcolon }{\mathrel {\unicode {x2236}}}$ $\renewcommand {\intertext }[2][]{\text {#2}\notag \\}$ $\newenvironment {fleqn}[1][]{}{}$ $\newenvironment {ceqn}{}{}$ $\newenvironment {darray}[2][c]{\begin {array}[#1]{#2}}{\end {array}}$ $\newcommand {\dmulticolumn }[3]{#3}$ $\newcommand {\LWRnrnostar }[1][0.5ex]{\\[#1]}$ $\newcommand {\nr }{\ifstar \LWRnrnostar \LWRnrnostar }$ $\newcommand {\mrel }[1]{\begin {aligned}#1\end {aligned}}$ $\newcommand {\underrel }[2]{\underset {#2}{#1}}$ $\newcommand {\medmath }[1]{#1}$ $\newcommand {\medop }[1]{#1}$ $\newcommand {\medint }[1]{#1}$ $\newcommand {\medintcorr }[1]{#1}$ $\newcommand {\mfrac }[2]{\frac {#1}{#2}}$ $\newcommand {\mbinom }[2]{\binom {#1}{#2}}$ $\newenvironment {mmatrix}{\begin {matrix}}{\end {matrix}}$ $\newcommand {\displaybreak }[1][]{}$ $ \def \offsyl {(\oslash )} \def \msconly {(\Delta )} $ $ \DeclareMathOperator {\var }{var} \DeclareMathOperator {\cov }{cov} \DeclareMathOperator {\Bin }{Bin} \DeclareMathOperator {\Geo }{Geometric} \DeclareMathOperator {\Beta }{Beta} \DeclareMathOperator {\Unif }{Uniform} \DeclareMathOperator {\Gam }{Gamma} \DeclareMathOperator {\Normal }{N} \DeclareMathOperator {\Exp }{Exp} \DeclareMathOperator {\Cauchy }{Cauchy} \DeclareMathOperator {\Bern }{Bernoulli} \DeclareMathOperator {\Poisson }{Poisson} \DeclareMathOperator {\Weibull }{Weibull} \DeclareMathOperator {\IGam }{IGamma} \DeclareMathOperator {\NGam }{NGamma} \DeclareMathOperator {\ChiSquared }{ChiSquared} \DeclareMathOperator {\Pareto }{Pareto} \DeclareMathOperator {\NBin }{NegBin} \DeclareMathOperator {\Studentt }{Student-t} \DeclareMathOperator *{\argmax }{arg\,max} \DeclareMathOperator *{\argmin }{arg\,min} $ \( \def \to {\rightarrow } \def \iff {\Leftrightarrow } \def \ra {\Rightarrow } \def \sw {\subseteq } \def \mc {\mathcal } \def \mb {\mathbb } \def \sc {\setminus } \def \wt {\widetilde } \def \v {\textbf } \def \E {\mb {E}} \def \P {\mb {P}} \def \R {\mb {R}} \def \C {\mb {C}} \def \N {\mb {N}} \def \Q {\mb {Q}} \def \Z {\mb {Z}} \def \B {\mb {B}} \def \~{\sim } \def \-{\,;\,} \def \qed {$\blacksquare $} \CustomizeMathJax {\def \1{\unicode {x1D7D9}}} \def \cadlag {c\`{a}dl\`{a}g} \def \p {\partial } \def \l {\left } \def \r {\right } \def \Om {\Omega } \def \om {\omega } \def \eps {\epsilon } \def \de {\delta } \def \ov {\overline } \def \sr {\stackrel } \def \Lp {\mc {L}^p} \def \Lq {\mc {L}^p} \def \Lone {\mc {L}^1} \def \Ltwo {\mc {L}^2} \def \toae {\sr {\rm a.e.}{\to }} \def \toas {\sr {\rm a.s.}{\to }} \def \top {\sr {\mb {\P }}{\to }} \def \tod {\sr {\rm d}{\to }} \def \toLp {\sr {\Lp }{\to }} \def \toLq {\sr {\Lq }{\to }} \def \eqae {\sr {\rm a.e.}{=}} \def \eqas {\sr {\rm a.s.}{=}} \def \eqd {\sr {\rm d}{=}} \def \approxd {\sr {\rm d}{\approx }} \def \Sa {(S1)\xspace } \def \Sb {(S2)\xspace } \def \Sc {(S3)\xspace } \)

6.3 Exercises on Chapter 6

6.1 $\color {blue}\star $ Show that the mode of the $\Gam (\alpha ,\beta )$ distribution is $\frac {\alpha -1}{\beta }$, where $\alpha \geq 1$. What about $\alpha \in (0,1)$?
6.2 $\color {blue}\star \,\star $ The following equations, written in Bayesian shorthand, are the key conclusions from results in earlier chapters of these notes. Which results are they from?
- (a) $f(x|y)=\frac {f(y,x)}{f(y)}$.
- (b) If $\theta \sim \Beta (\alpha ,\beta )$ and $x|\theta \sim \Bern (\theta )^{\otimes n}$ then $\theta |x\sim \Beta (\alpha +k,\beta +n-k)$, where $x=(x_i)_1^n$ and $k=\sum _1^n x_i$.
Write the following results in Bayesian shorthand, using similar notation to that in parts (a) and (b).
- (c) Lemma 4.2.1.
- (d) From Section 4.5, the two facts above Lemma 4.5.2 concerning marginal and conditional distributions of the $\NGam $ distribution.
6.3 $\color {blue}\star \,\star $ The following results are written in Bayesian shorthand.
- (a) If $x\sim N(0,1)$ then $x|\{x>0\}\sim |x|$.
- (b) If $x$ and $y$ are independent then $x|y\sim x$.
In each case, write a version of the results in precise mathematical notation. Which parts of Chapter 1 are they closely related to?
6.4 $\color {blue}\star \,\star $ Suppose that we model $x|\theta \sim \NBin (m,\theta )^{\otimes n}$, where $m\in \N $ is fixed and $\theta \in (0,1)$ is an unknown parameter.
- (a) Show that $f(x|\theta )\propto \theta ^{mn}(1-\theta )^{\sum _1^n x_i}.$
- (b) Show that the prior $\theta \sim \Beta (\alpha ,\beta )$ is conjugate to $\NBin (m,\theta )^{\otimes n}$, and find the posterior parameters.
- (c)
  - (i) Show that the reference prior for $\theta $ is given by $f(\theta )\propto \theta ^{-1}(1-\theta )^{-1/2}$.
  - (ii) Does $f(\theta )$ define a proper distribution?
  - (iii) Find the posterior density $f(\theta |x)$ arising from this prior.
Hint: The setup given is a Bayesian model with model family $M_{\theta }\sim \NBin (m,\theta )^{\otimes n}$.
6.5 Suppose that we model $x|\mu ,\tau \sim \Normal (\mu ,\frac {1}{\tau })^{\otimes n}$, where both $\mu $ and $\tau $ are unknown parameters. We use the improper prior $f(\mu , \tau )\propto \frac {1}{\tau }$ for $\tau >0$, and $f(\tau )=0$ elsewhere.
- (a) $\color {blue}\star \,\star $ Show that for $\mu \in \R $ and $\tau >0$ the posterior distribution satisfies
  
  \[f(\mu ,\tau |x)\propto \tau ^{\frac {n}{2}-1}\exp \l (-\frac {\tau }{2}\sum _{i=1}^n(x_i-\mu )^2\r ).\]
- (b) $\color {blue}\star \star \star $ Find the marginal p.d.f of $\tau |x$. Show that $(\mu ,\tau )|x$ is a proper distribution if and only if $n\geq 2$.
Hint: The setup given is a Bayesian model with model family $M_{\mu ,\tau }\sim \Normal (\mu ,\frac {1}{\tau })^{\otimes n}$. For part (b) use the sample-mean-variance identity (4.10).
6.6 $\color {blue}\star \,\star $ Let $(M_\theta )_{\theta \in \Pi }$ be a continuous family of distributions. For $i=1,2,$ let $\Theta _i$ be a continuous random variable with p.d.f. $f_{\Theta _i}$, both taking values in $\R ^d$. Let $\alpha ,\beta \in (0,1)$ be such that $\alpha +\beta =1$.
- (a) Show that $f_\Theta (\theta )=\alpha f_{\Theta _1}(\theta )+\beta f_{\Theta _2}(\theta )$ is a probability density function.
- (b) Consider Bayesian models $(X_1,\Theta _1)$ and $(X_2,\Theta _2)$, with the same model family $(M_\theta )$ and different prior distributions. Consider also a third Bayesian model $(X,\Theta )$ with model family $(M_\theta )$ and prior $\Theta $ with p.d.f. $f_\Theta (\theta )=\alpha f_{\Theta _1}(\theta )+\beta f_{\Theta _2}(\theta )$.
  
  Show that the posterior distributions of these three models satisfy
  
  \[f_{\Theta |_{\{X=x\}}}(\theta )=\alpha ' f_{\Theta _1|_{\{X_1=x\}}}(\theta ) + \beta ' f_{\Theta _2|_{\{X_2=x\}}}(\theta )\]
  
  where $\alpha '=\frac {\alpha Z_1}{\alpha Z_1+\beta Z_2}$ and $\beta '=\frac {\beta Z_2}{\alpha Z_1+\beta Z_2}$. Here $Z_1$ and $Z_2$ are the normalizing constants given in Theorem 3.1.2 for the posterior distributions of $(X_1,\Theta _1)$ and $(X_2,\Theta _2)$.
- (c) Outline briefly how to modify your argument in (c) to also cover the case of discrete Bayesian models.
6.7 $\color {blue}\star \star \star $ This question explores the idea in Exercise 4.6 further, but except for (a)(ii) it does not depend on having completed that exercise.
- (a) Let $(M_\theta )$ be a discrete or absolutely continuous family with range $R$. Let $(X,\Theta )$ be a Bayesian model with model family $M_\theta ^{\otimes n}$. Let $x\in R^n$ and write $x(1)=(x_1,\ldots ,x_{n_1})$, $x(2)=(x_{n_1+1},\ldots ,x_{n})$. Let $(X_1,\Theta )$ and $(X_2,\Theta |_{\{X_1=x(1)\}})$ be Bayesian models with model families $M_\theta ^{\otimes n_1}$ and $M_\theta ^{\otimes n_2}$, where $n_1+n_2=n$.
  - (i) Show that
    
    \[(\Theta _1|_{\{X_1=x(1)\}})|_{\{X_2=x(2)\}}\eqd \Theta |_{\{X=x\}}.\]
    
    Use likelihood functions to write your argument in a way that covers both the discrete and absolutely continuous cases.
  - (ii) What is the connection between this fact and Exercise 4.6?
- (b) Rewrite your solution to (a)(i) in a Bayesian shorthand notation of your choice.