\[ \renewcommand\vec{\boldsymbol} \def\bigO#1{\mathcal{O}(#1)} \def\Cond#1#2{\left(#1\,\middle|\, #2\right)} \def\mat#1{\boldsymbol{#1}} \def\der{{\mathop{}\!\mathrm{d}}} \def\argmax{\text{arg}\,\text{max}} \def\Prob{\text{P}} \def\Expec{\text{E}} \def\logit{\text{logit}} \def\diag{\text{diag}} \]
Aim: “Research to improve human health”
Bayesian statistical analysis in health science
What is it about and how to compute?
‘If we can not determine the truth, we should choose what is most probable’ (Rene Descartes, 1596-1650).
Conditional risk: Probability of event \(A\) given \(B\)
We aim for Bayes Theorem on conditional risks
-eg. risk of disease given exposure: \(P(\textrm{disease}\mid\textrm{exposure})\)
\[ P(\textrm{A}\mid\textrm{B})=\frac{P(\textrm{B}\mid\textrm{A})P(\textrm{A})}{P(\textrm{B})} \]
Apriori information \(\leadsto\) Aposteriori information
Example: Test sensitivity, specificity and prevalence \(\leadsto\) Positive and negative predictive values
This goes much further.
We are interested in the posterior distribution function \[ P(\theta\mid\textrm{data}) = \frac{P(\textrm{data}\mid\theta)P(\theta)}{\int P(\textrm{data}\mid\eta)P(\eta)d\eta} \]
\[ \begin{align*} P(\theta\mid\textrm{data}) &= \frac{P(\textrm{data}\mid\theta)P(\theta)}{\int P(\textrm{data}\mid\eta)P(\eta)d\eta} \\ &\propto \underbrace{P(\textrm{data}\mid\theta)}_{\text{likelihood}} \underbrace{P(\theta)}_{\text{prior}} \end{align*} \] -the posterior distribution is proportional to the likelihood function times the prior distribution.
In essence, \[ P(\theta\mid\textrm{data}) \propto \underbrace{P(\textrm{data}\mid\theta)}_{\text{likelihood}} \underbrace{P(\theta)}_{\text{prior}} \]
\[ \underbrace{g(p \mid\textrm{data})}_{\text{posterior}} \propto \underbrace{g(\textrm{data}\mid p)}_{\text{likelihood}} \underbrace{g(p)}_{\text{prior}} \]
\[g(p\mid \textrm{data}) \propto p^{a+s-1}(1-p)^{b+f-1}\]
Having the posterior distribution we may do inference and prediction
## [1] 0.208 0.365
The presentation is at rpubs.com/jhjelmborg/SDU_Bayes_briefly.
References are on the next slide.
Comments
The posterior distribution linking model and data is key