Bayesian Statistics Briefly

dummy slide

Introduction

\[ \renewcommand\vec{\boldsymbol} \def\bigO#1{\mathcal{O}(#1)} \def\Cond#1#2{\left(#1\,\middle|\, #2\right)} \def\mat#1{\boldsymbol{#1}} \def\der{{\mathop{}\!\mathrm{d}}} \def\argmax{\text{arg}\,\text{max}} \def\Prob{\text{P}} \def\Expec{\text{E}} \def\logit{\text{logit}} \def\diag{\text{diag}} \]

Knowledge from data

Aim: “Research to improve human health”

Confounding

Outline

Bayesian statistical analysis in health science

  • -risk modelling, disease mapping, drug testing, \(\ldots\)
  • -subjective or empirical information?

What is it about and how to compute?

Risk principle

‘If we can not determine the truth, we should choose what is most probable’ (Rene Descartes, 1596-1650).

Conditional risk: Probability of event \(A\) given \(B\)

We aim for Bayes Theorem on conditional risks

Conditional risk - definition

\[ P(\textrm{A}\mid\textrm{B})=\frac{P(\textrm{A and B})}{P(\textrm{B})}, \quad P(\textrm{B})\neq 0 \]

-eg. risk of disease given exposure: \(P(\textrm{disease}\mid\textrm{exposure})\)

Conditional risks - targets

  • Sensitivity: risk of positive test given disease
  • Specificity: risk of negative test given no disease
  • PPV: risk of disease given positive test
  • NPV: risk of no disease given negative test
  • RR: relative risk of outcome given exposure to that of no exposure
  • p-value: risk of observed value or more extreme given true null-hypothesis
  • AUC: risk for two randomly chosen individuals having ordered outcomes given ordered exposures

Conditional risks - example

Conditional risks - example

Bayes Theorem

  • For events \(A\) and \(B\) such that \(P(B)\neq 0\),

\[ P(\textrm{A}\mid\textrm{B})=\frac{P(\textrm{B}\mid\textrm{A})P(\textrm{A})}{P(\textrm{B})} \]

  • Thomas Bayes (1702-1761) British theologian and mathematician.
  • “Essay towards solving a problem in the doctrine of chances” was published in 1764.

Apriori information \(\leadsto\) Aposteriori information

Example: Test sensitivity, specificity and prevalence \(\leadsto\) Positive and negative predictive values

This goes much further.

Bayesian Statistics

Comments

We focus on defining property, controversy and scope of applications.

The posterior distribution linking model and data is key

The Bayes Theorem revisited

We are interested in the posterior distribution function \[ P(\theta\mid\textrm{data}) = \frac{P(\textrm{data}\mid\theta)P(\theta)}{\int P(\textrm{data}\mid\eta)P(\eta)d\eta} \]

association

The Bayes Theorem revisited

\[ \begin{align*} P(\theta\mid\textrm{data}) &= \frac{P(\textrm{data}\mid\theta)P(\theta)}{\int P(\textrm{data}\mid\eta)P(\eta)d\eta} \\ &\propto \underbrace{P(\textrm{data}\mid\theta)}_{\text{likelihood}} \underbrace{P(\theta)}_{\text{prior}} \end{align*} \] -the posterior distribution is proportional to the likelihood function times the prior distribution.

Bayesian statistics

In essence, \[ P(\theta\mid\textrm{data}) \propto \underbrace{P(\textrm{data}\mid\theta)}_{\text{likelihood}} \underbrace{P(\theta)}_{\text{prior}} \]

  • The likelihood function is fundamental in statistics! (R.A. Fisher, 1922)
  • The prior distribution \(P(\theta)\) may be empirical
  • Bayesian statistics: The prior distribution \(P(\theta)\) may be subjective!!

Comments

  • Everybody agree if all knowledge comes from data.
  • The subjective prior distinguishes the field.
  • We illustrate using an example, a prior distribution of \(p\) of covid19 infection?

Bayesian Computation

The test for CoVID19 revisited

  • We focus on the proportion of infected, denoted by \(p\).
  • Hence \(\theta=p\) and we want to learn about \(p\) using data and prior belief.

\[ \underbrace{g(p \mid\textrm{data})}_{\text{posterior}} \propto \underbrace{g(\textrm{data}\mid p)}_{\text{likelihood}} \underbrace{g(p)}_{\text{prior}} \]

  • Likelihood: \(g(\textrm{data}\mid p) \propto p^s(1-p)^f\), where \(s,f\) are numbers of infected and non-infected, respectively.
  • Prior: ‘I believe a much higher proportion are infected, at least …’

The proportion of infected - likelihood

  • Data: counts of infected and tested. Today, infected among those PCR-tested.

The proportion of infected - prior

  • Belief: 90% sure that proportion is less than a half
  • Belief: Median infection at 40%, that is, equally likely that p is smaller or larger than 40%
  • Probability theory: A continuous proportion has a beta distribution with hyper-parameters \(a,b\) to be chosen to reflect beliefs. \[g(p) \propto p^{a-1}(1-p)^{b-1}\]

The proportion of infected - prior

The proportion of infected - posterior

  • We may now obtain the posterior of data and prior beliefs

\[g(p\mid \textrm{data}) \propto p^{a+s-1}(1-p)^{b+f-1}\]

The Bayes computation

Comments

Having the posterior distribution we may do inference and prediction

  • A 90% credibility interval for \(p\) is given by
## [1] 0.208 0.365
  • Prediction: We may predict number of infected individuals.
  • Bayesian Statistics: The example is archetypical
  • ‘Today’s posterior is tomorrow’s prior’

Example - sparse data

  • Autoantibodies in twins discordant for rheumatoid arthrititis
  • Theory in (Berg and Hjelmborg 2012) and application in (Svendsen et al 2011)

‘Today’s posterior is tomorrow’s prior’

  • Literature: (Albert 2009), (Leonard and Hsu 1999), (McElreath 2016)
  • Statistical analysis examples: The Stan project; The R-INLA project
  • Experience\(\ldots\): (Berg and Hjelmborg 2012)

Thank You!

The presentation is at rpubs.com/jhjelmborg/SDU_Bayes_briefly.

References are on the next slide.

References

Albert, Jim. 2009. Bayesian Computation with r. Second Edition. Use r! Springer.
Berg, Stephanie M. van den, and Jacob v.B. Hjelmborg. 2012. Bayesian Estimation of Twin Concordance Rates.” Behav Genet, no. 42: 857–65. https://doi.org/10.1007/s10519-012-9547-9.
Leonard, Thomas, and John S. J. Hsu. 1999. Bayesian Methods. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press.
McElreath, Richard. 2016. Statistical Rethinking : A Bayesian Course with Examples in r and Stan. Texts in Statistical Science. Boca Raton : CRC Press.
Svendsen et al, Svendsen. 2011. “Autoantibodies for Rheumatoid Arthrititis.” Ann Rheum Dis, no. 70: 708–9.