Notes 21

Introduction to Bayesian Statistics | Part 3: General Method for any Prior (ex. Using a Histogram Prior)

Setup

Loading packages

library(tidyverse)
## Warning: package 'ggplot2' was built under R version 4.3.3
library(LearnBayes)

A method for any choice of prior (Section 2.5)

The first paragraph of Section 2.5 outlines a “brute-force” method of summarizing posterior computations for an arbitrary prior density.

  1. Create a sequence of values of \(p\) that covers the interval from 0 to 1.

  2. For each value of \(p\), calculate the prior \(\pi(p)\) and the likelihood \(L(p)\) and multiply the two.

  3. Divide each product by the sum of the products (this is called normalization). As a result, we have a discrete probability distribution over the sequence of values of \(p\) (that approximates the posterior density).

  4. Use the function sample() to take a random sample with replacement from this distribution.

  5. We can use this random sample to calculate probabilities associated with \(p\), or a credible interval.

I’ll refer to these steps below (hopefully, to help clarify things).

The book uses a histogram prior as an example of this method. This type of prior is useful because it simply to specify.

Step 1: sequence of values of p

sleep <- tibble(
  p = seq(from=0, to=1, by=.005)
)

Step 2: prior and likelihood

Prior

In the histogram prior, we’ll use the same weights that we used in Notes 19.

Interval Weights
[0, .1) 1
[.1, .2) 5.2
[.2, .3) 8
[.3, .4) 7.2
[.4, .5) 4.6
[.5, .6) 2.1
[.6, .7) 0.7
[.7, .8) 0.1
[.8, 1] 0

We can use case_when() to set the prior values for our sequence of \(p\)s.

sleep <- sleep |> 
  mutate(
    prior = case_when(
      p < .1 ~ 1, 
      p < .2 ~ 5.2, 
      p < .3 ~ 8, 
      p < .4 ~ 7.2, 
      p < .5 ~ 4.6, 
      p < .6 ~ 2.1, 
      p < .7 ~ 0.7, 
      p < .8 ~ 0.1, 
      TRUE ~ 0
    )
  )

Plot of the prior. (Note that this is not technically a proper distribution, because the area under it is not equal to 1, but this doesn’t matter due to the normalization we’ll do later.)

ggplot(
  data = sleep, 
  mapping = aes(x = p, y = prior)
) + 
  geom_step()

Likelihood

Recall that the likelihood \(L(p)\propto p^{11} (1-p)^{16}\).

sleep <- sleep |> 
  mutate(
    L = p^11 *(1-p)^16
  )

Not to confuse you, but you could also obtain the likelihood with

  • the binomial distribution: dbinom(x = 11, size = 27, prob = p)

  • the beta distribution: dbeta(x = p, shape1 = 12, shape2 = 17)

Step 3: Product and normalization

sleep <- sleep |> 
  mutate(
    prod = prior * L, 
    post_probs = prod / sum(prod)
  )
sleep
## # A tibble: 201 × 5
##        p prior        L     prod post_probs
##    <dbl> <dbl>    <dbl>    <dbl>      <dbl>
##  1 0         1 0        0          0       
##  2 0.005     1 4.51e-26 4.51e-26   1.55e-20
##  3 0.01      1 8.51e-23 8.51e-23   2.92e-17
##  4 0.015     1 6.79e-21 6.79e-21   2.33e-15
##  5 0.02      1 1.48e-19 1.48e-19   5.08e-14
##  6 0.025     1 1.59e-18 1.59e-18   5.45e-13
##  7 0.03      1 1.09e-17 1.09e-17   3.73e-12
##  8 0.035     1 5.46e-17 5.46e-17   1.87e-11
##  9 0.04      1 2.18e-16 2.18e-16   7.48e-11
## 10 0.045     1 7.33e-16 7.33e-16   2.51e-10
## # ℹ 191 more rows

Plot of posterior distribution (discrete approx.)

ggplot(data = sleep) + 
  geom_segment(
    mapping = aes(x = p, xend = p, y = 0, yend = post_probs)
  )

Step 4: Using sample() to randomly draw from posterior

This randomly draws 1000 proportions from our sequence of \(p\)s, with replacement, according to the posterior probabilities.

# set.seed(123)
sim <- tibble(
  ps = sample(x = sleep$p, size = 1000, replace = TRUE, prob = sleep$post_probs)
)

Histogram of these proportions

ggplot(
  data = sim, 
  mapping = aes(x = ps)
) + 
  geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Step 5: Posterior inference based on the random draws

What is the probability that \(p\) is greater than 0.5?

sim |> 
  summarize(
    `P(p>0.5)` = mean(ps > .5)
  )
## # A tibble: 1 × 1
##   `P(p>0.5)`
##        <dbl>
## 1       0.06

What is the 90% posterior credible interval for \(p\)?

sim |> 
  summarize(
    lower = quantile(ps, .05), upper = quantile(ps, .95)
  )
## # A tibble: 1 × 2
##   lower upper
##   <dbl> <dbl>
## 1 0.255  0.52

Note about reporting results from simulation

As we said before, simulation results are only approximately correct. As a result,

Only report as many digits as are consistent across multiple simulations (without setting a random seed)

  • Compare the results of the above two chunks across multiple simulations.

  • What would be appropriate values to report? ______ and ______

Using a larger sample size will make the results across simulations more consistent, and allow you to report more digits (but it will take longer to run).

  • Try simulating 100,000 values of \(p\) instead of 1000 and comparing results across simulations.

  • Values to report in this case? ______ and ______