Grid approximation - a simple and not really efficient way to approximate the posterior distribution.
hint: remember the distribution is continuous and we want to avoid integrals
library(tidyverse)
# Store draws (1 = R, 0 = B)
draws <- c(1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1)
# We need to define how coarse the grid is
grid_points <- 100
# Define Bayes theorem through grid approximation
grid_posterior <- tibble(
# GRID OF PARAMETER VALUES
grid = seq(from = 0,
to = 1,
length.out = grid_points),
# UNINFORMATIVE PRIOR
prior = 1,
# LIKELIHOOD
likelihood = dbinom(sum(draws), size = length(draws), prob = grid),
# POSTERIOR
posterior = (prior * likelihood) / sum(prior * likelihood)
)If we want to sample to summarise the data (the posterior), we can simply sample with replacement the parameter values and create a frequency distribution of observed values
:::: columns
Instead, if we want to sample to make predictions (inference), we need an extra step: 1. Sample a parameter value from the posterior 2. Simulate the next N observations with that value 3. Sample one of the observations (weighting each observation by its plausibility)
Repeat this over and over
Imagine we put each possible value in a bag, in a proportion that respects its posterior plausibility. For instance, our bag will contain more 0.30 values that everything else.
We shuffle the bag and we extract one value at random.
Now we sample again
And again…
…and again…
… and again thousands of times
If we have a dataset containing the posterior values, we can sample with replacement its rows. If we have a vector, we can sample its values.
# if we work with a tibble/dataframe we can use dplyr `slice_sample`
slice_sample(my_dataframe, n = 1000, weight_by = plausibility, raplace = TRUE)
# If we work with a single vector of values we use the base function 'sample'
sample(my_vector, size = 1000, replace = TRUE, prob = plausibility)Our data looks like this at the moment
# A tibble: 6 × 3
grid distribution plausibility
<dbl> <chr> <dbl>
1 0 prior 1 e+ 0
2 0 likelihood 0
3 0 posterior 0
4 0.0101 prior 1 e+ 0
5 0.0101 likelihood 3.40e-12
6 0.0101 posterior 4.12e-13
Our data looks like this at the moment
# A tibble: 6 × 3
grid distribution plausibility
<dbl> <chr> <dbl>
1 0 prior 1 e+ 0
2 0 likelihood 0
3 0 posterior 0
4 0.0101 prior 1 e+ 0
5 0.0101 likelihood 3.40e-12
6 0.0101 posterior 4.12e-13
Our data looks like this at the moment
# A tibble: 6 × 3
grid distribution plausibility
<dbl> <chr> <dbl>
1 0 prior 1 e+ 0
2 0 likelihood 0
3 0 posterior 0
4 0.0101 prior 1 e+ 0
5 0.0101 likelihood 3.40e-12
6 0.0101 posterior 4.12e-13
Having defined this, we can use slice_sample to sample from the posterior distribution
# NOTE: we need to select only the rows that refer to the posterior distribution!
# To do so we use `filter` to select those rows where distribution == posterior
posterior_samples <- grid_posterior %>%
filter(distribution == "posterior") %>%
slice_sample(n = 10000, weight_by = plausibility, replace = TRUE)Note, let’s compare the posterior distribution with the posterior samples
They look similar right?
Pay attention that the two distributions have different meaning. Specifically, the sampling distribution is a distribution of frequencies of parameter values and not a distribution of plausibilities.
For instance, we want to know how much of the posterior probability lies between 0.3 and 0.7.
In this case we can compute the proportion of samples where the parameter value is within 0.3 and 0.7:
\[\text{interval} = \frac{\text{number samples in the interval}}{\text{total number of sample}}\]
# A tibble: 1 × 1
proportion
<dbl>
1 0.727
Another way to do this is:
[1] 0.7266
Remember: \(\frac{\text{number samples in the interval}}{\text{total number of sample}}\)
We can also find the compatibility/credibility intervals - aka the confidence intervals of the posterior distribution. While before we defined the boundaries of the data, here we define the mass we want to capture (eg. 80% of the data) and then we find the boundaries that contain this mass.
80%
0.7272727
Let’s say we want to find the 50% compatibility interval of this distribution
Don’t worry, we break this down together step-by-step
# The first part is as before
# Assuming we have our grid posterior in long format from last time
grid_posterior %>%
# We select only the rows that contain information regarding the posterior distribution
filter(distribution == "posterior") %>%
# We now sample with replacement the remaining rows
slice_sample(n = 1000, weight_by = plausibility, replace = TRUE) %>%
# Now for each row (sample) we want to create a binomial distribution using the parameter value as probability
# Basically it's like we are assuming that parameter value is the correct one. Once we have the binomial for each
# row, we will sample one value from it as discussed at the beginning. To do all of this we:
# 1. tell R to work rowwise (row-by-row)
rowwise() %>%
#2. we create the binomial distribution and sample from it in one line. We will store only the sampled value
mutate(sampled_values = sample(0:9, size = 1, prob = dbinom(0:9, 9, prob = grid)))# A tibble: 6 × 4
# Rowwise:
grid distribution plausibility sampled_values
<dbl> <chr> <dbl> <int>
1 0.677 posterior 0.0284 7
2 0.455 posterior 0.0142 4
3 0.616 posterior 0.0293 6
4 0.677 posterior 0.0284 6
5 0.687 posterior 0.0277 7
6 0.556 posterior 0.0255 8
From 3E1 to 3E5