Using Bayesian Models for Understanding and Prediction
February 13, 2023
Discuss: How many values would we have in our grid approximation?
Answer: \(100^{10}\)
Suppose we flip the world 9 times and get 6 waters.
We can use the sample
command to draw random parameters from the posterior in proportion to their probability.
We can use the sample
command to draw random parameters from the posterior in proportion to their probability.
Add the maximum a posteriori (MAP) estimate (i.e. 6/9).
MAP is similar to maximum likelihood estimate (MLE).
Note: Only need to call library(rethinking)
once per R session!
Add analytical posterior to check.
If we have the grid-approximate posterior (posterior
), we can calculate Pr[A \(\leq\) p \(\leq\) B]. For example, if A = 0 and B = 0.5:
So, Pr[p \(\leq\) 0.5] = 0.1718746.
If we have samples from the posterior (samples
), we can calculate Pr[A \(\leq\) p \(\leq\) B] as well. For example, if A = 0 and B = 0.5:
Since samples appear in proportion to their probability, we only need to count the number of samples less than 0.5 and divide by the sample size.
Definition: An interval [A,B] with posterior probability mass \(x\) is called a (100\(x\))% credible interval or (100\(x\))% compatibility interval, given by Pr[A \(\leq p \leq\) B] = \(x\).
3 special types:
From samples, we can compute quantiles using the quantile
function in R.
We can also compute percentile intervals. For example, the 80% PI is:
Highest Posterior Density Interval (HPDI): Narrowest interval containing specified probability mass
Let’s look at different data.
p-th quantile:
Definition: The maximum a posteriori (MAP) estimate is the value of the random variable that maximizes the posterior distribution.
Definition: The median is 0.5 quantile (50% percentile).
Definition: The mean is the average of the posterior P, written as \[ \int p\mathrm{P}(p)\,dp. \]
Useful in Homework #3.
In general, we will not use this technique. We will instead compute estimates from samples from the posterior.
import {aq, op} from "@uwdata/arquero" // JavaScript dplyr
import {vl} from "@vega/vega-lite-api-v5" // JavaScript ggplot2
ss = require('simple-statistics')
Let’s go back to the original model with 6 waters in 9 tosses, assuming a uniform prior. The resulting posterior distribution from the trained model is:
Definition: A Posterior Predictive Distribution is an average data distribution, averaged over values of the posterior.
Question: How to sample from a posterior predictive distribution?
Two step process:
Step 1. Sample a parameter from the posterior using sample
:
For multiple samples, can use the previous samples
vector (recreating for clarity):
It is usually a good idea to check to see how well the model predicts the data you used to train it. It SHOULD be a good fit, as long as
So how likely is 6 waters out of 9 (our training data) according to the posterior predictive distribution?
Discuss: Where do I predict 89% of future data will be?