Definition: An interval [A,B] with posterior probability mass \(x\) is called a (100\(x\))% credible interval or (100\(x\))% compatibility interval, given by Pr[A \(\leq p \leq\) B] = \(x\).
3 special types:
Quantiles: When A = 0, B is called the \(x\)quantile.
Percential Intervals (PI): Equal probability mass in each tail.
Highest Posterior Density Interval (HPDI): Narrowest interval containing specified probability mass.
Definition: The maximum a posteriori (MAP) estimate is the value of the random variable that maximizes the posterior distribution.
Definition: The median is 0.5 quantile (50% percentile).
Definition: The mean is the average of the posterior P, written as \[
\int p\mathrm{P}(p)\,dp.
\]
Summarizing: Point estimates
Computing MAP estimate from posterior
Useful in Homework #4.
p_grid[ which.max(posterior) ]
[1] 1
In general, we will not use this technique. We will instead compute estimates from samples from the posterior.
Summarizing: Point estimates
Code summary using samples
MAP estimate:
chainmode( samples, adj=0.01 )
[1] 0.97785
Mean:
mean( samples )
[1] 0.7998607
Median:
median( samples )
[1] 0.8388388
Summarizing posterior: Final thoughts
In Bayesian Inference, the entire posterior distribution is your estimate! It includes all of the uncertainty in your inference.
Summarizing using intervals or point estimates severely limits the power of Bayesian and introduces certainty where none exists (i.e. throws away uncertainty).
Best practice is to use entire posterior when possible.
Using Bayesian Models for Prediction
Training the Model for Prediction
When using the model for inference, the posterior probability is the object of interest (distribution; point and interval estimates).
When using the model for prediction, the simulated samples from the trained model is of interest.
The trained model is simply the statistical model with the posterior probability trained from the data.
vl.markBar().data(dtd).encode( vl.x().fieldN('x').scale({domain: dtd.array('x')}).axis({title:"Number of successes"}), vl.y().fieldQ('y').axis({title:"Probability"})).width(600).height(400).render();
Constrained Model Parameters
After training
Let’s go back to the original model with 6 waters in 9 tosses, assuming a uniform prior. The resulting posterior distribution from the trained model is:
Predicting Future Data
Definition: A Posterior Predictive Distribution is an average data distribution, averaged over values of the posterior.