Prior elicitation is the practice of transforming expert knowledge into a distribution and assigning it proper weight relative to data.
In two groups complete this exercise and report back next class. Before then post your group answers to Canvas/Discussion. Here are some functions/packages:
source('clarkFunctions2026.R')
library(repmis)
So far we have used non-informative prior distributions, specified to be overwhelmed by the data. Is the prior nothing more than a device to transform the likelihood into a distribution of parameters (posterior)? How could I make it communicate my prior belief about the distribution of parameters?
If I am asked about my belief that it will rain tomorrow, I would not have a hard time coming up with a value for \(p\). For example, if I feel clueless, my answer would be \(p = 1/2\) (maximum ignorance). If I am asked about the prior distribution of values, i.e., \([p]\), that would be more difficult. How much of the prior distribution should be assigned to values less than, say, \(0.1\) or greater than \(0.8\)?
Prior elicitation is a huge literature in Bayesian analysis that includes which experts to consult, evaluating their input, and turning it into a distribution. Alternatively, an objective prior seeks to let the data dominate.
A prior distribution generally has a central tendency, e.g., the prior mean of the normal distribution, and a prior weight relative to the data. The weight of the data is typically proportional to sample size. Together with the noise in the data, the sample size has a direct effect on the posterior variance, e.g., the stardard error of a mean estimate, \(\sigma/\sqrt{n}\). The weight of the prior is inversely proportional to its variance. The posterior variance is a weighted average of data and prior mean.
For the regression example we used this model for, say, \(p = 3\) predictors:
\[ [\boldsymbol{\beta}|\mathbf{y}, \mathbf{X}, \sigma^2, \mathbf{b}, \mathbf{B}] = \prod_{i=1}^n N(y_i | \mathbf{x}_i' \boldsymbol{\beta}, \sigma^2) \times MVN_p( \boldsymbol{\beta}|\mathbf{b}, \mathbf{B}) \] Our non-informative prior was centered on zero, but that didn’t matter because the variances were huge:
\[ \mathbf{b} = \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix} , \mathbf{B} = \begin{bmatrix} 1000 & 0 & 0\\ 0 & 1000 & 0 \\ 0 & 0 & 1000 \end{bmatrix} \]
Here I load the FIA data we used for the Tobit regression:
source_data("https://github.com/jimclarkatduke/gjam/blob/master/xdataCluster.rdata?raw=True")
source_data("https://github.com/jimclarkatduke/gjam/blob/master/baCluster.rdata?raw=True")
y <- baCluster[,'pinusTaeda']
data <- data.frame( cbind(xdataCluster, y) )
Here is this prior distribution fitted with the Tobit.
xnames <- c( 'moist','standAge','meanTemp','annualPrec','silt30','sand30' )
form <- vars2formula( xnames )
Q <- length( xnames )
priorB <- matrix( 0, Q, 1, dimnames = list( xnames, NULL ) )
priorVB <- 1000*diag( Q )
colnames( priorVB ) <- rownames( priorVB ) <- xnames
fit1 <- bayesReg(formula = form, data, ng = 5000, TOBIT = T)
## NULL
## ================================================================================
Exercise 1. Find the parameter estimates and chains in
object fit1. Determine the following:
To make the prior informative I would likely want something other than zeros in \(\mathbf{b}\) and more weight (smaller variances) in \(\mathbf{B}\).
For the FIA basal area data I want an informative prior reflecting my
knowledge that loblolly pine does well on moist, fertile soils and early
in stand development. I start by specifying variables that I believe to
be important for the species pinusTaeda. These variables
must be in colnames( data ):
| Variable name | Description | Prior belief | Why |
|---|---|---|---|
moist |
site moisture status | positive | field obs |
standAge |
stand age | negative | early successional sp |
meanTemp |
site temperature | unsure | |
annualPrec |
site precipitation | unsure | |
silt30 |
silt content | positive | field obs |
sand30 |
sand content | unsure |
The prior belief in this table reflects field experience with this species as early-successional and tending to occupy rich (silt) soils.
Exercise 2. Make the prior distribution informative and determine its effect. You will need to modify \(\mathbf{b}\) and \(\mathbf{B}\).
## NULL
## ================================================================================
In the foregoing exercise I made the prior informative by changing the location and weight of the distribution. An alternative approach can engage truncated prior distributions that assign non-zero probability to only positive or negative values of a coefficient.
The matrix priorLoHi has two columns that specify the
lower and upper bounds on parameter values. Here is an example:
priorLoHi <- matrix( NA, 3, 2,
dimnames = list( c( 'moist', 'standAge', 'sand30' ),
c( 'lo', 'hi' ) ) )
priorLoHi[,1] <- c( 0, -Inf, -Inf )
priorLoHi[,2] <- c( Inf, 0, 0 )
fit3 <- bayesReg(formula = form, data, ng = 5000, TOBIT = T, priorLoHi = priorLoHi)
## [,1] [,2]
## intercept -Inf Inf
## moist 0 Inf
## standAge -Inf 0
## meanTemp -Inf Inf
## annualPrec -Inf Inf
## silt30 -Inf Inf
## sand30 -Inf 0
## ================================================================================
Exercise 3. Compare the posterior distributions you obtained with the non-informative prior, the prior with non-zero mean, and the truncated prior.