This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more of my R tutorials visit http://mikewk.com/statistics.

Contents

Frequentist vs. Bayesian

Frequentists ask, “what is the likelihood of observing the data, given the parameter(s) of the model.” (Maximum likelihood methods basically work by iteratively finding values for \(\theta\) that maximize this function.)

\[ p(y \,|\, \theta) \]

Bayesian statistics describe the probability of the parameters given the data.

\[ p(\theta \,|\, y) = \frac{p(y \,|\, \theta) \, p(\theta)} {p(y)} \]

Bayes’ rule is commonly presented in this form, which assumes the sum of the marginal probabilities is equal to 1.

\[ p(\theta \,|\, y) \propto p(y \,|\, \theta) \, p(\theta) \]

Inference Problem

The inference we are used to making is that the probability of the data given the null hypothesis is true.

\[ p(y \geq Y \,|\, H_0) \]

There are two problems with this:

  1. People misinterpret this all the time, and
  2. It’s not the inference you really want

People confuse \(p(y \geq Y \,|\, H_0)\) with \(p(H_0 \,|\, y \geq Y)\). If the probability of the data, given the null is true, is small, the probability that the null is true, given the data, must be small, too, right?! RIGHT?! Sadly, NO.

Through Bayes’ Rule, we know they’re only equal if the marginal probability of \(H_0\) being true is equal to the marginal probability of \(y\) being greater than of equal to \(Y\).

Bayesian regression example

x <- rnorm(200, 0, 1) * 5
dat <- data.frame(x)
dat[1:100, 'group'] <- 'A'
dat[1:100, 'y'] <- ((x[1:100]*2) + rnorm(100, 0, 2))
dat[101:200, 'group'] <- 'B'
dat[101:200, 'y'] <- ((x[101:200]/3) + rnorm(100, 0, 3))

The model saved as a .bug file

rjags.example.bug <-'
model {
   for (i in 1:N){
     y[i] ~ dnorm(y[i], tau)
     y.hat[i] <- a + b * x[i]
   }
   a ~ dnorm(0, .0001)
   b ~ dnorm(0, .0001)
   tau <- pow(sigma, -2)
   sigma ~ dunif(0, 100)
}
'

Run model

library(rjags)
xVar <- dat$x
yVar <- dat$y
N <- 200
epsilon <- rnorm(N, 0, 1)
jags <- jags.model('rjags.example.bug',
            data = list('xVar' = xVar, 
                        'yVar' = yVar, 
                        'N' = N),
            n.chains = 4,
            n.adapt = 100,
            quiet = TRUE)

Generate posterior samples

coefs <- jags.samples(jags,
  c('a', 'b'),
  2000)

Extract estimates - means from posterior samples and results from frequentist lm() model

intercept <- mean(coefs$a[])
b <- mean(coefs$b[])

lm1 <- lm(y ~ x, data=dat)

Compare the results

# results from frequentist lm() model
round(coefficients(lm1), 3)
## (Intercept)           x 
##      -0.405       1.210
# results from bayesian jags.samples() model
round(data.frame(intercept, b), 3)
##   intercept     b
## 1    -0.404 1.209