Monte Carlo Simulation of Bayesian Models

\(y_i=\mu+\epsilon_i,\space \space \epsilon \stackrel{iid}{\sim}N(0,\sigma^2),\space i=1,..,n\)

Likelihood: \(P(y|\theta) = \frac{P(y,\theta)}{P(\theta)}\), where \(P(\theta)\) is the prior probability distribution of \(\theta\).

Posterior: \(P(\theta|y) = \frac{P(y,\theta)}{P(y)} = \frac{P(y,\theta)}{\int P(y,\theta) \space d\theta} = \frac{P(y|\theta)P(\theta)}{\int P(y|\theta)P(\theta) \space d\theta} = \frac{1}{K}P(y|\theta)P(\theta)\)

Hierarchical Bayesian Model

\(\space \space y_i \stackrel{iid}{\sim}N(0,\sigma^2)\)
\(\space \space \mu|\sigma^2 \sim N(\mu_0,\frac{\sigma_0^2}{\omega_0})\)
\(\space \space \sigma^2 \sim \Gamma^{-1}(\alpha_{\space0},\beta_{\space0})\)

Let’s look at the \(Gamma\) distribution:

set.seed(32)
m=100
a=2
b=1/3

Get \(m=100\) samples from the \(Gamma\) distribution.

theta=rgamma(n=m,shape=a,rate=b)
head(theta)

## [1] 4.553954 8.280642 1.515590 9.160268 3.507859 5.606974

tail(theta)

## [1] 2.432403 7.148443 2.715706 2.763691 5.413677 5.865641

hist(theta,freq=FALSE)
curve(dgamma(x,shape=a,rate=b), col='blue',add=TRUE)

Calculate mean of the simulated values for \(\theta\):

mean(theta)

## [1] 5.514068

var(theta)

## [1] 12.56345

The mean and variance of a gamma distribution \(\Gamma(a,b)\) is \(\frac{a}{b}\) and \(\frac{a}{b^2}\) respectively.

a/b

## [1] 6

a/b^2

## [1] 18

Looks like we need to get more samples for \(\theta\).

m=10000
theta=rgamma(n=m,shape=a,rate=b)
mean(theta)

## [1] 6.023273

var(theta)

## [1] 18.04318

Let’s look at \(P(\theta < 5)\)

ind=theta<5
head(ind)

## [1] FALSE FALSE  TRUE  TRUE FALSE FALSE

mean(ind) # Probability that theta < 5

## [1] 0.497

Now, lets calculate \(P(\theta<5)\) using the pgamma function.

pgamma(q=5,shape=a,rate=b)

## [1] 0.4963317

Now, lets take a look at the 90th percentile of \(\theta\):

quantile(theta,probs=0.9)

##      90% 
## 11.74338

Now, lets the 90th percentile using the qgamma function.

qgamma(0.9,shape=a,rate=b)

## [1] 11.66916

Let’s calculate the standard error and confidence interval for \(\theta\):

std_error = sd(theta)/sqrt(m)
lower = mean(theta)-std_error
lower

## [1] 5.980796

upper = mean(theta)+std_error
upper

## [1] 6.065751

Lets go back to \(P(\theta<5)\) using the pgamma function.

ind=theta<5
mean(ind) # Probability that theta < 5

## [1] 0.497

pgamma(5,shape=a,rate=b)

## [1] 0.4963317

Our Monte Carlo estimate of the probability is within 0.01 of the true mean.

std_error=sd(ind)/sqrt(m)
2*std_error

## [1] 0.01000032

Now, let’s simulate \(\phi_i\) from \(Beta(2,2)\), then we will simulate \(y_i\) from \(Binom(10,\phi_i)\)

m=1e5
y=numeric(m)
phi=numeric(m)

Now loop \(i=1:m\) times to fill the \(\phi_i\) and \(y_i\) vectors using the rbeta and rbinom functions.

# for (i in 1:m) {
#  phi[i]=rbeta(1,shape1=2,shape2=2)
#  y[i]=rbinom(n=1,size=10,prob=phi[i])
#}

Loops are not particularly efficient, let’s use vectorized code

phi=rbeta(m,shape1=2,shape2=2)
y=rbinom(n=m,size=10,prob=phi)

table(y)/m

## y
##       0       1       2       3       4       5       6       7       8       9 
## 0.03908 0.06929 0.09475 0.11288 0.12243 0.12448 0.12011 0.11275 0.09604 0.07029 
##      10 
## 0.03790

plot(table(y)/m)

mean(y)

## [1] 4.99939

Problem 1: Laura keeps a record of her loan applications and performs a Bayesian analysis of her success rate, \(\theta\). Her analysis yields a \(Beta(5,3)\) posterior distribution for \(\theta\).

The posterior mean for \(\theta = 5/(5+3) = 0.625\). However, Laura likes to think in terms of odds of succeeding, defined as \(\frac{\theta}{1-\theta}\), the probability of success divided by the probability of failure.

Simulate a large number of samples for the posterior distribution of \(\theta\) and use these samples to approximate the posterior mean for Laura’s odds of success, namely \(E(\frac{\theta}{1-\theta})\).

m=1e5
a=5
b=3
theta=rbeta(m,shape1 = a,shape2 = b)
mean(theta/(1-theta))

## [1] 2.506003

Problem 2: Laura also wants to know the posterior probability that her odds for success on loan applications is > 1 (in other words, better that 50:50 odds).

ind=(theta/(1-theta))>1
mean(ind)

## [1] 0.77502

# ind=(theta/(1-theta))<1
# 1-mean(ind)

Problem 3: Use a (large) Monte Carlo sample to approximate the 0.3 quantile of the standard Normal distribution, \(N(0,1)\), the number such that the probability of being less than 0.3. Use the quantile function in R. We can also check our answer using the qnorm function.

m=1e6
theta=rnorm(m,0,1)
quantile(theta,probs=0.3)

##        30% 
## -0.5248922

qnorm(0.3,0,1)

## [1] -0.5244005

To measure how accurate our Monte Carlo approximations are, we can use the Central Limit Theorem (CLT). If the number of samples drawn, \(m\), is large, then the Monte Carlo sample mean \(\bar{\theta^*}\) used to estimate \(E(\theta)\) approximately follows a normal distribution with mean \(E(\theta)\) and variance \(Var(\theta)/m\). If we substitute the sample variance for \(Var(\theta)\), we can get a rough estimate of our Monte Carlo standard error (or standard deviation).

Suppose we have 100 samples from our posterior distribution \(\theta_i^*\), and that the sample variance of these draws is 5.2. A rough estimate of our Monte Carlo standard error would then be \(\sqrt{5.2/100} \approx 0.228\). So our estimate, \(\theta^*\), is probably within about \(0.456\) (two standard errors) of the true \(E(\theta)\).

Problem 4: What does the standard error of our Monte Carlo estimate become if we increase our sample size to 5000? Assume that the variance of the sample draws is still 5.2.

sqrt(5.2/5000)

## [1] 0.03224903

Monte Carlo Simulation of Bayesian Models

Ken Wood

2023-07-11

Hierarchical Bayesian Model