3.2)

The following process constructs a data frame of different a,b parameters and \(Pr(\theta > .5 | \sum Y_i = 57)\) given different combinations of \(\theta_0\) and \(n_0\).

#prior

theta0 <- seq(.1, .9, .1)
ltheta0 <- length(theta0)

n0 <- c(1,2,8,16,32)
ln0 <- length(n0)

n0 <- rep(n0, ltheta0)
theta0 <- rep(theta0, each = ln0)

data <- data.frame(theta0 = theta0, n0 = n0)
data$a <- data$theta0 * data$n0
data$b <- (1-data$theta0) * data$n0

# posterior 

y <- 57
n <- 100

#probability that posterior theta is greater than .5
data$ptheta.5 <- pbeta(.5, data$a + y, data$b + n - y, lower.tail = FALSE)
data

##    theta0 n0    a    b  ptheta.5
## 1     0.1  1  0.1  0.9 0.9067174
## 2     0.1  2  0.2  1.8 0.8914671
## 3     0.1  8  0.8  7.2 0.7686623
## 4     0.1 16  1.6 14.4 0.5445167
## 5     0.1 32  3.2 28.8 0.1554088
## 6     0.2  1  0.2  0.8 0.9100204
## 7     0.2  2  0.4  1.6 0.8987201
## 8     0.2  8  1.6  6.4 0.8130640
## 9     0.2 16  3.2 12.8 0.6591158
## 10    0.2 32  6.4 25.6 0.3248808
## 11    0.3  1  0.3  0.7 0.9132361
## 12    0.3  2  0.6  1.4 0.9056150
## 13    0.3  8  2.4  5.6 0.8517895
## 14    0.3 16  4.8 11.2 0.7606645
## 15    0.3 32  9.6 22.4 0.5417239
## 16    0.4  1  0.4  0.6 0.9163656
## 17    0.4  2  0.8  1.2 0.9121591
## 18    0.4  8  3.2  4.8 0.8847656
## 19    0.4 16  6.4  9.6 0.8430584
## 20    0.4 32 12.8 19.2 0.7465826
## 21    0.5  1  0.5  0.5 0.9194100
## 22    0.5  2  1.0  1.0 0.9183604
## 23    0.5  8  4.0  4.0 0.9121799
## 24    0.5 16  8.0  8.0 0.9042520
## 25    0.5 32 16.0 16.0 0.8894420
## 26    0.6  1  0.6  0.4 0.9223703
## 27    0.6  2  1.2  0.8 0.9242272
## 28    0.6  8  4.8  3.2 0.9344278
## 29    0.6 16  9.6  6.4 0.9458344
## 30    0.6 32 19.2 12.8 0.9628561
## 31    0.7  1  0.7  0.3 0.9252477
## 32    0.7  2  1.4  0.6 0.9297689
## 33    0.7  8  5.6  2.4 0.9520516
## 34    0.7 16 11.2  4.8 0.9716716
## 35    0.7 32 22.4  9.6 0.9905703
## 36    0.8  1  0.8  0.2 0.9280434
## 37    0.8  2  1.6  0.4 0.9349949
## 38    0.8  8  6.4  1.6 0.9656774
## 39    0.8 16 12.8  3.2 0.9863403
## 40    0.8 32 25.6  6.4 0.9982198
## 41    0.9  1  0.9  0.1 0.9307587
## 42    0.9  2  1.8  0.2 0.9399155
## 43    0.9  8  7.2  0.8 0.9759580
## 44    0.9 16 14.4  1.6 0.9939428
## 45    0.9 32 28.8  3.2 0.9997538

For the contour plot, I chose to have the prior values of \(\theta _0\) on the x-axis, the prior number of observations (\(n_0\)) on the y-axis, and the posterior probability \(P(\theta > .5 | \sum Y_i = 57)\) on the z-axis.

fig <- plot_ly(
  x = data$theta0,
  y = data$n0,
  z = data$ptheta.5,
  type = "contour",
  contours = list(
    start = 0,
    end = 1,
    size = .05
  )
  
)
fig

As the contour plot shows, if we fix \(n_0\), the higher the prior \(\theta_0\), the higher \(P(\theta > .5 | \sum Y_i = 57)\). If we fix a low value of \(\theta_0\) (\(\theta_0 < .5\)), the greater \(n_0\), the lower \(P(\theta > .5 | \sum Y_i = 57)\) . But at high values of \(\theta_0\) (\(\theta_0\) > .5), the greater \(n_0\), the higher \(P(\theta > .5 | \sum Y_i = 57)\).

If someone were to use this plot with a fixed beta prior distribution, they would first find the point (x,y) on the graph where x is their prior \(\theta_0\) and y is their prior number of observations. Given this point, the z value will give \(P(\theta > .5 | \sum Y_i = 57)\). If this value is large enough (greater than some pre-specifed \(\alpha\)), they should believe \(\theta > .5\), otherwise they should not believe \(\theta > .5\).

3.3)

a)

Since the sampling distribution for each group follows a Poisson distribution, the posterior distributions for each \(\theta_A, \theta_B\) follows a \(gamma(a + \sum Y_i, b + n )\) distribution. So the posterior parameters for \(\theta_A\) are the following.

##finding the posterior dist for a

y.a <- c(12, 9, 12, 14, 13, 13, 15, 8, 15, 6)

#prior parameters
a <- 120
b <- 10

sum.ya <- sum(y.a)
n <- length(y.a)

#posterior parameters
post.a1 <- sum.ya + a
post.a2 <- n + b
c(post.a1, post.a2)

## [1] 237  20

And \(\theta_A |Y_{A1}...Y_{An} \sim gamma(237, 20)\). The following output gives the posterior mean, variance, and 95% quantile-based interval for \(\theta_A\).

mean.a <- post.a1/post.a2
mean.a

## [1] 11.85

var.a <- post.a1/(post.a2^2)
var.a

## [1] 0.5925

lower.a <- qgamma(.025, post.a1, post.a2)
upper.a <- qgamma(.975, post.a1, post.a2)
c(lower.a, upper.a)

## [1] 10.38924 13.40545

Following a similar process for \(\theta_B\):

#finding posterior dist for b

y.b <- c(11, 11, 10, 9, 9, 8, 7, 10, 6, 8, 8, 9, 7)

#prior parameters
a <- 12
b <- 1

sum.yb <- sum(y.b)
nb <- length(y.b)

#posterior parameters

post.b1 <- a + sum.yb
post.b2 <- b + nb

c(post.b1, post.b2)

## [1] 125  14

So the posterior distribution \(\theta_B | Y_{B1}...Y_{Bn} \sim gamma(125, 14)\). Again, the posterior summary statistics for \(\theta_B\) are provided as output in the following order: mean, variance, 95% quantile-based interval.

mean.b <- post.b1/post.b2
mean.b

## [1] 8.928571

var.b <- post.b1/(post.b2^2)
var.b

## [1] 0.6377551

lower.b <- qgamma(.025, post.b1, post.b2)
upper.b <- qgamma(.975, post.b1, post.b2)
c(lower.b, upper.b)

## [1]  7.432064 10.560308

b)

The following code computes the posterior expected values of \(\theta_B\) given the modified prior. The values are then graphed alongside the 95% quantile-based posterior interval for \(\theta_A\).

n0 <- c(1:50)

a <- 12 * n0
b <- n0


post.b1 <- a + sum.yb
post.b2 <- b + nb

post.exp <- post.b1/post.b2

plot(n0, post.exp, type = "l", xlab = "n0 values", ylab = "posterior expectation"
     , lwd = 2, ylim = c(9, 13.5), main = "Posterior Expectation of theta_B")
abline(h = c(lower.a, upper.a), col = "blue", lwd = 2)
legend(x = c(20, 51), y = c(12, 13), inset = .2, lty = c(1,1), lwd = 2, col = c("blue", "black"), legend = c("95% quantile interval for theta_A", "posterior expecation for theta_B"))

So adjusting the prior \(n_0\) until \(E(\theta_B | Y_{B1} ... Y_{Bn}) \in\) 95% quantile-based interval for posterior \(\theta_A\) is computed below.

for(i in 1:length(post.exp)){
  if(post.exp[i] > lower.a){
    print(n0[i])
    break
  }
}

## [1] 14

So the prior belief we would need for \(\theta_B\) to be close to \(\theta_A\) is that \(n_0 \geq 14\). The following plot visualizes this concept, with a red line marking the minimum \(n_0\) necessary to have the posterior means close to one another.

plot(n0, post.exp, type = "l", xlab = "n0 values", ylab = "posterior expectation"
     , lwd = 2, ylim = c(9, 13.5), main = "Posterior Expectation of theta_B")
abline(h = c(lower.a, upper.a), col = "blue", lwd = 2)
legend(x = c(25, 51), y = c(12, 13), inset = .2, lty = c(1,1), lwd = 2, col = c("blue", "black"), legend = c("95% quantile interval for theta_A", "posterior expecation for theta_B"))
abline(v = 14, col = "red", lwd = 2)

c)

The prompt states very explicitly that type B mice tumor rates are unknown, but type B mice are related to type A mice. This relation implies that some of our prior information about type B mice comes from our knowledge about type A mice. Therefore, the expression \(p(\theta_A, \theta_B) = p(\theta_A) * p(\theta_B)\) does not make sense since the two group means are not independent from one another. Rather, the information we know about \(p(\theta_A)\) is considered when forming \(p(\theta_B)\).We could instead write \(p(\theta_A, \theta_B) = p(\theta_A) * p(\theta_B| \theta_A)\).

3.9)

a)

The Galenshore distribution itself is a class of conjugate prior densities for \(\theta\)

galenshore <- function(a, theta, y){
  prob <- (2/factorial(a-1)) * theta^(2*a) * y ^ (2*a - 1) * exp((-(theta^2 * y ^ 2)))
  return(prob)
}
y <- seq(0, 5, length.out = 100)

one <- data.frame(density = galenshore(1,1, y), type = "galenshore(1,1)")
two <- data.frame(density = galenshore(2,1,y), type = "galenshore(2,1)")
three <- data.frame(density = galenshore(1,2,y), type = "galenshore(1,2)")
four <- data.frame(density = galenshore(4,2,y), type = "galenshore(4,2)")

densities <- rbind(one, two, three, four)
yvalues <- rep(y, 4)

data <- data.frame(yvalues = yvalues, densities)


library(ggplot2)

g = ggplot(data, aes(x = yvalues, y = density, group = type, color  = type)) + geom_line() + xlab("y")
g

b)

Finding the posterior disribution of \(\theta | Y_1 ... Y_n\) using a Galenshore prior:

\(\large P(\theta | Y_1 ... Y_n) = \frac{P(\theta)P(Y_1...Y_n | \theta)}{\int P(\theta)P(Y_1...Y_n | \theta) d \theta}\)
\(\large P(Y_1... Y_n | \theta) = c(Y_1 ... Y_n, a) \theta ^ {2na} e ^ {-\theta ^2 \sum y_i ^ 2}\) and
\(\large P(\theta) = d(z,b) \theta ^ {2b-1} e ^ {-\theta ^ 2 z ^ 2}\)
\(\Large P(\theta | Y_1 ... Y_n) = \frac{c(Y_1 ... Y_n, a) \theta ^ {2na} e ^ {-\theta ^2 \sum y_i ^ 2}d(z,b) \theta ^ {2b-1} e ^ {-\theta ^ 2 z ^ 2}}{c(Y_1...Y_n, a)d(z, b) \int \theta ^ {2na} e ^ {-\theta ^2 \sum y_i ^ 2} \theta ^ {2b-1} e ^ {-\theta ^ 2 z ^ 2}}\)
= \(\huge \frac{\theta ^ { 2(na+b)-1} e ^ {-\theta ^ 2 (\sum y_i ^2 + z^2)}}{\int \theta ^ { 2(na+b)-1} e ^ {-\theta ^ 2 ((\sum y_i ^2 + z^2)^.5)^2}d\theta}\) = \(\huge \frac{\theta ^ { 2(na+b)-1} e ^ {-\theta ^ 2 ((\sum y_i ^2 + z^2)^.5)^2}}{\frac{\Gamma{(na+b)}}{2((\sum y_i ^2 + z ^2)^.5)^{2(na+b)}}}\)

So \(\large p(\theta | Y_1 ... Y_n) = \frac{2}{\Gamma(na+b)} ((\sum y_i^2 + z^2)^.5) ^ {2(na+b)} e ^{-\theta ^2 ((\sum y_i^2 + z^2)^.5)^2} \theta ^ {2(na+b)-1}\)

And we have \(p(\theta|Y_1...Y_n)\) ~ \(Galenshore(na+b, \sqrt{\sum y_i ^2 + z^ 2})\).

c)

\(\huge \frac{P(\theta_x|Y_1 ... Y_n)}{P(\theta_q|Y_1...Y_n)}\) = \(\huge \frac{\frac{2}{\Gamma(na+b)} (\sqrt{\sum y_i^2 + z^2}) ^ {2(na+b)} e ^{-\theta_x ^2 (\sqrt{\sum y_i^2 + z^2})^2} \theta_x ^ {2(na+b)-1}}{\frac{2}{\Gamma(na+b)} (\sqrt{\sum y_i^2 + z^2}) ^ {2(na+b)} e ^{-\theta_q ^2 (\sqrt{\sum y_i^2 + z^2})^2} \theta_q ^ {2(na+b)-1}}\) = \(\Large e^{-(\sqrt{\sum y _i ^2 + z^2})^2(\theta_x^2-\theta_q^2)}(\frac{\theta_x}{\theta_q})^{2(na+b)-1}\) and a sufficient statistic is \(\sum y_i^2\).

d)

Since the Galenshore distribution \(\large E[Y] = \frac{\Gamma{(a + .5)}}{\theta \Gamma{(a)}}\), we have \(\large E[\theta|Y_1...Y_n] = \frac{\Gamma{(na+b + .5)}}{(\sqrt{\sum y_i ^ 2 + z^2}) \Gamma{(na+b)}}\)

Homework 2

Harris Wohl

3.2)

3.3)

a)

b)

c)

3.9)

a)

b)

c)

d)