1. If the prior mean and prior variance (or standard deviation) are
known, then we can plug it in the above formulas and solve for a and b.
Suppose Sophie, the editor of the student newspaper, is going to conduct
a survey of students to determine the level of support for the current
president of the students’ association. She needs to determine her prior
distribution for p, the proportion of students who support the
president. She decides her prior mean is 0.5, and her prior standard
deviation is 0.15.
a. Show algebraically that a=b=5.06
\(E(Y) = \frac {a}{(a+b)} =
0.5\)
\(a = 0.5(a+b)\)
\(a-0.5a = 0.5b\)
\(a = b\)
then use a and b in V(Y),
\(V(Y) = \frac
{ab}{(a+b)^2(a+b+1)}\)
\((0.15)^2 = \frac
{ab}{(a+b)^2(a+b+1)}\)
solve algebraically to obtain
\(8a = \frac {364}{9}\)
\(a = (\frac {364}{9})(\frac {1}{8})
\approx 5.06\)
since a = b, a = 5.06, b = 5.06
b. Out of the 68 students that she polls, y=21 support the current
president. Determine posterior distribution using the Beta(5.06,5.06)
prior.
a = 5.06
b = 5.06
s = 21
f = 47
\(a' = s+a\)
\(a' = 21+5.06\)
\(a' = 26.06\)
\(b' = f+b\)
$b’ = 47+5.06 $
\(b' = 52.06\)
Thus, the posterior is Beta(26.06, 52.06)
c. Plot both the prior and posterior distributions.
p <- seq(0, 1, length = 68)
a <- 5.06
b <- 5.06
s <- 21
f <- 47
prior <- dbeta(p, a, b)
post <- dbeta(p, a+s, b+f)
plot(p, post, type = "l", ylab = "Density", lty = 2, lwd = 3)
lines(p, prior, lty = 3, lwd = 3)
legend(.7, 4, c("Prior", "Posterior"),
lty=c(3, 1, 2), lwd = c( 3, 3, 3))

d. Construct and interpret a 90% credible interval for p.
ProbBayes::beta_interval(0.9,c(26.06, 52.06))

There is a 90% probability that the true (unknown) estimate would
lie within the interval 0.250 to 0.422, given the evidence provided by
the observed data.
Problem 2. Suppose Sophie, the editor of the student newspaper,
conducted a survey to 68 students to determine the level of support for
the current president of the students’ association. Out of the 68
students that she polls, y=21 support the current president. She needs
to determine her prior distribution for p, the proportion of students
who support the president. Suppose it is known that 25th and 75th
percentiles of a Beta(a,b) prior are 0.393 and 0.607, respectively.
a. Find a and b.
library(ProbBayes)
beta.select(list(x=0.393,p=0.25),
list(x=0.607,p=0.75))
## [1] 5.1 5.1
a = 5.1, b = 5.1
b. Find and plot the posterior distribution.
a = 5.1
b = 5.1
s = 21
f = 47
\(a' = s+a\)
\(a' = 21+5.1\)
\(a' = 26.1\)
\(b' = f+b\)
\(b' = 47+5.1\)
\(b' = 52.1\)
Thus, the posterior is Beta(26.1, 52.1)
p <- seq(0, 1, length = 68)
a <- 5.1
b <- 5.1
s <- 21
f <- 47
post <- dbeta(p, a+s, b+f)
plot(p, post, type = "l", ylab = "Density", lty = 2, lwd = 3)
lines(post, lty = 3, lwd = 3)
legend(.7, 4, "Posterior",
lty=c(3, 1, 2), lwd = c( 3, 3, 3))

c. Sophie claims that at least 85% of the students support the
current president. Are we going to reject or not her claim? Support your
answer.
ProbBayes::beta_area(lo = 0.85, hi = 1.0,
shape_par = c(26.1, 52.1))

1-pbeta(0.85,26.1, 52.1)
## [1] 0
rval <- rbeta(1000, 26.1, 52.1)
prop <- sum(rval >= 0.85) / 1000
print(prop)
## [1] 0
In all cases, the P(Y≥0.85)≈0, thus, the claim that
at least 85% of the students support the current president is very
unlikely, almost impossible to happen.
d. Using the posterior distribution, how many students are expected
to support the current president if the survey is given to a sample of
100 students?
pred.prob.dist <- pbetap(c(26.1, 52.1),100,0:100)
discint(cbind(0:100,pred.prob.dist), 0.85)
## $prob
## [1] 0.8623549
##
## $set
## [1] 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
Thus, \(P(23≤Y˜≤43)≈0.86\). So, there’s an
86% chance that 23-43 students in the sample will
support the current president.
e. Does the posterior distribution describe well the prediction in
(d)? Explain.
pred_p_sim <- rbeta(1000, 26.1, 52.1)
pred_y_sim <- rbinom(1000,100, pred_p_sim)
hist(pred_y_sim,xlab="Simulated Y", main = " ")
abline(v = mean(pred_y_sim), col = "red", lwd = 3, lty = 2)

The observed value of s is in the middle of the distribution. Notice
also that the Bayesian point estimate of p is the mean of
Beta(a’, b’) which is approximately 33. This probably means that the
posterior distribution describes well the prediction in (d) as the
sample gets larger.