1 Three general steps of Bayesian inference:

Express an opinion about the location of the proportion \(p\) before sampling (prior).
Take the sample and record the observed proportion (data/likelihood).
Use Bayes’ rule to sharpen or update the previous opinion about \(p\) given the information in the sample (posterior).

2 An example

Suppose a person is interested in learning about the sleeping habits of graduating college students. He hears that doctors recommend eight hours of sleep for an average adult. What proportion of graduating students get at least eight hours of sleep?

Let \(p\) be the proportion of all graduating students who sleep (on a typical night) at least eight hours. We are interested in learning about the location of \(p\).

The value of the proportion \(p\) is unknown. In the Bayesian viewpoint, a person’s beliefs about the uncertainty in this proportion are represented by a probability distribution placed on this parameter. This distribution reflects the person’s subjective prior opinion about plausible values of \(p\).

A random sample of graduating students from a particular university will be taken to learn about this proportion. But first the researcher does some initial research to learn about the sleeping habits of graduating college students. This research will help him in constructing a prior distribution.

2.1 The prior distribution for \(p\)

One article about sleep patterns of students presented that that college students generally get less than eight hours of sleep and so \(p\) (the proportion that sleep at least eight hours) is likely smaller than 0.5. After some reflection, her best guess at the value of \(p\) is .3. But it is very plausible that this proportion could be any value in the interval from 0 to 0.5.

A simple approach for assessing a prior for p is to write down a list of plausible proportion values and then assign weights to these values. The person in our example believes that

0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95

are possible values for \(p\). Based on his beliefs, he assigns these values the corresponding weights

1, 5.2, 8, 7.2, 4.6, 2.1, 0.7, 0.1, 0, 0

which can be converted to prior probabilities by dividing each weight by their sum.

p <- seq(0.05, 0.95, by = 0.1)
wt <- c(1, 5.2, 8, 7.2, 4.6, 2.1, 0.7, 0.1, 0, 0)
prior <- wt/sum(wt)
PriorD <- as.data.frame(cbind(p, prior))
knitr::kable(PriorD)

p	prior
0.05	0.0346021
0.15	0.1799308
0.25	0.2768166
0.35	0.2491349
0.45	0.1591696
0.55	0.0726644
0.65	0.0242215
0.75	0.0034602
0.85	0.0000000
0.95	0.0000000

PriorD %>% 
  ggplot(aes(x = p, y = prior)) +
  geom_bar(stat = "identity", width = .001) +
  scale_y_continuous(expand=c(0,0), limit = c(0, 0.3)) +
  labs(y = "Prior Probability")+
  theme_classic()

2.2 The likelihood of the data

A sample of 27 graduating students is taken – in this group, 11 record that they had at least eight hours of sleep the previous night. Based on the prior information and these observed data, the researcher is interested in estimating the proportion \(p\).

If we regard a “success” as sleeping at least eight hours and we take a random sample of size \(n\) with \(s\) successes and \(f\) failures, then the likelihood function is given by

\[ L(y|p) = \binom{n}{s} p^s (1-p)^f \propto p^s (1-p)^f \]

In our example, 11 of 27 students sleep a sufficient number of hours, so \(s = 11\) and \(f = 16\), and the likelihood function is

\[ L(y|p) \propto p^{11} (1-p)^{16} \]

likelihood <- p^{11}*(1-p)^{16}
plot(p, likelihood, type="h", ylab="Likelihood(p)")

2.3 The posterior distribution

PL <- prior*likelihood
post <- PL/sum(PL)
PostD <- as.data.frame(cbind(p, prior, likelihood, post))
knitr::kable(PostD)

p	prior	post
0.05	0.0346021	0.0000000
0.15	0.1799308	0.0022560
0.25	0.2768166	0.1291349
0.35	0.2491349	0.4767910
0.45	0.1591696	0.3338350
0.55	0.0726644	0.0558782
0.65	0.0242215	0.0020983
0.75	0.0034602	0.0000066
0.85	0.0000000	0.0000000
0.95	0.0000000	0.0000000

PostD %>% 
  ggplot(aes(x = p, y = post)) +
  geom_bar(stat = "identity", width = .001) +
  scale_y_continuous(expand=c(0,0), limit = c(0, 0.5)) +
  labs(y = "Posterior Probability")+
  theme_classic()

2.4 Better alternative

The R function pdisc() in the package LearnBayes computes the posterior probabilities. To use pdisc(), one inputs the vector of proportion values \(p\), the vector of prior probabilities prior, and a data vector data consisting of \(s\) and \(f\). The output of pdisc() is a vector of posterior probabilities.

library(LearnBayes)
data <- c(s = 11, f = 16)
post1 <- pdisc(p, prior, data)
knitr::kable(as.data.frame(cbind(p,prior,post1)))

p	prior	post1
0.05	0.0346021	0.0000000
0.15	0.1799308	0.0022560
0.25	0.2768166	0.1291349
0.35	0.2491349	0.4767910
0.45	0.1591696	0.3338350
0.55	0.0726644	0.0558782
0.65	0.0242215	0.0020983
0.75	0.0034602	0.0000066
0.85	0.0000000	0.0000000
0.95	0.0000000	0.0000000

2.5 Inference questions

What is the expected proportion of graduating students who have slept at least eight hours per night?
What is the chance that between 30-50 percent of graduating students have slept at least eight hours per night?
Is there a statistical evidence that less than 50% of graduating students have slept at least eight hours per night?

To answer Question 1, we can use the classical estimator which is given by

\[ \hat{p} = \frac{11}{27} \approx 0.41 \]

About 41% of graduating students have slept at least eight hours per night.

In Bayesian statistics, there are many statistics which we can use to summarize the posterior distribution and come up with an estimate of \(p\). One exampe is the mean of the posterior distribution. For this example, the posterior mean is

\[ \hat{p}_{Bayes} \approx 0.38 \]

While the mode of the posterior distribution is 0.35.

The answer to Question 2 is:

\[ P(0.30 \leq p \leq 0.5) = P(p = 0.35) + P(p = 0.45) = 0.8106 \]

The answer to Question 3 is some kind of a hypothesis test:

\(H_0: p = 0.5\) vs \(H_1: p < 0.5\)

\[\begin{align} P(H_1) &= P(p < 0.5) \notag \\ &= P(p = 0.45) + P(p = 0.40) + P(p = 0.35) + P(p = 0.25) + P(p = 0.15) \notag \\ &\approx 0.9420 \end{align}\]

Therefore, it is highly likely that less than 50% of graduating students have slept at least eight hours per night.

2.6 Question to ponder

What happens if the prior distribution assigns equal probability (non-informative prior) for each possible value of \(p\)?

3 Takeaways

Bayesian inference (point estimation, credible interval, and hypothesis testing) is focused on the posterior distribution of the parameter \(p\)
The posterior distribution of the parameter is determined by the prior distribution and the likelihood (data)
If the prior distribution is “non-informative” the posterior distribution is “dominated” by the likelihood

Stat 136 (Bayesian Statistics)

Lesson 2.1 (Bayesian inference for a proportion using a discrete prior)

NE Milla, Jr.

2023-03-01