We have a taste test experiment where a subject tries to identify which of two brands (e.g., tea or shampoo) they are provided with over a series of trials.
Goal: Determine whether the individual possesses discriminating powers beyond random guessing.
Key quantities:
Data from first 6 trials: 1, 1, 1, 1, 1, 0 (5 successes, 1 failure)
y <- c(1, 1, 1, 1, 1, 0)
n <- length(y)
x <- sum(y)
cat("Data:", paste(y, collapse = ", "))
## Data: 1, 1, 1, 1, 1, 0
cat("\nNumber of trials (n):", n)
##
## Number of trials (n): 6
cat("\nNumber of successes (x):", x)
##
## Number of successes (x): 5
cat("\nObserved proportion:", x/n)
##
## Observed proportion: 0.8333333
The likelihood function for the observed data:
\[ L(\theta \mid y) = \prod_{i=1}^n \theta^{y_i} (1-\theta)^{1-y_i} = \theta^x (1-\theta)^{n-x} \]
For our specific data with 5 successes and 1 failure:
\[ L(\theta \mid y) = \theta^5 (1-\theta)^1 \]
likelihood <- function(theta, x, n) {
return(theta^x * (1-theta)^(n-x))
}
test_theta <- c(0.3, 0.5, 0.7, 0.9)
lik_vals <- sapply(test_theta, likelihood, x = x, n = n)
lik_df <- data.frame(
theta = test_theta,
likelihood = lik_vals
)
kable(lik_df, digits = 4, caption = "Likelihood at selected θ values")
| theta | likelihood |
|---|---|
| 0.3 | 0.0017 |
| 0.5 | 0.0156 |
| 0.7 | 0.0504 |
| 0.9 | 0.0590 |
Null Hypothesis (\(H_0\))
- Statement: \(\theta = \frac{1}{2}\)
(no discriminatory power, just guessing)
- Prior: Point mass at \(\theta =
\frac{1}{2}\)
- Prior “density”: \(\pi_0(\theta = 1/2) =
1\)
Alternative Hypothesis (\(H_1\))
- Statement: \(\theta >
\frac{1}{2}\) (has some discriminatory ability)
- Prior: Uniform distribution on \((\frac{1}{2}, 1)\)
- Prior density: \(\pi_1(\theta) = 2\)
for \(\frac{1}{2} < \theta <
1\)
Why these priors?
Under \(H_0\):
The “1” is a normalization constant for a point mass. Since there’s only
one possible value under \(H_0\), the
probability must sum to 1:
\[ \sum_{\theta \in \Theta_0} \pi_0(\theta) = \pi_0(\theta = 1/2) = 1 \]
Under \(H_1\):
The “2” ensures proper normalization of the uniform density on an
interval of length \(1/2\):
\[ \int_{1/2}^1 \pi_1(\theta) d\theta = \int_{1/2}^1 2 \, d\theta = 2 \times \frac{1}{2} = 1 \]
par(mfrow = c(1, 2), mar = c(4, 4, 3, 1), cex.main = 0.9)
plot(0.5, 1, type = "h", lwd = 3, col = "blue",
xlim = c(0.4, 1.1), ylim = c(0, 2.5),
xlab = expression(theta), ylab = "Prior 'density'",
main = expression("Under" ~ H[0] ~ "(Point Mass)"))
points(0.5, 1, pch = 16, col = "blue", cex = 1.5)
text(0.5, 1.2, expression(pi[0](theta) == 1), pos = 3)
theta_seq <- seq(0.5, 1, length.out = 100)
plot(theta_seq, rep(2, 100), type = "l", lwd = 2, col = "red",
xlim = c(0.4, 1.1), ylim = c(0, 2.5),
xlab = expression(theta), ylab = "Prior density",
main = expression("Under" ~ H[1] ~ "(Uniform)"))
polygon(c(0.5, theta_seq, 1), c(0, rep(2, 100), 0),
col = rgb(1, 0, 0, 0.3), border = NA)
text(0.75, 2.2, expression(pi[1](theta) == 2), pos = 3)
Let’s visualize the likelihood function to understand how different values of \(\theta\) are supported by our data.
theta_plot <- seq(0, 1, length.out = 1000)
lik_plot <- sapply(theta_plot, likelihood, x = x, n = n)
lik_df_plot <- data.frame(theta = theta_plot, likelihood = lik_plot)
mle_theta <- x/n
mle_lik <- likelihood(mle_theta, x, n)
ggplot(lik_df_plot, aes(x = theta, y = likelihood)) +
geom_line(color = "darkgreen", size = 1.2) +
geom_vline(xintercept = 0.5, linetype = "dashed", color = "blue", size = 0.8) +
geom_vline(xintercept = mle_theta, linetype = "dashed", color = "red", size = 0.8) +
geom_point(aes(x = 0.5, y = likelihood(0.5, x, n)), color = "blue", size = 3) +
geom_point(aes(x = mle_theta, y = mle_lik), color = "red", size = 3) +
annotate("text", x = 0.5, y = likelihood(0.5, x, n) + 0.005,
label = expression(theta == 0.5), color = "blue") +
annotate("text", x = mle_theta, y = mle_lik + 0.005,
label = paste("MLE:", round(mle_theta, 3)), color = "red") +
labs(
title = "Likelihood Function",
subtitle = paste("Data:", paste(y, collapse = ", ")),
x = expression(theta),
y = expression(L(theta ~ "|" ~ y))
) +
theme_minimal()
Interpretation: The likelihood peaks at \(\theta = 5/6 \approx 0.833\), which is the observed proportion of successes. The null hypothesis value \(\theta = 0.5\) has lower likelihood, suggesting the data are more consistent with values greater than 0.5.
Under \(H_0\), \(\theta\) is fixed at \(\frac{1}{2}\). Therefore, the marginal likelihood is simply the likelihood evaluated at that point:
\[ m_0(y) = L(\theta = \frac{1}{2} | y) = \left(\frac{1}{2}\right)^5 \times \left(1 - \frac{1}{2}\right)^1 = \left(\frac{1}{2}\right)^6 \]
theta0 <- 1/2
m0 <- likelihood(theta0, x, n)
cat("m₀(y) = (1/2)^6 =", m0)
## m₀(y) = (1/2)^6 = 0.015625
cat("\nm₀(y) as fraction:", fractions(m0))
##
## m₀(y) as fraction: 0.015625
Step-by-step calculation:
\[ m_0(y) = \left(\frac{1}{2}\right)^5 \times \frac{1}{2} = \frac{1}{32} \times \frac{1}{2} = \frac{1}{64} = 0.015625 \]
Why no integration? Because \(H_0\) has no free parameters - \(\theta\) is fixed at a single value. The “prior” is a point mass, so the marginal likelihood is just the likelihood at that point.
Under \(H_1\), \(\theta\) can vary between \(\frac{1}{2}\) and \(1\). We need to integrate the likelihood weighted by the prior density:
\[ m_1(y) = \int_{1/2}^1 L(\theta \mid y) \pi_1(\theta) d\theta = \int_{1/2}^1 \theta^5 (1-\theta) \times 2 \, d\theta \]
Let’s solve this integral step by step.
Step 1: Expand the integrand
\[ \int \theta^5 (1-\theta) d\theta = \int (\theta^5 - \theta^6) d\theta \]
Step 2: Integrate term by term
\[ \int \theta^5 d\theta = \frac{\theta^6}{6}, \quad \int \theta^6 d\theta = \frac{\theta^7}{7} \]
So:
\[ \int (\theta^5 - \theta^6) d\theta = \frac{\theta^6}{6} - \frac{\theta^7}{7} \]
Step 3: Evaluate from \(1/2\) to \(1\)
\[ \left[\frac{\theta^6}{6} - \frac{\theta^7}{7}\right]_{1/2}^1 = \left(\frac{1}{6} - \frac{1}{7}\right) - \left(\frac{(1/2)^6}{6} - \frac{(1/2)^7}{7}\right) \]
Step 4: Simplify
\[ \frac{1}{6} - \frac{1}{7} = \frac{7}{42} - \frac{6}{42} = \frac{1}{42} \]
\[ \frac{(1/2)^6}{6} = \frac{1/64}{6} = \frac{1}{384}, \quad \frac{(1/2)^7}{7} = \frac{1/128}{7} = \frac{1}{896} \]
So:
\[ \frac{1}{384} - \frac{1}{896} = \frac{896 - 384}{384 \times 896} = \frac{512}{344064} = \frac{1}{672} \]
Step 5: Subtract
\[ \frac{1}{42} - \frac{1}{672} = \frac{16}{672} - \frac{1}{672} = \frac{15}{672} = \frac{5}{224} \]
Step 6: Multiply by the prior density (2)
\[ m_1(y) = 2 \times \frac{5}{224} = \frac{10}{224} = \frac{5}{112} \approx 0.04464286 \]
F_upper <- (1^6)/6 - (1^7)/7
F_lower <- ((1/2)^6)/6 - ((1/2)^7)/7
integral_value <- F_upper - F_lower
m1 <- 2 * integral_value
cat("Step 1: ∫ θ^5(1-θ) dθ from 1/2 to 1 =", integral_value)
## Step 1: ∫ θ^5(1-θ) dθ from 1/2 to 1 = 0.02232143
cat("\nStep 2: m₁(y) = 2 ×", integral_value, "=", m1)
##
## Step 2: m₁(y) = 2 × 0.02232143 = 0.04464286
cat("\n\nm₁(y) as fraction:", fractions(m1))
##
##
## m₁(y) as fraction: 0.04464286
We can verify our analytical result using numerical integration in R:
integrand <- function(theta) {
return(likelihood(theta, x, n) * 2)
}
m1_numeric <- integrate(integrand, lower = 1/2, upper = 1)
cat("Numerical integration result:")
## Numerical integration result:
print(m1_numeric)
## 0.04464286 with absolute error < 5e-16
cat("\nDifference from analytical:", abs(m1 - m1_numeric$value))
##
## Difference from analytical: 6.938894e-18
✅ Both methods agree perfectly! The marginal likelihood under \(H_1\) is \(5/112 \approx 0.04464\).
The following plots show what we’re integrating:
theta_vis <- seq(0.5, 1, length.out = 200)
likelihood_vis <- sapply(theta_vis, likelihood, x = x, n = n)
prior_vis <- rep(2, length(theta_vis))
integrand_vis <- likelihood_vis * prior_vis
vis_df <- data.frame(
theta = theta_vis,
likelihood = likelihood_vis,
prior = prior_vis,
integrand = integrand_vis
)
p1 <- ggplot(vis_df, aes(x = theta, y = integrand)) +
geom_line(color = "purple", size = 1.2) +
geom_ribbon(aes(ymin = 0, ymax = integrand),
fill = "purple", alpha = 0.3) +
labs(
title = "Integrand: L(θ|y) × π₁(θ)",
subtitle = paste("Area = m₁(y) =", round(m1, 6)),
x = expression(theta),
y = expression(L(theta~"|"~y) %*% pi[1](theta))
) +
theme_minimal()
vis_df_long <- vis_df %>%
dplyr::select(theta, likelihood, integrand) %>%
pivot_longer(cols = c(likelihood, integrand),
names_to = "Quantity",
values_to = "value")
p2 <- ggplot(vis_df_long, aes(x = theta, y = value, color = Quantity)) +
geom_line(size = 1.2) +
labs(
title = "Likelihood vs Integrand",
subtitle = "Integrand = Likelihood × Prior (prior = 2)",
x = expression(theta),
y = "Value"
) +
theme_minimal() +
scale_color_manual(values = c("likelihood" = "darkgreen",
"integrand" = "purple"),
labels = c(expression(L(theta~"|"~y)),
expression(L(theta~"|"~y) %*% pi[1](theta))))
grid.arrange(p1, p2, ncol = 2)
Key insight: The integrand is the product of the likelihood and the prior density. The area under this purple curve equals \(m_1(y)\), the marginal likelihood under \(H_1\).
The Bayes factor quantifies the evidence provided by the data for one hypothesis versus another. It is the ratio of the marginal likelihoods:
\[ B_{01}(y) = \frac{m_0(y)}{m_1(y)} \]
Interpretation:
B01 <- m0 / m1
B01_fraction <- fractions(B01)
cat("Bayes Factor B₀₁ =", B01)
## Bayes Factor B₀₁ = 0.35
cat("\nBayes Factor as fraction:", B01_fraction)
##
## Bayes Factor as fraction: 0.35
cat("\n\n1/B₀₁ =", 1/B01)
##
##
## 1/B₀₁ = 2.857143
Calculation:
\[ B_{01}(y) = \frac{1/64}{5/112} = \frac{1}{64} \times \frac{112}{5} = \frac{112}{320} = \frac{7}{20} = 0.35 \]
Alternatively:
\[ \frac{1}{B_{01}} = \frac{20}{7} \approx 2.857 \]
This matches the problem statement: \(B_{01}(y) = 1/2.86\)
Conclusion: Since \(B_{01} = 0.35 < 1\), the data favor \(H_1\) over \(H_0\). Specifically, the data are about 2.86 times more likely under the alternative hypothesis that the subject has discriminatory power than under the null hypothesis of random guessing.