The proportion difference test uses the difference in sample proportions \(\hat{d} = p_X - p_Y\)Y as an estimate of the difference in population proportions \(d = \pi_X - \pi_Y\) to evaluate an hypothesized difference in population proportions \(d_0 = \pi_{X_0} - \pi_{Y_0}\). The null hypothesis is \(H_0: d = d_0\) and the alternative hypothesis is \(H_a: d \ne d_0\), \(H_a: d < d_0\), or \(H_a: d > d_0\). One can also construct a \((1 - \alpha)\%\) confidence interval around \(\hat{d}\) to estimate \(d\) within a margin of error \(\epsilon\).1 See PSU STAT 500 online material, and Stat Trek.
The normal approximation method applies when the central limit theorem conditions hold:
Under the normal approximation method, the sampling distribution of the difference in population proportions has a normal distribution centered at the measured difference in proportions \(\hat{d} = p_X - p_Y\)Y with standard error \(se = \sqrt{\frac{p_X (1 - p_X)}{n_X} + \frac{p_Y (1 - p_Y)}{n_Y}}\). The measured difference in proportions \(\hat{d}\) and standard deviation \(s\) approximate the population difference in proportions \(d = \pi_X - \pi_Y\) and standard deviation \(\sigma\). Test \(H_0: d = d_0\) with test statistic \(Z = \frac{\hat{d} - d_0}{se_{d_0}}\) where \(se_{d_0} = \sqrt{p (1 - p) \frac{1}{n_X} + \frac{1}{n_Y}}\) is the standard error of the hypothesized population proportion sampling distribution and \(p\) is the pooled proportion \(p = \frac{x+y}{n_X + n_Y}\).[Why do we pool the proporitions for the hypothesis test, but not for the CI?] Define a \((1 - \alpha)\%\) confidence interval as \(\hat{d} \pm z_{\alpha {/} 2} se\) where \(z_{\alpha {/} 2} se = \epsilon\) is the margin of error.
In an experiment on the effectiveness of computer training, \(n_X = 125\) students were assigned to computer training and \(n_Y = 175\) were assigned to traditional training. \(x = 94 (75.2 \%)\) of the computer training students passed the certification exam and \(y = 113 (64.6 \%)\) of the tradition training students passed the certification exam, a difference of $ = .752 - .646 = .1063. At an \(\alpha = 0.05\) level of significance, does computer training produce higher success rates than traditional training? What is the \((1 - \alpha) = 95%\) confidence interval of the difference in success rates?
x <- 94
n_x <- 125
y <- 113
n_y <- 175
(p_x <- x / n_x)
## [1] 0.752
(p_y <- y / n_y)
## [1] 0.6457143
(d <- (p_x - p_y))
## [1] 0.1062857
(d_hat <- p_x - p_y)
## [1] 0.1062857
alpha <- 0.05
The samples are independently drawn because the experiment is random assignment. There are \(>= 5\) success, \(>= 5\) failures, \(>= 30\) observations, and \(.2 < p <.8\) probability of success for both \(X\) and \(Y\), so we can use the difference in proportions test.
The null hypothesis is \(H_0: d = d_0 = (\pi_x - \pi_y) = 0\) with alternative hypothesis \(H_a: d > 0\). This is a right-tail test. The test statistic is \(Z = \frac{d - d_0}{se_{d_0}}\) where \(se_{d_0} = \sqrt{p (1 - p) \frac{{1}}{{n_X}} + \frac{{1}}{{n_Y}}}\) with \(p = \frac{{x+y}}{{n_X + n_Y}} = \frac{94 + 113}{125 + 175} = 0.69\). \(se_{d_0} = \sqrt{.69 (1 - .69) \frac{{1}}{{125}} + \frac{{1}}{{175}}} = .0542\). \(Z = \frac{.1063 - 0}{.0542} = 1.96\). \(P(z > 1.96) = .0249\), so reject \(H_0\) at the \(\alpha = 0.05\) level of significance.
d_0 <- 0
(p <- (x + y) / (n_x + n_y))
## [1] 0.69
(se_d0 <- sqrt(p * (1 - p) * (1 / n_x + 1 / n_y)))
## [1] 0.05416166
(z = (d_hat - d_0) / se_d0)
## [1] 1.962379
pnorm(q = d,
mean = d_0,
sd = se_d0,
lower.tail = FALSE)
## [1] 0.02485918
R package stats function prop.test does this same calculation.
(prop.test.result <- prop.test(x = c(x, y),
n = c(n_x, n_y),
alternative = "greater",
conf.level = 1 - alpha,
correct = FALSE))
##
## 2-sample test for equality of proportions without continuity
## correction
##
## data: c(x, y) out of c(n_x, n_y)
## X-squared = 3.8509, df = 1, p-value = 0.02486
## alternative hypothesis: greater
## 95 percent confidence interval:
## 0.01926052 1.00000000
## sample estimates:
## prop 1 prop 2
## 0.7520000 0.6457143
Graph the hypothesis test with rejection range in red.
library(ggplot2)
library(dplyr) # for %>%
lrr = -Inf
urr = qnorm(p = alpha, mean = d_0, sd = se_d0, lower.tail = FALSE)
data.frame(d = -300:300 / 1000) %>%
mutate(density = dnorm(x = d, mean = d_0, sd = se_d0)) %>%
mutate(rr = ifelse(d < lrr | d > urr, density, 0)) %>%
ggplot() +
geom_line(aes(x = d, y = density)) +
geom_area(aes(x = d, y = rr, fill = "red"), alpha = 0.3) +
geom_vline(aes(xintercept = d_hat), color = "blue") +
geom_vline(aes(xintercept = d_0), color = "black") +
labs(title = bquote("Hypothesis Test of Difference in Proportions"),
subtitle = bquote('p_X ='~.(round(p_x,3))~'p_Y ='~.(round(p_y,3))),
x = "d",
y = "Density") +
theme(legend.position="none")
Define a \((1 - \alpha)\%\) confidence interval as \(\hat{d} \pm z_{\alpha {/} 2} se\) where \(z_{\alpha {/} 2} se = \epsilon\) is the margin of error. \(se = \sqrt{\frac{p_X (1 - p_X)}{n_X} + \frac{p_Y (1 - p_Y)}{n_Y}} = \sqrt{\frac{.752 (1 - .752)}{125} + \frac{.646 (1 - .646)}{175}} = .0529\) and \(z_{\alpha {/} 2} = 1.96\).
(se <- sqrt(p_x * (1 - p_x) / n_x + p_y * (1 - p_y) / n_y))
## [1] 0.05290756
(z_crit <- qnorm(p = alpha / 2, mean = 0, sd = 1, lower.tail = FALSE))
## [1] 1.959964
lcl <- d_hat - z_crit * se
ucl <- d_hat + z_crit * se
cat("95% CI: ", round(d_hat, 4), "+/-", round(se * z_crit, 4), "= (", round(lcl, 4), ", ", round(ucl, 4), ")")
## 95% CI: 0.1063 +/- 0.1037 = ( 0.0026 , 0.21 )
R function prop.test can calculate a confidence interval around a proportion. In this case, use a two-tail distribution.
(prop.test.result <- prop.test(x = c(x, y),
n = c(n_x, n_y),
conf.level = 1 - alpha,
correct = FALSE))
##
## 2-sample test for equality of proportions without continuity
## correction
##
## data: c(x, y) out of c(n_x, n_y)
## X-squared = 3.8509, df = 1, p-value = 0.04972
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## 0.002588801 0.209982628
## sample estimates:
## prop 1 prop 2
## 0.7520000 0.6457143
Graph the 95% CI.
lcl <- round(prop.test.result$conf.int[1], 3)
ucl <- round(prop.test.result$conf.int[2], 3)
data.frame(d = -300:300 / 1000) %>%
mutate(density = dnorm(x = d, mean = d_hat, sd = se)) %>%
mutate(rr = ifelse(d < lcl | d > ucl, density, 0)) %>%
ggplot() +
geom_line(aes(x = d, y = density)) +
geom_area(aes(x = d, y = rr), fill = "red", alpha = 0.3) +
geom_vline(aes(xintercept = d_hat), color = "blue") +
labs(title = bquote("Difference in Proportions Confidence Interval"),
subtitle = bquote('p_X ='~.(round(p_x,3))~'p_Y ='~.(round(p_y,3))),
x = "d",
y = "Density") +
theme(legend.position="none")