Difference in Proportions CI and Test

The proportion difference test uses the difference in sample proportions $\hat{d} = p_X - p_Y$Y as an estimate of the difference in population proportions $d = \pi_X - \pi_Y$ to evaluate an hypothesized difference in population proportions $d_0 = \pi_{X_0} - \pi_{Y_0}$. The null hypothesis is $H_0: d = d_0$ and the alternative hypothesis is $H_a: d \ne d_0$, $H_a: d < d_0$, or $H_a: d > d_0$. One can also construct a $(1 - \alpha)\%$ confidence interval around $\hat{d}$ to estimate $d$ within a margin of error $\epsilon$.11 See PSU STAT 500 online material, and Stat Trek.

The normal approximation method applies when the central limit theorem conditions hold:

Under the normal approximation method, the sampling distribution of the difference in population proportions has a normal distribution centered at the measured difference in proportions $\hat{d} = p_X - p_Y$Y with standard error $se = \sqrt{\frac{p_X (1 - p_X)}{n_X} + \frac{p_Y (1 - p_Y)}{n_Y}}$. The measured difference in proportions $\hat{d}$ and standard deviation $s$ approximate the population difference in proportions $d = \pi_X - \pi_Y$ and standard deviation $\sigma$. Test $H_0: d = d_0$ with test statistic $Z = \frac{\hat{d} - d_0}{se_{d_0}}$ where $se_{d_0} = \sqrt{p (1 - p) \frac{1}{n_X} + \frac{1}{n_Y}}$ is the standard error of the hypothesized population proportion sampling distribution and $p$ is the pooled proportion $p = \frac{x+y}{n_X + n_Y}$.[Why do we pool the proporitions for the hypothesis test, but not for the CI?] Define a $(1 - \alpha)\%$ confidence interval as $\hat{d} \pm z_{\alpha {/} 2} se$ where $z_{\alpha {/} 2} se = \epsilon$ is the margin of error.

Example

In an experiment on the effectiveness of computer training, $n_X = 125$ students were assigned to computer training and $n_Y = 175$ were assigned to traditional training. $x = 94 (75.2 \%)$ of the computer training students passed the certification exam and $y = 113 (64.6 \%)$ of the tradition training students passed the certification exam, a difference of $ = .752 - .646 = .1063. At an $\alpha = 0.05$ level of significance, does computer training produce higher success rates than traditional training? What is the $(1 - \alpha) = 95%$ confidence interval of the difference in success rates?

x <- 94
n_x <- 125
y <- 113
n_y <- 175
(p_x <- x / n_x)

## [1] 0.752

(p_y <- y / n_y)

## [1] 0.6457143

(d <- (p_x - p_y))

## [1] 0.1062857

(d_hat <- p_x - p_y)

## [1] 0.1062857

alpha <- 0.05

The samples are independently drawn because the experiment is random assignment. There are $>= 5$ success, $>= 5$ failures, $>= 30$ observations, and $.2 < p <.8$ probability of success for both $X$ and $Y$, so we can use the difference in proportions test.

The null hypothesis is $H_0: d = d_0 = (\pi_x - \pi_y) = 0$ with alternative hypothesis $H_a: d > 0$. This is a right-tail test. The test statistic is $Z = \frac{d - d_0}{se_{d_0}}$ where $se_{d_0} = \sqrt{p (1 - p) \frac{{1}}{{n_X}} + \frac{{1}}{{n_Y}}}$ with $p = \frac{{x+y}}{{n_X + n_Y}} = \frac{94 + 113}{125 + 175} = 0.69$. $se_{d_0} = \sqrt{.69 (1 - .69) \frac{{1}}{{125}} + \frac{{1}}{{175}}} = .0542$. $Z = \frac{.1063 - 0}{.0542} = 1.96$. $P(z > 1.96) = .0249$, so reject $H_0$ at the $\alpha = 0.05$ level of significance.

d_0 <- 0

(p <- (x + y) / (n_x + n_y))

## [1] 0.69

(se_d0 <- sqrt(p * (1 - p) * (1 / n_x + 1 / n_y)))

## [1] 0.05416166

(z = (d_hat - d_0) / se_d0)

## [1] 1.962379

pnorm(q = d, 
      mean = d_0, 
      sd = se_d0, 
      lower.tail = FALSE)

## [1] 0.02485918

R package stats function prop.test does this same calculation.

(prop.test.result <- prop.test(x = c(x, y), 
                               n = c(n_x, n_y), 
                               alternative = "greater", 
                               conf.level = 1 - alpha, 
                               correct = FALSE))

## 
##  2-sample test for equality of proportions without continuity
##  correction
## 
## data:  c(x, y) out of c(n_x, n_y)
## X-squared = 3.8509, df = 1, p-value = 0.02486
## alternative hypothesis: greater
## 95 percent confidence interval:
##  0.01926052 1.00000000
## sample estimates:
##    prop 1    prop 2 
## 0.7520000 0.6457143

Graph the hypothesis test with rejection range in red.

library(ggplot2)
library(dplyr)  # for %>%

lrr = -Inf
urr = qnorm(p = alpha, mean = d_0, sd = se_d0, lower.tail = FALSE)
data.frame(d = -300:300 / 1000) %>%
  mutate(density = dnorm(x = d, mean = d_0, sd = se_d0)) %>%
  mutate(rr = ifelse(d < lrr | d > urr, density, 0)) %>%
ggplot() +
  geom_line(aes(x = d, y = density)) +
  geom_area(aes(x = d, y = rr, fill = "red"), alpha = 0.3) +
  geom_vline(aes(xintercept = d_hat), color = "blue") +
  geom_vline(aes(xintercept = d_0), color = "black") +
  labs(title = bquote("Hypothesis Test of Difference in Proportions"),
       subtitle = bquote('p_X ='~.(round(p_x,3))~'p_Y ='~.(round(p_y,3))),
       x = "d",
       y = "Density") +
  theme(legend.position="none")

Define a $(1 - \alpha)\%$ confidence interval as $\hat{d} \pm z_{\alpha {/} 2} se$ where $z_{\alpha {/} 2} se = \epsilon$ is the margin of error. $se = \sqrt{\frac{p_X (1 - p_X)}{n_X} + \frac{p_Y (1 - p_Y)}{n_Y}} = \sqrt{\frac{.752 (1 - .752)}{125} + \frac{.646 (1 - .646)}{175}} = .0529$ and $z_{\alpha {/} 2} = 1.96$.

(se <- sqrt(p_x * (1 - p_x) / n_x + p_y * (1 - p_y) / n_y))

## [1] 0.05290756

(z_crit <- qnorm(p = alpha / 2, mean = 0, sd = 1, lower.tail = FALSE))

## [1] 1.959964

lcl <- d_hat - z_crit * se
ucl <- d_hat + z_crit * se
cat("95% CI: ", round(d_hat, 4), "+/-", round(se * z_crit, 4), "= (", round(lcl, 4), ", ", round(ucl, 4), ")")

## 95% CI:  0.1063 +/- 0.1037 = ( 0.0026 ,  0.21 )

R function prop.test can calculate a confidence interval around a proportion. In this case, use a two-tail distribution.

(prop.test.result <- prop.test(x = c(x, y), 
                               n = c(n_x, n_y), 
                               conf.level = 1 - alpha, 
                               correct = FALSE))

## 
##  2-sample test for equality of proportions without continuity
##  correction
## 
## data:  c(x, y) out of c(n_x, n_y)
## X-squared = 3.8509, df = 1, p-value = 0.04972
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  0.002588801 0.209982628
## sample estimates:
##    prop 1    prop 2 
## 0.7520000 0.6457143

Graph the 95% CI.

lcl <- round(prop.test.result$conf.int[1], 3)
ucl <- round(prop.test.result$conf.int[2], 3)
data.frame(d = -300:300 / 1000) %>%
  mutate(density = dnorm(x = d, mean = d_hat, sd = se)) %>%
  mutate(rr = ifelse(d < lcl | d > ucl, density, 0)) %>%
ggplot() +
  geom_line(aes(x = d, y = density)) +
  geom_area(aes(x = d, y = rr), fill = "red", alpha = 0.3) +
  geom_vline(aes(xintercept = d_hat), color = "blue") +
  labs(title = bquote("Difference in Proportions Confidence Interval"),
       subtitle = bquote('p_X ='~.(round(p_x,3))~'p_Y ='~.(round(p_y,3))),
       x = "d",
       y = "Density") +
  theme(legend.position="none")

Difference in Proportions CI and Test

Calculating confidence intervals and conducting hypothesis tests for the difference in proportions of two independent categorical variables using R.

Michael Foley

2019-02-25

Example