Difference in Variances CI and Test

Calculating confidence intervals and conducting hypothesis tests for the difference (ratio) in variances of two independent quantitative variables using R.

Michael Foley

2019-03-01

The F test to compare two variances uses the ratio of sample variances \(\hat{r} = s_X^2 / s_Y^2\) as an estimate of the ratio of population variances \(r = \sigma_X^2 / \sigma_Y^2\) to evaluate the hypothesized ratio of population variances \(r_0\).1 Why use this test? Use it to test the equal variances assumption in the pooled variances t-test. In quality control situations, use it to identify the smaller variance process. The F test uses the F statistic \(F = (s_X^2 / s_Y^2) / (\sigma_X^2 / \sigma_Y^2)\). When \(\sigma_X^2 = \sigma_Y^2\) the F statistic reduces to \(F = s_X^2 / s_Y^2\). The F distribution starts at zero and is skewed right, like the chi-square. The F test returns valid results if the samples are independent and normally distributed.2 If the data fails the normality condition, use either Bonett’s test or Levene’s test (if heavily skewed and small (n<20) sample sizes). The null hypothesis is that \(H_0: r = r_0\) and the alternative hypothesis \(H_a: r \ne r_0\), \(H_a: r < r_0\), or \(H_a: r > r_0\). One can also construct a \((1 - \alpha)\%\) confidence interval around \(\hat{r}\) to estimate \(r\) as \((s_X^2 / s_Y^2) (F_{1-\alpha / 2, df_X, df_Y}, F_{\alpha / 2, df_X, df_Y})\).3 See PSU STAT 500 online material.

Example

A packing plant tests the hypothesis that a faster new machine still has the same variance as the old machine. It measures the packing time for \(n_{new} = n_{old} = 10\) cartons. At an \(\alpha = 5\%\) level of significance, does the new machine have the same variance?

library(readr)

machine <- read_tsv(file = "Data/machine.txt", col_names = TRUE)
colnames(machine)[colnames(machine) == "New machine"] <- "new"
colnames(machine)[colnames(machine) == "Old machine"] <- "old"

(n_old <- nrow(machine))
## [1] 10
(n_new <- nrow(machine))
## [1] 10

The samples are independent because the machines are not related. The sample sizes are not large \((n_{new}, n_{old} < 30)\), so we need to check that each population is normally distributed before conducting the F test to compare two variances. Neither normal Q-Q plot below indicates a non-normal population. The Anderson-Darling normality tests both have p-values > .05, so do not reject the null hypothesis of normally distributed populations.

qqnorm(machine$old)
qqline(machine$old)

qqnorm(machine$new)
qqline(machine$new)

library(nortest)  # for Anderson-Darling 
ad.test(machine$old)
## 
##  Anderson-Darling normality test
## 
## data:  machine$old
## A = 0.27475, p-value = 0.5782
ad.test(machine$new)
## 
##  Anderson-Darling normality test
## 
## data:  machine$new
## A = 0.14231, p-value = 0.9548

The null hypothesis is that the ratio of variances is one, \(H_0: r = \sigma_{new}^2 / \sigma_{old}^2\) = 1$ with alternative hypothesis that the new machine has a different variance, \(H_a: r \ne 0\). We use an \(\alpha = 0.05\) level of significance.

r_0 = 1
alpha = 0.05

Conduct an F test to compare two variances. The F-statistic is \(F = \frac{s_{new}^2}{s_{old}^2} = \frac{0.6835^2}{.7499^2} = 0.8307\). The p-value is \(P(f>|f|) = .7868\), so do not reject the null hypothesis of unequal variance in packing times.

(s_new <- sd(machine$new))
## [1] 0.6834553
(s_old <- sd(machine$old))
## [1] 0.7498889
(f <- s_new^2 / s_old^2)
## [1] 0.8306659
pf(q = f, 
   df1 = n_new - 1, 
   df2 = n_old - 1, 
   lower.tail = TRUE) * 2
## [1] 0.7867797

The F distribution is not symmetric, so calculate the lower and upper limits for the \((1 - \alpha)\%\) CI separately. \((s_{new}^2 / s_{old}^2) (F_{1-\alpha / 2, df_X, df_Y}, F_{\alpha / 2, df_X, df_Y}) =\) \(.8306(.2484, 4.026) = (0.2063, 3.3442)\)

(r <- s_new^2 / s_old^2)
## [1] 0.8306659
(f_lower <- qf(p = alpha / 2,
               df1 = n_new - 1, 
               df2 = n_old - 1, 
               lower.tail = TRUE))
## [1] 0.2483859
(f_upper <- qf(p = alpha / 2,
               df1 = n_new - 1, 
               df2 = n_old - 1, 
               lower.tail = FALSE))
## [1] 4.025994
(lcl = r * f_lower)
## [1] 0.2063257
(ucl = r * f_upper)
## [1] 3.344256
cat("95% CI: (", round(lcl, 4), ", ", round(ucl, 4), ")")
## 95% CI: ( 0.2063 ,  3.3443 )

The stats package function var.test does all of these calculations.

(var.test.result <- var.test(x = machine$new, 
                             y = machine$old, 
                             ratio = r_0, 
                             alternative = "two.sided", 
                             conf.level = (1 - alpha)))
## 
##  F test to compare two variances
## 
## data:  machine$new and machine$old
## F = 0.83067, num df = 9, denom df = 9, p-value = 0.7868
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.2063257 3.3442560
## sample estimates:
## ratio of variances 
##          0.8306659

The plot of the hypothesis test shows the ratio of variances is from the range of rejection.

library(ggplot2)
library(dplyr)  # for %>%
library(ggthemes)

data.frame(f = 0:800 / 100) %>%
  mutate(prob = df(x = f, df1 = n_new, df2 = n_old)) %>%
  mutate(rr = ifelse(f < lcl | f > ucl, prob, 0)) %>%
ggplot() +
  geom_line(aes(x = f, y = prob)) +
  geom_area(aes(x = f, y = rr), fill = "red", alpha = 0.3) +
  geom_vline(aes(xintercept = s_new^2 / s_old^2), color = "blue") +
  geom_vline(aes(xintercept = r_0), color = "black") +
  labs(title = bquote('F test to compare two variances'),
       x = "f = s_new^2 / s_old^2",
       y = "Density") +
  theme_tufte()