Use the \(\chi^2\) statistic with \(n-1\) degrees of freedom for a variance comparison test or for a point estimate of \(\sigma^2\) if
\(s^2\) is an unbiased estimator of \(\sigma^2\). Repeated samples of size \(n\) from a population would yield a statistic \(\chi^2 = \frac{{(n-1)s^2}}{{\sigma^2}}\) that is distributed chi-square with df=n-1. Express \(\sigma^2\) in a confidence interval that separately estimates each tail.
\(\frac{{(n-1)s^2}}{{\chi_L^2}} < \sigma^2 < \frac{{(n-1)s^2}}{{\chi_H^2}}\)
Do not use chi-square inference procedures if the sample data has substantial skewness or a substantial number of outliers. If the sample data is non-normal, use a bootstrapping technique.
The size of prey (millimeters) of two species of net-casting spiders, deinopis (X) and menneus (Y) are sampled for \(n_X = n_Y = 10\) spiders. What is the difference in the mean size of the prey of the two species?
library(dplyr)
library(ggplot2)
library(EnvStats)
x <- c(12.43, 11.71, 14.41, 11.05, 9.53,
11.66, 9.33, 11.71, 14.35, 13.81)
(x_bar <- mean(x))
## [1] 11.999
(s <- sd(x))
## [1] 1.801471
(n <- length(x))
## [1] 10
df <- n - 1
alpha <- 0.05
sigma = 1.00
# Check the single condition for a single variance comparison test.
# The samples is approximately normal (see below), so assume normal populations.
qqnorm(x)
qqline(x)
# Apply the Anderson-Darling normality test. The p-vaue is > 0.05 so do not reject H0 that the data is normally distributed.
library(nortest)
ad.test(x)
##
## Results of Hypothesis Test
## --------------------------
##
## Alternative Hypothesis:
##
## Test Name: Anderson-Darling normality test
##
## Data: x
##
## Test Statistic: A = 0.3383163
##
## P-value: 0.4225886
# Conduct a chi-square test on variance. The p-value = 0.001196, so reject H0 that s^2 = sigma^2. The 95% confidence interval is (1.535, 10.816).
varTest(x = x, alternative = "two.sided", sigma.squared = sigma^2, conf.level = (1 - alpha))
##
## Results of Hypothesis Test
## --------------------------
##
## Null Hypothesis: variance = 1
##
## Alternative Hypothesis: True variance is not equal to 1
##
## Test Name: Chi-Squared Test on Variance
##
## Estimated Parameter(s): variance = 3.245299
##
## Data: x
##
## Test Statistic: Chi-Squared = 29.20769
##
## Test Statistic Parameter: df = 9
##
## P-value: 0.001195555
##
## 95% Confidence Interval: LCL = 1.535407
## UCL = 10.816103
# (1-alpha/2) confidence interval graph
(lcl = (n - 1) * s^2 / qchisq(p = alpha / 2, df = df, lower.tail = FALSE))
## [1] 1.535407
(ucl = (n - 1) * s^2 / qchisq(p = alpha / 2, df = df, lower.tail = TRUE))
## [1] 10.8161
s_rnd = round(s, 2)
dat <- data.frame(chi_sq = 100:3000 / 100) %>%
mutate(sigma_sq = (n - 1) * s^2 / chi_sq) %>%
mutate(prob = dchisq(x = chi_sq, df = df)) %>%
mutate(rr = ifelse(sigma_sq < lcl | sigma_sq > ucl, prob, 0))
ggplot(dat) +
geom_line(aes(x = chi_sq, y = prob)) +
geom_area(aes(x = chi_sq, y = rr), alpha = 0.3) +
geom_vline(aes(xintercept = (n - 1)), color = "blue") +
labs(title = bquote('95% Interval Estimate'),
subtitle = bquote('s^2 = '~.(s_rnd^2)~' LCL'~.(lcl)~' UCL'~.(ucl)~' using chisq dist with'~.(df)~'df.'),
x = "chi^2",
y = "Probability") +
scale_x_continuous(breaks = c(1, (n - 1), 30), labels = dat$sigma_sq[c(1, (n - 1) * 100 - 100, 2900)])
# Hypothesis test graph
lcl = (n - 1) * sigma^2 / qchisq(p = alpha / 2, df = df, lower.tail = FALSE)
ucl = (n - 1) * sigma^2 / qchisq(p = alpha / 2, df = df, lower.tail = TRUE)
data.frame(chi_sq = 100:3000 / 100) %>%
mutate(sigma_sq = (n - 1) * sigma^2 / chi_sq) %>%
mutate(prob = dchisq(x = chi_sq, df = df)) %>%
mutate(rr = ifelse(sigma_sq < lcl | sigma_sq > ucl, prob, 0)) %>%
ggplot() +
geom_line(aes(x = chi_sq, y = prob)) +
geom_area(aes(x = chi_sq, y = rr), alpha = 0.3) +
geom_vline(aes(xintercept = (n - 1)), color = "blue") +
geom_vline(aes(xintercept = (n - 1) * s^2 / sigma^2), color = "red") +
labs(title = bquote('Hypothesis Test of H0: sigma^2 = 1'),
subtitle = bquote('s^2 = '~.(s_rnd^2)~' LCL'~.(lcl)~' UCL'~.(ucl)~' using chisq dist with'~.(df)~'df.'),
x = "chi^2",
y = "Probability") +
theme(legend.position="none")