In the Flight Delays Case Study in Section 1.1,
The data contain flight delays for two airlines, American Airlines and United Airlines. Conduct a two-sided permutation test to see if the mean delay times between the two carriers are statistically significant.
The flight delays occured in May and June of 2009. Conduct a two-sided permutation test to see if the difference in mean delay times between the 2 months is statistically significant.
FD <- read.csv("http://www1.appstate.edu/~arnholta/Data/FlightDelays.csv")
# a. Your code here
## Find the actual observed difference of means
obs_diff <- FD %>%
group_by(Carrier) %>%
summarise(Mean = mean(Delay)) %>%
summarise(actual_diff = Mean[1] - Mean[2])
obs_diff <- obs_diff$actual_diff
sims <- 10^4 - 1
perm_diffs <- numeric(sims)
for(i in 1:sims) {
index <- sample(nrow(FD), sum(FD$Carrier == "AA"), replace = FALSE)
perm_diffs <- mean(FD$Delay[index]) - mean(FD$Delay[-index])
}
p_value <- 2 * ((sum(perm_diffs >= obs_diff)) + 1)/(sims + 1)
p_value
[1] 4e-04
p_value < 0.05
[1] TRUE
\(H_0: \mu_{U} - \mu_{A} = 0\)
\(H_A: \mu_{U} - \mu_{A} \neq 0\)
According to our p-value of 0.0004, we have sufficent evidence to reject H0. We have sufficient evidence to suggest that the mean delay times between the two carriers are statistically significant.
# Get Observed Value
obs_diff <- FD %>%
group_by(Month) %>%
summarise(Mean = mean(Delay)) %>%
summarise(actual_diff = Mean[1] - Mean[2])
obs_diff <- obs_diff$actual_diff
# Run a simulation
sims <- 10^4 - 1
perm_diffs <- numeric(sims)
for(i in 1:sims) {
index <- sample(nrow(FD), sum(FD$Month == "June"), replace = FALSE)
perm_diffs <- mean(FD$Delay[index]) - mean(FD$Delay[-index])
}
# Calculate and Display a p-value
p_value <- 2 * ((sum(perm_diffs >= obs_diff)) + 1)/(sims + 1)
p_value
[1] 2e-04
p_value < 0.05
[1] TRUE
\(H_0: \mu_{M} - \mu_{J} = 0\)
\(H_A: \mu_{M} - \mu_{J} \neq 0\)
According to our p-value of 0.0002, we have sufficient evidence to reject H0. We have sufficient evicence to suggest that the mean delay times between the two months are statistically significant.
In the Flight Delays Case Study in Section 1.1, the data contain flight delays for two airlines, American Airlines and United Airlines.
Compute the proportion of times that each carrier’s flights was delayed more than 20 minutes. Conduct a two-sided test to see if the difference in these proportions is statistically significant.
Compute the variance in the flight delay lengths for each carrier. Conduct a test to see if the variance for United Airlines is greater than that of American Airlines.
# Get Observed Values
obs_prop <- FD %>%
group_by(Carrier) %>%
summarise(prop_high_delay = sum(Delay > 20)/n())
obs_prop_diff <- obs_prop$prop_high_delay[1] - obs_prop$prop_high_delay[2]
# Run a Simulation
sims <- 10^4 - 1
perm_prop_diff <- numeric(sims)
for(i in 1:sims) {
index <- sample(nrow(FD), sum(FD$Carrier == "AA"), replace = FALSE)
perm_prop_diff[i] <- (sum(FD$Delay[index] > 20) / sum(FD$Carrier == "AA")) - (sum(FD$Delay[-index] > 20) / sum(FD$Delay == "UA"))
}
# Get and display p-value
p_value <- 2 * ((sum(perm_prop_diff >= obs_prop_diff)) + 1)/(sims + 1)
p_value
[1] 2e-04
p_value < 0.05
[1] TRUE
\(H_0: {p}_{U} - {p}_{A} = 0\)
\(H_A: {p}_{U} - {p}_{A} \neq 0\)
According to our p-value of 0.0002, we have sufficient evidence to reject H0. We have sufficient evidence to suggest that the difference between the proportions of flights delayed more than 20 minutes is significant.
# Get Observed Value
obs_var <- FD %>%
group_by(Carrier) %>%
summarise(variance = var(Delay))
obs_var_diff <- obs_var$variance[2] - obs_var$variance[1]
# Run a simulation
sims <- 10^4 - 1
perm_var_diff <- numeric(sims)
for(i in 1:sims) {
index <- sample(nrow(FD), sum(FD$Carrier == "UA"), replace = FALSE)
perm_var_diff[i] <- var(FD$Delay[index]) - var(FD$Delay[-index])
}
# Calculate a p-value
p_value <- (sum(perm_var_diff >= obs_var_diff) + 1)/(sims + 1)
We have found that the variance of delay time for UA is 2037.52 and the variance of delay time for AA is 1606.46.
\(H_0: \sigma^2_{U}-\sigma^2_{A} = 0\)
\(H_A: \sigma^2_{U}-\sigma^2_{A} > 0\)
Given our p-value of 0.145, we fail to reject \(H_0\). We have insufficient evidence to support the hypothesis that the variance of delays for United Airlines is larger than that of American Airlines.
for loop.# Get Observed Values
obs_vals <- FD %>%
group_by(Carrier) %>%
summarise(mean_delay = mean(Delay), sum_delay = sum(Delay))
obs_diff <- obs_vals %>%
summarise(mean_diff = mean_delay[2] - mean_delay[1])
# Run a simulation (multiple values per iteration)
sims <- 10^4 - 1
perm_mean <- numeric(sims)
perm_sum <- numeric(sims)
perm_diff <- numeric(sims)
for(i in 1:sims) {
index <- sample(nrow(FD), sum(FD$Carrier == "UA"), replace = FALSE)
perm_mean[i] <- mean(FD$Delay[index])
perm_sum[i] <- mean(FD$Delay[index])
perm_diff[i] <- mean(FD$Delay[index] - FD$Delay[-index])
}
# Calculate a p-value (multiple p-values)
p_value_mean <- 2 * ((sum(perm_mean >= obs_vals$mean_delay[2]) + 1) / (sims + 1))
p_value_sum <- 2 * ((sum(perm_sum >= obs_vals$sum_delay[2]) + 1) / (sims + 1))
p_value_diff <- 2 * ((sum(perm_diff >= obs_diff) + 1) / (sims + 1))
Our 3 cases have hypothesis as follows: The 3 p-values are as follows:
We can notice that the p-values are identical for each of these measures. This most likely relates to a consistent variance from normal in the UA data.
In the Flight Delays Case Study in Section 1.1,
Find the 25% trimmed mean of the delay times for United Airlines and American Airlines.
Conduct a two-sided test to see if the difference in trimmed means is statistically significant.
# Get Observed Values
trimmed_mean_delays <- FD %>%
group_by(Carrier) %>%
summarise(trimmed_mean = mean(Delay, trim = 0.25))
obs_diff <- trimmed_mean_delays$trimmed_mean[2] - trimmed_mean_delays$trimmed_means[1]
# Run a simulation
sims <- 10^4 - 1
perm_diff <- numeric(sims)
for(i in 1:sims) {
index <- sample(nrow(FD), sum(FD$Carrier == "UA"), replace = FALSE)
perm_var_diff[i] <- mean(FD$Delay[index], trim = 0.25) + mean(FD$Delay[-index], trim = 0.25)
}
# Calculate a p-value
p_value <- 2 * ((sum(perm_var_diff >= obs_var_diff) + 1)/(sims + 1))
The calculated trimmed mean of the United Airlines delays was -1 and the trimmed mean of the American Airlines delays was -3.
\(H_0: \mu_{U} - \mu_{A} = 0\)
\(H_A: \mu_{U} - \mu_{A} \neq 0\)
Given our p-value of 0.0002 we can suggest to reject the \(H_0\). We have sufficient evidence to support that the difference in the trimmed means of delay times between airlines is statistically significant.
In the Flight Delays Case Study in Section 1.1,
Compute the proportion of times the flights in May and in June were delayed more than 20 min, and conduct a two-sided test of whether the difference between months is statistically significant.
Compute the variance of the flight delay times in May and June and then conduct a two-sided test of whether the ratio of variances is statistically significantly different from 1.
# Get Observed Values
obs_prop <- FD %>%
group_by(Month) %>%
summarise(prop_very_delayed = sum(Delay > 20)/length(Delay))
obs_diff <- obs_prop$prop_very_delayed[1] - obs_prop$prop_very_delayed[2]
# Run a simulation
sims <- 10^4 - 1
perm_diffs <- numeric(sims)
for(i in 1:sims) {
index <- sample(nrow(FD), sum(FD$Month == "June"), replace = FALSE)
perm_diffs <- (sum(FD$Delay[index] > 20)/sum(FD$Month == "June")) - (sum(FD$Delay[-index] > 20)/sum(FD$Month == "May"))
}
# Calculate our p-value
p_value <- 2 * ((sum(perm_diffs >= obs_diff)) + 1)/(sims + 1)
\(H_0: {p}_{U} - {p}_{A} = 0\)
\(H_A: {p}_{U} - {p}_{A} \neq 0\)
Given our p-value of 0.0002 we fail to reject the \(H_0\) We found insufficient evidence to support that the difference between months was statistically significant.
# Get Observed Values
obs_ratio <- FD %>%
group_by(Month) %>%
summarise(variance = var(Delay)) %>%
summarise(variance_ratio = variance[1]/variance[2])
# Run a simulation
sims <- 10^4 - 1
perm_ratio <- numeric(sims)
for(i in 1:sims) {
index <- sample(nrow(FD), sum(FD$Month == "June"), replace = FALSE)
perm_ratio <- var(FD$Delay[index])/var(FD$Delay[-index])
}
# Calculate our p-value
p_value <- 2 * ((sum(perm_ratio >= obs_diff)) + 1)/(sims + 1)
\(H_0: \sigma^2_{U}-\sigma^2_{A} = 0\)
\(H_A: \sigma^2_{U}-\sigma^2_{A} \neq 1\)
Given our p-value of 0.0004, we fail to reject \(H_0\). We have insufficient evidence to suggest that the proportion of variances of delays is statiscially significantly different from 1.
Research at the University of Nebraska conducted a study to investigate sex differences in dieting trends among a group of Midwestern college students (Davy et al. (2006)). Students were recruited from an introductory nutrition course during one term. Below are data from one question asked to 286 participants.
Write down the appropriate hypothesis to test to see if there is a relationship between gender and diet and then carry out the test.
Can the resluts be generalized to a population? Explain.
LowFatDiet
Gender Yes No
Women 35 146
Men 8 97
(obs_chisq <- chisq.test(DT))
Pearson's Chi-squared test with Yates' continuity correction
data: DT
X-squared = 6.2549, df = 1, p-value = 0.01239
\(H_0: \hat{p}_{W} \cap \hat{p}_{M} = \hat{p}_{W} \times \hat{p}_{M}\)
\(H_A: \hat{p}_{W} \cap \hat{p}_{M} \neq \hat{p}_{W} \times \hat{p}_{M}\)
In this case, we are going to test \(H_A\) with the assumption that \(H_0\) is true.
These results cannot be generalized because, firstly, the number of Men who are on the diet is below the usual threashold to use a chi squared test. Furthermore, the group surveyed are college students (age restriction) whyo are in an introductory nutrition course (possible bias).
A national polling company conducted a survey in 2001 asking a randomly selected group of Americans of 18 years of age or older whether they supported limited use of marijuana for medicinal purposes. Here is a summary of the data:
Write down the appropriate hypothesis to test whether there is a relationship between age and support for medicinal marijuana and carry out the test.
\(H_0: \hat{p}_{L} \cap \hat{p}_{M} \cap \hat{p}_H = \hat{p}_{L} \times \hat{p}_{M} \times \hat{p}_H\)
\(H_A: \hat{p}_{L} \cap \hat{p}_{M} \cap \hat{p}_H \neq \hat{p}_{L} \times \hat{p}_{M} \times \hat{p}_H\)
Support
Age Against For
18-29 years old 52 172
30-49 years old 103 313
50 years or older 119 258
# Actual observed chisq test
obs_chisq <- chisq.test(T1)
# Run a simulation
sims <- 10^4 - 1
perm_values <- numeric(sims)
for(i in 1:sims) {
perm_tab <- xtabs(~sample(Age, replace = FALSE) + Support, data = DF)
perm_values <- chisq.test(perm_tab)
}
# Calculate our p-value
p_value <- ((sum(perm_values$stat >= obs_chisq$stat)) + 1)/(sims + 1)
Given our p-value of 0.0001, we can reject the null hypothesis. We have found sufficient evidence to suggest that there is a relationship between age and supporting medicinal marijuana.
Two students went to a local supermarket and collected data on cereals; they classified by their target consumer (children versus adults) and the placement of the cereal on the shelf (bottom, middle, and top). The data are given in Cereals.
Create a table to summarize the relationship between age of target consumer and shelf location.
Conduct a chi-square test using R’s chisq.test command.
R returns a warning message. Compute the expected counts for each cell to see why.
The warning message indicates that there is not a large enough sample size to confidently draw a conclusion.
d. Conduct a permutation test for independence.
Cereals <- read.csv("http://www1.appstate.edu/~arnholta/Data/Cereals.csv")
# Create Table
T1 <- xtabs(~Age + Shelf, data = Cereals)
T1
Shelf
Age bottom middle top
adult 2 1 14
children 7 18 1
# Actual observed chisq test
(obs_chisq <- chisq.test(T1))
Pearson's Chi-squared test
data: T1
X-squared = 28.625, df = 2, p-value = 6.083e-07
# Run a simulation
sims <- 10^4 - 1
perm_values <- numeric(sims)
for(i in 1:sims) {
perm_tab <- xtabs(~sample(Age, replace = FALSE) + Shelf, data = Cereals)
perm_values <- chisq.test(perm_tab)
}
# Calculate our p-value
p_value <- ((sum(perm_values$stat >= obs_chisq$stat)) + 1)/(sims + 1)
In our permutation test for independence, we found a p-value of 0.0001, which indicates that we can reject \(H_0\). We have found sufficient evidence to suggest that the target audience for a cereal and the placement of a cereal are related.
From GSS 2002 Case Study in Section 1.6,
Create a table to summarize the relationship between gender and the person’s choice for president in the 2000 election.
Test to see if a person’s choice for president in the 2000 election is independent of gender (use chisq.test in R).
Repeat the test but use the permutation test for independence. Does your conclusion change? (Be sure to remove observations with missing values)
GSS2002 <- read.csv("http://www1.appstate.edu/~arnholta/Data/GSS2002.csv")
# a. Create a Table
T1 <- xtabs(~Gender + Pres00, data = GSS2002)
T1
Pres00
Gender Bush Didnt vote Gore Nader Other
Female 459 5 492 26 3
Male 426 5 289 31 13
# Actual observed chisq test
(obs_chisq <- chisq.test(T1))
Pearson's Chi-squared test
data: T1
X-squared = 33.29, df = 4, p-value = 1.042e-06
# Run a simulation
sims <- 10^4 - 1
perm_values <- numeric(sims)
for(i in 1:sims) {
perm_tab <- xtabs(~sample(Gender, replace = FALSE) + Pres00, data = GSS2002)
perm_values <- chisq.test(perm_tab)
}
# Calculate our p-value
p_value <- ((sum(perm_values$stat >= obs_chisq$stat)) + 1)/(sims + 1)
\(H_0:\) Gender and Voting Choice are Independent
\(H_A:\) Gender and Voting Choice are Dependent
Given our p-value of 0.0001, we can reject the \(H_0\). We have sufficient evidence to suggest that gender and voting choice in the 2000 election are dependent.
From GSS 2002 Case Study in Section 1.6,
Create a table to summarize the relationship bewteen gender and the person’s general level of happiness (Happy).
Conduct a permutation test to see if gender and level of happiness are independent (Be sure to remove the observations with missing values).
GSS2002 <- read.csv("http://www1.appstate.edu/~arnholta/Data/GSS2002.csv")
# a. Create a Table
T1 <- xtabs(~Gender + Happy, data = GSS2002)
T1
Happy
Gender Not too happy Pretty happy Very happy
Female 109 406 205
Male 61 378 210
# b. Run a chisq test and then a permutation test
# Actual observed chisq test
(obs_chisq <- chisq.test(T1))
Pearson's Chi-squared test
data: T1
X-squared = 10.96, df = 2, p-value = 0.004168
# Run a simulation
sims <- 10^4 - 1
perm_values <- numeric(sims)
for(i in 1:sims) {
perm_tab <- xtabs(~sample(Gender, replace = FALSE) + Happy, data = GSS2002)
perm_values <- chisq.test(perm_tab)
}
# Calculate our p-value
p_value <- ((sum(perm_values$stat >= obs_chisq$stat)) + 1)/(sims + 1)
\(H_0:\) Gender and Happiness are Independent
\(H_A:\) Gender and Happiness are Dependent
Given our p-value of 0.0001, we can reject the \(H_0\). We have found sufficient evidence to suggest that happiness and gender are dependent.
From GSS 2002 Case Study in Section 1.6,
Create a table to summarize the relationship between support for gun laws (GunLaw) and views on government spending on the military (SpendMilitary).
Conduct a permutation test to see if support for gun laws and views on government spending on the military are independent (Be sure to remove observations with missing values).
GSS2002 <- read.csv("http://www1.appstate.edu/~arnholta/Data/GSS2002.csv")
# a. Create a Table
T1 <- xtabs(~GunLaw + SpendMilitary, data = GSS2002)
T1
SpendMilitary
GunLaw About right Too little Too much
Favor 168 101 72
Oppose 34 33 19
# b. Run a chisq test and then a permutation test
# Actual observed chisq test
(obs_chisq <- chisq.test(T1))
Pearson's Chi-squared test
data: T1
X-squared = 3.0827, df = 2, p-value = 0.2141
# Run a simulation
sims <- 10^4 - 1
perm_values <- numeric(sims)
for(i in 1:sims) {
perm_tab <- xtabs(~sample(GunLaw, replace = FALSE) + SpendMilitary, data = GSS2002)
perm_values <- chisq.test(perm_tab)
}
# Calculate our p-value
p_value <- ((sum(perm_values$stat >= obs_chisq$stat)) + 1)/(sims + 1)
\(H_0:\) Views on Gun Laws and Views on Government Military Spending are Independent
\(H_A:\) Views on Gun Laws and Views on Government Military Spending are Dependent
Given our p-value of 0.0001, we can reject \(H_0\). We have sufficient evidence to suggest that support for gun laws and views on military spending are dependent.