A civil engineer is interested in determining whether four different methods of estimating flood flow frequency produce equivalent estimates of peak discharge when applied to the same watershed. Each procedure is used six times on the watershed, and the resulting discharge data (in cubic feet per second) are shown below.
# load data into a data frame
dat<-read.csv("https://raw.githubusercontent.com/forestwhite/RStatistics/main/FlippedAssignment10.csv", header=TRUE)
names(dat) <- sub('X', '', names(dat))
dat
## 1 2 3 4
## 1 0.34 0.91 6.31 17.15
## 2 0.12 2.94 8.37 11.82
## 3 1.23 2.14 9.75 10.97
## 4 0.70 2.36 6.09 17.20
## 5 1.75 2.86 9.82 14.35
## 6 0.12 4.55 7.24 16.82
The following equation describes the components of the linear effects model:
The hypotheses to test are:
H0: µk = µl for all k,l ∈ {1,2,3,4}
H1: µk ≠ µl for any k,l ∈ {1,2,3,4} where k ≠ l
The data is NOT normally distributed, because the ends deviate in different directions.
# AOV normal Q-Q plots of the data for the 4 estimation methods
aov.model<-aov(values ~ ind,data=(stack(dat)))
plot(aov.model,2)
The data sets for each treatment do not have equal variance.
# AOV residuals vs fitted comparison the data for the 4 estimation methods
aov.model<-aov(values ~ ind,data=(stack(dat)))
plot(aov.model,1)
# boxplots comparing the 4 estimation methods
boxplot(values ~ ind, data = stack(dat), col=c("steelblue","firebrick2", "forestgreen","darkorange"),xlab = "estimation methods",ylab = "discharge (cubic fee/second)",main="peak discharge estimates")
The P-value = 9.771e-05 < 0.05 = α, therefore we reject the null hypothesis. One or more of the estimation methods differs significantly from at least one of the others.
# Kruskal-Wallis Test
kruskal.test(values ~ ind,data=(stack(dat)))
##
## Kruskal-Wallis rank sum test
##
## data: values by ind
## Kruskal-Wallis chi-squared = 21.156, df = 3, p-value = 9.771e-05
From boxcox, our λ ≈ 0.55, and after the power transformation, the P-value = 1.67e-11 < 0.05 = α, therefore we reject the null hypothesis.
# Boxcox estimate
library(MASS)
boxcox(stack(dat)[,1]~stack(dat)[,2], plotit=TRUE )
# power transformation of the data using λ ≈ 0.55
dat[,1]<-dat[,1]^0.55
dat[,2]<-dat[,2]^0.55
dat[,3]<-dat[,3]^0.55
dat[,4]<-dat[,4]^0.55
dat
## 1 2 3 4
## 1 0.5524760 0.9494515 2.754331 4.773592
## 2 0.3115657 1.8096352 3.217355 3.889911
## 3 1.1205928 1.5195940 3.499069 3.733478
## 4 0.8218715 1.6036205 2.701093 4.781241
## 5 1.3604135 1.7823842 3.512864 4.327814
## 6 0.3115657 2.3009452 2.970682 4.722852
# Verify that boxcox λ ≈ 1 after transformation = SUCCESS
boxcox(stack(dat)[,1]~stack(dat)[,2], plotit=TRUE )
# One-way ANOVA
aov<-aov(values ~ ind,data=(stack(dat)))
summary(aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## ind 3 45.90 15.299 83.91 1.67e-11 ***
## Residuals 20 3.65 0.182
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Complete R code used in this analysis.
# load data into a data frame
dat<-read.csv("https://raw.githubusercontent.com/forestwhite/RStatistics/main/FlippedAssignment10.csv", header=TRUE)
names(dat) <- sub('X', '', names(dat))
dat
# AOV normal Q-Q plots of the data for the 4 estimation methods
aov.model<-aov(values ~ ind,data=(stack(dat)))
plot(aov.model,2)
# AOV residuals vs fitted comparison the data for the 4 estimation methods
aov.model<-aov(values ~ ind,data=(stack(dat)))
plot(aov.model,1)
# boxplots comparing the 4 estimation methods
boxplot(values ~ ind, data = stack(dat), col=c("steelblue","firebrick2", "forestgreen","darkorange"),xlab = "estimation methods",ylab = "discharge (cubic fee/second)",main="peak discharge estimates")
# Kruskal-Wallis Test
kruskal.test(values ~ ind,data=(stack(dat)))
# Boxcox estimate
library(MASS)
boxcox(stack(dat)[,1]~stack(dat)[,2], plotit=TRUE )
# power transformation of the data using λ ≈ 0.55
dat[,1]<-dat[,1]^0.55
dat[,2]<-dat[,2]^0.55
dat[,3]<-dat[,3]^0.55
dat[,4]<-dat[,4]^0.55
dat
# Verify that boxcox λ ≈ 1 after transformation = SUCCESS
boxcox(stack(dat)[,1]~stack(dat)[,2], plotit=TRUE )
# One-way ANOVA
aov<-aov(values ~ ind,data=(stack(dat)))
summary(aov)
# Kruskal-Wallis Test
kruskal.test(values ~ ind,data=(stack(dat)))