Flipped Assignment 10
Group 3: Muneeb, Nitish, Rajesh
A civil engineer is interested in determining whether four different methods of estimating flood flow frequency produce equivalent estimates of peak discharge when applied to the same watershed. Each procedure is used six times on the watershed, and the resulting discharge data (in cubic feet per second) are shown below.
Part A
Linear Effects Equation: Yij = mu + Ti + εij
Null Hypothesis: Ho= Ti = 0 All i
Alternative Hypothesis: Ha = Ti ≠ 0 Some i
Part B
Reading Data
Frequency <- c(0.34, 0.12, 1.23, 0.70, 1.75, 0.12, 0.91, 2.94, 2.14, 2.36, 2.86, 4.55, 6.31, 8.37, 9.75, 6.09, 9.82, 7.24, 17.15, 11.82, 10.97, 17.2, 14.35, 16.82)
Type <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- cbind(Frequency, Type)
Data <- data.frame(Data)
Data$Type <- as.factor(Data$Type)
str(Data)
## 'data.frame': 24 obs. of 2 variables:
## $ Frequency: num 0.34 0.12 1.23 0.7 1.75 0.12 0.91 2.94 2.14 2.36 ...
## $ Type : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 2 2 2 2 ...
Answer Part B: Data points in this case are too less to make a reasonable test of normality. For assessing constant variance, we can utilize “Residuals vs Fitted” plot, which shows that variance is not stable among different methods.
Part C
Since our varaince is not stable among different method types so we are using non-parametric test.
Kruskal-Wallace Test
kruskal.test(Frequency~Type,data=Data)
##
## Kruskal-Wallis rank sum test
##
## data: Frequency by Type
## Kruskal-Wallis chi-squared = 21.156, df = 3, p-value = 9.771e-05
Answer Part C: Since our p-value is very small, so we would reject null hypothesis at a reasonable level of significance.
Part D:
Transforming the data using box cox, so we can test our hypothesis through transformed data.
Box Cox
library(MASS)
boxplot(Data$Frequency~Data$Type,xlab="Method Type",ylab="Flood Flow Frequency",main="Boxplot of Observations")

boxcox(Frequency~Type)

lambda <- 0.5
TFrequency<-Frequency^(lambda)
boxplot(TFrequency~Data$Type,xlab="Method Type",ylab="Flood Flow Frequency",main="Boxplot of Observations")

Box Cox transformation shows that the likelihood function is maximum at 0.5 value of lambda, so we would raise our frequency observations to power of lambda (0.5), which in this case is same as taking sqrt.
Answer Part D: Anova analysis of the transformed data gives us a p-value, that is too small so at 0.05 level of significance we would reject null hypothesis and thus we would conclude that estimation method has a significant effect on mean flood flow frequency.
Also if we look at our residual plots, we can see that after transformation variance among different estimation methods has become much more stable. (Not perfect though)