Part A:Write the linear effects equation and the hypothesis you are testing
Linear Effects Equation: Y_{ij} = + T_{i} + _{ij} Null Hypothesis: Ho = Ti = 0 For All i Alternative Hypothesis: HA = Ti not equal to 0 for some i
Part B:Does it appear the data is normally distributed? Does it appear that the variance is constant? Entering the data
met1<-c(0.34,0.12,1.23,0.70,1.75,0.12)
met2<-c(0.91,2.94,2.14,2.36,2.86,4.55)
met3<-c(6.31,8.37,9.75,6.09,9.82,7.24)
met4<-c(17.15,11.82,10.97,17.20,14.35,16.82)
dat<-data.frame(met1,met2,met3,met4)
library(tidyr)
dat<-pivot_longer(dat,c(met1,met2,met3,met4))
dat$name<-as.factor(dat$name)
Anova Analysis
aov.model<-aov(value~name,data=dat)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 3 708.7 236.2 76.29 4e-11 ***
## Residuals 20 61.9 3.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(aov.model)
Since the data points are too less it is difficult to comment on
normality of data points. From “Residuals vs Fitted” graph we see that
the variance for all populations are not stable.
Part C:Perform a Kruskal-Wallace test in R (=0.05)
kruskal.test(value~name,data=dat)
##
## Kruskal-Wallis rank sum test
##
## data: value by name
## Kruskal-Wallis chi-squared = 21.156, df = 3, p-value = 9.771e-05
Since our P value is very small, we reject the null hypothesis at alpha = 0.05
Part D: Select an appropriate transformation using Box Cox, transform the data and test hypothesis in R (=0.05)
library(MASS)
boxplot(dat$value~dat$name,xlab = "Method Type",ylab = "Flood Flow Frequency",main="Boxplot of Observations")
boxcox(dat$value~dat$name)
lambda<-0.5
NNE<-dat$value^(0.5)
dat2<-data.frame(NNE,dat$name)
boxplot(dat2$NNE~dat2$dat.name,xlab="Method Type",ylab = "Flood Flow Frequency",main="Boxplot of Observations")
From the Box Cox transformation we see that the likelihood function is maximum at lambda = 0.5. Therefore , we raise the frequency of our observations to lambda = 0.5
Testing the Hypothesis
datT<-cbind(NNE,dat$name)
datT<-data.frame(datT)
datT$V2<-as.factor(datT$V2)
aov.modelT<-aov(NNE~V2,data = datT)
summary(aov.modelT)
## Df Sum Sq Mean Sq F value Pr(>F)
## V2 3 32.69 10.898 81.17 2.27e-11 ***
## Residuals 20 2.69 0.134
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(aov.modelT)
From the anova analysis we find that the p value is very very small for
alpha = 0.05 therefore, we reject the null hypothesis and thus we can
conclude that the estimation method has a significant effect on mean
blood flow frequency.
Also from the plots we can infer that after transformation, variance among estimation methods becomes more stable.