The hypothesis that we are testing is:
\[ H_0 = \mu_1 = \mu_2 = \mu_3 = \mu_4 = \mu \]
\[ H_1 = At\; least \; one \;mean\;is\;different \]
#a)
f1<-c(17.6,18.9,16.3,17.4,20.1,21.6)
f2<-c(16.9,15.3,18.6,17.1,19.5,20.3)
f3<-c(21.4,23.6,19.4,18.5,20.5,22.3)
f4<-c(19.3,21.1,16.9,17.5,18.3,19.8)
dat<-data.frame(f1,f2,f3,f4)
library(tidyr)
dat2<-pivot_longer(dat,c(f1,f2,f3,f4))
aov.model<-aov(value~name,data=dat2)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 3 30.16 10.05 3.047 0.0525 .
## Residuals 20 65.99 3.30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion:
Since P-value (0.0525) > alpha (0.05), we do not reject the hypothesis that the means are equal.
Visually looking the data, I would choose fluid 3, since it shows higher observations values.
#c)
pop<-c(f1,f2,f3,f4)
meanx<-c(rep(mean(f1),6),rep(mean(f2),6),rep(mean(f3),6),rep(mean(f4),6))
res<-pop-meanx
qqnorm(res)
qqline(res)
plot(meanx,res,xlab="population average", ylab="residual",main="constant variance")
Conclusion:
The data looks fairly normal and the variance looks to be consistent within the samples.
The hypothesis that we are testing is:
\[ H_0 = \mu_1 = \mu_2 = \mu_3 = \mu \]
\[ H_1 = At\; least \; one \;mean\;is\;different \]
m1<-c(110, 157, 194, 178)
m2<-c(1, 2, 4, 18)
m3<-c(880, 1256, 5276, 4355)
m4<-c(495, 7040, 5307, 10050)
m5<-c(7, 5, 29, 2)
dat3<-data.frame(m1,m2,m3,m4,m5)
library(tidyr)
dat4<-pivot_longer(dat3,c(m1,m2,m3,m4,m5))
aov.model<-aov(value~name,data=dat4)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 4 103191489 25797872 6.191 0.00379 **
## Residuals 15 62505657 4167044
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion:
Since P-value (0.00379) < (0.01), we do reject the hypothesis that the means are equal.
pop2<-c(m1,m2,m3,m4,m5)
meanx2<-c(rep(mean(m1),4),rep(mean(m2),4),rep(mean(m3),4),rep(mean(m4),4),rep(mean(m5),4))
res2<-pop2-meanx2
qqnorm(res2)
qqline(res2)
plot(meanx2,res2,xlab="population average", ylab="residual",main="variance")
Conclusion:
From the variance residual plot we can clearly see a funnel shape that indicates that each pop. has a different variance pattern, hence the variance is not constant. In addition, the normal Q-Q plot shows that the data is not normally distributed.
dat5<-log(dat3)
dat6<-pivot_longer(dat5,c(m1,m2,m3,m4,m5))
aov.model<-aov(value~name,data=dat6)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 4 165.06 41.26 37.66 1.18e-07 ***
## Residuals 15 16.44 1.10
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion:
I applied a log transformation in the data, and similarly as previously, the hypothesis that the means are equal is rejected.
The hypothesis that we are testing is:
\[ H_0 = \mu_1 = \mu_2 = \mu_3 = \mu \]
\[ H_1 = At\; least \; one \;mean\;is\;different \]
w1<-c(31, 10, 21, 4, 1)
w2<-c(62, 40, 24, 30, 35)
w3<-c(53, 27, 120, 97, 68)
dat7<-data.frame(w1,w2,w3)
library(tidyr)
dat8<-pivot_longer(dat7,c(w1,w2,w3))
aov.model<-aov(value~name,data=dat8)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 2 8964 4482 7.914 0.00643 **
## Residuals 12 6796 566
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion:
The data looks fairly normal and the variance looks to be consistent within the samples.
#Item (b)
pop3<-c(w1,w2,w3)
meanx3<-c(rep(mean(w1),5),rep(mean(w2),5),rep(mean(w3),5))
res3<-pop3-meanx3
qqnorm(res3)
qqline(res3)
plot(meanx3,res3,xlab="population average", ylab="residual",main="variance")
Conclusion:
Similarly with previous question, the variance of the residuals has a funnel shape, indicating the non constant variance. From the Normal Q-Q plot, the data does not looks like normal there is a little āSā shape variation along the QQ line.
w1_1<-sqrt(w1)
w2_2<-sqrt(w2)
w3_3<-sqrt(w3)
dat9<-data.frame(w1_1,w2_2,w3_3)
library(tidyr)
dat10<-pivot_longer(dat9,c(w1_1,w2_2,w3_3))
aov.model<-aov(value~name,data=dat10)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 2 63.90 31.95 9.84 0.00295 **
## Residuals 12 38.96 3.25
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion:
After applying a square root transformation, the difference between the means is much more visible, showing a P- value of 0.0295, which is lesser than before the transformation.
f1<-c(17.6,18.9,16.3,17.4,20.1,21.6)
f2<-c(16.9,15.3,18.6,17.1,19.5,20.3)
f3<-c(21.4,23.6,19.4,18.5,20.5,22.3)
f4<-c(19.3,21.1,16.9,17.5,18.3,19.8)
dat<-data.frame(f1,f2,f3,f4)
library(tidyr)
dat2<-pivot_longer(dat,c(f1,f2,f3,f4))
kruskal.test(value~name,data=dat2)
##
## Kruskal-Wallis rank sum test
##
## data: value by name
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015
Conclusion:
Since P-value = 0.1015 (> than 0.05) the we do not reject the hypothesis that the means are equal.