\(H_0: \mu_1=\mu_2 =\mu_3=\mu_4=\mu\)
\(H_a\): At least one \(\mu\) is not the same.
library(tidyr)
f1<-c(17.6,18.9,16.3,17.4,20.1,21.6)
f2<-c(16.9,15.3,18.6,17.1,19.5,20.3)
f3<-c(21.4,23.6,19.4,18.5,20.5,22.3)
f4<-c(19.3,21.1,16.9,17.5,18.3,19.8)
dat <- data.frame(f1,f2,f3,f4)
dat2 <- pivot_longer(dat,c(f1,f2,f3,f4))
aov.model <- aov(value~name,data=dat2)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 3 30.16 10.05 3.047 0.0525 .
## Residuals 20 65.99 3.30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
p value is 0.0525 larger than 0.05, we fail to reject the null hypothesis. There is no indication that the fluids differ.
I would choose fluid type 3 for longer life.
pop <- c(f1,f2,f3,f4)
meanx <- c(rep(mean(f1),6),rep(mean(f2),6),rep(mean(f3),6),rep(mean(f4),6))
res <- pop-meanx
qqnorm(res)
qqline(res)
plot(meanx,res,xlab="population average", ylab="residual",main="variance")
The variance appear to be consistent.
\(H_0: \mu_1=\mu_2 =\mu_3=\mu_4=\mu_5=\mu\)
\(H_a\): At least one \(\mu\) is not the same.
m1<-c(110, 157, 194, 178)
m2<-c(1, 2, 4, 18)
m3<-c(880, 1256, 5276, 4355)
m4<-c(495, 7040, 5307, 10050)
m5<-c(7, 5, 29, 2)
dat <- data.frame(m1,m2,m3,m4,m5)
library(tidyr)
dat2 <- pivot_longer(dat,c(m1,m2,m3,m4,m5))
aov.model <- aov(value~name, data=dat2)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 4 103191489 25797872 6.191 0.00379 **
## Residuals 15 62505657 4167044
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P value = 0.00379 which is smaller than 0.01, so we can reject the H0. The five materials do not have the same effect on mean failure time.
pop <- c(m1,m2,m3,m4,m5)
meanx <- c(rep(mean(m1),4),rep(mean(m2),4),rep(mean(m3),4),rep(mean(m4),4),rep(mean(m5),4))
res <- pop-meanx
qqnorm(res)
qqline(res)
plot(meanx,res,xlab="population average", ylab="residual",main="variance")
The variances do not appear to be constant. The data is not normally distributed.
library(MASS)
boxcox(pop~meanx)
dat3<-log(dat)
dat4<-pivot_longer(dat3,c(m1,m2,m3,m4,m5))
aov.model<-aov(value~name,data=dat4)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 4 165.06 41.26 37.66 1.18e-07 ***
## Residuals 15 16.44 1.10
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P value is much smaller, and we can reject the H0.
\(H_0: \mu_1=\mu_2 =\mu_3=\mu\)
\(H_a\): At least one \(\mu\) is not the same.
M1<-c(31, 10, 21, 4, 1)
M2<-c(62, 40, 24, 30, 35)
M3<-c(53, 27, 120, 97, 68)
dat<-data.frame(M1,M2,M3)
library(tidyr)
dat2<-pivot_longer(dat,c(M1,M2,M3))
aov.model<-aov(value~name,data=dat2)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 2 8964 4482 7.914 0.00643 **
## Residuals 12 6796 566
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P value = 0.00643 which is smaller than 0.01, so we can reject H0. All methods don’t have the same effect.
pop <- c(M1,M2,M3)
meanx <- c(rep(mean(M1),5),rep(mean(M2),5),rep(mean(M3),5))
res <- pop-meanx
qqnorm(res)
qqline(res)
plot(meanx,res,xlab="population average", ylab="residual",main="variance")
The data seems to be normally distributed, but the variance is obviously not constant,
library(MASS)
boxcox(pop~meanx)
dat5<-sqrt(dat)
dat6<-pivot_longer(dat5,c(M1,M2,M3))
aov.model<-aov(value~name,data=dat6)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 2 63.90 31.95 9.84 0.00295 **
## Residuals 12 38.96 3.25
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P value is 0.00295, which is even smaller than before the data transformation. We can safely reject the null hypothesis.
f1<-c(17.6,18.9,16.3,17.4,20.1,21.6)
f2<-c(16.9,15.3,18.6,17.1,19.5,20.3)
f3<-c(21.4,23.6,19.4,18.5,20.5,22.3)
f4<-c(19.3,21.1,16.9,17.5,18.3,19.8)
dat<-data.frame(f1,f2,f3,f4)
library(tidyr)
dat2<-pivot_longer(dat,c(f1,f2,f3,f4))
kruskal.test(value~name,data=dat2)
##
## Kruskal-Wallis rank sum test
##
## data: value by name
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015
P value is higher than 0.05, so we fail to reject H0.
The results is different from the usual analysis of variance, which assume the normal distribution and the constant variance. In this case, Kruskal-Wallis test gives us more reliable and conservative results.
library(GAD)
chemical<-c(rep(1,5),rep(2,5),rep(3,5),rep(4,5))
bolt <- c(seq(1,5),seq(1,5),seq(1,5),seq(1,5))
obs<- c(73,68,74,71,67,
73,67,75,72,70,
75,68,78,73,68,
73,71,75,75,69)
chemical<-as.fixed(chemical)
bolt <- as.fixed(bolt)
model<-lm(obs~chemical+bolt)
gad(model)
## $anova
## Analysis of Variance Table
##
## Response: obs
## Df Sum Sq Mean Sq F value Pr(>F)
## chemical 3 12.95 4.317 2.3761 0.1211
## bolt 4 157.00 39.250 21.6055 2.059e-05 ***
## Residuals 12 21.80 1.817
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The results suggest that there is significant variability in tensile strength between different bolts (p value is smaller than 0.05), but the different chemical agents did not have a statistically significant effect on tensile strength at the 0.05 level.
mean_all <- mean(obs)
c1 <- c(73,68,74,71,67)
c2 <- c(73,67,75,72,70)
c3 <- c(75,68,78,73,68)
c4 <- c(73,71,75,75,69)
mean(c1)
## [1] 70.6
mean(c2)
## [1] 71.4
mean(c3)
## [1] 72.4
mean(c4)
## [1] 72.6
t1=mean(c1)-mean_all
t2=mean(c2)-mean_all
t3=mean(c3)-mean_all
t4=mean(c4)-mean_all
t1
## [1] -1.15
t2
## [1] -0.35
t3
## [1] 0.65
t4
## [1] 0.85
b1 <- c(73,73,75,73)
b2 <- c(68,67,68,71)
b3 <- c(74,75,78,75)
b4 <- c(71,72,73,75)
b5 <- c(67,70,68,69)
mean(b1)
## [1] 73.5
mean(b2)
## [1] 68.5
mean(b3)
## [1] 75.5
mean(b4)
## [1] 72.75
mean(b5)
## [1] 68.5
B1 <- mean(b1)-mean_all
B2 <- mean(b2)-mean_all
B3 <- mean(b3)-mean_all
B4 <- mean(b4)-mean_all
B5 <- mean(b5)-mean_all
B1
## [1] 1.75
B2
## [1] -3.25
B3
## [1] 3.75
B4
## [1] 1
B5
## [1] -3.25
\(\tau_1=\)-1.15
\(\tau_2=\)-0.35
\(\tau_3=\) 0.65
\(\tau_4=\) 0.85
\(\beta_1=\) 1.75
\(\beta_2=\) -3.25
\(\beta_3=\) 3.75
\(\beta_4=\) 1
\(\beta_5=\) -3.25
batch<-c(rep(1,5),rep(2,5),rep(3,5),rep(4,5),rep(5,5))
day<-c(seq(1,5),seq(1,5),seq(1,5),seq(1,5),seq(1,5))
value <- c(8,7,1,7,3,11,2,7,3,8,4,9,10,1,5,6,8,6,6,10,4,2,3,8,8)
ingre <- c('A','B','D','C','E','C','E','A','D','B','B','A','C','E','D','D','C','E','B','A','E','D','B','A','C')
dat <- data.frame(batch,day,value,ingre)
dat$batch <- as.factor(dat$batch)
dat$day <- as.factor(dat$day)
dat$ingre <- as.factor(dat$ingre)
aov.model <- aov(value~batch+day+ingre, data = dat)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## batch 4 15.44 3.86 1.235 0.347618
## day 4 12.24 3.06 0.979 0.455014
## ingre 4 141.44 35.36 11.309 0.000488 ***
## Residuals 12 37.52 3.13
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The ingredients appears to significantly affect the reaction time of the chemical process, since the p value is smaller than 0.001.