Problem 3.23:
Part (a)
Life <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 16.9, 15.3, 18.6, 17.1, 19.5, 20.3, 21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 19.3, 21.1, 16.9, 17.5, 18.3, 19.8)
Type <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- data.frame(Life, Type)
Data$Type <- as.factor(Data$Type)
aov.model<-aov(Life~Type,data=Data)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 3 30.16 10.05 3.047 0.0525 .
## Residuals 20 65.99 3.30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The p value is slightly higher than \(\alpha=0.05\), so we fail to reject null hypothesis and say that there is no indication that the fluid differs.
(b):
library(agricolae)
?LSD.test
LSD.test(aov.model,"Type",console=TRUE)
##
## Study: aov.model ~ "Type"
##
## LSD t Test for Life
##
## Mean Square Error: 3.299667
##
## Type, means and individual ( 95 %) CI
##
## Life std r se LCL UCL Min Max Q25 Q50 Q75
## 1 18.65000 1.952178 6 0.7415824 17.10309 20.19691 16.3 21.6 17.450 18.25 19.800
## 2 17.95000 1.854454 6 0.7415824 16.40309 19.49691 15.3 20.3 16.950 17.85 19.275
## 3 20.95000 1.879096 6 0.7415824 19.40309 22.49691 18.5 23.6 19.675 20.95 22.075
## 4 18.81667 1.554885 6 0.7415824 17.26975 20.36358 16.9 21.1 17.700 18.80 19.675
##
## Alpha: 0.05 ; DF Error: 20
## Critical Value of t: 2.085963
##
## least Significant Difference: 2.187666
##
## Treatments with the same letter are not significantly different.
##
## Life groups
## 3 20.95000 a
## 4 18.81667 ab
## 1 18.65000 b
## 2 17.95000 b
Since the objective is to select fluid with long life, so type 3 has the highest average and should be selected.
(c):
plot(aov.model)
Looking at the plots, they both look normal and the spread is not too much, so they satisfy both the conditions of normality and constant variance.
Probem 3.28:
(a):
Time <- c(110, 157, 194, 178, 1, 2, 4, 18, 880, 1256, 5276, 4355, 495, 7040, 5307, 10050, 7, 5, 29, 2)
Type <- c(rep(1,4), rep(2,4), rep(3,4), rep(4,4), rep(5,4))
Data <- data.frame(Time,Type)
Data$Type <- as.factor(Data$Type)
aov.model<-aov(Time~Type,data=Data)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 4 103191489 25797872 6.191 0.00379 **
## Residuals 15 62505657 4167044
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Since p value is less than the significance level so we reject H0 and say that all five materials do not have the same effect on mean failure time.
(b):
plot(aov.model)
Looking at the plots, it does not look both normal and the variance also does not seem to be constant so both assumptions of anova are violated.
(c):
Stabilizing the variance using boxcox like we did in the last assignment.
library(MASS)
boxcox(Time~Type)
Since \(\lambda\) seems to be almost zero, so we need to transform the data using log transformation.
LogTime <- log(Time)
#Plotting the transform data
DataT<-data.frame(LogTime,Type)
DataT$Type <- as.factor(DataT$Type)
aovmodelT<-aov(LogTime~Type,data=DataT)
plot(aovmodelT)
Looking at the transformed data plots, it looks reasonably normal but the variance is still not stabilized.
Problem 3.29:
(a):
Count <- c(31, 10, 21, 4, 1, 62, 40, 24, 30, 35, 53, 27, 120, 97, 68)
Type <- c(rep(1,5), rep(2,5), rep(3,5))
Data <- data.frame(Count,Type)
Data$Type <- as.factor(Data$Type)
aov.model<-aov(Count~Type,data=Data)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 2 8964 4482 7.914 0.00643 **
## Residuals 12 6796 566
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Since p value is smaller than 0.05, so we reject H0 and conclude that one of the means is different.
(b):
plot(aov.model)
Looking at the plots, the normality is somehow satisfied but the spread is not constant hence the second assumption of constant variance is not satisfied.
(c):
Performing transformations like we did in the previous part
library(MASS)
boxcox(Count~Type)
The value of \(\lambda\) is close to 0.4, so using this.
lambda <- 0.4
CountT<-Count^(lambda)
boxcox(CountT~Type)
Now the value of \(\lambda\) is more than 1, so using the transformed data.
DataT <- data.frame(CountT,Type)
DataT$Type <- as.factor(DataT$Type)
aov.modelT<-aov(CountT~Type,data=DataT)
summary(aov.modelT)
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 2 21.21 10.605 9.881 0.00291 **
## Residuals 12 12.88 1.073
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(aovmodelT)
Now,the data seems to satisfy both assumptions of normality and constant variance.
Problem 3.51, 3.52:
Loading the data again from problem 3.23 and performing Kruskal-Wallis test on it:
Life <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 16.9, 15.3, 18.6, 17.1, 19.5, 20.3, 21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 19.3, 21.1, 16.9, 17.5, 18.3, 19.8)
Type <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- data.frame(Life,Type)
Data$Type <- as.factor(Data$Type)
kruskal.test(Life~Type,data=Data)
##
## Kruskal-Wallis rank sum test
##
## data: Life by Type
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015
Since the p-value is again greater than 0.05, so we fail to reject H0 again like 3.23 though the p-value was slightly lower from ANOVA in 3.23. Therefore both results confirm that there is no significant different between the means.
Problem 4.3:
library(GAD)
obs<-c(73,68,74,71,67,73,67,75,72,70,75,68,78,73,68,73,71,75,75,69)
chemical<-c(rep(1,5),rep(2,5),rep(3,5),rep(4,5))
bolt<-c(seq(1,5),seq(1,5),seq(1,5),seq(1,5))
bolt<-as.fixed(bolt)
chemical<-as.fixed(chemical)
model<-lm(obs~chemical+bolt)
gad(model)
## $anova
## Analysis of Variance Table
##
## Response: obs
## Df Sum Sq Mean Sq F value Pr(>F)
## chemical 3 12.95 4.317 2.3761 0.1211
## bolt 4 157.00 39.250 21.6055 2.059e-05 ***
## Residuals 12 21.80 1.817
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Since p value is greater than the alpha so we fail to reject H0 and say that there is not significant difference between their means.
Problem 4.16:
chemical<-as.fixed(chemical)
bolt<-as.fixed(bolt)
# Grand mean
Gmean<-mean(obs)
c1 <- c(73, 68, 74, 71, 67)
c2 <- c(73, 67, 75, 72, 70)
c3 <- c(75, 68, 78, 73, 68)
c4 <- c(73, 71, 75, 75, 69)
Y1<-mean(c1)
Y2<-mean(c2)
Y3<-mean(c3)
Y4<-mean(c4)
t1<-Y1-Gmean
t2<-Y2-Gmean
t3<-Y3-Gmean
t4<-Y4-Gmean
cat(t1,t2,t3,t4)
## -1.15 -0.35 0.65 0.85
Now for bolts:
b1<-c(73,73,75,73)
b2<-c(68,67,68,71)
b3<-c(74,75,78,75)
b4<-c(71,72,73,75)
b5<-c(67,70,68,69)
Y.1<-mean(b1)
Y.2<-mean(b2)
Y.3<-mean(b3)
Y.4<-mean(b4)
Y.5<-mean(b5)
beta1<-Y.1-Gmean
beta2<-Y.2-Gmean
beta3<-Y.3-Gmean
beta4<-Y.4-Gmean
beta5<-Y.5-Gmean
cat(beta1,beta2,beta3,beta4,beta5)
## 1.75 -3.25 3.75 1 -3.25
Problem 4.22:
Obs <- c(8,7,1,7,3,11,2,7,3,8,4,9,10,1,5,6,8,6,6,10,4,2,3,8,8)
Batch <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,5)
Day <- c(rep(seq(1,5),5))
Ingredient <- c(1,2,4,3,5,3,5,1,4,2,2,1,3,5,4,4,3,5,2,1,5,4,2,1,3)
Batch <- as.factor(Batch)
Day <- as.factor(Day)
Ingredient <- as.factor(Ingredient)
Data <- data.frame(Obs, Batch, Day, Ingredient)
model<-aov(Obs~Ingredient+Batch+Day,data=Data)
summary(model)
## Df Sum Sq Mean Sq F value Pr(>F)
## Ingredient 4 141.44 35.36 11.309 0.000488 ***
## Batch 4 15.44 3.86 1.235 0.347618
## Day 4 12.24 3.06 0.979 0.455014
## Residuals 12 37.52 3.13
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The p-value of ingredient is lower than 0.05, so we fail to reject H0 and for the batch and day it greater than alpha, so they are the sources of nuisance variability.