Getting the data we have
fluidtype<- c(rep(1,6),rep(2,6),rep(3,6),rep(4,6))
life <- c(17.6,18.9,16.3,17.4,20.1,21.6,16.9,15.3,18.6,17.1,19.5,20.3,21.4,23.6,19.4,18.5,20.5,22.3,19.3,21.1,16.9,17.5,18.3,19.8)
dat5<-cbind(fluidtype,life)
dat5<- as.data.frame(dat5)
dat5$fluidtype<-as.factor(dat5$fluidtype)
dat5$life <- as.numeric(dat5$life)
Question 3a
let
u1= mean of life (in h) at 35 kV load for fluid type 1
u2= mean of life (in h) at 35 kV load for fluid type 2
u3= mean of life (in h) at 35 kV load for fluid type 3
u4= mean of life (in h) at 35 kV load for fluid type 4
Null hypothesis is that
H0: u1=u2=u3=u4 that is; all the mean of all fluid type (1,2,3,4) are equal.
Alternative hypothesis
Ha- At least one of the means (u’s) differs
Model<- aov(dat5$life~dat5$fluidtype,data=dat5)
summary(Model)
## Df Sum Sq Mean Sq F value Pr(>F)
## dat5$fluidtype 3 30.17 10.05 3.047 0.0525 .
## Residuals 20 65.99 3.30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Since the p-value (0.0525) from our model is greater than the our reference value of 0.05. We are failing to reject the null hypothesis and stating that the fluids(1,2,3,4) do not differ.
We can also note and see that our p-value is obtained is not significantly different from 0.05. Which is very important to be aware of this.
Which fluid would you select, given that the objective is long life?
Model2<- TukeyHSD(Model)
Model2
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dat5$life ~ dat5$fluidtype, data = dat5)
##
## $`dat5$fluidtype`
## diff lwr upr p adj
## 2-1 -0.7000000 -3.63540073 2.2354007 0.9080815
## 3-1 2.3000000 -0.63540073 5.2354007 0.1593262
## 4-1 0.1666667 -2.76873407 3.1020674 0.9985213
## 3-2 3.0000000 0.06459927 5.9354007 0.0440578
## 4-2 0.8666667 -2.06873407 3.8020674 0.8413288
## 4-3 -2.1333333 -5.06873407 0.8020674 0.2090635
plot(Model2)
Which fluid would you select, given that the objective is long life?
Based on the plots above, we would select fluid type 3 if we were given that an objective is long-life for fluid type, based on the fact that for fluid type 3, the average life is more that the three other fluids(1,2,4).
plot(Model)
3.23 (c) Analyze the residuals from this experiment. Are the basic analysis of variance assumptions satisfied?
We can say strongly that the data is normally distributed because all the data falls fairly on a straight line based on the residual Normal Q-Q plot
From the residual vs fitted plot,we see that the residuals have the spread/dispersion. Hence the variance is constant.
material <- c(rep(1,4),rep(2,4),rep(3,4),rep(4,4),rep(5,4))
t <- c(110,157,194,178,1,2,4,18,880,1256,5276,4355,495,7040,5307,10050,7,5,29,2)
dat <- cbind(material,t)
dat <- as.data.frame(dat)
dat$material <- as.factor(as.character(dat$material))
dat$t<- as.numeric(dat$t)
let
u1= mean failure time in minutes of material 1
u2= mean failure time in minutes of material 2
u3= mean failure time in minutes of material 3
u4= mean failure time in minutes of material 4
u5= mean failure time in minutes of material 5
Null hypothesis is that
H0: u1=u2=u3=u4=u5 that is; all the mean of all Material type (1,2,3,4,5) are equal.
Alternative hypothesis
Ha- At least one of the means (u’s) differs
Model3<- aov(dat$t~dat$material,data = dat)
summary(Model3)
## Df Sum Sq Mean Sq F value Pr(>F)
## dat$material 4 103191489 25797872 6.191 0.00379 **
## Residuals 15 62505657 4167044
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From the p-value obtained from our model (p=0.00379) which is less than 0.05. We are rejecting our null hypothesis and stating that at least one of the u’s(means) differs.
Question 3.28 b
plot(Model3)
We can say strongly that the data is not normally distributed because all the data points doesn’t fall fairly on a straight line based on the Normal Q-Q plot
From the residual vs fitted plot,we see that the residuals have different spread/dispersion. Hence the variance is not constant.
library(MASS)
boxcox(Model3)
Reading off the box cox graph, we can see that the lambda value is close to zero.
So therefore we would take the log transformation of the data
dat$time <- log(dat$t)
head(dat)
## material t time
## 1 1 110 4.7004804
## 2 1 157 5.0562458
## 3 1 194 5.2678582
## 4 1 178 5.1817836
## 5 2 1 0.0000000
## 6 2 2 0.6931472
Model4<-aov(dat$time~dat$material,data = dat)
summary(Model4)
## Df Sum Sq Mean Sq F value Pr(>F)
## dat$material 4 165.06 41.26 37.66 1.18e-07 ***
## Residuals 15 16.44 1.10
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(Model4)
After the application of box-cox and applying the appropriate power transformation which was the log transformation of data we observed that and say strongly
That the data is normally distributed because all the data points falls fairly on a straight line based on the residual Normal Q-Q plot
Also
From the residual vs fitted plot,we see also see that the residuals have fairly the same spread/dispersion. Hence the variance is constant.
The model was also highly significant with p-value of 1.18e-07
Getting data
method<- c(rep(1,5),rep(2,5),rep(3,5))
count<- c(31,10,21,4,1,62,40,24,30,35,53,27,120,97,68)
dat1 <- cbind(method,count)
dat1 <- as.data.frame(dat1)
dat1$method <- as.factor(as.character(dat1$method))
dat1$count <- as.numeric(dat1$count)
let
u1= mean count using method 1
u2= mean count using method 2
u3= mean count using method 3
Null hypothesis is that
H0: u1=u2=u3 that is; all the mean count of the three methods (1,2,3) have the same effect of mean particle count
Alternative hypothesis
Ha- At least one of mean count of the three methods (1,2,3) have different effect on mean particle count
Model4<- aov(dat1$count~dat1$method,data = dat1)
summary(Model4)
## Df Sum Sq Mean Sq F value Pr(>F)
## dat1$method 2 8964 4482 7.914 0.00643 **
## Residuals 12 6796 566
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From the p-value obtained from our model (p=0.00643) which is less than 0.05. We are rejecting our null hypothesis and stating that at least one of the u’s(means) differs and therefore all the methods(1,2,3) does not have the same effect on mean particle.
Question 3.29 b
plot(Model4)
b.
We can say strongly that the data is normally distributed because all the data falls fairly on a straight line based on the residual Normal Q-Q plot
From the residual vs fitted plot,we see that the residuals have the same spread/dispersion. Hence the variance is constant.
Question 3.29 c
I would suggest conducting a power transformation to be able to draw appropriate conclusion.
boxcox(Model4)
Based on the box-cox graph we can see that the lambda value can be taken as 0.45
Hence, we take the power transformation of the data raised to the power of 0.45
dat1$new_count<-(dat1$count)^0.45
head(dat1)
## method count new_count
## 1 1 31 4.689351
## 2 1 10 2.818383
## 3 1 21 3.935489
## 4 1 4 1.866066
## 5 1 1 1.000000
## 6 2 62 6.405843
Model6<-aov(dat1$new_count~dat1$method,data = dat1)
summary(Model6)
## Df Sum Sq Mean Sq F value Pr(>F)
## dat1$method 2 37.17 18.59 9.887 0.0029 **
## Residuals 12 22.56 1.88
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(Model6)
After the application of box-cox and applying the appropriate power transformation which was the log transformation of data we observed that and say strongly
That the data is normally distributed because all the data points falls fairly on a straight line based on the residual Normal Q-Q plot
Also
From the residual vs fitted plot,we see also see that the residuals have fairly the same spread/dispersion. Hence the variance is constant.
The model was also highly significant with p-value of 0.0029
Use the Kruskal–Wallis test for the experiment in Problem 3.23. Compare the conclusions obtained with those from the usual analysis of variance.
kruskal.test(dat5$life~dat5$fluidtype,data = dat5)
##
## Kruskal-Wallis rank sum test
##
## data: dat5$life by dat5$fluidtype
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015
From the p-value obtained from our model (p=0.1015) which is greater than 0.05. We are failing to rejecting our null hypothesis and stating the mean of the fluid does not differ
We are concluding and comparing that
with the Anova and krusak wallis tests results we are failing to reject the null hypothesis.
Comparing the krusak Wallis tests results and those obtained with usual analysis of variance(ANOVA), we are coming to the same conclusion of failing to reject the null hypothesis stating that the fluids(1,2,3,4) do not differ.