Question 3.23

Getting the data we have

fluidtype<- c(rep(1,6),rep(2,6),rep(3,6),rep(4,6))
life <- c(17.6,18.9,16.3,17.4,20.1,21.6,16.9,15.3,18.6,17.1,19.5,20.3,21.4,23.6,19.4,18.5,20.5,22.3,19.3,21.1,16.9,17.5,18.3,19.8) 
dat5<-cbind(fluidtype,life)
dat5<- as.data.frame(dat5)
dat5$fluidtype<-as.factor(dat5$fluidtype)
dat5$life <- as.numeric(dat5$life)

Question 3a

let

u1= mean of life (in h) at 35 kV load for fluid type 1

u2= mean of life (in h) at 35 kV load for fluid type 2

u3= mean of life (in h) at 35 kV load for fluid type 3

u4= mean of life (in h) at 35 kV load for fluid type 4

Null hypothesis is that

H0: u1=u2=u3=u4 that is; all the mean of all fluid type (1,2,3,4) are equal.

Alternative hypothesis

Ha- At least one of the means (u’s) differs

Model<- aov(dat5$life~dat5$fluidtype,data=dat5)
summary(Model)
##                Df Sum Sq Mean Sq F value Pr(>F)  
## dat5$fluidtype  3  30.17   10.05   3.047 0.0525 .
## Residuals      20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Since the p-value (0.0525) from our model is greater than the our reference value of 0.05. We are failing to reject the null hypothesis and stating that the fluids(1,2,3,4) do not differ.

We can also note and see that our p-value is obtained is not significantly different from 0.05. Which is very important to be aware of this.

Question 3.23 b

Which fluid would you select, given that the objective is long life?

Model2<- TukeyHSD(Model)
Model2
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = dat5$life ~ dat5$fluidtype, data = dat5)
## 
## $`dat5$fluidtype`
##           diff         lwr       upr     p adj
## 2-1 -0.7000000 -3.63540073 2.2354007 0.9080815
## 3-1  2.3000000 -0.63540073 5.2354007 0.1593262
## 4-1  0.1666667 -2.76873407 3.1020674 0.9985213
## 3-2  3.0000000  0.06459927 5.9354007 0.0440578
## 4-2  0.8666667 -2.06873407 3.8020674 0.8413288
## 4-3 -2.1333333 -5.06873407 0.8020674 0.2090635
plot(Model2)

Which fluid would you select, given that the objective is long life?

Based on the plots above, we would select fluid type 3 if we were given that an objective is long-life for fluid type, based on the fact that for fluid type 3, the average life is more that the three other fluids(1,2,4).

3.23 c

plot(Model)

3.23 (c) Analyze the residuals from this experiment. Are the basic analysis of variance assumptions satisfied?

We can say strongly that the data is normally distributed because all the data falls fairly on a straight line based on the residual Normal Q-Q plot

From the residual vs fitted plot,we see that the residuals have the spread/dispersion. Hence the variance is constant.

Question 3.28 a

material <- c(rep(1,4),rep(2,4),rep(3,4),rep(4,4),rep(5,4))
t <- c(110,157,194,178,1,2,4,18,880,1256,5276,4355,495,7040,5307,10050,7,5,29,2)
dat <- cbind(material,t)
dat <- as.data.frame(dat)
dat$material <- as.factor(as.character(dat$material))
dat$t<- as.numeric(dat$t)

let

u1= mean failure time in minutes of material 1

u2= mean failure time in minutes of material 2

u3= mean failure time in minutes of material 3

u4= mean failure time in minutes of material 4

u5= mean failure time in minutes of material 5

Null hypothesis is that

H0: u1=u2=u3=u4=u5 that is; all the mean of all Material type (1,2,3,4,5) are equal.

Alternative hypothesis

Ha- At least one of the means (u’s) differs

Model3<- aov(dat$t~dat$material,data = dat)
summary(Model3)
##              Df    Sum Sq  Mean Sq F value  Pr(>F)   
## dat$material  4 103191489 25797872   6.191 0.00379 **
## Residuals    15  62505657  4167044                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From the p-value obtained from our model (p=0.00379) which is less than 0.05. We are rejecting our null hypothesis and stating that at least one of the u’s(means) differs.

Question 3.28 b

plot(Model3)

  1. Plot the residuals versus the predicted response. Construct a normal probability plot of the residuals.What information is conveyed by these plots?

We can say strongly that the data is not normally distributed because all the data points doesn’t fall fairly on a straight line based on the Normal Q-Q plot

From the residual vs fitted plot,we see that the residuals have different spread/dispersion. Hence the variance is not constant.

Question 3.28 c

library(MASS)
boxcox(Model3)

Reading off the box cox graph, we can see that the lambda value is close to zero.

So therefore we would take the log transformation of the data

dat$time <- log(dat$t) 
head(dat)
##   material   t      time
## 1        1 110 4.7004804
## 2        1 157 5.0562458
## 3        1 194 5.2678582
## 4        1 178 5.1817836
## 5        2   1 0.0000000
## 6        2   2 0.6931472
Model4<-aov(dat$time~dat$material,data = dat)
summary(Model4)
##              Df Sum Sq Mean Sq F value   Pr(>F)    
## dat$material  4 165.06   41.26   37.66 1.18e-07 ***
## Residuals    15  16.44    1.10                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(Model4)

After the application of box-cox and applying the appropriate power transformation which was the log transformation of data we observed that and say strongly

That the data is normally distributed because all the data points falls fairly on a straight line based on the residual Normal Q-Q plot

Also

From the residual vs fitted plot,we see also see that the residuals have fairly the same spread/dispersion. Hence the variance is constant.

The model was also highly significant with p-value of 1.18e-07

Question 3.29

a

Getting data

method<- c(rep(1,5),rep(2,5),rep(3,5))
count<- c(31,10,21,4,1,62,40,24,30,35,53,27,120,97,68)
dat1 <- cbind(method,count)
dat1 <- as.data.frame(dat1)
dat1$method <- as.factor(as.character(dat1$method))
dat1$count <- as.numeric(dat1$count)

let

u1= mean count using method 1

u2= mean count using method 2

u3= mean count using method 3

Null hypothesis is that

H0: u1=u2=u3 that is; all the mean count of the three methods (1,2,3) have the same effect of mean particle count

Alternative hypothesis

Ha- At least one of mean count of the three methods (1,2,3) have different effect on mean particle count

Model4<- aov(dat1$count~dat1$method,data = dat1)
summary(Model4)
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## dat1$method  2   8964    4482   7.914 0.00643 **
## Residuals   12   6796     566                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From the p-value obtained from our model (p=0.00643) which is less than 0.05. We are rejecting our null hypothesis and stating that at least one of the u’s(means) differs and therefore all the methods(1,2,3) does not have the same effect on mean particle.

Question 3.29 b

plot(Model4)

b.

We can say strongly that the data is normally distributed because all the data falls fairly on a straight line based on the residual Normal Q-Q plot

From the residual vs fitted plot,we see that the residuals have the same spread/dispersion. Hence the variance is constant.

Question 3.29 c

I would suggest conducting a power transformation to be able to draw appropriate conclusion.

boxcox(Model4)

Based on the box-cox graph we can see that the lambda value can be taken as 0.45

Hence, we take the power transformation of the data raised to the power of 0.45

dat1$new_count<-(dat1$count)^0.45
head(dat1)
##   method count new_count
## 1      1    31  4.689351
## 2      1    10  2.818383
## 3      1    21  3.935489
## 4      1     4  1.866066
## 5      1     1  1.000000
## 6      2    62  6.405843
Model6<-aov(dat1$new_count~dat1$method,data = dat1)
summary(Model6)
##             Df Sum Sq Mean Sq F value Pr(>F)   
## dat1$method  2  37.17   18.59   9.887 0.0029 **
## Residuals   12  22.56    1.88                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(Model6)

After the application of box-cox and applying the appropriate power transformation which was the log transformation of data we observed that and say strongly

That the data is normally distributed because all the data points falls fairly on a straight line based on the residual Normal Q-Q plot

Also

From the residual vs fitted plot,we see also see that the residuals have fairly the same spread/dispersion. Hence the variance is constant.

The model was also highly significant with p-value of 0.0029

Question 3.51

Use the Kruskal–Wallis test for the experiment in Problem 3.23. Compare the conclusions obtained with those from the usual analysis of variance.

kruskal.test(dat5$life~dat5$fluidtype,data = dat5)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  dat5$life by dat5$fluidtype
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015

From the p-value obtained from our model (p=0.1015) which is greater than 0.05. We are failing to rejecting our null hypothesis and stating the mean of the fluid does not differ

We are concluding and comparing that

with the Anova and krusak wallis tests results we are failing to reject the null hypothesis.

Question 3.52

Comparing the krusak Wallis tests results and those obtained with usual analysis of variance(ANOVA), we are coming to the same conclusion of failing to reject the null hypothesis stating that the fluids(1,2,3,4) do not differ.