Q)3.23) The effective life of insulating fluids at an accelerated load of 35 kV is being studied. Test data have been obtained for four types of fluids. The results from a completely randomized experiment were as follows:
Getting data
ftype <- c(rep(1,6),rep(2,6),rep(3,6),rep(4,6))
obs <- c(17.6,18.9,16.3,17.4,20.1,21.6,16.9,15.3,18.6,17.1,19.5,20.3,21.4,23.6,19.4,18.5,20.5,22.3,19.3,21.1,16.9,17.5,18.3,19.8) 
dat1 <- cbind(ftype,obs)
dat1 <- as.data.frame(dat1)
dat1$ftype <- as.factor(dat1$ftype)
dat1$obs <- as.numeric(dat1$obs)
Q3.23)a)Is there any indication that the fluids differ? Use alpha 0.05.
Hypothesis :
Null Hypothesis : \(H_o: \mu_1= \mu_2 = \mu_3 = \mu_4 =\mu_i\)
Alternative Hypothesis : Ha: at least one of the \(\mu_i\) differs
first_model <- aov(dat1$obs~dat1$ftype,data=dat1)
summary(first_model)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## dat1$ftype   3  30.17   10.05   3.047 0.0525 .
## Residuals   20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
Conclusion :- We can see that P value 0.0525 is greater than 0.05 . Hence we fail to reject Null Hypothesis and state that the fluids does not differ.But we may also note that pvalue is not signifantly large hence there might be a chance of some ui differing . But as it is a little bigger than 0.05, lets go ahead with conclusion that we have failed to reject Null Hypothesis.
 
Q3.23)b)Which fluid would you select, given that the objective is long life?
Tukey_model1 <- TukeyHSD(first_model)
Tukey_model1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = dat1$obs ~ dat1$ftype, data = dat1)
## 
## $`dat1$ftype`
##           diff         lwr       upr     p adj
## 2-1 -0.7000000 -3.63540073 2.2354007 0.9080815
## 3-1  2.3000000 -0.63540073 5.2354007 0.1593262
## 4-1  0.1666667 -2.76873407 3.1020674 0.9985213
## 3-2  3.0000000  0.06459927 5.9354007 0.0440578
## 4-2  0.8666667 -2.06873407 3.8020674 0.8413288
## 4-3 -2.1333333 -5.06873407 0.8020674 0.2090635
plot(Tukey_model1)

Answer:
For fluid 1 is 18.65 hours
For fluid 2 is 17.95 hours
For fluid 3 is 20.95 hours
For fluid 4 is 19.3142857 hours
Hence, We can see from Tukey plot as well as from the mean shown above , we can see that fluid 3 has mean of 20.95 which is highest than any other , hence we will suggest to choose this fluid as its mean life is hours is highest of all other
We also may note here that , as we concluded above that we reject fail to reject Null hypothesis , but as p value was not significantly bigger than 0.05 we can confirm now from the above tukey plot and analysis that there is a ui which differs . Hence we re-state our decision and say that we reject Null Hypothesis
 
Q3.23)c)Analyze the residuals from this experiment. Are the basic analysis of variance assumptions satisfied? There is nothing unusual in the residual plots
plot(first_model)




Answer:- 
We can see from above residual normal plot that all data points fall fairly along striaght line hence we can state that data is normally distributed
We can also see from above plot that residual vs fitted value plot also shows that data points form a rectangular shape which clearly means that variance is constant 
Hence we can claim that our assumption of Anova of Normal distribution and constant variance is adequate
 
 
 
Q3.28)
mat <- c(rep(1,4),rep(2,4),rep(3,4),rep(4,4),rep(5,4))
time <- c(110,157,194,178,1,2,4,18,880,1256,5276,4355,495,7040,5307,10050,7,5,29,2)
dat2 <- cbind(mat,time)
dat2 <- as.data.frame(dat2)
dat2$mat <- as.factor(as.character(dat2$mat))
dat2$time <- as.numeric(dat2$time)
str(dat2)
## 'data.frame':    20 obs. of  2 variables:
##  $ mat : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...
##  $ time: num  110 157 194 178 1 ...
Q3.28)A) Do all five had same effect?
Hypothesis :
Null Hypothesis : \(H_o: \mu_1= \mu_2 = \mu_3 = \mu_4 =\mu_i\)
Alternative Hypothesis : Ha: at least one of the \(\mu_i\) differs
second_model <- aov(dat2$time~dat2$mat,data = dat2)
summary(second_model)
##             Df    Sum Sq  Mean Sq F value  Pr(>F)   
## dat2$mat     4 103191489 25797872   6.191 0.00379 **
## Residuals   15  62505657  4167044                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
Conclusion :- We can see that P value 0.00379 is less than 0.05 . Hence we reject Null Hypothesis and state that the atleast one of \(\mu_i\) differ and has differnet effect on failure time.
 
 
Q3.28)B) Resiudal plots
plot(second_model)




Conclusion :- We can see that
1) The Normaly probabilty plot of residual shows that data is not normally distributed as the data points does not fairly fall along straigth line.
2) The residual vs fitted value shows that this experiment does not have constant variance , as the plots maximum and minimum points does not make a rectangular shape rather makes funel shape.
I would recommend boxcox transformation and then testing of hypothesis for adequacy of ANOVA
 
 
Q3.29)
Getting data
method <- c(rep(1,5),rep(2,5),rep(3,5))
count <- c(31,10,21,4,1,62,40,24,30,35,53,27,120,97,68)
dat3 <- cbind(method,count)
dat3 <- as.data.frame(dat3)
dat3$method <- as.factor(as.character(dat3$method))
dat3$count <- as.numeric(dat3$count)
str(dat3)
## 'data.frame':    15 obs. of  2 variables:
##  $ method: Factor w/ 3 levels "1","2","3": 1 1 1 1 1 2 2 2 2 2 ...
##  $ count : num  31 10 21 4 1 62 40 24 30 35 ...
 
Q3.29)a) Do all method are same or differ
third_model <- aov(dat3$count~dat3$method,data = dat3)
summary(third_model)
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## dat3$method  2   8964    4482   7.914 0.00643 **
## Residuals   12   6796     566                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion :- We can see that P value 0.00643 is less than 0.05 . Hence we reject Null Hypothesis and state that the atleast one of \(\mu_i\) differ and also state that all method does not have same effect on mean particle.
 
Q3.29)B) Resiudal plots
plot(third_model)




Conclusion :- We can see that
1) The Normaly probabilty plot of residual shows that data is normally distributed as the data points fairly falls along straigth line.
2) The residual vs fitted value shows that this experiment does not have constant variance , as the plots maximum and minimum points does not make a rectangular shape rather it makes a funnel shape.
I would recommend boxcox transformation and then testing of hypothesis for adequacy of ANOVA
 
 
Q-3.51) Use the Kruskal–Wallis test for the experiment in Problem 3.23.Compare the conclusions obtained with those from the usual analysis of variance.
kruskal.test(dat1$obs~dat1$ftype,data = dat1)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  dat1$obs by dat1$ftype
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015
Answer : We can see that P value 0.1015 is greater than 0.05 hence we fail to reject the NULL Hypothesis , and state that fluid does not differ .
The conclusion with ANOVA and Krusak Wallis Matches as we fail to reject Null hypothesis in both method
 
——————————– End ————————————
Source Code
library(dplyr)
library(MASS)
ftype <- c(rep(1,6),rep(2,6),rep(3,6),rep(4,6))
obs <- c(17.6,18.9,16.3,17.4,20.1,21.6,16.9,15.3,18.6,17.1,19.5,20.3,21.4,23.6,19.4,18.5,20.5,22.3,19.3,21.1,16.9,17.5,18.3,19.8) 
dat1 <- cbind(ftype,obs)
dat1 <- as.data.frame(dat1)
dat1$ftype <- as.factor(dat1$ftype)
dat1$obs <- as.numeric(dat1$obs)
first_model <- aov(dat1$obs~dat1$ftype,data=dat1)
summary(first_model)
Tukey_model1 <- TukeyHSD(first_model)
Tukey_model1
plot(Tukey_model1)
plot(first_model)
mat <- c(rep(1,4),rep(2,4),rep(3,4),rep(4,4),rep(5,4))
time <- c(110,157,194,178,1,2,4,18,880,1256,5276,4355,495,7040,5307,10050,7,5,29,2)
dat2 <- cbind(mat,time)
dat2 <- as.data.frame(dat2)
dat2$mat <- as.factor(as.character(dat2$mat))
dat2$time <- as.numeric(dat2$time)
str(dat2)
second_model <- aov(dat2$time~dat2$mat,data = dat2)
summary(second_model)
plot(second_model)
boxcox(second_model)
dat2$time_transformed <- log(dat2$time) 
head(dat2)
second_model2 <- aov(dat2$time_transformed~dat2$mat,data = dat2)
summary(second_model2)
plot(second_model2)
method <- c(rep(1,5),rep(2,5),rep(3,5))
count <- c(31,10,21,4,1,62,40,24,30,35,53,27,120,97,68)
dat3 <- cbind(method,count)
dat3 <- as.data.frame(dat3)
dat3$method <- as.factor(as.character(dat3$method))
dat3$count <- as.numeric(dat3$count)
str(dat3)
third_model <- aov(dat3$count~dat3$method,data = dat3)
summary(third_model)
plot(third_model)
boxcox(third_model)
dat3$count_transformed <- (dat3$count)^0.4 
head(dat3)
third_model2 <- aov(dat3$count_transformed~dat3$method,data = dat3)
summary(third_model2)
plot(third_model2)
kruskal.test(dat1$obs~dat1$ftype,data = dat1)