IE5342_HW_Week

############################### Question No: 3.23 #####################

Entry of data and Formation of Data frame:

T1_Fluid<-c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6)
T2_Fluid<-c(16.9, 15.3, 18.6, 17.1, 19.5, 20.3)
T3_Fluid<-c(21.4, 23.6, 19.4, 18.5, 20.5, 22.3)
T4_Fluid<-c(19.3, 21.1, 16.9, 17.5, 18.3, 19.8)

dat<-cbind.data.frame(T1_Fluid, T2_Fluid, T3_Fluid, T4_Fluid)

Answer to the Question No: 3.23(a)

library(tidyr)
dat1 <- pivot_longer(data = dat, c(T1_Fluid, T2_Fluid, T3_Fluid, T4_Fluid))
colnames(dat1)<- c("Fluid Types", "Life 35KV")
dat1$`Fluid Types`<-as.factor(dat1$`Fluid Types`)
dat1$`Life 35KV`<- as.numeric(dat1$`Life 35KV`)
model_1 <- aov(dat1$`Life 35KV`~dat1$`Fluid Types`,data=dat1)
summary(model_1)

##                    Df Sum Sq Mean Sq F value Pr(>F)  
## dat1$`Fluid Types`  3  30.16   10.05   3.047 0.0525 .
## Residuals          20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Comments: Hypothesis Satement: Null Hpothesis H0: u1=u2=u3=u4 and Alternative Hypothesis Ha: at least one of the u(i) differs, u(i) = mean of the Fluid Type (i=1,2,3,4). Since P-value (0.0525) > 0.05 so we fail to reject Null Hypothesis Ho There are no differences in fluid types, also the P-value is almost 0.05, there is a chance that it might be a difference in fluid types based on p-value at alpha=.05.

Answer to the Question No: 3.23(b)

library(agricolae)

## Warning: package 'agricolae' was built under R version 4.1.3

LSD.test(model_1,"dat1$`Fluid Types`",console = TRUE)

## 
## Study: model_1 ~ "dat1$`Fluid Types`"
## 
## LSD t Test for dat1$`Life 35KV` 
## 
## Mean Square Error:  3.299667 
## 
## dat1$`Fluid Types`,  means and individual ( 95 %) CI
## 
##          dat1..Life.35KV.      std r      LCL      UCL  Min  Max
## T1_Fluid         18.65000 1.952178 6 17.10309 20.19691 16.3 21.6
## T2_Fluid         17.95000 1.854454 6 16.40309 19.49691 15.3 20.3
## T3_Fluid         20.95000 1.879096 6 19.40309 22.49691 18.5 23.6
## T4_Fluid         18.81667 1.554885 6 17.26975 20.36358 16.9 21.1
## 
## Alpha: 0.05 ; DF Error: 20
## Critical Value of t: 2.085963 
## 
## least Significant Difference: 2.187666 
## 
## Treatments with the same letter are not significantly different.
## 
##          dat1$`Life 35KV` groups
## T3_Fluid         20.95000      a
## T4_Fluid         18.81667     ab
## T1_Fluid         18.65000      b
## T2_Fluid         17.95000      b

Comments: According to the LSD test, fluid T3_Fluid is different from the others, and its mean life 35KV also exceeds the mean life 35KV of the other 3 fluids.

Answer to the Question No: 3.23(c)

L_35KV<- c(dat1$`Life 35KV`)
x <- c(rep(1,6),rep(2,6),rep(3,6),rep(4,6))
boxplot(L_35KV~x,xlab="Fluid Types",ylab="Life 35KV",main="Boxplot")

meanx<-c(rep(mean(T1_Fluid),6),rep(mean(T2_Fluid),6),rep(mean(T3_Fluid),6),rep(mean(T4_Fluid),6))
Type<-c(T1_Fluid,T2_Fluid,T3_Fluid,T4_Fluid)
res<-Type-meanx
qqnorm(res)
qqline(res)

plot(meanx,res,xlab="Fluid Types",ylab="Residual",
     main="Constant Variance Checking Plot")

Comments: Based on the basic assumptions of variance analysis, the residuals have the same spread. As a result, the variance is constant, and the residual plots reveal no anomalies.Using the qq normality plot, we can observe the data normality.

######################## Question No: 3.28 #############################

M_1 <- c(110,157,194,178)
M_2 <- c(1,2,4,18)
M_3 <- c(880,1256,5276,4355)
M_4 <- c(495,7040,5307,10050)
M_5<- c(7,5,29,2)
dat <- data.frame(M_1,M_2,M_3,M_4,M_5)

Answer to the Question No: 3.28(a)

library(tidyr)
library(dplyr)

## Warning: package 'dplyr' was built under R version 4.1.3

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

dat1 <- pivot_longer(dat,c(M_1,M_2,M_3,M_4,M_5))
colnames(dat1)<-c("Materials","Failure Time")
dat1$Materials <- as.factor(dat1$Materials)
dat1$`Failure Time` <- as.numeric(dat1$`Failure Time`)
model_2 <- aov(dat1$`Failure Time`~dat1$Materials,data=dat1)
summary(model_2)

##                Df    Sum Sq  Mean Sq F value  Pr(>F)   
## dat1$Materials  4 103191489 25797872   6.191 0.00379 **
## Residuals      15  62505657  4167044                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Comments: Hypothesis Statement: Null Hypothesis H0: u1=u2=u3=u4=u5 and Alternative Hypothesis Ha: at least one of the u(i) differs u(i) = mean of the Materials(i=1,2,3,4,5) Since P-value 0.00379<0.05 so we reject Null Hypothesis Ho at aplpha=0.05 No not all the five materials have the same effect on mean failure time. least one material is different.

Answer to the Question No: 3.28(b)

fail_Time <- c(dat1$`Failure Time`)
x <- c(rep(1,4),rep(2,4),rep(3,4),rep(4,4),rep(5,4))
boxplot(fail_Time~x,xlab="Materials",ylab="Failure Time",main="Boxplot")

meanx<-c(rep(mean(M_1),4),rep(mean(M_2),4),rep(mean(M_3),4),rep(mean(M_4),4),rep(mean(M_5),4))
Materials<-c(M_1,M_2,M_3,M_4,M_5)
res<-Materials-meanx
qqnorm(res)
qqline(res)

plot(meanx,res,xlab="Materials",ylab="Residual",
     main="Constant Variance Checking Plot")

Comments: It can be seen from the plot of the residuals against the expected values that the variance of the original observations is not constant. In addition, the normal probability plot gives further evidence that the normalcy assumption is not correct. It is necessary to perform the boxcox transformation.

Answer to the Question No: 3.28(c)

library(MASS)

## Warning: package 'MASS' was built under R version 4.1.3

## 
## Attaching package: 'MASS'

## The following object is masked from 'package:dplyr':
## 
##     select

boxcox(dat1$`Failure Time`~dat1$Materials,data=dat1)

Comments: lambda=1 is not within 95% CI so we can go ahead to do box cox transformations

y <- log(dat1$`Failure Time`)
boxplot(y~dat1$Materials,xlab="Materials",ylab="Failure Time",main="Boxplot")

model_3 <- aov(y~dat1$Materials,data=dat1)
summary(model_3)

##                Df Sum Sq Mean Sq F value   Pr(>F)    
## dat1$Materials  4 165.06   41.26   37.66 1.18e-07 ***
## Residuals      15  16.44    1.10                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(model_3)

Comments: Since P-value 1.18e-07< 0.05 so we reject H0 at alpha = 0.05. not all of the approaches have the same impact on the mean particle count. There is at least one way that is unique. Following the application of the Boxcox transformation, we observe that the residual and Normality plots do not display anything out of the ordinary; so therefore we can say ANOVA assumptions of normal distribution and constant variance are valid.

######################### Question No: 3.29 #############################

Entering the data and formation of data Frame:

M_1 <- c(31,10,21,4,1)
M_2 <- c(62,40,24,30,35)
M_3<- c(53,27,120,97,68)
dat <- data.frame(M_1,M_2,M_3)

Answer to the question No: 3.29(a)

library(tidyr)
library(dplyr)
dat1 <- pivot_longer(dat,c(M_1,M_2,M_3))
colnames(dat1)<-c("Methods","Counts")
dat1$Methods <- as.factor(dat1$Methods)
dat1$Counts <- as.numeric(dat1$Counts)
model_4 <- aov(dat1$Counts~dat1$Methods,data=dat1)
summary(model_4)

##              Df Sum Sq Mean Sq F value  Pr(>F)   
## dat1$Methods  2   8964    4482   7.914 0.00643 **
## Residuals    12   6796     566                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Comments: Hypothesis Statement: Null Hypothesis H0: u1=u2=u3 and Alternative Hypothesis Ha: at least one of the u(i) differs u(i) = mean of the Method(i=1,2,3) Since P-value 0.00643<0.05 so we reject Null Hypothesis Ho at alpha=0.05 The three method have not the same effect on mean of partical count. at least one method is different.

Answer to the Question No: 3.29(b)

Count <- c(dat1$Counts)
x <- c(rep(1,5),rep(2,5),rep(3,5))
boxplot(Count ~ x,xlab="Methods",ylab="Counts",main="Boxplot")

meanx<-c(rep(mean(M_1),5),rep(mean(M_2),5),rep(mean(M_3),5))
Methods<-c(M_1,M_2,M_3)
res<-Methods-meanx
qqnorm(res)
qqline(res)

plot(meanx,res,xlab="Methods",ylab="Residual",
     main="Constant Variance Checking Plot")

It can be seen from the plot of the residuals against the expected values that the variance of the original observations is not constant. In addition, the normal probability plot gives further evidence that the normalcy assumption is not correct. It is necessary to perform the box cox transformation.

Answer to the Question No: 3.29(c)

library(MASS)
boxcox(dat1$Counts~dat1$Methods,data=dat1)

# lambda=1 is not within 95% CI so we can go ahead to do boxcox transformations
y <- (dat1$Counts)^0.5
head(y)

## [1] 5.567764 7.874008 7.280110 3.162278 6.324555 5.196152

boxplot(y~dat1$Methods,xlab="Methods",ylab="Counts",main="Boxplot")

model_5 <- aov(y~dat1$Methods,data=dat1)
summary(model_5)

##              Df Sum Sq Mean Sq F value  Pr(>F)   
## dat1$Methods  2  63.90   31.95    9.84 0.00295 **
## Residuals    12  38.96    3.25                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot(model_5)

Comments: Since P-value 0.00295< 0.05 so we reject H0 at alpha = 0.05. not all of the approaches have the same impact on the mean particle count.There is at least one way that is unique. Following the application of the Box cox transformation, we observe that the residual and Normality plots do not display anything out of the ordinary; so therefore we can say ANOVA assumptions of normal distribution and constant variance are valid.

################### Question No: 3.51 ############################

T1_Fluid<-c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6)
T2_Fluid<-c(16.9, 15.3, 18.6, 17.1, 19.5, 20.3)
T3_Fluid<-c(21.4, 23.6, 19.4, 18.5, 20.5, 22.3)
T4_Fluid<-c(19.3, 21.1, 16.9, 17.5, 18.3, 19.8)

dat<-cbind.data.frame(T1_Fluid, T2_Fluid, T3_Fluid, T4_Fluid)
library(tidyr)
dat1 <- pivot_longer(data = dat, c(T1_Fluid, T2_Fluid, T3_Fluid, T4_Fluid))
colnames(dat1)<- c("Fluid Types", "Life 35KV")
dat1$`Fluid Types`<-as.factor(dat1$`Fluid Types`)
dat1$`Life 35KV`<- as.numeric(dat1$`Life 35KV`)
model_1 <- aov(dat1$`Life 35KV`~dat1$`Fluid Types`,data=dat1)
summary(model_1)

##                    Df Sum Sq Mean Sq F value Pr(>F)  
## dat1$`Fluid Types`  3  30.16   10.05   3.047 0.0525 .
## Residuals          20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

kruskal.test(dat1$`Life 35KV`~dat1$`Fluid Types`,data=dat1)

## 
##  Kruskal-Wallis rank sum test
## 
## data:  dat1$`Life 35KV` by dat1$`Fluid Types`
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015

Comments: Hypothesis Statement: Null Hypothesis H0: u1=u2=u3=u4 and Alternative Hypothesis Ha: at least one of the u(i) differs u(i) = mean of the Fluid Type (i=1,2,3,4) Since P-value (0.0525) > 0.05 so we fail to reject Null Hypothesis Ho There are no differences in fluid types, also the P-value is almost 0.05, there is a chance that it might be a difference in fluid types based on p-value at alpha=.05. In Kruskal-Wallis Test, we again see that P-value (0.1015) > alpha so we fail to reject H0 again at alpha = 0.05 As a result, based on the results of the ANOVA and the Kruskal-Wallis Test, we have reached the conclusion that there are no differences in the types of fluids. The analysis of variance and this result are in agreement.

######################### Question No: 3.52 ####################

In the book this problem stated the same problem no as 3.51. That’s why the answer will be same as Question: 3.51.

IE5342_HW_Week_6

Md Jahir Ahmed

2022-10-09