1 Question 3.23

The effective life of insulating fluids at an accelerated load of 35 kV is being studied. Test data have been obtained for four types of fluids. The results from a completely randomized experiment were as follows:

  1. Is there any indication that the fluids differ? Use α=0.05

  2. Which fluid would you select, given that the objective is long life?

  3. Analyze the residuals from this experiment. Are the basic analysis of variance assumptions satisfied?

1.1 Answer

Question a:

prob1<- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 16.9, 15.3, 18.6, 17.1, 19.5, 20.3, 21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 19.3, 21.1, 16.9, 17.5, 18.3, 19.8) 
x <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- data.frame(prob1,x)
Data$x <- as.factor(Data$x)
str(Data)
## 'data.frame':    24 obs. of  2 variables:
##  $ prob1: num  17.6 18.9 16.3 17.4 20.1 21.6 16.9 15.3 18.6 17.1 ...
##  $ x    : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 2 2 2 2 ...

Using ANOVA:

aov.model<-aov(prob1~x,data=Data)
summary(aov.model)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## x            3  30.16   10.05   3.047 0.0525 .
## Residuals   20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusion:

–> At α = 0.05 there is no indication that the fluids differs and we fail to reject Null Hypothesis, but since the P-value (0.0525) is just slightly above 0.05, there is probably a difference in means at higher significance level

Part b:

Using the LSD test

library(agricolae)
LSD.test(aov.model,"x",console=TRUE)
## 
## Study: aov.model ~ "x"
## 
## LSD t Test for prob1 
## 
## Mean Square Error:  3.299667 
## 
## x,  means and individual ( 95 %) CI
## 
##      prob1      std r        se      LCL      UCL  Min  Max    Q25   Q50    Q75
## 1 18.65000 1.952178 6 0.7415824 17.10309 20.19691 16.3 21.6 17.450 18.25 19.800
## 2 17.95000 1.854454 6 0.7415824 16.40309 19.49691 15.3 20.3 16.950 17.85 19.275
## 3 20.95000 1.879096 6 0.7415824 19.40309 22.49691 18.5 23.6 19.675 20.95 22.075
## 4 18.81667 1.554885 6 0.7415824 17.26975 20.36358 16.9 21.1 17.700 18.80 19.675
## 
## Alpha: 0.05 ; DF Error: 20
## Critical Value of t: 2.085963 
## 
## least Significant Difference: 2.187666 
## 
## Treatments with the same letter are not significantly different.
## 
##      prob1 groups
## 3 20.95000      a
## 4 18.81667     ab
## 1 18.65000      b
## 2 17.95000      b

Conclusion:

–> From the LSD test we can conclude that Fluid Type 3 has the highest average and therefore it is chosen

Part c:

Using plots to analyze residuals:

plot(aov.model)

Conclusion:

–> There is no unusual factors in the plots and therefore the assumption of normality and constant variance is satisfied.

2 Question 3.28

An experiment was performed to investigate the effectiveness of five insulating materials. Four samples of each material were tested at an elevated voltage level to accelerate the time to failure. The failure times (in minutes) are shown below:

  1. Do all five materials have the same effect on mean failure time?

  2. Plot the residuals versus the predicted response. Construct a normal probability plot of the residuals. What information is conveyed by these plots?

  3. Based on your answer to part (b) conduct another analysis of the failure time data and draw appropriate conclusions.

2.1 Solution:

Part a:

Time<-c(110, 157, 194, 178, 1, 2, 4, 18, 880, 1256, 5276, 4355, 495, 7040, 5307, 10050, 7, 5, 29, 2)
Type<-c(rep(1,4), rep(2,4), rep(3,4), rep(4,4), rep(5,4))
Data<-data.frame(Time,Type)
Data$Type<-as.factor(Data$Type)
str(Data)
## 'data.frame':    20 obs. of  2 variables:
##  $ Time: num  110 157 194 178 1 ...
##  $ Type: Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...

Using ANOVA:

aov.model<-aov(Time~Type,data=Data)
summary(aov.model)
##             Df    Sum Sq  Mean Sq F value  Pr(>F)   
## Type         4 103191489 25797872   6.191 0.00379 **
## Residuals   15  62505657  4167044                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusion:

–> With a p-value of 0.00379 and a significance level of 0.05, we reject the Null hypothesis in favor of the alternate (at least 1 means differs)

Part b:

Plotting the results

library(ggfortify)
library(ggplot2)
autoplot(aov.model)

Conclusion:

–> The normality assumption cannot be fulfilled as the normal probability plot does not seem to fall in a straight line. Additionally, the residuals vs fitted value shows that there is no constant variance as the points do not make a shape. Therefore the requirements for ANOVA are violated.

Part c:

Using Boxplot to analyze the result

boxplot(Data$Time~Data$Type,xlab="Material Type",ylab="Failure Time",main="Boxplot of Observations")

Plots above show that there are large variances and different means in Material Types.

Using BoxCox Transformation:

Lambda is equal to 0, and therefore we have to perform the natural log transformation of the data.

LogTime <- log(Time) # if lambda is equal to zero 
boxplot(LogTime~Data$Type,xlab="Material Type",ylab="Failure Time",main="Boxplot of Observations")

The box plot fo the log transform data now shows a closed spread of observation between material types. However, the transformation isnt perfect as Type 1 is still different.

Refering to Kruskal Wallis test

kruskal.test(Time,Type,data=Data)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  Time and Type
## Kruskal-Wallis chi-squared = 16.873, df = 4, p-value = 0.002046

Conclusion:

–> The result we obtained from the Non Parametric Anova test is a P value of 0.002046 at α=0.05. Therefore, we will reject the null hypothesis in favor of the alternate and conclude that the five materials do not have the effect (at least one differs).

3 Question 3.29

A semiconductor manufacturer has developed three different methods for reducing particle counts on wafers. All three methods are tested on five different wafers and the after treatment particle count obtained. The data are shown below:

  1. Do all methods have the same effect on mean particle count?

  2. Plot the residuals versus the predicted response. Construct a normal probability plot of the residuals. Are there potential concerns about the validity of the assumptions?

  3. Based on your answer to part (b) conduct another analysis of the particle count data and draw appropriate conclusions.

3.1 Solution

Part a:

Count <- c(31, 10, 21, 4, 1, 62, 40, 24, 30, 35, 53, 27, 120, 97, 68)
Type <- c(rep(1,5), rep(2,5), rep(3,5))
Data <- data.frame(Count,Type)
Data$Type <- as.factor(Data$Type)
str(Data)
## 'data.frame':    15 obs. of  2 variables:
##  $ Count: num  31 10 21 4 1 62 40 24 30 35 ...
##  $ Type : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 2 2 2 2 2 ...

Using ANOVA:

aov.model<-aov(Count~Type,data=Data)
summary(aov.model)
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## Type         2   8964    4482   7.914 0.00643 **
## Residuals   12   6796     566                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusion:

–> The P-value is 0.00643 with a 0.05 level of significance. Therefore, we will reject the null hypothesis in favor of the alternate and conclude that at least one mean differs.

Part b:

Plotting the residuals:

autoplot(aov.model)

Conclusion:

Based on the normality plot, it does seem to follow a straight line so the assumption of normality is satisfied. However, based on the residuals plot, the spread isnt constant and we cannot make the assumption of constant variance. The requirements for the ANOVA test are not fulfilled.

Part c:

Due to the result of part b, we will most likely need to transform the data.

boxplot(Data$Count~Data$Type,xlab="Method Type",ylab="Particle Count",main="Boxplot of Observations")

The plots above validate that the variance vary so they need to be stabilized using a transformation.

Performing BoxCox

boxcox(Count~Type)

Value of lambda seems to be approximately 0.5

lambda <- 0.5
CountT<-Count^(lambda)
boxcox(CountT~Type)

boxplot(CountT~Data$Type,xlab="Method Type",ylab="Particle Count",main="Boxplot of Observations")

–> The variance seem to have stabilized. The above graph shows a new value of lambda close to 1, indicating the transformation worked. and the box plot graph shows a reduced variance. Now the ANOVA test requirements will hold.

Using ANOVA on the transformation:

DataT <- data.frame(CountT,Type)
DataT$Type <- as.factor(DataT$Type)
str(DataT)
## 'data.frame':    15 obs. of  2 variables:
##  $ CountT: num  5.57 3.16 4.58 2 1 ...
##  $ Type  : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 2 2 2 2 2 ...
aov.modelT<-aov(CountT~Type,data=DataT)
summary(aov.modelT)
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## Type         2  63.90   31.95    9.84 0.00295 **
## Residuals   12  38.96    3.25                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
autoplot(aov.modelT)

Conclusion:

–> After transforming data we can see that the anova model is more adequate now that the normality and constant variance assumption has been satisfied. The new p value of 0.00295 witha 0.05 level of significance will indicate that we must reject the null hypothesis in favor of the alternate, meaning that at least one mean varies.

4 Question 3.51 & 3.52:

Use the Kruskal–Wallis test for the experiment in Problem 3.23.

3.51) Compare the conclusions obtained with those from the usual analysis of variance. 3.52) Are the results comparable to those found by the usual analysis of variance?

4.1 Solution

Life <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 16.9, 15.3, 18.6, 17.1, 19.5, 20.3, 21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 19.3, 21.1, 16.9, 17.5, 18.3, 19.8) 
Type <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- data.frame(Life,Type)
Data$Type <- as.factor(Data$Type)
str(Data)
## 'data.frame':    24 obs. of  2 variables:
##  $ Life: num  17.6 18.9 16.3 17.4 20.1 21.6 16.9 15.3 18.6 17.1 ...
##  $ Type: Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 2 2 2 2 ...

Performing the Kruskal.wallace test:

kruskal.test(Life~Type,data=Data)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  Life by Type
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015

Conclusion:

–> With a P value is 0.1015 and a level of significance of 0.95, we can FAIL TO REJECT the null hypothesis and all fluid means are the same. The results of the ANOVA test previously yielded the same result but with a lower p value. The result and conclusion are comparable to those found by ANOVA in question 3.23

#Complete R code

##Question 3.23:

#Part a
prob1<- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 16.9, 15.3, 18.6, 17.1, 19.5, 20.3, 21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 19.3, 21.1, 16.9, 17.5, 18.3, 19.8) 
x <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- data.frame(prob1,x)
Data$x <- as.factor(Data$x)
str(Data)
#use ANOVA
aov.model<-aov(prob1~x,data=Data)
summary(aov.model)
#Part B
library(agricolae)
LSD.test(aov.model,"x",console=TRUE)
#Part C
plot(aov.model)


##Question 3.28
#Part a
Time<-c(110, 157, 194, 178, 1, 2, 4, 18, 880, 1256, 5276, 4355, 495, 7040, 5307, 10050, 7, 5, 29, 2)
Type<-c(rep(1,4), rep(2,4), rep(3,4), rep(4,4), rep(5,4))
Data<-data.frame(Time,Type)
Data$Type<-as.factor(Data$Type)
str(Data)
#Use ANOVA
aov.model<-aov(Time~Type,data=Data)
summary(aov.model)
#Part B
library(ggfortify)
library(ggplot2)
autoplot(aov.model)
#Part c
boxplot(Data$Time~Data$Type,xlab="Material Type",ylab="Failure Time",main="Boxplot of Observations")
LogTime <- log(Time) # if lambda is equal to zero 
boxplot(LogTime~Data$Type,xlab="Material Type",ylab="Failure Time",main="Boxplot of Observations")
#Use Kruskal Wallis Test
kruskal.test(Time,Type,data=Data)

#Question 3.29
#Part A
Count <- c(31, 10, 21, 4, 1, 62, 40, 24, 30, 35, 53, 27, 120, 97, 68)
Type <- c(rep(1,5), rep(2,5), rep(3,5))
Data <- data.frame(Count,Type)
Data$Type <- as.factor(Data$Type)
str(Data)
#use ANOVA
aov.model<-aov(Count~Type,data=Data)
summary(aov.model)
#Part B
autoplot(aov.model)
#Part C
boxplot(Data$Count~Data$Type,xlab="Method Type",ylab="Particle Count",main="Boxplot of Observations")
boxcox(Count~Type)
lambda <- 0.5
CountT<-Count^(lambda)
boxcox(CountT~Type)
boxplot(CountT~Data$Type,xlab="Method Type",ylab="Particle Count",main="Boxplot of Observations")
#ANOVA on transformation
DataT <- data.frame(CountT,Type)
DataT$Type <- as.factor(DataT$Type)
str(DataT)
aov.modelT<-aov(CountT~Type,data=DataT)
summary(aov.modelT)
autoplot(aov.modelT)

#Question 3.51 & 3.52
Life <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 16.9, 15.3, 18.6, 17.1, 19.5, 20.3, 21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 19.3, 21.1, 16.9, 17.5, 18.3, 19.8) 
Type <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- data.frame(Life,Type)
Data$Type <- as.factor(Data$Type)
str(Data)
#Kruskal Wallis Test
kruskal.test(Life~Type,data=Data)