Question 3.23

The effective life of insulating fluids at an accelerated load of 35 kV is being studied. Test data have been obtained for four types of fluids.

Part A:

Null Hypothesis: Ho: mu1 = mu2 = mu3 = mu4 = mu

Alternative Hypothesis: Ha: Atleast one mui differs i = {1,2,3,4}

Reading Data

Life <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 16.9, 15.3, 18.6, 17.1, 19.5, 20.3, 21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 19.3, 21.1, 16.9, 17.5, 18.3, 19.8) 
Type <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- cbind(Life, Type)
Data <- data.frame(Data)
Data$Type <- as.factor(Data$Type)
str(Data)
## 'data.frame':    24 obs. of  2 variables:
##  $ Life: num  17.6 18.9 16.3 17.4 20.1 21.6 16.9 15.3 18.6 17.1 ...
##  $ Type: Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 2 2 2 2 ...

ANOVA Analysis

aov.model<-aov(Life~Type,data=Data)
summary(aov.model)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## Type         3  30.17   10.05   3.047 0.0525 .
## Residuals   20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Answer Part A: At 0.05 level of significance we fail to reject Ho, and thus there’s no indication that the fluids differ. But since our p-value (0.0525) is just slightly above a, there is a probability of difference in means at higher significance levels.

Part B:

Tukey’s Test

TukeyHSD(aov.model)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = Life ~ Type, data = Data)
## 
## $Type
##           diff         lwr       upr     p adj
## 2-1 -0.7000000 -3.63540073 2.2354007 0.9080815
## 3-1  2.3000000 -0.63540073 5.2354007 0.1593262
## 4-1  0.1666667 -2.76873407 3.1020674 0.9985213
## 3-2  3.0000000  0.06459927 5.9354007 0.0440578
## 4-2  0.8666667 -2.06873407 3.8020674 0.8413288
## 4-3 -2.1333333 -5.06873407 0.8020674 0.2090635
plot(TukeyHSD(aov.model))

F1 <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6)
F2 <- c(16.9, 15.3, 18.6, 17.1, 19.5, 20.3)
F3 <- c(21.4, 23.6, 19.4, 18.5, 20.5, 22.3)
F4 <- c(19.3, 21.1, 16.9, 17.5, 18.3, 19.8)
A <- mean(F1)
B <- mean(F2)
C <- mean(F3)
D <- mean(F4)

A
## [1] 18.65
B
## [1] 17.95
C
## [1] 20.95
D
## [1] 18.81667

Answer Part B: Plot of Tukey’s test shows that there’s a difference in means of fluid type 3 and 2. Infact, this is the only combination that differs. Also looking at mean lives of all four populations, fluid type 3 would be the obvious choice if our goal was long life.

Part C:

Plots

aov.model<-aov(Life~Type,data=Data)
plot(aov.model)

Answer Part C: If we look at the “Normal Q-Q” and “Residuals vs Fitted” plots, we can see that both the normal distribution and constant variance assumptions are satisfied, and model is adequate.

Question 3.28

An experiment was performed to investigate the effectiveness of five insulating materials. Four samples of each material were tested at an elevated voltage level to accelerate the time to failure.

Part A:

Null Hypothesis: Ho: mu1 = mu2 = mu3 = mu4 = mu5 = mu

Alternative Hypothesis: Ha: Atleast one mui differs i = {1,2,3,4,5}

Reading Data

Time <- c(110, 157, 194, 178, 1, 2, 4, 18, 880, 1256, 5276, 4355, 495, 7040, 5307, 10050, 7, 5, 29, 2)
Type <- c(rep(1,4), rep(2,4), rep(3,4), rep(4,4), rep(5,4))
Data <- cbind(Time,Type)
Data <- data.frame(Data)
Data$Type <- as.factor(Data$Type)
str(Data)
## 'data.frame':    20 obs. of  2 variables:
##  $ Time: num  110 157 194 178 1 ...
##  $ Type: Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...

ANOVA Analysis

aov.model<-aov(Time~Type,data=Data)
summary(aov.model)
##             Df    Sum Sq  Mean Sq F value  Pr(>F)   
## Type         4 103191489 25797872   6.191 0.00379 **
## Residuals   15  62505657  4167044                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Answer Part A: Our p-value is 0.00379, since it is very small, so at any reasonable level of significance we conclude that atleast one mui differs, and all five materials do not have the same effect on mean failure time.

Part B:

Plots

aov.model<-aov(Time~Type,data=Data)
plot(aov.model)

Answer Part B: If we look at the “Normal Q-Q” and “Residuals vs Fitted” plots, we can see that neither the normal probability nor the constant variance assumptions are satisfied. Infact there’s a huge visible difference in spread as shown in “Residuals vs Fitted” plot. Thus we need to perform a data transformation.

Part C:

Box Cox

library(MASS)
boxplot(Data$Time~Data$Type,xlab="Material Type",ylab="Failure Time",main="Boxplot of Observations")

boxcox(Time~Type)

If we see box cox plot, 1 is not in 95% confidence interval which confirms our notion that the data required a transformation. Now the likelihood function is maximum right next to zero value of lambda, thus we would perform a log transformation of our data.

Log Transformation

LogTime <- log(Time)
boxplot(LogTime~Data$Type,xlab="Material Type",ylab="Failure Time",main="Boxplot of Observations")

As visible in the plots, now the spread of time observations is much more uniform than before.

ANOVA Analysis on Transformed Data

DataT <- cbind(LogTime,Type)
DataT <- data.frame(DataT)
DataT$Type <- as.factor(DataT$Type)
str(DataT)
## 'data.frame':    20 obs. of  2 variables:
##  $ LogTime: num  4.7 5.06 5.27 5.18 0 ...
##  $ Type   : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...
aov.modelT<-aov(LogTime~Type,data=DataT)
summary(aov.modelT)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Type         4 165.06   41.26   37.66 1.18e-07 ***
## Residuals   15  16.44    1.10                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(aov.modelT)

Answer Part C: Based on the results of anova analysis on transformed data, this time our p-value is so small we can say with absolute certainty that atleast one mui differs. Also if we look at the residual plots of transformed data, now we can reasonably conclude that the model is adequate.

Question 3.29

A semiconductor manufacturer has developed three different methods for reducing particle counts on wafers. All three methods are tested on five different wafers and the after treatment particle count obtained.

Part A:

Null Hypothesis: Ho: mu1 = mu2 = mu3 = mu

Alternative Hypothesis: Ha: Atleast one mui differs i = {1,2,3}

Reading Data

Count <- c(31, 10, 21, 4, 1, 62, 40, 24, 30, 35, 53, 27, 120, 97, 68)
Type <- c(rep(1,5), rep(2,5), rep(3,5))
Data <- cbind(Count,Type)
Data <- data.frame(Data)
Data$Type <- as.factor(Data$Type)
str(Data)
## 'data.frame':    15 obs. of  2 variables:
##  $ Count: num  31 10 21 4 1 62 40 24 30 35 ...
##  $ Type : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 2 2 2 2 2 ...

ANOVA Analysis

aov.model<-aov(Count~Type,data=Data)
summary(aov.model)
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## Type         2   8964    4482   7.914 0.00643 **
## Residuals   12   6796     566                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Answer Part A: Since our p-value is 0.00643, thus at 0.05 level of significance we reject Ho and we conclude that atleast one mui differs which means atleast one method has a different effect on mean particle count.

Part B:

Plots

aov.model<-aov(Count~Type,data=Data)
plot(aov.model)

Answer Part B: If we look at the “Residuals vs Fitted” plot, we can see that the spread of three methods is not constant and thus we can’t make constant variance assumption in the current state. So we would perform a transformation of the data.

Part C:

Box Cox

library(MASS)
boxplot(Data$Count~Data$Type,xlab="Method Type",ylab="Particle Count",main="Boxplot of Observations")

boxcox(Count~Type)

If we see box cox plot, 1 is not in 95% confidence interval which confirms our prediction that the data required a transformation. Now the likelihood function is maximum close to 0.5 value of lambda, thus we would perform a transformation of our data.

Transformation

library(MASS)
lambda <- 0.5
TCount<-Count^(lambda)
boxplot(TCount~Data$Type,xlab="Method Type",ylab="Particle Count",main="Boxplot of Observations")

As visible in the plots, now the spread of particle count is better than before the transformation, but still not perfect.

P.S.: 0.5 lambda transformation is same as sqrt transformation.

ANOVA Analysis on Transformed Data

DataT <- cbind(TCount,Type)
DataT <- data.frame(DataT)
DataT$Type <- as.factor(DataT$Type)
str(DataT)
## 'data.frame':    15 obs. of  2 variables:
##  $ TCount: num  5.57 3.16 4.58 2 1 ...
##  $ Type  : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 2 2 2 2 2 ...
aov.modelT<-aov(TCount~Type,data=DataT)
summary(aov.modelT)
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## Type         2  63.90   31.95    9.84 0.00295 **
## Residuals   12  38.96    3.25                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(aov.modelT)

Answer Part C: Based on the results of anova analysis on transformed data, our p-value is 0.00295, so at 0.05 level of significance we can say that atleast one mui differs which means that method type has a significant effect on mean particle count. Also if we look at the residual plots of transformed data, now we can reasonably consider our model adequate.

Question 3.51

Use the Kruskal–Wallis test for the experiment in Problem 3.23. Compare the conclusions obtained with those from the usual analysis of variance.

Reading Data

Life <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 16.9, 15.3, 18.6, 17.1, 19.5, 20.3, 21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 19.3, 21.1, 16.9, 17.5, 18.3, 19.8) 
Type <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- cbind(Life, Type)
Data <- data.frame(Data)
Data$Type <- as.factor(Data$Type)
str(Data)
## 'data.frame':    24 obs. of  2 variables:
##  $ Life: num  17.6 18.9 16.3 17.4 20.1 21.6 16.9 15.3 18.6 17.1 ...
##  $ Type: Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 2 2 2 2 ...

Kruskal-Wallace Test

kruskal.test(Life~Type,data=Data)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  Life by Type
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015

Answer: Since our p-value in this case is 0.1015, thus we fail to reject Ho and that there’s no indication that the fluids differ. These results are the same as we obtained from the usual analysis of variance “ANOVA” test.