The effective life of insulating fluids at an accelerated load of 35 kV is being studied. Test data have been obtained for four types of fluids. The results from a completely randomized experiment were as follows:
Is there any indication that the fluids differ? Use \(\alpha=0.05\).
Which fluid would you select, given that the objective is long life?
Analyze the residuals from this experiment. Are the basic analysis of variance assumptions satisfied?
PART A:
Reading the Data:
Life <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 16.9, 15.3, 18.6, 17.1, 19.5, 20.3, 21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 19.3, 21.1, 16.9, 17.5, 18.3, 19.8)
Type <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- data.frame(Life, Type)
Data$Type <- as.factor(Data$Type)
str(Data)
## 'data.frame': 24 obs. of 2 variables:
## $ Life: num 17.6 18.9 16.3 17.4 20.1 21.6 16.9 15.3 18.6 17.1 ...
## $ Type: Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 2 2 2 2 ...
Performing ANOVA:
First Stating the Hypothesis:
Null: H0: μ1 =
μ2 = μ3 = μ4
Alternate: H1: μi ≠
μj for at least one pair (i,j)
aov.model<-aov(Life~Type,data=Data)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 3 30.17 10.05 3.047 0.0525 .
## Residuals 20 65.99 3.30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion:
—> At α = 0.05 there is no indication that the fluids differs and we fail to reject Null Hypothesis, but since the P-value (0.0525) is just slightly above 0.05, there is probably a difference in means at higher significance level
PART B:
Performing LSD Test to identify which treatment mean is highest and which pairs differ the most if so:
library(agricolae)
?LSD.test
LSD.test(aov.model,"Type",console=TRUE)
##
## Study: aov.model ~ "Type"
##
## LSD t Test for Life
##
## Mean Square Error: 3.299667
##
## Type, means and individual ( 95 %) CI
##
## Life std r LCL UCL Min Max
## 1 18.65000 1.952178 6 17.10309 20.19691 16.3 21.6
## 2 17.95000 1.854454 6 16.40309 19.49691 15.3 20.3
## 3 20.95000 1.879096 6 19.40309 22.49691 18.5 23.6
## 4 18.81667 1.554885 6 17.26975 20.36358 16.9 21.1
##
## Alpha: 0.05 ; DF Error: 20
## Critical Value of t: 2.085963
##
## least Significant Difference: 2.187666
##
## Treatments with the same letter are not significantly different.
##
## Life groups
## 3 20.95000 a
## 4 18.81667 ab
## 1 18.65000 b
## 2 17.95000 b
Conclusion:
---> Given that the Objective is to select a fluid with Long Life, I would choose Fluid Type 3 because it has the highest averagePART C:
Analyzing Residuals:
plot(aov.model)
Conclusion:
---> There is nothing unusual in the residual plots & both the normal distribution and constant variance assumptions are satisfied, and model is adequateAn experiment was performed to investigate the effectiveness of five insulating materials. Four samples of each material were tested at an elevated voltage level to accelerate the time to failure. The failure times (in minutes) are shown below:
Do all five materials have the same effect on mean failure time?
Plot the residuals versus the predicted response. Construct a normal probability plot of the residuals. What information is conveyed by these plots?
Based on your answer to part (b) conduct another analysis of the failure time data and draw appropriate conclusions.
PART A:
Reading the Data:
Time <- c(110, 157, 194, 178, 1, 2, 4, 18, 880, 1256, 5276, 4355, 495, 7040, 5307, 10050, 7, 5, 29, 2)
Type <- c(rep(1,4), rep(2,4), rep(3,4), rep(4,4), rep(5,4))
Data <- data.frame(Time,Type)
Data$Type <- as.factor(Data$Type)
str(Data)
## 'data.frame': 20 obs. of 2 variables:
## $ Time: num 110 157 194 178 1 ...
## $ Type: Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...
Performing ANOVA:
First Stating the Hypothesis:
Null Hypothesis: \(Ho:μ1=μ2=μ3=μ4=μ5\)
Alternative Hypothesis: \(Ha: At least\space one \space μi\) differs
aov.model<-aov(Time~Type,data=Data)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 4 103191489 25797872 6.191 0.00379 **
## Residuals 15 62505657 4167044
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion:
---> With a p-value of 0.00379 and a significance level of 0.05, we reject the Null hypothesis,and all five materials do not have the same effect on mean failure timePART B:
Plotting the residuals using Anova Plots:
library(ggfortify)
library(ggplot2)
autoplot(aov.model)
Conclusion:
---> 1) The Normaly probabilty plot of residual shows that data is not normally distributed as the data points does not fairly fall along straigth line. 2)The residual vs fitted value shows that this experiment does not have constant variance , as the plots maximum and minimum points does not make a rectangular shape The requirements for valid ANOVA are violatedPART C:
Based on Part B results we have to perform Data transformation (Either BoxCox or Natural Log) in order to stabilize the variance:
To visually see differences in variance, plotting Box Plot:
boxplot(Data$Time~Data$Type,xlab="Material Type",ylab="Failure Time",main="Boxplot of Observations")
---> Huge differences in variances between Failure Times data of different Material Types
Performing BoxCox Transformation:
library(MASS)
boxcox(Time~Type)
Performing Natural Log Transformation on Failure times data and Analyzing Variances:
LogTime <- log(Time)
boxplot(LogTime~Data$Type,xlab="Material Type",ylab="Failure Time",main="Boxplot of Observations")
Checking Residuals Plots for Log Transformed Data:
DataT<-data.frame(LogTime,Type)
DataT$Type <- as.factor(DataT$Type)
aovmodelT<-aov(LogTime~Type,data=DataT)
autoplot(aovmodelT)
Also,
---> Over here the transformation did not seem to work because the data is messy with outliers, however we will now resort to Non-Parametric ANOVA Test i.e. Kruskal Wallis Testkruskal.test(Time,Type,data=Data)
##
## Kruskal-Wallis rank sum test
##
## data: Time and Type
## Kruskal-Wallis chi-squared = 16.873, df = 4, p-value = 0.002046
Conclusion:
---> The result we obtained from the Non Parametric Anova test is a P value of 0.002046 at α=0.05. This is less than our α of 0.05. Therefore, we will again reject the null hypothesis and conclude that the five materials do not have the same effect on mean failure time. We say this with more certainty now that we have used the correct testA semiconductor manufacturer has developed three different methods for reducing particle counts on wafers. All three methods are tested on five different wafers and the after treatment particle count obtained. The data are shown below:
Do all methods have the same effect on mean particle count?
Plot the residuals versus the predicted response. Construct a normal probability plot of the residuals. Are there potential concerns about the validity of the assumptions?
Based on your answer to part (b) conduct another analysis of the particle count data and draw appropriate conclusions.
PART A:
Reading the Data:
Count <- c(31, 10, 21, 4, 1, 62, 40, 24, 30, 35, 53, 27, 120, 97, 68)
Type <- c(rep(1,5), rep(2,5), rep(3,5))
Data <- data.frame(Count,Type)
Data$Type <- as.factor(Data$Type)
str(Data)
## 'data.frame': 15 obs. of 2 variables:
## $ Count: num 31 10 21 4 1 62 40 24 30 35 ...
## $ Type : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 2 2 2 2 2 ...
Performing ANOVA:
First Stating the Hypothesis:
\(H_o:μ1=μ2=μ3\)
\(Ha: \space Atleast \space One \space\mu_{i}
\space differs\)
Where 1, 2, and 3 correspond to Method 1, Method 2, and Method 3
aov.model<-aov(Count~Type,data=Data)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 2 8964 4482 7.914 0.00643 **
## Residuals 12 6796 566
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion:
---> Since our p-value is 0.00643, thus at 0.05 level of significance we reject Ho and we conclude that atleast one mui differs which means atleast one method has a different effect on mean particle countPART B:
Plotting the Residuals:
autoplot(aov.model)
But,
If we look at the “Residuals vs Fitted” plot, we can see that the spread of three methods is not constant and thus we can’t make constant variance assumption which is required for a valid ANOVA testPART C:
As per Part B results, we need to perform Data Transformation to conclude appropriate results:
library(MASS)
boxplot(Data$Count~Data$Type,xlab="Method Type",ylab="Particle Count",main="Boxplot of Observations")
Since Variances vary, we will stabilize them using BoxCox transformation:
boxcox(Count~Type)
---> One is outside confidence interval and the likelihood function is maximum close to 0.4 value of lambda, thus we would perform a transformation on count data at (lambda = 0.4)
lambda <- 0.4
CountT<-Count^(lambda)
We look to see how the transformation did:
boxplot(CountT~Data$Type,xlab="Method Type",ylab="Particle Count",main="Boxplot of Observations")
boxcox(CountT~Type)
ANOVA Analysis on Transformed Data:
DataT <- data.frame(CountT,Type)
DataT$Type <- as.factor(DataT$Type)
str(DataT)
## 'data.frame': 15 obs. of 2 variables:
## $ CountT: num 3.95 2.51 3.38 1.74 1 ...
## $ Type : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 2 2 2 2 2 ...
aov.modelT<-aov(CountT~Type,data=DataT)
summary(aov.modelT)
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 2 21.21 10.605 9.881 0.00291 **
## Residuals 12 12.88 1.073
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
autoplot(aov.modelT)
CONCLUSION:
---> After transforming data we can see that now our anova model is adequate with normal distribution and constant variance. From the residual vs fitted plot we can see now it is in rectangular shape which states that we do have constant variance now. From anova analysis on transformed data, our p-value is 0.00291, so at 0.05 level of significance we can say that atleast one mui differs which means that method type has a significant effect on mean particle count.Use the Kruskal–Wallis test for the experiment in Problem 3.23.
3.51) Compare the conclusions obtained with those from the usual
analysis of variance.
3.52) Are the results comparable to those found by the usual analysis of
variance?
Reading the Data:
Life <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 16.9, 15.3, 18.6, 17.1, 19.5, 20.3, 21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 19.3, 21.1, 16.9, 17.5, 18.3, 19.8)
Type <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- data.frame(Life,Type)
Data$Type <- as.factor(Data$Type)
str(Data)
## 'data.frame': 24 obs. of 2 variables:
## $ Life: num 17.6 18.9 16.3 17.4 20.1 21.6 16.9 15.3 18.6 17.1 ...
## $ Type: Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 2 2 2 2 ...
Stating the Hypothesis:
Null: H0: μ1 =
μ2 = μ3 = μ4
Alternate: H1: μi ≠
μj for at least one pair (i,j)
Now Performing Kruskal Wallis Test:
kruskal.test(Life~Type,data=Data)
##
## Kruskal-Wallis rank sum test
##
## data: Life by Type
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015
Conclusion:
---> We can see that P value 0.1015 is greater than 0.05 hence we fail to reject the NULL Hypothesis , and state that fluid does not differ. The results from the ANOVA test yielded the same result but with a lower p value. Therefore, using the kruskal wallis test allows us to make this conclusion with more certainty however the p-values from both the tests are in the acceptance region and we can conclude there are no difference bewteen the mean life of the fluid types. Therefore the results and conclsions are comparable to those found by analysis of variance in Question 3.23getwd()
##Question 3.23:
#PART A:
#Reading the Data:
Life <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 16.9, 15.3, 18.6, 17.1, 19.5, 20.3, 21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 19.3, 21.1, 16.9, 17.5, 18.3, 19.8)
Type <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- data.frame(Life, Type)
Data$Type <- as.factor(Data$Type)
str(Data)
#ANOVA:
aov.model<-aov(Life~Type,data=Data)
summary(aov.model)
#PART B:
LSD.test(aov.model,"Type",console=TRUE)
#PART C:
plot(aov.model)
##Question 3.28:
#Reading the data:
Time <- c(110, 157, 194, 178, 1, 2, 4, 18, 880, 1256, 5276, 4355, 495, 7040, 5307, 10050, 7, 5, 29, 2)
Type <- c(rep(1,4), rep(2,4), rep(3,4), rep(4,4), rep(5,4))
Data <- data.frame(Time,Type)
Data$Type <- as.factor(Data$Type)
str(Data)
#PART A:
aov.model<-aov(Time~Type,data=Data)
summary(aov.model)
#PART B:
library(ggfortify)
library(ggplot2)
autoplot(aov.model)
#PART C:
library(MASS)
boxplot(Data$Time~Data$Type,xlab="Material Type",ylab="Failure Time",main="Boxplot of Observations")
boxcox(Time~Type)
LogTime <- log(Time)
boxplot(LogTime~Data$Type,xlab="Material Type",ylab="Failure Time",main="Boxplot of Observations")
DataT<-data.frame(LogTime,Type)
DataT$Type <- as.factor(DataT$Type)
str(DataT)
aovmodelT<-aov(LogTime~Type,data=DataT)
plot(aovmodelT)
?kruskal.test
kruskal.test(Time,Type,data=Data)
##Question 3.29:
#Reading the Data:
Count <- c(31, 10, 21, 4, 1, 62, 40, 24, 30, 35, 53, 27, 120, 97, 68)
Type <- c(rep(1,5), rep(2,5), rep(3,5))
Data <- data.frame(Count,Type)
Data$Type <- as.factor(Data$Type)
str(Data)
#PART A:
aov.model<-aov(Count~Type,data=Data)
summary(aov.model)
#PART B:
autoplot(aov.model)
#PART C:
library(MASS)
boxplot(Data$Count~Data$Type,xlab="Method Type",ylab="Particle Count",main="Boxplot of Observations")
boxcox(Count~Type)
lambda <- 0.4
CountT<-Count^(lambda)
#we look to see how the transformation did
boxplot(CountT~Data$Type,xlab="Method Type",ylab="Particle Count",main="Boxplot of Observations")
boxcox(CountT~Type)
#ANOVA Analysis on Transformed Data
DataT <- data.frame(CountT,Type)
DataT$Type <- as.factor(DataT$Type)
str(DataT)
aov.modelT<-aov(CountT~Type,data=DataT)
summary(aov.modelT)
autoplot(aov.modelT)
#Question 3.51 Question 3.52:
#Reading the Data:
Life <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 16.9, 15.3, 18.6, 17.1, 19.5, 20.3, 21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 19.3, 21.1, 16.9, 17.5, 18.3, 19.8)
Type <- c(rep(1,6), rep(2,6), rep(3,6), rep(4,6))
Data <- data.frame(Life,Type)
Data$Type <- as.factor(Data$Type)
str(Data)
kruskal.test(Life~Type,data=Data)