The effective life of insulating fluids at an accelerated load of 35 kV is being studied. Test data have been obtained for four types of fluids. The results from a completely randomized experiment were as follows:
Is there any indication that the fluids differ? Use α=0.05�=0.05.
Which fluid would you select, given that the objective is long life?
Analyze the residuals from this experiment. Are the basic analysis of variance assumptions satisfied?
u1= for fluid type 1, mean of life (in h) at 35 kV load
u2= for fluid type 2, mean of life (in h) at 35 kV load
u3= for fluid type 3, mean of life (in h) at 35 kV load
u4= for fluid type 4, mean of life (in h) at 35 kV load
Null hypothesis test
H0: u1=u2=u3=u4 that is; all the mean of all fluid type (1,2,3,4) are equal.
Alternative hypothesis
Ha- At least one of the means (u's) differs
Reading the data
life <- c(17.6,18.9,16.3,17.4,20.1,21.6,16.9,15.3,18.6,17.1,19.5,20.3,21.4,23.6,19.4,18.5,20.5,22.3,19.3,21.1,16.9,17.5,18.3,19.8)
fluid_type<- c(rep(1,6),rep(2,6),rep(3,6),rep(4,6))
dat5<-cbind(fluid_type,life)
dat5<- as.data.frame(dat5)
dat5$fluid_type<-as.factor(dat5$fluid_type)
dat5$life <- as.numeric(dat5$life)Model<- aov(dat5$life~dat5$fluid_type,data=dat5)
summary(Model)
## Df Sum Sq Mean Sq F value Pr(>F)
## dat5$fluid_type 3 30.16 10.05 3.047 0.0525 .
## Residuals 20 65.99 3.30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model2<- TukeyHSD(Model)
Model2
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dat5$life ~ dat5$fluid_type, data = dat5)
##
## $`dat5$fluid_type`
## diff lwr upr p adj
## 2-1 -0.7000000 -3.63540073 2.2354007 0.9080815
## 3-1 2.3000000 -0.63540073 5.2354007 0.1593262
## 4-1 0.1666667 -2.76873407 3.1020674 0.9985213
## 3-2 3.0000000 0.06459927 5.9354007 0.0440578
## 4-2 0.8666667 -2.06873407 3.8020674 0.8413288
## 4-3 -2.1333333 -5.06873407 0.8020674 0.2090635
plot(Model2)
Which fluid would you select, given that the objective is long life?
From the plot above, fluid 3 seems to have a longer average life, which will be our choice.
Analyze the residuals from this experiment. Are the basic analysis of variance assumptions satisfied?
We get a plot of the model
plot(Model)
Deductions: 1. The Normal QQ residual plot shows some normality as the data appears to be in a straight line
An experiment was performed to investigate the effectiveness of five insulating materials. Four samples of each material were tested at an elevated voltage level to accelerate the time to failure. The failure times (in minutes) are shown below:
a. Do all five materials have the same effect on mean failure time?
b. Plot the residuals versus the predicted response. Construct a normal probability plot of the residuals. What information is conveyed by these plots?
c. Based on your answer to part (b) conduct another analysis of the failure time data and draw appropriate conclusions.
Reading the data:
material <- c(rep(1,4),rep(2,4),rep(3,4),rep(4,4),rep(5,4))
t <- c(110,157,194,178,1,2,4,18,880,1256,5276,4355,495,7040,5307,10050,7,5,29,2)
dat <- cbind(material,t)
dat <- as.data.frame(dat)
dat$material <- as.factor(as.character(dat$material))
dat$t<- as.numeric(dat$t)
str(data)
## function (..., list = character(), package = NULL, lib.loc = NULL, verbose = getOption("verbose"),
## envir = .GlobalEnv, overwrite = TRUE)
Testing the hypothesis:
Null hypothesis is that
H0: u1=u2=u3=u4=u5 that is; all the mean are equal.
Alternative hypothesis: Ha=At least one of the means (u's) differs
Model3<- aov(dat$t~dat$material,data = dat)
summary(Model3)
## Df Sum Sq Mean Sq F value Pr(>F)
## dat$material 4 103191489 25797872 6.191 0.00379 **
## Residuals 15 62505657 4167044
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion: P value is 0.00379
As a result, we reject the Null hypothesis. This implies that the entire u’s are not the same.
Plot the residuals versus the predicted response. Construct a normal probability plot of the residuals.What information is conveyed by these plots?
plot(Model3)
Conclusion
The Normal Q-Q plot is not normally distributed as the data doesn’t seem to fall on a straight line.
Also, the residual vs. fitted plot shows a wide variety of spread. This implies that the variance is not constant.
library(MASS)
boxcox(Model3)
Since Lambda is close to zero, we take the log of the data
dat$time <- log(dat$t)
head(dat)
## material t time
## 1 1 110 4.7004804
## 2 1 157 5.0562458
## 3 1 194 5.2678582
## 4 1 178 5.1817836
## 5 2 1 0.0000000
## 6 2 2 0.6931472
Model4<-aov(dat$time~dat$material,data = dat)
summary(Model4)
## Df Sum Sq Mean Sq F value Pr(>F)
## dat$material 4 165.06 41.26 37.66 1.18e-07 ***
## Residuals 15 16.44 1.10
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now we plot the model
plot(Model4)
On applying the log transformation (power transformation), the data becomes normally distributed since the plot appears to fall on a straight line.
Also, the residual vs. fitted plot reveals that the spread is fairly even showing a constant variance.
A semiconductor manufacturer has developed three different methods for reducing particle counts on wafers. All three methods are tested on five different wafers and the after treatment particle count obtained. The data are shown below:
a. Do all methods have the same effect on mean particle count?
b. Plot the residuals versus the predicted response. Construct a normal probability plot of the residuals. Are there potential concerns about the validity of the assumptions?
c. Based on your answer to part (b) conduct another analysis of the particle count data and draw appropriate conclusions.
method<- c(rep(1,5),rep(2,5),rep(3,5))
count<- c(31,10,21,4,1,62,40,24,30,35,53,27,120,97,68)
dat1 <- cbind(method,count)
dat1 <- as.data.frame(dat1)
dat1$method <- as.factor(as.character(dat1$method))
dat1$count <- as.numeric(dat1$count)
str(data)
## function (..., list = character(), package = NULL, lib.loc = NULL, verbose = getOption("verbose"),
## envir = .GlobalEnv, overwrite = TRUE)
let
u1= mean count using method 1
u2= mean count using method 2
u3= mean count using method 3
Null hypothesis is that
H0: u1=u2=u3
Alternative hypothesis
Ha: At least one of mean count of the three methods (1,2,3) is different
Model5<- aov(dat1$count~dat1$method,data = dat1)
summary(Model5)
## Df Sum Sq Mean Sq F value Pr(>F)
## dat1$method 2 8964 4482 7.914 0.00643 **
## Residuals 12 6796 566
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We have a P value of 0.00643, which is less than 0.05.
As a result, we reject the null hypothesis hence the method doesn’t have similar mean effect.
plot(Model4)
We perform data transformation to get the needed result:
plot(Model4)
Since lambda appears to be 0.47, we take the power transformation raised to 0.47
dat1$new_count<-(dat1$count)^0.47
head(dat1)
## method count new_count
## 1 1 31 5.022732
## 2 1 10 2.951209
## 3 1 21 4.182569
## 4 1 4 1.918528
## 5 1 1 1.000000
## 6 2 62 6.957033
Model6<-aov(dat1$new_count~dat1$method,data = dat1)
summary(Model6)
## Df Sum Sq Mean Sq F value Pr(>F)
## dat1$method 2 46.26 23.132 9.875 0.00292 **
## Residuals 12 28.11 2.343
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Plotting the model
plot(Model6)
Conclusion
The residual normal QQ plot seems to appear on a straight line showing that the data is fairly normal.
We also conclude that the residual and fitted plot has a constant variance since the spread appears to be similar.
We have a P-value of 0.0029, so we conclude that at least one of the null value differs.
Use the Kruskal–Wallis test for the experiment in Problem 3.23.
Compare the conclusions obtained with those from the usual analysis of variance
reading the data
life <- c(17.6,18.9,16.3,17.4,20.1,21.6,16.9,15.3,18.6,17.1,19.5,20.3,21.4,23.6,19.4,18.5,20.5,22.3,19.3,21.1,16.9,17.5,18.3,19.8)
fluid_type<- c(rep(1,6),rep(2,6),rep(3,6),rep(4,6))
dat5<-cbind(fluid_type,life)
dat5<- as.data.frame(dat5)
dat5$fluid_type<-as.factor(dat5$fluid_type)
dat5$life <- as.numeric(dat5$life)
Performing the test
kruskal.test(dat5$life~dat5$fluid_type,data = dat5)
##
## Kruskal-Wallis rank sum test
##
## data: dat5$life by dat5$fluid_type
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015
Stating the Hypothesis:
Null: H0: μ1 =
μ2 = μ3 = μ4
Alternate: H1: μi ≠
μj for at least one pair (i,j)
Now analyzing the Kruskal Wallis Test:
We have a P value of 0.1015 which is higher than 0.05. As a result, we do not reject the NULL hypothesis.
The ANOVA result gave similar result even with lower P value.
As a result, we can conclude from the kruskal wallis test that this test is certain. Also, we conclude that the meal life of both fluids are not different. This makes the result similar to what we had in Q 3.23.
#Question 3.23a
life <- c(17.6,18.9,16.3,17.4,20.1,21.6,16.9,15.3,18.6,17.1,19.5,20.3,21.4,23.6,19.4,18.5,20.5,22.3,19.3,21.1,16.9,17.5,18.3,19.8)
fluid_type<- c(rep(1,6),rep(2,6),rep(3,6),rep(4,6))
dat5<-cbind(fluid_type,life)
dat5<- as.data.frame(dat5)
dat5$fluid_type<-as.factor(dat5$fluid_type)
dat5$life <- as.numeric(dat5$life)
Model<- aov(dat5$life~dat5$fluid_type,data=dat5)
summary(Model)
#Question 3.23b
Model2<- TukeyHSD(Model)
Model2
#plotting the model
plot(Model2)
#Question 3.23c
plot(Model)
#Question 3.28a
material <- c(rep(1,4),rep(2,4),rep(3,4),rep(4,4),rep(5,4))
t <- c(110,157,194,178,1,2,4,18,880,1256,5276,4355,495,7040,5307,10050,7,5,29,2)
dat <- cbind(material,t)
dat <- as.data.frame(dat)
dat$material <- as.factor(as.character(dat$material))
dat$t<- as.numeric(dat$t)
str(data)
#Testing the hypothesis
Model3<- aov(dat$t~dat$material,data = dat)
summary(Model3)
#Question 3.28b
plot(Model3)
#Question 3.28c
library(MASS)
boxcox(Model3)
Model4<-aov(dat$time~dat$material,data = dat)
summary(Model4)
#plot the model
plot(Model4)
#Question 3.29
method<- c(rep(1,5),rep(2,5),rep(3,5))
count<- c(31,10,21,4,1,62,40,24,30,35,53,27,120,97,68)
dat1 <- cbind(method,count)
dat1 <- as.data.frame(dat1)
dat1$method <- as.factor(as.character(dat1$method))
dat1$count <- as.numeric(dat1$count)
str(data)
#Testing the hypothesis
Model5<- aov(dat1$count~dat1$method,data = dat1)
summary(Model5)
#Question 3.29b
plot(Model4)
#Question 3.29c
#tranforming the data
plot(Model4)
#taking the power transformation
dat1$new_count<-(dat1$count)^0.47
head(dat1)
Model6<-aov(dat1$new_count~dat1$method,data = dat1)
summary(Model6)
plot(Model6)
#Question 3.51 & 3.52
kruskal.test(dat5$life~dat5$fluid_type,data = dat5)