3.23.

The effective life of insulating fluids at an accelerated load of 35 kV is being studied. Test data have been obtained for four types of fluids. The results from a completely randomized experiment were as follows:

type1 <- c(17.6 ,18.9, 16.3 ,17.4, 20.1, 21.6)
type2 <- c(16.9, 15.3 ,18.6, 17.1, 19.5, 20.3)
type3 <- c(21.4, 23.6 ,19.4, 18.5, 20.5, 22.3)
type4 <-  c(19.3, 21.1, 16.9 ,17.5 ,18.3 ,19.8)
dafr <- stack(data.frame(type1,type2,type3,type4))

a)

Is there any indication that the fluids differ? Use a=0.05.

We will be testing whether or not the means between the types of fluids are equal or not.

Ho: \(\mu1=\mu2=\mu3=\mu4\)

Ha: \(\mu\neq\mu2\neq\mu3\neq\mu4\)

summary(aov(values~ind,dafr))
##             Df Sum Sq Mean Sq F value Pr(>F)  
## ind          3  30.17   10.05   3.047 0.0525 .
## Residuals   20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Since our P value is above our alpha of .05, we do not reject the null and state that there is no difference between means based on fluid type.

(b)

Which fluid would you select, given that the objective is long life?

An LSD test will show us a comparison of the means individually.

library(agricolae)
lsd <- LSD.test(aov(values~ind,data=dafr),'ind')
lsd
## $statistics
##    MSerror Df     Mean       CV  t.value      LSD
##   3.299667 20 19.09167 9.514614 2.085963 2.187666
## 
## $parameters
##         test p.ajusted name.t ntr alpha
##   Fisher-LSD      none    ind   4  0.05
## 
## $means
##         values      std r      LCL      UCL  Min  Max    Q25   Q50    Q75
## type1 18.65000 1.952178 6 17.10309 20.19691 16.3 21.6 17.450 18.25 19.800
## type2 17.95000 1.854454 6 16.40309 19.49691 15.3 20.3 16.950 17.85 19.275
## type3 20.95000 1.879096 6 19.40309 22.49691 18.5 23.6 19.675 20.95 22.075
## type4 18.81667 1.554885 6 17.26975 20.36358 16.9 21.1 17.700 18.80 19.675
## 
## $comparison
## NULL
## 
## $groups
##         values groups
## type3 20.95000      a
## type4 18.81667     ab
## type1 18.65000      b
## type2 17.95000      b
## 
## attr(,"class")
## [1] "group"

It seems like Fluid Type 3 has the highest mean value, and additionally is the only one alone in it’s group. If there were to be a difference between means, I would say pick Type 3 for the highest mean.

(c)

Analyze the residuals from this experiment. Are the basic analysis of variance assumptions satisfied?

par(mfrow=(c(2,2)))
plot(aov(values~ind,dafr))

Our residual and normality assumptions appear to hold, since everything looks evenly spread out, with no patterns in any of these graphs. This would imply that our base assumtions do hold for the ANOVA test.

3.28.

An experiment was performed to investigate the effectiveness of five insulating materials. Four samples of each material were tested at an elevated voltage level to accelerate the time to failure. The failure times (in minutes) are shown below:

material1 <-  c(110, 157 ,194, 178)
material2 <-c(1, 2 ,4 ,18)
material3 <-  c(880, 1256 ,5276 ,4355)
material4 <- c(495, 7040 ,5307 ,10050)
material5 <- c(7 ,5, 29, 2)
dafr <- stack(data.frame(material1,material2,material3,material4,material5))

(a)

Do all five materials have the same effect on mean failure time? We will be testing the difference between means, with the our null hypothesis stating they are all the same average failure time, and the alternative stating they are not.

Ho: \(\mu1=\mu2=\mu3=\mu4=\mu5\)

Ha: \(\mu\neq\mu2\neq\mu3\neq\mu4\neq\mu5\)

 summary(aov(values~ind,dafr))
##             Df    Sum Sq  Mean Sq F value  Pr(>F)   
## ind          4 103191489 25797872   6.191 0.00379 **
## Residuals   15  62505657  4167044                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

No, because our pvlaue is below our alpha(I used .05), this time we would reject the null hypothesis and state that there is a difference in average failure time given different insulating materials.

(b)

Plot the residuals versus the predicted response. Construct a normal probability plot of the residuals.What information is conveyed by these plots?

par(mfcol=c(2,2))
  plot(aov(values~ind,dafr))

We can clearly see from these plots that our variances are not equal, and our normality plot has a distinct pattern. Our variance and normality assumptions do not hold for this ANOVA test and the model is currently inadequate.

(c)

Based on your answer to part (b) conduct another analysis of the failure time data and draw appropriate conclusions.

I chose to perform a log transform on the data, since a quick glance at boxcox showed no real results:

  dafr$values <- log(dafr$values)
par(mfcol=c(2,2))
plot(aov(values~ind,dafr))

  summary(aov(values~ind,dafr)) 
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## ind          4 165.06   41.26   37.66 1.18e-07 ***
## Residuals   15  16.44    1.10                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can see here our variances are a little better than they were beforehand, and our normality plot now seems to indicate that our data is normally distributed.

Given that, our new pvalue still shows the same result, that we should reject the null. This time it seems much more sure of itself in this answer, as our pvalue is much decreased.

3.29.

A semiconductor manufacturer has developed three different methods for reducing particle counts on wafers. All three methods are tested on five different wafers and the after treatment particle count obtained. The data are shown below:

  method1 <-  c(31 ,10 ,21, 4, 1)
  method2 <-  c(62 ,40, 24, 30, 35)
  method3 <- c(53 ,27 ,120, 97, 68)
  dafr <- stack(data.frame(method1,method2,method3))

(a)

Do all methods have the same effect on mean particle count?

We will again perform an ANOVA test to see if these methods actually differe from one another. Our NUll and alternative hypothesis are:

Ho: \(\mu1=\mu2=\mu3\)

Ha: \(\mu\neq\mu2\neq\mu3\)

summary(aov(values~ind,dafr)) 
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## ind          2   8964    4482   7.914 0.00643 **
## Residuals   12   6796     566                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Our results show that we reject the null hypothesis (because our Pvalue is lower than our a=.05)

(b)

Plot the residuals versus the predicted response. Construct a normal probability plot of the residuals.Are there potential concerns about the validity of the assumptions?

par(mfrow=c(2,2))
  plot(aov(values~ind,dafr))

Our data does appear to be noramlly distributed, however our residuals are somewhat concerning, with method one having much less spread than the other two methods.

(c)

Based on your answer to part (b) conduct another analysis of the particle count data and draw appropriate conclusions

Based on the above data, I chose to perform a boxcox test to find an appropriate power transformation:

  library(MASS)
  boxcox(values~ind,data=dafr,main='Boxcox Transform')
## Warning: In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
##  extra argument 'main' will be disregarded

Based on this, I chose my lambda to be approximately .4.

  dafr$values <- dafr$values^.4
par(mfrow=c(2,2))
  plot(aov(values~ind,dafr))

Our variances are much improved by this transformation, and I would say this transform made our model adequate. We will again perform the ANOVA test:

  summary(aov(values~ind,dafr)) 
##             Df Sum Sq Mean Sq F value  Pr(>F)   
## ind          2  21.21  10.605   9.881 0.00291 **
## Residuals   12  12.88   1.073                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

And we find that while our pvalue did decrease, the result is the same, and we still reject the null hypothesis that the mean particle count is affected by the methods.

#3.51. Use the Kruskal–Wallis test for the experiment in Problem 3.23. Compare the conclusions obtained with those from the usual analysis of variance.

type1 <- c(17.6 ,18.9, 16.3 ,17.4, 20.1, 21.6)
type2 <- c(16.9, 15.3 ,18.6, 17.1, 19.5, 20.3)
type3 <- c(21.4, 23.6 ,19.4, 18.5, 20.5, 22.3)
type4 <-  c(19.3, 21.1, 16.9 ,17.5 ,18.3 ,19.8)
dafr <- stack(data.frame(type1,type2,type3,type4))
kruskal.test(values~ind, dafr)
## 
##  Kruskal-Wallis rank sum test
## 
## data:  values by ind
## Kruskal-Wallis chi-squared = 6.2177, df = 3, p-value = 0.1015

Our non parametric kruskal wallace test shows us that we should not reject the null hypothesis, which is the same conclusion reached in the original 3.23.