1 Assignment#9 Questions

Do the following assignment in R.  Upload a link to an html page and a pdf of your answers in Blackboard

1.      Suppose we wish to design a new experiment that tests for a significant difference between the mean effective life of these 4 insulating fluids at an accelerated load of 35kV.  The variance of fluid life is estimated to be 4.5hrs based on preliminary data.  We would like this test to have a type 1 error probability of 0.05, and for this test to have an 80% probability of rejecting the assumption that the mean life of all the fluids are the same if there is a difference greater than 1 hour between the mean lives of the fluids.       

a.      How many samples of each fluid will need to be collected to achieve this design criterion?

b.      Suppose we wish to have an 80% probability of detecting a difference between mean fluid lives of 30minutes, how many samples would need to be collected?

2.      The effective life of insulating fluids at an accelerated load of 35kV is being studied.  Test data have been obtained for the four types of fluid.  The data from this experiment is given below.

Fluid Type
1 17.6 18.9 16.3 17.4 20.1 21.6
2 16.9 15.3 18.6 17.1 19.5 20.3
3 21.4 23.6 19.4 18.5 20.5 22.3
4 19.3 21.1 16.9 17.5 18.3 19.8

a.      Test the hypothesis that the life of fluids is the same against the alternative that they differ at an a=0.10 level of significance (Remember to enter the data in a tidy format when using R, or to pivot_longer to a tidy format using tidyr )

b.      Is the model adequate? (show plots and comment)

c.       Assuming the null hypothesis in question 1 is rejected, which fluids significant differ using a familywise error rate of a=0.10 (use Tukey’s test).  Include the plot of confidence intervals. 

1.1 Question 1-a

  1. How many samples of each fluid will need to be collected to achieve this design criterion?
#variance = 4.5
#d is the range of means divided by sigma: 1/4.5
#The effect (f) is given by d*sqrt(1/(2*k)) 

library(pwr)
pwr.anova.test(k=4,n=NULL,f=(1/4.5)*sqrt(1/(2*4)),sig.level=0.05,power=0.80)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 442.5319
##               f = 0.07856742
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

We would need at least 443 samples to collect to achieve the above criterion. The below plot shows the power curve graphically showing the number of observations to achieve the desired criterion

plot(pwr.anova.test(k=4,n=NULL,f=(1/4.5)*sqrt(1/(2*4)),sig.level=0.05,power=0.80))

1.2 Question 1-b

  1. Suppose we wish to have an 80% probability of detecting a difference between mean fluid lives of 30minutes, how many samples would need to be collected?
#d is the range of means divided by sigma: 0.5/4.5
pwr.anova.test(k=4,n=NULL,f=(0.5/4.5)*sqrt(1/(2*4)),sig.level=0.05,power=0.80)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 1767.192
##               f = 0.03928371
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

We would need at least 1768 samples to collect to achieve the above criterion, given that the mean fluid lives is now 30 minutes (0.5 hrs). The below plot shows the power curve graphically showing the number of observations to achieve the desired criterion

plot(pwr.anova.test(k=4,n=NULL,f=(0.5/4.5)*sqrt(1/(2*4)),sig.level=0.05,power=0.80))

1.3 Question 2-a

  1. Test the hypothesis that the life of fluids is the same against the alternative that they differ at an a=0.10 level of significance
library(tidyr)
Fluid1<-c(17.6,18.9,16.3,17.4,20.1,21.6)
Fluid2<-c(16.9,15.3,18.6,17.1,19.5,20.3)
Fluid3<-c(21.4,23.6,19.4,18.5,20.5,22.3)
Fluid4<-c(19.3,21.1,16.9,17.5,18.3,19.8)
dat<-data.frame(Fluid1,Fluid2,Fluid3,Fluid4)
dat<-pivot_longer(dat,c(Fluid1,Fluid2,Fluid3,Fluid4))
dataov.model<-aov(value~name,data=dat,)
summary(dataov.model)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## name         3  30.17   10.05   3.047 0.0525 .
## Residuals   20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We reject the null hypothesis given that pr(>F) value of 0.0525 at 0.10 level of significance

1.4 Question 2-b

  1. Is the model adequate? (show plots and comment)

Answer: Yes, the model is accurate.

plot(dataov.model)

1.5 Question 2-c

  1. Assuming the null hypothesis in question 1 is rejected, which fluids significant differ using a familywise error rate of a=0.10 (use Tukey’s test).  Include the plot of confidence intervals.

Answer: Given that 0 is in the confidence interval, then we fail to reject H0 for most cases. Only on Fluid 3 vs #Fluid 2 we reject H0 since it does not contain 0 in the confidence interval.

##   Tukey multiple comparisons of means
##     90% family-wise confidence level
## 
## Fit: aov(formula = value ~ name, data = dat)
## 
## $name
##                     diff        lwr       upr     p adj
## Fluid2-Fluid1 -0.7000000 -3.2670196 1.8670196 0.9080815
## Fluid3-Fluid1  2.3000000 -0.2670196 4.8670196 0.1593262
## Fluid4-Fluid1  0.1666667 -2.4003529 2.7336862 0.9985213
## Fluid3-Fluid2  3.0000000  0.4329804 5.5670196 0.0440578
## Fluid4-Fluid2  0.8666667 -1.7003529 3.4336862 0.8413288
## Fluid4-Fluid3 -2.1333333 -4.7003529 0.4336862 0.2090635

2 Complete R Code

#Question 1
library(pwr)
pwr.anova.test(k=4,n=NULL,f=(1/4.5)*sqrt(1/(2*4)),sig.level=0.05,power=0.80)
plot(pwr.anova.test(k=4,n=NULL,f=(1/4.5)*sqrt(1/(2*4)),sig.level=0.05,power=0.80))
#
pwr.anova.test(k=4,n=NULL,f=(0.5/4.5)*sqrt(1/(2*4)),sig.level=0.05,power=0.80)
plot(pwr.anova.test(k=4,n=NULL,f=(0.5/4.5)*sqrt(1/(2*4)),sig.level=0.05,power=0.80))

#Question 2
library(tidyr)
Fluid1<-c(17.6,18.9,16.3,17.4,20.1,21.6)
Fluid2<-c(16.9,15.3,18.6,17.1,19.5,20.3)
Fluid3<-c(21.4,23.6,19.4,18.5,20.5,22.3)
Fluid4<-c(19.3,21.1,16.9,17.5,18.3,19.8)
dat<-data.frame(Fluid1,Fluid2,Fluid3,Fluid4)
dat<-pivot_longer(dat,c(Fluid1,Fluid2,Fluid3,Fluid4))
dataov.model<-aov(value~name,data=dat)
summary(dataov.model)
#
plot(dataov.model)
#
TukeyHSD(dataov.model,conf.level = .9)
plot(TukeyHSD(dataov.model,conf.level = .9))