1 QUESTION 1

Suppose we wish to design a new experiment that tests for a significant difference between the mean effective life of four insulating fluids at an accelerated load of 35kV. The variance of fluid life is estimated to be 3.5 hrs based on preliminary data. We would like this test to have a type 1 error probability of 0.05, and for this test to have an 80% probability of rejecting the assumption that the mean life of all the fluids are the same if there is a difference greater than 2 hours between the mean lives of the fluids, with a min of 18 hrs and max of 20 hrs.

How many samples of each fluid will need to be collected to achieve this design criterion in the case of Min, Intermediate, and Max variability?

library(pwr)

k <- 4
within_var <- 3.5
between_var <- 2
power <- 0.8
sig.level <- 0.05
bet_min_case = var(c(18,19,19,20))
bet_int_case = var(c(18, 18.5, 19.3, 20))
bet_max_case = var(c(18,18,20,20))

#For case of Min
power.anova.test(groups = k, n = NULL, within.var = within_var, between.var = bet_min_case, sig.level= sig.level ,power = power)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 20.08368
##     between.var = 0.6666667
##      within.var = 3.5
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

#For case of Int.
power.anova.test(groups = k, n = NULL, within.var = within_var, between.var = bet_int_case, sig.level= sig.level ,power = power)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 17.38575
##     between.var = 0.7766667
##      within.var = 3.5
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

#For case of Max.
power.anova.test(groups = k, n = NULL, within.var = within_var, between.var = bet_max_case, sig.level= sig.level ,power = power)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 10.56952
##     between.var = 1.333333
##      within.var = 3.5
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

Conclusion: - For case of Min. we need 21 samples - For case of Int. we need 18 samples - For case of Max. we need 11 samples

2 QUESTION 2

2.1 QUESTION 2.1

Test the hypothesis that the life of fluids is the same against the alternative that they differ at an α=0.10 level of significance (Remember to enter the data in a tidy format when using R, or to pivot_longer to a tidy format using tidyr)

library(tidyr)

f1 <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6)
f2 <- c(16.9, 15.3, 18.6, 17.1, 19.5, 20.3)
f3 <- c(21.4, 23.6, 19.4, 18.5, 20.5, 22.3)
f4 <- c(19.3, 21.1, 16.9, 17.5, 18.3, 19.8)
df <- data.frame(f1,f2,f3,f4)
df <- pivot_longer(df, c(f1,f2,f3,f4))
df$name <- as.factor(df$name)
aov.model <- aov(value~name, data = df)
summary(aov.model)

##             Df Sum Sq Mean Sq F value Pr(>F)  
## name         3  30.17   10.05   3.047 0.0525 .
## Residuals   20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusion: Regarding the sig level is greater than p-value (0.0525), we can reject the null hypothesis, that means there is a significance difference between the fluids

2.2 QUESTION 2.2

Is the model adequate? (show plots and comment)

plot(aov.model)

Conclusions: - For the QQ plot (normal probability plot), the data tends to follow a straight line. Normality assume is achieved. - For the “Residuals vs Fitted” plot, we can see the spread out is similar. Constant variance assume is achieved. - Since the assumptions are valid the model is adequate

2.3 QUESTION 2.3

Assuming the null hypothesis in question 1 is rejected, which fluids significantly differ using a familywise error rate of α=0.10 (use Tukey’s test). Include the plot of confidence intervals.

tukeymodel <- TukeyHSD(aov.model, conf.level = 0.9)
tukeymodel

##   Tukey multiple comparisons of means
##     90% family-wise confidence level
## 
## Fit: aov(formula = value ~ name, data = df)
## 
## $name
##             diff        lwr       upr     p adj
## f2-f1 -0.7000000 -3.2670196 1.8670196 0.9080815
## f3-f1  2.3000000 -0.2670196 4.8670196 0.1593262
## f4-f1  0.1666667 -2.4003529 2.7336862 0.9985213
## f3-f2  3.0000000  0.4329804 5.5670196 0.0440578
## f4-f2  0.8666667 -1.7003529 3.4336862 0.8413288
## f4-f3 -2.1333333 -4.7003529 0.4336862 0.2090635

Plot the tukey Model

plot(tukeymodel)

Conclusion: Regarding the summary we observed that for Fluid f3-f2 the p-value (0.0440578) is lower than sig.level (0.1) so f2-f2 significantly differ

3 Complete R-Code

# Question 1

library(pwr)

k <- 4
within_var <- 3.5
between_var <- 2
power <- 0.8
sig.level <- 0.05
bet_min_case = var(c(18,19,19,20))
bet_int_case = var(c(18, 18.5, 19.3, 20))
bet_max_case = var(c(18,18,20,20))

#For case of Min
power.anova.test(groups = k, n = NULL, within.var = within_var, between.var = bet_min_case, sig.level= sig.level ,power = power)

#For case of Int.
power.anova.test(groups = k, n = NULL, within.var = within_var, between.var = bet_int_case, sig.level= sig.level ,power = power)

#For case of Max.
power.anova.test(groups = k, n = NULL, within.var = within_var, between.var = bet_max_case, sig.level= sig.level ,power = power)

# Question 2.1
library(tidyr)

f1 <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6)
f2 <- c(16.9, 15.3, 18.6, 17.1, 19.5, 20.3)
f3 <- c(21.4, 23.6, 19.4, 18.5, 20.5, 22.3)
f4 <- c(19.3, 21.1, 16.9, 17.5, 18.3, 19.8)
df <- data.frame(f1,f2,f3,f4)
df <- pivot_longer(df, c(f1,f2,f3,f4))
df$name <- as.factor(df$name)
aov.model <- aov(value~name, data = df)
summary(aov.model)

# Question 2.2
plot(aov.model)

# Question 2.3
tukeymodel <- TukeyHSD(aov.model, conf.level = 0.9)
tukeymodel
plot(tukeymodel)

F. Assignment 9 - GROUP 5

Carlos Mas

Yasir IQBAL