Question 1: Power Analysis for ANOVA

In this question, I calculated the sample size per group needed to reach power = 0.80 with 4 groups, within-group variance = 3.5, and significance level = 0.05. I looked at three different cases of variability in the group means.

Min variability

power.anova.test(groups = 4, n = NULL,                  between.var = var(c(18,19,19,20)),                  within.var  = 3.5,                  sig.level   = 0.05,                  power       = 0.80)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 20.08368
##     between.var = 0.6666667
##      within.var = 3.5
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

Intermediate variability

power.anova.test(groups = 4, n = NULL,                  between.var = var(c(18,18.6667,19.3333,20)),                  within.var  = 3.5,                  sig.level   = 0.05,                  power       = 0.80)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 18.17901
##     between.var = 0.7407259
##      within.var = 3.5
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

Maximum variability

power.anova.test(groups = 4, n = NULL,                  between.var = var(c(18,18,20,20)),                  within.var  = 3.5,                  sig.level   = 0.05,                  power       = 0.80)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 10.56952
##     between.var = 1.333333
##      within.var = 3.5
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

Conclusion for Question 1: When the variability between the group means is small, more samples per group are needed to detect differences. When the variability between the means is large, fewer samples are needed because the groups are easier to distinguish.

Question 2: Fluid Life at 35kV Load

The experiment collected six lifetimes for each of four fluids. We want to test whether the mean life is the same for all fluids at α = 0.10. If not, we will use Tukey’s test to see which fluids differ.

Enter the Data

Fluid1 <- c(17.6,18.9,16.3,17.4,20.1,21.6)
Fluid2 <- c(16.9,15.3,18.6,17.1,19.5,20.3)
Fluid3 <- c(21.4,23.6,19.4,18.5,20.5,22.3)
Fluid4 <- c(19.3,21.1,16.9,17.5,18.3,19.8)

dat <- data.frame(Fluid1,Fluid2,Fluid3,Fluid4)
dat   # data is not tidy
##   Fluid1 Fluid2 Fluid3 Fluid4
## 1   17.6   16.9   21.4   19.3
## 2   18.9   15.3   23.6   21.1
## 3   16.3   18.6   19.4   16.9
## 4   17.4   17.1   18.5   17.5
## 5   20.1   19.5   20.5   18.3
## 6   21.6   20.3   22.3   19.8

Convert to tidy format using tidyr

library(tidyr)
dat <- pivot_longer(dat, c(Fluid1,Fluid2,Fluid3,Fluid4))
dat   # now data is tidy 
## # A tibble: 24 × 2
##    name   value
##    <chr>  <dbl>
##  1 Fluid1  17.6
##  2 Fluid2  16.9
##  3 Fluid3  21.4
##  4 Fluid4  19.3
##  5 Fluid1  18.9
##  6 Fluid2  15.3
##  7 Fluid3  23.6
##  8 Fluid4  21.1
##  9 Fluid1  16.3
## 10 Fluid2  18.6
## # ℹ 14 more rows

Fit ANOVA model

aov.model <- aov(value ~ name, data=dat)
summary(aov.model)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## name         3  30.17   10.05   3.047 0.0525 .
## Residuals   20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

H0: The mean life of the fluids is the same

Ha: At least one fluid mean is different

The ANOVA output shows the p-value is less than 0.10, so we reject H0.

Conclusion: There are significant differences among the fluid means.

Model adequacy check

plot(aov.model)

Comment:

the ANOVA model assumptions appear adequate as the diagnostic plots

(Residuals vs Fitted, Q-Q, Scale-Location, Leverage) show that residuals are roughly normal and variances are fairly constant.

Tukey’s test for multiple comparisons

TukeyHSD(aov.model)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = value ~ name, data = dat)
## 
## $name
##                     diff         lwr       upr     p adj
## Fluid2-Fluid1 -0.7000000 -3.63540073 2.2354007 0.9080815
## Fluid3-Fluid1  2.3000000 -0.63540073 5.2354007 0.1593262
## Fluid4-Fluid1  0.1666667 -2.76873407 3.1020674 0.9985213
## Fluid3-Fluid2  3.0000000  0.06459927 5.9354007 0.0440578
## Fluid4-Fluid2  0.8666667 -2.06873407 3.8020674 0.8413288
## Fluid4-Fluid3 -2.1333333 -5.06873407 0.8020674 0.2090635
plot(TukeyHSD(aov.model))

The plot of confidence intervals confirms that Fluid 3 has significantly higher mean life than Fluid 2.

Final Conclusion:

At the 10% significance level, we reject H0 and conclude not all fluids have the same mean life. The ANOVA model assumptions are adequate. Tukey’s test indicates that certain fluids (e.g., Fluid 3 vs Fluid 2) differ significantly in their mean life.