Flipped Assignment 9, CRD/One-Way ANOVA and Multiple Comparisons in R

1.

Suppose we wish to design a new experiment that tests for a significant difference between the mean effective life of these 4 insulating fluids at an accelerated load of 35kV. The variance of fluid life is estimated to be 4.5hrs based on preliminary data. We would like this test to have a type 1 error probability of 0.05, and for this test to have an 80% probability of rejecting the assumption that the mean life of all the fluids are the same if there is a difference greater than 1 hour between the mean lives of the fluids.

a. How many samples of each fluid will need to be collected to achieve this design criterion?

Given

groups: k = 4
σ² = 4.5
effect: f > √(1²/4.5)
α = 0.05
power (1-β) = 0.80

We need 14 samples

# balanced one-way analysis effect size = 1^2/4.5
library(pwr)
pwr.anova.test(k=4,n=NULL,f=sqrt(1^2/4.5),sig.level=0.05,power=.80)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 13.28401
##               f = 0.4714045
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

b. Suppose we wish to have an 80% probability of detecting a difference between mean fluid lives of 30minutes, how many samples would need to be collected?

Given

groups: k = 4
σ² = 4.5
effect: f > √(0.5²/4.5)
α = 0.05
power (1-β) = 0.80

We need 51 samples

# balanced one-way analysis effect size = 0.5^2/4.5
library(pwr)
pwr.anova.test(k=4,n=NULL,f=sqrt(0.5^2/4.5),sig.level=0.05,power=.80)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 50.04922
##               f = 0.2357023
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

2.

The effective life of insulating fluids at an accelerated load of 35kV is being studied. Test data have been obtained for the four types of fluid. The data from this experiment is given below.

# Load data into a data frame
dat<-read.csv("https://raw.githubusercontent.com/forestwhite/RStatistics/main/FlippedAssignment9.csv", header=TRUE)
dat

##     X1   X2   X3   X4
## 1 17.6 16.9 21.4 19.3
## 2 18.9 15.3 23.6 21.1
## 3 16.3 18.6 19.4 16.9
## 4 17.4 17.1 18.5 17.5
## 5 20.1 19.5 20.5 18.3
## 6 21.6 20.3 22.3 19.8

a. Given that n=6 samples of each fluid type were collected, with what power will a hypothesis test with an α=0.10 level of significance be able to detect a difference of 1 hour between the mean lives of the tested fluids?

Given

groups: k = 4
treatments: n = 6
σ assumed equal for all treatments, so get the average of all treatments: σ = 1.952178

# calculate the average standard deviations, assumed the same for all treatments
mean(sd(dat[,1]),sd(dat[,2]),sd(dat[,3]),sd(dat[,4]))

## [1] 1.952178

effect: f > √(1²/1.952178²)
α = 0.10

power (1-β) = 0.5987318

# balanced one-way analysis effect size = 1^2/(1.952178)^2
library(pwr)
pwr.anova.test(k=4,n=6,f=sqrt(1^2/(1.952178)^2),sig.level=0.10,power=NULL)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 6
##               f = 0.5122484
##       sig.level = 0.1
##           power = 0.5987318
## 
## NOTE: n is number in each group

b. Test the hypothesis that the life of fluids is the same against the alternative that they differ at an α=0.10 level of significance (Remember to enter the data in a tidy format when using R, or to pivot_longer to a tidy format using tidyr )

P_F = 0.0525 < 0.10 = α, so we reject the null hypothesis, and predict that at least one of the treatment sample means is not equal to the other sample treatment means.

# One way ANOVA test 
aov2 <- aov(values ~ ind,data=(stack(dat)))
summary(aov2)

##             Df Sum Sq Mean Sq F value Pr(>F)  
## ind          3  30.17   10.05   3.047 0.0525 .
## Residuals   20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

c. Is the model adequate? (show plots and comment)

The conditions on treatments required to perform adequate ANOVA analyses are (a) variance must be the same and (b) moderately normally distributions.

variances are nearly equal, as

the residual distribution versus predicted (fitted) values is fairly symetrical across 0, which suggest a good model
the leveraged distribution plot demonstrates similar distributions and few outliers.
the scale-location mean line is fairly constant, so the average distance of the points from the center line is consistent across samples

Probability plots of all four fluids demonstrate normality in the Q-Q plot

# AOV plots of the data for the 4 insulating fluids
aov.model<-aov(values ~ ind,data=(stack(dat)))
plot(aov.model)

d. Assuming the null hypothesis in question 1 is rejected, which fluids significant differ using a familywise error rate of α=0.10 (use Tukey’s test). Include the plot of confidence intervals.

Note: we did not test a hypothesis in question 1, but in question 2b.

The pair that significantly differs is fluid 2 and fluid 3

# Use Tukey's test to determine Honest Significant Differences between the the treatment means, 90% confidence
library(car)

## Loading required package: carData

TukeyHSD(aov2, ordered = TRUE, conf.level = 0.90)

##   Tukey multiple comparisons of means
##     90% family-wise confidence level
##     factor levels have been ordered
## 
## Fit: aov(formula = values ~ ind, data = (stack(dat)))
## 
## $ind
##            diff        lwr      upr     p adj
## X1-X2 0.7000000 -1.8670196 3.267020 0.9080815
## X4-X2 0.8666667 -1.7003529 3.433686 0.8413288
## X3-X2 3.0000000  0.4329804 5.567020 0.0440578
## X4-X1 0.1666667 -2.4003529 2.733686 0.9985213
## X3-X1 2.3000000 -0.2670196 4.867020 0.1593262
## X3-X4 2.1333333 -0.4336862 4.700353 0.2090635

plot(TukeyHSD(aov2, ordered = TRUE, conf.level = 0.90))

Complete Code

Complete R code used in this analysis.

# balanced one-way analysis effect size = 1^2/4.5
library(pwr)
pwr.anova.test(k=4,n=NULL,f=sqrt(1/4.5),sig.level=0.05,power=.80)

# balanced one-way analysis effect size = 0.5^2/4.5
library(pwr)
pwr.anova.test(k=4,n=NULL,f=sqrt(0.5^2/4.5),sig.level=0.05,power=.80)

# Load data into a data frame
dat<-read.csv("https://raw.githubusercontent.com/forestwhite/RStatistics/main/FlippedAssignment9.csv", header=TRUE)
dat

# calculate the average standard deviations, assumed the same for all treatments
mean(sd(dat[,1]),sd(dat[,2]),sd(dat[,3]),sd(dat[,4]))

# balanced one-way analysis effect size = 1^2/(1.952178)^2
library(pwr)
pwr.anova.test(k=4,n=6,f=sqrt(1^2/(1.952178)^2),sig.level=0.10,power=NULL)

# One way ANOVA test 
aov2 <- aov(values ~ ind,data=(stack(dat)))
summary(aov2)

# AOV plots of the data for the 4 insulating fluids
aov.model<-aov(values ~ ind,data=(stack(dat)))
plot(aov.model)

# Use Tukey's test to determine Honest Significant Differences between the the treatment means, 90% confidence
library(car)
TukeyHSD(aov2, ordered = TRUE, conf.level = 0.90)
plot(TukeyHSD(aov2, ordered = TRUE, conf.level = 0.90))

Flipped Assignment 9, CRD/One-Way ANOVA and Multiple Comparisons in R

Forest Kingfisher, Olivia Owens, Kameron Garza

2021-10-02

1.

a. How many samples of each fluid will need to be collected to achieve this design criterion?

b. Suppose we wish to have an 80% probability of detecting a difference between mean fluid lives of 30minutes, how many samples would need to be collected?

2.

a. Given that n=6 samples of each fluid type were collected, with what power will a hypothesis test with an α=0.10 level of significance be able to detect a difference of 1 hour between the mean lives of the tested fluids?

b. Test the hypothesis that the life of fluids is the same against the alternative that they differ at an α=0.10 level of significance (Remember to enter the data in a tidy format when using R, or to pivot_longer to a tidy format using tidyr )

c. Is the model adequate? (show plots and comment)

d. Assuming the null hypothesis in question 1 is rejected, which fluids significant differ using a familywise error rate of α=0.10 (use Tukey’s test). Include the plot of confidence intervals.

Complete Code