Setup

Load Libraries Into Session

# setup Libraries
library(dplyr)
library(knitr)
library(agricolae)
library(lawstat)
library(BSDA)
library(kableExtra)
library(tidyr)
library(pwr)
library(car)
library(ggplot2)

Problem 1

Suppose we wish to design a new experiment that tests for a significant difference between the mean effective life of these 4 insulating fluids at an accelerated load of 35kV. The variance of fluid life is estimated to be 4.5hrs based on preliminary data. We would like this test to have a type 1 error probability of 0.05, and for this test to have an 80% probability of rejecting the assumption that the mean life of all the fluids are the same if there is a difference greater than 1 hour between the mean lives of the fluids.

Part (a)

How many samples of each fluid will need to be collected to achieve this design criterion?

Running Power Test

pwr.anova.test(k=4,n=NULL,f=sqrt((1)^2/4.5),sig.level=0.05,power=0.80)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 13.28401
##               f = 0.4714045
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

The results provided above show that 13.28 (rounded up to 14) samples of each fluid will need to be collected.

Part (b)

Suppose we wish to have an 80% probability of detecting a difference between mean fluid lives of 30minutes, how many samples would need to be collected?

Running Power Test

pwr.anova.test(k=4,n=NULL,f=sqrt((0.5)^2/4.5),sig.level=0.05,power=0.80)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 50.04922
##               f = 0.2357023
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

The results provided above show that 50.05 (rounded up to 51) samples of each fluid will need to be collected.

Problem 2

The effective life of insulating fluids at an accelerated load of 35kV is being studied. Test data have been obtained for the four types of fluid. The data from this experiment is given below.

Life Under 35kV Load for Different Fluid Types
FluidType 1 2 3 4 5 6
1 17.6 18.9 16.3 17.4 20.1 21.6
2 16.9 15.3 18.6 17.1 19.5 20.3
3 21.4 23.6 19.4 18.5 20.5 22.3
4 19.3 21.1 16.9 17.5 18.3 19.8

Part (a)

Given that \(n=6\) samples of each fluid type were collected, with what power will a hypothesis test with an \(\alpha = 0.10\) level of significance be able to detect a difference of 1 hour between the mean lives of the tested fluids?

Calculating the Variance

Problem2var <- var(c(17.6,18.9,16.3,17.4,20.1,21.6,
                     16.9,15.3,18.6,17.1,19.5,20.3,
                     21.4,23.6,19.4,18.5,20.5,22.3,
                     19.3,21.1,16.9,17.5,18.3,19.8))

The variance is calculated as 4.1807971.

Running Power Test

pwr.anova.test(k=4,n=6,f=sqrt((1)^2/Problem2var),sig.level=0.10,power=NULL)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 6
##               f = 0.4890694
##       sig.level = 0.1
##           power = 0.5618141
## 
## NOTE: n is number in each group

With power of 0.56, a hypothesis test with an \(\alpha = 0.10\) level of significance will be able to detect a difference of 1 hour between the mean lives of the tested fluids.

Part (b)

Test the hypothesis that the life of fluids is the same against the alternative that they differ at an \(\alpha=0.10\) level of significance.

Hypotheses

\(H_0: \mu_1 = \mu_2 = \mu_3 = \mu_4\)

\(H_a\): at least one \(\mu\) differs from the other \(\mu\)’s.

Setting up Data Frame for Part (b)

Type1 <- c(17.6,18.9,16.3,17.4,20.1,21.6)
Type2 <- c(16.9,15.3,18.6,17.1,19.5,20.3)
Type3 <- c(21.4,23.6,19.4,18.5,20.5,22.3)
Type4 <- c(19.3,21.1,16.9,17.5,18.3,19.8)

FluidTableb <- data.frame(Type1,Type2,Type3,Type4)
FluidTableLong <- pivot_longer(FluidTableb,c(Type1,Type2,Type3,Type4))
Tidied Fluid Life Table
name value
Type1 17.6
Type2 16.9
Type3 21.4
Type4 19.3
Type1 18.9
Type2 15.3
Type3 23.6
Type4 21.1
Type1 16.3
Type2 18.6
Type3 19.4
Type4 16.9
Type1 17.4
Type2 17.1
Type3 18.5
Type4 17.5
Type1 20.1
Type2 19.5
Type3 20.5
Type4 18.3
Type1 21.6
Type2 20.3
Type3 22.3
Type4 19.8

Running Analysis of Variance for Part (b)

FluidLifeAOV <- aov(value ~ name, data = FluidTableLong)
FluidLifeAOVTable <- summary(FluidLifeAOV)
Df Sum Sq Mean Sq F value Pr(>F)
name 3 30.16500 10.055000 3.047277 0.0524632
Residuals 20 65.99333 3.299667

R calculated a p-value of 0.05, which is less than our \(\alpha=0.10\) which means we reject \(H_0\) and conclude that at least one of the \(\mu_i\)’s differs from the rest.

Part (c)

Is the model adequate? (show plots and comment)

Analysis of Variance Plots for Part (c)

The following are the model adequacy plots.

Yes, the model is adequate. The Residuals vs Fitted plot shows that there is a pretty constant spread of the residuals at different fitted values. The Normal Probability plot shows the data in a straight line. Both of these plots show that the data is normal with a constant variance.

Part (d)

Assuming the null hypothesis in question 1 is rejected, which fluids significant differ using a familywise error rate of \(\alpha=0.10\) (use Tukey’s test). Include the plot of confidence intervals.

Performing Tukey’s Test and Plotting Results

TukeyHSD(FluidLifeAOV)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = value ~ name, data = FluidTableLong)
## 
## $name
##                   diff         lwr       upr     p adj
## Type2-Type1 -0.7000000 -3.63540073 2.2354007 0.9080815
## Type3-Type1  2.3000000 -0.63540073 5.2354007 0.1593262
## Type4-Type1  0.1666667 -2.76873407 3.1020674 0.9985213
## Type3-Type2  3.0000000  0.06459927 5.9354007 0.0440578
## Type4-Type2  0.8666667 -2.06873407 3.8020674 0.8413288
## Type4-Type3 -2.1333333 -5.06873407 0.8020674 0.2090635
par(cex.axis=0.6)
plot(TukeyHSD(FluidLifeAOV),las=2,)

The output from R displays that the Tukey Test determined that Fluid Type 2 and Fluid Type 3 differ significantly because a value of 0 is not contained in the respective confidence interval. The same result can be seen in the 95% family-wise confidence interval plot.