DoE Assignment 9: Sample Size determination and Tukeys Test

1. Determination of Sample Size with ANOVA

Minimum Variability Case

Minimum Variability Case We take means as: 18, 19, 19, 20 We determine the same size for this case as follows:

power.anova.test(groups = 4, n = NULL,
                 between.var = var(c(18,19,19,20)),
                 within.var  = 3.5,
                 sig.level   = 0.05,
                 power       = 0.80)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 20.08368
##     between.var = 0.6666667
##      within.var = 3.5
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

Min Variability case: We need to collect 21 samples.

Intermediate Variability Case

Intermediate Variability Case We take means as: 18, 18.66, 19.33, 20 We determine the same size for this case as follows:

power.anova.test(groups = 4, n = NULL,
                 between.var = var(c(18, 18.66, 19.33, 20)),
                 within.var  = 3.5,
                 sig.level   = 0.05,
                 power       = 0.80)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 18.16131
##     between.var = 0.7414917
##      within.var = 3.5
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

Intermediate Variability case: We need to collect 19 samples.

Maximum Variability Case

Maximum Variability Case We take means as: 18, 18, 20, 20 We determine the same size for this case as follows:

power.anova.test(groups = 4, n = NULL,
                 between.var = var(c(18, 18, 20, 20)),
                 within.var  = 3.5,
                 sig.level   = 0.05,
                 power       = 0.80)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 4
##               n = 10.56952
##     between.var = 1.333333
##      within.var = 3.5
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

Maximum Variability case: We need to collect 11 samples.

2. Tukeys Test on Fluids Data

fluid_type <- c(1,1,1,1,1,1,
                2,2,2,2,2,2,
                3,3,3,3,3,3,
                4,4,4,4,4,4)
life <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 
          16.9, 15.3, 18.6, 17.1, 19.5, 20.3,
          21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 
          19.3, 21.1, 16.9, 17.5, 18.3, 19.8)

dat1 <- data.frame(fluid_type, life)

dat1$fluid_type <- as.factor(dat1$fluid_type)
str(dat1$fluid_type)

##  Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 2 2 2 2 ...

Null Hypothesis(H0): mean life of fluids is the same. That is: meanlife_1 = meanlife_2 meanlife_3 = meanlife_4 Alternative Hypothesis: at least one mean life of fluids differs.

aov.model<-aov(dat1$life~dat1$fluid_type,data=dat1)
summary(aov.model)

##                 Df Sum Sq Mean Sq F value Pr(>F)  
## dat1$fluid_type  3  30.16   10.05   3.047 0.0525 .
## Residuals       20  65.99    3.30                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Since P-value < alpha (0.0525<0.1), we reject the null hypothesis that mean life for all fluid factors are same.

Checking for model adequacy

plot(aov.model)

Normal QQ plot shows most of the data lying on almost the same straight line. So the populations are normal. There seem to be constant variance in the population since the spread of residuals across each factor levels is almost the same looking at the leverage plot(residuals vs factor levels). Therefore, we can say the model is adequate.

Tukey’s Test

tukey_model<- TukeyHSD(aov.model, conf.level=0.9)
plot(tukey_model)

From the Tukey’s test plot, we can see there is significant difference between the means of life for fluids 3 and fluid 2 because the difference in means do not cover the zero value.

Unevaluated complete code

#q1
power.anova.test(groups = 4, n = NULL,
                 between.var = var(c(18,19,19,20)),
                 within.var  = 3.5,
                 sig.level   = 0.05,
                 power       = 0.80)

power.anova.test(groups = 4, n = NULL,
                 between.var = var(c(18, 18.66, 19.33, 20)),
                 within.var  = 3.5,
                 sig.level   = 0.05,
                 power       = 0.80)

power.anova.test(groups = 4, n = NULL,
                 between.var = var(c(18, 18, 20, 20)),
                 within.var  = 3.5,
                 sig.level   = 0.05,
                 power       = 0.80)


#q2
fluid_type <- c(1,1,1,1,1,1,
                2,2,2,2,2,2,
                3,3,3,3,3,3,
                4,4,4,4,4,4)
life <- c(17.6, 18.9, 16.3, 17.4, 20.1, 21.6, 
          16.9, 15.3, 18.6, 17.1, 19.5, 20.3,
          21.4, 23.6, 19.4, 18.5, 20.5, 22.3, 
          19.3, 21.1, 16.9, 17.5, 18.3, 19.8)

dat1 <- data.frame(fluid_type, life)

dat1$fluid_type <- as.factor(dat1$fluid_type)
#fluid_type <as.factor(fluid_type)
str(dat1$fluid_type)


aov.model<-aov(dat1$life~dat1$fluid_type,data=dat1)
summary(aov.model)

plot(aov.model)

tukey_model<- TukeyHSD(aov.model, conf.level = 0.9)
plot(tukey_model)