The hypotheses testing in question to perform the ANOVA here is in the form:
\[ H_0: \mu_1=\mu_2=\mu_3=\mu_4 \\ H_a : \mu_1\neq \mu_2 \neq\mu_3 \neq \mu_4\]
After employing the ANOVA test, we can see that the null hypotheses is rejected and a pairwise comparison needs to be performed to check for difference in means between mixing techniques.
M1 <- c(3129,3000,2865,2890)
M2 <- c(3200,3300,2975,3150)
M3 <- c(2800,2900,2985,3050)
M4 <- c(2600,2700,2600,2765)
dat <- data.frame(M1,M2,M3,M4)
library(tidyr)
dat_org <- pivot_longer(dat,c(M1,M2,M3,M4))
aov.model <- aov(value~name,dat_org)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 3 489740 163247 12.73 0.000489 ***
## Residuals 12 153908 12826
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Based on Fisher LSD methodthe means of M1 and M3 do not differ significantly. However, for M1~M2, M1~M4, M2~M3, M2~M4, and M3~M4 we found statistical significant differences between the mean of these techniques.
library(agricolae)
print(LSD.test(aov.model,"name"))
## $statistics
## MSerror Df Mean CV t.value LSD
## 12825.69 12 2931.812 3.862817 2.178813 174.4798
##
## $parameters
## test p.ajusted name.t ntr alpha
## Fisher-LSD none name 4 0.05
##
## $means
## value std r LCL UCL Min Max Q25 Q50 Q75
## M1 2971.00 120.55704 4 2847.624 3094.376 2865 3129 2883.75 2945.0 3032.25
## M2 3156.25 135.97641 4 3032.874 3279.626 2975 3300 3106.25 3175.0 3225.00
## M3 2933.75 108.27242 4 2810.374 3057.126 2800 3050 2875.00 2942.5 3001.25
## M4 2666.25 80.97067 4 2542.874 2789.626 2600 2765 2600.00 2650.0 2716.25
##
## $comparison
## NULL
##
## $groups
## value groups
## M2 3156.25 a
## M1 2971.00 b
## M3 2933.75 b
## M4 2666.25 c
##
## attr(,"class")
## [1] "group"
Based on the normal probability plot of the residuals, it seems that the data is fairly normal with a slightly deviation on the upper tail.
e <- c(dat$M1-mean(dat$M1),dat$M2-mean(dat$M2),dat$M3-mean(dat$M3),dat$M4-mean(dat$M4))
qqnorm(e)
qqline(e)
The plot shows that the variances are fairly equal with no obvious pattern observed. Therefore, an ANOVA test might be performed with the constant variance assumption satisfied.
plot(aov.model,1)
In the scatter plot we can see that each mixing technique visually differs in height and do not follow a straight line if the means were compared. We can clearly see that there is difference in means. In contrast, the deviations seems to be pretty similar.
dat_org[dat_org == "M1"] <- "1"
dat_org[dat_org == "M2"] <- "2"
dat_org[dat_org == "M3"] <- "3"
dat_org[dat_org == "M4"] <- "4"
dat_org$name <- as.numeric(dat_org$name)
plot(dat_org$name,dat_org$value)
The hypotheses testing in question to perform the ANOVA here is in the form:
\[ H_0: \mu_1=\mu_2=\mu_3=\mu_4 \\ H_a : \mu_1\neq \mu_2 \neq\mu_3 \neq \mu_4\] The ANOVA test gives us a p-value really small, which indicates that the null hypotheses is rejected and we can proceed to employ the LSD test.
cot_15 <- c(7,7,15,11,9)
cot_20 <- c(12,17,12,18,18)
cot_25 <- c(14,19,19,18,18)
cot_30 <- c(19,25,22,19,23)
cot_35 <- c(7,10,11,15,11)
dat <- data.frame(cot_15,cot_20,cot_25,cot_30,cot_35)
dat_org <- pivot_longer(dat,c(cot_15,cot_20,cot_25,cot_30,cot_35))
aov.model <- aov(value~name,dat_org)
summary(aov.model)
## Df Sum Sq Mean Sq F value Pr(>F)
## name 4 475.8 118.94 14.76 9.13e-06 ***
## Residuals 20 161.2 8.06
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From Fisher LSD test we can see that the pairwise comparison between cotton 25 and 20, and cotton 35 and 15 did not differ significantly. Conversely, the cotton30 ~ cotton25, cotton30 ~ cotton 20, cotton30 ~ cotton 35, cotton30 ~ cotton 15, cotton25 ~ cotton 35, cotton25 ~ cotton 15, cotton20 ~ cotton 35 and cotton20 ~ cotton 15 differed significantly from each other, as can be seen in the comparison analysis of groups given by the fisher LSD method.
print(LSD.test(aov.model,"name"))
## $statistics
## MSerror Df Mean CV t.value LSD
## 8.06 20 15.04 18.87642 2.085963 3.745452
##
## $parameters
## test p.ajusted name.t ntr alpha
## Fisher-LSD none name 5 0.05
##
## $means
## value std r LCL UCL Min Max Q25 Q50 Q75
## cot_15 9.8 3.346640 5 7.151566 12.44843 7 15 7 9 11
## cot_20 15.4 3.130495 5 12.751566 18.04843 12 18 12 17 18
## cot_25 17.6 2.073644 5 14.951566 20.24843 14 19 18 18 19
## cot_30 21.6 2.607681 5 18.951566 24.24843 19 25 19 22 23
## cot_35 10.8 2.863564 5 8.151566 13.44843 7 15 10 11 11
##
## $comparison
## NULL
##
## $groups
## value groups
## cot_30 21.6 a
## cot_25 17.6 b
## cot_20 15.4 b
## cot_35 10.8 c
## cot_15 9.8 c
##
## attr(,"class")
## [1] "group"
We can see a slightly higher variance on the left of the residuals plot but the line seems to be fairly straight and no visual pattern is observed. This might suggest that the assumption of constant variance can be employed.
plot(aov.model,1)
The number of observations in each population should be 5, with a 90% chance of correct rejection of the null hypotheses.
power.anova.test(groups=4, n=NULL, between.var=var(c(50,60,50,60)),within.var=25,sig.level=0.05,power=0.9)
##
## Balanced one-way analysis of variance power calculation
##
## groups = 4
## n = 4.658128
## between.var = 33.33333
## within.var = 25
## sig.level = 0.05
## power = 0.9
##
## NOTE: n is number in each group
If the variance changes to \(\sigma^2=36\), the number of observations required to reach a 0.9 power is 7.
power.anova.test(groups=4, n=NULL, between.var=var(c(50,60,50,60)),within.var=36,sig.level=0.05,power=0.9)
##
## Balanced one-way analysis of variance power calculation
##
## groups = 4
## n = 6.180885
## between.var = 33.33333
## within.var = 36
## sig.level = 0.05
## power = 0.9
##
## NOTE: n is number in each group
If the variance changes to \(\sigma^2=49\), the number of observations required to reach a 0.9 power increases to 8.
power.anova.test(groups=4, n=NULL, between.var=var(c(50,60,50,60)),within.var=49,sig.level=0.05,power=0.9)
##
## Balanced one-way analysis of variance power calculation
##
## groups = 4
## n = 7.998751
## between.var = 33.33333
## within.var = 49
## sig.level = 0.05
## power = 0.9
##
## NOTE: n is number in each group
Since the variability is increasing, it is natural that the number of observations should increase to meet the requirements of 0.9 power of the test.
There is no formula or script to approach this problem when the variance is unknown. However, based on the type of experiment, environment conditions (noise, experience of the data collector, etc), and other factors, a variance can be estimated by the analyst based on an estimate, for example, on the minimum standard deviation allowed.