We’ll use the data set named PlantGrowth from R environment. It contains the weight of plants obtained under a control and two different treatment conditions. So we need to download the data set, then we look at the data.
data("PlantGrowth")
head(PlantGrowth)
## weight group
## 1 4.17 ctrl
## 2 5.58 ctrl
## 3 5.18 ctrl
## 4 6.11 ctrl
## 5 4.50 ctrl
## 6 4.61 ctrl
A plant fertilizer manufacturer wants to develop a formula of fertilizer that yields the most increase in the height of plants.
levels(PlantGrowth$group)
## [1] "ctrl" "trt1" "trt2"
The one-way ANOVA can be used to determine whether the means plant growths are significantly different between the three conditions.
Now, we can see that in PlantGrowth data set, there is 3 groups named ctrl,trt1, and trt2. Before we begin to build the model, it’s important to consider the summary statistics and also identify if there is some outliers. We ca apply box plot methods or implement by using the R function identify_outliers() [rstatix package].
library(tidyr) # to create tidy data
library(rstatix) # use for basic Anova/statistical tests
##
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
##
## filter
library(ggplot2) # for visualization
# first get the summary statistics
PlantGrowth %>%
group_by(group) %>%
get_summary_stats(weight, type = "mean_sd")
## # A tibble: 3 x 5
## group variable n mean sd
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 ctrl weight 10 5.03 0.583
## 2 trt1 weight 10 4.66 0.794
## 3 trt2 weight 10 5.53 0.443
The aov() function is used to obtain the relevant sums of squares. Using the summary() function on the output from aov() creates the desired ANOVA table. (Without the unneeded row for total).
res.aov = anova_test(weight ~ group, data = PlantGrowth)
## Coefficient covariances computed by hccm()
weight_aov = aov(weight ~ group, data = PlantGrowth)
summary(weight_aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## group 2 3.766 1.8832 4.846 0.0159 *
## Residuals 27 10.492 0.3886
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
When p-value<alpha, then we reject H0 that claims all of the group, gives the same mean value, we accept H1 that claims at least one group mean is not equal to the others.
groups = data.frame(group = unique(PlantGrowth$group))
data.frame(groups, weight = predict(weight_aov, groups))
## group weight
## 1 ctrl 5.032
## 2 trt1 4.661
## 3 trt2 5.526
groups is combine a number of vectors into a data frame.
The ANOVA test assumes that, the data are normally distributed and the variance across groups are homogeneous. We can check that with some diagnostic plots.
plot(weight_aov, 1) # 1. Homogeneity of variances
In the plot, there is no evident relationship between the residuals and the suitability score (mean of each group), which is a good thing. Thus, we can assume the homogeneity of the variance. And we can see that the outliers are 17, 15, and 4o.
plot(weight_aov, 2) # 2. Normality
PlantGrowth %>%
group_by(group) %>%
shapiro_test(weight)
## # A tibble: 3 x 4
## group variable statistic p
## <fct> <chr> <dbl> <dbl>
## 1 ctrl weight 0.957 0.747
## 2 trt1 weight 0.930 0.452
## 3 trt2 weight 0.941 0.564
Points 4, 15, 17 are detected as outliers, which can greatly affect the normality and homogeneity of the variance. It is useful to remove outliers to satisfy test assumptions. I recommend Levene’s test, which is less sensitive to deviations from the normal distribution.
library(car) # `leveneTest()` in car package
## Loading required package: carData
leveneTest(weight ~ group, data = PlantGrowth) # levene’s test (homogeneity of variances)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 1.1192 0.3412
## 27
From the output above, it can be seen that the p-value<0.05 is not significant. This means that there is no significant difference between the variances between groups. Therefore, we can assume a homogeneity of variance across different treatment groups.
oneway.test(weight ~ group, data = PlantGrowth) # no assumption of equal variances
##
## One-way analysis of means (not assuming equal variances)
##
## data: weight and group
## F = 5.181, num df = 2.000, denom df = 17.128, p-value = 0.01739
From the output above, it can be seen that the p value is smaller than the significance level of 0.05. This means that there is evidence to suggest that the variance between groups is statistically significant. Therefore, we can assume a homogeneity of variance across different treatment groups.
kruskal.test(weight ~ group, data = PlantGrowth) # use if ANNOVA assumptions are not met
##
## Kruskal-Wallis rank sum test
##
## data: weight by group
## Kruskal-Wallis chi-squared = 7.9882, df = 2, p-value = 0.01842
Suppose we reject the null hypothesis from the ANOVA test for equal means. That tells us that the means are different. But which means? All of them? Some of them? The obvious strategy is to test all possible comparisons of two means. We can do this easily in R.
pwc <- PlantGrowth %>%
tukey_hsd(weight ~ group)
pwc
## # A tibble: 3 x 8
## term group1 group2 estimate conf.low conf.high p.adj p.adj.signif
## * <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 group ctrl trt1 -0.371 -1.06 0.320 0.391 ns
## 2 group ctrl trt2 0.494 -0.197 1.19 0.198 ns
## 3 group trt1 trt2 0.865 0.174 1.56 0.012 *
The output contains the following columns:
estimate: estimate of the difference between means of the two groups conf.low, conf.high: the lower and the upper end point of the confidence interval at 95% (default) p.adj: p-value after adjustment for the multiple comparisons.
Tukey’s Honest Significance difference can be applied directly to an object which was created using aov(). It will adjust the p-values of the pairwise comparisons of the means to control the FWER, in this case, for 0.05. Notice it also gives confidence intervals for the difference of the means.
TukeyHSD(weight_aov, conf.level = 0.95)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = weight ~ group, data = PlantGrowth)
##
## $group
## diff lwr upr p adj
## trt1-ctrl -0.371 -1.0622161 0.3202161 0.3908711
## trt2-ctrl 0.494 -0.1972161 1.1852161 0.1979960
## trt2-trt1 0.865 0.1737839 1.5562161 0.0120064
Based on these results, we saw differences between all groups. All pairwise comparisons are significant. If you return to the original box plot, this result should come as no surprise.
Besides, well, we can easily plot this confidence interval.
plot(TukeyHSD(weight_aov, conf.level = 0.95))
We can see that the pair between
trl2 group and ctrl are the closest one to the mean 0.0, but we still cant accept that between trl2 and ctrl are same.
library(faraway) # dataset
## Registered S3 methods overwritten by 'lme4':
## method from
## cooks.distance.influence.merMod car
## influence.merMod car
## dfbeta.influence.merMod car
## dfbetas.influence.merMod car
##
## Attaching package: 'faraway'
## The following objects are masked from 'package:car':
##
## logit, vif
library(tidyr)
head(rats)
## time poison treat
## 1 0.31 I A
## 2 0.82 I B
## 3 0.43 I C
## 4 0.45 I D
## 5 0.45 I A
## 6 1.10 I B
levels(rats$poison)
## [1] "I" "II" "III"
We can see that the poison have 3 type.
levels(rats$treat)
## [1] "A" "B" "C" "D"
We can see that the treats have 4 type.
library(rstatix) # use for basic Anova/statistical tests
set.seed(1603)
data("rats", package = "faraway")
rats %>%
sample_n_by(poison, treat, size = 1)
## # A tibble: 12 x 3
## time poison treat
## <dbl> <fct> <fct>
## 1 0.31 I A
## 2 0.88 I B
## 3 0.76 I C
## 4 0.66 I D
## 5 0.4 II A
## 6 1.24 II B
## 7 0.4 II C
## 8 0.71 II D
## 9 0.18 III A
## 10 0.38 III B
## 11 0.22 III C
## 12 0.36 III D
library(tidyr) # to create tidy data
library(rstatix) # use for basic Anova/statistical tests
library(ggplot2) # for visualization
# first get the summary statistics
rats %>%
group_by(poison, treat) %>%
get_summary_stats( type = "mean_sd")
## # A tibble: 12 x 6
## poison treat variable n mean sd
## <fct> <fct> <chr> <dbl> <dbl> <dbl>
## 1 I A time 4 0.412 0.069
## 2 I B time 4 0.88 0.161
## 3 I C time 4 0.568 0.157
## 4 I D time 4 0.61 0.113
## 5 II A time 4 0.32 0.075
## 6 II B time 4 0.815 0.336
## 7 II C time 4 0.375 0.057
## 8 II D time 4 0.667 0.271
## 9 III A time 4 0.21 0.022
## 10 III B time 4 0.335 0.047
## 11 III C time 4 0.235 0.013
## 12 III D time 4 0.325 0.026
# visualization
ggplot(rats, aes(x = poison, y = time, fill = treat)) +
geom_boxplot() +
geom_jitter(shape = 15,
color = "steelblue",
position = position_jitter(0.21)) +
theme_minimal()
Poisson III reacts the fastest because of the least time.
# identify outliers
rats %>%
group_by(poison, treat) %>%
identify_outliers(time)
## # A tibble: 1 x 5
## poison treat time is.outlier is.extreme
## <fct> <fct> <dbl> <lgl> <lgl>
## 1 I A 0.31 TRUE FALSE
There were no extreme outliers.
mice_int = aov(time ~ treat * poison, data = rats) # interaction model
mice_add = aov(time ~ treat + poison, data = rats) # additive model
mice_treat= aov(time ~treat, data = rats) # single factor model
mice_toxic = aov(time ~ poison, data = rats) # single factor model
mice_null = aov(time ~ 1, data = rats) # null model
summary(mice_int)
## Df Sum Sq Mean Sq F value Pr(>F)
## treat 3 0.9212 0.3071 13.806 3.78e-06 ***
## poison 2 1.0330 0.5165 23.222 3.33e-07 ***
## treat:poison 6 0.2501 0.0417 1.874 0.112
## Residuals 36 0.8007 0.0222
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(mice_add)
## Df Sum Sq Mean Sq F value Pr(>F)
## treat 3 0.9212 0.3071 12.27 6.7e-06 ***
## poison 2 1.0330 0.5165 20.64 5.7e-07 ***
## Residuals 42 1.0509 0.0250
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(mice_treat)
## Df Sum Sq Mean Sq F value Pr(>F)
## treat 3 0.9212 0.30707 6.484 0.000992 ***
## Residuals 44 2.0839 0.04736
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(mice_toxic)
## Df Sum Sq Mean Sq F value Pr(>F)
## poison 2 1.033 0.5165 11.79 7.66e-05 ***
## Residuals 45 1.972 0.0438
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(mice_null)
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 47 3.005 0.06394
In the R code below, the asterisk represents the interaction effect and the main effect of each variable (and all lower-order interactions).
aov_test_int <- rats %>%
anova_test(time ~ poison * treat)
## Coefficient covariances computed by hccm()
aov_test_int
## ANOVA Table (type II tests)
##
## Effect DFn DFd F p p<.05 ges
## 1 poison 2 36 23.222 3.33e-07 * 0.563
## 2 treat 3 36 13.806 3.78e-06 * 0.535
## 3 poison:treat 6 36 1.874 1.12e-01 0.238
There was a statistically significant interaction between treat and poison for time score, F(6, 36) = 1.874, p = 0.238.
The ANOVA test assumes that, the data are normally distributed and the variance across groups are homogeneous. We can check that with some diagnostic plots.
plot(mice_int, 1) # 1. Homogeneity of variances
We can see that the outliers are 24,26, and 30.
plot(mice_int, 2) # 2. Normality
We can see that the outliers are point 24,26, and 30.
rats %>%
group_by(treat, poison) %>%
shapiro_test(time)
## # A tibble: 12 x 5
## poison treat variable statistic p
## <fct> <fct> <chr> <dbl> <dbl>
## 1 I A time 0.782 0.0741
## 2 II A time 0.971 0.848
## 3 III A time 0.927 0.577
## 4 I B time 0.947 0.700
## 5 II B time 0.948 0.701
## 6 III B time 0.831 0.171
## 7 I C time 0.895 0.405
## 8 II C time 0.983 0.921
## 9 III C time 0.993 0.972
## 10 I D time 0.899 0.427
## 11 II D time 0.981 0.907
## 12 III D time 0.946 0.689
Scores were normally distributed (p-value> 0.05) for each cell, as assessed by the Shapiro-Wilk normality test.
library(car)
leveneTest(time~treat*poison,data=rats) # levene’s test (car packages)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 11 4.1323 0.0005833 ***
## 36
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The Levene’s test is significant (p < 0.05). Therefore, we cant assume the homogeneity of variances in the different groups.
levene_test(time~treat*poison,data=rats) # levene’s test (rstatix packages)
## # A tibble: 1 x 4
## df1 df2 statistic p
## <int> <int> <dbl> <dbl>
## 1 11 36 4.13 0.000583
The Levene’s test is significant (p < 0.05).
oneway.test(time~treat*poison,data=rats) # no assumption of equal variances
##
## One-way analysis of means (not assuming equal variances)
##
## data: time and treat * poison
## F = 14.848, num df = 11.000, denom df = 13.801, p-value = 8.107e-06
Suppose we reject the null hypothesis from the ANOVA test for equal means. That tells us that the means are different. But which means? All of them? Some of them? The obvious strategy is to test all possible comparisons of two means. We can do this easily in R.
library(emmeans)
pwc <- rats %>%
group_by(treat) %>%
emmeans_test(time ~ poison, p.adjust.method = "bonferroni")
pwc
## # A tibble: 12 x 9
## treat .y. group1 group2 df statistic p p.adj p.adj.signif
## * <fct> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 A time I II 36 0.877 0.386 1 ns
## 2 A time I III 36 1.92 0.0628 0.188 ns
## 3 A time II III 36 1.04 0.304 0.912 ns
## 4 B time I II 36 0.616 0.542 1 ns
## 5 B time I III 36 5.17 0.00000898 0.0000270 ****
## 6 B time II III 36 4.55 0.0000586 0.000176 ***
## 7 C time I II 36 1.83 0.0762 0.229 ns
## 8 C time I III 36 3.15 0.00325 0.00976 **
## 9 C time II III 36 1.33 0.193 0.578 ns
## 10 D time I II 36 -0.545 0.589 1 ns
## 11 D time I III 36 2.70 0.0104 0.0313 *
## 12 D time II III 36 3.25 0.00252 0.00756 **
Tukey’s Honest Significance difference can be applied directly to an object which was created using aov(). It will adjust the p-values of the pairwise comparisons of the means to control the FWER, in this case, for 0.05. Notice it also gives confidence intervals for the difference of the means.
TukeyHSD(mice_int, conf.level = 0.95)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = time ~ treat * poison, data = rats)
##
## $treat
## diff lwr upr p adj
## B-A 0.36250000 0.19852116 0.52647884 0.0000047
## C-A 0.07833333 -0.08564550 0.24231217 0.5772283
## D-A 0.22000000 0.05602116 0.38397884 0.0048556
## C-B -0.28416667 -0.44814550 -0.12018783 0.0002333
## D-B -0.14250000 -0.30647884 0.02147884 0.1077087
## D-C 0.14166667 -0.02231217 0.30564550 0.1107678
##
## $poison
## diff lwr upr p adj
## II-I -0.073125 -0.2020091 0.05575913 0.3583151
## III-I -0.341250 -0.4701341 -0.21236587 0.0000005
## III-II -0.268125 -0.3970091 -0.13924087 0.0000339
##
## $`treat:poison`
## diff lwr upr p adj
## B:I-A:I 0.4675 0.09942113 0.835578867 0.0041387
## C:I-A:I 0.1550 -0.21307887 0.523078867 0.9393871
## D:I-A:I 0.1975 -0.17057887 0.565578867 0.7673377
## A:II-A:I -0.0925 -0.46057887 0.275578867 0.9989739
## B:II-A:I 0.4025 0.03442113 0.770578867 0.0220694
## C:II-A:I -0.0375 -0.40557887 0.330578867 0.9999999
## D:II-A:I 0.2550 -0.11307887 0.623078867 0.4203275
## A:III-A:I -0.2025 -0.57057887 0.165578867 0.7395476
## B:III-A:I -0.0775 -0.44557887 0.290578867 0.9998038
## C:III-A:I -0.1775 -0.54557887 0.190578867 0.8641326
## D:III-A:I -0.0875 -0.45557887 0.280578867 0.9993833
## C:I-B:I -0.3125 -0.68057887 0.055578867 0.1618705
## D:I-B:I -0.2700 -0.63807887 0.098078867 0.3379251
## A:II-B:I -0.5600 -0.92807887 -0.191921133 0.0003212
## B:II-B:I -0.0650 -0.43307887 0.303078867 0.9999650
## C:II-B:I -0.5050 -0.87307887 -0.136921133 0.0014940
## D:II-B:I -0.2125 -0.58057887 0.155578867 0.6808722
## A:III-B:I -0.6700 -1.03807887 -0.301921133 0.0000139
## B:III-B:I -0.5450 -0.91307887 -0.176921133 0.0004903
## C:III-B:I -0.6450 -1.01307887 -0.276921133 0.0000285
## D:III-B:I -0.5550 -0.92307887 -0.186921133 0.0003699
## D:I-C:I 0.0425 -0.32557887 0.410578867 0.9999996
## A:II-C:I -0.2475 -0.61557887 0.120578867 0.4645761
## B:II-C:I 0.2475 -0.12057887 0.615578867 0.4645761
## C:II-C:I -0.1925 -0.56057887 0.175578867 0.7938454
## D:II-C:I 0.1000 -0.26807887 0.468078867 0.9979417
## A:III-C:I -0.3575 -0.72557887 0.010578867 0.0634939
## B:III-C:I -0.2325 -0.60057887 0.135578867 0.5569107
## C:III-C:I -0.3325 -0.70057887 0.035578867 0.1086625
## D:III-C:I -0.2425 -0.61057887 0.125578867 0.4949176
## A:II-D:I -0.2900 -0.65807887 0.078078867 0.2439183
## B:II-D:I 0.2050 -0.16307887 0.573078867 0.7252287
## C:II-D:I -0.2350 -0.60307887 0.133078867 0.5413031
## D:II-D:I 0.0575 -0.31057887 0.425578867 0.9999899
## A:III-D:I -0.4000 -0.76807887 -0.031921133 0.0234641
## B:III-D:I -0.2750 -0.64307887 0.093078867 0.3126107
## C:III-D:I -0.3750 -0.74307887 -0.006921133 0.0426208
## D:III-D:I -0.2850 -0.65307887 0.083078867 0.2655759
## B:II-A:II 0.4950 0.12692113 0.863078867 0.0019661
## C:II-A:II 0.0550 -0.31307887 0.423078867 0.9999936
## D:II-A:II 0.3475 -0.02057887 0.715578867 0.0790981
## A:III-A:II -0.1100 -0.47807887 0.258078867 0.9953340
## B:III-A:II 0.0150 -0.35307887 0.383078867 1.0000000
## C:III-A:II -0.0850 -0.45307887 0.283078867 0.9995291
## D:III-A:II 0.0050 -0.36307887 0.373078867 1.0000000
## C:II-B:II -0.4400 -0.80807887 -0.071921133 0.0085466
## D:II-B:II -0.1475 -0.51557887 0.220578867 0.9563176
## A:III-B:II -0.6050 -0.97307887 -0.236921133 0.0000893
## B:III-B:II -0.4800 -0.84807887 -0.111921133 0.0029569
## C:III-B:II -0.5800 -0.94807887 -0.211921133 0.0001822
## D:III-B:II -0.4900 -0.85807887 -0.121921133 0.0022537
## D:II-C:II 0.2925 -0.07557887 0.660578867 0.2335610
## A:III-C:II -0.1650 -0.53307887 0.203078867 0.9105785
## B:III-C:II -0.0400 -0.40807887 0.328078867 0.9999998
## C:III-C:II -0.1400 -0.50807887 0.228078867 0.9695827
## D:III-C:II -0.0500 -0.41807887 0.318078867 0.9999976
## A:III-D:II -0.4575 -0.82557887 -0.089421133 0.0054007
## B:III-D:II -0.3325 -0.70057887 0.035578867 0.1086625
## C:III-D:II -0.4325 -0.80057887 -0.064421133 0.0103742
## D:III-D:II -0.3425 -0.71057887 0.025578867 0.0880763
## B:III-A:III 0.1250 -0.24307887 0.493078867 0.9868993
## C:III-A:III 0.0250 -0.34307887 0.393078867 1.0000000
## D:III-A:III 0.1150 -0.25307887 0.483078867 0.9932581
## C:III-B:III -0.1000 -0.46807887 0.268078867 0.9979417
## D:III-B:III -0.0100 -0.37807887 0.358078867 1.0000000
## D:III-C:III 0.0900 -0.27807887 0.458078867 0.9992007
To get the estimates, we’ll create a table which we will predict on.
mice_table = expand.grid(treat = unique(rats$treat),
poison= unique(rats$poison))
matrix(paste0(mice_table$treat, "-", mice_table$poison) , 4, 3, byrow = TRUE)
## [,1] [,2] [,3]
## [1,] "A-I" "B-I" "C-I"
## [2,] "D-I" "A-II" "B-II"
## [3,] "C-II" "D-II" "A-III"
## [4,] "B-III" "C-III" "D-III"
get_est_means = function(model, table) {
mat = matrix(predict(model, table), nrow = 3, ncol = 2, byrow = TRUE)
colnames(mat) = c("A", "B")
rownames(mat) = c("I", "II", "III")
mat
}
knitr::kable(get_est_means(model = mice_int, table = mice_table))
| A | B | |
|---|---|---|
| I | 0.4125 | 0.880 |
| II | 0.5675 | 0.610 |
| III | 0.3200 | 0.815 |
interaction_means = get_est_means(model = mice_int, table = mice_table)
interaction_means["I",] - interaction_means["II",]
## A B
## -0.155 0.270
additive_means = get_est_means(model = mice_add, table = mice_table)
additive_means["I",] - additive_means["II",]
## A B
## -0.07833333 0.14250000
knitr::kable(get_est_means(model = mice_treat, table = mice_table))
| A | B | |
|---|---|---|
| I | 0.3141667 | 0.6766667 |
| II | 0.3925000 | 0.5341667 |
| III | 0.3141667 | 0.6766667 |
knitr::kable(get_est_means(model = mice_toxic, table = mice_table))
| A | B | |
|---|---|---|
| I | 0.617500 | 0.617500 |
| II | 0.617500 | 0.617500 |
| III | 0.544375 | 0.544375 |
| A I=0. | 617500 |
knitr::kable(get_est_means(model = mice_null, table = mice_table))
| A | B | |
|---|---|---|
| I | 0.479375 | 0.479375 |
| II | 0.479375 | 0.479375 |
| III | 0.479375 | 0.479375 |