## Warning: package 'rstatix' was built under R version 4.0.3
##
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
##
## filter
## Loading required package: carData
## Warning: package 'faraway' was built under R version 4.0.3
## Registered S3 methods overwritten by 'lme4':
## method from
## cooks.distance.influence.merMod car
## influence.merMod car
## dfbeta.influence.merMod car
## dfbetas.influence.merMod car
##
## Attaching package: 'faraway'
## The following objects are masked from 'package:car':
##
## logit, vif
In this exercise you have to consider data set named PlantGrowth from R environment. It contains the weight of plants obtained under a control and two different treatment conditions. Please consider the following steps:
Here we use the built-in R dataset named PlantGrowth. It contains the weight of plants obtained under a control and two different treatment conditions.
## weight group
## 1 4.17 ctrl
## 2 5.58 ctrl
## 3 5.18 ctrl
## 4 6.11 ctrl
## 5 4.50 ctrl
## 6 4.61 ctrl
Show the levels of the grouping variable:
## [1] "ctrl" "trt1" "trt2"
Where, ctrl = Control trt1 = Treatment 1 trt2 = Treatment 2
Compute some summary statistics (count, mean and sd) of the variable weight organized by groups:
## # A tibble: 3 x 5
## group variable n mean sd
## <fct> <chr> <dbl> <dbl> <dbl>
## 1 ctrl weight 10 5.03 0.583
## 2 trt1 weight 10 4.66 0.794
## 3 trt2 weight 10 5.53 0.443
Create a box plot of weight by group:
ggplot(PlantGrowth, aes(x = group, y = weight, fill = group)) +
geom_boxplot() +
geom_jitter(shape = 15,
color = "steelblue",
position = position_jitter(0.21)) +
theme_minimal()Outliers can be easily identified using box plot methods, implemented in the R function identify_outliers() [rstatix package].
## # A tibble: 2 x 4
## group weight is.outlier is.extreme
## <fct> <dbl> <lgl> <lgl>
## 1 trt1 5.87 TRUE FALSE
## 2 trt1 6.03 TRUE FALSE
There were no extreme outliers.
The aov() function is used to obtain the relevant sums of squares. Using the summary() function on the output from aov() creates the desired ANOVA table. (Without the unneeded row for total).
## Coefficient covariances computed by hccm()
## ANOVA Table (type II tests)
##
## Effect DFn DFd F p p<.05 ges
## 1 group 2 27 4.846 0.016 * 0.264
## Df Sum Sq Mean Sq F value Pr(>F)
## group 2 3.766 1.8832 4.846 0.0159 *
## Residuals 27 10.492 0.3886
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In the table above, the column ges corresponds to the generalized eta squared (effect size). It measures the proportion of the variability in the outcome variable (here plant weight) that can be explained in terms of the predictor (here, treatment group). An effect size of 0.264 (26.4%) means that 26.4% of the change in the weight can be accounted for the treatment conditions. From the above ANOVA table, it can be seen that there are significant differences between groups (\(p = 0.016\)), which are highlighted with “*“, \(F(2, 27) = 4.85\), \(p = 0.16\), \(eta2[g] = 0.26\).
where,
F indicates that we are comparing to an F-distribution (F-test); (2, 27) indicates the degrees of freedom in the numerator (DFn) and the denominator (DFd), respectively; 4.85 indicates the obtained F-statistic value p specifies the p-value *ges is the generalized effect size (amount of variability due to the factor)
The ANOVA test assumes that, the data are normally distributed and the variance across groups are homogeneous. We can check that with some diagnostic plots.
The normality assumption can be checked by using one of the following two approaches:
Analyzing the ANOVA model residuals to check the normality for all groups together. This approach is easier and it’s very handy when you have many groups or if there are few data points per group. Check normality for each group separately. This approach might be used when you have only a few groups and many data points per group.
This section will show how to proceed for both option 1 and 2.
Check normality assumption by analyzing the model residuals. QQ plot and Shapiro-Wilk test of normality are used. QQ plot draws the correlation between a given data and the normal distribution.
In the plot above, there is no evident to show relationships between residuals and fitted values (the mean of each groups), which is good. So, we can assume the homogenity of variances.
Check normality assumption by groups. Computing Shapiro-Wilk test for each group level. If the data is normally distributed, the p-value should be greater than 0.05.
## # A tibble: 3 x 4
## group variable statistic p
## <fct> <chr> <dbl> <dbl>
## 1 ctrl weight 0.957 0.747
## 2 trt1 weight 0.930 0.452
## 3 trt2 weight 0.941 0.564
The score were normally distributed (\(p \gt 0.05\)) for each group, as assessed by Shapiro-Wilk’s test of normality. Note : if sample size is greater than 50, the normal QQ plot is preferred because at larger sample sizes the Shapiro-Wilk test becomes very sensitive even to a minor deviation from normality.
QQ plot draws the correlation between a given data and the normal distribution. Create QQ plots for each group level:
Points 4, 15, 17 are detected as outliers since it is far from the linearity, which can severely affect normality and homogeneity of variance. It can be useful to remove outliers to meet the test assumptions. I recommend Levene’s test, which is less sensitive to departures from normal distribution. From the Shapiro-Wilk test on the ANOVA residuals (\(W = 0.97\), \(p = 0.8\)) which finds no indication that normality is violated.
## # A tibble: 1 x 4
## df1 df2 statistic p
## <int> <int> <dbl> <dbl>
## 1 2 27 1.12 0.341
##
## One-way analysis of means (not assuming equal variances)
##
## data: weight and group
## F = 5.181, num df = 2.000, denom df = 17.128, p-value = 0.01739
From the output above, we can see that the p-value is \(\gt 0.05\), which is not significant. This means that, there is not significant difference between variances across groups. Therefore, we can assume the homogenity of variances in the different treatment groups.
A significant one-way ANOVA is generally followed up by Tukey post-hoc tests to perform multiple pairwise comparisons between groups. Key R function: tukey_hsd() [rstatix].
## # A tibble: 3 x 9
## term group1 group2 null.value estimate conf.low conf.high p.adj p.adj.signif
## * <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 group ctrl trt1 0 -0.371 -1.06 0.320 0.391 ns
## 2 group ctrl trt2 0 0.494 -0.197 1.19 0.198 ns
## 3 group trt1 trt2 0 0.865 0.174 1.56 0.012 *
The output contains the following columns:
estimate: estimate of the difference between means of the two groups conf.low, conf.high: the lower and the upper end point of the confidence interval at 95% (default) p.adj: p-value after adjustment for the multiple comparisons. It can be seen from the output, that only the difference between trt2 and trt1 is significant (adjusted p-value = 0.012).
Tukey’s Honest Significance difference can be applied directly to an object which was created using aov(). It will adjust the p-values of the pairwise comparisons of the means to control the FWER, in this case, for 0.05. Notice it also gives confidence intervals for the difference of the means.
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = weight ~ group, data = PlantGrowth)
##
## $group
## diff lwr upr p adj
## trt1-ctrl -0.371 -1.0622161 0.3202161 0.3908711
## trt2-ctrl 0.494 -0.1972161 1.1852161 0.1979960
## trt2-trt1 0.865 0.1737839 1.5562161 0.0120064
Based on these results, we see the control is causing treatment to change either control towards treatment 1 or control towards treatment 2. But not between treatments.
We could report the results of one-way ANOVA as follow:
*A one-way ANOVA was performed to evaluate if the plant growth was different for the 3 different treatment groups: ctr, trt1 and trt2 all have \(n=10\).
*Data is presented as mean +/- standard deviation. Plant growth was statistically significantly different between different treatment groups, F(2, 27) = 4.85, p = 0.016, generalized eta squared = 0.26.
*Plant growth decreased in trt1 group (4.66 +/- 0.79) compared to ctr group (5.03 +/- 0.58). It increased in trt2 group (5.53 +/- 0.44) compared to trt1 and ctr group.
*Tukey post-hoc analyses revealed that the increase from trt1 to trt2 (0.87, 95% CI (0.17 to 1.56)) was statistically significant (p = 0.012), but no other group differences were statistically significant.
Please apply Two-way ANOVA procedures to the rats data from the faraway package. There are two factors here: poison and treat. You can use the levels() function to extract the levels of a factor variable. Give your opinions accordingly!
## time poison treat
## Min. :0.1800 I :16 A:12
## 1st Qu.:0.3000 II :16 B:12
## Median :0.4000 III:16 C:12
## Mean :0.4794 D:12
## 3rd Qu.:0.6225
## Max. :1.2400
## [1] "I" "II" "III"
## [1] "A" "B" "C" "D"
Here, 48 rats were randomly assigned both one of three poisons and one of four possible treatments. The experimenters then measures their survival time in tens of hours. A total of 12 groups, each with 4 replicates.
Before running any tests, we should first look at the data. We will create interaction plots, which will help us visualize the effect of one factor, as we move through the levels of another factor.
par(mfrow = c(1, 2))
with(rats, interaction.plot(poison, treat, time, lwd = 2, col = 1:4))
with(rats, interaction.plot(treat, poison, time, lwd = 2, col = 1:3))If there is not interaction, thus an additive model, we would expect to see parallel lines. That means when the factor level is changed, there can be an effect on the response. However, the difference between the levels of the other factor should still be the same.
The obvious indication of interaction would be lines that cross while heading in different directions. We can’t really see it but the lines aren’t strictly parallel, and there is some overlap on the right panel. However, is this interaction effect significant?
Let’s fit each of the possible models, then investigate their estimates for each of the group means.
rats_int = aov(time ~ poison * treat, data = rats) # interaction model
rats_add = aov(time ~ poison + treat, data = rats) # additive model
rats_pois = aov(time ~ poison , data = rats) # single factor model
rats_treat = aov(time ~ treat, data = rats) # single factor model
rats_null = aov(time ~ 1, data = rats) # null modelTo get the estimates, we’ll create a table which we will predict on.
## poison treat
## 1 I A
## 2 II A
## 3 III A
## 4 I B
## 5 II B
## 6 III B
## 7 I C
## 8 II C
## 9 III C
## 10 I D
## 11 II D
## 12 III D
## [,1] [,2] [,3]
## [1,] "I-A" "II-A" "III-A"
## [2,] "I-B" "II-B" "III-B"
## [3,] "I-C" "II-C" "III-C"
## [4,] "I-D" "II-D" "III-D"
Since repeating ourselves a number of times, so write a function to perform the prediction. Some housekeeping is done to keep the estimates in order and provide row and column names. Above, it is shown where each of the estimates will be placed in the resulting matrix.
get_est_means = function(model, table) {
mat = matrix(predict(model, table), nrow = 4, ncol = 3, byrow = TRUE)
colnames(mat) = c("I", "II", "III")
rownames(mat) = c("A", "B", "C", "D")
mat
}First, we obtain the estimates from the interaction model. Note that each cell has a different value.
| I | II | III | |
|---|---|---|---|
| A | 0.4125 | 0.3200 | 0.210 |
| B | 0.8800 | 0.8150 | 0.335 |
| C | 0.5675 | 0.3750 | 0.235 |
| D | 0.6100 | 0.6675 | 0.325 |
Next, we obtain the estimates from the additive model. Again, each cell has a different value. We also see that these estimates are somewhat close to those from the interaction model.
| I | II | III | |
|---|---|---|---|
| A | 0.4522917 | 0.3791667 | 0.1110417 |
| B | 0.8147917 | 0.7416667 | 0.4735417 |
| C | 0.5306250 | 0.4575000 | 0.1893750 |
| D | 0.6722917 | 0.5991667 | 0.3310417 |
To understand the difference, let’s consider the effect of the treatments.
additive_means = get_est_means(model = rats_add, table = rats_table)
additive_means["A",] - additive_means["B",]## I II III
## -0.3625 -0.3625 -0.3625
interaction_means = get_est_means(model = rats_int, table = rats_table)
interaction_means["A",] - interaction_means["B",]## I II III
## -0.4675 -0.4950 -0.1250
This is the key difference between the interaction and additive models. The difference between the effect of treatments A and B is the same for each poison in the additive model. They are different in the interaction model.
The remaining three models are much simpler, having either only row or only column effects. Or no effects in the case of the null model.
| I | II | III | |
|---|---|---|---|
| A | 0.6175 | 0.544375 | 0.27625 |
| B | 0.6175 | 0.544375 | 0.27625 |
| C | 0.6175 | 0.544375 | 0.27625 |
| D | 0.6175 | 0.544375 | 0.27625 |
| I | II | III | |
|---|---|---|---|
| A | 0.3141667 | 0.3141667 | 0.3141667 |
| B | 0.6766667 | 0.6766667 | 0.6766667 |
| C | 0.3925000 | 0.3925000 | 0.3925000 |
| D | 0.5341667 | 0.5341667 | 0.5341667 |
| I | II | III | |
|---|---|---|---|
| A | 0.479375 | 0.479375 | 0.479375 |
| B | 0.479375 | 0.479375 | 0.479375 |
| C | 0.479375 | 0.479375 | 0.479375 |
| D | 0.479375 | 0.479375 | 0.479375 |
## Df Sum Sq Mean Sq F value Pr(>F)
## poison 2 1.0330 0.5165 23.222 3.33e-07 ***
## treat 3 0.9212 0.3071 13.806 3.78e-06 ***
## poison:treat 6 0.2501 0.0417 1.874 0.112
## Residuals 36 0.8007 0.0222
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Using a significance level of \(\alpha = 0.05\), we see that the interaction is not significant. Within the additive model, both factors are significant, so we select the additive model.
Within the additive model, we could do further testing about the main effects.
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = time ~ poison + treat, data = rats)
##
## $poison
## diff lwr upr p adj
## II-I -0.073125 -0.2089936 0.0627436 0.3989657
## III-I -0.341250 -0.4771186 -0.2053814 0.0000008
## III-II -0.268125 -0.4039936 -0.1322564 0.0000606
##
## $treat
## diff lwr upr p adj
## B-A 0.36250000 0.18976135 0.53523865 0.0000083
## C-A 0.07833333 -0.09440532 0.25107198 0.6221729
## D-A 0.22000000 0.04726135 0.39273865 0.0076661
## C-B -0.28416667 -0.45690532 -0.11142802 0.0004090
## D-B -0.14250000 -0.31523865 0.03023865 0.1380432
## D-C 0.14166667 -0.03107198 0.31440532 0.1416151
For an example with interaction, we investigate the warpbreaks dataset, a default dataset in R.
par(mfrow = c(1, 2))
with(warpbreaks, interaction.plot(wool, tension, breaks, lwd = 2, col = 2:4))
with(warpbreaks, interaction.plot(tension, wool, breaks, lwd = 2, col = 2:3))Either plot makes it rather clear that the wool and tensions factors interact.
## Df Sum Sq Mean Sq F value Pr(>F)
## wool 1 451 450.7 3.765 0.058213 .
## tension 2 2034 1017.1 8.498 0.000693 ***
## wool:tension 2 1003 501.4 4.189 0.021044 *
## Residuals 48 5745 119.7
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Using an \(\alpha\) of \(0.05\) the ANOVA test finds that the interaction is significant, so we use the interaction model here.