Today we will be doing one-way ANOVAs in R. There are many ways to do an ANOVA in R. Today, we will keep it simple by assuming that you have balanced data. In this case, the basic procedure for conducting an ANOVA is:
aov() to conduct the ANOVAplot() to produce residual plots (for assessing the
assumptions of the ANOVA)summary() to produce the ANOVA tableIn future tutorials, we will learn other methods, such as
lm() + gad() for balanced data in more
complicated experimental designs, and lmer() +
Anova() to implement linear mixed-effect models that can be
used even for unbalanced data.
Remember to set your working directory before you start. (See Week 1’s tutorial if you forget how.) You also need to install today’s packages.
library(ggplot2)
library(ggthemes)
library(emmeans)
library(effectsize)
library(multcomp)
library(multcompView)
library(ICC)
The example data set is SSRI.csv. This is simulated data representing the effect of 4 different selective serotonin reuptake inhibitors (SSRIs) on receptor binding. (SSRIs are drugs that are commonly used as antidepressants.) For each SSRI, you have obtained 15 estimates of \(K_i\). \(K_i\) is a measure of neuroreceptor binding (in units of nanomolar concentration, nM). Lower values of \(K_i\) indicate a stronger affinity of the drug for the receptor, suggesting the drug may be more effective. You wish to compare the four SSRIs.
# Read the data and convert variables as appropriate
d <- read.csv("Data/SSRI.csv")
d$SSRI <- factor(d$SSRI)
Check for balance. If the data are unbalanced, it is not recommended
to use the aov() + summary() method because it
implements Type I sums of squares.1 This command returns the sample size of
each group:
# The replications() function returns a frequency table with a count for each group
replications(Ki ~ SSRI, data = d)
## SSRI
## 15
Only one value was printed, meaning that all groups have an equal sample size (i.e., the experiment is balanced).
Now plot the data. A boxplot seems appropriate for exploring the
data. We will also show the raw data using
geom_jitter().
ggplot(data = d, aes(x = SSRI, y = Ki)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(height=0, width = 0.1, alpha = 0.3) +
labs(y = expression(italic(K[i])*" (nM)")) +
theme_few()
The variances do not appear to be perfectly equal, but they never
will be. There are no glaring outliers. SSRI1 looks a little curious
(there may be multiple groups within SSRI1?). But overall, it seems like
we should be okay to continue with an ANOVA using aov(). We
will check other assumptions later.
Let’s continue exploring the data by getting the mean and standard deviation for each SSRI:
aggregate(Ki ~ SSRI, data = d, mean) # Calculate the mean for each group
## SSRI Ki
## 1 SSRI1 279.7673
## 2 SSRI2 241.7253
## 3 SSRI3 273.2180
## 4 SSRI4 295.0840
aggregate(Ki ~ SSRI, data = d, sd) # Calculate the s.d. for each group
## SSRI Ki
## 1 SSRI1 45.49070
## 2 SSRI2 32.59665
## 3 SSRI3 37.68350
## 4 SSRI4 40.59071
“Contrasts” describe how the different levels of a treatment factor are compared. By default, R uses “treatment contrasts”. In treatment contrasts, the first level of each factor is used as the reference level, and all other levels are compared to the reference level. (For example, SSRI1 would be the reference level, and SSRI2, SSRI3, and SSRI4 would be compared to SSRI1.) This is okay for a linear regression, but it is inappropriate for a classical ANOVA. To check which contrasts R is currently using by default, you can type:
options("contrasts")
## $contrasts
## unordered ordered
## "contr.treatment" "contr.poly"
ANOVAs using aov() in R are best conducted with
orthogonal contrasts, which we set below. These contrasts will
remain set in R until you override them or revert them (or until you
close R).
op <- options(contrasts = c("contr.sum", "contr.poly")) # Set orthogonal "sum-to-zero" contrasts
Suppose that SSRI1 and SSRI2 are both brand-name drugs, and SSRI3 and SSRI4 are both generic drugs. I may be interested in testing whether the brand-name drugs differ from the generic drugs. This would be an example of a “planned comparison”. Planned comparisons are handled by defining a contrast matrix.
To handle our planned brand-name vs. generic drug comparison, we need to define different contrasts, including a contrast that compares the average of SSRI1 and SSRI2 with the average of SSRI3 and SSRI4.
levels(d$SSRI) # Show me the order in which the drugs are stored by R
## [1] "SSRI1" "SSRI2" "SSRI3" "SSRI4"
c1 <- cbind(
c(-1, -1, 1, 1), # Brand-name (SSRI1, SSRI2) vs. generic (SSRI3, SSRI4) contrast
c(-1, 1, 0, 0), # SSRI1 vs. SSRI2 contrast (i.e., within brand-name drugs)
c( 0, 0, -1, 1) # SSRI3 vs. SSRI4 contrast (i.e., within generic drugs)
)
(contrasts(d$SSRI) <- c1)
## [,1] [,2] [,3]
## [1,] -1 -1 0
## [2,] -1 1 0
## [3,] 1 0 -1
## [4,] 1 0 1
Here is how you interpret a contrast matrix:2
We can quickly check that these contrasts are orthogonal. (I have hidden the output to save space):
sum(contrasts(d$SSRI)[,1] * contrasts(d$SSRI)[,2]) # = 0
sum(contrasts(d$SSRI)[,1] * contrasts(d$SSRI)[,3]) # = 0
sum(contrasts(d$SSRI)[,2] * contrasts(d$SSRI)[,3]) # = 0
The null hypothesis of a one-way ANOVA is that the mean response is identical for all levels of the treatment factor. (E.g., \(\bar{K_i}\) is the same for all SSRIs.) If the null hypothesis is true, this implies that:
A significant result in an ANOVA indicates that the mean response is not identical across all levels of the treatment factor, so at least one of these contrasts is likely non-zero.
Now we will conduct a fixed-effects ANOVA using the
aov() function; we will produce the ANOVA table using the
summary() command. Note that summary()
implements Type I sums of squares, which are often inappropriate. But in
this case, it is unimportant because we have only one factor and our
data are balanced.
SSRI.aov <- aov(Ki ~ SSRI, data = d) # Do the ANOVA
summary(SSRI.aov) # Print the ANOVA table
## Df Sum Sq Mean Sq F value Pr(>F)
## SSRI 3 22657 7552 4.873 0.00442 **
## Residuals 56 86794 1550
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The result looks significant (\(p = 0.004\)). But before we get too excited about the results, we should test whether the data meet the assumptions of an ANOVA.
Look for outliers or unequal spread of the residuals, and check for normality of the residuals. An ANOVA assumes that the data are normally distributed within each group, which can be determined by examining the residuals.
opar <- par(mfrow = c(2, 2)) # Create a 2x2 plotting grid
plot(SSRI.aov, which = 5) # Plot residuals for each factor level
plot(SSRI.aov, which = 1) # Plot residuals vs. fitted values
plot(SSRI.aov, which = 2) # Plot a Q-Q plot
plot(residuals(SSRI.aov) ~ Subject, main="Residuals vs Exp. Unit", font.main=1, data=d)
par(opar) # Return to original plotting settings (i.e., a 1x1 grid)
The residuals do not look terrible to me. The first two plots show roughly equal variance of the residuals among treatments and fitted values. The third plot shows that the residuals are roughly normally distributed. The fourth plot shows the residual for each unit, which could be useful for identifying outliers or patterns in the residuals.
Now get the estimated marginal mean for each SSRI, along with its standard error and 95% CI. Standard errors are the same for each group, because they are calculated using the pooled standard deviation.
(emm <- emmeans(SSRI.aov, ~SSRI))
## SSRI emmean SE df lower.CL upper.CL
## SSRI1 280 10.2 56 259 300
## SSRI2 242 10.2 56 221 262
## SSRI3 273 10.2 56 253 294
## SSRI4 295 10.2 56 275 315
##
## Confidence level used: 0.95
Now calculate the overall effect size. Note that this effect size is not reported as commonly in Biology as in some other fields (e.g., Psychology). \(\omega^2\) indicates the proportion of variance in \(K_i\) that is explained by SSRI.
omega_squared(SSRI.aov)
## For one-way between subjects designs, partial omega squared is
## equivalent to omega squared. Returning omega squared.
## # Effect Size for ANOVA
##
## Parameter | Omega2 | 95% CI
## ---------------------------------
## SSRI | 0.16 | [0.02, 1.00]
##
## - One-sided CIs: upper bound fixed at [1.00].
You could report this ANOVA as: “Receptor binding differed significantly among the SSRIs tested (\(F_{3, 56} = 4.873, p = 0.004; \omega^2 = 0.16\ [95\%\ CI = 0.02-1.00]\)).”
To view the results of any planned comparisons, we can use the
“split” option of summary():
summary(SSRI.aov, split = list(SSRI = list("brand-name vs. generic" = 1,
"brand-name 1 vs. brand-name 2" = 2,
"generic 1 vs. generic 2" = 3)))
## Df Sum Sq Mean Sq F value Pr(>F)
## SSRI 3 22657 7552 4.873 0.00442 **
## SSRI: brand-name vs. generic 1 8217 8217 5.301 0.02505 *
## SSRI: brand-name 1 vs. brand-name 2 1 10854 10854 7.003 0.01054 *
## SSRI: generic 1 vs. generic 2 1 3586 3586 2.314 0.13387
## Residuals 56 86794 1550
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Our brand-name and generic drugs differed significantly (\(F_{1,56}=5.301, p=0.025\)). Here is how you might calculate the averages for each of these two groups:
mean(d[d$SSRI == "SSRI1" | d$SSRI == "SSRI2", "Ki"]) # Mean of brand-name drugs
mean(d[d$SSRI == "SSRI3" | d$SSRI == "SSRI4", "Ki"]) # Mean of generic drugs
## [1] 260.7463
## [1] 284.151
If we wanted to test every possible pair of SSRI drugs to see which differ, we could use Tukey’s method.3
pairs(emm, adjust = "tukey")
## contrast estimate SE df t.ratio p.value
## SSRI1 - SSRI2 38.04 14.4 56 2.646 0.0502
## SSRI1 - SSRI3 6.55 14.4 56 0.456 0.9683
## SSRI1 - SSRI4 -15.32 14.4 56 -1.065 0.7117
## SSRI2 - SSRI3 -31.49 14.4 56 -2.191 0.1384
## SSRI2 - SSRI4 -53.36 14.4 56 -3.712 0.0026
## SSRI3 - SSRI4 -21.87 14.4 56 -1.521 0.4320
##
## P value adjustment: tukey method for comparing a family of 4 estimates
Now let’s look at the “compact letter display” that more clearly indicates which groups differ significantly.
(tukey_cld <- cld(emm))
## SSRI emmean SE df lower.CL upper.CL .group
## SSRI2 242 10.2 56 221 262 1
## SSRI3 273 10.2 56 253 294 12
## SSRI1 280 10.2 56 259 300 12
## SSRI4 295 10.2 56 275 315 2
##
## Confidence level used: 0.95
## P value adjustment: tukey method for comparing a family of 4 estimates
## significance level used: alpha = 0.05
## NOTE: If two or more means share the same grouping symbol,
## then we cannot show them to be different.
## But we also did not show them to be the same.
The output suggests that the only significant difference between means is SSRI2 vs. SSRI4. Now we may wish to re-plot our data to show means, 95% confidence intervals (estimated using the pooled standard deviation) and the raw data.
plot(tukey_cld) +
geom_jitter(aes(y = SSRI, x = Ki), height=0.1, width = 0, alpha = 0.3, data = d) +
labs(x = expression(italic(K[i])*" (nM)")) +
coord_flip() +
theme_few()
To add “a”, “b”, “ab” (etc.) indicators to the plot (corresponding to “1”, “2”, “12” (etc.) in the “compact letter display”), the easiest way is to edit the figure in Adobe Illustrator, Inkscape, etc.
If your data do not meet the assumptions of an ANOVA, you have a few choices:
In the unlikely case that our SSRIs do not represent specific treatments of interest, but rather they are a random sample of SSRIs (from a population with effects that are assumed to be normally distributed) and our goal was to estimate the variance in \(K_i\) in people taking SSRIs that is attributable to the SSRI they take, then we would use a random-effects ANOVA instead of a fixed-effects ANOVA.
In aov(), random effects are specified using the
Error() term.
SSRI.raov <- aov(Ki ~ Error(SSRI), data = d) # Do the ANOVA
summary(SSRI.raov) # Summarize the ANOVA model
##
## Error: SSRI
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 3 22657 7552
##
## Error: Within
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 56 86794 1550
R assumes that you are not interested in testing the significance of random effects, so it does not display p-values for them. We can test the significance of the random effect manually as \(F_{3,56} = 7552/1550 = 4.872\):
pf(7552/1550, df1 = 3, df2 = 56, lower.tail = FALSE)
## [1] 0.004424933
Note that these are the same F- and p-values as in the fixed-effects ANOVA above.
Look for outliers or unequal spread of the residuals, and check for normality of the residuals.
opar <- par(mfrow = c(2, 2)) # Create a 2x2 plotting grid
plot(SSRI.aov, which = 5) # Plot residuals for each factor level
plot(SSRI.aov, which = 1) # Plot residuals vs. fitted values
plot(SSRI.aov, which = 2) # Plot a Q-Q plot
plot(residuals(SSRI.aov) ~ Subject, main="Residuals vs Exp. Unit", font.main=1, data=d)
par(opar) # Return to original plotting settings (i.e., a 1x1 grid)
These residuals should look familiar. A one-way random-effects ANOVA actually produces the same results as a fixed-effects ANOVA. The difference is in the intent and the interpretation.
In a random-effects ANOVA, it is not usually of interest to perform
contrasts. Instead, we estimate the variance components. The
ICCest() function will estimate variance components, plus
the intraclass correlation (\(\rho_1\))
and its confidence intervals. The intraclass correlation is an estimate
of the proportion of variance in the response variable that is explained
by the random effect.
icc.out <- ICCest(x = SSRI, y = Ki, data = d, alpha = 0.05)
icc.out$vara # The among-group variance
## [1] 400.1521
icc.out$varw # The within-group (i.e., residual) variance
## [1] 1549.899
icc.out$ICC # The intraclass coefficient
## [1] 0.2052008
c(icc.out$LowerCI, icc.out$UpperCI) # The 95% confidence intervals for ICC
## [1] 0.02915407 0.81754091
You could report this ANOVA as: “SSRI explains 20.5% (95% CI: 2.9-81.8%) of the variance in receptor binding (\(F_{3, 56} = 4.873, p = 0.004\)).” Note that this result is identical to our fixed-effects ANOVA except for the interpretation and the effect size.
If you are finished using aov() but you want to continue
using R, then you can return R to using its default contrasts, which we
stored above as op.
options(op)
You are trying to determine how much variability in calcium concentration exists among turnip greens. You randomly selected four turnip plants. From each plant, you randomly selected four different leaves, and you estimated the calcium concentration once for each leaf. This (simulated) data is stored as turnip.csv. Do the appropriate statistical analysis to meet your research goals. Be sure to test assumptions and make publication-ready plots.
Some flower nectars contain caffeine, which may act as a stimulant and a reward for bees. You are interested in knowing how the caffeine content of nectar affects visitation by pollinators. You have set up feeding stations, at which bees were offered a choice between: (1) a control solution of 20%-sucrose syrup (containing no caffeine), and (2) a 20%-sucrose syrup containing one of four levels of caffeine: 50 ppm, 100 ppm, 150 ppm, or 200 ppm.
The response variable is the difference in syrup consumption between the caffeine-supplemented and control feeders at each station (i.e., a positive value means that more caffeine-supplemented syrup was consumed, and a negative value means that more control syrup was consumed). This data (from Singaravelan et al., 2005 and described in Whitlock & Schluter 2015) is stored as caffeine.csv. Do the appropriate statistical analysis to meet your research goals. Be sure to test assumptions and make publication-ready plots.
Instead, we can implement Type II/III sums of squares
using lm() + Anova() for a fixed-effects
ANOVA, or lmer() + Anova(test="F") for the
equivalent of a random- or mixed-effects ANOVA.↩︎
For more information, see Venables & Ripley 1999.↩︎
Scheffé’s method is a flexible method of multiple comparisons better suited for unequal sample sizes and testing non-pairwise hypotheses (e.g., brand-name vs. generic drugs). If you only want to compare treatments to a control group, use Dunnett’s test.↩︎
The Kruskal-Wallis test is simply
kruskal.test() in R. Rather than Tukey’s method, Dunn’s
test can be used for unplanned comparisons. This is implemented as the
dunnTest() function in the FSA package.↩︎