KickstartR: 2-way ANOVA

Two-way Analysis of Variance, 2-way ANOVA, is an extension of our standard one-way ANOVA that incorporates the effects of two separate categorical predictor variables (X1 and X2) on some continuous response variable (Y). Just as in the case of one-way ANOVA, we use the lm() function.

In biology, we are often interested in whether the two categorical predictors interact with one another, i.e., whether the effect of X1 on Y may depend on X2 and vice versa. For example, we might examine how different antibiotic agents affect bacterial growth in the presence or absence of salt stress. Here, let’s reconsider data from Potvin, Lechowicz, and Tardif (1990) on CO2 uptake by the grass Echinochloa crus-galli. In this experiment on cold tolerance, the plants were either experimentally chilled or left unchilled (X1=Treatment) during the night before the gas-exchange measurements were taken, but the researchers also utilized plants from two different populations, Mississippi and Quebec (X2=Type). Using 2-way ANOVA, we can ask not just whether chilling affects CO2 uptake and whether uptake differs between the two populations, but also whether the effect of chilling differs between the two populations, i.e., whether the two factors interact.

require(mosaic) # be sure to install and load the usual packages if you have not already
require(lattice)
require(datasets)

data(CO2)

This experiment is actually fairly complicated, because the researchers also varied the CO2 concentration. Since we want to focus just on the categorical Treatment and Type variables, let’s see if we can just examine CO2 uptake rates under substrate saturation. To assess where that might be, let’s plot uptake as a function of concentration (conc) for all the combinations of the two categorical variables.

xyplot(uptake~conc|Type, groups=Treatment, auto.key=TRUE, type=c("p","smooth"), data=CO2, xlab="Concentration", ylab=expression(paste(CO[2], " uptake (", mu,"mol ", m^-2, " ", s^-1,")"))) #sorry the y-label is a bit complicated, units sometimes require mathematical expressions...

plot of chunk unnamed-chunk-3

Here it looks like uptake pretty much saturates at concentrations above about 200. We can then subset the data for all of our subsequent analyses, in order to examine CO2 uptake at (or at least near) saturation.

CO2sat = subset(CO2, conc>200)

It is always helpful to look at a plot and summary of the data first. In this case we can use a boxplot to see the distribution of uptake values for the two treatments. We can also use favstats() to see a comprehensive summary of the data.

bwplot(uptake~Treatment|Type, data=CO2sat, xlab="Treatment", ylab=expression(paste(CO[2], " uptake (", mu,"mol ", m^-2, " ", s^-1,")"))) #

plot of chunk unnamed-chunk-5

favstats(uptake~Treatment|Type, data=CO2sat)

##                   .group  min    Q1 median    Q3  max  mean    sd  n
## 1      nonchilled.Quebec 34.8 38.20  40.60 42.50 45.5 40.41 3.230 15
## 2         chilled.Quebec 30.3 34.80  38.10 38.85 42.4 37.05 3.332 15
## 3 nonchilled.Mississippi 25.8 28.00  30.60 31.65 35.5 30.03 2.630 15
## 4    chilled.Mississippi 12.3 14.05  17.90 19.20 22.2 17.27 3.290 15
## 5                 Quebec 30.3 35.82  38.85 41.40 45.5 38.73 3.648 30
## 6            Mississippi 12.3 17.95  24.00 30.45 35.5 23.65 7.121 30
##   missing
## 1       0
## 2       0
## 3       0
## 4       0
## 5       0
## 6       0

From the summary and the boxplot, we can see that the rate of CO2 uptake appears to be higher (on average) in unchilled plants in both population, but the Quebec population appears to have higher average rates of CO2 uptake in general and may be less sensitive to chilling. We can now construct a 2-way ANOVA to make a more rigorous inference:

CO2.2way = lm(uptake~Treatment*Type, data=CO2sat) # note the similarity of the model and plotting statements, but the * is used to indicate that we want to consider the effects of Treatment, Type, AND their interaction Treatment:Type.

Let’s start our analysis of the 2-way ANOVA model with a peek at an anova table to evaluate the different terms in the model.

anova(CO2.2way) # this command gives us an anova table to evalute different terms

## Analysis of Variance Table
## 
## Response: uptake
##                Df Sum Sq Mean Sq F value  Pr(>F)    
## Treatment       1    974     974    99.2 5.2e-14 ***
## Type            1   3411    3411   347.4 < 2e-16 ***
## Treatment:Type  1    332     332    33.8 3.0e-07 ***
## Residuals      56    550      10                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In order to ask whether the effect of chilling depends on the source population Type, we can examine the Treatment:Type interaction term. Based on the very low Pr(>F) values, the differences among treatments (and their interactions) are all extremely unlikely to arise “by chance.” Thus, at CO2 saturation, we can be pretty confident that: 1. Quebec plants have faster uptake rates than Mississippi plants, 2. Chilled plants have lower uptake rates than unchilled plants, and 3. Quebec plants are less sensitive to chilling than Mississippi plants.

If you think about it, this makes sense in terms of natural selection.

One form of graphical summary that often goes along with 2-way ANOVAs is an “interaction plot.” In this case, we can make one by plotting uptake as a function of the chilling treatment and grouping by the source population of the plants and then using a line to join the average values for each group across the two treatments.

stripplot(uptake~Treatment, groups=Type, auto.key=TRUE, jitter.data=TRUE, type=c("p","a"), data=CO2sat, xlab="Treatment", ylab=expression(paste(CO[2], " uptake (", mu,"mol ", m^-2, " ", s^-1,")")))

plot of chunk unnamed-chunk-8

These plots are called interaction plots because a difference in the slopes of the lines (as observed here) nicely shows that the effect of the chilling treatment depends on the source population.

KickstartR: 2-way ANOVA

Drew Kerkhoff