Scientific Research, Statistics Lab 7

Learning targets

After this class, you should be able to:

conduct an anova power test
implement a two-way ANOVA including an interaction term
identify group differences in the presence of an interaction using a post-hoc test (‘slicing an interaction’)

ANOVA power test

Similar to the power.t.test procedure we have encountered previously, it is possible to conduct an ANOVA power test. In the following example we use the (power.anova.test) function for a one-way ANOVA with 5 groups, each group containing a replication of 16. The within group variance is 90.3, and the between-group variance is 12.7. Comment on the result. What could you undertake to improve the power of your ANOVA?

How many degrees of freedom has your F-distribution that you are using for the above test? Plot a histogram of random numbers that follow such a distribution.

## Power anova test
## Note that the 'power' is not specified, so the function computes the power.
power.anova.test(groups = 5, n = 5, between.var = 12.7, within.var = 90.3)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 5
##               n = 5
##     between.var = 12.7
##      within.var = 90.3
##       sig.level = 0.05
##           power = 0.1889547
## 
## NOTE: n is number in each group

power.anova.test(groups = 5, between.var = 12.7, within.var = 90.3, power = 0.8)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 5
##               n = 22.18527
##     between.var = 12.7
##      within.var = 90.3
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

## Since we know that we should strive for a power of 0.8, we let R calculate the 
## required replicates within each group to attain the desired power. 
## Note how we now state a power value but leave out the 'n' argument (the replicates)
power.anova.test(groups = 5, between.var = 12.7, within.var = 90.3, power = 0.8)

## 
##      Balanced one-way analysis of variance power calculation 
## 
##          groups = 5
##               n = 22.18527
##     between.var = 12.7
##      within.var = 90.3
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

Two-way ANOVA

Once you have understood the concept of a simple one-way ANOVA, it is a piece of cake to construct more complex models. Let’s have a look at the built-in dataset crabs (in the built-in package MASS). The dataset contains morphometric measurements of two colour varieties of both sexes of the crab species Leptograpsus variegatus (50 individuals of each colour form).

library(MASS)
library(ggplot2)

## Examine the data set
head(crabs)

str(crabs)

## 'data.frame':    200 obs. of  8 variables:
##  $ sp   : Factor w/ 2 levels "B","O": 1 1 1 1 1 1 1 1 1 1 ...
##  $ sex  : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ...
##  $ index: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ FL   : num  8.1 8.8 9.2 9.6 9.8 10.8 11.1 11.6 11.8 11.8 ...
##  $ RW   : num  6.7 7.7 7.8 7.9 8 9 9.9 9.1 9.6 10.5 ...
##  $ CL   : num  16.1 18.1 19 20.1 20.3 23 23.8 24.5 24.2 25.2 ...
##  $ CW   : num  19 20.8 22.4 23.1 23 26.5 27.1 28.4 27.8 29.3 ...
##  $ BD   : num  7 7.4 7.7 8.2 8.2 9.8 9.8 10.4 9.7 10.3 ...

summary(crabs)

##  sp      sex         index            FL              RW       
##  B:100   F:100   Min.   : 1.0   Min.   : 7.20   Min.   : 6.50  
##  O:100   M:100   1st Qu.:13.0   1st Qu.:12.90   1st Qu.:11.00  
##                  Median :25.5   Median :15.55   Median :12.80  
##                  Mean   :25.5   Mean   :15.58   Mean   :12.74  
##                  3rd Qu.:38.0   3rd Qu.:18.05   3rd Qu.:14.30  
##                  Max.   :50.0   Max.   :23.10   Max.   :20.20  
##        CL              CW              BD       
##  Min.   :14.70   Min.   :17.10   Min.   : 6.10  
##  1st Qu.:27.27   1st Qu.:31.50   1st Qu.:11.40  
##  Median :32.10   Median :36.80   Median :13.90  
##  Mean   :32.11   Mean   :36.41   Mean   :14.03  
##  3rd Qu.:37.23   3rd Qu.:42.00   3rd Qu.:16.60  
##  Max.   :47.60   Max.   :54.60   Max.   :21.60

## Exploratory data analysis
## Plot raw data
ggplot(data = crabs, aes(x = sp, y = CL)) + geom_point() + facet_wrap(~ sex) # not very meaningful to plot the raw data

## Boxplot give a cleare picture
ggplot(data = crabs, aes(x = sp, y = CL)) + geom_boxplot() # pooled sexes

ggplot(data = crabs, aes(x = sex, y = CL)) + geom_boxplot() # pooled colour forms

ggplot(data = crabs, aes(x = sp, y = CL)) + geom_boxplot() + facet_wrap(~ sex) # data split by sex with boxplots for each colour

ggplot(data = crabs, aes(x = sex, y = CL)) + geom_boxplot() + facet_wrap(~ sp) # data split by sp with boxplots for each sex

Just by looking at the data we get a pretty good idea of what could be going on in terms of differences between groups. The multi-panel boxplots are a great tool to drill down into data and provide a lot of information at one glance. The first version ggplot(data = crabs, aes(x = sp, y = CL)) + geom_boxplot() + facet_wrap(~ sex) indicates a reasonably large difference in carapace length between blue and orange coloured crabs but only in the female group not in the male group. The second version ggplot(data = crabs, aes(x = sex, y = CL)) + geom_boxplot() + facet_wrap(~ sp) shows that males appear to be larger than females in the blue coloured crabs but not in the orange coloured crabs. What we described here is a so-called interaction between two explanatory variables, which means that the effect of one variable dependes on the level of the other.

The multi-panel graphing approach is ideally suited to spot potential interactions but we may also use the interaction.plot function for this purpose (interaction.plot(x.factor = crabs$sp, trace.factor = crabs$sex, response = crabs$CL, type = "l") or use interaction.plot(x.factor = crabs$sex, trace.factor = crabs$sp, response = crabs$CL, type = "l"))

First, we will run a simple two-way ANOVA without interaction, which means that we are only looking at the main effects. With this model we are asking: ‘Are there differences in carapace lengths between the two colour forms, regardless of sex?’ and ‘Are there differences in carapace length between sexes, regardless of colour?’

## Run the two-way ANOVA without interaction
m1 <- lm(CL ~ sp + sex, data = crabs)
summary(m1) # lm-style output, for interpretation refer to the 'Making sense of the lm model output' document I have sent around.

## 
## Call:
## lm(formula = CL ~ sp + sex, data = crabs)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -18.198  -4.713   0.642   4.898  16.297 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  29.3125     0.8340  35.145  < 2e-16 ***
## spO           4.0950     0.9631   4.252 3.27e-05 ***
## sexM          1.4910     0.9631   1.548    0.123    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.81 on 197 degrees of freedom
## Multiple R-squared:  0.09416,    Adjusted R-squared:  0.08496 
## F-statistic: 10.24 on 2 and 197 DF,  p-value: 5.884e-05

summary.aov(m1) # ANOVA-style output, giving an overall significance for the two explanatory variabls

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## sp            1    838   838.5  18.080 3.27e-05 ***
## sex           1    111   111.2   2.397    0.123    
## Residuals   197   9136    46.4                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Now, I will show how to account for the interaction between colour form and sex in the model. By the way, in scientific papers interaction terms are commonly expressed using the multiplication symbol like this: “…we found a significant colour $\times$ sex interaction…”

## Run the two-way ANOVA with interaction
m2 <- lm(CL ~ sp + sex + sp:sex, data = crabs) # sp:sex is the interaction term
m2 <- lm(CL ~ sp * sex, data = crabs) # asterisk shorthand notation for: sp + sex + sp:sex
summary(m2)

## 
## Call:
## lm(formula = CL ~ sp * sex, data = crabs)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -16.988  -4.636   0.184   5.130  15.086 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  28.1020     0.9499  29.584  < 2e-16 ***
## spO           6.5160     1.3434   4.851  2.5e-06 ***
## sexM          3.9120     1.3434   2.912  0.00401 ** 
## spO:sexM     -4.8420     1.8998  -2.549  0.01158 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.717 on 196 degrees of freedom
## Multiple R-squared:  0.1232, Adjusted R-squared:  0.1098 
## F-statistic: 9.181 on 3 and 196 DF,  p-value: 1.031e-05

## How do the parameter estimates (estimated group means) add up?
coef(m2)[1] # Intercept parameter: Estimated mean carapace length for blue females

## (Intercept) 
##      28.102

unname(coef(m2)[1] + coef(m2)[2]) # Estimated mean carapace length for orange females

## [1] 34.618

unname(coef(m2)[1] + coef(m2)[3]) # Estimated mean carapace length for blue males

## [1] 32.014

unname(coef(m2)[1] + coef(m2)[2] + coef(m2)[3] + coef(m2)[4]) # Estimated mean carapace length for orange males

## [1] 33.688

## Compare to the calculated means (should be the same)
library(dplyr)
group_by(.data = crabs, sp, sex) %>% summarise(cl = mean(CL))

## To obtain an ANOVA-style summary output use:
summary.aov(m2)

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## sp            1    838   838.5  18.585 2.57e-05 ***
## sex           1    111   111.2   2.464   0.1181    
## sp:sex        1    293   293.1   6.496   0.0116 *  
## Residuals   196   8843    45.1                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpretation of an interaction term

If we stick with our crabs example, a significant interaction tells us that colour has a significant effect on carapace length but the magnitude of this effect depends on the level of sex (i.e. it varies with gender). An interaction term can be looked at from ‘both sides’ meaning that we can also interpret it from the sex point of view: sex has a significant effect on carapace length but the magnitude of this effect depends on the colour form (think about the two different versions of the multi-panel boxplots!). Now we want to follow up with a post-hoc test to find out where exactly the statistically significant differences lie. This is also referred to as ‘slicing an interaction’. The concept is simple: we hold one level of the first predictor constant and compare the levels of the second predictor within. Then we switch to the next level of the first predictor and repeat the procedure. What is considered ‘first’ and ‘second’ predictor is arbitrary and can be switched. And here is how this is done in R:

library(emmeans)
## Looking at the differences between colour forms within each level of sex
emms1 <- emmeans(m2, specs = "sp", by = "sex")
pairs(emms1)

## sex = F:
##  contrast estimate       SE  df t.ratio p.value
##  B - O      -6.516 1.343361 196  -4.851  <.0001
## 
## sex = M:
##  contrast estimate       SE  df t.ratio p.value
##  B - O      -1.674 1.343361 196  -1.246  0.2142

## Looking at the differences between sexes within each level of colour form
emms2 <- emmeans(m2, specs = "sex", by = "sp")
pairs(emms2)

## sp = B:
##  contrast estimate       SE  df t.ratio p.value
##  F - M      -3.912 1.343361 196  -2.912  0.0040
## 
## sp = O:
##  contrast estimate       SE  df t.ratio p.value
##  F - M       0.930 1.343361 196   0.692  0.4896

The following code is not important for the exam. It shows you how to prepare a publication-quality figure containing the outcome of our statistical analysis. Let’s say we would like to present our data as a bar plot showing the means and standard errors. First we have to make a decision which grouping we would like to show (think about the boxplots again). In this example it is sensible to show the colour forms in separate panels and contrast the sexes within each panel. We can use both ways of slicing the interaction (both post-hoc tests) for the statistical annotation using the usual convention: different lower case letter indicate statistically significant differences at the $\alpha = 0.05$ significance level. There are several graphical packages in R, perhaps the most popular ones are the built-in base graphics and the ggplot2 package. In the code below I show you both packages in action, so you can decide for yourself which flavour suits you better. Let’s start off with ggplot2 because we have used that for our exploratory plots above.

## Standard error function
se <- function(x, na.rm = FALSE)
{
  if(na.rm == TRUE)
  {
    sqrt(var(x, na.rm = T)/length(na.omit(x)))
  }
  else
  {
    sqrt(var(x)/length(x))
  }
}

## Aggregate the original data (get averages for each colour and sex and the upper and lower extent of the error bars)
library(dplyr)
crabs1 <- as.data.frame(group_by(crabs, sp, sex) %>% summarise(cl = mean(CL), se.up = mean(CL) + se(CL), se.lo = mean(CL) - se(CL)))
crabs1

## Figure created with ggplot2

## Create new, meaningful strip labels
labels <- c(B = "Blue form", O = "Orange form") # feed into scale_x_discrete function below
stats_annotation <- data.frame(sp = c("B", "B", "O", "O"), sex = c("F", "M", "F", "M"), 
                       label = c("a", "b", "b", "b")) # feed into geom_text function below


ggplot(crabs1, aes(x = sex, y = cl)) + geom_bar(stat = "identity", width = 0.6, fill = "grey", colour = "black") + # add bars
  geom_errorbar(mapping = aes(x = sex, ymin = se.lo, ymax = se.up), width = 0.15) + # add error bars
  facet_wrap(~ sp, labeller = labeller(sp = labels)) + # plot as two-panel figure (one panel for each colour form)
  theme(panel.background = element_rect(fill = "white", colour = "black"), panel.border = element_rect(colour = "black", fill = NA), strip.background = element_rect(fill = "grey", colour = "black")) + # tweaking the plot appearance
  labs(x = "Sex", y = "Carapace length (mm)") + # change axis titles
  scale_x_discrete(breaks = c("F", "M"), labels = c("Female", "Male")) + # change x-axis labels
  expand_limits(y = c(0, 40)) + # change y-axis limits to make room for the stats annotation
  geom_text(data = stats_annotation, mapping = aes(y = c(32, 36, 38, 38), label = label), size = 4.5) # add the stats annotation previously prepared in the small stats_annotation data frame

And here is a similar figure produced with base graphics.

## Figure created with base graphics

## The barplot function requires a data matrix in wide format for drawing grouped bar charts.
library(reshape2)
plot.data <- acast(data = crabs1, formula = sex ~ sp, value.var = "cl") # create a data matrix in wide format

## Draw barplot
bp <- barplot(plot.data, beside = T, las = 1, ylim = c(0, 40), col = c("white", "black"), axisnames = F, mgp = c(3, 0.65, 0.5), tcl = -0.3, cex.axis = 0.9) # the bp object stores the midpoints of the bars on the x-axis

## Add error bars
arrows(x0 = bp, y0 = crabs1$se.lo, x1 = bp, y1 = crabs1$se.up, length = 0.04, angle = 90, code = 3) # draw error bars using arrows with head showing a 90° angle

arrows(x0 = bp[2, ], y0 = crabs1$se.lo[c(2, 4)], x1 = bp[2, ], y1 = crabs1$cl[c(2, 4)], length = 0.04, angle = 90, code = 1, col = "white") # fancy things up by drawing the lower part in white when bars switch to black

## Add bar annotation
text(x = bp, y = -2, labels = c("Female", "Male"), srt = 45, adj = c(1, 0.5), cex = 0.9, xpd = T)

## Add group annotation
mtext(c("Blue form", "Orange form"), side = 3, line = 1, at = c(2, 5), cex = 0.9)

## Add y-axis name
mtext("Carapace length (mm)", side = 2, line = 2.7)

## Add statistical annotation
text(x = bp, y = c(32, 36, 38, 38), labels = c("a", "b", "b", "b"))

Scientific Research, Statistics Lab 7

Dr Martin Bader, School of Science, AUT

May 24, 2018

Learning targets

ANOVA power test

Two-way ANOVA

Interpretation of an interaction term