Visualize your data

Box plots and line plots can be used to visualize group differences:

Box plot to plot the data grouped by the combinations of the levels of the two factors.
Two-way interaction plot, which plots the mean (or other summary) of the response for two-way combinations of factors, thereby illustrating possible interactions.
Here, we’ll use the ggpubr R package for an easy ggplot2-based data visualization.

First let’s do a boxplot of the data, starting with the main question; old dogs new tricks?

library("ggpubr")

## Loading required package: ggplot2

ggboxplot(dogs, x = "Dog_Type", y = "Time", color = "blue",
          palette = c("#00AFBB", "#E7B800"))

We can see that the mean time that old dogs take to learn new tricks is higher than that for puppies, we’ll figure out the statistical significance later. Now let’s break this down by trick type;

library("ggpubr")
ggboxplot(dogs, x = "Trick", y = "Time", color = "Dog_Type",
          palette = c("#00AFBB", "#E7B800"))

It looks like ‘Roll Over’ is the hardest trick for both groups to learn. There’s not as much difference between ‘Shake’ and ‘Sit’. But we can also see that for all three tricks, old dogs take longer than puppies to learn the trick (with slightly closer performance for the ‘Sit’ trick)

Next, a line plot with multiple groups

# Line plots with multiple groups
# +++++++++++++++++++++++
# Plot tooth length ("len") by groups ("dose")
# Color box plot by a second group: "supp"
# Add error bars: mean_se
# (other values include: mean_sd, mean_ci, median_iqr, ....)

ggline(dogs, x = "Trick", y = "Time", color = "Dog_Type",
       add = c("mean_se", "dotplot"),
       palette = c("#00AFBB", "#E7B800"))

## Bin width defaults to 1/30 of the range of the data. Pick better value with `binwidth`.

# Perform the ANOVA

We’ll use this ANOVA summary method to compare the combined effect of Dog_Type and Trick on Time;

av_model <- aov(dogs$Time ~ dogs$Dog_Type + dogs$Trick)    # Perform ANOVA
summary(av_model)                                          # Show the result

##               Df Sum Sq Mean Sq F value   Pr(>F)    
## dogs$Dog_Type  1  124.0  124.03   35.49 2.75e-06 ***
## dogs$Trick     2  366.1  183.03   52.37 7.61e-10 ***
## Residuals     26   90.9    3.49                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The dogs$Dog_Type row tells us there is a statistically significant (to 95% confidence) difference between Old Dogs and Puppies for learning any trick. We can see from the plots that Old Dogs take longer to learn any of the tricks, so we can conclude that you can teach an old dog new tricks, but it takes longer than teaching a new dog. To a confidence of 95%.

Alternative approach using lm() and the anova() function instead of summarising aov()

lm2 <- lm(Time ~ Dog_Type + Trick,data=dogs)

anova(lm2)

A degree of freedom (df) of 2 for ‘Trick’, indicates there are 3 groups represented in the results. So lets drill into this to find out which of the 3 groups is most different from the other, and what the significance is.

# Conduct pairwise t-tests between tricks
pairwise.t.test(dogs$Time,      
                dogs$Trick, 
                p.adj = "none")  # Do not adjust resulting p-value

## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  dogs$Time and dogs$Trick 
## 
##       Roll_Over Shake 
## Shake 0.0120    -     
## Sit   3.1e-07   0.0004
## 
## P value adjustment method: none

This tells us there’s a significant difference in the time taken to learn each trick for both dog types combined. This is purely a measure of relative trick difficulty. Roll Over is much more difficult to learn than Sit with a P of 3.1e-07. And Sit is by far the easiest trick to learn compared with both Roll Over (3.1e-07) and Shake (0.0004)

We can also get all the combinations by passing our model into the TukeyHSD() function;

TukeyHSD(av_model)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = dogs$Time ~ dogs$Dog_Type + dogs$Trick)
## 
## $`dogs$Dog_Type`
##                       diff       lwr       upr   p adj
## Puppies-Old_Dogs -4.066667 -5.469832 -2.663502 2.7e-06
## 
## $`dogs$Trick`
##                 diff        lwr       upr     p adj
## Shake-Roll_Over -3.4  -5.477488 -1.322512 0.0011071
## Sit-Roll_Over   -8.5 -10.577488 -6.422512 0.0000000
## Sit-Shake       -5.1  -7.177488 -3.022512 0.0000056

# Repeat the test
av_model <- aov(dogs$Time ~ dogs$Dog_Type + dogs$Trick +     
               (dogs$Dog_Type * dogs$Trick))                       

summary(av_model)

##                          Df Sum Sq Mean Sq F value   Pr(>F)    
## dogs$Dog_Type             1  124.0  124.03  47.705 3.83e-07 ***
## dogs$Trick                2  366.1  183.03  70.397 9.10e-11 ***
## dogs$Dog_Type:dogs$Trick  2   28.5   14.23   5.474    0.011 *  
## Residuals                24   62.4    2.60                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

TukeyHSD(av_model)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = dogs$Time ~ dogs$Dog_Type + dogs$Trick + (dogs$Dog_Type * dogs$Trick))
## 
## $`dogs$Dog_Type`
##                       diff       lwr       upr p adj
## Puppies-Old_Dogs -4.066667 -5.281857 -2.851476 4e-07
## 
## $`dogs$Trick`
##                 diff        lwr       upr     p adj
## Shake-Roll_Over -3.4  -5.200819 -1.599181 0.0002447
## Sit-Roll_Over   -8.5 -10.300819 -6.699181 0.0000000
## Sit-Shake       -5.1  -6.900819 -3.299181 0.0000008
## 
## $`dogs$Dog_Type:dogs$Trick`
##                                       diff        lwr        upr     p adj
## Puppies:Roll_Over-Old_Dogs:Roll_Over  -6.0  -9.153163 -2.8468366 0.0000606
## Old_Dogs:Shake-Old_Dogs:Roll_Over     -4.0  -7.153163 -0.8468366 0.0074657
## Puppies:Shake-Old_Dogs:Roll_Over      -8.8 -11.953163 -5.6468366 0.0000001
## Old_Dogs:Sit-Old_Dogs:Roll_Over      -10.8 -13.953163 -7.6468366 0.0000000
## Puppies:Sit-Old_Dogs:Roll_Over       -12.2 -15.353163 -9.0468366 0.0000000
## Old_Dogs:Shake-Puppies:Roll_Over       2.0  -1.153163  5.1531634 0.3921292
## Puppies:Shake-Puppies:Roll_Over       -2.8  -5.953163  0.3531634 0.1024519
## Old_Dogs:Sit-Puppies:Roll_Over        -4.8  -7.953163 -1.6468366 0.0011004
## Puppies:Sit-Puppies:Roll_Over         -6.2  -9.353163 -3.0468366 0.0000377
## Puppies:Shake-Old_Dogs:Shake          -4.8  -7.953163 -1.6468366 0.0011004
## Old_Dogs:Sit-Old_Dogs:Shake           -6.8  -9.953163 -3.6468366 0.0000092
## Puppies:Sit-Old_Dogs:Shake            -8.2 -11.353163 -5.0468366 0.0000004
## Old_Dogs:Sit-Puppies:Shake            -2.0  -5.153163  1.1531634 0.3921292
## Puppies:Sit-Puppies:Shake             -3.4  -6.553163 -0.2468366 0.0293531
## Puppies:Sit-Old_Dogs:Sit              -1.4  -4.553163  1.7531634 0.7421536

Can you teach an old dog new tricks?

Introduction

Two-Way ANOVA

Visualize your data