You wonder whether you can teach old dogs new tricks. So you go to the pound and adopt 15 old dogs and 15 puppies. Then you attempt to teach each of the 30 dogs one of the standard dog tricks, “sit”, “stay”, and “roll over.” Teaching only one trick to each dog, you keep a record of how many days it takes before they learn the tricks. The results of your experiment are listed in the table below. Use that data to conduct a two-way analysis of variance to determine if old dogs can learn new tricks.
The two-way ANOVA extends the analysis of variance to cases where you have two categorical variables of interest. In this exercise a two-way ANOVA would let us check whether Time spent training each dog varies across two variables; dog type and trick. You can conduct a two way ANOVA by passing an extra categorical variable into the formula supplied to the aov() function.
First, lets import the data and do a quick clean-up
dogs <- read.csv("dog_tricks.csv")
str(dogs)
## 'data.frame': 30 obs. of 3 variables:
## $ Dog_Type: chr "Puppies" "Puppies" "Puppies" "Puppies" ...
## $ Trick : chr "Sit" "Sit" "Sit" "Sit" ...
## $ Time : int 2 1 3 1 2 4 5 4 6 7 ...
# Convert the dog_type and trick into factors (categories)
dogs$Dog_Type <- factor(dogs$Dog_Type)
dogs$Trick <- factor(dogs$Trick)
Box plots and line plots can be used to visualize group differences:
Box plot to plot the data grouped by the combinations of the levels of the two factors.
Two-way interaction plot, which plots the mean (or other summary) of the response for two-way combinations of factors, thereby illustrating possible interactions.
Here, we’ll use the ggpubr R package for an easy ggplot2-based data visualization.
First let’s do a boxplot of the data, starting with the main question; old dogs new tricks?
library("ggpubr")
## Loading required package: ggplot2
ggboxplot(dogs, x = "Dog_Type", y = "Time", color = "blue",
palette = c("#00AFBB", "#E7B800"))
We can see that the mean time that old dogs take to learn new tricks is higher than that for puppies, we’ll figure out the statistical significance later. Now let’s break this down by trick type;
library("ggpubr")
ggboxplot(dogs, x = "Trick", y = "Time", color = "Dog_Type",
palette = c("#00AFBB", "#E7B800"))
It looks like ‘Roll Over’ is the hardest trick for both groups to learn. There’s not as much difference between ‘Shake’ and ‘Sit’. But we can also see that for all three tricks, old dogs take longer than puppies to learn the trick (with slightly closer performance for the ‘Sit’ trick)
Next, a line plot with multiple groups
# Line plots with multiple groups
# +++++++++++++++++++++++
# Plot tooth length ("len") by groups ("dose")
# Color box plot by a second group: "supp"
# Add error bars: mean_se
# (other values include: mean_sd, mean_ci, median_iqr, ....)
ggline(dogs, x = "Trick", y = "Time", color = "Dog_Type",
add = c("mean_se", "dotplot"),
palette = c("#00AFBB", "#E7B800"))
## Bin width defaults to 1/30 of the range of the data. Pick better value with `binwidth`.
# Perform the ANOVA
We’ll use this ANOVA summary method to compare the combined effect of Dog_Type and Trick on Time;
av_model <- aov(dogs$Time ~ dogs$Dog_Type + dogs$Trick) # Perform ANOVA
summary(av_model) # Show the result
## Df Sum Sq Mean Sq F value Pr(>F)
## dogs$Dog_Type 1 124.0 124.03 35.49 2.75e-06 ***
## dogs$Trick 2 366.1 183.03 52.37 7.61e-10 ***
## Residuals 26 90.9 3.49
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The dogs$Dog_Type row tells us there is a statistically significant (to 95% confidence) difference between Old Dogs and Puppies for learning any trick. We can see from the plots that Old Dogs take longer to learn any of the tricks, so we can conclude that you can teach an old dog new tricks, but it takes longer than teaching a new dog. To a confidence of 95%.
Alternative approach using lm() and the anova() function instead of summarising aov()
lm2 <- lm(Time ~ Dog_Type + Trick,data=dogs)
anova(lm2)
A degree of freedom (df) of 2 for ‘Trick’, indicates there are 3 groups represented in the results. So lets drill into this to find out which of the 3 groups is most different from the other, and what the significance is.
# Conduct pairwise t-tests between tricks
pairwise.t.test(dogs$Time,
dogs$Trick,
p.adj = "none") # Do not adjust resulting p-value
##
## Pairwise comparisons using t tests with pooled SD
##
## data: dogs$Time and dogs$Trick
##
## Roll_Over Shake
## Shake 0.0120 -
## Sit 3.1e-07 0.0004
##
## P value adjustment method: none
This tells us there’s a significant difference in the time taken to learn each trick for both dog types combined. This is purely a measure of relative trick difficulty. Roll Over is much more difficult to learn than Sit with a P of 3.1e-07. And Sit is by far the easiest trick to learn compared with both Roll Over (3.1e-07) and Shake (0.0004)
We can also get all the combinations by passing our model into the TukeyHSD() function;
TukeyHSD(av_model)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dogs$Time ~ dogs$Dog_Type + dogs$Trick)
##
## $`dogs$Dog_Type`
## diff lwr upr p adj
## Puppies-Old_Dogs -4.066667 -5.469832 -2.663502 2.7e-06
##
## $`dogs$Trick`
## diff lwr upr p adj
## Shake-Roll_Over -3.4 -5.477488 -1.322512 0.0011071
## Sit-Roll_Over -8.5 -10.577488 -6.422512 0.0000000
## Sit-Shake -5.1 -7.177488 -3.022512 0.0000056
# Repeat the test
av_model <- aov(dogs$Time ~ dogs$Dog_Type + dogs$Trick +
(dogs$Dog_Type * dogs$Trick))
summary(av_model)
## Df Sum Sq Mean Sq F value Pr(>F)
## dogs$Dog_Type 1 124.0 124.03 47.705 3.83e-07 ***
## dogs$Trick 2 366.1 183.03 70.397 9.10e-11 ***
## dogs$Dog_Type:dogs$Trick 2 28.5 14.23 5.474 0.011 *
## Residuals 24 62.4 2.60
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
TukeyHSD(av_model)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dogs$Time ~ dogs$Dog_Type + dogs$Trick + (dogs$Dog_Type * dogs$Trick))
##
## $`dogs$Dog_Type`
## diff lwr upr p adj
## Puppies-Old_Dogs -4.066667 -5.281857 -2.851476 4e-07
##
## $`dogs$Trick`
## diff lwr upr p adj
## Shake-Roll_Over -3.4 -5.200819 -1.599181 0.0002447
## Sit-Roll_Over -8.5 -10.300819 -6.699181 0.0000000
## Sit-Shake -5.1 -6.900819 -3.299181 0.0000008
##
## $`dogs$Dog_Type:dogs$Trick`
## diff lwr upr p adj
## Puppies:Roll_Over-Old_Dogs:Roll_Over -6.0 -9.153163 -2.8468366 0.0000606
## Old_Dogs:Shake-Old_Dogs:Roll_Over -4.0 -7.153163 -0.8468366 0.0074657
## Puppies:Shake-Old_Dogs:Roll_Over -8.8 -11.953163 -5.6468366 0.0000001
## Old_Dogs:Sit-Old_Dogs:Roll_Over -10.8 -13.953163 -7.6468366 0.0000000
## Puppies:Sit-Old_Dogs:Roll_Over -12.2 -15.353163 -9.0468366 0.0000000
## Old_Dogs:Shake-Puppies:Roll_Over 2.0 -1.153163 5.1531634 0.3921292
## Puppies:Shake-Puppies:Roll_Over -2.8 -5.953163 0.3531634 0.1024519
## Old_Dogs:Sit-Puppies:Roll_Over -4.8 -7.953163 -1.6468366 0.0011004
## Puppies:Sit-Puppies:Roll_Over -6.2 -9.353163 -3.0468366 0.0000377
## Puppies:Shake-Old_Dogs:Shake -4.8 -7.953163 -1.6468366 0.0011004
## Old_Dogs:Sit-Old_Dogs:Shake -6.8 -9.953163 -3.6468366 0.0000092
## Puppies:Sit-Old_Dogs:Shake -8.2 -11.353163 -5.0468366 0.0000004
## Old_Dogs:Sit-Puppies:Shake -2.0 -5.153163 1.1531634 0.3921292
## Puppies:Sit-Puppies:Shake -3.4 -6.553163 -0.2468366 0.0293531
## Puppies:Sit-Old_Dogs:Sit -1.4 -4.553163 1.7531634 0.7421536