Analysis of covariance (ANCOVA) incorporates one or more regression variables into an analysis of variance. As such, we can think of it as analogous to the two-way ANOVA; except that instead of having two different factor variables as predictors, we have one factor variable and one continuous variable.
Our primary interest is to determine whether females’ heart weights differ from males’ heart weights when both have received digitalis.
library(MASS)
data(cats)
str(cats)
## 'data.frame': 144 obs. of 3 variables:
## $ Sex: Factor w/ 2 levels "F","M": 1 1 1 1 1 1 1 1 1 1 ...
## $ Bwt: num 2 2 2 2.1 2.1 2.1 2.1 2.1 2.1 2.1 ...
## $ Hwt: num 7 7.4 9.5 7.2 7.3 7.6 8.1 8.2 8.3 8.5 ...
par(mfrow = c(1,2),mar = c(3,3.5,1,1))
boxplot(Hwt ~ Sex, data = cats)
boxplot(Hwt ~ Bwt, data = cats)
Fisher provided both heart weights and body weights, so we can ask a more complex question, “Is there a sex difference in the heart weights over and above the fact that male cats are naturally larger than female cats?”
par(mfrow=c(1,1))
interaction.plot(cats$Sex, cats$Bwt, cats$Hwt)
The means of response variable vary w.r.t change in factors (lines are not parallel), indicating strong interaction between factor (Sex) and covariate (Bwt).
To apply ANCOVA, first fit a linear model with interaction Bwt*Sex as predictor.
Bwt_Sex_model <- lm(Hwt ~ Bwt * Sex, data = cats)
print(summary(Bwt_Sex_model))
##
## Call:
## lm(formula = Hwt ~ Bwt * Sex, data = cats)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.7728 -1.0118 -0.1196 0.9272 4.8646
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.9813 1.8428 1.618 0.107960
## Bwt 2.6364 0.7759 3.398 0.000885 ***
## SexM -4.1654 2.0618 -2.020 0.045258 *
## Bwt:SexM 1.6763 0.8373 2.002 0.047225 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.442 on 140 degrees of freedom
## Multiple R-squared: 0.6566, Adjusted R-squared: 0.6493
## F-statistic: 89.24 on 3 and 140 DF, p-value: < 2.2e-16
These results indicate that Bwt and Bwt*Sex interaction are two factors that has a statistically significant effect on Hwt. Significant interaction term implies that the effect of Bwt is not consistent across Sex. That is, a male cat will have more Hwt compared to a female cat while the effect of Bwt is blocked.
print(anova(Bwt_Sex_model))
## Analysis of Variance Table
##
## Response: Hwt
## Df Sum Sq Mean Sq F value Pr(>F)
## Bwt 1 548.09 548.09 263.6448 < 2e-16 ***
## Sex 1 0.15 0.15 0.0745 0.78535
## Bwt:Sex 1 8.33 8.33 4.0077 0.04722 *
## Residuals 140 291.05 2.08
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The ANCOVA summary shows the significant contribution of Bwt:Sex interaction.
Let’s inspect the adjusted means w.r.t vs after blocking effect of Bwt. You can also plot the difference in adjusted means using effect function.
adj.means <- effect("Sex", Bwt_Sex_model)
## NOTE: Sex is not a high-order term in the model
print(adj.means)
##
## Sex effect
## Sex
## F M
## 10.16188 10.56197
plot(adj.means)
The effects::effect() function uses the output of the ANCOVA model to estimate the means of each factor level, corrected by the effect of the covariate (Bwt).
We can also compare the grouped means w.r.t vs without taking into consideration (blocking) the effect of its interaction with hp.
grouped.means <- aggregate(cats$Hwt, list(cats$Sex), FUN=mean )
print(grouped.means)
## Group.1 x
## 1 F 9.202128
## 2 M 11.322680
Sex_model <- lm(Hwt~Sex, data=cats)
plot(effect("Sex", Sex_model))
Interestingly, the grouped means are different than adjusted means. Explain why ?
The effect of Sex on Hwt changes when its interaction with Bwt is taken into account.