Explication of linear model stuff

Working notes, to be expanded at some point …

what is anova()?

anova() is R's method for general-purpose model comparison: it is a very broad generalization of ANOVA, which in addition to its interpretation in terms of decomposition of variance can also be interpreted as (1) a test of null hypotheses relating to sets of parameters or (2) a test of comparisons between pairs of models (i.e., linear models with certain effects present or missing). Depending on the particular type of model it is applied to, ANOVA might give a real analysis of variance (for linear models, generated from lm); an analysis of deviance (sometimes called ANODEV; for generalized linear models); or a series of model comparisons, typically based on a likelihood ratio test.

The ways that anova() behaves for different classes of models are generally sensible, but (unfortunately) idiosyncratic. In base R, the available methods are:

methods("anova")

## [1] anova.glm     anova.glmlist anova.lm      anova.loess*  anova.mlm
## [6] anova.nls*
##
##    Non-visible functions are asterisked


(might be useful to explicate how anova() behaves differently for different model types.)

what's the difference between anova() and aov() (and car::Anova())?

• aov is a special-case function that can be applied in simple cases (balanced/nested error strata), analogous to the situations where SAS PROC GLM is useful.
• as described above, anova() is much more general, covering situations that would require SAS PROC MIXED

Parameter tables vs. effect tables

Parameter tables (i.e., typically contained in the output of summary()) have one row for each individual parameter in a model. For each parameter they typically give a point estimate; standard error; $$Z$$ or $$t$$ statistic; and sometimes an effective degrees of freedom and/or a $$p$$-value. They typically represent Wald tests (see below).

Effect tables (i.e., output of anova() or aov() or drop1()) have one row for each individual effect in a model. They differ particularly for categorical variables (factors) with more than two levels.

In the linear model world, Wald and likelihood-ratio approaches give exactly the same answer; there is no difference between parameter-table and effect-table answers except that the effect-table view collapses multi-parameter effects into a single test statistic – for continuous predictors or categorical predictors with only two levels, the $$p$$-values are identical (although typically the effects table reports an $$F$$ statistic while the summary table reports a $$t$$ statistic).

Beyond the linear model world, likelihood-ratio approaches are (always?) more accurate, but they are harder to compute. Sometimes this difference is fairly trivial (i.e., just use a slightly different set of R code), sometimes it's slow and/or a big pain in the butt. Asymptotically the two approaches converge, but who ever has arbitrarily large amounts of data? (But for some cases, e.g. a generalized linear model fitted to a few hundred data points, the difference might be very small.)

By itself, aov() just gives the variance due to each term and to the residuals:

library("faraway")
(a1 <- aov(wear ~ run + position + material, abrasion))

## Call:
##    aov(formula = wear ~ run + position + material, data = abrasion)
##
## Terms:
##                  run position material Residuals
## Sum of Squares   986     1469     4622       367
## Deg. of Freedom    3        3        3         6
##
## Residual standard error: 7.826
## Estimated effects may be unbalanced


summary(aov(...)) gives a standard-looking ANOVA table

summary(a1)

##             Df Sum Sq Mean Sq F value  Pr(>F)
## run          3    986     329    5.37 0.03901 *
## position     3   1469     490    7.99 0.01617 *
## material     3   4622    1541   25.15 0.00085 ***
## Residuals    6    367      61
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


By itself, the results of lm include just the coefficient estimates:

a2 <- lm(wear ~ run + position + material, abrasion)


aov(lm(...)) gives something equivalent for all practical purposes to aov(...)

summary(lm(...)) gives a parameter table, and some other summary information about the fit:

summary(a2)

##
## Call:
## lm(formula = wear ~ run + position + material, data = abrasion)
##
## Residuals:
##    Min     1Q Median     3Q    Max
##  -9.75  -2.25   0.25   3.69   8.00
##
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   254.75       6.19   41.17  1.4e-08 ***
## run2           -2.25       5.53   -0.41  0.69842
## run3           12.50       5.53    2.26  0.06466 .
## run4           -9.25       5.53   -1.67  0.14566
## position2      26.25       5.53    4.74  0.00318 **
## position3       8.50       5.53    1.54  0.17545
## position4       8.25       5.53    1.49  0.18661
## materialB     -45.75       5.53   -8.27  0.00017 ***
## materialC     -24.00       5.53   -4.34  0.00489 **
## materialD     -35.25       5.53   -6.37  0.00070 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.83 on 6 degrees of freedom
## Multiple R-squared:  0.951,  Adjusted R-squared:  0.877
## F-statistic: 12.8 on 9 and 6 DF,  p-value: 0.00283

anova(a2)

## Analysis of Variance Table
##
## Response: wear
##           Df Sum Sq Mean Sq F value  Pr(>F)
## run        3    986     329    5.37 0.03901 *
## position   3   1469     490    7.99 0.01617 *
## material   3   4622    1541   25.15 0.00085 ***
## Residuals  6    367      61
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

library("car")
Anova(a2)

## Anova Table (Type II tests)
##
## Response: wear
##           Sum Sq Df F value  Pr(>F)
## run          986  3    5.37 0.03901 *
## position    1469  3    7.99 0.01617 *
## material    4622  3   25.15 0.00085 ***
## Residuals    367  6
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

drop1(a2, test = "F")

## Single term deletions
##
## Model:
## wear ~ run + position + material
##          Df Sum of Sq  RSS   AIC F value  Pr(>F)
## <none>                 367  70.1
## run       3       987 1354  85.0    5.37 0.03901 *
## position  3      1469 1836  89.9    7.99 0.01617 *
## material  3      4622 4989 105.9   25.15 0.00085 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


For this example, all of the different approaches (except summary()) give nearly equivalent output. (If we had done an example with only 2 levels per categorical variable (run,position,material) even summary() would have looked similar – the $$p$$-values would have been identical, and the $$t$$ statistics would have been the square root of the $$F$$ statistics reported: in general $$t_n = \sqrt{F_{1,n}}$$.)