Working notes, to be expanded at some point …

`anova()`

?`anova()`

is R's method for general-purpose model comparison: it is a very broad *generalization* of ANOVA, which in addition to its interpretation in terms of decomposition of variance can also be interpreted as (1) a test of null hypotheses relating to sets of parameters or (2) a test of comparisons between pairs of models (i.e., linear models with certain effects present or missing). Depending on the particular type of model it is applied to, ANOVA might give a real analysis of variance (for linear models, generated from `lm`

); an analysis of deviance (sometimes called ANODEV; for generalized linear models); or a series of model comparisons, typically based on a likelihood ratio test.

The ways that `anova()`

behaves for different classes of models are *generally* sensible, but (unfortunately) idiosyncratic. In base R, the available methods are:

```
methods("anova")
```

```
## [1] anova.glm anova.glmlist anova.lm anova.loess* anova.mlm
## [6] anova.nls*
##
## Non-visible functions are asterisked
```

(might be useful to explicate how `anova()`

behaves differently for different model types.)

`anova()`

and `aov()`

(and `car::Anova()`

)?`aov`

is a special-case function that can be applied in simple cases (balanced/nested error strata), analogous to the situations where SAS`PROC GLM`

is useful.- as described above,
`anova()`

is much more general, covering situations that would require SAS`PROC MIXED`

*Parameter tables* (i.e., typically contained in the output of `summary()`

) have one row for each individual parameter in a model. For each parameter they typically give a point estimate; standard error; \( Z \) or \( t \) statistic; and sometimes an effective degrees of freedom and/or a \( p \)-value. They typically represent *Wald tests* (see below).

*Effect tables* (i.e., output of `anova()`

or `aov()`

or `drop1()`

) have one row for each individual *effect* in a model. They differ particularly for categorical variables (factors) with more than two levels.

In the linear model world, Wald and likelihood-ratio approaches give exactly the same answer; there is no difference between parameter-table and effect-table answers *except* that the effect-table view collapses multi-parameter effects into a single test statistic – for continuous predictors or categorical predictors with only two levels, the \( p \)-values are identical (although typically the effects table reports an \( F \) statistic while the summary table reports a \( t \) statistic).

Beyond the linear model world, likelihood-ratio approaches are (always?) more accurate, but they are harder to compute. Sometimes this difference is fairly trivial (i.e., just use a slightly different set of R code), sometimes it's slow and/or a big pain in the butt. *Asymptotically* the two approaches converge, but who ever has arbitrarily large amounts of data? (But for some cases, e.g. a generalized linear model fitted to a few hundred data points, the difference might be very small.)

By itself, `aov()`

just gives the variance due to each term and to the residuals:

```
library("faraway")
(a1 <- aov(wear ~ run + position + material, abrasion))
```

```
## Call:
## aov(formula = wear ~ run + position + material, data = abrasion)
##
## Terms:
## run position material Residuals
## Sum of Squares 986 1469 4622 367
## Deg. of Freedom 3 3 3 6
##
## Residual standard error: 7.826
## Estimated effects may be unbalanced
```

`summary(aov(...))`

gives a standard-looking ANOVA table

```
summary(a1)
```

```
## Df Sum Sq Mean Sq F value Pr(>F)
## run 3 986 329 5.37 0.03901 *
## position 3 1469 490 7.99 0.01617 *
## material 3 4622 1541 25.15 0.00085 ***
## Residuals 6 367 61
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

By itself, the results of `lm`

include just the coefficient estimates:

```
a2 <- lm(wear ~ run + position + material, abrasion)
```

`aov(lm(...))`

gives something equivalent for all practical purposes to `aov(...)`

`summary(lm(...))`

gives a parameter table, and some other summary information about the fit:

```
summary(a2)
```

```
##
## Call:
## lm(formula = wear ~ run + position + material, data = abrasion)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.75 -2.25 0.25 3.69 8.00
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 254.75 6.19 41.17 1.4e-08 ***
## run2 -2.25 5.53 -0.41 0.69842
## run3 12.50 5.53 2.26 0.06466 .
## run4 -9.25 5.53 -1.67 0.14566
## position2 26.25 5.53 4.74 0.00318 **
## position3 8.50 5.53 1.54 0.17545
## position4 8.25 5.53 1.49 0.18661
## materialB -45.75 5.53 -8.27 0.00017 ***
## materialC -24.00 5.53 -4.34 0.00489 **
## materialD -35.25 5.53 -6.37 0.00070 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.83 on 6 degrees of freedom
## Multiple R-squared: 0.951, Adjusted R-squared: 0.877
## F-statistic: 12.8 on 9 and 6 DF, p-value: 0.00283
```

```
anova(a2)
```

```
## Analysis of Variance Table
##
## Response: wear
## Df Sum Sq Mean Sq F value Pr(>F)
## run 3 986 329 5.37 0.03901 *
## position 3 1469 490 7.99 0.01617 *
## material 3 4622 1541 25.15 0.00085 ***
## Residuals 6 367 61
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

```
library("car")
Anova(a2)
```

```
## Anova Table (Type II tests)
##
## Response: wear
## Sum Sq Df F value Pr(>F)
## run 986 3 5.37 0.03901 *
## position 1469 3 7.99 0.01617 *
## material 4622 3 25.15 0.00085 ***
## Residuals 367 6
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

```
drop1(a2, test = "F")
```

```
## Single term deletions
##
## Model:
## wear ~ run + position + material
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 367 70.1
## run 3 987 1354 85.0 5.37 0.03901 *
## position 3 1469 1836 89.9 7.99 0.01617 *
## material 3 4622 4989 105.9 25.15 0.00085 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

For this example, all of the different approaches (except `summary()`

) give nearly equivalent output. (If we had done an example with only 2 levels per categorical variable (`run`

,`position`

,`material`

) even `summary()`

would have looked similar – the \( p \)-values would have been identical, and the \( t \) statistics would have been the square root of the \( F \) statistics reported: in general \( t_n = \sqrt{F_{1,n}} \).)