Explication of linear model stuff

Working notes, to be expanded at some point …

what is anova()?

anova() is R's method for general-purpose model comparison: it is a very broad generalization of ANOVA, which in addition to its interpretation in terms of decomposition of variance can also be interpreted as (1) a test of null hypotheses relating to sets of parameters or (2) a test of comparisons between pairs of models (i.e., linear models with certain effects present or missing). Depending on the particular type of model it is applied to, ANOVA might give a real analysis of variance (for linear models, generated from lm); an analysis of deviance (sometimes called ANODEV; for generalized linear models); or a series of model comparisons, typically based on a likelihood ratio test.

The ways that anova() behaves for different classes of models are generally sensible, but (unfortunately) idiosyncratic. In base R, the available methods are:

methods("anova")
## [1] anova.glm     anova.glmlist anova.lm      anova.loess*  anova.mlm    
## [6] anova.nls*   
## 
##    Non-visible functions are asterisked

(might be useful to explicate how anova() behaves differently for different model types.)

what's the difference between anova() and aov() (and car::Anova())?

Parameter tables vs. effect tables

Parameter tables (i.e., typically contained in the output of summary()) have one row for each individual parameter in a model. For each parameter they typically give a point estimate; standard error; \( Z \) or \( t \) statistic; and sometimes an effective degrees of freedom and/or a \( p \)-value. They typically represent Wald tests (see below).

Effect tables (i.e., output of anova() or aov() or drop1()) have one row for each individual effect in a model. They differ particularly for categorical variables (factors) with more than two levels.

In the linear model world, Wald and likelihood-ratio approaches give exactly the same answer; there is no difference between parameter-table and effect-table answers except that the effect-table view collapses multi-parameter effects into a single test statistic – for continuous predictors or categorical predictors with only two levels, the \( p \)-values are identical (although typically the effects table reports an \( F \) statistic while the summary table reports a \( t \) statistic).

Beyond the linear model world, likelihood-ratio approaches are (always?) more accurate, but they are harder to compute. Sometimes this difference is fairly trivial (i.e., just use a slightly different set of R code), sometimes it's slow and/or a big pain in the butt. Asymptotically the two approaches converge, but who ever has arbitrarily large amounts of data? (But for some cases, e.g. a generalized linear model fitted to a few hundred data points, the difference might be very small.)

By itself, aov() just gives the variance due to each term and to the residuals:

library("faraway")
(a1 <- aov(wear ~ run + position + material, abrasion))
## Call:
##    aov(formula = wear ~ run + position + material, data = abrasion)
## 
## Terms:
##                  run position material Residuals
## Sum of Squares   986     1469     4622       367
## Deg. of Freedom    3        3        3         6
## 
## Residual standard error: 7.826
## Estimated effects may be unbalanced

summary(aov(...)) gives a standard-looking ANOVA table

summary(a1)
##             Df Sum Sq Mean Sq F value  Pr(>F)    
## run          3    986     329    5.37 0.03901 *  
## position     3   1469     490    7.99 0.01617 *  
## material     3   4622    1541   25.15 0.00085 ***
## Residuals    6    367      61                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

By itself, the results of lm include just the coefficient estimates:

a2 <- lm(wear ~ run + position + material, abrasion)

aov(lm(...)) gives something equivalent for all practical purposes to aov(...)

summary(lm(...)) gives a parameter table, and some other summary information about the fit:

summary(a2)
## 
## Call:
## lm(formula = wear ~ run + position + material, data = abrasion)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -9.75  -2.25   0.25   3.69   8.00 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   254.75       6.19   41.17  1.4e-08 ***
## run2           -2.25       5.53   -0.41  0.69842    
## run3           12.50       5.53    2.26  0.06466 .  
## run4           -9.25       5.53   -1.67  0.14566    
## position2      26.25       5.53    4.74  0.00318 ** 
## position3       8.50       5.53    1.54  0.17545    
## position4       8.25       5.53    1.49  0.18661    
## materialB     -45.75       5.53   -8.27  0.00017 ***
## materialC     -24.00       5.53   -4.34  0.00489 ** 
## materialD     -35.25       5.53   -6.37  0.00070 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.83 on 6 degrees of freedom
## Multiple R-squared:  0.951,  Adjusted R-squared:  0.877 
## F-statistic: 12.8 on 9 and 6 DF,  p-value: 0.00283
anova(a2)
## Analysis of Variance Table
## 
## Response: wear
##           Df Sum Sq Mean Sq F value  Pr(>F)    
## run        3    986     329    5.37 0.03901 *  
## position   3   1469     490    7.99 0.01617 *  
## material   3   4622    1541   25.15 0.00085 ***
## Residuals  6    367      61                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library("car")
Anova(a2)
## Anova Table (Type II tests)
## 
## Response: wear
##           Sum Sq Df F value  Pr(>F)    
## run          986  3    5.37 0.03901 *  
## position    1469  3    7.99 0.01617 *  
## material    4622  3   25.15 0.00085 ***
## Residuals    367  6                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
drop1(a2, test = "F")
## Single term deletions
## 
## Model:
## wear ~ run + position + material
##          Df Sum of Sq  RSS   AIC F value  Pr(>F)    
## <none>                 367  70.1                    
## run       3       987 1354  85.0    5.37 0.03901 *  
## position  3      1469 1836  89.9    7.99 0.01617 *  
## material  3      4622 4989 105.9   25.15 0.00085 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

For this example, all of the different approaches (except summary()) give nearly equivalent output. (If we had done an example with only 2 levels per categorical variable (run,position,material) even summary() would have looked similar – the \( p \)-values would have been identical, and the \( t \) statistics would have been the square root of the \( F \) statistics reported: in general \( t_n = \sqrt{F_{1,n}} \).)