Stats 155 Class Notes 2012-10-03

Main Ideas for Today

From Model Terms to Vectors
1. The intercept term as a vector of 1s
2. Understanding interaction terms as vectors
The geometry of fitting against two vectors.
Colinearity and why adding model terms changes the coefficients on old terms. (Reference: housing prices versus bedrooms with and without living area)
1. Simpson's paradox, geometrically.
2. Extreme colinearity: redundancy

Heads up for the future: colinearity has an important effect on confidence intervals.

Review of Geometry through Arithmetic

scaling
addition
linear combination
square-length is sum of squares
dot product operation sum(a*b)
angle in terms of a dot product
projection as a dot product
orthgonality is when the dot product is zero.

From Model Terms to Vectors

Derive the model vectors for interaction terms.

Make a small, illustrative data set and a model from it.

small = sample(CPS85, size = 3)[, c("wage", "educ", "sector")]
small

##      wage educ  sector
## 245  8.00   12 service
## 13  25.00   14 service
## 57  13.95   18   manag

mod = lm(wage ~ educ * sector, data = small)

I'll predict that the residuals from this model will be all zero — it's a “perfect” fit to the data.

resid(mod)

## 245  13  57 
##   0   0   0

Understanding the geometry of the situation will make it easier for you to understand why this is.

Write down the vectors as columns of numbers

response vector:

with(small, wage)

## [1]  8.00 25.00 13.95

intercept vector (all ones)
main effect due to educ

with(small, educ)

## [1] 12 14 18

indicator vectors due to sector. Include one level that's all zeros in this short data set.
interaction vectors as the component-wise products of the educ vector with each of the sector indicator vectors.

Fitting as a linear algebra problem

Show a potential linear combination of the vectors. Just make up the coefficients.

Point out that the vectors that are all zeros don't count. They have nothing to contribute.

What R does

coef(mod)

##        (Intercept)               educ      sectorservice 
##            -139.05               8.50              45.05 
## educ:sectorservice 
##                 NA

These are the multipliers on the corresponding vectors.
One of the indicator vectors has been dropped: it's not needed.
- Redundancy: the dropped indicator vector can be constructed from the other vectors in the model. That's why it's not needed.
- QUESTION: How would you construct it from the other vectors?
- Every categorical variable has one vector that is redundant (when the intercept vector is included in the model) R marks the “first” vector, alphabetically, as redundant. This is arbitrary.
- It's a choice to include the intercept vector. lm() does it, mm() does not. By including the intercept vector, one of the indicator vectors from each categorical variable is made redundant, and the meaning of the coefficients on the other vectors changes: difference from reference group rather than groupwise mean.
There's an NA in the interaction term. This signals a vector that R discovered was redundant by examining the vectors.
- As opposed to categorical variables, where one knows ahead of time that one level must be redundant (if the intercept is in the model).
- How would you construct the redundant vector from the others already in the model?
With three non-redundant vectors, any set of three values can be reaced. In particular, the 3 non-redundant vectors in mod can reach any possible response vector exactly. That's why the residuals must be zero in the model.

Geometry of Fitting with Multiple Vectors