Stats 155 Class Notes 2012-10-03
Main Ideas for Today
- From Model Terms to Vectors
- The intercept term as a vector of 1s
- Understanding interaction terms as vectors
- The geometry of fitting against two vectors.
- Colinearity and why adding model terms changes the coefficients on old terms. (Reference: housing prices versus bedrooms with and without living area)
- Simpson's paradox, geometrically.
- Extreme colinearity: redundancy
Heads up for the future: colinearity has an important effect on confidence intervals.
Review of Geometry through Arithmetic
- scaling
- addition
- linear combination
- square-length is sum of squares
- dot product operation
sum(a*b)
- angle in terms of a dot product
- projection as a dot product
- orthgonality is when the dot product is zero.
From Model Terms to Vectors
Derive the model vectors for interaction terms.
Make a small, illustrative data set and a model from it.
small = sample(CPS85, size = 3)[, c("wage", "educ", "sector")]
small
## wage educ sector
## 245 8.00 12 service
## 13 25.00 14 service
## 57 13.95 18 manag
mod = lm(wage ~ educ * sector, data = small)
I'll predict that the residuals from this model will be all zero — it's a “perfect” fit to the data.
resid(mod)
## 245 13 57
## 0 0 0
Understanding the geometry of the situation will make it easier for you to understand why this is.
Write down the vectors as columns of numbers
with(small, wage)
## [1] 8.00 25.00 13.95
- intercept vector (all ones)
- main effect due to educ
with(small, educ)
## [1] 12 14 18
indicator vectors due to sector. Include one level that's all zeros in this short data set.
interaction vectors as the component-wise products of the educ vector with each of the sector indicator vectors.
Fitting as a linear algebra problem
Show a potential linear combination of the vectors. Just make up the coefficients.
- Point out that the vectors that are all zeros don't count. They have nothing to contribute.
What R does
coef(mod)
## (Intercept) educ sectorservice
## -139.05 8.50 45.05
## educ:sectorservice
## NA
- These are the multipliers on the corresponding vectors.
- One of the indicator vectors has been dropped: it's not needed.
- Redundancy: the dropped indicator vector can be constructed from the other vectors in the model. That's why it's not needed.
- QUESTION: How would you construct it from the other vectors?
- Every categorical variable has one vector that is redundant (when the intercept vector is included in the model) R marks the “first” vector, alphabetically, as redundant. This is arbitrary.
- It's a choice to include the intercept vector.
lm() does it, mm() does not. By including the intercept vector, one of the indicator vectors from each categorical variable is made redundant, and the meaning of the coefficients on the other vectors changes: difference from reference group rather than groupwise mean.
- There's an NA in the interaction term. This signals a vector that R discovered was redundant by examining the vectors.
- As opposed to categorical variables, where one knows ahead of time that one level must be redundant (if the intercept is in the model).
- How would you construct the redundant vector from the others already in the model?
- With three non-redundant vectors, any set of three values can be reaced. In particular, the 3 non-redundant vectors in
mod can reach any possible response vector exactly. That's why the residuals must be zero in the model.
Geometry of Fitting with Multiple Vectors
1. Diagram with two explanatory vectors
2. Show that the residual is orthogonal to each and every model vector.