He's best known for his children's books, but he also did propaganda work during World War II. Private Goldbrick SNAFU video
…
…
Connecting this to Statistics:
model.matrix operator to the model.A demo:
cps = fetchData("CPS85")
## Data CPS85 found in package.
small = droplevels(sample(cps, size = 8)) # Get rid of unused levels
small$orig.ids <- NULL
small
## wage educ race sex hispanic south married exper union age sector
## 227 24.98 18 W M NH NS Married 29 Not 53 manag
## 243 11.25 15 W M NH NS Single 5 Not 26 prof
## 398 3.35 12 W F NH NS Married 7 Not 25 clerical
## 220 3.75 12 NW F NH S Married 6 Not 24 manuf
## 448 8.00 12 W F NH NS Married 15 Not 33 service
## 70 7.50 12 W F NH NS Single 10 Not 28 service
## 477 6.25 14 W F NH NS Married 12 Not 32 prof
## 213 18.00 16 W M NH S Married 38 Not 60 prof
mod = lm(wage ~ educ + age * sector, data = small)
model.matrix(mod)
## (Intercept) educ age sectormanag sectormanuf sectorprof sectorservice
## 227 1 18 53 1 0 0 0
## 243 1 15 26 0 0 1 0
## 398 1 12 25 0 0 0 0
## 220 1 12 24 0 1 0 0
## 448 1 12 33 0 0 0 1
## 70 1 12 28 0 0 0 1
## 477 1 14 32 0 0 1 0
## 213 1 16 60 0 0 1 0
## age:sectormanag age:sectormanuf age:sectorprof age:sectorservice
## 227 53 0 0 0
## 243 0 0 26 0
## 398 0 0 0 0
## 220 0 24 0 0
## 448 0 0 0 33
## 70 0 0 0 28
## 477 0 0 32 0
## 213 0 0 60 0
## attr(,"assign")
## [1] 0 1 2 3 3 3 3 4 4 4 4
## attr(,"contrasts")
## attr(,"contrasts")$sector
## [1] "contr.treatment"
Compare the columns of the model matrix to the original data. Note the name of each vector.
== as in sex=='M'Big R, little r, are different. R = \( \sqrt{R^2} \) and is always positive. \( r \) is the cosine of the angle between two vectors and might be positive or negative, depending on whether the angle is less than or greater than 90 degrees.
Read in the manipulate software.
fetchData("M155/littleR.R")
## Retrieving from http://www.mosaic-web.org/go/datasets/M155/littleR.R
## [1] TRUE
Run littleR.
The blue vector is constructed from a linear combination of a red vector and a black vector. Move the slider to change the amount of the black vector that goes into the sum.
The vectors are being displayed in both variable space and case space. Notice how the roundness of the case-space cloud reflects the angle. The correlation coefficient, r, corresponds to the roundness and to the angle \( \theta \) between the blue and black vectors.
Exercise showing vector relationships among
Why adding a new explanatory variable, even when it's not strongly correlated with the response, makes BIG R bigger — the alignment is closer with the plane.
Xeroxed sheets with angle cut-outs.
(perhaps as a take-home)