Let's see what a 35 year old (in 2015) female with a Master's degree is likely
to earn working from 2005 (when she was 25) to 2030 (when she retires early at
50.) There isn't enough room in these slides to show the entire algorithm, but
it goes something like:
data <- read.csv("../census.csv")
rows <- (data$Gender == "Female") & (data$Education == "Master's")
years <- data$Year[rows]; ages <- data$Ages[rows]; incomes <- data$Income[rows]
mdata <- data.frame(years, ages, incomes); fit <- lm(incomes ~ ., data = mdata)
years <- seq(from = 2005, to = 2030)
ages_yr <- seq(from = 25, to = 50); ages <- sapply(ages_yr, age_groups)
test <- data.frame(years, ages); earnings <- predict(fit, newdata = test)
sum(earnings)
The actual algorithm (not echoed here) produces:
## [1] 2063017