In this exercise, you will further analyze the Wage data set considered throughout this chapter.
library(ISLR)
attach(Wage)
summary(Wage)
## year age maritl race
## Min. :2003 Min. :18.00 1. Never Married: 648 1. White:2480
## 1st Qu.:2004 1st Qu.:33.75 2. Married :2074 2. Black: 293
## Median :2006 Median :42.00 3. Widowed : 19 3. Asian: 190
## Mean :2006 Mean :42.41 4. Divorced : 204 4. Other: 37
## 3rd Qu.:2008 3rd Qu.:51.00 5. Separated : 55
## Max. :2009 Max. :80.00
##
## education region
## 1. < HS Grad :268 2. Middle Atlantic :3000
## 2. HS Grad :971 1. New England : 0
## 3. Some College :650 3. East North Central: 0
## 4. College Grad :685 4. West North Central: 0
## 5. Advanced Degree:426 5. South Atlantic : 0
## 6. East South Central: 0
## (Other) : 0
## jobclass health health_ins logwage
## 1. Industrial :1544 1. <=Good : 858 1. Yes:2083 Min. :3.000
## 2. Information:1456 2. >=Very Good:2142 2. No : 917 1st Qu.:4.447
## Median :4.653
## Mean :4.654
## 3rd Qu.:4.857
## Max. :5.763
##
## wage
## Min. : 20.09
## 1st Qu.: 85.38
## Median :104.92
## Mean :111.70
## 3rd Qu.:128.68
## Max. :318.34
##
For this particular problem, I will be looking at the relationship between wage and age.
To begin, I will first look for the polynomial the brings out the lowest MSE with a cross-validation method.
library(boot)
set.seed(1)
cv.err<-rep(NA, 10)
for (i in 1:10) {
fit<-glm(wage~poly(age, i), data=Wage)
cv.err[i]<-cv.glm(Wage, fit, K=10)$delta[1]
}
plot(1:10, cv.err, xlab="Degree", ylab = "Test MSE", type = "l")
d.min<-which.min(cv.err)
points(d.min, cv.err[d.min], col="red", cex=2, pch=20)
Through cross-validation, the 4th degree is the optimal degree for the polynomial since it has the lowest test MSE. I will now use ANOVA to test the hypothesis the chosen model above (\(\mu_1\)) is sufficient to explain the data against the alternative hypothesis that a more complex model (\(\mu_2\)) is required.
fit.1=lm(wage~age, data = Wage)
fit.2=lm(wage~poly(age,2), data = Wage)
fit.3=lm(wage~poly(age,3), data = Wage)
fit.4=lm(wage~poly(age,4), data = Wage)
fit.5=lm(wage~poly(age,5), data = Wage)
anova(fit.1,fit.2,fit.3,fit.4,fit.5)
## Analysis of Variance Table
##
## Model 1: wage ~ age
## Model 2: wage ~ poly(age, 2)
## Model 3: wage ~ poly(age, 3)
## Model 4: wage ~ poly(age, 4)
## Model 5: wage ~ poly(age, 5)
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 2998 5022216
## 2 2997 4793430 1 228786 143.5931 < 2.2e-16 ***
## 3 2996 4777674 1 15756 9.8888 0.001679 **
## 4 2995 4771604 1 6070 3.8098 0.051046 .
## 5 2994 4770322 1 1283 0.8050 0.369682
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The p-value comparing the linear fit 1 to quadratic fit 2 is essentiallly 0, indicating that a linear fit is not sufficient to explain the relationship between Wage and Age. Similarly, the p-value comparing the quadratic fit 2 and cubic fit 3 is very low (0.0017), so the quadratic fit is also insufficient. The p-value comparing the cubic fit 3 and degree-4 polynimal fit 4 is approximately 5%. Going on, the degree-5 polynomial seems unnecessary considering it’s p-value of 0.37. From this ANOVA, the cubic or quartic polynomial seem to provide a reasonable fit to the data.
Now, I will plot the predictions of these two polynomials.
plot(wage~age, data=Wage, col="grey80", main="Polynomial 3 & 4 on Age Against Wage")
agelim=range(Wage$age)
age.grid=seq(from=agelim[1], to=agelim[2])
fit=lm(wage~poly(age,3), data=Wage)
preds=predict(fit, newdata = list(age=age.grid))
lines(age.grid, preds, col="royalblue1", lwd=2)
fit2=lm(wage~poly(age,4), data=Wage)
preds2=predict(fit2, newdata=list(age=age.grid))
lines(age.grid, preds2, col="indianred1", lwd=2)
As seen in the graph above, the two polynomial functions predict pretty similarly except towards where age=75 and on. I believe that either of these would make a sufficient fit to this data. However, if we were to take a pick of the functions based on simplicity and interpretability, I would go with the polynomial with degree=3.
cuts=rep(NA, 10)
for (i in 2:10) {
Wage$age.cut=cut(Wage$age, i)
fit.cut=glm(wage~age.cut, data=Wage)
cuts[i]=cv.glm(Wage, fit.cut, K=10)$delta[1]
}
plot(2:10, cuts[-1], main= "# of Cuts", xlab = "Cuts", ylab="Test MSE", type="l")
d.min=which.min(cuts)
points(which.min(cuts), cuts[which.min(cuts)], col="red", cex=2, pch=20)
For this problem, we see that the test error is minimum for 8 cuts. Now I will fit the entire data on a step function using 8 cuts.
plot(wage~age, data=Wage, col="grey80", main="Step Function with 8 Cuts")
step.fit=glm(wage~cut(age,8), data=Wage)
step.preds=predict(step.fit, list(age=age.grid))
lines(age.grid, step.preds, col="orchid4", lwd=2)
As described in the book, we use the step function in order to avoid imposing such a global structure to the data. In the graph above, you will see that the data is separated into 8 different bins that describe the trends with wage in regard to age. This function is typically used for data like this where the you are attempting to understand trends against several age groups.
This question relates to the College data set.
library(ISLR)
data(College)
summary(College)
## Private Apps Accept Enroll Top10perc
## No :212 Min. : 81 Min. : 72 Min. : 35 Min. : 1.00
## Yes:565 1st Qu.: 776 1st Qu.: 604 1st Qu.: 242 1st Qu.:15.00
## Median : 1558 Median : 1110 Median : 434 Median :23.00
## Mean : 3002 Mean : 2019 Mean : 780 Mean :27.56
## 3rd Qu.: 3624 3rd Qu.: 2424 3rd Qu.: 902 3rd Qu.:35.00
## Max. :48094 Max. :26330 Max. :6392 Max. :96.00
## Top25perc F.Undergrad P.Undergrad Outstate
## Min. : 9.0 Min. : 139 Min. : 1.0 Min. : 2340
## 1st Qu.: 41.0 1st Qu.: 992 1st Qu.: 95.0 1st Qu.: 7320
## Median : 54.0 Median : 1707 Median : 353.0 Median : 9990
## Mean : 55.8 Mean : 3700 Mean : 855.3 Mean :10441
## 3rd Qu.: 69.0 3rd Qu.: 4005 3rd Qu.: 967.0 3rd Qu.:12925
## Max. :100.0 Max. :31643 Max. :21836.0 Max. :21700
## Room.Board Books Personal PhD
## Min. :1780 Min. : 96.0 Min. : 250 Min. : 8.00
## 1st Qu.:3597 1st Qu.: 470.0 1st Qu.: 850 1st Qu.: 62.00
## Median :4200 Median : 500.0 Median :1200 Median : 75.00
## Mean :4358 Mean : 549.4 Mean :1341 Mean : 72.66
## 3rd Qu.:5050 3rd Qu.: 600.0 3rd Qu.:1700 3rd Qu.: 85.00
## Max. :8124 Max. :2340.0 Max. :6800 Max. :103.00
## Terminal S.F.Ratio perc.alumni Expend
## Min. : 24.0 Min. : 2.50 Min. : 0.00 Min. : 3186
## 1st Qu.: 71.0 1st Qu.:11.50 1st Qu.:13.00 1st Qu.: 6751
## Median : 82.0 Median :13.60 Median :21.00 Median : 8377
## Mean : 79.7 Mean :14.09 Mean :22.74 Mean : 9660
## 3rd Qu.: 92.0 3rd Qu.:16.50 3rd Qu.:31.00 3rd Qu.:10830
## Max. :100.0 Max. :39.80 Max. :64.00 Max. :56233
## Grad.Rate
## Min. : 10.00
## 1st Qu.: 53.00
## Median : 65.00
## Mean : 65.46
## 3rd Qu.: 78.00
## Max. :118.00
library(caret)
## Loading required package: lattice
##
## Attaching package: 'lattice'
## The following object is masked from 'package:boot':
##
## melanoma
## Loading required package: ggplot2
library(leaps)
set.seed(1)
train=createDataPartition(College$Outstate, p=0.75, list=FALSE)
training=College[train,]
testing=College[-train,]
dummy.matrix=model.matrix(Outstate~., data=training)
forward.model=regsubsets(x=dummy.matrix, y=training$Outstate, method="forward", nvmax=17)
## Warning in leaps.setup(x, y, wt = weights, nbest = nbest, nvmax = nvmax, :
## 1 linear dependencies found
## Reordering variables and trying again:
## Warning in rval$lopt[] <- rval$vorder[rval$lopt]: number of items to
## replace is not a multiple of replacement length
After splitting the data into the appropriate dataset and the best subset of coefficients has been selected, I will start to analyze the model fitted to those subset.
forward.summary=summary(forward.model)
forward.summary
## Subset selection object
## 18 Variables (and intercept)
## Forced in Forced out
## PrivateYes FALSE FALSE
## Apps FALSE FALSE
## Accept FALSE FALSE
## Enroll FALSE FALSE
## Top10perc FALSE FALSE
## Top25perc FALSE FALSE
## F.Undergrad FALSE FALSE
## P.Undergrad FALSE FALSE
## Room.Board FALSE FALSE
## Books FALSE FALSE
## Personal FALSE FALSE
## PhD FALSE FALSE
## Terminal FALSE FALSE
## S.F.Ratio FALSE FALSE
## perc.alumni FALSE FALSE
## Expend FALSE FALSE
## Grad.Rate FALSE FALSE
## (Intercept) FALSE FALSE
## 1 subsets of each size up to 17
## Selection Algorithm: forward
## (Intercept) PrivateYes Apps Accept Enroll Top10perc Top25perc
## 1 ( 1 ) " " " " " " " " " " " " " "
## 2 ( 1 ) " " "*" " " " " " " " " " "
## 3 ( 1 ) " " "*" " " " " " " " " " "
## 4 ( 1 ) " " "*" " " " " " " " " " "
## 5 ( 1 ) " " "*" " " " " " " " " " "
## 6 ( 1 ) " " "*" " " " " " " " " " "
## 7 ( 1 ) " " "*" " " "*" " " " " " "
## 8 ( 1 ) " " "*" " " "*" "*" " " " "
## 9 ( 1 ) " " "*" " " "*" "*" " " "*"
## 10 ( 1 ) " " "*" "*" "*" "*" " " "*"
## 11 ( 1 ) " " "*" "*" "*" "*" " " "*"
## 12 ( 1 ) " " "*" "*" "*" "*" " " "*"
## 13 ( 1 ) " " "*" "*" "*" "*" " " "*"
## 14 ( 1 ) " " "*" "*" "*" "*" " " "*"
## 15 ( 1 ) " " "*" "*" "*" "*" "*" "*"
## 16 ( 1 ) " " "*" "*" "*" "*" "*" "*"
## 17 ( 1 ) " " "*" "*" "*" "*" "*" "*"
## F.Undergrad P.Undergrad Room.Board Books Personal PhD Terminal
## 1 ( 1 ) " " " " " " " " " " " " " "
## 2 ( 1 ) " " " " " " " " " " " " " "
## 3 ( 1 ) " " " " "*" " " " " " " " "
## 4 ( 1 ) " " " " "*" " " " " " " " "
## 5 ( 1 ) " " " " "*" " " " " "*" " "
## 6 ( 1 ) " " " " "*" " " " " "*" " "
## 7 ( 1 ) " " " " "*" " " " " "*" " "
## 8 ( 1 ) " " " " "*" " " " " "*" " "
## 9 ( 1 ) " " " " "*" " " " " "*" " "
## 10 ( 1 ) " " " " "*" " " " " "*" " "
## 11 ( 1 ) "*" " " "*" " " " " "*" " "
## 12 ( 1 ) "*" " " "*" " " " " "*" "*"
## 13 ( 1 ) "*" " " "*" " " " " "*" "*"
## 14 ( 1 ) "*" " " "*" "*" " " "*" "*"
## 15 ( 1 ) "*" " " "*" "*" " " "*" "*"
## 16 ( 1 ) "*" " " "*" "*" "*" "*" "*"
## 17 ( 1 ) "*" "*" "*" "*" "*" "*" "*"
## S.F.Ratio perc.alumni Expend Grad.Rate
## 1 ( 1 ) " " " " "*" " "
## 2 ( 1 ) " " " " "*" " "
## 3 ( 1 ) " " " " "*" " "
## 4 ( 1 ) " " "*" "*" " "
## 5 ( 1 ) " " "*" "*" " "
## 6 ( 1 ) " " "*" "*" "*"
## 7 ( 1 ) " " "*" "*" "*"
## 8 ( 1 ) " " "*" "*" "*"
## 9 ( 1 ) " " "*" "*" "*"
## 10 ( 1 ) " " "*" "*" "*"
## 11 ( 1 ) " " "*" "*" "*"
## 12 ( 1 ) " " "*" "*" "*"
## 13 ( 1 ) "*" "*" "*" "*"
## 14 ( 1 ) "*" "*" "*" "*"
## 15 ( 1 ) "*" "*" "*" "*"
## 16 ( 1 ) "*" "*" "*" "*"
## 17 ( 1 ) "*" "*" "*" "*"
par(mfrow=c(2,2))
plot(forward.summary$adjr2, pch=19, col="steelblue4", main="Adjusted R2", xlab="Number of Variables Used", ylab="Adjusted R2 Values")
plot(forward.summary$bic, pch=19, col="steelblue4", main="BIC", xlab="Number of Variables Used", ylab="BIC")
RMSE=sqrt(forward.summary$rss/nrow(training))
plot(RMSE, pch=19, col="steelblue4", main="RMSE", xlab="Number of Variables Used", ylab ="RMSE")
plot(forward.summary$cp, pch=19, col="steelblue4", main="CP", xlab="Number of Variables Used", ylab="CP")
print(forward.summary$bic)
## [1] -339.6571 -535.4045 -643.4238 -704.1590 -738.4836 -754.3565 -753.1089
## [8] -781.5852 -779.5157 -777.0209 -773.3869 -768.8531 -764.1952 -759.3010
## [15] -753.9628 -748.2359 -741.8663
print(RMSE)
## [1] 2975.638 2502.807 2269.312 2142.608 2069.243 2030.210 2021.326
## [8] 1961.912 1954.702 1948.228 1943.669 1940.616 1937.774 1935.327
## [15] 1933.618 1932.554 1932.554
print(forward.summary$adjr2)
## [1] 0.4521220 0.6117377 0.6802524 0.7144688 0.7332271 0.7427516 0.7445553
## [8] 0.7589329 0.7602846 0.7614544 0.7621542 0.7624856 0.7627654 0.7629482
## [15] 0.7629500 0.7627932 0.7623743
print(forward.summary$cp)
## [1] 759.50831 369.63243 203.06526 120.49761 75.75103 53.54405 50.09881
## [8] 16.29674 14.02468 12.20193 11.51819 11.72415 12.05642 12.62281
## [15] 13.62272 15.00030 17.00000
One of the first things that I notice about the several models is that not all variables are included. When looking at RMSE and Adjusted \(R^2\), we start to see that the error and % of explained variance stabilizes around 8 variables. At 8 variables, we have a error of 1961.912, which is significantly smaller than errors from 1-6 variables. Then at 8 variables, we have almost 76% of the variation in model explained.
The variables that the Forward Stepwise model selected include: PrivateYes, Accept, Enroll, PhD, Room.Board, expend, perc.alumni, and Grad.Rate.
coeff=names(coef(forward.model, id=8))
coeff=coeff[-grep("Intercept", coeff)]
gam.model=train(x=training[,coeff], y=training$Outstate, method="gamSpline", trControl=trainControl(method = "cv", number=10), tuneGrid=expand.grid(df=1:10))
## Loading required package: gam
## Loading required package: splines
## Loading required package: foreach
## Loaded gam 1.16
plot(gam.model, xlim=c(0,10))
The GAM Model shows that the optimal degrees of freedom as chosen by 10-fold cross-validation is 3.
postResample(predict(gam.model, training), training$Outstate)
## RMSE Rsquared MAE
## 2067.1961713 0.7362112 1593.9508202
The GAM model chosen from cross-validation achieved an \(R^2\) of 73.62% mean that this percentage of variation in the outcome variable is explained by the data. The GAM Model also pushed out an error of approximately 2067.2. This model did not out-perform the forward step model above.
postResample(predict(gam.model, testing), testing$Outstate)
## RMSE Rsquared MAE
## 2324.0904162 0.6702745 1619.7849436
When the model was used on the test set, the GAM produced a 67.03% for \(r^2\) and had a much higher error. The reason that this happened is due to the fact that this model has never seen the data tested.
summary(gam.model)
##
## Call: (function (formula, family = gaussian, data, weights, subset,
## na.action, start = NULL, etastart, mustart, control = gam.control(...),
## model = TRUE, method = "glm.fit", x = FALSE, y = TRUE, ...)
## {
## call <- match.call()
## if (is.character(family))
## family <- get(family, mode = "function", envir = parent.frame())
## if (is.function(family))
## family <- family()
## if (is.null(family$family)) {
## print(family)
## stop("`family' not recognized")
## }
## if (missing(data))
## data <- environment(formula)
## mf <- match.call(expand.dots = FALSE)
## m <- match(c("formula", "data", "subset", "weights", "etastart",
## "mustart", "offset"), names(mf), 0L)
## mf <- mf[c(1L, m)]
## mf$na.action = quote(na.pass)
## mf$drop.unused.levels <- TRUE
## mf[[1L]] <- quote(stats::model.frame)
## gam.slist <- gam.smoothers()$slist
## mt <- if (missing(data))
## terms(formula, gam.slist)
## else terms(formula, gam.slist, data = data)
## mf$formula <- mt
## mf <- eval(mf, parent.frame())
## if (missing(na.action)) {
## naa = getOption("na.action", "na.fail")
## na.action = get(naa)
## }
## mf = na.action(mf)
## mt = attributes(mf)[["terms"]]
## switch(method, model.frame = return(mf), glm.fit = 1, glm.fit.null = 1,
## stop("invalid `method': ", method))
## Y <- model.response(mf, "any")
## X <- if (!is.empty.model(mt))
## model.matrix(mt, mf, contrasts)
## else matrix(, NROW(Y), 0)
## weights <- model.weights(mf)
## offset <- model.offset(mf)
## if (!is.null(weights) && any(weights < 0))
## stop("Negative wts not allowed")
## if (!is.null(offset) && length(offset) != NROW(Y))
## stop("Number of offsets is ", length(offset), ", should equal ",
## NROW(Y), " (number of observations)")
## mustart <- model.extract(mf, "mustart")
## etastart <- model.extract(mf, "etastart")
## fit <- gam.fit(x = X, y = Y, smooth.frame = mf, weights = weights,
## start = start, etastart = etastart, mustart = mustart,
## offset = offset, family = family, control = control)
## if (length(offset) && attr(mt, "intercept") > 0) {
## fit$null.dev <- glm.fit(x = X[, "(Intercept)", drop = FALSE],
## y = Y, weights = weights, offset = offset, family = family,
## control = control[c("epsilon", "maxit", "trace")],
## intercept = TRUE)$deviance
## }
## if (model)
## fit$model <- mf
## fit$na.action <- attr(mf, "na.action")
## if (x)
## fit$x <- X
## if (!y)
## fit$y <- NULL
## fit <- c(fit, list(call = call, formula = formula, terms = mt,
## data = data, offset = offset, control = control, method = method,
## contrasts = attr(X, "contrasts"), xlevels = .getXlevels(mt,
## mf)))
## class(fit) <- c("Gam", "glm", "lm")
## if (!is.null(fit$df.residual) && !(fit$df.residual > 0))
## warning("Residual degrees of freedom are negative or zero. This occurs when the sum of the parametric and nonparametric degrees of freedom exceeds the number of observations. The model is probably too complex for the amount of data available.")
## fit
## })(formula = .outcome ~ s(Terminal, df = 4) + s(Top10perc, df = 4) +
## s(Grad.Rate, df = 4) + s(Books, df = 4) + s(Enroll, df = 4) +
## s(Apps, df = 4) + s(Expend, df = 4), family = function (link = "identity")
## {
## linktemp <- substitute(link)
## if (!is.character(linktemp))
## linktemp <- deparse(linktemp)
## okLinks <- c("inverse", "log", "identity")
## if (linktemp %in% okLinks)
## stats <- make.link(linktemp)
## else if (is.character(link)) {
## stats <- make.link(link)
## linktemp <- link
## }
## else {
## if (inherits(link, "link-glm")) {
## stats <- link
## if (!is.null(stats$name))
## linktemp <- stats$name
## }
## else {
## stop(gettextf("link \"%s\" not available for gaussian family; available links are %s",
## linktemp, paste(sQuote(okLinks), collapse = ", ")),
## domain = NA)
## }
## }
## structure(list(family = "gaussian", link = linktemp, linkfun = stats$linkfun,
## linkinv = stats$linkinv, variance = function(mu) rep.int(1,
## length(mu)), dev.resids = function(y, mu, wt) wt *
## ((y - mu)^2), aic = function(y, n, mu, wt, dev) {
## nobs <- length(y)
## nobs * (log(dev/nobs * 2 * pi) + 1) + 2 - sum(log(wt))
## }, mu.eta = stats$mu.eta, initialize = expression({
## n <- rep.int(1, nobs)
## if (is.null(etastart) && is.null(start) && is.null(mustart) &&
## ((family$link == "inverse" && any(y == 0)) ||
## (family$link == "log" && any(y <= 0)))) stop("cannot find valid starting values: please specify some")
## mustart <- y
## }), validmu = function(mu) TRUE, valideta = stats$valideta),
## class = "family")
## }, data = structure(list(Apps = c(2186, 417, 193, 587, 353, 1899,
## 1038, 582, 1732, 1179, 1267, 1420, 4302, 1130, 7313, 12809, 708,
## 1734, 7548, 662, 1879, 761, 948, 627, 602, 1690, 261, 1910, 2496,
## 990, 6075, 1163, 632, 1220, 1320, 632, 519, 1858, 878, 202, 502,
## 805, 500, 6773, 377, 692, 20192, 3356, 443, 3767, 4186, 367,
## 1436, 838, 7365, 12586, 6548, 860, 2362, 599, 1011, 7811, 1784,
## 848, 2853, 1747, 100, 2694, 8728, 1160, 1096, 3877, 1257, 1754,
## 3847, 1307, 369, 495, 601, 1283, 4158, 4681, 2785, 174, 1013,
## 959, 212, 342, 880, 2887, 460, 2174, 689, 1006, 604, 2848, 4856,
## 1432, 798, 946, 344, 457, 511, 444, 983, 672, 7117, 2100, 9478,
## 314, 6756, 232, 688, 528, 440, 1538, 2967, 866, 504, 8587, 2373,
## 967, 2762, 1994, 3014, 434, 793, 360, 604, 1011, 700, 3330, 379,
## 458, 5597, 486, 516, 1422, 2417, 1457, 245, 3624, 3151, 8506,
## 1256, 659, 560, 1801, 4784, 1455, 1339, 1415, 1947, 3306, 1381,
## 11651, 291, 4200, 3440, 1801, 553, 5187, 895, 346, 2161, 2464,
## 1110, 668, 809, 5653, 727, 11115, 7837, 3793, 348, 3596, 633,
## 1886, 440, 1151, 780, 608, 510, 2039, 2491, 1202, 1709, 380,
## 3140, 1006, 817, 7178, 1006, 1721, 13865, 1377, 817, 823, 920,
## 922, 2688, 7428, 602, 699, 1712, 949, 608, 450, 600, 723, 894,
## 1756, 8681, 3050, 268, 16587, 8427, 7259, 472, 2957, 605, 8474,
## 833, 313, 1005, 589, 3121, 2212, 461, 1456, 355, 1040, 2929,
## 544, 979, 497, 831, 1166, 1470, 1386, 6397, 244, 477, 2774, 787,
## 1660, 810, 1561, 900, 3570, 2747, 1641, 2013, 2397, 1891, 3579,
## 1286, 1756, 535, 740, 874, 1004, 2432, 962, 3073, 1611, 5152,
## 499, 4350, 478, 695, 1464, 1107, 233, 1002, 578, 2286, 857, 1584,
## 1742, 9239, 2618, 331, 6011, 610, 905, 1217, 594, 4255, 480,
## 1576, 1310, 3500, 263, 1232, 3708, 586, 882, 1800, 279, 235,
## 368, 325, 1321, 657, 1310, 2519, 2225, 947, 1879, 787, 2220,
## 1563, 10634, 812, 1127, 2968, 465, 6040, 5891, 10706, 616, 860,
## 12289, 1743, 379, 4778, 2324, 792, 2936, 2190, 776, 4522, 1496,
## 910, 2308, 8256, 1603, 940, 943, 19315, 3821, 701, 838, 809,
## 875, 1132, 2405, 1082, 13218, 5139, 21804, 516, 1025, 3712, 2088,
## 1771, 696, 1966, 4996, 2302, 3586, 2227, 935, 560, 3304, 1777,
## 3820, 5785, 2307, 2095, 3971, 213, 1046, 920, 833, 2519, 292,
## 3294, 888, 876, 1910, 785, 489, 4216, 1680, 9402, 4019, 1109,
## 584, 855, 1183, 2115, 4576, 936, 540, 464, 1003, 1016, 437, 4293,
## 2925, 2281, 818, 385, 2540, 4301, 1093, 6118, 1047, 213, 1244,
## 283, 3713, 1489, 191, 2643, 1340, 1243, 651, 450, 1557, 1768,
## 3646, 13528, 15039, 12512, 7294, 8000, 5318, 7888, 4877, 8598,
## 8399, 5549, 3150, 2119, 2096, 462, 10477, 257, 4414, 1769, 529,
## 4095, 497, 4345, 592, 1500, 1154, 845, 759, 1262, 3058, 247,
## 222, 2425, 626, 2267, 484, 3495, 4800, 1797, 3235, 15698, 682,
## 6348, 6855, 9735, 681, 14446, 927, 576, 2096, 12445, 11220, 3580,
## 14939, 1487, 8579, 1597, 4777, 4269, 14292, 14438, 3347, 7122,
## 1458, 3844, 1877, 1618, 6277, 1209, 9750, 1757, 14596, 5191,
## 6071, 4418, 4144, 4743, 12394, 8586, 1758, 4044, 9643, 5892,
## 8766, 2306, 285, 4471, 848, 7693, 7589, 12229, 2379, 2850, 2057,
## 374, 7473, 3281, 4217, 1445, 1712, 5095, 7663, 15849, 12749,
## 1558, 2593, 910, 2409, 2029, 663, 1399, 325, 368, 2075, 7791,
## 3550, 7759, 4963, 2996, 15712, 1470, 647, 800, 1416, 5661, 1231,
## 3315, 6540, 1373, 1190, 665, 2895, 318, 1480, 1164, 1566, 1205,
## 9167, 2702, 5548, 3100, 662, 996, 1432, 1738, 1681, 2139, 1631,
## 663, 4186, 1239, 3325, 2320, 152, 1979, 2768, 2197, 2097, 10705,
## 2989), Enroll = c(512, 137, 55, 158, 103, 489, 227, 172, 472,
## 290, 385, 220, 418, 322, 1910, 3761, 166, 951, 3070, 257, 497,
## 306, 295, 172, 206, 662, 111, 285, 531, 279, 2367, 348, 129,
## 481, 284, 222, 114, 480, 200, 122, 104, 287, 156, 1025, 181,
## 209, 3810, 418, 151, 1061, 740, 158, 1202, 292, 4615, 1462, 862,
## 285, 700, 224, 213, 1650, 913, 298, 753, 449, 35, 489, 1191,
## 352, 464, 713, 363, 505, 527, 616, 90, 210, 203, 401, 902, 1436,
## 1007, 88, 288, 351, 91, 126, 224, 457, 167, 557, 250, 275, 295,
## 456, 727, 317, 238, 177, 97, 177, 186, 122, 249, 278, 1217, 553,
## 2194, 132, 871, 106, 144, 186, 149, 383, 876, 157, 185, 1087,
## 452, 459, 533, 495, 487, 319, 244, 108, 328, 410, 314, 1303,
## 265, 165, 1565, 227, 200, 366, 426, 345, 125, 858, 958, 1236,
## 853, 167, 113, 438, 781, 452, 336, 338, 523, 1071, 374, 3023,
## 126, 942, 1123, 819, 228, 446, 314, 146, 685, 678, 332, 237,
## 294, 1727, 286, 1390, 2276, 1238, 127, 575, 284, 526, 221, 248,
## 198, 176, 194, 432, 573, 326, 634, 104, 454, 328, 307, 1433,
## 317, 806, 1606, 178, 262, 274, 347, 244, 500, 1349, 215, 176,
## 624, 302, 127, 125, 124, 361, 262, 478, 2408, 471, 103, 5873,
## 3441, 1368, 262, 691, 284, 911, 279, 137, 298, 148, 822, 408,
## 174, 381, 142, 286, 622, 177, 271, 231, 224, 510, 425, 320, 1092,
## 82, 204, 482, 363, 326, 356, 458, 217, 651, 724, 527, 212, 1525,
## 719, 868, 363, 366, 223, 177, 428, 239, 563, 212, 1547, 342,
## 1685, 199, 756, 117, 239, 176, 323, 153, 119, 187, 564, 376,
## 891, 607, 3290, 1032, 225, 960, 189, 319, 496, 307, 1609, 380,
## 913, 316, 1779, 103, 303, 722, 239, 330, 526, 126, 121, 159,
## 95, 328, 113, 458, 462, 1190, 266, 483, 233, 467, 240, 3176,
## 195, 308, 1610, 176, 1620, 1973, 2397, 385, 366, 1902, 626, 107,
## 678, 370, 186, 669, 458, 351, 2181, 428, 450, 295, 1522, 504,
## 385, 288, 3450, 680, 458, 159, 428, 207, 302, 1061, 302, 1153,
## 973, 5874, 154, 297, 806, 362, 306, 169, 327, 936, 391, 730,
## 437, 345, 270, 679, 382, 695, 499, 509, 514, 1921, 85, 284, 225,
## 217, 776, 96, 956, 393, 367, 463, 295, 120, 736, 691, 2151, 888,
## 386, 131, 139, 411, 494, 1000, 197, 165, 183, 295, 300, 211,
## 591, 632, 1408, 447, 193, 994, 1166, 642, 3204, 511, 75, 352,
## 97, 443, 375, 63, 465, 285, 414, 243, 194, 489, 380, 585, 1843,
## 3087, 1724, 904, 1464, 1025, 1036, 814, 1143, 656, 853, 650,
## 390, 465, 146, 2442, 109, 335, 437, 243, 1195, 215, 2604, 279,
## 611, 253, 254, 244, 276, 478, 100, 91, 601, 145, 611, 177, 528,
## 1515, 938, 2133, 2478, 204, 922, 2408, 2064, 246, 3252, 415,
## 137, 694, 3623, 3320, 1627, 5705, 388, 3681, 226, 1823, 985,
## 3409, 3816, 1006, 1643, 588, 1669, 823, 479, 3526, 265, 2529,
## 394, 3331, 1500, 1449, 2049, 1853, 2233, 2464, 2503, 419, 688,
## 1968, 756, 1243, 538, 208, 910, 377, 2328, 1876, 2477, 1292,
## 1046, 828, 185, 3013, 1448, 1686, 326, 696, 2400, 1735, 2678,
## 3343, 472, 1030, 342, 759, 1073, 192, 308, 86, 212, 520, 1499,
## 653, 1477, 1567, 704, 4277, 287, 271, 256, 417, 903, 345, 425,
## 2440, 724, 324, 226, 579, 130, 452, 478, 483, 278, 2738, 604,
## 1549, 825, 184, 377, 548, 417, 344, 502, 434, 315, 526, 383,
## 1301, 769, 75, 575, 682, 543, 695, 1317, 691), Top10perc = c(16,
## 60, 16, 38, 17, 37, 30, 21, 37, 38, 44, 9, 83, 14, 20, 24, 46,
## 12, 25, 12, 36, 21, 42, 16, 21, 30, 15, 50, 53, 18, 34, 23, 17,
## 28, 26, 14, 25, 37, 16, 19, 11, 67, 25, 15, 15, 20, 45, 76, 5,
## 30, 48, 12, 10, 22, 48, 87, 49, 32, 40, 8, 17, 47, 29, 25, 16,
## 34, 10, 75, 60, 19, 27, 71, 9, 24, 9, 25, 12, 35, 1, 31, 6, 10,
## 8, 8, 55, 23, 28, 25, 16, 30, 14, 35, 15, 29, 15, 58, 46, 29,
## 14, 23, 11, 35, 23, 34, 23, 29, 68, 29, 29, 10, 78, 16, 30, 22,
## 35, 33, 30, 18, 10, 87, 77, 15, 32, 50, 31, 10, 20, 4, 25, 9,
## 33, 15, 10, 16, 12, 19, 17, 33, 36, 27, 23, 11, 14, 76, 43, 47,
## 36, 14, 30, 1, 12, 18, 39, 42, 20, 50, 16, 30, 16, 13, 22, 3,
## 20, 51, 56, 24, 18, 19, 27, 17, 30, 71, 89, 9, 12, 42, 10, 31,
## 26, 40, 7, 10, 20, 56, 57, 18, 36, 30, 40, 34, 20, 25, 33, 35,
## 90, 95, 22, 52, 35, 37, 25, 25, 26, 36, 37, 30, 26, 20, 3, 10,
## 28, 42, 10, 55, 16, 25, 26, 23, 14, 10, 24, 75, 3, 10, 36, 16,
## 5, 44, 10, 20, 34, 48, 20, 15, 31, 24, 15, 9, 21, 28, 40, 12,
## 29, 35, 21, 15, 6, 48, 22, 17, 12, 20, 33, 22, 24, 25, 16, 3,
## 6, 12, 21, 23, 20, 21, 9, 27, 36, 26, 39, 9, 21, 26, 13, 5, 16,
## 25, 37, 25, 6, 30, 35, 42, 15, 22, 26, 32, 36, 36, 18, 19, 13,
## 5, 15, 10, 23, 41, 16, 2, 47, 17, 12, 20, 16, 15, 37, 26, 30,
## 29, 36, 27, 40, 65, 1, 39, 7, 30, 13, 19, 36, 23, 12, 29, 22,
## 85, 8, 15, 50, 52, 56, 35, 36, 22, 29, 26, 31, 22, 37, 31, 20,
## 41, 48, 86, 10, 11, 20, 7, 58, 10, 34, 90, 20, 29, 32, 22, 17,
## 6, 21, 35, 47, 53, 58, 16, 27, 22, 11, 10, 31, 21, 26, 19, 15,
## 10, 13, 21, 24, 12, 39, 20, 44, 26, 14, 16, 15, 23, 20, 34, 20,
## 40, 20, 19, 60, 42, 28, 16, 24, 23, 10, 23, 27, 13, 25, 51, 18,
## 20, 18, 13, 41, 12, 15, 13, 28, 44, 10, 47, 13, 5, 36, 42, 33,
## 8, 17, 37, 51, 25, 16, 36, 27, 7, 17, 8, 6, 13, 56, 19, 9, 16,
## 5, 27, 36, 28, 19, 30, 41, 22, 33, 27, 15, 19, 12, 15, 5, 57,
## 14, 46, 19, 16, 62, 12, 20, 9, 49, 14, 24, 25, 85, 22, 68, 26,
## 23, 44, 22, 24, 11, 35, 54, 43, 36, 52, 26, 25, 16, 16, 27, 22,
## 12, 10, 42, 56, 26, 49, 18, 33, 19, 24, 32, 75, 15, 15, 23, 18,
## 32, 85, 25, 27, 51, 12, 46, 56, 23, 21, 29, 14, 30, 29, 45, 8,
## 20, 26, 45, 27, 19, 17, 46, 41, 27, 18, 74, 40, 20, 9, 14, 17,
## 23, 10, 44, 21, 5, 49, 71, 53, 30, 18, 2, 29, 20, 17, 41, 10,
## 75, 34, 68, 31, 6, 12, 17, 80, 40, 6, 12, 28, 31, 24, 7, 30,
## 3, 20, 29, 56, 21, 35, 24, 15, 32, 81, 10, 20, 24, 17, 42, 49,
## 4, 34, 95, 28), Books = c(750, 450, 800, 500, 500, 450, 300,
## 660, 500, 600, 400, 450, 660, 900, 96, 700, 500, 450, 600, 540,
## 540, 600, 400, 750, 400, 1000, 500, 750, 600, 600, 600, 400,
## 300, 650, 355, 350, 600, 350, 550, 500, 600, 400, 500, 500, 450,
## 400, 475, 1495, 500, 2000, 410, 500, 500, 560, 860, 720, 800,
## 450, 500, 300, 425, 612, 400, 600, 400, 500, 300, 550, 450, 480,
## 400, 525, 400, 526, 600, 570, 500, 600, 2340, 650, 500, 300,
## 654, 600, 600, 600, 400, 700, 600, 500, 350, 700, 600, 500, 400,
## 500, 500, 550, 400, 500, 630, 600, 450, 380, 800, 450, 600, 500,
## 470, 400, 550, 450, 570, 400, 500, 550, 650, 680, 450, 550, 600,
## 500, 500, 550, 595, 600, 400, 600, 450, 450, 500, 600, 530, 450,
## 120, 600, 450, 600, 500, 450, 450, 490, 570, 600, 450, 475, 330,
## 600, 700, 350, 500, 500, 750, 800, 400, 600, 350, 600, 500, 400,
## 600, 500, 600, 630, 600, 500, 500, 500, 400, 580, 550, 670, 795,
## 720, 500, 500, 531, 600, 600, 500, 400, 550, 400, 400, 525, 450,
## 400, 500, 300, 450, 660, 600, 500, 700, 500, 700, 500, 500, 400,
## 400, 600, 1000, 500, 450, 465, 500, 500, 300, 600, 350, 600,
## 500, 537, 400, 500, 600, 640, 634, 400, 400, 400, 500, 500, 500,
## 450, 500, 500, 750, 600, 400, 600, 400, 450, 750, 920, 500, 600,
## 528, 400, 400, 750, 550, 600, 450, 600, 350, 1000, 500, 800,
## 225, 550, 500, 500, 618, 600, 700, 500, 450, 500, 550, 450, 525,
## 500, 450, 450, 470, 600, 600, 650, 575, 350, 500, 600, 600, 500,
## 400, 400, 450, 540, 475, 500, 900, 1100, 450, 550, 550, 1000,
## 600, 480, 600, 200, 400, 550, 500, 550, 250, 500, 850, 750, 500,
## 550, 500, 500, 550, 630, 400, 600, 700, 550, 700, 600, 450, 500,
## 600, 600, 450, 600, 400, 400, 620, 470, 450, 300, 759, 500, 400,
## 575, 558, 600, 600, 550, 500, 800, 420, 500, 616, 660, 600, 630,
## 450, 512, 500, 500, 400, 800, 400, 436, 598, 554, 675, 450, 570,
## 610, 450, 1000, 600, 400, 370, 500, 1230, 550, 700, 450, 500,
## 600, 500, 955, 800, 690, 400, 450, 350, 450, 400, 600, 500, 350,
## 650, 800, 500, 400, 500, 600, 400, 600, 569, 612, 630, 1200,
## 500, 600, 600, 500, 650, 350, 500, 475, 500, 550, 500, 500, 500,
## 200, 450, 630, 221, 576, 400, 500, 600, 550, 600, 600, 660, 500,
## 300, 630, 675, 700, 600, 450, 600, 450, 480, 700, 708, 600, 500,
## 500, 550, 630, 620, 600, 550, 600, 500, 1125, 400, 500, 635,
## 400, 1000, 450, 600, 650, 490, 600, 600, 400, 700, 600, 500,
## 468, 500, 500, 550, 500, 450, 600, 500, 450, 518, 750, 500, 790,
## 400, 500, 556, 700, 500, 530, 600, 400, 700, 630, 525, 687, 500,
## 525, 700, 630, 530, 494, 550, 500, 500, 630, 600, 600, 700, 500,
## 500, 900, 650, 600, 550, 600, 500, 450, 595, 765, 500, 400, 600,
## 630, 500, 700, 500, 750, 600, 650, 500, 495, 500, 600, 500, 500,
## 500, 500, 750, 500, 452, 450, 1200, 858, 500, 500, 708, 541,
## 376, 550, 475, 600, 500, 500, 450, 600, 500, 630, 600, 400, 500,
## 500, 740, 500, 500, 500, 585, 500, 400, 680, 800, 540, 500, 500,
## 500, 600, 850, 600, 450, 500, 500, 500, 639, 500, 600, 430, 530,
## 700, 500, 500, 550, 500, 500, 550, 300, 580, 500, 400, 530, 500,
## 617, 630, 500), Terminal = c(30, 97, 72, 73, 93, 100, 84, 41,
## 88, 84, 87, 84, 98, 66, 96, 93, 88, 60, 91, 65, 83, 70, 95, 67,
## 68, 74, 77, 98, 93, 78, 76, 89, 85, 61, 95, 84, 59, 80, 68, 80,
## 63, 79, 76, 68, 67, 71, 81, 96, 80, 81, 97, 80, 62, 73, 76, 100,
## 97, 69, 81, 76, 96, 81, 81, 66, 92, 69, 50, 93, 93, 81, 62, 95,
## 91, 96, 47, 52, 85, 92, 58, 90, 73, 80, 89, 62, 99, 80, 98, 71,
## 81, 95, 71, 95, 67, 86, 77, 94, 98, 62, 46, 77, 80, 64, 87, 70,
## 71, 76, 92, 96, 89, 87, 98, 76, 60, 64, 76, 76, 90, 68, 51, 99,
## 97, 63, 97, 94, 94, 64, 72, 43, 60, 78, 92, 75, 62, 60, 71, 65,
## 58, 89, 68, 98, 71, 74, 43, 98, 75, 80, 87, 78, 90, 75, 58, 60,
## 90, 96, 74, 89, 55, 97, 97, 76, 69, 56, 76, 59, 95, 51, 58, 67,
## 53, 96, 76, 92, 92, 89, 59, 95, 25, 90, 56, 90, 83, 80, 61, 92,
## 65, 86, 89, 41, 96, 93, 97, 64, 84, 77, 97, 100, 63, 99, 80,
## 95, 99, 90, 91, 88, 81, 90, 72, 76, 35, 44, 75, 88, 84, 92, 64,
## 88, 88, 79, 53, 67, 68, 97, 91, 55, 97, 55, 73, 95, 94, 72, 68,
## 95, 90, 83, 94, 65, 61, 77, 92, 84, 99, 88, 70, 98, 72, 72, 85,
## 91, 93, 55, 80, 62, 74, 77, 88, 95, 81, 66, 65, 55, 45, 72, 92,
## 97, 68, 97, 94, 79, 80, 93, 76, 90, 76, 68, 74, 64, 95, 82, 48,
## 75, 89, 82, 33, 74, 69, 89, 46, 79, 77, 54, 54, 67, 83, 50, 92,
## 74, 68, 60, 91, 55, 51, 50, 37, 82, 71, 87, 86, 76, 93, 98, 100,
## 100, 71, 98, 77, 87, 83, 79, 72, 83, 78, 59, 68, 100, 74, 47,
## 96, 93, 95, 78, 93, 64, 92, 68, 67, 64, 95, 78, 87, 76, 96, 98,
## 100, 62, 71, 90, 90, 93, 85, 96, 74, 86, 75, 71, 73, 95, 80,
## 97, 90, 98, 96, 92, 89, 79, 78, 54, 85, 81, 95, 73, 82, 75, 78,
## 64, 60, 89, 90, 90, 94, 95, 74, 88, 100, 24, 75, 75, 93, 92,
## 98, 56, 100, 85, 85, 84, 71, 65, 49, 81, 73, 73, 92, 97, 76,
## 59, 65, 81, 88, 54, 75, 75, 56, 89, 56, 80, 91, 77, 89, 89, 83,
## 45, 63, 90, 89, 79, 96, 97, 96, 83, 85, 78, 85, 90, 84, 93, 90,
## 75, 100, 86, 99, 84, 70, 58, 61, 88, 93, 58, 75, 83, 94, 76,
## 74, 91, 53, 96, 93, 78, 96, 70, 74, 88, 96, 78, 96, 89, 96, 58,
## 99, 95, 95, 93, 87, 79, 98, 81, 97, 95, 87, 90, 56, 94, 68, 92,
## 88, 92, 92, 90, 94, 78, 86, 88, 73, 92, 75, 87, 83, 93, 94, 85,
## 94, 82, 90, 96, 93, 96, 86, 89, 89, 99, 86, 77, 83, 62, 88, 89,
## 94, 65, 99, 80, 75, 92, 73, 100, 93, 96, 93, 90, 92, 94, 87,
## 78, 81, 89, 94, 75, 85, 69, 41, 89, 98, 98, 90, 87, 63, 89, 81,
## 65, 85, 78, 97, 91, 96, 87, 68, 58, 68, 98, 98, 68, 33, 81, 91,
## 84, 79, 89, 79, 70, 78, 83, 96, 92, 86, 92, 80, 99, 81, 60, 80,
## 48, 95, 94, 60, 75, 96, 75), Expend = c(10527, 19016, 10922,
## 9727, 8861, 11487, 11644, 8991, 10932, 7940, 9305, 7355, 21424,
## 10908, 5854, 4602, 14579, 4739, 6642, 7836, 9220, 6871, 11361,
## 6523, 6136, 8086, 9337, 13894, 12580, 9084, 7503, 8954, 7550,
## 7614, 12957, 5922, 16364, 10511, 7718, 8324, 7623, 8649, 8377,
## 7041, 6259, 9073, 16836, 20447, 15387, 7671, 17150, 5935, 4900,
## 8655, 7916, 20440, 13675, 6333, 9511, 7117, 7922, 8453, 7786,
## 5391, 7972, 9557, 7976, 17960, 24386, 7666, 6716, 19733, 6318,
## 12751, 7697, 6897, 9583, 9685, 13025, 8444, 4900, 5501, 6413,
## 3365, 13118, 12692, 16095, 7729, 8129, 11951, 7963, 11659, 4417,
## 10141, 8741, 15954, 15494, 8604, 6889, 10015, 9773, 7477, 7315,
## 9447, 6084, 8996, 9534, 14140, 7850, 5015, 30639, 6875, 8415,
## 7309, 10965, 10340, 22906, 6898, 8686, 29619, 17581, 5113, 12423,
## 11525, 13861, 4525, 6852, 5480, 7325, 11178, 9303, 9825, 5619,
## 8135, 5682, 10188, 6430, 15003, 9815, 7502, 8952, 6329, 12878,
## 28457, 5527, 7527, 9552, 8355, 11220, 6972, 7967, 3930, 8923,
## 6722, 6339, 7250, 4607, 10864, 4362, 5039, 6336, 6418, 6078,
## 8095, 12940, 7711, 5664, 6786, 7136, 6751, 7508, 19635, 11271,
## 7762, 7348, 14720, 6081, 9553, 10270, 16623, 6475, 7949, 8222,
## 18979, 4957, 9114, 9907, 7483, 17761, 10296, 12263, 6791, 9248,
## 6466, 37219, 21569, 7335, 8588, 12639, 12165, 13040, 10093, 13957,
## 12434, 9284, 8187, 7703, 9466, 9249, 4923, 7348, 12851, 5569,
## 9605, 7305, 8686, 8420, 9812, 3186, 8367, 8118, 56233, 6564,
## 7840, 12067, 6535, 6195, 14067, 7863, 7649, 8954, 13224, 9084,
## 9067, 15687, 4054, 5531, 6119, 8118, 8196, 14665, 8694, 6863,
## 12999, 5177, 10912, 3480, 8747, 9268, 6374, 5553, 7578, 7547,
## 4546, 9456, 13009, 8024, 8832, 7114, 10642, 4796, 7940, 10062,
## 11291, 5801, 10223, 9982, 10819, 5358, 10502, 10622, 8317, 9034,
## 6971, 6652, 6383, 8995, 6881, 3871, 7762, 7846, 7473, 5524, 7832,
## 8128, 11218, 7140, 6170, 6223, 5704, 4172, 4333, 6324, 11989,
## 9566, 8122, 8111, 6990, 18359, 5073, 7016, 6695, 6525, 8536,
## 10613, 7215, 10888, 5972, 8797, 12529, 9241, 7788, 6562, 9670,
## 10090, 8871, 6310, 10889, 6601, 6157, 6086, 6261, 6357, 26385,
## 9209, 4948, 16593, 16196, 8568, 9979, 12011, 6578, 6433, 8802,
## 6413, 7114, 10059, 9431, 9157, 11216, 8992, 16185, 4870, 7634,
## 7895, 7652, 14329, 5967, 8354, 28320, 8135, 8604, 9315, 6880,
## 8847, 7333, 11080, 16358, 15886, 15605, 13388, 11299, 9405, 7519,
## 6422, 7957, 11561, 7252, 11878, 7120, 6719, 4629, 7440, 7344,
## 7360, 4322, 8243, 8680, 18367, 11165, 5081, 10190, 8953, 8946,
## 5125, 9533, 7930, 10872, 6874, 8545, 18372, 10368, 10175, 10080,
## 10065, 7356, 9431, 14086, 7411, 7881, 14837, 21199, 5916, 4444,
## 8219, 7498, 12726, 4718, 4632, 6591, 7818, 12995, 6860, 9988,
## 9828, 9209, 9619, 10357, 7651, 8426, 9995, 10307, 12837, 7495,
## 9075, 11177, 13705, 6632, 5280, 7511, 5563, 6442, 5716, 6608,
## 6174, 6436, 8170, 10554, 18953, 14231, 7233, 5970, 8294, 6415,
## 9158, 7002, 3605, 7936, 7744, 9186, 5096, 10219, 7603, 18034,
## 11806, 9586, 14703, 4933, 10872, 6864, 15411, 5901, 16352, 6820,
## 15934, 7683, 36854, 13889, 10178, 8731, 10650, 10891, 8767, 7780,
## 14737, 7881, 12833, 8559, 6735, 9657, 10139, 10207, 7618, 9021,
## 10276, 7462, 17500, 6716, 6971, 9699, 6433, 6813, 9718, 7855,
## 7011, 15893, 7392, 6005, 5657, 6333, 10244, 25765, 13789, 9060,
## 11217, 9158, 11984, 26037, 10074, 3864, 9131, 5035, 8246, 11020,
## 17007, 4078, 5917, 8546, 7237, 8612, 4696, 4329, 18784, 11743,
## 9275, 12646, 13597, 16527, 8488, 6254, 6490, 5968, 8745, 3733,
## 12082, 7164, 4958, 9681, 23850, 17089, 10458, 11183, 5733, 8944,
## 7804, 8050, 14904, 9006, 41766, 7735, 15736, 10912, 4550, 6563,
## 6951, 21409, 13935, 17858, 4249, 8080, 10026, 5983, 4599, 7203,
## 4222, 7925, 8596, 11916, 22704, 11778, 9603, 8543, 7885, 22014,
## 7264, 5318, 6729, 8960, 10414, 10774, 4469, 8323, 40386, 4509
## ), Grad.Rate = c(56, 59, 15, 55, 63, 73, 80, 52, 73, 74, 68,
## 69, 100, 46, 70, 48, 54, 48, 69, 58, 71, 69, 71, 48, 65, 85,
## 71, 79, 91, 72, 72, 73, 52, 49, 69, 58, 55, 63, 48, 56, 35, 72,
## 51, 75, 53, 58, 72, 96, 46, 85, 84, 49, 18, 82, 33, 97, 93, 78,
## 83, 71, 55, 59, 81, 49, 64, 83, 52, 91, 74, 79, 67, 67, 79, 75,
## 118, 64, 24, 66, 47, 67, 49, 50, 51, 58, 74, 47, 52, 73, 63,
## 79, 74, 77, 46, 67, 75, 91, 93, 96, 100, 83, 43, 75, 77, 78,
## 64, 72, 93, 69, 59, 37, 99, 42, 55, 75, 75, 64, 85, 46, 54, 98,
## 94, 58, 81, 82, 87, 46, 60, 54, 87, 42, 67, 42, 38, 54, 76, 82,
## 70, 59, 81, 64, 86, 63, 44, 96, 50, 67, 53, 68, 94, 24, 22, 69,
## 57, 66, 68, 58, 62, 80, 46, 43, 83, 51, 62, 54, 82, 65, 29, 74,
## 52, 46, 55, 95, 70, 34, 76, 83, 36, 69, 72, 77, 76, 39, 60, 83,
## 100, 65, 80, 96, 91, 65, 69, 70, 64, 73, 100, 100, 52, 63, 79,
## 79, 79, 60, 72, 72, 72, 67, 44, 47, 21, 84, 52, 56, 54, 83, 69,
## 68, 65, 75, 54, 26, 75, 90, 36, 56, 80, 68, 61, 88, 51, 87, 65,
## 79, 84, 75, 77, 57, 60, 51, 94, 85, 91, 58, 56, 69, 53, 45, 100,
## 81, 92, 63, 62, 70, 59, 45, 53, 65, 72, 70, 51, 59, 55, 64, 79,
## 70, 68, 96, 77, 90, 84, 77, 68, 51, 66, 51, 52, 32, 91, 82, 59,
## 89, 85, 65, 49, 71, 64, 58, 67, 61, 53, 63, 100, 27, 37, 15,
## 74, 83, 56, 60, 84, 44, 78, 64, 21, 80, 72, 81, 83, 52, 61, 72,
## 34, 65, 57, 62, 52, 76, 42, 74, 76, 41, 56, 58, 68, 92, 63, 33,
## 83, 79, 67, 83, 75, 45, 48, 87, 65, 37, 62, 83, 69, 42, 63, 66,
## 44, 48, 54, 66, 62, 35, 85, 99, 96, 67, 58, 80, 86, 47, 74, 68,
## 68, 70, 77, 70, 72, 79, 68, 61, 90, 51, 58, 82, 97, 58, 48, 69,
## 67, 85, 83, 76, 67, 98, 78, 84, 55, 98, 56, 61, 41, 100, 55,
## 52, 73, 66, 89, 64, 71, 58, 47, 79, 70, 48, 81, 90, 45, 53, 43,
## 43, 72, 71, 56, 51, 52, 67, 89, 65, 78, 40, 78, 63, 72, 45, 59,
## 73, 79, 97, 74, 56, 57, 49, 63, 42, 53, 66, 76, 53, 65, 59, 46,
## 90, 61, 67, 53, 46, 98, 43, 64, 50, 10, 43, 84, 60, 39, 70, 65,
## 91, 96, 78, 93, 52, 65, 64, 88, 56, 33, 39, 66, 57, 90, 54, 71,
## 63, 75, 51, 45, 77, 66, 63, 54, 81, 69, 57, 47, 31, 55, 63, 68,
## 56, 59, 51, 53, 49, 48, 53, 64, 75, 37, 83, 53, 55, 35, 77, 44,
## 93, 66, 72, 63, 63, 100, 80, 62, 43, 92, 48, 63, 47, 68, 38,
## 45, 89, 62, 53, 29, 50, 82, 47, 37, 79, 95, 65, 53, 65, 36, 46,
## 45, 78, 79, 68, 40, 95, 83, 90, 96, 45, 31, 73, 68, 73, 72, 75,
## 89, 67, 90, 56, 52, 63, 48, 91, 69, 64, 60, 67, 60, 55, 52, 61,
## 65, 62, 80, 85, 71, 52, 63, 67, 59, 99, 91, 58, 59, 50, 78, 82,
## 40, 49, 99, 99), .outcome = c(12280, 12960, 7560, 13500, 13290,
## 13868, 15595, 10468, 16548, 9690, 12572, 8700, 19760, 9996, 6806,
## 7434, 8644, 3460, 6300, 11902, 13353, 10990, 11280, 9925, 8620,
## 10995, 9690, 19264, 17926, 11290, 6450, 12850, 9000, 7800, 16304,
## 9550, 21700, 8050, 8740, 8540, 6200, 11660, 6500, 7844, 7150,
## 9900, 18420, 19030, 14080, 10870, 19380, 9592, 4371, 10265, 2340,
## 19528, 18550, 13306, 13130, 10518, 8900, 7380, 10230, 6060, 10750,
## 13050, 8400, 19292, 17900, 12200, 8150, 15700, 7656, 13712, 9384,
## 7344, 11400, 8950, 11230, 10938, 5962, 4620, 7242, 8300, 11850,
## 16624, 13500, 10335, 9300, 17500, 10740, 15960, 7168, 13925,
## 9888, 18930, 19510, 10860, 9800, 11790, 12600, 11180, 12224,
## 10900, 9990, 11844, 11720, 16240, 8412, 8294, 18624, 6900, 10800,
## 9216, 12050, 15248, 10628, 8920, 9130, 19545, 17295, 4528, 16900,
## 14300, 18700, 4486, 9570, 8310, 9800, 9000, 8730, 5800, 4950,
## 11190, 5710, 9650, 8770, 15360, 14190, 14990, 11800, 9100, 7800,
## 17600, 5401, 10485, 10955, 6297, 15000, 6806, 9400, 5120, 13900,
## 6597, 8025, 6680, 8390, 14235, 6198, 5840, 9650, 13320, 5500,
## 9900, 13440, 10970, 8180, 9476, 12500, 10800, 8100, 18300, 6489,
## 6744, 9150, 19964, 6120, 13000, 9420, 15588, 11750, 8330, 10310,
## 15688, 5224, 13404, 14125, 11000, 19700, 13252, 13218, 7161,
## 8200, 5504, 18485, 17230, 9376, 8800, 11090, 14067, 19029, 11600,
## 13470, 13960, 12275, 9990, 8080, 9950, 7260, 7800, 8050, 14550,
## 7799, 14360, 10000, 9766, 7550, 14424, 7620, 3946, 6398, 18800,
## 7656, 9414, 14850, 8400, 7870, 19240, 9600, 10910, 8664, 15747,
## 12600, 6987, 16880, 9400, 5170, 4938, 11040, 13850, 18700, 11700,
## 8840, 15800, 5950, 4818, 9200, 13380, 4400, 7352, 7920, 11200,
## 5150, 3957, 11100, 11500, 13900, 12450, 7320, 9620, 9858, 10440,
## 12370, 14700, 4300, 13850, 11610, 11200, 6490, 11510, 10200,
## 11200, 11040, 4486, 7680, 6930, 11985, 6720, 5016, 10300, 8856,
## 8127, 6840, 7844, 8200, 11320, 11505, 5580, 9866, 4386, 3840,
## 8550, 5552, 8438, 14990, 7050, 10520, 4515, 19300, 6844, 8950,
## 10500, 9900, 12850, 12474, 12250, 16975, 4738, 10850, 8832, 5376,
## 10320, 5542, 8400, 8242, 11718, 5834, 12580, 4856, 6746, 7799,
## 9840, 9900, 16404, 14134, 9990, 19670, 16560, 12900, 15990, 16732,
## 6400, 5336, 12888, 6530, 8530, 11000, 13312, 11925, 14210, 10645,
## 18200, 2580, 8640, 10178, 9700, 16200, 4290, 11859, 19900, 14400,
## 9556, 11020, 10100, 12030, 4449, 13840, 13970, 19960, 17475,
## 15200, 13250, 13425, 9490, 8734, 12520, 16425, 4356, 7410, 11070,
## 12950, 4259, 8670, 10880, 11200, 9985, 12750, 12200, 11690, 12730,
## 10800, 13030, 10860, 10575, 5130, 8236, 8384, 13584, 8325, 8955,
## 17238, 12669, 12825, 12000, 11240, 7210, 9240, 16160, 11250,
## 8990, 18710, 18820, 4680, 3738, 9520, 5472, 12772, 7070, 4740,
## 4285, 7200, 11850, 8400, 7000, 10456, 11550, 13332, 6800, 8678,
## 5000, 13900, 12315, 16900, 12170, 6550, 6550, 6550, 6550, 6550,
## 6550, 6550, 6550, 6550, 6550, 6550, 6840, 6550, 16130, 14500,
## 15150, 7850, 5666, 10965, 4860, 8490, 7850, 7860, 6400, 7070,
## 11172, 7600, 10900, 9456, 18810, 11412, 11010, 12240, 7700, 6735,
## 7800, 18732, 6874, 4440, 5028, 12024, 9500, 18930, 8907, 11656,
## 10760, 10220, 11130, 10430, 11800, 7090, 5697, 4460, 7560, 11120,
## 6994, 13540, 6540, 8594, 8723, 8566, 6919, 16500, 9843, 4916,
## 9057, 7246, 5595, 11450, 11180, 5972, 8400, 8677, 7558, 4104,
## 6197, 5173, 17020, 10786, 12040, 16230, 10330, 14500, 17840,
## 13226, 3687, 11584, 5800, 8074, 6760, 17230, 4973, 4652, 11712,
## 8550, 5764, 4422, 4104, 15350, 11750, 6857, 15516, 12212, 8199,
## 6172, 6704, 7032, 6900, 5988, 8840, 14900, 9600, 4286, 11800,
## 17865, 18920, 15925, 10217, 5587, 10260, 10900, 9140, 12925,
## 13500, 13850, 11600, 13750, 8200, 2700, 8840, 9160, 18345, 14900,
## 9850, 4470, 14200, 14510, 6940, 5918, 8124, 5542, 10720, 12065,
## 11480, 18460, 16249, 12350, 11150, 10060, 19629, 7820, 4200,
## 6400, 9100, 15948, 15884, 6797, 6900, 19840, 4990)), .Names = c("Apps",
## "Enroll", "Top10perc", "Books", "Terminal", "Expend", "Grad.Rate",
## ".outcome"), row.names = c("Adelphi.University", "Agnes.Scott.College",
## "Alaska.Pacific.University", "Albertson.College", "Albertus.Magnus.College",
## "Albion.College", "Albright.College", "Alderson.Broaddus.College",
## "Alfred.University", "Allentown.Coll..of.St..Francis.de.Sales",
## "Alma.College", "American.International.College", "Amherst.College",
## "Andrews.University", "Appalachian.State.University", "Arizona.State.University.Main.campus",
## "Arkansas.College..Lyon.College.", "Arkansas.Tech.University",
## "Auburn.University.Main.Campus", "Augsburg.College", "Augustana.College.IL",
## "Augustana.College", "Austin.College", "Averett.College", "Baker.University",
## "Baldwin.Wallace.College", "Barat.College", "Bard.College", "Barnard.College",
## "Barry.University", "Baylor.University", "Beaver.College", "Belmont.Abbey.College",
## "Belmont.University", "Beloit.College", "Benedictine.College",
## "Bennington.College", "Berry.College", "Bethany.College", "Bethel.College.KS",
## "Bethel.College", "Birmingham.Southern.College", "Blackburn.College",
## "Bloomsburg.Univ..of.Pennsylvania", "Bluefield.College", "Bluffton.College",
## "Boston.University", "Bowdoin.College", "Bradford.College", "Bradley.University",
## "Brandeis.University", "Brenau.University", "Brewton.Parker.College",
## "Bridgewater.College", "Brigham.Young.University.at.Provo", "Brown.University",
## "Bucknell.University", "Buena.Vista.College", "Butler.University",
## "Cabrini.College", "Caldwell.College", "California.Polytechnic.San.Luis",
## "Calvin.College", "Campbellsville.College", "Canisius.College",
## "Capital.University", "Capitol.College", "Carleton.College",
## "Carnegie.Mellon.University", "Carroll.College", "Carson.Newman.College",
## "Case.Western.Reserve.University", "Castleton.State.College",
## "Catholic.University.of.America", "Cazenovia.College", "Cedarville.College",
## "Centenary.College", "Centenary.College.of.Louisiana", "Center.for.Creative.Studies",
## "Central.College", "Central.Connecticut.State.University", "Central.Missouri.State.University",
## "Central.Washington.University", "Central.Wesleyan.College",
## "Centre.College", "Chapman.University", "Chatham.College", "Chestnut.Hill.College",
## "Christian.Brothers.University", "Clark.University", "Clarke.College",
## "Clarkson.University", "Clinch.Valley.Coll..of..the.Univ..of.Virginia",
## "Coe.College", "Coker.College", "Colby.College", "Colgate.University",
## "College.Misericordia", "College.of.Mount.St..Joseph", "College.of.Mount.St..Vincent",
## "College.of.Notre.Dame", "College.of.Notre.Dame.of.Maryland",
## "College.of.Saint.Catherine", "College.of.Saint.Elizabeth", "College.of.Saint.Rose",
## "College.of.St..Scholastica", "College.of.William.and.Mary",
## "College.of.Wooster", "Colorado.State.University", "Columbia.College.MO",
## "Columbia.University", "Concordia.Lutheran.College", "Concordia.University.CA",
## "Concordia.University", "Converse.College", "Cornell.College",
## "Creighton.University", "D.Youville.College", "Dana.College",
## "Dartmouth.College", "Davidson.College", "Delta.State.University",
## "Denison.University", "DePauw.University", "Dickinson.College",
## "Dickinson.State.University", "Doane.College", "Dominican.College.of.Blauvelt",
## "Dordt.College", "Dowling.College", "Drury.College", "East.Tennessee.State.University",
## "East.Texas.Baptist.University", "Eastern.College", "Eastern.Illinois.University",
## "Eastern.Mennonite.College", "Eastern.Nazarene.College", "Eckerd.College",
## "Elizabethtown.College", "Elmira.College", "Elms.College", "Elon.College",
## "Embry.Riddle.Aeronautical.University", "Emory.University", "Emporia.State.University",
## "Erskine.College", "Eureka.College", "Evergreen.State.College",
## "Fairfield.University", "Fayetteville.State.University", "Ferrum.College",
## "Flagler.College", "Florida.Institute.of.Technology", "Florida.International.University",
## "Florida.Southern.College", "Florida.State.University", "Fontbonne.College",
## "Fordham.University", "Fort.Lewis.College", "Francis.Marion.University",
## "Franciscan.University.of.Steubenville", "Franklin.Pierce.College",
## "Freed.Hardeman.University", "Fresno.Pacific.College", "Furman.University",
## "Gannon.University", "Gardner.Webb.University", "Geneva.College",
## "George.Fox.College", "George.Mason.University", "Georgetown.College",
## "Georgetown.University", "Georgia.Institute.of.Technology", "Georgia.State.University",
## "Georgian.Court.College", "Gettysburg.College", "Goldey.Beacom.College",
## "Gonzaga.University", "Goshen.College", "Goucher.College", "Green.Mountain.College",
## "Greensboro.College", "Greenville.College", "Grinnell.College",
## "Grove.City.College", "Guilford.College", "Gustavus.Adolphus.College",
## "Gwynedd.Mercy.College", "Hamilton.College", "Hamline.University",
## "Hampden...Sydney.College", "Hampton.University", "Hanover.College",
## "Harding.University", "Harvard.University", "Harvey.Mudd.College",
## "Hastings.College", "Hendrix.College", "Hillsdale.College", "Hiram.College",
## "Hobart.and.William.Smith.Colleges", "Hofstra.University", "Hollins.College",
## "Hood.College", "Hope.College", "Houghton.College", "Huntingdon.College",
## "Huntington.College", "Huron.University", "Husson.College", "Illinois.College",
## "Illinois.Institute.of.Technology", "Illinois.State.University",
## "Illinois.Wesleyan.University", "Immaculata.College", "Indiana.University.at.Bloomington",
## "Iowa.State.University", "Ithaca.College", "Jamestown.College",
## "Jersey.City.State.College", "John.Brown.University", "Johns.Hopkins.University",
## "Johnson.State.College", "Judson.College", "Juniata.College",
## "Kansas.Wesleyan.University", "Keene.State.College", "Kenyon.College",
## "Keuka.College", "King.s.College", "King.College", "Knox.College",
## "La.Salle.University", "LaGrange.College", "Lake.Forest.College",
## "Lakeland.College", "Lambuth.University", "Lander.University",
## "Le.Moyne.College", "Lebanon.Valley.College", "Lehigh.University",
## "Lesley.College", "LeTourneau.University", "Lewis.and.Clark.College",
## "Lincoln.Memorial.University", "Lincoln.University", "Lindenwood.College",
## "Linfield.College", "Livingstone.College", "Lock.Haven.University.of.Pennsylvania",
## "Longwood.College", "Loras.College", "Louisiana.College", "Louisiana.Tech.University",
## "Loyola.University", "Loyola.University.Chicago", "Lycoming.College",
## "Lynchburg.College", "Lyndon.State.College", "MacMurray.College",
## "Malone.College", "Manchester.College", "Manhattan.College",
## "Manhattanville.College", "Mankato.State.University", "Marietta.College",
## "Marquette.University", "Mary.Baldwin.College", "Mary.Washington.College",
## "Marymount.College.Tarrytown", "Marymount.Manhattan.College",
## "Maryville.College", "Marywood.College", "Mayville.State.University",
## "McKendree.College", "McMurry.University", "Mercer.University",
## "Meredith.College", "Mesa.State.College", "Messiah.College",
## "Miami.University.at.Oxford", "Michigan.Technological.University",
## "MidAmerica.Nazarene.College", "Millersville.University.of.Penn.",
## "Milligan.College", "Millsaps.College", "Milwaukee.School.of.Engineering",
## "Mississippi.College", "Mississippi.State.University", "Mississippi.University.for.Women",
## "Missouri.Southern.State.College", "Missouri.Valley.College",
## "Montana.State.University", "Montreat.Anderson.College", "Moravian.College",
## "Morehouse.College", "Morningside.College", "Morris.College",
## "Mount.Holyoke.College", "Mount.Marty.College", "Mount.Mary.College",
## "Mount.Mercy.College", "Mount.Saint.Clare.College", "Mount.Saint.Mary.s.College",
## "Mount.St..Mary.s.College", "Mount.Union.College", "Muhlenberg.College",
## "Murray.State.University", "Nazareth.College.of.Rochester", "New.Jersey.Institute.of.Technology",
## "New.Mexico.Institute.of.Mining.and.Tech.", "Niagara.University",
## "North.Adams.State.College", "North.Carolina.State.University.at.Raleigh",
## "North.Carolina.Wesleyan.College", "North.Central.College", "North.Dakota.State.University",
## "North.Park.College", "Northeast.Missouri.State.University",
## "Northern.Arizona.University", "Northern.Illinois.University",
## "Northwest.Nazarene.College", "Northwestern.College", "Northwestern.University",
## "Norwich.University", "Notre.Dame.College", "Oberlin.College",
## "Occidental.College", "Oglethorpe.University", "Ohio.Northern.University",
## "Ohio.Wesleyan.University", "Oklahoma.Christian.University",
## "Oklahoma.State.University", "Otterbein.College", "Ouachita.Baptist.University",
## "Our.Lady.of.the.Lake.University", "Pace.University", "Pacific.Lutheran.University",
## "Pacific.Union.College", "Pacific.University", "Pennsylvania.State.Univ..Main.Campus",
## "Pepperdine.University", "Peru.State.College", "Pfeiffer.College",
## "Point.Loma.Nazarene.College", "Point.Park.College", "Polytechnic.University",
## "Prairie.View.A..and.M..University", "Presbyterian.College",
## "Princeton.University", "Providence.College", "Purdue.University.at.West.Lafayette",
## "Queens.College", "Quincy.University", "Quinnipiac.College",
## "Ramapo.College.of.New.Jersey", "Randolph.Macon.College", "Randolph.Macon.Woman.s.College",
## "Reed.College", "Rensselaer.Polytechnic.Institute", "Rhodes.College",
## "Rider.University", "Roanoke.College", "Rockhurst.College", "Rocky.Mountain.College",
## "Roger.Williams.University", "Rollins.College", "Rowan.College.of.New.Jersey",
## "Rutgers.State.University.at.Newark", "Sacred.Heart.University",
## "Saint.Anselm.College", "Saint.Cloud.State.University", "Saint.Francis.College.IN",
## "Saint.Francis.College", "Saint.Joseph.s.College.IN", "Saint.Joseph.s.College",
## "Saint.Joseph.s.University", "Saint.Joseph.College", "Saint.Louis.University",
## "Saint.Mary.s.College", "Saint.Mary.s.College.of.Minnesota",
## "Saint.Michael.s.College", "Saint.Xavier.University", "Salem.Teikyo.University",
## "Salisbury.State.University", "Samford.University", "San.Diego.State.University",
## "Santa.Clara.University", "Savannah.Coll..of.Art.and.Design",
## "Schreiner.College", "Scripps.College", "Seattle.Pacific.University",
## "Seattle.University", "Seton.Hall.University", "Seton.Hill.College",
## "Shorter.College", "Siena.Heights.College", "Simmons.College",
## "Simpson.College", "Sioux.Falls.College", "Skidmore.College",
## "Smith.College", "Southeast.Missouri.State.University", "Southeastern.Oklahoma.State.Univ.",
## "Southern.California.College", "Southern.Illinois.University.at.Edwardsville",
## "Southern.Methodist.University", "Southwest.Baptist.University",
## "Southwest.Missouri.State.University", "Southwest.State.University",
## "Southwestern.College", "Southwestern.University", "Spalding.University",
## "Spelman.College", "St..Bonaventure.University", "St..Martin.s.College",
## "St..Mary.s.College.of.California", "St..Mary.s.College.of.Maryland",
## "St..Mary.s.University.of.San.Antonio", "St..Paul.s.College",
## "Stephens.College", "Stetson.University", "Stevens.Institute.of.Technology",
## "Stonehill.College", "SUNY.at.Albany", "SUNY.at.Buffalo", "SUNY.at.Stony.Brook",
## "SUNY.College..at.Brockport", "SUNY.College..at.Oswego", "SUNY.College.at.Buffalo",
## "SUNY.College.at.Cortland", "SUNY.College.at.Fredonia", "SUNY.College.at.Geneseo",
## "SUNY.College.at.New.Paltz", "SUNY.College.at.Plattsburgh", "SUNY.College.at.Potsdam",
## "SUNY.College.at.Purchase", "Susquehanna.University", "Sweet.Briar.College",
## "Syracuse.University", "Tabor.College", "Talladega.College",
## "Taylor.University", "Texas.A.M.University.at.Galveston", "Texas.Christian.University",
## "Texas.Lutheran.College", "Texas.Southern.University", "Texas.Wesleyan.University",
## "The.Citadel", "Thiel.College", "Tiffin.University", "Transylvania.University",
## "Tri.State.University", "Trinity.College.CT", "Trinity.College.DC",
## "Trinity.College.VT", "Trinity.University", "Tusculum.College",
## "Tuskegee.University", "Union.College.KY", "Union.College.NY",
## "Univ..of.Wisconsin.at.OshKosh", "University.of.Alabama.at.Birmingham",
## "University.of.Arkansas.at.Fayetteville", "University.of.California.at.Irvine",
## "University.of.Charleston", "University.of.Chicago", "University.of.Cincinnati",
## "University.of.Connecticut.at.Storrs", "University.of.Dallas",
## "University.of.Delaware", "University.of.Detroit.Mercy", "University.of.Dubuque",
## "University.of.Evansville", "University.of.Florida", "University.of.Georgia",
## "University.of.Hawaii.at.Manoa", "University.of.Illinois...Urbana",
## "University.of.Indianapolis", "University.of.Kansas", "University.of.La.Verne",
## "University.of.Louisville", "University.of.Maryland.at.Baltimore.County",
## "University.of.Maryland.at.College.Park", "University.of.Massachusetts.at.Amherst",
## "University.of.Massachusetts.at.Dartmouth", "University.of.Miami",
## "University.of.Minnesota.at.Morris", "University.of.Mississippi",
## "University.of.Missouri.at.Rolla", "University.of.Missouri.at.Saint.Louis",
## "University.of.Nebraska.at.Lincoln", "University.of.New.England",
## "University.of.New.Hampshire", "University.of.North.Carolina.at.Asheville",
## "University.of.North.Carolina.at.Chapel.Hill", "University.of.North.Carolina.at.Greensboro",
## "University.of.North.Carolina.at.Wilmington", "University.of.North.Texas",
## "University.of.Northern.Iowa", "University.of.Oklahoma", "University.of.Pennsylvania",
## "University.of.Pittsburgh.Main.Campus", "University.of.Portland",
## "University.of.Puget.Sound", "University.of.Rhode.Island", "University.of.Richmond",
## "University.of.Rochester", "University.of.San.Francisco", "University.of.Sci..and.Arts.of.Oklahoma",
## "University.of.Scranton", "University.of.South.Carolina.at.Aiken",
## "University.of.South.Carolina.at.Columbia", "University.of.South.Florida",
## "University.of.Southern.California", "University.of.Southern.Indiana",
## "University.of.Southern.Mississippi", "University.of.St..Thomas.MN",
## "University.of.St..Thomas.TX", "University.of.Tennessee.at.Knoxville",
## "University.of.Texas.at.Arlington", "University.of.Texas.at.San.Antonio",
## "University.of.the.South", "University.of.Tulsa", "University.of.Utah",
## "University.of.Vermont", "University.of.Virginia", "University.of.Washington",
## "University.of.West.Florida", "University.of.Wisconsin.Stout",
## "University.of.Wisconsin.Superior", "University.of.Wisconsin.at.Green.Bay",
## "University.of.Wyoming", "Upper.Iowa.University", "Ursinus.College",
## "Ursuline.College", "Valley.City.State.University", "Valparaiso.University",
## "Vanderbilt.University", "Vassar.College", "Villanova.University",
## "Virginia.Commonwealth.University", "Virginia.State.University",
## "Virginia.Tech", "Virginia.Wesleyan.College", "Viterbo.College",
## "Wabash.College", "Wagner.College", "Wake.Forest.University",
## "Wartburg.College", "Washington.and.Lee.University", "Washington.State.University",
## "Wayne.State.College", "Waynesburg.College", "Webster.University",
## "Wellesley.College", "Wells.College", "Wentworth.Institute.of.Technology",
## "West.Liberty.State.College", "West.Virginia.Wesleyan.College",
## "Western.Maryland.College", "Western.Michigan.University", "Western.State.College.of.Colorado",
## "Western.Washington.University", "Westfield.State.College", "Westminster.College.MO",
## "Westminster.College", "Wheaton.College.IL", "Westminster.College.PA",
## "Whittier.College", "Widener.University", "Wilkes.University",
## "William.Jewell.College", "Williams.College", "Wingate.College",
## "Winona.State.University", "Winthrop.University", "Wisconsin.Lutheran.College",
## "Wittenberg.University", "Worcester.Polytechnic.Institute", "Worcester.State.College",
## "Xavier.University.of.Louisiana", "Yale.University", "York.College.of.Pennsylvania"
## ), class = "data.frame"))
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -7781.2 -1165.4 135.2 1323.6 7582.1
##
## (Dispersion Parameter for gaussian family taken to be 4496568)
##
## Null Deviance: 9454415304 on 583 degrees of freedom
## Residual Deviance: 2495596522 on 555.0003 degrees of freedom
## AIC: 10633.77
##
## Number of Local Scoring Iterations: 2
##
## Anova for Parametric Effects
## Df Sum Sq Mean Sq F value Pr(>F)
## s(Terminal, df = 4) 1 1036026604 1036026604 230.4038 <2e-16 ***
## s(Top10perc, df = 4) 1 1703782767 1703782767 378.9073 <2e-16 ***
## s(Grad.Rate, df = 4) 1 863561464 863561464 192.0490 <2e-16 ***
## s(Books, df = 4) 1 4913517 4913517 1.0927 0.2963
## s(Enroll, df = 4) 1 525253363 525253363 116.8121 <2e-16 ***
## s(Apps, df = 4) 1 556944262 556944262 123.8598 <2e-16 ***
## s(Expend, df = 4) 1 1094524665 1094524665 243.4133 <2e-16 ***
## Residuals 555 2495596522 4496568
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Anova for Nonparametric Effects
## Npar Df Npar F Pr(F)
## (Intercept)
## s(Terminal, df = 4) 3 1.570 0.195493
## s(Top10perc, df = 4) 3 3.712 0.011543 *
## s(Grad.Rate, df = 4) 3 5.901 0.000573 ***
## s(Books, df = 4) 3 4.692 0.003024 **
## s(Enroll, df = 4) 3 16.460 2.934e-10 ***
## s(Apps, df = 4) 3 11.015 4.921e-07 ***
## s(Expend, df = 4) 3 42.436 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
With the exception of the intercept, all variables seem to have a non-linear relationship with the response variable.
plot(testing$Outstate~testing$Top10perc)
plot(testing$Outstate~testing$Grad.Rate)
plot(testing$Outstate~testing$Books)
plot(testing$Outstate~testing$Enroll)
plot(testing$Outstate~testing$Apps)
plot(testing$Outstate~testing$Expend)
As seen in these graphs, none of the chosen variables have a linear relationship with the outcome variable.
In Section 7.7, it was mentioned that GAMs are generally fit using a backfitting approach. The idea behind backfitting is actually quite simple. We will now explore backfitting in the context of multiple linear regression.
Suppose that we would like to perform multiple linear regression, but we do not have software to do so. Instead, we only have software to perform simple linear regression. Therefore, we take the following iterative approach: we repeatedly hold all but one coefficient estimate fixed at its current value, and update only that coefficient estimate using a simple linear regression. The process is continued until convergence-that is, until the coefficient estimates stop changing.
We now try this out on a toy example.
Generate a response Y and two predictors \(X_1\) and \(X_2\), with \(n\) = 100.
Initialize \(\hat\beta_1\) to take on a value of your choice. It does not matter what value you choose.
set.seed(1)
y=rnorm(100)
x1=rnorm(100)
x2=rnorm(100)
beta1=2.34
Y-\(\hat\beta_1X_1\)=\(\beta_0\)+\(\beta_2X_2\)+\(\epsilon\)
You can do this as follows:
a=y-beta1*x1
beta2=lm(a~x2)$coef[2]
a
## [1] 0.82520421 0.08509218 1.29592805 1.22549347 1.86123584
## [6] -4.95592059 -1.18966644 -1.39148299 -0.32321239 -4.24168042
## [11] 2.99940447 1.47009191 -3.97278102 -0.69207042 1.61020186
## [16] 0.87423695 0.73259305 1.59696134 -0.33517950 1.00885465
## [21] 2.10291783 -2.36057455 0.57668080 -1.56918942 0.85427208
## [26] -1.72376790 0.01634520 -1.38268842 1.11693547 1.17673400
## [31] 1.21790412 1.27522537 -0.85602948 3.49923711 -2.09440495
## [36] 3.18029802 0.30999418 1.17686158 2.62592716 0.89631421
## [41] 4.31507746 -3.00656663 4.59299888 1.64132434 1.92249735
## [46] 1.04942131 -4.51938775 0.72782717 2.89759703 4.72012468
## [51] -0.65533194 -0.56859638 1.08539969 1.04534433 4.91368083
## [56] 4.49634987 -2.70728888 0.40962944 3.80927845 -4.50919466
## [61] 1.40688288 0.51919421 -1.78711097 -2.04622685 0.70575552
## [66] -4.97348747 -1.20819538 4.79887234 0.49114841 1.68697196
## [71] -4.92515992 -0.95752397 -0.45865085 -0.75355976 -0.47207143
## [76] 0.37270514 -2.28636855 -4.85496797 -2.32975698 -3.41602660
## [81] 2.31262807 -2.43749425 0.66346296 1.90979827 -0.62524703
## [86] 0.70443615 -2.36403447 1.48844796 1.37671431 2.43419501
## [91] -0.12809676 0.26716024 2.87269334 -1.24285956 4.41374717
## [96] 3.01076995 -4.64890124 1.80381765 -2.18863344 0.41831732
beta2
## x2
## 0.1233176
Y-\(\hat\beta_2X_2\)=\(\beta_0\)+\(\beta_1X_1\)+\(\epsilon\)
You can do this as follows:
a= y-beta2 *x2
beta1=lm(a~x1)$coef[2]
a
## [1] -0.676940276 -0.024624528 -1.031282939 1.636087568 0.611317604
## [6] -1.128474094 0.405168033 0.671569500 0.577433749 -0.368293749
## [11] 1.532051607 0.337964170 -0.571883100 -2.045729098 1.003113043
## [16] -0.232344965 0.021882892 1.098388934 0.742021519 0.599414746
## [21] 1.132713759 0.781873405 0.152292127 -1.947304258 0.762451512
## [26] -0.278487928 -0.114961088 -1.272764273 -0.502467483 0.385487363
## [31] 1.480249365 0.253467126 0.466654298 -0.124158691 -1.369694624
## [36] -0.402887393 -0.463449037 0.086997872 0.964773425 0.763834761
## [41] -0.251747472 -0.380885396 0.669404300 0.665023340 -0.832169728
## [46] -0.460839554 0.431764266 0.800061631 -0.091860559 0.755266535
## [51] 0.381307319 -0.662237338 0.349709358 -1.098821716 1.347250024
## [56] 1.839049733 -0.070877344 -1.114763511 0.523509501 -0.082611596
## [61] 2.284341114 0.008759804 0.724802346 -0.077731584 -0.955333574
## [66] 0.155489769 -1.752895897 1.612193496 0.194075541 2.288509196
## [71] 0.507440482 -0.758580336 0.715775352 -1.260786615 -1.272872390
## [76] 0.152071753 -0.161002527 -0.090273156 0.236657559 -0.702948957
## [81] -0.617765200 -0.084923156 1.014782560 -1.437092572 0.665546168
## [86] 0.456400221 1.145498040 -0.420741893 0.316535688 0.143144937
## [91] -0.494411522 1.161454713 1.130292776 0.876096326 1.367521768
## [96] 0.541906659 -1.371004063 -0.691050606 -1.218376972 -0.435688203
beta1
## x1
## 0.005663458
iteration=100
df=data.frame(0.0, 0.27, 0.0)
names(df)=c("beta0", "beta1", "beta2")
for (i in 1:iteration){
beta1 = df[nrow(df), 2]
a= y - beta1 * x1
beta2= lm(a ~ x2)$coef[2]
a= y - beta2 * x2
beta1 <- lm(a ~ x1)$coef[2]
beta0 <- lm(a ~ x1)$coef[1]
df[nrow(df) + 1,] <- list(beta0, beta1, beta2)
}
plot(df$beta0, col="purple2", type="l", lwd=2, main="Backfitting", xlab="Iterations", ylab="Betas")
lines(df$beta1, col="skyblue2", lwd=2)
lines(df$beta2, col="palegreen3", lwd=2)
The coefficients here quick attain their least squared values during the iterations.
lm.fit<-coef(lm(y~x1+x2))
plot(df$beta0, col="purple2", type="l", lwd=2, main="Multiple Linear Regression VS Backfitting", xlab="Iterations", ylab="Betas")
lines(df$beta1, col="skyblue2", lwd=2)
lines(df$beta2, col="palegreen3", lwd=2)
abline(h=lm.fit[1], col="grey60", lty=2, lwd=2)
abline(h=lm.fit[2], col="grey60", lty=2, lwd=2)
abline(h=lm.fit[3], col="grey60", lty=2, lwd=2)
The dotted lines on this graph show that the estimated multiple regression coefficients match exactly with the coefficients obtained using the backfitting method.
As seen in the graphs above, the betas didn’t reach their least squared values until almost 5 iterations. I would say that 5 iterations are necessary for backfitting in order to produce the same approximately to the multiple regression coefficient estimates.