\[f_1(x) = a_1 + b_1x + c_1x^2 + d_1x^3\] such that \(f(x) = f_2(x)\) for all \(x > \xi\). Express \(a_2, b_2, c_2, d_2\) in terms of \(\beta_0, \beta_1, \beta_2, \beta_3, \beta_4\). We have now established that \(f(x)\) is a piecewise polynomial.
The positive subscript indicates \((x)_+ = \operatorname{max}\{0,x\}\), thus indicating that when \(x < \xi\) the following expression evaluates to zero as the insides of the parentheses will be negative: \[ \beta (x - \xi)^3_+ = 0 \]
We thus find the following:
\[ f(x) = f_1(x) \newline \beta_0 + \beta_1 x + \beta_2 x^2 + \beta_3 x^3 = a_1 + b_1x + c_1x^2 + d_1x^3 \]
We thus find that this is true when all coefficients equal each other: \[ \beta_0 = a_1 \beta_1 = b_1 \beta_2 = c_1 \beta_3 = d_1 \] b) Find a cubic polynomial
\[ f_2(x) = a_2 + b_2x + c_2x^2 + d_2x^3 \]
such that \(f(x) = f_2(x)\) for all \(x > \xi\). Express \(a_2, b_2, c_2, d_2\) in terms of \(\beta_0, \beta_1, \beta_2, \beta_3, \beta_4\). We have now established that \(f(x)\) is a piecewise polynomial.
In this case, as \(x > \xi\), we cannot equate _4 to 0 as we did in the previous question. Recall that \((a - b)^3 = (a - b)(a^2 - 2ab + b^2)\), and we can substitute as follows:
\[ f(x) = f_2(x) \newline \beta_0 + \beta_1 x + \beta_2 x^2 + \beta_3 x^3 \beta_4(x - \xi)^3_+ = a_2 + b_2x + c_2x^2 + d_2x^3 \newline \beta_0 + \beta_1 x + \beta_2 x^2 + \beta_3 x^3 \beta_4(x - \xi)(x^2 - 2x\xi + \xi^2) = a_2 + b_2x + c_2x^2 + d_2x^3 \newline \]
If we simplify this expression we reach the following:
\[ a_2 + b_2x + c_2x^2 + d_2x^3 = (\beta_0 - \beta_4 \xi^3) + (\beta_1 + 3\beta_4\xi^2)x + (\beta_2 - 3\beta_4 \xi)x^2 + (\beta_3 + \beta_4)x^3 \newline \]
We thus find the following values of the coefficients:
\[ a_2 = \beta_0 - \beta_4 \xi^3 \newline b_2 = \beta_1 + 3\beta_4\xi^2 \newline c_2 = \beta_2 - 3\beta_4 \xi \newline d_2 = \beta_3 + \beta_4 \newline \]
\[ f_1(\xi) = f_2(\xi) \newline \beta_0 + \beta_1 \xi + \beta_2 \xi^2 + \beta_3 \xi^3 = (\beta_0 = \beta_4 \xi^3) + (\beta_1 + 3 \beta_4 \xi^2)\xi + (\beta_2 - 3\beta_4 \xi)\xi^2 + (\beta_3 + \beta_4)\xi^3 \newline \beta_0 + \beta_1 \xi + \beta_2 \xi^2 + \beta_3 \xi^3 = \beta_0 + \beta_1 \xi + \beta_2 \xi^2 + \beta_3 \xi^3 \]
Thus, \(f_1(\xi) = f_2(\xi)\),
We can find the derivatives of each as follows:
\[ f_1'(x) = b_1 + 2c_1x + 3d_1x^2 \newline f_2'(x) = b_2 + 2c_2x + 3d_2x^2 \newline \]
We can then plug in to the derivatives and set them equal to each other using the values for the coefficients we have previously calculated, and continue with algebra:
\[ \beta_1 + 2\beta_2 \xi + 3\beta_3 \xi ^2 = (\beta_1 + 3\beta_4 \xi^2) + 2(\beta_2 - 3\beta_4 \xi)\xi + (\beta_3 + \beta_4)\xi^2 \newline \beta_1 + 2\beta_2 \xi + 3\beta_3 \xi ^2 = \beta_1 + 2\beta_2 \xi + 3\beta_3 \xi ^2 \newline \]
Thus, we find that \(f_1'(\xi) = f_2'(\xi)\).
We can find the second derivatives as follows:
\[ f_1''(\xi) = 2c_1 + 6d_1x \newline f_2''(\xi) = 2c_2 + 6d_2x \newline \]
We can then plug in to the second derivatives and set them equal to each other using the values for the coefficients we have previously calculated, and continue with algebra:
\[ 2\beta_2 + 6\beta_3 \xi = 2(\beta_2 - 3\beta_4\xi) + 6(\beta_3 + \beta_4) \xi \newline 2\beta_2 + 6\beta_3 \xi = 2\beta_2 + 6\beta_3 \xi \]
Thus, we find that \(f_1''(\xi) = f_2''(\xi)\).
In this exercise, you will further analyze the Wage data set considered throughout this chapter.
head(Wage)
## year age maritl race education region
## 231655 2006 18 1. Never Married 1. White 1. < HS Grad 2. Middle Atlantic
## 86582 2004 24 1. Never Married 1. White 4. College Grad 2. Middle Atlantic
## 161300 2003 45 2. Married 1. White 3. Some College 2. Middle Atlantic
## 155159 2003 43 2. Married 3. Asian 4. College Grad 2. Middle Atlantic
## 11443 2005 50 4. Divorced 1. White 2. HS Grad 2. Middle Atlantic
## 376662 2008 54 2. Married 1. White 4. College Grad 2. Middle Atlantic
## jobclass health health_ins logwage wage
## 231655 1. Industrial 1. <=Good 2. No 4.318063 75.04315
## 86582 2. Information 2. >=Very Good 2. No 4.255273 70.47602
## 161300 1. Industrial 1. <=Good 1. Yes 4.875061 130.98218
## 155159 2. Information 2. >=Very Good 1. Yes 5.041393 154.68529
## 11443 2. Information 1. <=Good 1. Yes 4.318063 75.04315
## 376662 2. Information 2. >=Very Good 1. Yes 4.845098 127.11574
set.seed(123)
k <- 10
max_degree <- 10
folds <- createFolds(Wage$wage, k=k, list = F)
results <- as.data.frame(matrix(nrow=k, ncol=max_degree)) # initialize a matrix for error
# cross validation
for(i in 1:k){
train_data <- Wage[-folds[i],]
test_data <- Wage[folds[i],]
# get mse for each model
for(d in 1:max_degree){
model <- lm(wage ~ poly(age, d), data=train_data)
preds <- predict(model, newdata=test_data)
results[i,d] <- mean((preds - test_data$wage)^2)
}
}
# Compute the mean MSE for each polynomial degree across all folds
mean_errors <- colMeans(results, na.rm = TRUE)
# Find the index (degree) of the best performing polynomial degree
best_degree <- which.min(mean_errors)
# Print results
cat("Best polynomial degree:", best_degree, "\n")
## Best polynomial degree: 10
cat("Average MSE for the best degree:", mean_errors[best_degree], "\n")
## Average MSE for the best degree: 1058.158
# Plot mean MSE for each polynomial degree
plot(1:max_degree, mean_errors, type = "b", pch = 19, col = "blue",
xlab = "Polynomial Degree", ylab = "Mean MSE",
main = "Cross-Validation Error for Polynomial Degrees")
As we can see, the best performing degree is 10 in this instance.
We can plot the best degree polynomial here:
# Fit the best polynomial regression model using the entire dataset
best_model <- lm(wage ~ poly(age, best_degree), data = Wage)
# Create a sequence of age values for smooth plotting
age_grid <- data.frame(age = seq(min(Wage$age), max(Wage$age), length.out = 100))
# Generate predictions using the best model
age_grid$predicted_wage <- predict(best_model, newdata = age_grid)
# Plot the data and fitted polynomial regression curve
ggplot(Wage, aes(x = age, y = wage)) +
geom_point(alpha = 0.5) + # Scatter plot of actual data
geom_line(data = age_grid, aes(x = age, y = predicted_wage), color = "blue", linewidth = 1.2) + # Fitted curve
labs(title = paste("Polynomial Regression (Degree =", best_degree, ")"),
x = "Age",
y = "Wage") +
theme_minimal()
We can compare this with ANOVA:
set.seed(123)
# Fitting polynomial regression models from degree 1 to 10
fit.1 <- lm(wage ~ age, data = Wage)
fit.2 <- lm(wage ~ poly(age, 2), data = Wage)
fit.3 <- lm(wage ~ poly(age, 3), data = Wage)
fit.4 <- lm(wage ~ poly(age, 4), data = Wage)
fit.5 <- lm(wage ~ poly(age, 5), data = Wage)
fit.6 <- lm(wage ~ poly(age, 6), data = Wage)
fit.7 <- lm(wage ~ poly(age, 7), data = Wage)
fit.8 <- lm(wage ~ poly(age, 8), data = Wage)
fit.9 <- lm(wage ~ poly(age, 9), data = Wage)
fit.10 <- lm(wage ~ poly(age, 10), data = Wage)
# Performing ANOVA to compare models
anova(fit.1, fit.2, fit.3, fit.4, fit.5, fit.6, fit.7, fit.8, fit.9, fit.10)
## Analysis of Variance Table
##
## Model 1: wage ~ age
## Model 2: wage ~ poly(age, 2)
## Model 3: wage ~ poly(age, 3)
## Model 4: wage ~ poly(age, 4)
## Model 5: wage ~ poly(age, 5)
## Model 6: wage ~ poly(age, 6)
## Model 7: wage ~ poly(age, 7)
## Model 8: wage ~ poly(age, 8)
## Model 9: wage ~ poly(age, 9)
## Model 10: wage ~ poly(age, 10)
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 2998 5022216
## 2 2997 4793430 1 228786 143.7638 < 2.2e-16 ***
## 3 2996 4777674 1 15756 9.9005 0.001669 **
## 4 2995 4771604 1 6070 3.8143 0.050909 .
## 5 2994 4770322 1 1283 0.8059 0.369398
## 6 2993 4766389 1 3932 2.4709 0.116074
## 7 2992 4763834 1 2555 1.6057 0.205199
## 8 2991 4763707 1 127 0.0796 0.777865
## 9 2990 4756703 1 7004 4.4014 0.035994 *
## 10 2989 4756701 1 3 0.0017 0.967529
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We thus see that using ANOVA, going from degree 1 to 2 and 2 to 3 is significant, meaning that these increases in degree are statistically significant to help explain the relationship between wage and age. There is no statistically significant increase onwards until from degree 8 to 9. Given these results, the ANOVA would suggest the most practical fit is at degree 3, which would likely provide enough flexibility without too much variance. Looking at the plot from before, we see that a 3 degree polynomial provides a similar performance to the best performing degree of 10. This could suggest that the good performance of the 10 degree polynomial is due to overfitting.
We will try from 2 to 10 cuts, and compare with cross validation:
set.seed(123)
k <- 10
max_cuts <- 10
folds <- createFolds(Wage$wage, k=k, list = F)
results <- as.data.frame(matrix(nrow=k, ncol=max_cuts)) # initialize a matrix for error
# cross validation
for (i in 1:k) {
test_idx <- which(folds == i) # Get test indices
train_data <- Wage[-test_idx, ]
test_data <- Wage[test_idx, ]
# Get MSE for each number of equal-width cuts
for (c in 2:max_cuts) { # Ensure at least 2 intervals for cut()
# Compute equal-width breakpoints based on the full range of age
breaks <- seq(min(Wage$age), max(Wage$age), length.out = c + 1)
# Apply the same breakpoints to both train and test data
train_data$age_cut <- cut(train_data$age, breaks = breaks, include.lowest = TRUE)
test_data$age_cut <- cut(test_data$age, breaks = breaks, include.lowest = TRUE)
model <- lm(wage ~ age_cut, data = train_data)
preds <- predict(model, newdata = test_data)
results[i, c] <- mean((preds - test_data$wage)^2)
}
}
# Compute the mean MSE for each polynomial degree across all folds
mean_errors <- colMeans(results, na.rm = TRUE)
# Find the index (degree) of the best performing polynomial degree
best_cuts <- which.min(mean_errors)
# Print results
cat("Best number of cuts:", best_cuts, "\n")
## Best number of cuts: 8
cat("Average MSE for the best cuts:", mean_errors[best_cuts], "\n")
## Average MSE for the best cuts: 1600.994
# Plot mean MSE for each cut
plot(2:max_cuts, mean_errors[2:max_cuts], type = "b", pch = 19, col = "blue",
xlab = "Number of Cuts", ylab = "Mean MSE",
main = "Cross-Validation Error for Number of Cuts")
We see that the best cuts is at 8. Below is a plot of the fit:
The Wage data set contains a number of other features not explored in this chapter, such as marital status (maritl), job class (jobclass), and others. Explore the relationships between some of these other predictors and wage, and use non-linear fitting techniques in order to fit flexible models to the data. Create plots of the results obtained, and write a summary of your findings.
We will ignore year as it is not a factor about the worker and logwage as it is related to the response. We will also ignore region as there is only one region present in this dataset, which is the middle atlantic:
##
## 1. New England 2. Middle Atlantic 3. East North Central
## 0 3000 0
## 4. West North Central 5. South Atlantic 6. East South Central
## 0 0 0
## 7. West South Central 8. Mountain 9. Pacific
## 0 0 0
The variables we will focus on this problem are maritl, race, education, jobclass, health, and health_ins.
Let’s first focus on maritl vs wage:
boxplot(wage ~ factor(maritl), data=Wage, main="Wage by Marital Status",
xlab=" ", ylab="Wage", col=cm.colors(5), las=2, cex.axis=0.8)
We can see that Married workers appear to have a slightly higher wage, and has much more outliers.
Now we can create a boxplot for race:
boxplot(wage ~ factor(race), data=Wage, main="Wage by Race",
xlab=" ", ylab="Wage", col=cm.colors(4), las=2)
We see that white and asian workers appear to have a slightly higher wage than either black or other races.
Next is education:
boxplot(wage ~ factor(education), data=Wage, main="Wage by Education",
xlab=" ", ylab="Wage", col=cm.colors(5), las=2,cex.axis=0.6)
From here we see that a higher education tends to lead to a higher wage, for all levels of education.
Now for jobclass:
boxplot(wage ~ factor(jobclass), data=Wage, main="Wage by Job Class",
xlab=" ", ylab="Wage", col=cm.colors(2), las=1)
There appears to be a slightly higher wage for workers in the information type of job rather than industrial jobs.
Now for health:
boxplot(wage ~ factor(health), data=Wage, main="Wage by Health",
xlab=" ", ylab="Wage", col=cm.colors(2), las=1)
We can see that healthier workers have a slightly higher wage.
Lastly, for health_ins:
boxplot(wage ~ factor(health_ins), data=Wage, main="Wage by Health Insurance",
xlab=" ", ylab="Wage", col=cm.colors(2), las=1)
From this we can see that workers with health insurance have a clearly higher wage.
We will now use non linear models for these variables to predict wage. We will fit two new models. The first is adding education and health_ins to the age polynomial model from the previous question as those seem to have the clearest correlation to wage, and the second is adding all variables to age.
The model containing education and health_ins:
##
## Call:
## lm(formula = wage ~ poly(age, best_degree) + education + health_ins,
## data = Wage)
##
## Residuals:
## Min 1Q Median 3Q Max
## -103.034 -18.979 -3.411 13.533 210.123
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 94.210 2.246 41.937 < 2e-16 ***
## poly(age, best_degree)1 306.007 34.979 8.748 < 2e-16 ***
## poly(age, best_degree)2 -328.794 34.878 -9.427 < 2e-16 ***
## poly(age, best_degree)3 57.389 34.526 1.662 0.096575 .
## poly(age, best_degree)4 10.554 34.577 0.305 0.760219
## poly(age, best_degree)5 -59.364 34.558 -1.718 0.085941 .
## poly(age, best_degree)6 49.447 34.485 1.434 0.151716
## poly(age, best_degree)7 60.198 34.474 1.746 0.080880 .
## poly(age, best_degree)8 -15.077 34.476 -0.437 0.661907
## poly(age, best_degree)9 -41.138 34.518 -1.192 0.233446
## poly(age, best_degree)10 21.125 34.537 0.612 0.540806
## education2. HS Grad 8.395 2.400 3.497 0.000477 ***
## education3. Some College 19.285 2.542 7.586 4.37e-14 ***
## education4. College Grad 33.436 2.533 13.200 < 2e-16 ***
## education5. Advanced Degree 57.263 2.750 20.823 < 2e-16 ***
## health_ins2. No -16.907 1.428 -11.843 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 34.44 on 2984 degrees of freedom
## Multiple R-squared: 0.322, Adjusted R-squared: 0.3186
## F-statistic: 94.5 on 15 and 2984 DF, p-value: < 2.2e-16
The model containing all variables (but year and region):
##
## Call:
## lm(formula = wage ~ poly(age, best_degree) + maritl + race +
## education + jobclass + health + health_ins, data = Wage)
##
## Residuals:
## Min 1Q Median 3Q Max
## -108.222 -18.869 -3.539 13.814 210.673
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 80.6493 2.8269 28.529 < 2e-16 ***
## poly(age, best_degree)1 221.4768 39.7418 5.573 2.73e-08 ***
## poly(age, best_degree)2 -239.0219 36.1688 -6.609 4.58e-11 ***
## poly(age, best_degree)3 19.7752 34.3694 0.575 0.565083
## poly(age, best_degree)4 18.9340 34.0647 0.556 0.578373
## poly(age, best_degree)5 -56.5120 34.1273 -1.656 0.097844 .
## poly(age, best_degree)6 22.2221 34.0107 0.653 0.513558
## poly(age, best_degree)7 58.4591 33.9462 1.722 0.085153 .
## poly(age, best_degree)8 -18.1509 33.9107 -0.535 0.592513
## poly(age, best_degree)9 -43.2739 33.9460 -1.275 0.202485
## poly(age, best_degree)10 29.5938 33.9694 0.871 0.383720
## maritl2. Married 13.1756 1.8234 7.226 6.30e-13 ***
## maritl3. Widowed -1.0106 7.9921 -0.126 0.899380
## maritl4. Divorced -0.1244 2.9415 -0.042 0.966275
## maritl5. Separated 7.6217 4.8681 1.566 0.117534
## race2. Black -4.4729 2.1425 -2.088 0.036909 *
## race3. Asian -2.3087 2.5944 -0.890 0.373599
## race4. Other -5.2105 5.6520 -0.922 0.356656
## education2. HS Grad 7.9729 2.3702 3.364 0.000779 ***
## education3. Some College 18.4320 2.5220 7.309 3.45e-13 ***
## education4. College Grad 31.0826 2.5484 12.197 < 2e-16 ***
## education5. Advanced Degree 53.6361 2.8079 19.102 < 2e-16 ***
## jobclass2. Information 3.3955 1.3202 2.572 0.010160 *
## health2. >=Very Good 6.3416 1.4175 4.474 7.97e-06 ***
## health_ins2. No -16.3431 1.4119 -11.575 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 33.84 on 2975 degrees of freedom
## Multiple R-squared: 0.3476, Adjusted R-squared: 0.3424
## F-statistic: 66.06 on 24 and 2975 DF, p-value: < 2.2e-16
And for comparison, the polynomial age model:
##
## Call:
## lm(formula = wage ~ poly(age, best_degree), data = Wage)
##
## Residuals:
## Min 1Q Median 3Q Max
## -100.38 -24.45 -4.97 15.49 199.61
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 111.7036 0.7283 153.369 < 2e-16 ***
## poly(age, best_degree)1 447.0679 39.8924 11.207 < 2e-16 ***
## poly(age, best_degree)2 -478.3158 39.8924 -11.990 < 2e-16 ***
## poly(age, best_degree)3 125.5217 39.8924 3.147 0.00167 **
## poly(age, best_degree)4 -77.9112 39.8924 -1.953 0.05091 .
## poly(age, best_degree)5 -35.8129 39.8924 -0.898 0.36940
## poly(age, best_degree)6 62.7077 39.8924 1.572 0.11607
## poly(age, best_degree)7 50.5498 39.8924 1.267 0.20520
## poly(age, best_degree)8 -11.2547 39.8924 -0.282 0.77787
## poly(age, best_degree)9 -83.6918 39.8924 -2.098 0.03599 *
## poly(age, best_degree)10 1.6240 39.8924 0.041 0.96753
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 39.89 on 2989 degrees of freedom
## Multiple R-squared: 0.08912, Adjusted R-squared: 0.08607
## F-statistic: 29.24 on 10 and 2989 DF, p-value: < 2.2e-16
We can compare the adjR2 to account for the different number of predictors, and get a significantly higher adjR2 when accounting for the other variables, with the highest using all variables at 0.3424. This also provides the lowest residual error of 33.84. We can thus see that accounting for these additional variables we get a better model.
This question relates to the College data set.
college <- College
library(caret)
set.seed(123)
train_test <- createDataPartition(college$Outstate, times = 1, p = 0.8, list=F) # indices of training set
train_data <- college[train_test,]
null <- lm(Outstate ~ 1, data=train_data)
all <- lm(Outstate ~ ., data=train_data)
forward <- step(null, direction = 'forward', scope = formula(all), trace = 0)
forward$call
## lm(formula = Outstate ~ Expend + Private + Room.Board + perc.alumni +
## PhD + Grad.Rate + S.F.Ratio + Personal + Terminal + F.Undergrad +
## Accept + Apps + Top10perc + Enroll, data = train_data)
Using stepwise selection, we see that we can use Expend + Private + Room.Board + perc.alumni + PhD + Grad.Rate + S.F.Ratio + Personal + Terminal + F.Undergrad + Accept + Apps + Top10perc + Enroll to predict Outstate. This is a total of 13 variables, meaning forward stepwise regression successfully eliminates 5 predictors.
library(gam)
## Loading required package: splines
## Loading required package: foreach
## Loaded gam 1.22-5
gam_model <- gam(Outstate ~
Private + # Include categorical variable as-is
s(Expend) + s(Room.Board) + s(perc.alumni) +
s(PhD) + s(Grad.Rate) + s(S.F.Ratio) + s(Personal) +
s(Terminal) + s(F.Undergrad) + s(Accept) + s(Apps) +
s(Top10perc) + s(Enroll),
data = train_data)
par(mfrow=c(2,2)) #
plot(gam_model, se=T, col='blue', scale=0) # Setting scale=0 forces a common y-axis scale
Looking at the plots, it appears that the variables Expend, PhD, Grad.Rate, S.F.Ratio, Personal, Top10perc, and Enroll show some evidence of a non-linear relationship based on the plots.
## GAM MSE: 3364854
## GAM R2: 0.7844119
## Stepwise Regression MSE: 3598986
## Stepwise Regression R2: 0.769411
We see that the performance is similar, with GAM being slightly better than the stepwise regression. GAM has about a 7% decrease in MSE and a 0.02 increase in R2.
As explained in part b) variables Expend, PhD, Grad.Rate, S.F.Ratio, Personal, Top10perc, and Enroll show evidence of a non-linear relationship based on the plots. We can look at the ANOVA for non parametric effects below for further evidence:
summary(gam_model)
##
## Call: gam(formula = Outstate ~ Private + s(Expend) + s(Room.Board) +
## s(perc.alumni) + s(PhD) + s(Grad.Rate) + s(S.F.Ratio) + s(Personal) +
## s(Terminal) + s(F.Undergrad) + s(Accept) + s(Apps) + s(Top10perc) +
## s(Enroll), data = train_data)
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -6660.75 -1061.88 39.09 1157.13 7347.73
##
## (Dispersion Parameter for gaussian family taken to be 3056278)
##
## Null Deviance: 10155674653 on 622 degrees of freedom
## Residual Deviance: 1739021423 on 568.9997 degrees of freedom
## AIC: 11124.59
##
## Number of Local Scoring Iterations: NA
##
## Anova for Parametric Effects
## Df Sum Sq Mean Sq F value Pr(>F)
## Private 1 2769764849 2769764849 906.2541 < 2.2e-16 ***
## s(Expend) 1 2964900567 2964900567 970.1016 < 2.2e-16 ***
## s(Room.Board) 1 460884042 460884042 150.7991 < 2.2e-16 ***
## s(perc.alumni) 1 172202199 172202199 56.3438 2.361e-13 ***
## s(PhD) 1 59043260 59043260 19.3187 1.320e-05 ***
## s(Grad.Rate) 1 85916813 85916813 28.1116 1.641e-07 ***
## s(S.F.Ratio) 1 10049722 10049722 3.2882 0.070305 .
## s(Personal) 1 30237464 30237464 9.8936 0.001745 **
## s(Terminal) 1 12726158 12726158 4.1639 0.041754 *
## s(F.Undergrad) 1 28484009 28484009 9.3198 0.002373 **
## s(Accept) 1 103109412 103109412 33.7369 1.050e-08 ***
## s(Apps) 1 5061375 5061375 1.6561 0.198660
## s(Top10perc) 1 582274 582274 0.1905 0.662652
## s(Enroll) 1 22824450 22824450 7.4681 0.006476 **
## Residuals 569 1739021423 3056278
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Anova for Nonparametric Effects
## Npar Df Npar F Pr(F)
## (Intercept)
## Private
## s(Expend) 3 25.5850 1.554e-15 ***
## s(Room.Board) 3 0.9495 0.4163201
## s(perc.alumni) 3 1.5824 0.1925380
## s(PhD) 3 3.6749 0.0121149 *
## s(Grad.Rate) 3 3.2413 0.0217872 *
## s(S.F.Ratio) 3 5.4614 0.0010481 **
## s(Personal) 3 5.6025 0.0008628 ***
## s(Terminal) 3 1.7095 0.1639260
## s(F.Undergrad) 3 1.5038 0.2125030
## s(Accept) 3 6.9335 0.0001370 ***
## s(Apps) 3 5.1229 0.0016704 **
## s(Top10perc) 3 1.4598 0.2245227
## s(Enroll) 3 1.5761 0.1940774
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Using this there is evidence that Expend, PhD, Grad.Rate, S.F.Ratio, Personal, Accept, and Apps all have evidence of a non-linear relationship. This agrees with some of the conclusions for the plots, with a couple exceptions. Both Top10perc and Enroll are not actually significant, while Accept and Apps are. We can see that Top10perc actually does not have even a statistically significant non linear relationship, while Enroll does have a statistically significant linear relationship.
We can thus conclude that there is evidence for non linear relationships for the following variables: Expend, PhD, Grad.Rate, S.F.Ratio, Personal, Accept, and Apps.