(2) For parts (a) through (c), indicate which of i. through iv. is correct.

i. More flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance.

ii. More flexible and hence will give improved prediction accuracy when its increase in variance is less than its decrease in bias.

iii. Less flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance.

iv. Less flexible and hence will give improved prediction accuracy when its increase in variance is less than its decrease in bias.

# (a) The lasso, relative to least squares, is: Answer is (iii) Less flexible giving improved prediction accuracy when its increase in bias is less than its decrease in variance.


# (b) Repeat (a) for ridge regression relative to least squares: Answer is (iii) Same as Lasso, Less flexible and will give improved prediction accuracy when its increase in bias is less than its decrease in variance.


# (c) Repeat (a) for non-linear methods relative to least squares: Answer is (ii) More flexible giving improved prediction accuracy when its increase in variance is less than its decrease in bias.

(9)

(a) Split the data set into a training set and a test set.

## Warning: package 'ISLR' was built under R version 3.5.3

(b) Fit a linear model using least squares on the training set, and report the test error obtained.

ls.full<-lm(Apps~.,data=train)
summary(ls.full)
## 
## Call:
## lm(formula = Apps ~ ., data = train)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5023.2  -408.6   -46.6   336.7  7242.5 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -140.27742  518.76903  -0.270 0.786955    
## PrivateYes  -511.17144  176.18600  -2.901 0.003872 ** 
## Accept         1.63646    0.05066  32.302  < 2e-16 ***
## Enroll        -1.08353    0.24583  -4.408 1.27e-05 ***
## Top10perc     58.88924    7.33783   8.025 6.66e-15 ***
## Top25perc    -21.37240    6.10623  -3.500 0.000505 ***
## F.Undergrad    0.07107    0.04330   1.641 0.101333    
## P.Undergrad    0.05855    0.03783   1.548 0.122318    
## Outstate      -0.08871    0.02447  -3.625 0.000317 ***
## Room.Board     0.14479    0.06418   2.256 0.024493 *  
## Books         -0.19750    0.29231  -0.676 0.499547    
## Personal       0.03265    0.07920   0.412 0.680391    
## PhD           -8.78197    6.16418  -1.425 0.154844    
## Terminal      -4.79463    6.69874  -0.716 0.474463    
## S.F.Ratio     17.93368   16.68176   1.075 0.282848    
## perc.alumni    2.29991    5.51102   0.417 0.676608    
## Expend         0.07830    0.01574   4.974 8.92e-07 ***
## Grad.Rate      9.14788    3.83929   2.383 0.017541 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1113 on 525 degrees of freedom
## Multiple R-squared:  0.9308, Adjusted R-squared:  0.9286 
## F-statistic: 415.4 on 17 and 525 DF,  p-value: < 2.2e-16
predicted.apps<-predict(ls.full,test)
testerror<-mean((test$Apps-predicted.apps)^2)
testerror
## [1] 769127.1

The Mean Square error for the test data set is 769127.1

(c) Fit a ridge regression model on the training set, with ?? chosen by cross-validation. Report the test error obtained.

train.mat<-model.matrix(Apps~.,data=train)
test.mat<-model.matrix(Apps~.,data=test)
grid<-10^seq(4,-2,length=100)
library(glmnet)
## Warning: package 'glmnet' was built under R version 3.5.3
## Loading required package: Matrix
## Loading required package: foreach
## Warning: package 'foreach' was built under R version 3.5.3
## Loaded glmnet 2.0-16
ridge<-glmnet(train.mat,train$Apps,alpha=0,lambda=grid,thresh = 1e-12)
cv.ridge<-cv.glmnet(train.mat,train$Apps,alpha=0,lambda=grid,thresh=1e-12)
bestlam.ridge<-cv.ridge$lambda.min
bestlam.ridge
## [1] 0.01
pred.newridge<-predict(ridge,s=bestlam.ridge,newx =test.mat)
mean((test$Apps-pred.newridge)^2)
## [1] 769103.1

The Mean Square error for the test data set is found to be 7.69103.1 for Ridge Regression Model, slightly less than the one for Least Square Model

(d) Fit a lasso model on the training set, with ?? chosen by crossvalidation.Report the test error obtained, along with the number of non-zero coefficient estimates.

lasso<-glmnet(train.mat,train$Apps,alpha=1,lambda=grid,thresh = 1e-12)
cv.lasso<-cv.glmnet(train.mat,train$Apps,alpha=1,lambda=grid,thresh=1e-12)
bestlam.lasso<-cv.lasso$lambda.min
bestlam.lasso
## [1] 0.01
pred.newlasso<-predict(lasso,s=bestlam.lasso,newx =test.mat)
mean((test$Apps-pred.newlasso)^2)
## [1] 769058.3
predict(lasso,s=bestlam.lasso,type="coefficients")
## 19 x 1 sparse Matrix of class "dgCMatrix"
##                         1
## (Intercept) -140.44188745
## (Intercept)    .         
## PrivateYes  -511.17487803
## Accept         1.63637874
## Enroll        -1.08293849
## Top10perc     58.87977942
## Top25perc    -21.36470140
## F.Undergrad    0.07099655
## P.Undergrad    0.05854286
## Outstate      -0.08868893
## Room.Board     0.14477237
## Books         -0.19740566
## Personal       0.03263004
## PhD           -8.78073978
## Terminal      -4.79422684
## S.F.Ratio     17.92955431
## perc.alumni    2.29577721
## Expend         0.07829692
## Grad.Rate      9.14621252

The Mean Square error for the test data set is found to be 7.69058.3 for Lasso Regression Model The mean square error has further reduced compared to Ridge Regression.

(e) Fit a PCR model on the training set, with M chosen by crossvalidation. Report the test error obtained, along with the value of M selected by cross-validation.

library(pls)
## Warning: package 'pls' was built under R version 3.5.3
## 
## Attaching package: 'pls'
## The following object is masked from 'package:stats':
## 
##     loadings
pcrmodel<-pcr(Apps~.,data=train,scale=TRUE,validation="CV")
validationplot(pcrmodel,val.type="MSEP")

predict.pcr<-predict(pcrmodel,test,ncomp=17)
mean((test$Apps-predict.pcr)^2)
## [1] 769127.1

We see that the cross validation error is minimum for M=17.If we use 17 components it gives MSE on test set as 7.69127.1 which is similar to the one obtained for least squares method.

(f) Fit a PLS model on the training set, with M chosen by crossvalidation. Report the test error obtained, along with the value of M selected by cross-validation.

plsrmodel<-plsr(Apps~.,data=train,scale=TRUE,validation="CV")
validationplot(plsrmodel,val.type="MSEP")

predict.plsr<-predict(plsrmodel,test,ncomp=10)
mean((test$Apps-predict.plsr)^2)
## [1] 775233.6

We see that the cross validation error is minimum for M=10.If we use 10 components it gives MSE on test set as 7.75233.6, which is higher compared to the one obtained for least squares method.

(g) Comment on the results obtained. How accurately can we predict the number of college applications received? Is there much difference among the test errors resulting from these five approaches?

We use test R Sqaure statistic to see how accurately we predicted the number of college applications received Higher value of test R-Square shows higher accuracy.

test.avg <- mean(test$Apps)
lm.r2 <- 1 - mean((predicted.apps - test$Apps)^2) / mean((test.avg - test$Apps)^2)
lm.r2
## [1] 0.918807
ridge.r2 <- 1 - mean((pred.newridge - test$Apps)^2) / mean((test.avg - test$Apps)^2)
ridge.r2
## [1] 0.9188096
lasso.r2 <- 1 - mean((pred.newlasso - test$Apps)^2) / mean((test.avg - test$Apps)^2)
lasso.r2
## [1] 0.9188143
pcr.r2 <- 1 - mean((predict.pcr - test$Apps)^2) / mean((test.avg - test$Apps)^2)
pcr.r2
## [1] 0.918807
pls.r2 <- 1 - mean((predict.plsr - test$Apps)^2) / mean((test.avg - test$Apps)^2)
pls.r2
## [1] 0.9181624

It can be seen that the Lasso model predicts the highest R-Square. Though all the models have similar performance, This was expected as the minimum MSE was found for Lasso model across all the models.

(11)

(a) Try out some of the regression methods explored in this chapter, such as best subset selection, the lasso, ridge regression, and PCR. Present and discuss results for the approaches that you consider.

library(MASS)
attach(Boston)
dim(Boston)
## [1] 506  14
str(Boston)
## 'data.frame':    506 obs. of  14 variables:
##  $ crim   : num  0.00632 0.02731 0.02729 0.03237 0.06905 ...
##  $ zn     : num  18 0 0 0 0 0 12.5 12.5 12.5 12.5 ...
##  $ indus  : num  2.31 7.07 7.07 2.18 2.18 2.18 7.87 7.87 7.87 7.87 ...
##  $ chas   : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ nox    : num  0.538 0.469 0.469 0.458 0.458 0.458 0.524 0.524 0.524 0.524 ...
##  $ rm     : num  6.58 6.42 7.18 7 7.15 ...
##  $ age    : num  65.2 78.9 61.1 45.8 54.2 58.7 66.6 96.1 100 85.9 ...
##  $ dis    : num  4.09 4.97 4.97 6.06 6.06 ...
##  $ rad    : int  1 2 2 3 3 3 5 5 5 5 ...
##  $ tax    : num  296 242 242 222 222 222 311 311 311 311 ...
##  $ ptratio: num  15.3 17.8 17.8 18.7 18.7 18.7 15.2 15.2 15.2 15.2 ...
##  $ black  : num  397 397 393 395 397 ...
##  $ lstat  : num  4.98 9.14 4.03 2.94 5.33 ...
##  $ medv   : num  24 21.6 34.7 33.4 36.2 28.7 22.9 27.1 16.5 18.9 ...
summary(Boston)
##       crim                zn             indus            chas        
##  Min.   : 0.00632   Min.   :  0.00   Min.   : 0.46   Min.   :0.00000  
##  1st Qu.: 0.08204   1st Qu.:  0.00   1st Qu.: 5.19   1st Qu.:0.00000  
##  Median : 0.25651   Median :  0.00   Median : 9.69   Median :0.00000  
##  Mean   : 3.61352   Mean   : 11.36   Mean   :11.14   Mean   :0.06917  
##  3rd Qu.: 3.67708   3rd Qu.: 12.50   3rd Qu.:18.10   3rd Qu.:0.00000  
##  Max.   :88.97620   Max.   :100.00   Max.   :27.74   Max.   :1.00000  
##       nox               rm             age              dis        
##  Min.   :0.3850   Min.   :3.561   Min.   :  2.90   Min.   : 1.130  
##  1st Qu.:0.4490   1st Qu.:5.886   1st Qu.: 45.02   1st Qu.: 2.100  
##  Median :0.5380   Median :6.208   Median : 77.50   Median : 3.207  
##  Mean   :0.5547   Mean   :6.285   Mean   : 68.57   Mean   : 3.795  
##  3rd Qu.:0.6240   3rd Qu.:6.623   3rd Qu.: 94.08   3rd Qu.: 5.188  
##  Max.   :0.8710   Max.   :8.780   Max.   :100.00   Max.   :12.127  
##       rad              tax           ptratio          black       
##  Min.   : 1.000   Min.   :187.0   Min.   :12.60   Min.   :  0.32  
##  1st Qu.: 4.000   1st Qu.:279.0   1st Qu.:17.40   1st Qu.:375.38  
##  Median : 5.000   Median :330.0   Median :19.05   Median :391.44  
##  Mean   : 9.549   Mean   :408.2   Mean   :18.46   Mean   :356.67  
##  3rd Qu.:24.000   3rd Qu.:666.0   3rd Qu.:20.20   3rd Qu.:396.23  
##  Max.   :24.000   Max.   :711.0   Max.   :22.00   Max.   :396.90  
##      lstat            medv      
##  Min.   : 1.73   Min.   : 5.00  
##  1st Qu.: 6.95   1st Qu.:17.02  
##  Median :11.36   Median :21.20  
##  Mean   :12.65   Mean   :22.53  
##  3rd Qu.:16.95   3rd Qu.:25.00  
##  Max.   :37.97   Max.   :50.00
set.seed(10743959)
subset<-sample(nrow(Boston),nrow(Boston)*0.70)
boston.train<-Boston[subset,]
boston.test<-Boston[-subset,]

Full model

slr.full<-lm(medv~.,data=boston.train)
slr.predict<-predict(slr.full,boston.test)
slr.MSE<-mean((slr.predict-boston.test$medv)^2)
slr.MSE
## [1] 17.96323

Best subset

library(leaps)
## Warning: package 'leaps' was built under R version 3.5.3
bsm<-regsubsets(medv~.,data=boston.train,nbest=1,nvmax=13)
summary(bsm)
## Subset selection object
## Call: regsubsets.formula(medv ~ ., data = boston.train, nbest = 1, 
##     nvmax = 13)
## 13 Variables  (and intercept)
##         Forced in Forced out
## crim        FALSE      FALSE
## zn          FALSE      FALSE
## indus       FALSE      FALSE
## chas        FALSE      FALSE
## nox         FALSE      FALSE
## rm          FALSE      FALSE
## age         FALSE      FALSE
## dis         FALSE      FALSE
## rad         FALSE      FALSE
## tax         FALSE      FALSE
## ptratio     FALSE      FALSE
## black       FALSE      FALSE
## lstat       FALSE      FALSE
## 1 subsets of each size up to 13
## Selection Algorithm: exhaustive
##           crim zn  indus chas nox rm  age dis rad tax ptratio black lstat
## 1  ( 1 )  " "  " " " "   " "  " " " " " " " " " " " " " "     " "   "*"  
## 2  ( 1 )  " "  " " " "   " "  " " "*" " " " " " " " " " "     " "   "*"  
## 3  ( 1 )  " "  " " " "   " "  " " "*" " " " " " " " " "*"     " "   "*"  
## 4  ( 1 )  " "  " " " "   " "  " " "*" " " "*" " " " " "*"     " "   "*"  
## 5  ( 1 )  " "  " " " "   " "  "*" "*" " " "*" " " " " "*"     " "   "*"  
## 6  ( 1 )  " "  "*" " "   " "  "*" "*" " " "*" " " " " "*"     " "   "*"  
## 7  ( 1 )  " "  "*" " "   "*"  "*" "*" " " "*" " " " " "*"     " "   "*"  
## 8  ( 1 )  " "  "*" " "   "*"  "*" "*" " " "*" " " " " "*"     "*"   "*"  
## 9  ( 1 )  " "  "*" " "   " "  "*" "*" " " "*" "*" "*" "*"     "*"   "*"  
## 10  ( 1 ) " "  "*" " "   "*"  "*" "*" " " "*" "*" "*" "*"     "*"   "*"  
## 11  ( 1 ) "*"  "*" " "   "*"  "*" "*" " " "*" "*" "*" "*"     "*"   "*"  
## 12  ( 1 ) "*"  "*" " "   "*"  "*" "*" "*" "*" "*" "*" "*"     "*"   "*"  
## 13  ( 1 ) "*"  "*" "*"   "*"  "*" "*" "*" "*" "*" "*" "*"     "*"   "*"
boston.test.mat <- model.matrix(medv~ ., data = boston.test, nvmax = 13)
val.errors <- rep(NA, 13)
for (i in 1:13) {
    coefi <- coef(bsm, id = i)
    pred <- boston.test.mat[, names(coefi)] %*% coefi
    val.errors[i] <- mean((pred - boston.test$medv)^2)
}
plot(val.errors, xlab = "Number of predictors", ylab = "Test MSE", pch = 19, type = "b",col="orange")

which.min(val.errors)
## [1] 11
coef(bsm,which.min(val.errors))
##   (Intercept)          crim            zn          chas           nox 
##  42.504262409  -0.051977242   0.058240750   2.579296005 -16.716321472 
##            rm           dis           rad           tax       ptratio 
##   3.017526315  -1.598507331   0.330049548  -0.013725911  -0.904781765 
##         black         lstat 
##   0.008099312  -0.611990159
bsm.MSE<-val.errors[11]
bsm.MSE
## [1] 17.81448

forward selection

nullmodel<-lm(medv ~ 1, data = boston.train)
fullmodel<-lm(medv ~ ., data = boston.train)

fwd.model<-step(nullmodel, scope = list(lower = nullmodel, upper = fullmodel), 
    direction = "forward")
## Start:  AIC=1586.28
## medv ~ 1
## 
##           Df Sum of Sq   RSS    AIC
## + lstat    1   16886.9 14204 1311.0
## + rm       1   13875.0 17216 1379.0
## + indus    1    6900.6 24190 1499.4
## + ptratio  1    6829.7 24261 1500.5
## + tax      1    5850.3 25241 1514.5
## + nox      1    5552.7 25538 1518.6
## + crim     1    4592.6 26498 1531.7
## + age      1    4281.3 26810 1535.8
## + zn       1    4209.0 26882 1536.8
## + black    1    3725.3 27366 1543.1
## + rad      1    3716.7 27374 1543.2
## + dis      1    1831.6 29259 1566.8
## + chas     1    1093.5 29997 1575.6
## <none>                 31091 1586.3
## 
## Step:  AIC=1310.96
## medv ~ lstat
## 
##           Df Sum of Sq   RSS    AIC
## + rm       1   2395.69 11808 1247.6
## + ptratio  1   1732.25 12472 1266.9
## + dis      1    876.01 13328 1290.4
## + chas     1    517.36 13687 1299.8
## + age      1    225.86 13978 1307.3
## + zn       1    110.67 14093 1310.2
## + tax      1     91.75 14112 1310.7
## <none>                 14204 1311.0
## + black    1     77.31 14127 1311.0
## + nox      1     44.17 14160 1311.9
## + crim     1     16.41 14188 1312.5
## + indus    1     10.89 14193 1312.7
## + rad      1      0.00 14204 1313.0
## 
## Step:  AIC=1247.57
## medv ~ lstat + rm
## 
##           Df Sum of Sq   RSS    AIC
## + ptratio  1   1132.61 10676 1213.9
## + dis      1    549.77 11258 1232.7
## + chas     1    387.64 11421 1237.8
## + black    1    222.12 11586 1242.8
## + tax      1    141.87 11666 1245.3
## <none>                 11808 1247.6
## + age      1     41.66 11767 1248.3
## + zn       1     36.09 11772 1248.5
## + rad      1     28.87 11779 1248.7
## + crim     1     27.66 11781 1248.7
## + nox      1      8.20 11800 1249.3
## + indus    1      3.04 11805 1249.5
## 
## Step:  AIC=1213.87
## medv ~ lstat + rm + ptratio
## 
##         Df Sum of Sq   RSS    AIC
## + dis    1    582.48 10093 1196.0
## + chas   1    306.74 10369 1205.5
## + black  1    152.15 10524 1210.8
## + rad    1     81.27 10594 1213.2
## + age    1     63.71 10612 1213.8
## <none>               10676 1213.9
## + indus  1     17.59 10658 1215.3
## + zn     1      4.49 10671 1215.7
## + crim   1      0.99 10675 1215.8
## + tax    1      0.02 10676 1215.9
## + nox    1      0.01 10676 1215.9
## 
## Step:  AIC=1196.01
## medv ~ lstat + rm + ptratio + dis
## 
##         Df Sum of Sq     RSS    AIC
## + nox    1    485.83  9607.3 1180.5
## + zn     1    246.86  9846.3 1189.2
## + black  1    220.03  9873.1 1190.2
## + chas   1    189.94  9903.2 1191.3
## + indus  1    167.38  9925.8 1192.1
## + age    1     96.92  9996.3 1194.6
## + tax    1     84.77 10008.4 1195.0
## <none>               10093.2 1196.0
## + crim   1     14.98 10078.2 1197.5
## + rad    1      0.32 10092.9 1198.0
## 
## Step:  AIC=1180.55
## medv ~ lstat + rm + ptratio + dis + nox
## 
##         Df Sum of Sq    RSS    AIC
## + zn     1   263.206 9344.1 1172.7
## + chas   1   235.838 9371.5 1173.8
## + black  1   134.635 9472.7 1177.5
## + rad    1    94.690 9512.7 1179.0
## <none>               9607.3 1180.5
## + age    1    18.917 9588.4 1181.8
## + indus  1    17.286 9590.1 1181.9
## + crim   1     1.982 9605.4 1182.5
## + tax    1     0.423 9606.9 1182.5
## 
## Step:  AIC=1172.72
## medv ~ lstat + rm + ptratio + dis + nox + zn
## 
##         Df Sum of Sq    RSS    AIC
## + chas   1   228.511 9115.6 1166.0
## + black  1   161.277 9182.9 1168.5
## <none>               9344.1 1172.7
## + rad    1    44.435 9299.7 1173.0
## + crim   1    20.334 9323.8 1173.9
## + indus  1    13.701 9330.4 1174.2
## + tax    1    12.329 9331.8 1174.2
## + age    1     7.138 9337.0 1174.5
## 
## Step:  AIC=1165.95
## medv ~ lstat + rm + ptratio + dis + nox + zn + chas
## 
##         Df Sum of Sq    RSS    AIC
## + black  1   129.863 8985.8 1162.9
## <none>               9115.6 1166.0
## + rad    1    46.912 9068.7 1166.1
## + indus  1    18.764 9096.9 1167.2
## + crim   1    12.611 9103.0 1167.5
## + age    1    10.235 9105.4 1167.5
## + tax    1     6.846 9108.8 1167.7
## 
## Step:  AIC=1162.87
## medv ~ lstat + rm + ptratio + dis + nox + zn + chas + black
## 
##         Df Sum of Sq    RSS    AIC
## + rad    1   113.259 8872.5 1160.4
## <none>               8985.8 1162.9
## + age    1    14.476 8971.3 1164.3
## + indus  1    10.915 8974.9 1164.4
## + crim   1     0.376 8985.4 1164.9
## + tax    1     0.161 8985.6 1164.9
## 
## Step:  AIC=1160.38
## medv ~ lstat + rm + ptratio + dis + nox + zn + chas + black + 
##     rad
## 
##         Df Sum of Sq    RSS    AIC
## + tax    1   226.847 8645.7 1153.2
## <none>               8872.5 1160.4
## + indus  1    25.821 8846.7 1161.3
## + crim   1    20.783 8851.7 1161.5
## + age    1    10.522 8862.0 1162.0
## 
## Step:  AIC=1153.21
## medv ~ lstat + rm + ptratio + dis + nox + zn + chas + black + 
##     rad + tax
## 
##         Df Sum of Sq    RSS    AIC
## <none>               8645.7 1153.2
## + crim   1   24.7408 8620.9 1154.2
## + age    1    6.6613 8639.0 1154.9
## + indus  1    2.1652 8643.5 1155.1
summary(fwd.model)
## 
## Call:
## lm(formula = medv ~ lstat + rm + ptratio + dis + nox + zn + chas + 
##     black + rad + tax, data = boston.train)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.5511  -2.7981  -0.5087   1.6696  24.1049 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  41.553676   6.098471   6.814 4.28e-11 ***
## lstat        -0.620575   0.058679 -10.576  < 2e-16 ***
## rm            3.068242   0.484744   6.330 7.69e-10 ***
## ptratio      -0.899612   0.160989  -5.588 4.69e-08 ***
## dis          -1.568736   0.229077  -6.848 3.47e-11 ***
## nox         -16.280488   4.309207  -3.778 0.000186 ***
## zn            0.056446   0.016975   3.325 0.000979 ***
## chas          2.628518   1.055356   2.491 0.013224 *  
## black         0.008998   0.003350   2.686 0.007590 ** 
## rad           0.305467   0.083179   3.672 0.000279 ***
## tax          -0.013603   0.004534  -3.000 0.002898 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.021 on 343 degrees of freedom
## Multiple R-squared:  0.7219, Adjusted R-squared:  0.7138 
## F-statistic: 89.05 on 10 and 343 DF,  p-value: < 2.2e-16
fwd.predict<-predict(fwd.model,boston.test)
fwd.mse<-mean((fwd.predict-boston.test$medv)^2)
fwd.mse
## [1] 18.84049

backward selection

bwd.model<-step(fullmodel,direction = "backward")
## Start:  AIC=1157.9
## medv ~ crim + zn + indus + chas + nox + rm + age + dis + rad + 
##     tax + ptratio + black + lstat
## 
##           Df Sum of Sq     RSS    AIC
## - indus    1      1.52  8615.3 1156.0
## - age      1      5.66  8619.4 1156.1
## - crim     1     23.03  8636.8 1156.8
## <none>                  8613.8 1157.9
## - black    1    140.52  8754.3 1161.6
## - chas     1    148.15  8761.9 1161.9
## - tax      1    200.73  8814.5 1164.1
## - zn       1    282.73  8896.5 1167.3
## - nox      1    323.10  8936.9 1168.9
## - rad      1    344.22  8958.0 1169.8
## - ptratio  1    781.31  9395.1 1186.6
## - rm       1    962.12  9575.9 1193.4
## - dis      1   1057.32  9671.1 1196.9
## - lstat    1   2403.65 11017.4 1243.0
## 
## Step:  AIC=1155.97
## medv ~ crim + zn + chas + nox + rm + age + dis + rad + tax + 
##     ptratio + black + lstat
## 
##           Df Sum of Sq     RSS    AIC
## - age      1      5.62  8620.9 1154.2
## - crim     1     23.70  8639.0 1154.9
## <none>                  8615.3 1156.0
## - black    1    139.54  8754.8 1159.7
## - chas     1    152.16  8767.5 1160.2
## - tax      1    227.08  8842.4 1163.2
## - zn       1    282.42  8897.7 1165.4
## - nox      1    330.81  8946.1 1167.3
## - rad      1    355.54  8970.8 1168.3
## - ptratio  1    782.19  9397.5 1184.7
## - rm       1    961.95  9577.2 1191.4
## - dis      1   1129.57  9744.9 1197.6
## - lstat    1   2415.87 11031.2 1241.5
## 
## Step:  AIC=1154.2
## medv ~ crim + zn + chas + nox + rm + dis + rad + tax + ptratio + 
##     black + lstat
## 
##           Df Sum of Sq     RSS    AIC
## - crim     1     24.74  8645.7 1153.2
## <none>                  8620.9 1154.2
## - black    1    137.25  8758.2 1157.8
## - chas     1    150.23  8771.1 1158.3
## - tax      1    230.80  8851.7 1161.5
## - zn       1    293.36  8914.3 1164.0
## - rad      1    364.43  8985.4 1166.9
## - nox      1    375.39  8996.3 1167.3
## - ptratio  1    795.33  9416.2 1183.4
## - rm       1    965.97  9586.9 1189.8
## - dis      1   1206.59  9827.5 1198.6
## - lstat    1   2683.27 11304.2 1248.1
## 
## Step:  AIC=1153.21
## medv ~ zn + chas + nox + rm + dis + rad + tax + ptratio + black + 
##     lstat
## 
##           Df Sum of Sq     RSS    AIC
## <none>                  8645.7 1153.2
## - chas     1    156.36  8802.0 1157.6
## - black    1    181.81  8827.5 1158.6
## - tax      1    226.85  8872.5 1160.4
## - zn       1    278.70  8924.4 1162.4
## - rad      1    339.94  8985.6 1164.9
## - nox      1    359.79  9005.4 1165.7
## - ptratio  1    787.09  9432.8 1182.1
## - rm       1   1009.85  9655.5 1190.3
## - dis      1   1182.06  9827.7 1196.6
## - lstat    1   2819.25 11464.9 1251.1
summary(bwd.model)
## 
## Call:
## lm(formula = medv ~ zn + chas + nox + rm + dis + rad + tax + 
##     ptratio + black + lstat, data = boston.train)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.5511  -2.7981  -0.5087   1.6696  24.1049 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  41.553676   6.098471   6.814 4.28e-11 ***
## zn            0.056446   0.016975   3.325 0.000979 ***
## chas          2.628518   1.055356   2.491 0.013224 *  
## nox         -16.280488   4.309207  -3.778 0.000186 ***
## rm            3.068242   0.484744   6.330 7.69e-10 ***
## dis          -1.568736   0.229077  -6.848 3.47e-11 ***
## rad           0.305467   0.083179   3.672 0.000279 ***
## tax          -0.013603   0.004534  -3.000 0.002898 ** 
## ptratio      -0.899612   0.160989  -5.588 4.69e-08 ***
## black         0.008998   0.003350   2.686 0.007590 ** 
## lstat        -0.620575   0.058679 -10.576  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.021 on 343 degrees of freedom
## Multiple R-squared:  0.7219, Adjusted R-squared:  0.7138 
## F-statistic: 89.05 on 10 and 343 DF,  p-value: < 2.2e-16
bwd.predict<-predict(bwd.model,boston.test)
bwd.mse<-mean((bwd.predict-boston.test$medv)^2)
bwd.mse
## [1] 18.84049

stepwise

step.model<-step(nullmodel, scope = list(lower = nullmodel, upper = fullmodel), 
    direction = "both")
## Start:  AIC=1586.28
## medv ~ 1
## 
##           Df Sum of Sq   RSS    AIC
## + lstat    1   16886.9 14204 1311.0
## + rm       1   13875.0 17216 1379.0
## + indus    1    6900.6 24190 1499.4
## + ptratio  1    6829.7 24261 1500.5
## + tax      1    5850.3 25241 1514.5
## + nox      1    5552.7 25538 1518.6
## + crim     1    4592.6 26498 1531.7
## + age      1    4281.3 26810 1535.8
## + zn       1    4209.0 26882 1536.8
## + black    1    3725.3 27366 1543.1
## + rad      1    3716.7 27374 1543.2
## + dis      1    1831.6 29259 1566.8
## + chas     1    1093.5 29997 1575.6
## <none>                 31091 1586.3
## 
## Step:  AIC=1310.96
## medv ~ lstat
## 
##           Df Sum of Sq   RSS    AIC
## + rm       1    2395.7 11808 1247.6
## + ptratio  1    1732.3 12472 1266.9
## + dis      1     876.0 13328 1290.4
## + chas     1     517.4 13687 1299.8
## + age      1     225.9 13978 1307.3
## + zn       1     110.7 14093 1310.2
## + tax      1      91.8 14112 1310.7
## <none>                 14204 1311.0
## + black    1      77.3 14127 1311.0
## + nox      1      44.2 14160 1311.9
## + crim     1      16.4 14188 1312.5
## + indus    1      10.9 14193 1312.7
## + rad      1       0.0 14204 1313.0
## - lstat    1   16886.9 31091 1586.3
## 
## Step:  AIC=1247.57
## medv ~ lstat + rm
## 
##           Df Sum of Sq   RSS    AIC
## + ptratio  1    1132.6 10676 1213.9
## + dis      1     549.8 11258 1232.7
## + chas     1     387.6 11421 1237.8
## + black    1     222.1 11586 1242.8
## + tax      1     141.9 11666 1245.3
## <none>                 11808 1247.6
## + age      1      41.7 11767 1248.3
## + zn       1      36.1 11772 1248.5
## + rad      1      28.9 11779 1248.7
## + crim     1      27.7 11781 1248.7
## + nox      1       8.2 11800 1249.3
## + indus    1       3.0 11805 1249.5
## - rm       1    2395.7 14204 1311.0
## - lstat    1    5407.6 17216 1379.0
## 
## Step:  AIC=1213.87
## medv ~ lstat + rm + ptratio
## 
##           Df Sum of Sq   RSS    AIC
## + dis      1     582.5 10093 1196.0
## + chas     1     306.7 10369 1205.5
## + black    1     152.2 10524 1210.8
## + rad      1      81.3 10594 1213.2
## + age      1      63.7 10612 1213.8
## <none>                 10676 1213.9
## + indus    1      17.6 10658 1215.3
## + zn       1       4.5 10671 1215.7
## + crim     1       1.0 10675 1215.8
## + tax      1       0.0 10676 1215.9
## + nox      1       0.0 10676 1215.9
## - ptratio  1    1132.6 11808 1247.6
## - rm       1    1796.1 12472 1266.9
## - lstat    1    4415.1 15091 1334.4
## 
## Step:  AIC=1196.01
## medv ~ lstat + rm + ptratio + dis
## 
##           Df Sum of Sq     RSS    AIC
## + nox      1     485.8  9607.3 1180.5
## + zn       1     246.9  9846.3 1189.2
## + black    1     220.0  9873.1 1190.2
## + chas     1     189.9  9903.2 1191.3
## + indus    1     167.4  9925.8 1192.1
## + age      1      96.9  9996.3 1194.6
## + tax      1      84.8 10008.4 1195.0
## <none>                 10093.2 1196.0
## + crim     1      15.0 10078.2 1197.5
## + rad      1       0.3 10092.9 1198.0
## - dis      1     582.5 10675.7 1213.9
## - ptratio  1    1165.3 11258.5 1232.7
## - rm       1    1509.0 11602.2 1243.3
## - lstat    1    4868.4 14961.6 1333.4
## 
## Step:  AIC=1180.55
## medv ~ lstat + rm + ptratio + dis + nox
## 
##           Df Sum of Sq     RSS    AIC
## + zn       1     263.2  9344.1 1172.7
## + chas     1     235.8  9371.5 1173.8
## + black    1     134.6  9472.7 1177.5
## + rad      1      94.7  9512.7 1179.0
## <none>                  9607.3 1180.5
## + age      1      18.9  9588.4 1181.8
## + indus    1      17.3  9590.1 1181.9
## + crim     1       2.0  9605.4 1182.5
## + tax      1       0.4  9606.9 1182.5
## - nox      1     485.8 10093.2 1196.0
## - dis      1    1068.3 10675.6 1215.9
## - ptratio  1    1356.9 10964.3 1225.3
## - rm       1    1444.7 11052.1 1228.1
## - lstat    1    3420.8 13028.1 1286.4
## 
## Step:  AIC=1172.72
## medv ~ lstat + rm + ptratio + dis + nox + zn
## 
##           Df Sum of Sq     RSS    AIC
## + chas     1     228.5  9115.6 1166.0
## + black    1     161.3  9182.9 1168.5
## <none>                  9344.1 1172.7
## + rad      1      44.4  9299.7 1173.0
## + crim     1      20.3  9323.8 1173.9
## + indus    1      13.7  9330.4 1174.2
## + tax      1      12.3  9331.8 1174.2
## + age      1       7.1  9337.0 1174.5
## - zn       1     263.2  9607.3 1180.5
## - nox      1     502.2  9846.3 1189.2
## - ptratio  1     954.0 10298.1 1205.1
## - rm       1    1215.1 10559.2 1214.0
## - dis      1    1326.0 10670.1 1217.7
## - lstat    1    3512.4 12856.5 1283.7
## 
## Step:  AIC=1165.95
## medv ~ lstat + rm + ptratio + dis + nox + zn + chas
## 
##           Df Sum of Sq     RSS    AIC
## + black    1     129.9  8985.8 1162.9
## <none>                  9115.6 1166.0
## + rad      1      46.9  9068.7 1166.1
## + indus    1      18.8  9096.9 1167.2
## + crim     1      12.6  9103.0 1167.5
## + age      1      10.2  9105.4 1167.5
## + tax      1       6.8  9108.8 1167.7
## - chas     1     228.5  9344.1 1172.7
## - zn       1     255.9  9371.5 1173.8
## - nox      1     547.8  9663.4 1184.6
## - ptratio  1     902.1 10017.7 1197.4
## - rm       1    1183.0 10298.7 1207.2
## - dis      1    1240.9 10356.5 1209.1
## - lstat    1    3276.0 12391.6 1272.6
## 
## Step:  AIC=1162.87
## medv ~ lstat + rm + ptratio + dis + nox + zn + chas + black
## 
##           Df Sum of Sq     RSS    AIC
## + rad      1    113.26  8872.5 1160.4
## <none>                  8985.8 1162.9
## + age      1     14.48  8971.3 1164.3
## + indus    1     10.92  8974.9 1164.4
## + crim     1      0.38  8985.4 1164.9
## + tax      1      0.16  8985.6 1164.9
## - black    1    129.86  9115.6 1166.0
## - chas     1    197.10  9182.9 1168.5
## - zn       1    279.84  9265.6 1171.7
## - nox      1    454.26  9440.0 1178.3
## - ptratio  1    820.48  9806.2 1191.8
## - dis      1   1252.80 10238.6 1207.1
## - rm       1   1260.63 10246.4 1207.3
## - lstat    1   2823.02 11808.8 1257.6
## 
## Step:  AIC=1160.38
## medv ~ lstat + rm + ptratio + dis + nox + zn + chas + black + 
##     rad
## 
##           Df Sum of Sq     RSS    AIC
## + tax      1    226.85  8645.7 1153.2
## <none>                  8872.5 1160.4
## + indus    1     25.82  8846.7 1161.3
## + crim     1     20.78  8851.7 1161.5
## + age      1     10.52  8862.0 1162.0
## - rad      1    113.26  8985.8 1162.9
## - chas     1    192.69  9065.2 1166.0
## - black    1    196.21  9068.7 1166.1
## - zn       1    209.06  9081.6 1166.6
## - nox      1    562.12  9434.6 1180.1
## - ptratio  1    904.86  9777.4 1192.8
## - dis      1   1144.82 10017.3 1201.3
## - rm       1   1151.98 10024.5 1201.6
## - lstat    1   2850.03 11722.5 1257.0
## 
## Step:  AIC=1153.21
## medv ~ lstat + rm + ptratio + dis + nox + zn + chas + black + 
##     rad + tax
## 
##           Df Sum of Sq     RSS    AIC
## <none>                  8645.7 1153.2
## + crim     1     24.74  8620.9 1154.2
## + age      1      6.66  8639.0 1154.9
## + indus    1      2.17  8643.5 1155.1
## - chas     1    156.36  8802.0 1157.6
## - black    1    181.81  8827.5 1158.6
## - tax      1    226.85  8872.5 1160.4
## - zn       1    278.70  8924.4 1162.4
## - rad      1    339.94  8985.6 1164.9
## - nox      1    359.79  9005.4 1165.7
## - ptratio  1    787.09  9432.8 1182.1
## - rm       1   1009.85  9655.5 1190.3
## - dis      1   1182.06  9827.7 1196.6
## - lstat    1   2819.25 11464.9 1251.1
summary(step.model)
## 
## Call:
## lm(formula = medv ~ lstat + rm + ptratio + dis + nox + zn + chas + 
##     black + rad + tax, data = boston.train)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.5511  -2.7981  -0.5087   1.6696  24.1049 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  41.553676   6.098471   6.814 4.28e-11 ***
## lstat        -0.620575   0.058679 -10.576  < 2e-16 ***
## rm            3.068242   0.484744   6.330 7.69e-10 ***
## ptratio      -0.899612   0.160989  -5.588 4.69e-08 ***
## dis          -1.568736   0.229077  -6.848 3.47e-11 ***
## nox         -16.280488   4.309207  -3.778 0.000186 ***
## zn            0.056446   0.016975   3.325 0.000979 ***
## chas          2.628518   1.055356   2.491 0.013224 *  
## black         0.008998   0.003350   2.686 0.007590 ** 
## rad           0.305467   0.083179   3.672 0.000279 ***
## tax          -0.013603   0.004534  -3.000 0.002898 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.021 on 343 degrees of freedom
## Multiple R-squared:  0.7219, Adjusted R-squared:  0.7138 
## F-statistic: 89.05 on 10 and 343 DF,  p-value: < 2.2e-16
step.predict<-predict(step.model,boston.test)
step.mse<-mean((step.predict-boston.test$medv)^2)
step.mse
## [1] 18.84049

ridge regression

bostontrain.mat<-model.matrix(medv~.,data=boston.train)
bostontest.mat<-model.matrix(medv~.,data=boston.test)
grid<-10^seq(4,-2,length=100)
library(glmnet)
ridge.model<-glmnet(bostontrain.mat,boston.train$medv,alpha=0,lambda=grid)
boston.cv.ridge<-cv.glmnet(bostontrain.mat,boston.train$medv,alpha=0,lambda=grid)
plot(boston.cv.ridge)

boston.bestlam.ridge<-boston.cv.ridge$lambda.min
boston.bestlam.ridge
## [1] 0.1417474
ridge.model.1<-glmnet(bostontrain.mat,boston.train$medv,alpha=0,lambda=boston.bestlam.ridge)
coef(ridge.model.1)
## 15 x 1 sparse Matrix of class "dgCMatrix"
##                        s0
## (Intercept)  39.726176503
## (Intercept)   .          
## crim         -0.043885158
## zn            0.052687856
## indus        -0.005063095
## chas          2.692011518
## nox         -15.182853995
## rm            3.177687894
## age          -0.008481379
## dis          -1.529508653
## rad           0.269963547
## tax          -0.010966192
## ptratio      -0.881138929
## black         0.008225794
## lstat        -0.588333809
boston.pred.newridge<-predict(ridge.model,s=boston.bestlam.ridge,newx=bostontest.mat)
ridge.MSE<-mean((boston.test$medv-boston.pred.newridge)^2)
ridge.MSE
## [1] 17.74425

I performed ridge regression for a range of values of lambda and find MSE for training set. The value of lambda corresponding to minimum MSE is used to build final model and for prediction. The value of lambda is 0.1417474 for which we got minimum train set MSE

lasso regression

lasso.model<-glmnet(bostontrain.mat,boston.train$medv,alpha=0,lambda=grid)

boston.cv.lasso<-cv.glmnet(bostontrain.mat,boston.train$medv,alpha=0,lambda=grid)

plot(boston.cv.lasso)

boston.bestlam.lasso<-boston.cv.lasso$lambda.min
boston.bestlam.lasso
## [1] 0.1232847
lasso.model.1<-glmnet(bostontrain.mat,boston.train$medv,alpha=0,lambda=boston.bestlam.lasso)
coef(lasso.model.1)
## 15 x 1 sparse Matrix of class "dgCMatrix"
##                        s0
## (Intercept)  40.033394715
## (Intercept)   .          
## crim         -0.044466644
## zn            0.053252258
## indus        -0.002505365
## chas          2.680304197
## nox         -15.340063745
## rm            3.166768913
## age          -0.008441933
## dis          -1.541098565
## rad           0.276257251
## tax          -0.011281348
## ptratio      -0.883796393
## black         0.008225935
## lstat        -0.590324265
boston.pred.newlasso<-predict(lasso.model,s=boston.bestlam.lasso,newx=bostontest.mat)
lasso.MSE<-mean((boston.test$medv-boston.pred.newlasso)^2)
lasso.MSE
## [1] 17.76477

I performed lasso regression for a range of values of lambda and find MSE for training set. The value of lambda corresponding to minimum MSE is used to build final model and for prediction The value of lambda is 0.1232847 for which we got minimum train set MSE.

Principal Component Regression (PCR)

library(pls)
pcr.model<-pcr(medv~.,data=boston.train,scale=TRUE,validation="CV")
summary(pcr.model)
## Data:    X dimension: 354 13 
##  Y dimension: 354 1
## Fit method: svdpc
## Number of components considered: 13
## 
## VALIDATION: RMSEP
## Cross-validated using 10 random segments.
##        (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
## CV           9.398    7.599    7.437    5.937    5.957    5.576    5.598
## adjCV        9.398    7.597    7.434    5.932    5.952    5.567    5.591
##        7 comps  8 comps  9 comps  10 comps  11 comps  12 comps  13 comps
## CV       5.575    5.477    5.447     5.428     5.452     5.253     5.186
## adjCV    5.565    5.467    5.444     5.418     5.445     5.240     5.173
## 
## TRAINING: % variance explained
##       1 comps  2 comps  3 comps  4 comps  5 comps  6 comps  7 comps
## X       47.68    59.12    68.52    75.68    82.00    86.70    90.14
## medv    34.95    39.63    60.77    60.94    65.86    65.96    66.42
##       8 comps  9 comps  10 comps  11 comps  12 comps  13 comps
## X       93.15    95.05     96.84     98.28     99.58    100.00
## medv    67.91    68.10     69.03     69.15     71.49     72.29
validationplot(pcr.model,val.type="MSEP")

pcr.model.5comp<-pcr(medv~.,data=boston.train,scale=TRUE,ncomp=5)
summary(pcr.model.5comp)
## Data:    X dimension: 354 13 
##  Y dimension: 354 1
## Fit method: svdpc
## Number of components considered: 5
## TRAINING: % variance explained
##       1 comps  2 comps  3 comps  4 comps  5 comps
## X       47.68    59.12    68.52    75.68    82.00
## medv    34.95    39.63    60.77    60.94    65.86
pcr.model.5comp$coefficients
## , , 1 comps
## 
##                  medv
## crim    -0.6063915501
## zn       0.5653092042
## indus   -0.7708923694
## chas    -0.0006292165
## nox     -0.7499003656
## rm       0.4304083600
## age     -0.6915127075
## dis      0.7132887312
## rad     -0.7111004067
## tax     -0.7475540656
## ptratio -0.4139434089
## black    0.4877422872
## lstat   -0.6904890541
## 
## , , 2 comps
## 
##                medv
## crim    -1.16076039
## zn       0.04235492
## indus   -0.56193240
## chas     0.63284060
## nox     -0.34381707
## rm       0.58457968
## age     -0.17407242
## dis      0.12326631
## rad     -1.17923658
## tax     -1.15681336
## ptratio -0.97561493
## black    0.97449535
## lstat   -0.73255090
## 
## , , 3 comps
## 
##                medv
## crim    -0.40885303
## zn       1.14506849
## indus   -0.57361682
## chas     2.11629628
## nox      0.07154186
## rm       2.92971918
## age     -0.12664889
## dis     -0.12661619
## rad     -0.06747743
## tax     -0.27611708
## ptratio -2.13668603
## black    0.02081913
## lstat   -1.84931819
## 
## , , 4 comps
## 
##                 medv
## crim    -0.446493901
## zn       1.082229500
## indus   -0.560066308
## chas     2.339542181
## nox      0.002820896
## rm       2.929573203
## age     -0.173545861
## dis     -0.095552865
## rad      0.012642142
## tax     -0.218882977
## ptratio -1.892652735
## black    0.159432982
## lstat   -1.922300636
## 
## , , 5 comps
## 
##                medv
## crim    -0.79069410
## zn       0.31377021
## indus   -0.53227588
## chas     0.97980565
## nox     -0.07258062
## rm       4.11921228
## age      0.08861939
## dis     -0.53232412
## rad      0.24973101
## tax     -0.07788551
## ptratio -1.33947559
## black    0.49875496
## lstat   -2.65801809
boston.predict.pcr<-predict(pcr.model,boston.test,ncomp=5)
pcr.MSE<-mean((boston.test$medv-boston.predict.pcr)^2)
pcr.MSE
## [1] 15.96318

I performed Principal Component Regression and see summary for 13 components plot of MSE vs Number of Components suggests that minimum MSE is obtained for 13 components.This means that PCR is not helping in dimension reduction from p+1 to m+1 I select 5 as number of components in final model as 5 components explain about 80 % of variation Scores of the components are also summarized. Scores are further used to fit the Principal Component Model

PLS

library(pls)
plsr.model<-plsr(medv~.,data=boston.train,scale=TRUE,validation="CV")
validationplot(plsr.model,val.type="MSEP")

plsr.model.5comp<-plsr(medv~.,data=boston.train,scale=TRUE,ncomp=5)
summary(plsr.model.5comp)
## Data:    X dimension: 354 13 
##  Y dimension: 354 1
## Fit method: kernelpls
## Number of components considered: 5
## TRAINING: % variance explained
##       1 comps  2 comps  3 comps  4 comps  5 comps
## X       46.42    57.12    64.14    70.68    76.34
## medv    46.90    67.90    70.21    71.38    71.87
plsr.model.5comp$coefficients
## , , 1 comps
## 
##               medv
## crim    -0.6629002
## zn       0.6346088
## indus   -0.8125719
## chas     0.3234698
## nox     -0.7289083
## rm       1.1522209
## age     -0.6400400
## dis      0.4186334
## rad     -0.5963464
## tax     -0.7481862
## ptratio -0.8083894
## black    0.5970368
## lstat   -1.2711419
## 
## , , 2 comps
## 
##                medv
## crim    -0.42482542
## zn       0.52607570
## indus   -0.47112526
## chas     1.40382834
## nox     -0.24208308
## rm       3.48196710
## age     -0.04543861
## dis     -1.18593277
## rad      0.35129618
## tax     -0.25239724
## ptratio -1.93617630
## black    0.69452051
## lstat   -3.03566596
## 
## , , 3 comps
## 
##               medv
## crim     0.0162841
## zn       0.5614550
## indus   -0.4975755
## chas     0.5868014
## nox     -0.9170569
## rm       3.3403655
## age     -0.3678324
## dis     -1.9690256
## rad      0.9934994
## tax     -0.1655670
## ptratio -1.6703181
## black    0.8684079
## lstat   -4.0099664
## 
## , , 4 comps
## 
##               medv
## crim    -0.0137188
## zn       0.5497794
## indus   -0.3653050
## chas     0.5499696
## nox     -1.3785074
## rm       2.5871245
## age     -0.3460459
## dis     -3.0540460
## rad      1.1149044
## tax     -0.6409048
## ptratio -1.9283887
## black    1.1417590
## lstat   -4.5137243
## 
## , , 5 comps
## 
##                medv
## crim     0.04611646
## zn       1.10369104
## indus   -0.31643319
## chas     0.79412630
## nox     -1.76929120
## rm       2.19271829
## age     -0.40901868
## dis     -3.49536648
## rad      1.46882099
## tax     -0.87414555
## ptratio -2.08221751
## black    0.81717250
## lstat   -4.63089016
boston.predict.plsr<-predict(plsr.model,boston.test,ncomp=5)
plsr.MSE<-mean((boston.test$medv-boston.predict.plsr)^2)
plsr.MSE
## [1] 19.42082

(b) Propose a model (or set of models) that seem to perform well on this data set, and justify your answer. Make sure that you are evaluating model performance using validation set error, crossvalidation, or some other reasonable alternative, as opposed to using training error.

#Validation Set MSE is used for evaluating model performance, Below is the list of different approaches with the MSE obtained for Validation Data set.

##Full Model: 17.96

##Best Subset: 17.81

##Forward Selection: 18.84

##Backward Selection: 18.84

##Stepwise Selection: 18.84

##Ridge Regression: 17.74

##Lasso Regression: 17.76

##Principal Component Regression (PCR): 15.96

##Partial Least Square (PLS): 19.42

##The minimum validation is obtained for Principal Component Regression Model with 5 Principal Components

##It is interesting to see that validation MSE for forward, backward and subset selection models is same

##The second least MSE for Validation set is obtained for Ridge regression

(c) Does your chosen model involve all of the features in the data set? Why or why not?

# The PCR model contains 5 components which comprises of all the 13 predictor variables. The weightage of each predictor variable for every component is given by scores

#The Ridge regression model also contains all the predictor variables

#This suggest that all the predictor variables are contributing in predicting the response variable

The End