7.2, 7.5
1. K-Nearest Neighbors
## k-Nearest Neighbors
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results across tuning parameters:
##
## k RMSE Rsquared MAE
## 5 3.466085 0.5121775 2.816838
## 7 3.349428 0.5452823 2.727410
## 9 3.264276 0.5785990 2.660026
## 11 3.214216 0.6024244 2.603767
## 13 3.196510 0.6176570 2.591935
## 15 3.184173 0.6305506 2.577482
## 17 3.183130 0.6425367 2.567787
## 19 3.198752 0.6483184 2.592683
## 21 3.188993 0.6611428 2.588787
## 23 3.200458 0.6638353 2.604529
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 17.
## RMSE Rsquared MAE
## 3.2040595 0.6819919 2.5683461
2. Neural Network
Decay = .06, size = 5.
## a 10-5-1 network with 61 weights
## options were - linear output units decay=0.06
## RMSE Rsquared MAE
## 1.5052969 0.9103803 1.1799201
3. MARS
## Selected 12 of 18 terms, and 6 of 10 predictors
## Termination condition: Reached nk 21
## Importance: X1, X4, X2, X5, X3, X6, X7-unused, X8-unused, X9-unused, ...
## Number of terms at each degree of interaction: 1 11 (additive model)
## GCV 2.540556 RSS 397.9654 GRSq 0.8968524 RSq 0.9183982
## RMSE Rsquared MAE
## 1.8136467 0.8677298 1.3911836
4. SVM (linear and radial)
## Support Vector Machines with Radial Basis Function Kernel
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 180, 180, 180, 180, 180, 180, ...
## Resampling results across tuning parameters:
##
## C RMSE Rsquared MAE
## 0.25 2.507298 0.7985476 1.988577
## 0.50 2.219032 0.8221279 1.723539
## 1.00 2.026322 0.8446958 1.569136
##
## Tuning parameter 'sigma' was held constant at a value of 0.06472009
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were sigma = 0.06472009 and C = 1.
## RMSE Rsquared MAE
## 2.2549651 0.8002404 1.7237658
## [1] ""
## [1] "___________________________________________"
## [1] ""
## Support Vector Machines with Linear Kernel
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 180, 180, 180, 180, 180, 180, ...
## Resampling results:
##
## RMSE Rsquared MAE
## 2.443627 0.7597914 1.957003
##
## Tuning parameter 'C' was held constant at a value of 1
## RMSE Rsquared MAE
## 2.7633860 0.6973384 2.0970616
The models create RMSEs of between 1.5 and 3.2. Given that the mean y value is 14 and the standard deviation is 5, an RMSE of 1.5 is relatively low. This model (Neural Net) also had an R-Squared of .91, which also suggests a relatively reasonable fit. MARS uses the fist six indicators in this case.
KNN is not generally considered a highly powerful model so it is not surprising that its predictive power is low. NNet was very sensitive to decay. At a decay of .06, the model outperforms MARS but at ,01 it does not. The SVM models performed in between - the radial was more accurate than the Linear.
## Model RMSE RSquared
## A KNN 3.2 0.68
## B NeuralNet 1.5 0.91
## C MARS 1.8 0.87
## D SVM-Linear 2.8 0.7
## E SVM_Radial 2.1 0.83
## [1] 14.38613
## [1] 4.964588
1. KNN
## k-Nearest Neighbors
##
## 143 samples
## 57 predictor
##
## Pre-processing: centered (57)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 143, 143, 143, 143, 143, 143, ...
## Resampling results across tuning parameters:
##
## k RMSE Rsquared MAE
## 5 0.6231081 0.5890175 0.4452523
## 7 0.6253624 0.5924293 0.4487492
## 9 0.6320160 0.5928172 0.4549737
## 11 0.6376085 0.5958979 0.4637498
## 13 0.6430328 0.5978215 0.4684199
## 15 0.6492103 0.6022657 0.4700675
## 17 0.6554245 0.6026970 0.4730035
## 19 0.6620176 0.6023006 0.4773928
## 21 0.6688298 0.6023563 0.4820142
## 23 0.6732539 0.6048079 0.4854839
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 5.
## RMSE Rsquared MAE
## 0.6453094 0.6883410 0.4320985
2. Neural Network
Decay = .03, size = 4.
## RMSE Rsquared MAE
## 0.5587323 0.7956238 0.3826069
3. MARS
## Selected 13 of 21 terms, and 9 of 57 predictors
## Termination condition: RSq changed by less than 0.001 at 21 terms
## Importance: BiologicalMaterial06, BiologicalMaterial10, ...
## Number of terms at each degree of interaction: 1 12 (additive model)
## GCV 0.1155637 RSS 11.25251 GRSq 0.8768118 RSq 0.9149339
## RMSE Rsquared MAE
## 0.5833055 0.7457736 0.3888165
4. SVM
## Support Vector Machines with Radial Basis Function Kernel
##
## 143 samples
## 57 predictor
##
## Pre-processing: centered (57)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 129, 127, 128, 129, 129, 128, ...
## Resampling results across tuning parameters:
##
## C RMSE Rsquared MAE
## 0.25 0.6002364 0.6847794 0.4089570
## 0.50 0.5507585 0.7084697 0.3705803
## 1.00 0.5273432 0.7107396 0.3509539
##
## Tuning parameter 'sigma' was held constant at a value of 0.01291832
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were sigma = 0.01291832 and C = 1.
## RMSE Rsquared MAE
## 0.6000082 0.7494038 0.3813698
## [1] ""
## [1] "___________________________________________"
## [1] ""
## Support Vector Machines with Linear Kernel
##
## 143 samples
## 57 predictor
##
## Pre-processing: centered (57), scaled (57)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 128, 128, 131, 130, 128, 128, ...
## Resampling results:
##
## RMSE Rsquared MAE
## 1.100034 0.7018634 0.4937001
##
## Tuning parameter 'C' was held constant at a value of 1
## RMSE Rsquared MAE
## 0.9539542 0.4053442 0.4620395
The models create RMSEs of between .56 and .65 (and an outlier .95 for linear SVM). Given that the mean y value is about 0 and the standard deviation is 1.15, an RMSE of 5.6 is reasonable. This model (again Neural Net) also had an R-Squared of .80, which also suggests a relatively reasonable fit.
With the exception of KNN and linear SVM, all of the models are within a small range. The Lasso model perfomed for homework 7 is also a strong contender, with very comparable scores to the neural net.
## Model RMSE RSquared
## A KNN 0.65 0.69
## B NeuralNet 0.56 0.8
## C MARS 0.58 0.75
## D SVM-Linear 0.95 0.4
## E SVM_Radial 0.6 0.75
## F Lasso 0.57 0.8
## [1] -0.0278816
## [1] 1.154999
(b) Which predictors are most important in the optimal nonlinear regression model? Do either the biological or process variables dominate the list? How do the top ten important predictors compare to the top ten predictors from the optimal linear model?
The neural net model has more processing factors at the forefront than the Lasso model. While the results are similar, there is little overlap between the sets of indicators chosen by the two models.
## Overall
## X10 4.091391
## X1 3.693592
## X17 3.367948
## X55 3.229145
## X32 3.110172
## X51 2.877768
## X11 2.842688
## X42 2.716389
## X13 2.695383
## X39 2.689269
(c) Explore the relationships between the top predictors and the response for the predictors that are unique to the optimal nonlinear regression model. Do these plots reveal intuition about the biological or process predictors and their relationship with yield?
Most of the biological indicators are highly correlated with yield, however the manufacturing indicators the r is low and the p is high. This suggests that the model finds hidden relationships more important than individual correlations Because of the high degree of multicollinearity, the model is looking for the unique set of indicators which explain variance, which may be certain indicators only in combination with other indicators.