Do problems 7.2 and 7.5 in Kuhn and Johnson. There are only two but they have many parts. Please submit both a link to your Rpubs and the .rmd file.
library(lattice)
library(pls)
##
## Attaching package: 'pls'
## The following object is masked from 'package:stats':
##
## loadings
library(glmnet)
## Loading required package: Matrix
## Loaded glmnet 4.1-8
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(AppliedPredictiveModeling)
library(caret)
## Loading required package: ggplot2
##
## Attaching package: 'caret'
## The following object is masked from 'package:pls':
##
## R2
Which models appear to give the best performance? Does MARS select the informative predictors (those named X1–X5)?
library(mlbench)
set.seed(200)
trainingData <- mlbench.friedman1(200, sd = 1)
## We convert the 'x' data from a matrix to a data frame
## One reason is that this will give the columns names
trainingData$x <- data.frame(trainingData$x)
## Look at the data using
featurePlot(trainingData$x, trainingData$y)
## or other methods.
## This creates a list with a vector 'y' and a matrix
## of predictors 'x'. Also simulate a large test set to
## estimate the true error rate with good precision:
testData <- mlbench.friedman1(5000, sd = 1)
testData$x <- data.frame(testData$x)
Let’s try some models:
# Try models:
# KNN
library(caret)
knnModel <- train(x = trainingData$x,
y = trainingData$y,
method = "knn",
preProc = c("center", "scale"),
tuneLength = 10)
knnModel
## k-Nearest Neighbors
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results across tuning parameters:
##
## k RMSE Rsquared MAE
## 5 3.466085 0.5121775 2.816838
## 7 3.349428 0.5452823 2.727410
## 9 3.264276 0.5785990 2.660026
## 11 3.214216 0.6024244 2.603767
## 13 3.196510 0.6176570 2.591935
## 15 3.184173 0.6305506 2.577482
## 17 3.183130 0.6425367 2.567787
## 19 3.198752 0.6483184 2.592683
## 21 3.188993 0.6611428 2.588787
## 23 3.200458 0.6638353 2.604529
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 17.
knnModel$finalModel
## 17-nearest neighbor regression model
knnPred <- predict(knnModel, newdata = testData$x)
## The function 'postResample' can be used to get the test set
## performance values
postResample(pred = knnPred, obs = testData$y)
## RMSE Rsquared MAE
## 3.2040595 0.6819919 2.5683461
KNN produced a 3.287803 RMSE and a 0.6348889 Rsquared for the optimal model. 63% is not a great value for accounted variance.
The test set performance values are RMSE of 3.2040595 and 0.6819919 Rsquared.
Let’s try Neural Networks:
## Neural Networks
# Try with different hidden layer values
nnetGrid <- expand.grid(decay = c(0, 0.01, .1),
size = c(1, 3, 5, 7, 9, 11, 13),
bag = FALSE)
set.seed(100)
nnetTune <- train(x = trainingData$x, y = trainingData$y,
method = "avNNet",
tuneGrid = nnetGrid,
preProc = c("center", "scale"),
linout = TRUE,
trace = FALSE,
MaxNWts = 13 * (ncol(trainingData$x) + 1) + 13 + 1,
maxit = 1000,
allowParallel = FALSE)
nnetTune
## Model Averaged Neural Network
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results across tuning parameters:
##
## decay size RMSE Rsquared MAE
## 0.00 1 2.591292 0.7404870 2.044739
## 0.00 3 2.416654 0.7693978 1.882941
## 0.00 5 3.551628 0.6331502 2.506577
## 0.00 7 5.141500 0.4508978 3.338480
## 0.00 9 4.141441 0.5326838 2.757371
## 0.00 11 2.834577 0.6924494 2.248478
## 0.00 13 2.731770 0.7169142 2.160916
## 0.01 1 2.550749 0.7477407 2.000384
## 0.01 3 2.345852 0.7845572 1.865673
## 0.01 5 2.503629 0.7573536 1.988640
## 0.01 7 2.770058 0.7131783 2.180200
## 0.01 9 2.774937 0.7082923 2.217840
## 0.01 11 2.655150 0.7329095 2.105368
## 0.01 13 2.473508 0.7608801 1.958684
## 0.10 1 2.534355 0.7496508 1.980586
## 0.10 3 2.344816 0.7821212 1.870001
## 0.10 5 2.381380 0.7787366 1.889124
## 0.10 7 2.506458 0.7599062 1.998846
## 0.10 9 2.502286 0.7550033 1.991627
## 0.10 11 2.508369 0.7534897 1.973798
## 0.10 13 2.325015 0.7865549 1.841374
##
## Tuning parameter 'bag' was held constant at a value of FALSE
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were size = 13, decay = 0.1 and bag = FALSE.
nnetTune$finalModel
## Model Averaged Neural Network with 5 Repeats
##
## a 10-13-1 network with 157 weights
## options were - linear output units decay=0.1
plot(nnetTune)
testResults <- data.frame(obs = testData$y,
NNet = predict(nnetTune, testData$x))
nnetPred <- predict(nnetTune, newdata = testData$x)
## The function 'postResample' can be used to get the test set
## performance values
postResample(pred = nnetPred, obs = testData$y)
## RMSE Rsquared MAE
## 2.3265643 0.7811706 1.7869748
Neural Networks produced a 2.325015 RMSE and a 0.7865549 Rsquared.
The test set performance values are RMSE of 2.3265643 and 0.7811706 Rsquared.
Now let’s try MARS:
# MARS
set.seed(100)
marsTune <- train(x = trainingData$x, y = trainingData$y,
method = "earth",
tuneGrid = expand.grid(degree = 1:3, nprune = 2:38))
## Loading required package: earth
## Loading required package: Formula
## Loading required package: plotmo
## Loading required package: plotrix
marsTune
## Multivariate Adaptive Regression Spline
##
## 200 samples
## 10 predictor
##
## No pre-processing
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results across tuning parameters:
##
## degree nprune RMSE Rsquared MAE
## 1 2 4.476111 0.2204191 3.673471
## 1 3 3.785524 0.4453226 3.060822
## 1 4 2.888006 0.6739447 2.299054
## 1 5 2.617975 0.7305555 2.088069
## 1 6 2.503672 0.7565683 1.993318
## 1 7 2.118338 0.8216259 1.677739
## 1 8 1.929299 0.8520020 1.524792
## 1 9 1.831854 0.8673078 1.449707
## 1 10 1.803948 0.8715808 1.427875
## 1 11 1.794652 0.8728269 1.415841
## 1 12 1.812615 0.8703780 1.433389
## 1 13 1.811880 0.8707304 1.430757
## 1 14 1.825082 0.8688957 1.443386
## 1 15 1.837181 0.8672550 1.449804
## 1 16 1.854541 0.8647653 1.464247
## 1 17 1.853712 0.8648111 1.461572
## 1 18 1.853712 0.8648111 1.461572
## 1 19 1.853712 0.8648111 1.461572
## 1 20 1.853712 0.8648111 1.461572
## 1 21 1.853712 0.8648111 1.461572
## 1 22 1.853712 0.8648111 1.461572
## 1 23 1.853712 0.8648111 1.461572
## 1 24 1.853712 0.8648111 1.461572
## 1 25 1.853712 0.8648111 1.461572
## 1 26 1.853712 0.8648111 1.461572
## 1 27 1.853712 0.8648111 1.461572
## 1 28 1.853712 0.8648111 1.461572
## 1 29 1.853712 0.8648111 1.461572
## 1 30 1.853712 0.8648111 1.461572
## 1 31 1.853712 0.8648111 1.461572
## 1 32 1.853712 0.8648111 1.461572
## 1 33 1.853712 0.8648111 1.461572
## 1 34 1.853712 0.8648111 1.461572
## 1 35 1.853712 0.8648111 1.461572
## 1 36 1.853712 0.8648111 1.461572
## 1 37 1.853712 0.8648111 1.461572
## 1 38 1.853712 0.8648111 1.461572
## 2 2 4.476111 0.2204191 3.673471
## 2 3 3.785524 0.4453226 3.060822
## 2 4 2.903053 0.6705805 2.306446
## 2 5 2.568112 0.7406424 2.044942
## 2 6 2.454281 0.7643365 1.933426
## 2 7 2.126069 0.8214939 1.680926
## 2 8 1.984433 0.8444773 1.569823
## 2 9 1.833055 0.8679464 1.452162
## 2 10 1.764117 0.8777022 1.393074
## 2 11 1.609337 0.8974280 1.276747
## 2 12 1.505974 0.9094424 1.192012
## 2 13 1.529548 0.9067647 1.192927
## 2 14 1.538968 0.9058228 1.198838
## 2 15 1.523385 0.9077521 1.188052
## 2 16 1.541811 0.9057796 1.200075
## 2 17 1.528599 0.9071459 1.190484
## 2 18 1.530678 0.9070504 1.194269
## 2 19 1.533510 0.9067789 1.196159
## 2 20 1.534539 0.9066600 1.196707
## 2 21 1.531484 0.9070318 1.194709
## 2 22 1.531484 0.9070318 1.194709
## 2 23 1.531484 0.9070318 1.194709
## 2 24 1.531484 0.9070318 1.194709
## 2 25 1.531484 0.9070318 1.194709
## 2 26 1.531484 0.9070318 1.194709
## 2 27 1.531484 0.9070318 1.194709
## 2 28 1.531484 0.9070318 1.194709
## 2 29 1.531484 0.9070318 1.194709
## 2 30 1.531484 0.9070318 1.194709
## 2 31 1.531484 0.9070318 1.194709
## 2 32 1.531484 0.9070318 1.194709
## 2 33 1.531484 0.9070318 1.194709
## 2 34 1.531484 0.9070318 1.194709
## 2 35 1.531484 0.9070318 1.194709
## 2 36 1.531484 0.9070318 1.194709
## 2 37 1.531484 0.9070318 1.194709
## 2 38 1.531484 0.9070318 1.194709
## 3 2 4.476111 0.2204191 3.673471
## 3 3 3.785524 0.4453226 3.060822
## 3 4 2.903053 0.6705805 2.306446
## 3 5 2.568112 0.7406424 2.044942
## 3 6 2.454281 0.7643365 1.933426
## 3 7 2.120578 0.8222940 1.676387
## 3 8 1.969539 0.8458646 1.550942
## 3 9 1.827937 0.8681566 1.442770
## 3 10 1.756453 0.8786655 1.382025
## 3 11 1.606629 0.8979073 1.274194
## 3 12 1.543764 0.9057062 1.211520
## 3 13 1.534280 0.9065041 1.200634
## 3 14 1.559588 0.9035305 1.214722
## 3 15 1.541923 0.9054369 1.199061
## 3 16 1.560220 0.9034803 1.210911
## 3 17 1.568854 0.9021719 1.214138
## 3 18 1.567686 0.9026102 1.214731
## 3 19 1.570078 0.9024163 1.214594
## 3 20 1.570901 0.9022869 1.215046
## 3 21 1.567846 0.9026587 1.213048
## 3 22 1.567846 0.9026587 1.213048
## 3 23 1.567846 0.9026587 1.213048
## 3 24 1.567846 0.9026587 1.213048
## 3 25 1.567846 0.9026587 1.213048
## 3 26 1.567846 0.9026587 1.213048
## 3 27 1.567846 0.9026587 1.213048
## 3 28 1.567846 0.9026587 1.213048
## 3 29 1.567846 0.9026587 1.213048
## 3 30 1.567846 0.9026587 1.213048
## 3 31 1.567846 0.9026587 1.213048
## 3 32 1.567846 0.9026587 1.213048
## 3 33 1.567846 0.9026587 1.213048
## 3 34 1.567846 0.9026587 1.213048
## 3 35 1.567846 0.9026587 1.213048
## 3 36 1.567846 0.9026587 1.213048
## 3 37 1.567846 0.9026587 1.213048
## 3 38 1.567846 0.9026587 1.213048
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were nprune = 12 and degree = 2.
# Final Model Info
marsTune$finalModel
## Selected 11 of 18 terms, and 5 of 10 predictors (nprune=12)
## Termination condition: Reached nk 21
## Importance: X1, X4, X2, X5, X3, X6-unused, X7-unused, X8-unused, X9-unused, ...
## Number of terms at each degree of interaction: 1 8 2
## GCV 1.747495 RSS 264.5358 GRSq 0.929051 RSq 0.9457576
plot(marsTune)
# Comment this out for now
#predict(marsTune, testData$x)
marsImp <- varImp(marsTune, scale = FALSE)
plot(marsImp, top = 25)
marsTunePred <- predict(marsTune, testData$x)
## The function 'postResample' can be used to get the test set
## performance values
postResample(pred = marsTunePred, obs = testData$y)
## RMSE Rsquared MAE
## 1.2803060 0.9335241 1.0168673
Yes MARS selected X1, X4, X2, X5, and X3 as the important variables. MARS produced a 1.505974 RMSE and a 0.9094424 Rsquared.
MARS produced the best model with the lowest RMSE and highest Rsquared. With MARS, only less than 10% of the variance we don’t know where is coming from which is a pretty good value.
Let’s try SVM:
# Support Vector Machines
# Radial Basis
set.seed(100)
svmRTune <- train(x = trainingData$x, y = trainingData$y,
method = "svmRadial",
preProc = c("center", "scale"),
tuneLength = 14)
svmRTune
## Support Vector Machines with Radial Basis Function Kernel
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results across tuning parameters:
##
## C RMSE Rsquared MAE
## 0.25 2.599779 0.7770102 2.063152
## 0.50 2.375217 0.7936710 1.876608
## 1.00 2.222896 0.8107192 1.751399
## 2.00 2.113595 0.8249813 1.659187
## 4.00 2.060853 0.8320435 1.609666
## 8.00 2.049809 0.8339087 1.600139
## 16.00 2.047569 0.8343621 1.599618
## 32.00 2.047569 0.8343621 1.599618
## 64.00 2.047569 0.8343621 1.599618
## 128.00 2.047569 0.8343621 1.599618
## 256.00 2.047569 0.8343621 1.599618
## 512.00 2.047569 0.8343621 1.599618
## 1024.00 2.047569 0.8343621 1.599618
## 2048.00 2.047569 0.8343621 1.599618
##
## Tuning parameter 'sigma' was held constant at a value of 0.06509124
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were sigma = 0.06509124 and C = 16.
svmRTune$finalModel
## Support Vector Machine object of class "ksvm"
##
## SV type: eps-svr (regression)
## parameter : epsilon = 0.1 cost C = 16
##
## Gaussian Radial Basis kernel function.
## Hyperparameter : sigma = 0.0650912392130387
##
## Number of Support Vectors : 152
##
## Objective Function Value : -70.1788
## Training error : 0.008527
plot(svmRTune, scales = list(x = list(log = 2)))
svmGrid <- expand.grid(degree = 1:2,
scale = c(0.01, 0.005, 0.001),
C = 2^(-2:5))
# Most important variable predictors
svmRImp <- varImp(svmRTune, scale = FALSE)
plot(svmRImp, top = 25)
svmRTunePred <- predict(svmRTune, testData$x)
## The function 'postResample' can be used to get the test set
## performance values
postResample(pred = svmRTunePred, obs = testData$y)
## RMSE Rsquared MAE
## 2.0788958 0.8248396 1.5792687
# Polynomial
set.seed(100)
svmPTune <- train(x = trainingData$x, y = trainingData$y,
method = "svmPoly",
preProc = c("center", "scale"),
tuneGrid = svmGrid)
svmPTune
## Support Vector Machines with Polynomial Kernel
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results across tuning parameters:
##
## degree scale C RMSE Rsquared MAE
## 1 0.001 0.25 4.862920 0.7155465 4.004585
## 1 0.001 0.50 4.697906 0.7176711 3.863205
## 1 0.001 1.00 4.400200 0.7213558 3.604238
## 1 0.001 2.00 3.901315 0.7289326 3.180745
## 1 0.001 4.00 3.254841 0.7341450 2.663649
## 1 0.001 8.00 2.771845 0.7413433 2.258568
## 1 0.001 16.00 2.622436 0.7408603 2.125900
## 1 0.001 32.00 2.578572 0.7410913 2.071013
## 1 0.005 0.25 4.262548 0.7239938 3.484594
## 1 0.005 0.50 3.703935 0.7308785 3.021418
## 1 0.005 1.00 3.057672 0.7376830 2.503690
## 1 0.005 2.00 2.703878 0.7415164 2.201463
## 1 0.005 4.00 2.602981 0.7405541 2.106804
## 1 0.005 8.00 2.576439 0.7410259 2.065790
## 1 0.005 16.00 2.575065 0.7413928 2.057777
## 1 0.005 32.00 2.578020 0.7420024 2.050418
## 1 0.010 0.25 3.703918 0.7308543 3.021403
## 1 0.010 0.50 3.057681 0.7376812 2.503697
## 1 0.010 1.00 2.703862 0.7415153 2.201458
## 1 0.010 2.00 2.602947 0.7405566 2.106773
## 1 0.010 4.00 2.576329 0.7410363 2.065722
## 1 0.010 8.00 2.575253 0.7413771 2.057924
## 1 0.010 16.00 2.578125 0.7419991 2.050585
## 1 0.010 32.00 2.581512 0.7422381 2.050168
## 2 0.001 0.25 4.697916 0.7176996 3.863224
## 2 0.001 0.50 4.400196 0.7214218 3.604242
## 2 0.001 1.00 3.901199 0.7290208 3.180632
## 2 0.001 2.00 3.254169 0.7344170 2.663125
## 2 0.001 4.00 2.770573 0.7417223 2.257680
## 2 0.001 8.00 2.619500 0.7415846 2.124118
## 2 0.001 16.00 2.569849 0.7427953 2.064056
## 2 0.001 32.00 2.558191 0.7447051 2.045942
## 2 0.005 0.25 3.702765 0.7317002 3.020476
## 2 0.005 0.50 3.054604 0.7388475 2.501021
## 2 0.005 1.00 2.692157 0.7443012 2.192689
## 2 0.005 2.00 2.580493 0.7455833 2.089917
## 2 0.005 4.00 2.530526 0.7503904 2.030787
## 2 0.005 8.00 2.494742 0.7567411 1.991982
## 2 0.005 16.00 2.443603 0.7667976 1.942870
## 2 0.005 32.00 2.361202 0.7820982 1.858781
## 2 0.010 0.25 3.051325 0.7400793 2.498238
## 2 0.010 0.50 2.681190 0.7467634 2.184163
## 2 0.010 1.00 2.560269 0.7502383 2.073691
## 2 0.010 2.00 2.495690 0.7572675 2.002563
## 2 0.010 4.00 2.434944 0.7679398 1.938777
## 2 0.010 8.00 2.358201 0.7823336 1.858808
## 2 0.010 16.00 2.278112 0.7974784 1.777278
## 2 0.010 32.00 2.197354 0.8098275 1.704964
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were degree = 2, scale = 0.01 and C = 32.
svmPTune$finalModel
## Support Vector Machine object of class "ksvm"
##
## SV type: eps-svr (regression)
## parameter : epsilon = 0.1 cost C = 32
##
## Polynomial kernel function.
## Hyperparameters : degree = 2 scale = 0.01 offset = 1
##
## Number of Support Vectors : 157
##
## Objective Function Value : -1143.441
## Training error : 0.092611
plot(svmPTune,
scales = list(x = list(log = 2),
between = list(x = .5, y = 1)))
svmPTunePred <- predict(svmPTune, testData$x)
## The function 'postResample' can be used to get the test set
## performance values
postResample(pred = svmPTunePred, obs = testData$y)
## RMSE Rsquared MAE
## 2.238478 0.800259 1.663366
For SVM Radial, the RMSE is 2.047569 and Rsquared is 0.8343621. This is a pretty good RMSE value. The test set performance values are RMSE of 2.0788958 and 0.8248396 Rsquared. The test set performed similarly well.
For SVM Polynomial, the RMSE is 2.197354 and Rsquared is 0.8098275. This is also a pretty good RMSE value. The test set performance values are RMSE of 2.238478 and 0.800259 Rsquared.
MARS produced the highest Rsquared value so would be a good fit for the data.
Exercise 6.3 describes data for a chemical manufacturing process. Use the same data imputation, data splitting, and pre-processing steps as before and train several nonlinear regression models.
data(ChemicalManufacturingProcess)
ChemicalManufacturingProcessCopy <- ChemicalManufacturingProcess
# Imputation - use mean
for(i in 1:ncol(ChemicalManufacturingProcessCopy)) {
ChemicalManufacturingProcessCopy[is.na(ChemicalManufacturingProcessCopy[,i]), i] <- mean(ChemicalManufacturingProcessCopy[,i], na.rm = TRUE)
}
# Split the data
chem_train_set_x_df <- as.data.frame(ChemicalManufacturingProcessCopy[1:110,])
chem_train_set_x <- ChemicalManufacturingProcessCopy[1:110,]
chem_test_set_x <- ChemicalManufacturingProcessCopy[111:nrow(ChemicalManufacturingProcessCopy),]
chem_train_set_y_df <- as.data.frame(ChemicalManufacturingProcess[1:110,])
chem_train_set_y <- ChemicalManufacturingProcess[1:110,]
# choose response variable
y <- chem_train_set_x$Yield
# all the predictors need to be put into a matrix
x <- as.matrix(chem_train_set_x[, -which(names(chem_train_set_x) == "Yield")])
Let’s try KNN first:
# KNN
knnModel <- train(x = x,
y = y,
method = "knn",
preProc = c("center", "scale"),
tuneLength = 10)
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
knnModel
## k-Nearest Neighbors
##
## 110 samples
## 57 predictor
##
## Pre-processing: centered (57), scaled (57)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 110, 110, 110, 110, 110, 110, ...
## Resampling results across tuning parameters:
##
## k RMSE Rsquared MAE
## 5 1.244868 0.5817738 1.000147
## 7 1.264645 0.5848482 1.018950
## 9 1.264177 0.5929734 1.007341
## 11 1.284206 0.5848121 1.023910
## 13 1.290174 0.5916196 1.036305
## 15 1.311579 0.5829704 1.051650
## 17 1.321128 0.5808776 1.061011
## 19 1.333719 0.5772114 1.068976
## 21 1.343080 0.5778280 1.078028
## 23 1.358634 0.5729131 1.092949
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 5.
knnModel$finalModel
## 5-nearest neighbor regression model
chem_test_y <- chem_test_set_x[, which(names(chem_test_set_x) == "Yield")]
chem_test_x <- as.matrix(chem_test_set_x[, -which(names(chem_test_set_x) == "Yield")])
knnPred <- predict(knnModel, newdata = chem_test_x)
## The function 'postResample' can be used to get the test set
## performance values
postResample(pred = knnPred, obs = chem_test_y)
## RMSE Rsquared MAE
## 1.84291993 0.05430223 1.47330303
For KNN, the RMSE is 1.84291993 and Rsquared is 0.05430223. This is very low Rsquared value - most of the variance is unaccounted for.
The test set performance values are RMSE of 1.84291993 and 0.05430223 Rsquared. The resampled test set did not perform well either.
Now let’s try neural networks:
## Neural Networks
nnetGrid <- expand.grid(decay = c(0, 0.01, .1),
size = c(1, 3, 5, 7, 9, 11, 13),
bag = FALSE)
set.seed(100)
nnetTune <- train(x = x, y = y,
method = "avNNet",
tuneGrid = nnetGrid,
preProc = c("center", "scale"),
linout = TRUE,
trace = FALSE,
MaxNWts = 13 * (ncol(x) + 1) + 13 + 1,
maxit = 1000,
allowParallel = FALSE)
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
nnetTune
## Model Averaged Neural Network
##
## 110 samples
## 57 predictor
##
## Pre-processing: centered (57), scaled (57)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 110, 110, 110, 110, 110, 110, ...
## Resampling results across tuning parameters:
##
## decay size RMSE Rsquared MAE
## 0.00 1 2.221483 0.39428144 1.595476
## 0.00 3 1.469018 0.45830245 1.165347
## 0.00 5 2.126823 0.31766378 1.676008
## 0.00 7 2.098475 0.37979407 1.670170
## 0.00 9 3.144894 0.27626480 2.432084
## 0.00 11 5.823046 0.18118225 4.188792
## 0.00 13 9.715145 0.07074765 6.461437
## 0.01 1 2.540264 0.36279887 1.917291
## 0.01 3 2.471412 0.36209689 1.714530
## 0.01 5 2.433967 0.36890508 1.544417
## 0.01 7 2.249112 0.38440066 1.398157
## 0.01 9 2.114612 0.36244068 1.375064
## 0.01 11 2.123678 0.33871893 1.384059
## 0.01 13 1.987337 0.36755565 1.396594
## 0.10 1 2.511255 0.33675326 1.732347
## 0.10 3 2.642941 0.34464525 1.696917
## 0.10 5 2.573284 0.36049575 1.602714
## 0.10 7 2.457101 0.37785063 1.500059
## 0.10 9 2.252513 0.41132388 1.369509
## 0.10 11 2.120289 0.42678727 1.278255
## 0.10 13 2.015254 0.42952874 1.223559
##
## Tuning parameter 'bag' was held constant at a value of FALSE
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were size = 3, decay = 0 and bag = FALSE.
nnetTune$finalModel
## Model Averaged Neural Network with 5 Repeats
##
## a 57-3-1 network with 178 weights
## options were - linear output units
plot(nnetTune)
testResults <- data.frame(obs = chem_test_y,
NNet = predict(nnetTune, chem_test_x))
nnetTunePred <- predict(nnetTune, chem_test_x)
## The function 'postResample' can be used to get the test set
## performance values
postResample(pred = nnetTunePred, obs = chem_test_y)
## RMSE Rsquared MAE
## 1.62533417 0.04802536 1.35891124
For Neural Networks, the RMSE is 1.469018 and Rsquared is 0.45830245. This Rsquared value is better and covers more, but is still not great.
The test set performance values are RMSE of 1.62533417 and Rsquared of 0.04802536. This percentage is pretty low.
Now let’s try MARS:
# MARS
set.seed(100)
marsTune <- train(x = x, y = y,
method = "earth",
tuneGrid = expand.grid(degree = 1:3, nprune = 2:38))
marsTune
## Multivariate Adaptive Regression Spline
##
## 110 samples
## 57 predictor
##
## No pre-processing
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 110, 110, 110, 110, 110, 110, ...
## Resampling results across tuning parameters:
##
## degree nprune RMSE Rsquared MAE
## 1 2 1.588699 0.2994073 1.2582850
## 1 3 1.669369 0.5535076 1.0452364
## 1 4 1.507194 0.6238185 0.9599723
## 1 5 1.354593 0.6627754 0.9000157
## 1 6 1.334627 0.6713258 0.8739952
## 1 7 1.332292 0.6656274 0.8688532
## 1 8 1.294239 0.6530538 0.8825207
## 1 9 1.819951 0.5985816 1.0156710
## 1 10 2.458930 0.5390850 1.1623344
## 1 11 2.479671 0.5443419 1.1599377
## 1 12 2.954855 0.5435698 1.2459141
## 1 13 2.578470 0.5404274 1.1848380
## 1 14 2.720336 0.5422678 1.2113476
## 1 15 3.008675 0.5323508 1.2736239
## 1 16 3.145704 0.4997209 1.3109893
## 1 17 3.114113 0.4969623 1.3102191
## 1 18 3.123815 0.4971977 1.3100418
## 1 19 3.110950 0.5009103 1.3071952
## 1 20 3.113982 0.4998761 1.3074783
## 1 21 3.136307 0.4960707 1.3136513
## 1 22 3.133658 0.4958745 1.3139014
## 1 23 3.138981 0.4932907 1.3175322
## 1 24 3.138123 0.4937134 1.3160941
## 1 25 3.139060 0.4934384 1.3163302
## 1 26 3.157571 0.4921132 1.3176582
## 1 27 3.157119 0.4920943 1.3172509
## 1 28 3.145639 0.4935788 1.3154290
## 1 29 3.140517 0.4944116 1.3151549
## 1 30 3.152495 0.4926473 1.3167767
## 1 31 3.153676 0.4921455 1.3171657
## 1 32 3.153676 0.4921455 1.3171657
## 1 33 3.153676 0.4921455 1.3171657
## 1 34 3.153676 0.4921455 1.3171657
## 1 35 3.153676 0.4921455 1.3171657
## 1 36 3.153676 0.4921455 1.3171657
## 1 37 3.153676 0.4921455 1.3171657
## 1 38 3.153676 0.4921455 1.3171657
## 2 2 1.600795 0.2907941 1.2778536
## 2 3 1.291054 0.5332279 0.9994380
## 2 4 1.106141 0.6553167 0.8611250
## 2 5 1.059342 0.6838166 0.8212886
## 2 6 1.036216 0.7000995 0.8071075
## 2 7 1.328971 0.6631774 0.8656688
## 2 8 1.800931 0.6139357 0.9591714
## 2 9 1.929898 0.6087788 0.9997442
## 2 10 2.014019 0.6040248 1.0126597
## 2 11 1.985624 0.6009410 1.0213670
## 2 12 2.015089 0.5788043 1.0319266
## 2 13 2.114038 0.5779464 1.0384019
## 2 14 1.963194 0.5717780 1.0230509
## 2 15 2.101275 0.5420759 1.0601521
## 2 16 2.235070 0.5396043 1.0818772
## 2 17 2.284834 0.5290841 1.1041820
## 2 18 2.641252 0.5246312 1.1634581
## 2 19 2.732306 0.5119740 1.2003765
## 2 20 3.241683 0.4915273 1.3008548
## 2 21 3.151771 0.4834690 1.3019662
## 2 22 3.075073 0.4768640 1.3060841
## 2 23 3.134874 0.4708840 1.3219670
## 2 24 3.156865 0.4613247 1.3406371
## 2 25 3.164167 0.4490676 1.3589473
## 2 26 3.168371 0.4478407 1.3643826
## 2 27 3.177424 0.4446789 1.3736186
## 2 28 3.111525 0.4453110 1.3653820
## 2 29 3.228253 0.4425177 1.3935305
## 2 30 3.224888 0.4435082 1.3910505
## 2 31 3.224888 0.4435082 1.3910505
## 2 32 3.230032 0.4427358 1.3940785
## 2 33 3.230032 0.4427358 1.3940785
## 2 34 3.230032 0.4427358 1.3940785
## 2 35 3.230032 0.4427358 1.3940785
## 2 36 3.230032 0.4427358 1.3940785
## 2 37 3.230032 0.4427358 1.3940785
## 2 38 3.230032 0.4427358 1.3940785
## 3 2 1.611359 0.2831406 1.2859470
## 3 3 1.287507 0.5357239 1.0085776
## 3 4 1.151840 0.6263777 0.9008127
## 3 5 1.098984 0.6619505 0.8595380
## 3 6 1.079925 0.6842634 0.8292669
## 3 7 1.363673 0.6523168 0.8789438
## 3 8 1.417813 0.6407400 0.8993040
## 3 9 1.559487 0.6359382 0.9377949
## 3 10 1.636319 0.6303950 0.9616302
## 3 11 1.626241 0.6193544 0.9775764
## 3 12 1.647695 0.6304333 0.9653034
## 3 13 1.731049 0.6165197 0.9955543
## 3 14 1.685758 0.5905392 0.9980469
## 3 15 1.714558 0.5799949 1.0106033
## 3 16 1.858930 0.5500968 1.0578350
## 3 17 1.877271 0.5437410 1.0676937
## 3 18 2.100317 0.5049920 1.1172595
## 3 19 2.169463 0.4973989 1.1333040
## 3 20 2.156728 0.4989900 1.1348170
## 3 21 2.201367 0.4994939 1.1503700
## 3 22 2.216714 0.4925901 1.1567132
## 3 23 2.197880 0.4883685 1.1625758
## 3 24 2.200493 0.4835275 1.1597911
## 3 25 2.212609 0.4832953 1.1658835
## 3 26 2.222175 0.4796148 1.1746132
## 3 27 2.236529 0.4765425 1.1822400
## 3 28 2.243705 0.4777620 1.1844329
## 3 29 2.242471 0.4774921 1.1857809
## 3 30 2.269594 0.4758330 1.1969217
## 3 31 2.264465 0.4763329 1.1987303
## 3 32 2.269609 0.4755605 1.2017583
## 3 33 2.269609 0.4755605 1.2017583
## 3 34 2.269609 0.4755605 1.2017583
## 3 35 2.269609 0.4755605 1.2017583
## 3 36 2.269609 0.4755605 1.2017583
## 3 37 2.269609 0.4755605 1.2017583
## 3 38 2.269609 0.4755605 1.2017583
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were nprune = 6 and degree = 2.
plot(marsTune)
marsTune$finalModel
## Selected 6 of 36 terms, and 4 of 57 predictors (nprune=6)
## Termination condition: RSq changed by less than 0.001 at 36 terms
## Importance: ManufacturingProcess32, ManufacturingProcess13, ...
## Number of terms at each degree of interaction: 1 4 1
## GCV 0.7989905 RSS 67.63999 GRSq 0.7813103 RSq 0.8285924
#predict(marsTune, chem_test_x)
marsImp <- varImp(marsTune, scale = FALSE)
plot(marsImp, top = 25)
marsTunePred <- predict(marsTune, chem_test_x)
## The function 'postResample' can be used to get the test set
## performance values
postResample(pred = marsTunePred, obs = chem_test_y)
## RMSE Rsquared MAE
## 1.5511787 0.1711802 1.2524785
For MARS, the RMSE is 1.294239 and the final model’s Rsquared value is 0.8285924. This Rsquared is the best so far, and is pretty good with over 80% of the variance being accounted for.
The test set performance values are RMSE of 1.5511787 and 0.1711802 Rsquared.
Let’s try SVM:
# Support Vector Machines
# Radial Basis
set.seed(100)
svmRTune <- train(x = x, y = y,
method = "svmRadial",
preProc = c("center", "scale"),
tuneLength = 14)
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
svmRTune
## Support Vector Machines with Radial Basis Function Kernel
##
## 110 samples
## 57 predictor
##
## Pre-processing: centered (57), scaled (57)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 110, 110, 110, 110, 110, 110, ...
## Resampling results across tuning parameters:
##
## C RMSE Rsquared MAE
## 0.25 1.355761 0.5886127 1.0752683
## 0.50 1.223504 0.6303527 0.9651716
## 1.00 1.137010 0.6653915 0.8899960
## 2.00 1.098457 0.6778606 0.8591548
## 4.00 1.088640 0.6785654 0.8500208
## 8.00 1.086956 0.6785984 0.8482931
## 16.00 1.087078 0.6786161 0.8481781
## 32.00 1.087078 0.6786161 0.8481781
## 64.00 1.087078 0.6786161 0.8481781
## 128.00 1.087078 0.6786161 0.8481781
## 256.00 1.087078 0.6786161 0.8481781
## 512.00 1.087078 0.6786161 0.8481781
## 1024.00 1.087078 0.6786161 0.8481781
## 2048.00 1.087078 0.6786161 0.8481781
##
## Tuning parameter 'sigma' was held constant at a value of 0.01293828
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were sigma = 0.01293828 and C = 8.
svmRTune$finalModel
## Support Vector Machine object of class "ksvm"
##
## SV type: eps-svr (regression)
## parameter : epsilon = 0.1 cost C = 8
##
## Gaussian Radial Basis kernel function.
## Hyperparameter : sigma = 0.0129382849538257
##
## Number of Support Vectors : 90
##
## Objective Function Value : -41.3735
## Training error : 0.008806
plot(svmRTune, scales = list(x = list(log = 2)))
svmGrid <- expand.grid(degree = 1:2,
scale = c(0.01, 0.005, 0.001),
C = 2^(-2:5))
# Most important variable predictors
svmRImp <- varImp(svmRTune, scale = FALSE)
plot(svmRImp, top = 25)
svmRTunePred <- predict(svmRTune, chem_test_x)
## The function 'postResample' can be used to get the test set
## performance values
postResample(pred = svmRTunePred, obs = chem_test_y)
## RMSE Rsquared MAE
## 1.5674408 0.1562124 1.2724531
# Polynomial
set.seed(100)
svmPTune <- train(x = x, y = y,
method = "svmPoly",
preProc = c("center", "scale"),
tuneGrid = svmGrid)
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: BiologicalMaterial07
## Warning in .local(x, ...): Variable(s) `' constant. Cannot scale data.
svmPTune
## Support Vector Machines with Polynomial Kernel
##
## 110 samples
## 57 predictor
##
## Pre-processing: centered (57), scaled (57)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 110, 110, 110, 110, 110, 110, ...
## Resampling results across tuning parameters:
##
## degree scale C RMSE Rsquared MAE
## 1 0.001 0.25 1.720410 0.4234988 1.3922192
## 1 0.001 0.50 1.610085 0.4283503 1.3046835
## 1 0.001 1.00 1.543744 0.4311445 1.1916996
## 1 0.001 2.00 1.620361 0.4182461 1.0856874
## 1 0.001 4.00 1.629624 0.4526044 1.0100518
## 1 0.001 8.00 1.672883 0.5048335 0.9859143
## 1 0.001 16.00 1.857864 0.5035023 1.0196492
## 1 0.001 32.00 2.235462 0.4506089 1.1332864
## 1 0.005 0.25 1.547668 0.4251583 1.1552749
## 1 0.005 0.50 1.626239 0.4291056 1.0526890
## 1 0.005 1.00 1.637573 0.4697672 0.9982559
## 1 0.005 2.00 1.681775 0.5135275 0.9816983
## 1 0.005 4.00 1.973811 0.4916481 1.0505667
## 1 0.005 8.00 2.369730 0.4360883 1.1786173
## 1 0.005 16.00 3.048753 0.4062931 1.3289141
## 1 0.005 32.00 4.073252 0.3367127 1.5790190
## 1 0.010 0.25 1.626194 0.4291113 1.0526790
## 1 0.010 0.50 1.637505 0.4697980 0.9982353
## 1 0.010 1.00 1.681724 0.5135220 0.9816955
## 1 0.010 2.00 1.973769 0.4916501 1.0505696
## 1 0.010 4.00 2.369672 0.4360882 1.1785867
## 1 0.010 8.00 3.048910 0.4062662 1.3289463
## 1 0.010 16.00 4.072590 0.3367566 1.5789932
## 1 0.010 32.00 5.101183 0.2942691 1.8169842
## 2 0.001 0.25 1.604700 0.4355894 1.3016706
## 2 0.001 0.50 1.511550 0.4556188 1.1792970
## 2 0.001 1.00 1.514201 0.4658014 1.0550742
## 2 0.001 2.00 1.496422 0.5136397 0.9708990
## 2 0.001 4.00 1.554965 0.5245759 0.9591011
## 2 0.001 8.00 1.799799 0.5215671 1.0047817
## 2 0.001 16.00 2.018153 0.4937737 1.0734654
## 2 0.001 32.00 2.784911 0.4041025 1.2523164
## 2 0.005 0.25 1.905831 0.4494745 1.0811059
## 2 0.005 0.50 2.244565 0.3933987 1.0891001
## 2 0.005 1.00 2.363590 0.4206431 1.0928894
## 2 0.005 2.00 2.585938 0.3958022 1.1398564
## 2 0.005 4.00 2.855922 0.3819741 1.2042342
## 2 0.005 8.00 3.845552 0.3688580 1.3847050
## 2 0.005 16.00 4.919619 0.3319692 1.6038240
## 2 0.005 32.00 5.380138 0.3183079 1.6984975
## 2 0.010 0.25 3.240862 0.3275041 1.2517091
## 2 0.010 0.50 2.831368 0.4095898 1.1529139
## 2 0.010 1.00 2.938238 0.3599960 1.1897918
## 2 0.010 2.00 3.883424 0.3287333 1.3568396
## 2 0.010 4.00 4.603693 0.3064813 1.5211378
## 2 0.010 8.00 4.815485 0.2774205 1.5807421
## 2 0.010 16.00 4.820214 0.2734886 1.5849862
## 2 0.010 32.00 4.804172 0.2744159 1.5816452
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were degree = 2, scale = 0.001 and C = 2.
svmPTune$finalModel
## Support Vector Machine object of class "ksvm"
##
## SV type: eps-svr (regression)
## parameter : epsilon = 0.1 cost C = 2
##
## Polynomial kernel function.
## Hyperparameters : degree = 2 scale = 0.001 offset = 1
##
## Number of Support Vectors : 92
##
## Objective Function Value : -80.6722
## Training error : 0.216849
plot(svmPTune,
scales = list(x = list(log = 2),
between = list(x = .5, y = 1)))
svmPTunePred <- predict(svmPTune, chem_test_x)
## The function 'postResample' can be used to get the test set
## performance values
postResample(pred = svmPTunePred, obs = chem_test_y)
## RMSE Rsquared MAE
## 2.6354160 0.0382175 1.8813095
testResults$SVMr <- predict(svmRTune, chem_test_x)
testResults$SVMp <- predict(svmPTune, chem_test_x)
For SVM with Radial Basis, the RMSE is 1.086956 and Rsquared is 0.6785984. The test set performance values are RMSE of 1.5674408 and 0.1562124 Rsquared.
For SVM with Polynomial, the RMSE is 1.496422 and Rsquared is 0.5136397. The test set performance values are RMSE of 2.6354160 and 0.0382175 Rsquared.
The optimal performance model is MARS which accounts for ~83% of the variance.
The top predictors for MARS are ManufacturingProcess32, ManufacturingProcess13, ManufacturingProcess06, and ManufacturingProcess16. The only chosen predictors are process variables. Compared to the linear model I used, there are fewer predictors and zero biological predictors (which existed in the linear model).
Here are scatter plots showing the relationship between each predictor and the yield:
ggplot(data=ChemicalManufacturingProcessCopy, aes(x=ManufacturingProcess32, y=Yield)) + geom_point()
ggplot(data=ChemicalManufacturingProcessCopy, aes(x=ManufacturingProcess13, y=Yield)) + geom_point()
ggplot(data=ChemicalManufacturingProcessCopy, aes(x=ManufacturingProcess06, y=Yield)) + geom_point()
ggplot(data=ChemicalManufacturingProcessCopy, aes(x=ManufacturingProcess16, y=Yield)) + geom_point()
ManufacturingProcess32 approximate range is from 150-170. The variance seems to stay pretty consistent. Overall, as ManufacturingProcess32 increases, Yield increases as well. It seems maybe around 165 ManufacturingProcess32, the Yield starts to even out or decrease.
ManufacturingProcess13 approximate range is from 32-36 with a few outliers around 38. The variance seems to stay pretty consistent. Overall, as ManufacturingProcess13 increases, Yield decreases.
ManufacturingProcess06 approximate range is from 200-215 with a few outliers. The variance seems to stay pretty consistent, with a slightly larger variance at 210. Overall, as ManufacturingProcess06 increases, Yield increases as well. It seems maybe around 210 ManufacturingProcess06, the Yield starts to even out or decrease. The highest Yield occurs at 210. The largest cluster of values occurs between 205 and 210.
ManufacturingProcess16 approximate range is from 4500-4600. It looks like there is no real trend present. There is a very narrow range of possible values for ManufacturingProcess16 that produces a wide variance for Yield. It seems ManufacturingProcess16 is not a great predictor for Yield.
Having all this information informs us what values we should have for certain predictors to produce a good yield.