7.2

Friedman (1991) introduced several benchmark data sets created by simulation. On of these simulations used the following nonlinear equations to create data:

\(y = 10 sin(\pi x_1x_2) + 20(x_3- 0.5)^2 + 10x_4 + 5x_5 + N(0, \sigma^2)\)

where the \(x\) values are random variables uniformly distributed between \([0,1]\) (there are also 5 other non-informative variables also created in the simulation). The package mlbench contains a function called mlbench.friedman1 that simulates these data:

library(mlbench)
set.seed(200)
trainingData <- mlbench.friedman1(200, sd = 1)
## We convert the 'x' data from a matrix to a data frame.
trainingData$x <- data.frame(trainingData$x)

featurePlot(trainingData$x, trainingData$y)

# List with a vector 'y' and a matrix of predictors 'x'. 
testData <- mlbench.friedman1(5000, sd = 1)
testData$x <- data.frame(testData$x)

Tune several models on these data.

KNN Model

library(caret)
knnModel <- train(x = trainingData$x, 
                  y = trainingData$y,
                  method = "knn",
                  preProcess = c("center", "scale"),
                  tuneLength = 10)
knnModel
## k-Nearest Neighbors 
## 
## 200 samples
##  10 predictor
## 
## Pre-processing: centered (10), scaled (10) 
## Resampling: Bootstrapped (25 reps) 
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ... 
## Resampling results across tuning parameters:
## 
##   k   RMSE      Rsquared   MAE     
##    5  3.466085  0.5121775  2.816838
##    7  3.349428  0.5452823  2.727410
##    9  3.264276  0.5785990  2.660026
##   11  3.214216  0.6024244  2.603767
##   13  3.196510  0.6176570  2.591935
##   15  3.184173  0.6305506  2.577482
##   17  3.183130  0.6425367  2.567787
##   19  3.198752  0.6483184  2.592683
##   21  3.188993  0.6611428  2.588787
##   23  3.200458  0.6638353  2.604529
## 
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 17.
knnPred <- predict(knnModel, newdata = testData$x)
## The function 'postResample' can be used to get the test set
## performance values
postResample(pred = knnPred, obs = testData$y)
##      RMSE  Rsquared       MAE 
## 3.2040595 0.6819919 2.5683461

Which models appear to give the best performance? Does MARS select the informative predictors (those named X1-X5)?

MARS Model

MARS_grid <- expand.grid(.degree = 1:2, .nprune = 2:15)
MARS_model <- train(x = trainingData$x, 
                  y = trainingData$y,
                  method = "earth",
                  tuneGrid = MARS_grid,
                  preProcess = c("center", "scale"),
                  tuneLength = 10)
## Loading required package: earth
## Loading required package: Formula
## Loading required package: plotmo
## Loading required package: plotrix
## Loading required package: TeachingDemos
MARS_model
## Multivariate Adaptive Regression Spline 
## 
## 200 samples
##  10 predictor
## 
## Pre-processing: centered (10), scaled (10) 
## Resampling: Bootstrapped (25 reps) 
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ... 
## Resampling results across tuning parameters:
## 
##   degree  nprune  RMSE      Rsquared   MAE     
##   1        2      4.383438  0.2405683  3.597961
##   1        3      3.645469  0.4745962  2.930453
##   1        4      2.727602  0.7035031  2.184240
##   1        5      2.449243  0.7611230  1.939231
##   1        6      2.331605  0.7835496  1.833420
##   1        7      1.976830  0.8421599  1.562591
##   1        8      1.870959  0.8585503  1.464551
##   1        9      1.804342  0.8683110  1.410395
##   1       10      1.787676  0.8711960  1.386944
##   1       11      1.790700  0.8707740  1.393076
##   1       12      1.821005  0.8670619  1.419893
##   1       13      1.858688  0.8617344  1.445459
##   1       14      1.862343  0.8623072  1.446050
##   1       15      1.871033  0.8607099  1.457618
##   2        2      4.383438  0.2405683  3.597961
##   2        3      3.644919  0.4742570  2.929647
##   2        4      2.730222  0.7028372  2.183075
##   2        5      2.481291  0.7545789  1.965749
##   2        6      2.338369  0.7827873  1.825542
##   2        7      2.030065  0.8328250  1.602024
##   2        8      1.890997  0.8551326  1.477422
##   2        9      1.742626  0.8757904  1.371910
##   2       10      1.608221  0.8943432  1.255416
##   2       11      1.474325  0.9111463  1.157848
##   2       12      1.437483  0.9157967  1.120977
##   2       13      1.439395  0.9164721  1.128309
##   2       14      1.428565  0.9184503  1.118634
##   2       15      1.434093  0.9182413  1.121622
## 
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were nprune = 14 and degree = 2.

The optimal MARS model minimized the RMSE when the nprune = r MARS_model\(bestTune\)nprune and the degree = r MARS_model\(bestTune\)degree.

MARS_predictions <- predict(MARS_model, newdata = testData$x)
postResample(pred = MARS_predictions, obs = testData$y)
##      RMSE  Rsquared       MAE 
## 1.2779993 0.9338365 1.0147070

The RMSE of the MARS model is a lot lower than the KNN model. Let’s see what variables are important.

varImp(MARS_model)
## earth variable importance
## 
##    Overall
## X1  100.00
## X4   75.40
## X2   49.00
## X5   15.72
## X3    0.00

The MARS model picks up the X1-X5 varaibles.

SVM Model

SVM_model <- train(x = trainingData$x,
                   y = trainingData$y,
                   method = "svmRadial",
                   preProcess = c("center", "scale"),
                   tuneLength = 10,
                   trControl = trainControl(method = "cv"))
SVM_model
## Support Vector Machines with Radial Basis Function Kernel 
## 
## 200 samples
##  10 predictor
## 
## Pre-processing: centered (10), scaled (10) 
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 180, 180, 180, 180, 180, 180, ... 
## Resampling results across tuning parameters:
## 
##   C       RMSE      Rsquared   MAE     
##     0.25  2.548095  0.7967225  2.001583
##     0.50  2.290826  0.8118225  1.779407
##     1.00  2.095918  0.8337552  1.634192
##     2.00  1.989469  0.8462100  1.539721
##     4.00  1.901332  0.8565135  1.510431
##     8.00  1.879815  0.8588126  1.514502
##    16.00  1.878866  0.8591568  1.518094
##    32.00  1.878866  0.8591568  1.518094
##    64.00  1.878866  0.8591568  1.518094
##   128.00  1.878866  0.8591568  1.518094
## 
## Tuning parameter 'sigma' was held constant at a value of 0.06670077
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were sigma = 0.06670077 and C = 16.

The optimal SVM mdoel has a \(\sigma\) of r SVM_model\(bestTune\)sigma and an C of r SVM_model\(bestTune\)C.

SVM_predictions <- predict(SVM_model, newdata = testData$x)
postResample(pred = SVM_predictions, obs = testData$y)
##      RMSE  Rsquared       MAE 
## 2.0829707 0.8242096 1.5826017
varImp(SVM_model)
## loess r-squared variable importance
## 
##      Overall
## X4  100.0000
## X1   95.5047
## X2   89.6186
## X5   45.2170
## X3   29.9330
## X9    6.3299
## X10   5.5182
## X8    3.2527
## X6    0.8884
## X7    0.0000

The SVM picked up the important variables.

Neural Network Model

nnet_grid <- expand.grid(.decay = c(0, 0.01, .1), .size = c(1:10), .bag = FALSE)
nnet_maxnwts <- 5 * (ncol(trainingData$x) + 1) + 5 + 1
nnet_model <- train(x = trainingData$x,
                    y = trainingData$y,
                    method = "avNNet",
                    preProcess = c("center", "scale"),
                    tuneGrid = nnet_grid,
                    trControl = trainControl(method = "cv"),
                    linout = TRUE,
                    trace = FALSE,
                    MaxNWts = nnet_maxnwts,
                    maxit = 500)
## Warning: executing %dopar% sequentially: no parallel backend registered
## Warning: model fit failed for Fold01: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold01: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold01: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold01: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold01: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold01: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold01: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold01: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold01: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold01: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold01: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold01: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold01: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold01: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold01: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold02: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold02: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold02: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold02: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold02: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold02: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold02: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold02: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold02: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold02: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold02: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold02: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold02: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold02: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold02: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold03: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold03: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold03: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold03: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold03: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold03: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold03: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold03: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold03: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold03: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold03: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold03: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold03: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold03: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold03: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold04: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold04: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold04: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold04: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold04: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold04: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold04: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold04: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold04: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold04: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold04: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold04: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold04: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold04: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold04: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold05: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold05: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold05: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold05: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold05: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold05: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold05: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold05: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold05: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold05: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold05: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold05: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold05: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold05: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold05: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold06: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold06: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold06: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold06: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold06: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold06: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold06: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold06: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold06: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold06: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold06: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold06: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold06: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold06: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold06: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold07: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold07: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold07: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold07: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold07: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold07: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold07: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold07: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold07: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold07: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold07: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold07: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold07: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold07: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold07: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold08: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold08: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold08: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold08: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold08: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold08: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold08: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold08: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold08: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold08: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold08: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold08: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold08: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold08: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold08: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold09: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold09: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold09: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold09: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold09: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold09: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold09: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold09: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold09: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold09: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold09: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold09: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold09: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold09: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold09: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold10: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold10: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold10: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (73) weights"
## Warning: model fit failed for Fold10: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold10: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold10: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (85) weights"
## Warning: model fit failed for Fold10: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold10: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold10: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (97) weights"
## Warning: model fit failed for Fold10: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold10: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold10: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (109) weights"
## Warning: model fit failed for Fold10: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold10: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning: model fit failed for Fold10: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (121) weights"
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
## Warning in train.default(x = trainingData$x, y = trainingData$y, method =
## "avNNet", : missing values found in aggregated results
nnet_model
## Model Averaged Neural Network 
## 
## 200 samples
##  10 predictor
## 
## Pre-processing: centered (10), scaled (10) 
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 180, 180, 180, 180, 180, 180, ... 
## Resampling results across tuning parameters:
## 
##   decay  size  RMSE      Rsquared   MAE     
##   0.00    1    2.459523  0.7691775  1.920417
##   0.00    2    2.515976  0.7560670  1.970540
##   0.00    3    2.170413  0.8134990  1.698411
##   0.00    4    2.207703  0.8187669  1.654231
##   0.00    5    2.319841  0.7984667  1.781056
##   0.00    6         NaN        NaN       NaN
##   0.00    7         NaN        NaN       NaN
##   0.00    8         NaN        NaN       NaN
##   0.00    9         NaN        NaN       NaN
##   0.00   10         NaN        NaN       NaN
##   0.01    1    2.443513  0.7700880  1.902169
##   0.01    2    2.498222  0.7610108  1.965275
##   0.01    3    2.243893  0.8018325  1.802807
##   0.01    4    2.060692  0.8303931  1.626429
##   0.01    5    2.146816  0.8264060  1.712472
##   0.01    6         NaN        NaN       NaN
##   0.01    7         NaN        NaN       NaN
##   0.01    8         NaN        NaN       NaN
##   0.01    9         NaN        NaN       NaN
##   0.01   10         NaN        NaN       NaN
##   0.10    1    2.450808  0.7677685  1.905063
##   0.10    2    2.566992  0.7445498  2.007382
##   0.10    3    2.203412  0.8081681  1.717865
##   0.10    4    2.001697  0.8435571  1.594042
##   0.10    5    2.009614  0.8416270  1.595709
##   0.10    6         NaN        NaN       NaN
##   0.10    7         NaN        NaN       NaN
##   0.10    8         NaN        NaN       NaN
##   0.10    9         NaN        NaN       NaN
##   0.10   10         NaN        NaN       NaN
## 
## Tuning parameter 'bag' was held constant at a value of FALSE
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were size = 4, decay = 0.1 and bag = FALSE.

The best neural network has a size = r nnet_model\(bestTune\)size and a decay of r nnet_model\(bestTune\)decay.

nnet_predictions <- predict(nnet_model, newdata = testData$x)
postResample(pred = nnet_predictions, obs = testData$y)
##      RMSE  Rsquared       MAE 
## 2.0532720 0.8346366 1.5444560
varImp(nnet_model)
## loess r-squared variable importance
## 
##      Overall
## X4  100.0000
## X1   95.5047
## X2   89.6186
## X5   45.2170
## X3   29.9330
## X9    6.3299
## X10   5.5182
## X8    3.2527
## X6    0.8884
## X7    0.0000

Summary

results <- data.frame(t(postResample(pred = knnPred, obs = testData$y))) %>% 
  mutate("Model" = "KNN")

results <- data.frame(t(postResample(pred = MARS_predictions, obs = testData$y))) %>%
  mutate("Model"= "MARS") %>%
  bind_rows(results)

results <- data.frame(t(postResample(pred = SVM_predictions, obs = testData$y))) %>%
  mutate("Model"= "SVM") %>%
  bind_rows(results)

results <- data.frame(t(postResample(pred = nnet_predictions, obs = testData$y))) %>%
  mutate("Model"= "Neural Network") %>%
  bind_rows(results)

results %>%
  select(Model, RMSE, Rsquared, MAE) %>%
  arrange(RMSE) %>%
  kable() %>%
  kable_styling()
Model RMSE Rsquared MAE
MARS 1.277999 0.9338365 1.014707
Neural Network 2.053272 0.8346366 1.544456
SVM 2.082971 0.8242096 1.582602
KNN 3.204060 0.6819919 2.568346

The MARS model was the best and identified the right variables as the important ones.

7.5

Exercise 6.3 describes data for a chemical manufacturing process. Use the same data imputation, data splitting, and pre-processing steps as before and training several nonlinear regression models.

library(AppliedPredictiveModeling)
data(ChemicalManufacturingProcess)

# Make this reproducible
set.seed(42)

knn_model <- preProcess(ChemicalManufacturingProcess, "knnImpute")
df <- predict(knn_model, ChemicalManufacturingProcess)

df <- df %>%
  select_at(vars(-one_of(nearZeroVar(., names = TRUE))))

in_train <- createDataPartition(df$Yield, times = 1, p = 0.8, list = FALSE)
train_df <- df[in_train, ]
test_df <- df[-in_train, ]

pls_model <- train(
  Yield ~ ., data = train_df, method = "pls",
  center = TRUE,
  scale = TRUE,
  trControl = trainControl("cv", number = 10),
  tuneLength = 25
)

pls_model
## Partial Least Squares 
## 
## 144 samples
##  56 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 130, 129, 128, 129, 130, 129, ... 
## Resampling results across tuning parameters:
## 
##   ncomp  RMSE       Rsquared   MAE      
##    1     0.8824790  0.3779221  0.6711462
##    2     1.1458456  0.4219806  0.7086431
##    3     0.7363066  0.5244517  0.5688553
##    4     0.8235294  0.5298005  0.5933120
##    5     0.9670735  0.4846010  0.6371199
##    6     0.9959036  0.4776684  0.6427478
##    7     0.9119517  0.4986338  0.6200233
##    8     0.9068621  0.5012144  0.6293371
##    9     0.8517370  0.5220166  0.6163795
##   10     0.8919356  0.5062912  0.6332243
##   11     0.9173758  0.4934557  0.6463164
##   12     0.9064125  0.4791526  0.6485663
##   13     0.9255289  0.4542181  0.6620193
##   14     1.0239913  0.4358371  0.6944056
##   15     1.0754710  0.4365214  0.7077991
##   16     1.1110579  0.4269065  0.7135684
##   17     1.1492855  0.4210485  0.7222868
##   18     1.1940639  0.4132534  0.7396357
##   19     1.2271867  0.4079005  0.7494818
##   20     1.2077102  0.4022859  0.7470327
##   21     1.2082648  0.4026711  0.7452969
##   22     1.2669285  0.3987044  0.7634170
##   23     1.3663033  0.3970188  0.7957514
##   24     1.4531634  0.3898475  0.8243034
##   25     1.5624265  0.3820102  0.8612935
## 
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was ncomp = 3.
pls_predictions <- predict(pls_model, test_df)

results <- data.frame(t(postResample(pred = pls_predictions, obs = test_df$Yield))) %>%
  mutate("Model"= "PLS")

A

Which nonlinear regression model give the optimal resampling and test set performance?

KNN Model

knn_model <- train(
  Yield ~ ., data = train_df, method = "knn",
  center = TRUE,
  scale = TRUE,
  trControl = trainControl("cv", number = 10),
  tuneLength = 25
)
knn_model
## k-Nearest Neighbors 
## 
## 144 samples
##  56 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 130, 129, 130, 130, 130, 131, ... 
## Resampling results across tuning parameters:
## 
##   k   RMSE       Rsquared   MAE      
##    5  0.7085987  0.4846711  0.5709345
##    7  0.7331379  0.4559107  0.5981516
##    9  0.7417741  0.4556845  0.6124459
##   11  0.7505969  0.4526780  0.6277377
##   13  0.7450412  0.4691830  0.6181457
##   15  0.7539451  0.4604423  0.6246762
##   17  0.7663763  0.4474991  0.6306860
##   19  0.7694326  0.4480450  0.6315047
##   21  0.7706841  0.4508441  0.6288689
##   23  0.7780866  0.4419184  0.6350889
##   25  0.7834333  0.4320541  0.6382165
##   27  0.7877890  0.4339581  0.6454522
##   29  0.7981302  0.4187796  0.6533767
##   31  0.8060716  0.3977540  0.6610186
##   33  0.8063566  0.4128701  0.6600828
##   35  0.8144040  0.3964037  0.6671734
##   37  0.8160580  0.4009313  0.6662999
##   39  0.8220176  0.3884786  0.6693883
##   41  0.8274778  0.3750873  0.6746765
##   43  0.8288000  0.3820179  0.6758220
##   45  0.8285016  0.3878549  0.6747216
##   47  0.8331504  0.3808342  0.6774480
##   49  0.8368096  0.3816669  0.6829726
##   51  0.8391074  0.3804316  0.6857959
##   53  0.8442763  0.3704957  0.6915276
## 
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 5.
knn_predictions <- predict(knn_model, test_df)

results <- data.frame(t(postResample(pred = knn_predictions, obs = test_df$Yield))) %>%
  mutate("Model"= "KNN") %>% rbind(results)

MARS Model

MARS_grid <- expand.grid(.degree = 1:2, .nprune = 2:15)

MARS_model <- train(
  Yield ~ ., data = train_df, method = "earth",
  tuneGrid = MARS_grid,
  # If the following lines are uncommented, it throws an error
  #center = TRUE,
  #scale = TRUE,
  trControl = trainControl("cv", number = 10),
  tuneLength = 25
)
MARS_model
## Multivariate Adaptive Regression Spline 
## 
## 144 samples
##  56 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 129, 130, 130, 128, 130, 128, ... 
## Resampling results across tuning parameters:
## 
##   degree  nprune  RMSE       Rsquared   MAE      
##   1        2      0.7744594  0.4159942  0.6084363
##   1        3      0.7003597  0.5682276  0.5530374
##   1        4      0.6469215  0.6086750  0.5154043
##   1        5      0.6404520  0.6073071  0.5042864
##   1        6      0.6960964  0.5396157  0.5588565
##   1        7      0.6997921  0.5521698  0.5669813
##   1        8      0.7147341  0.5423472  0.5621022
##   1        9      0.7173559  0.5508650  0.5641610
##   1       10      0.7487512  0.5325936  0.5869088
##   1       11      0.7415722  0.5353676  0.5847493
##   1       12      0.7353149  0.5519142  0.5768069
##   1       13      0.7450704  0.5357967  0.5887685
##   1       14      0.7419274  0.5400866  0.5844385
##   1       15      0.7262459  0.5553445  0.5775694
##   2        2      0.7814439  0.4052275  0.6150579
##   2        3      0.6820554  0.5737133  0.5476069
##   2        4      0.7004870  0.5518986  0.5592458
##   2        5      0.6664614  0.5994367  0.5205528
##   2        6      0.6805337  0.5743844  0.5268483
##   2        7      0.6934553  0.5605660  0.5471219
##   2        8      0.6388022  0.6152830  0.5074414
##   2        9      0.6232908  0.6304499  0.4924151
##   2       10      0.6344438  0.6238174  0.4924866
##   2       11      0.6328938  0.6213979  0.4972698
##   2       12      0.6578933  0.6082595  0.5063196
##   2       13      0.6584237  0.6149814  0.5042046
##   2       14      0.6816227  0.6094383  0.5120934
##   2       15      0.6839806  0.6080881  0.5167929
## 
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were nprune = 9 and degree = 2.
MARS_predictions <- predict(MARS_model, test_df)

results <- data.frame(t(postResample(pred = MARS_predictions, obs = test_df$Yield))) %>%
  mutate("Model"= "MARS") %>% rbind(results)

SVM Model

SVM_model <- train(
  Yield ~ ., data = train_df, method = "svmRadial",
  center = TRUE,
  scale = TRUE,
  trControl = trainControl(method = "cv"),
  tuneLength = 25
)
SVM_model
## Support Vector Machines with Radial Basis Function Kernel 
## 
## 144 samples
##  56 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 129, 129, 129, 130, 131, 131, ... 
## Resampling results across tuning parameters:
## 
##   C           RMSE       Rsquared   MAE      
##         0.25  0.7683015  0.4646282  0.6392703
##         0.50  0.7174781  0.4848381  0.5965095
##         1.00  0.6686679  0.5330949  0.5530530
##         2.00  0.6409061  0.5552794  0.5275947
##         4.00  0.6356640  0.5619233  0.5166512
##         8.00  0.6286151  0.5761807  0.5115837
##        16.00  0.6287131  0.5760690  0.5116965
##        32.00  0.6287131  0.5760690  0.5116965
##        64.00  0.6287131  0.5760690  0.5116965
##       128.00  0.6287131  0.5760690  0.5116965
##       256.00  0.6287131  0.5760690  0.5116965
##       512.00  0.6287131  0.5760690  0.5116965
##      1024.00  0.6287131  0.5760690  0.5116965
##      2048.00  0.6287131  0.5760690  0.5116965
##      4096.00  0.6287131  0.5760690  0.5116965
##      8192.00  0.6287131  0.5760690  0.5116965
##     16384.00  0.6287131  0.5760690  0.5116965
##     32768.00  0.6287131  0.5760690  0.5116965
##     65536.00  0.6287131  0.5760690  0.5116965
##    131072.00  0.6287131  0.5760690  0.5116965
##    262144.00  0.6287131  0.5760690  0.5116965
##    524288.00  0.6287131  0.5760690  0.5116965
##   1048576.00  0.6287131  0.5760690  0.5116965
##   2097152.00  0.6287131  0.5760690  0.5116965
##   4194304.00  0.6287131  0.5760690  0.5116965
## 
## Tuning parameter 'sigma' was held constant at a value of 0.015148
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were sigma = 0.015148 and C = 8.
SVM_predictions <- predict(SVM_model, test_df)

results <- data.frame(t(postResample(pred = SVM_predictions, obs = test_df$Yield))) %>%
  mutate("Model"= "SVM") %>% rbind(results)

Neural Network Model

nnet_grid <- expand.grid(.decay = c(0, 0.01, .1), .size = c(1:10), .bag = FALSE)
nnet_maxnwts <- 5 * ncol(train_df) + 5 + 1
nnet_model <- train(
  Yield ~ ., data = train_df, method = "avNNet",
  center = TRUE,
  scale = TRUE,
  tuneGrid = nnet_grid,
  trControl = trainControl(method = "cv"),
  linout = TRUE,
  trace = FALSE,
  MaxNWts = nnet_maxnwts,
  maxit = 500
)
## Warning: model fit failed for Fold01: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold01: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold01: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold01: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold01: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold01: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold01: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold01: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold01: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold01: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold01: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold01: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold01: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold01: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold01: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold02: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold02: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold02: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold02: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold02: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold02: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold02: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold02: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold02: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold02: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold02: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold02: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold02: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold02: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold02: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold03: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold03: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold03: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold03: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold03: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold03: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold03: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold03: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold03: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold03: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold03: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold03: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold03: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold03: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold03: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold04: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold04: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold04: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold04: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold04: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold04: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold04: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold04: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold04: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold04: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold04: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold04: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold04: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold04: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold04: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold05: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold05: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold05: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold05: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold05: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold05: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold05: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold05: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold05: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold05: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold05: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold05: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold05: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold05: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold05: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold06: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold06: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold06: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold06: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold06: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold06: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold06: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold06: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold06: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold06: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold06: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold06: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold06: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold06: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold06: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold07: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold07: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold07: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold07: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold07: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold07: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold07: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold07: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold07: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold07: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold07: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold07: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold07: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold07: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold07: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold08: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold08: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold08: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold08: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold08: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold08: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold08: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold08: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold08: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold08: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold08: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold08: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold08: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold08: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold08: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold09: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold09: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold09: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold09: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold09: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold09: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold09: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold09: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold09: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold09: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold09: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold09: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold09: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold09: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold09: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold10: decay=0.00, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold10: decay=0.01, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold10: decay=0.10, size= 6, bag=FALSE Error in { : task 1 failed - "too many (349) weights"
## Warning: model fit failed for Fold10: decay=0.00, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold10: decay=0.01, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold10: decay=0.10, size= 7, bag=FALSE Error in { : task 1 failed - "too many (407) weights"
## Warning: model fit failed for Fold10: decay=0.00, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold10: decay=0.01, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold10: decay=0.10, size= 8, bag=FALSE Error in { : task 1 failed - "too many (465) weights"
## Warning: model fit failed for Fold10: decay=0.00, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold10: decay=0.01, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold10: decay=0.10, size= 9, bag=FALSE Error in { : task 1 failed - "too many (523) weights"
## Warning: model fit failed for Fold10: decay=0.00, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold10: decay=0.01, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning: model fit failed for Fold10: decay=0.10, size=10, bag=FALSE Error in { : task 1 failed - "too many (581) weights"
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
## Warning in train.default(x, y, weights = w, ...): missing values found in
## aggregated results
nnet_model
## Model Averaged Neural Network 
## 
## 144 samples
##  56 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 131, 129, 130, 128, 130, 129, ... 
## Resampling results across tuning parameters:
## 
##   decay  size  RMSE       Rsquared   MAE      
##   0.00    1    0.7934900  0.4209508  0.6106307
##   0.00    2    0.9159339  0.4003914  0.7369839
##   0.00    3    0.7645514  0.4989719  0.6040197
##   0.00    4    0.6948408  0.5731379  0.5488690
##   0.00    5    0.6263516  0.6289645  0.5011214
##   0.00    6          NaN        NaN        NaN
##   0.00    7          NaN        NaN        NaN
##   0.00    8          NaN        NaN        NaN
##   0.00    9          NaN        NaN        NaN
##   0.00   10          NaN        NaN        NaN
##   0.01    1    0.7696616  0.4978434  0.6083154
##   0.01    2    0.7597511  0.5098346  0.6236169
##   0.01    3    0.8156739  0.5151222  0.6584366
##   0.01    4    0.6830777  0.6041899  0.5492441
##   0.01    5    0.6838020  0.5987224  0.5457117
##   0.01    6          NaN        NaN        NaN
##   0.01    7          NaN        NaN        NaN
##   0.01    8          NaN        NaN        NaN
##   0.01    9          NaN        NaN        NaN
##   0.01   10          NaN        NaN        NaN
##   0.10    1    0.7121318  0.5409956  0.5882917
##   0.10    2    0.6924515  0.5545411  0.5497997
##   0.10    3    0.6736422  0.6040624  0.5410502
##   0.10    4    0.6596263  0.6253162  0.5406258
##   0.10    5    0.6456583  0.6408940  0.5200500
##   0.10    6          NaN        NaN        NaN
##   0.10    7          NaN        NaN        NaN
##   0.10    8          NaN        NaN        NaN
##   0.10    9          NaN        NaN        NaN
##   0.10   10          NaN        NaN        NaN
## 
## Tuning parameter 'bag' was held constant at a value of FALSE
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were size = 5, decay = 0 and bag = FALSE.
nnet_predictions <- predict(nnet_model, test_df)

results <- data.frame(t(postResample(pred = nnet_predictions, obs = test_df$Yield))) %>%
  mutate("Model"= "Neural Network") %>% rbind(results)

Summary

results %>%
  select(Model, RMSE, Rsquared, MAE) %>%
  arrange(RMSE) %>%
  kable() %>%
  kable_styling()
Model RMSE Rsquared MAE
PLS 0.6192577 0.6771122 0.5059984
MARS 0.6797644 0.6313203 0.5452501
SVM 0.6815833 0.6251941 0.4926105
KNN 0.7149819 0.5894607 0.5047973
Neural Network 0.7326940 0.5529739 0.5928907

The SVM Model was the best non-linear model.

B

Which predictors are most important in the optimal nonlinear regression model? Do either the biological or process variables dominate the list? How do the top ten important predictors compare to the top ten predictors from the optimal linear model?

varImp(SVM_model, 10)
## loess r-squared variable importance
## 
##   only 20 most important variables shown (out of 56)
## 
##                        Overall
## ManufacturingProcess32  100.00
## ManufacturingProcess13   93.82
## ManufacturingProcess09   89.93
## ManufacturingProcess17   88.20
## BiologicalMaterial06     82.61
## BiologicalMaterial03     79.44
## ManufacturingProcess36   73.85
## BiologicalMaterial12     72.36
## ManufacturingProcess06   69.00
## ManufacturingProcess11   62.34
## ManufacturingProcess31   56.39
## BiologicalMaterial02     50.34
## BiologicalMaterial11     48.53
## BiologicalMaterial09     44.76
## ManufacturingProcess30   41.87
## BiologicalMaterial08     40.24
## ManufacturingProcess29   38.54
## ManufacturingProcess33   38.16
## BiologicalMaterial04     36.92
## ManufacturingProcess25   36.83
varImp(pls_model, 10)
## pls variable importance
## 
##   only 20 most important variables shown (out of 56)
## 
##                        Overall
## ManufacturingProcess32  100.00
## ManufacturingProcess09   88.04
## ManufacturingProcess36   82.20
## ManufacturingProcess13   82.11
## ManufacturingProcess17   80.25
## ManufacturingProcess06   59.06
## ManufacturingProcess11   55.93
## BiologicalMaterial02     55.46
## BiologicalMaterial06     54.64
## BiologicalMaterial03     54.50
## ManufacturingProcess33   53.91
## ManufacturingProcess12   52.04
## BiologicalMaterial08     49.76
## BiologicalMaterial12     47.40
## ManufacturingProcess34   45.47
## BiologicalMaterial11     45.05
## BiologicalMaterial01     44.18
## BiologicalMaterial04     42.95
## ManufacturingProcess04   39.94
## ManufacturingProcess28   36.61

The SVM model was very similar to the PLS model. The manufacuturing process variables is the dominant variable. Based on the SVM model found BiologicalMaterial12 to be important and didn’t find BiologicalMaterial02 to be important.

C

Explore the relationships between the top predictors and the response for the predictors that are unique to the optimal nonlinear regression model. Do these plots reveal intuition about the biological or process predictors and their relationship with yield?

ggplot(train_df, aes(BiologicalMaterial12, Yield)) +
  geom_point()

The graph indicates a positive relationship between the BiologicalMaterial02 and the highest yields are not the furthest to the right on the graph.