7.2. Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used the following nonlinear equation to create data: y = 10 sin(πx1x2) + 20(x3 − 0.5)2 + 10 x4 + 5 x5 + N(0, σ 2) where the x values are random variables uniformly distributed between [0, 1] (there are also 5 other non-informative variables also created in the simulation). The package mlbench contains a function called mlbench.friedman1 that simulates these data:
library(mlbench)
## Warning: package 'mlbench' was built under R version 4.0.5
set.seed(200)
trainingData <- mlbench.friedman1(200, sd = 1)
## We convert the 'x'data from a matrix to a data frame
## One reason is that this will give the columns names.
trainingData$x <- data.frame(trainingData$x)
## Look at the data using
featurePlot(trainingData$x, trainingData$y)
## or other methods.
## This creates a list with a vector 'y'and a matrix
## of predictors 'x'. Also simulate a large test set to
## estimate the true error rate with good precision:
testData <- mlbench.friedman1(5000, sd = 1)
testData$x <- data.frame(testData$x)
Tune several models on these data. For example:
library(caret)
knnModel <- train(x = trainingData$x,
y = trainingData$y,
method = "knn",
preProc = c("center", "scale"),
tuneLength = 10)
knnModel
## k-Nearest Neighbors
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results across tuning parameters:
##
## k RMSE Rsquared MAE
## 5 3.466085 0.5121775 2.816838
## 7 3.349428 0.5452823 2.727410
## 9 3.264276 0.5785990 2.660026
## 11 3.214216 0.6024244 2.603767
## 13 3.196510 0.6176570 2.591935
## 15 3.184173 0.6305506 2.577482
## 17 3.183130 0.6425367 2.567787
## 19 3.198752 0.6483184 2.592683
## 21 3.188993 0.6611428 2.588787
## 23 3.200458 0.6638353 2.604529
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 17.
knnPred <- predict(knnModel, newdata = testData$x)
## The function 'postResample'can be used to get the test set
## perforamnce values
postResample(pred = knnPred, obs = testData$y)
## RMSE Rsquared MAE
## 3.2040595 0.6819919 2.5683461
varImp(knnModel)
## loess r-squared variable importance
##
## Overall
## X4 100.0000
## X1 95.5047
## X2 89.6186
## X5 45.2170
## X3 29.9330
## X9 6.3299
## X10 5.5182
## X8 3.2527
## X6 0.8884
## X7 0.0000
Multivariate Adaptive Regression Splines (MARS) MARS models are in several packages, but the most extensive implementation is in the earth package. The MARS model using the nominal forward pass and pruning step can be called simply.
library(earth)
## Warning: package 'earth' was built under R version 4.0.5
## Loading required package: Formula
## Loading required package: plotmo
## Warning: package 'plotmo' was built under R version 4.0.5
## Loading required package: plotrix
## Loading required package: TeachingDemos
## Warning: package 'TeachingDemos' was built under R version 4.0.5
##
## Attaching package: 'plotmo'
## The following object is masked from 'package:urca':
##
## plotres
marsFit <- earth(trainingData$x, trainingData$y)
marsFit
## Selected 12 of 18 terms, and 6 of 10 predictors
## Termination condition: Reached nk 21
## Importance: X1, X4, X2, X5, X3, X6, X7-unused, X8-unused, X9-unused, ...
## Number of terms at each degree of interaction: 1 11 (additive model)
## GCV 2.540556 RSS 397.9654 GRSq 0.8968524 RSq 0.9183982
The summary method generates more extensive output.
summary(marsFit)
## Call: earth(x=trainingData$x, y=trainingData$y)
##
## coefficients
## (Intercept) 18.451984
## h(0.621722-X1) -11.074396
## h(0.601063-X2) -10.744225
## h(X3-0.281766) 20.607853
## h(0.447442-X3) 17.880232
## h(X3-0.447442) -23.282007
## h(X3-0.636458) 15.150350
## h(0.734892-X4) -10.027487
## h(X4-0.734892) 9.092045
## h(0.850094-X5) -4.723407
## h(X5-0.850094) 10.832932
## h(X6-0.361791) -1.956821
##
## Selected 12 of 18 terms, and 6 of 10 predictors
## Termination condition: Reached nk 21
## Importance: X1, X4, X2, X5, X3, X6, X7-unused, X8-unused, X9-unused, ...
## Number of terms at each degree of interaction: 1 11 (additive model)
## GCV 2.540556 RSS 397.9654 GRSq 0.8968524 RSq 0.9183982
To tune the model using external resampling, the train function can be used.
# Define the candidate models to test
marsGrid <- expand.grid(.degree = 1:2, .nprune = 2:38)
# Fix the seed so that the results can be reproduced
set.seed(1340)
marsTuned <- train(trainingData$x, trainingData$y,
method = "earth",
# Explicitly declare the candidate models to test
tuneGrid = marsGrid,
trControl = trainControl(method = "cv"))
marsTuned
## Multivariate Adaptive Regression Spline
##
## 200 samples
## 10 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 180, 180, 180, 180, 180, 180, ...
## Resampling results across tuning parameters:
##
## degree nprune RMSE Rsquared MAE
## 1 2 4.594324 0.1963843 3.833072
## 1 3 3.836801 0.4233298 3.139225
## 1 4 2.697201 0.7151127 2.134218
## 1 5 2.384270 0.7718287 1.877921
## 1 6 2.292469 0.7939161 1.752696
## 1 7 1.873309 0.8658661 1.420039
## 1 8 1.774391 0.8812817 1.353393
## 1 9 1.731318 0.8898088 1.346609
## 1 10 1.675093 0.8972301 1.308108
## 1 11 1.683475 0.8950124 1.304817
## 1 12 1.630683 0.9016304 1.257779
## 1 13 1.658973 0.8996103 1.271139
## 1 14 1.663902 0.8982894 1.277294
## 1 15 1.663902 0.8982894 1.277294
## 1 16 1.663902 0.8982894 1.277294
## 1 17 1.663902 0.8982894 1.277294
## 1 18 1.663902 0.8982894 1.277294
## 1 19 1.663902 0.8982894 1.277294
## 1 20 1.663902 0.8982894 1.277294
## 1 21 1.663902 0.8982894 1.277294
## 1 22 1.663902 0.8982894 1.277294
## 1 23 1.663902 0.8982894 1.277294
## 1 24 1.663902 0.8982894 1.277294
## 1 25 1.663902 0.8982894 1.277294
## 1 26 1.663902 0.8982894 1.277294
## 1 27 1.663902 0.8982894 1.277294
## 1 28 1.663902 0.8982894 1.277294
## 1 29 1.663902 0.8982894 1.277294
## 1 30 1.663902 0.8982894 1.277294
## 1 31 1.663902 0.8982894 1.277294
## 1 32 1.663902 0.8982894 1.277294
## 1 33 1.663902 0.8982894 1.277294
## 1 34 1.663902 0.8982894 1.277294
## 1 35 1.663902 0.8982894 1.277294
## 1 36 1.663902 0.8982894 1.277294
## 1 37 1.663902 0.8982894 1.277294
## 1 38 1.663902 0.8982894 1.277294
## 2 2 4.594324 0.1963843 3.833072
## 2 3 3.836801 0.4233298 3.139225
## 2 4 2.697201 0.7151127 2.134218
## 2 5 2.406939 0.7716633 1.890850
## 2 6 2.348855 0.7858024 1.826896
## 2 7 1.865997 0.8649874 1.417428
## 2 8 1.718563 0.8879092 1.330179
## 2 9 1.494916 0.9123315 1.180176
## 2 10 1.418147 0.9229322 1.138736
## 2 11 1.361989 0.9301450 1.072554
## 2 12 1.329251 0.9330332 1.042525
## 2 13 1.295076 0.9359551 1.025561
## 2 14 1.286339 0.9383256 1.021240
## 2 15 1.279173 0.9392309 1.023911
## 2 16 1.295970 0.9382018 1.035481
## 2 17 1.311907 0.9368463 1.050086
## 2 18 1.311907 0.9368463 1.050086
## 2 19 1.322933 0.9358727 1.059458
## 2 20 1.322933 0.9358727 1.059458
## 2 21 1.322933 0.9358727 1.059458
## 2 22 1.322933 0.9358727 1.059458
## 2 23 1.322933 0.9358727 1.059458
## 2 24 1.322933 0.9358727 1.059458
## 2 25 1.322933 0.9358727 1.059458
## 2 26 1.322933 0.9358727 1.059458
## 2 27 1.322933 0.9358727 1.059458
## 2 28 1.322933 0.9358727 1.059458
## 2 29 1.322933 0.9358727 1.059458
## 2 30 1.322933 0.9358727 1.059458
## 2 31 1.322933 0.9358727 1.059458
## 2 32 1.322933 0.9358727 1.059458
## 2 33 1.322933 0.9358727 1.059458
## 2 34 1.322933 0.9358727 1.059458
## 2 35 1.322933 0.9358727 1.059458
## 2 36 1.322933 0.9358727 1.059458
## 2 37 1.322933 0.9358727 1.059458
## 2 38 1.322933 0.9358727 1.059458
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were nprune = 15 and degree = 2.
head(predict(marsTuned, testData$x))
## y
## [1,] 18.586161
## [2,] 21.272051
## [3,] 12.274064
## [4,] 7.736178
## [5,] 10.592710
## [6,] 14.124571
marsPred <- predict(marsTuned, testData$x)
## The function 'postResample'can be used to get the test set
## perforamnce values
postResample(pred = marsPred, obs = testData$y)
## RMSE Rsquared MAE
## 1.1589948 0.9460418 0.9250230
There are two functions that estimate the importance of each predictor in the MARS model: evimp in the earth package and varImp in the caret package (although the latter calls the former):
varImp(marsTuned)
## earth variable importance
##
## Overall
## X1 100.00
## X4 75.24
## X2 48.73
## X5 15.52
## X3 0.00
Only X1 to X5 are important to the model according to Mars model.
Neural Networks (nnet) To fit a regression model, the nnet function takes both the formula and non-formula interfaces. For regression, the linear relationship between the hidden units and the prediction can be used with the option linout = TRUE.
tooHigh <- findCorrelation(cor(trainingData$x), cutoff = .75)
trainXnnet <- trainingData$x[, -tooHigh]
testXnnet <- testData$x[, -tooHigh]
## Create a specific candidate set of models to evaluate:
nnetGrid <- expand.grid(.decay = c(0, 0.01, .1),
.size = c(1:10),## The next option is to use bagging (see the
## next chapter) instead of different random
## seeds.
.bag = FALSE)
set.seed(31500)
nnetTuned <- train(trainingData$x, trainingData$y,
method = "avNNet",
tuneGrid = nnetGrid,
trControl = trainControl(method = "cv"),
## Automatically standardize data prior to modeling
## and prediction
preProc = c("center", "scale"),
linout = TRUE,
trace = FALSE,
MaxNWts = 5 * (ncol(trainXnnet) + 1) + 10 + 1,
maxit = 50)
nnetTuned
## Model Averaged Neural Network
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 180, 180, 180, 180, 180, 180, ...
## Resampling results across tuning parameters:
##
## decay size RMSE Rsquared MAE
## 0.00 1 2.660379 0.7364356 2.129650
## 0.00 2 NaN NaN NaN
## 0.00 3 NaN NaN NaN
## 0.00 4 NaN NaN NaN
## 0.00 5 NaN NaN NaN
## 0.00 6 NaN NaN NaN
## 0.00 7 NaN NaN NaN
## 0.00 8 NaN NaN NaN
## 0.00 9 NaN NaN NaN
## 0.00 10 NaN NaN NaN
## 0.01 1 2.658120 0.7257255 2.138039
## 0.01 2 NaN NaN NaN
## 0.01 3 NaN NaN NaN
## 0.01 4 NaN NaN NaN
## 0.01 5 NaN NaN NaN
## 0.01 6 NaN NaN NaN
## 0.01 7 NaN NaN NaN
## 0.01 8 NaN NaN NaN
## 0.01 9 NaN NaN NaN
## 0.01 10 NaN NaN NaN
## 0.10 1 2.467318 0.7592730 1.956655
## 0.10 2 NaN NaN NaN
## 0.10 3 NaN NaN NaN
## 0.10 4 NaN NaN NaN
## 0.10 5 NaN NaN NaN
## 0.10 6 NaN NaN NaN
## 0.10 7 NaN NaN NaN
## 0.10 8 NaN NaN NaN
## 0.10 9 NaN NaN NaN
## 0.10 10 NaN NaN NaN
##
## Tuning parameter 'bag' was held constant at a value of FALSE
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were size = 1, decay = 0.1 and bag = FALSE.
nnetFit <- earth(trainingData$x, trainingData$y)
nnetFit
## Selected 12 of 18 terms, and 6 of 10 predictors
## Termination condition: Reached nk 21
## Importance: X1, X4, X2, X5, X3, X6, X7-unused, X8-unused, X9-unused, ...
## Number of terms at each degree of interaction: 1 11 (additive model)
## GCV 2.540556 RSS 397.9654 GRSq 0.8968524 RSq 0.9183982
summary(nnetFit)
## Call: earth(x=trainingData$x, y=trainingData$y)
##
## coefficients
## (Intercept) 18.451984
## h(0.621722-X1) -11.074396
## h(0.601063-X2) -10.744225
## h(X3-0.281766) 20.607853
## h(0.447442-X3) 17.880232
## h(X3-0.447442) -23.282007
## h(X3-0.636458) 15.150350
## h(0.734892-X4) -10.027487
## h(X4-0.734892) 9.092045
## h(0.850094-X5) -4.723407
## h(X5-0.850094) 10.832932
## h(X6-0.361791) -1.956821
##
## Selected 12 of 18 terms, and 6 of 10 predictors
## Termination condition: Reached nk 21
## Importance: X1, X4, X2, X5, X3, X6, X7-unused, X8-unused, X9-unused, ...
## Number of terms at each degree of interaction: 1 11 (additive model)
## GCV 2.540556 RSS 397.9654 GRSq 0.8968524 RSq 0.9183982
head(predict(nnetTuned, testData$x))
## 1 2 3 4 5 6
## 17.815453 17.580442 12.202086 8.501727 15.206499 13.941385
nnetPred <- predict(marsTuned, testData$x)
## The function 'postResample'can be used to get the test set
## perforamnce values
postResample(pred = nnetPred, obs = testData$y)
## RMSE Rsquared MAE
## 1.1589948 0.9460418 0.9250230
varImp(nnetTuned)
## loess r-squared variable importance
##
## Overall
## X4 100.0000
## X1 95.5047
## X2 89.6186
## X5 45.2170
## X3 29.9330
## X9 6.3299
## X10 5.5182
## X8 3.2527
## X6 0.8884
## X7 0.0000
Which models appear to give the best performance? Does MARS select the informative predictors (those named X1–X5)? Mars model selected the informative predictors (X1-X5). Mars model appears to be the best with selecting predictors which are important for the model. It has better R-squared compared nnet and knn. Other model do not narrow important predictors to 5 like Mars.
7.5. Exercise 6.3 describes data for a chemical manufacturing process. Use the same data imputation, data splitting, and pre-processing steps as before and train several nonlinear regression models.
set.seed(34392)
library(AppliedPredictiveModeling)
## Warning: package 'AppliedPredictiveModeling' was built under R version 4.0.5
library(RANN)
## Warning: package 'RANN' was built under R version 4.0.5
data(ChemicalManufacturingProcess)
df <- ChemicalManufacturingProcess
#sum(is.na(df))
trans <- preProcess(df,"knnImpute")
#sum(is.na(trans))
pred <- predict(trans, df)
pred <- pred %>% select_at(vars(-one_of(nearZeroVar(., names = TRUE))))
trainDf <- createDataPartition(pred$Yield, p=0.8, time = 1, list = FALSE)
trainX <-pred[trainDf, ]
trainY <- pred$Yield[trainDf]
#sum(is.na(trainX))
plsTune <- train(trainX, trainY,
method = "pls",
## The default tuning grid evaluates
## components 1... tuneLength
tuneLength = 20,
trControl = trainControl(method = 'cv'),
preProc = c("center", "scale"))
plsTune
## Partial Least Squares
##
## 144 samples
## 57 predictor
##
## Pre-processing: centered (57), scaled (57)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 130, 130, 131, 128, 128, 130, ...
## Resampling results across tuning parameters:
##
## ncomp RMSE Rsquared MAE
## 1 0.67852890 0.5941896 0.54887418
## 2 0.63548816 0.6847727 0.47397765
## 3 0.63643355 0.7621490 0.40839579
## 4 0.66167229 0.7942985 0.38549797
## 5 0.57660761 0.8513484 0.29922932
## 6 0.48017797 0.8775200 0.25072728
## 7 0.31537720 0.9063116 0.17967528
## 8 0.21673598 0.9389757 0.13253697
## 9 0.11834743 0.9852244 0.08810216
## 10 0.11620387 0.9798759 0.07497350
## 11 0.11671649 0.9703266 0.06567956
## 12 0.10582328 0.9698120 0.05535640
## 13 0.08437493 0.9803105 0.04493301
## 14 0.05493883 0.9941788 0.03454552
## 15 0.03184658 0.9990082 0.02420603
## 16 0.03150690 0.9988056 0.02189858
## 17 0.03530588 0.9975295 0.02130832
## 18 0.03132606 0.9981684 0.01965167
## 19 0.02353572 0.9989905 0.01439933
## 20 0.01547202 0.9996614 0.01031025
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was ncomp = 20.
plot(plsTune)
testX <- pred[-trainDf,]
testY <- pred$Yield[-trainDf]
postResample(pred = predict(plsTune, newdata=testX), obs = testY)
## RMSE Rsquared MAE
## 0.06449150 0.99368528 0.02210791
Neutral Network
# tooHigh <- findCorrelation(cor(trainX), cutoff = .75)
#
# trainXnnet <- trainX[, -tooHigh]
#
# testXnnet <- testData$x[, -tooHigh]
## Create a specific candidate set of models to evaluate:
nnetGrid <- expand.grid(.decay = c(0, 0.01, .1),
.size = c(1:10),## The next option is to use bagging (see the
## next chapter) instead of different random
## seeds.
.bag = FALSE)
set.seed(31500)
# nnetTuned <- train(Yield ~ ., trainX,
# method = "avNNet",
# tuneGrid = nnetGrid,
# trControl = trainControl(method = "cv"),
# ## Automatically standardize data prior to modeling
# ## and prediction
# preProc = c("center", "scale"),
# linout = TRUE,
# trace = FALSE,
# MaxNWts = 5 * (ncol(trainDf) + 1) + 5 + 1,
# maxit = 50)
#
# nnetTuned
# plot(nnetTuned)
#postResample(pred = predict(nnetTuned, newdata=testX), obs = testY)
knn model
knnModel <- train(x = trainX,
y = trainY,
method = "knn",
preProc = c("center", "scale"),
tuneLength = 10)
knnModel
## k-Nearest Neighbors
##
## 144 samples
## 57 predictor
##
## Pre-processing: centered (57), scaled (57)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 144, 144, 144, 144, 144, 144, ...
## Resampling results across tuning parameters:
##
## k RMSE Rsquared MAE
## 5 0.7154769 0.5471728 0.5667266
## 7 0.7134805 0.5556671 0.5683334
## 9 0.7212001 0.5474275 0.5819166
## 11 0.7284377 0.5428464 0.5893612
## 13 0.7308178 0.5426277 0.5911040
## 15 0.7336511 0.5442605 0.5954581
## 17 0.7336175 0.5491911 0.5965774
## 19 0.7349090 0.5554128 0.5950657
## 21 0.7407129 0.5534573 0.6018688
## 23 0.7449997 0.5545652 0.6039193
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 7.
plot(knnModel)
postResample(pred = predict(knnModel, newdata=testX), obs = testY)
## RMSE Rsquared MAE
## 0.5032695 0.5990757 0.4012294
Mars model
# Define the candidate models to test
marsGrid <- expand.grid(.degree = 1:2, .nprune = 2:38)
# Fix the seed so that the results can be reproduced
set.seed(1340)
marsTuned <- train(trainX, trainY,
method = "earth",
# Explicitly declare the candidate models to test
tuneGrid = marsGrid,
trControl = trainControl(method = "cv"))
marsTuned
## Multivariate Adaptive Regression Spline
##
## 144 samples
## 57 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 129, 131, 129, 129, 129, 129, ...
## Resampling results across tuning parameters:
##
## degree nprune RMSE Rsquared MAE
## 1 2 5.491765e-16 1 4.569087e-16
## 1 3 5.491765e-16 1 4.569087e-16
## 1 4 5.491765e-16 1 4.569087e-16
## 1 5 5.491765e-16 1 4.569087e-16
## 1 6 5.491765e-16 1 4.569087e-16
## 1 7 5.491765e-16 1 4.569087e-16
## 1 8 5.491765e-16 1 4.569087e-16
## 1 9 5.491765e-16 1 4.569087e-16
## 1 10 5.491765e-16 1 4.569087e-16
## 1 11 5.491765e-16 1 4.569087e-16
## 1 12 5.491765e-16 1 4.569087e-16
## 1 13 5.491765e-16 1 4.569087e-16
## 1 14 5.491765e-16 1 4.569087e-16
## 1 15 5.491765e-16 1 4.569087e-16
## 1 16 5.491765e-16 1 4.569087e-16
## 1 17 5.491765e-16 1 4.569087e-16
## 1 18 5.491765e-16 1 4.569087e-16
## 1 19 5.491765e-16 1 4.569087e-16
## 1 20 5.491765e-16 1 4.569087e-16
## 1 21 5.491765e-16 1 4.569087e-16
## 1 22 5.491765e-16 1 4.569087e-16
## 1 23 5.491765e-16 1 4.569087e-16
## 1 24 5.491765e-16 1 4.569087e-16
## 1 25 5.491765e-16 1 4.569087e-16
## 1 26 5.491765e-16 1 4.569087e-16
## 1 27 5.491765e-16 1 4.569087e-16
## 1 28 5.491765e-16 1 4.569087e-16
## 1 29 5.491765e-16 1 4.569087e-16
## 1 30 5.491765e-16 1 4.569087e-16
## 1 31 5.491765e-16 1 4.569087e-16
## 1 32 5.491765e-16 1 4.569087e-16
## 1 33 5.491765e-16 1 4.569087e-16
## 1 34 5.491765e-16 1 4.569087e-16
## 1 35 5.491765e-16 1 4.569087e-16
## 1 36 5.491765e-16 1 4.569087e-16
## 1 37 5.491765e-16 1 4.569087e-16
## 1 38 5.491765e-16 1 4.569087e-16
## 2 2 5.491765e-16 1 4.569087e-16
## 2 3 5.491765e-16 1 4.569087e-16
## 2 4 5.491765e-16 1 4.569087e-16
## 2 5 5.491765e-16 1 4.569087e-16
## 2 6 5.491765e-16 1 4.569087e-16
## 2 7 5.491765e-16 1 4.569087e-16
## 2 8 5.491765e-16 1 4.569087e-16
## 2 9 5.491765e-16 1 4.569087e-16
## 2 10 5.491765e-16 1 4.569087e-16
## 2 11 5.491765e-16 1 4.569087e-16
## 2 12 5.491765e-16 1 4.569087e-16
## 2 13 5.491765e-16 1 4.569087e-16
## 2 14 5.491765e-16 1 4.569087e-16
## 2 15 5.491765e-16 1 4.569087e-16
## 2 16 5.491765e-16 1 4.569087e-16
## 2 17 5.491765e-16 1 4.569087e-16
## 2 18 5.491765e-16 1 4.569087e-16
## 2 19 5.491765e-16 1 4.569087e-16
## 2 20 5.491765e-16 1 4.569087e-16
## 2 21 5.491765e-16 1 4.569087e-16
## 2 22 5.491765e-16 1 4.569087e-16
## 2 23 5.491765e-16 1 4.569087e-16
## 2 24 5.491765e-16 1 4.569087e-16
## 2 25 5.491765e-16 1 4.569087e-16
## 2 26 5.491765e-16 1 4.569087e-16
## 2 27 5.491765e-16 1 4.569087e-16
## 2 28 5.491765e-16 1 4.569087e-16
## 2 29 5.491765e-16 1 4.569087e-16
## 2 30 5.491765e-16 1 4.569087e-16
## 2 31 5.491765e-16 1 4.569087e-16
## 2 32 5.491765e-16 1 4.569087e-16
## 2 33 5.491765e-16 1 4.569087e-16
## 2 34 5.491765e-16 1 4.569087e-16
## 2 35 5.491765e-16 1 4.569087e-16
## 2 36 5.491765e-16 1 4.569087e-16
## 2 37 5.491765e-16 1 4.569087e-16
## 2 38 5.491765e-16 1 4.569087e-16
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were nprune = 2 and degree = 1.
#plot(marsTuned)
postResample(pred = predict(marsTuned, newdata=testX), obs = testY)
## RMSE Rsquared MAE
## 5.086513e-16 1.000000e+00 4.391019e-16
Which nonlinear regression model gives the optimal resampling and test set performance? Mars model gives the optimal resampling and test set performance with RMSE = 5.086513e-16 .
Which predictors are most important in the optimal nonlinear regression model? Do either the biological or process variables dominate the list? How do the top ten important predictors compare to the top ten predictors from the optimal linear model?
#varImp(marsTuned)
#varImp(nnetTuned)
varImp(plsTune)
## Warning: package 'pls' was built under R version 4.0.5
##
## Attaching package: 'pls'
## The following object is masked from 'package:caret':
##
## R2
## The following object is masked from 'package:stats':
##
## loadings
## pls variable importance
##
## only 20 most important variables shown (out of 57)
##
## Overall
## Yield 100.00
## ManufacturingProcess32 43.05
## ManufacturingProcess13 39.27
## ManufacturingProcess36 36.66
## ManufacturingProcess17 36.31
## ManufacturingProcess09 34.64
## BiologicalMaterial02 33.30
## BiologicalMaterial06 31.38
## BiologicalMaterial08 30.86
## BiologicalMaterial12 29.61
## BiologicalMaterial03 29.25
## ManufacturingProcess33 29.20
## BiologicalMaterial11 28.98
## BiologicalMaterial01 27.17
## ManufacturingProcess06 27.14
## BiologicalMaterial04 26.78
## ManufacturingProcess12 25.48
## ManufacturingProcess11 25.31
## ManufacturingProcess04 23.70
## ManufacturingProcess28 22.16
varImp(knnModel)
## loess r-squared variable importance
##
## only 20 most important variables shown (out of 57)
##
## Overall
## Yield 100.00
## ManufacturingProcess32 37.86
## ManufacturingProcess13 36.69
## BiologicalMaterial06 35.28
## BiologicalMaterial12 31.62
## ManufacturingProcess17 31.25
## BiologicalMaterial03 30.61
## BiologicalMaterial02 28.91
## ManufacturingProcess09 27.57
## ManufacturingProcess36 27.23
## BiologicalMaterial11 24.66
## ManufacturingProcess06 23.12
## ManufacturingProcess31 22.35
## BiologicalMaterial04 20.71
## BiologicalMaterial08 19.88
## ManufacturingProcess11 19.18
## ManufacturingProcess33 19.12
## ManufacturingProcess29 18.05
## BiologicalMaterial01 18.02
## ManufacturingProcess02 16.06
We got some issue with Mars model selection of most important predictors. plsTune model and knnModel predictors selection are about the same. Just like what we found in exercise 6.3 The Manufacturing variant among the predictors dominate the list.
plot(trainX$ManufacturingProcess32, trainX$Yield , pch = 19)
#lines(trainX$ManufacturingProcess32, trainX$Yield, type = "b", col = 3, lwd = 4, pch = 2 )
Very interesting!
df <- data.frame(trainX$ManufacturingProcess32, trainX$ManufacturingProcess13, trainX$ManufacturingProcess36, trainX$ManufacturingProcess17, trainX$ManufacturingProcess09, trainX$BiologicalMaterial02 , trainX$BiologicalMaterial06, trainX$BiologicalMaterial08, trainX$BiologicalMaterial12, trainX$BiologicalMaterial03, trainX$Yield)
x <- dplyr::select(df , -trainX.Yield)
featurePlot( x, df$trainX.Yield)