7.2. Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used the following nonlinear equation to create data: y = 10 sin(πx1x2) + 20(x3 − 0.5)2 + 10x4 + 5x5 + N(0, σ2) where the x values are random variables uniformly distributed between [0, 1] (there are also 5 other non-informative variables also created in the simulation). The package mlbench contains a function called mlbench.friedman1 that simulates these data:

#Required packages and libraries:

install.packages("earth")
## Installing package into 'C:/Users/zahid/AppData/Local/R/win-library/4.4'
## (as 'lib' is unspecified)
## package 'earth' successfully unpacked and MD5 sums checked
## Warning: cannot remove prior installation of package 'earth'
## Warning in file.copy(savedcopy, lib, recursive = TRUE): problem copying
## C:\Users\zahid\AppData\Local\R\win-library\4.4\00LOCK\earth\libs\x64\earth.dll
## to C:\Users\zahid\AppData\Local\R\win-library\4.4\earth\libs\x64\earth.dll:
## Permission denied
## Warning: restored 'earth'
## 
## The downloaded binary packages are in
##  C:\Users\zahid\AppData\Local\Temp\Rtmp6pLhoC\downloaded_packages
library(mlbench)
library(caret)
## Loading required package: ggplot2
## Loading required package: lattice
library(earth)
## Loading required package: Formula
## Loading required package: plotmo
## Loading required package: plotrix

#Loading data

set.seed(200)
trainingData <- mlbench.friedman1(200, sd = 1)
trainingData$x <- data.frame(trainingData$x)

set.seed(123)
trainIndex <- createDataPartition(trainingData$y, p = 0.8, list = FALSE)

trainX <- trainingData$x[trainIndex, ]
testX  <- trainingData$x[-trainIndex, ]
trainY <- trainingData$y[trainIndex]
testY  <- trainingData$y[-trainIndex]

ctrl <- trainControl(method = "cv", number = 5)

featurePlot(trainingData$x, trainingData$y)

#Fit multiple models

#KNN model

set.seed(123)
knnModel <- train(
  x = trainX,
  y = trainY,
  method = "knn",
  preProcess = c("center", "scale"),
  tuneLength = 10,
  trControl = ctrl
)

#SVM model

set.seed(123)
svmModel <- train(
  x = trainX,
  y = trainY,
  method = "svmRadial",
  preProcess = c("center", "scale"),
  tuneLength = 10,
  trControl = ctrl
)

#MARS(earth) Model

set.seed(123)
marsModel <- train(
  x = trainX,
  y = trainY,
  method = "earth",
  tuneLength = 10,
  trControl = ctrl
)

#Neural Network Model

set.seed(123)
nnetModel <- train(
  x = trainX,
  y = trainY,
  method = "nnet",
  preProcess = c("center", "scale"),
  tuneLength = 10,
  trControl = ctrl,
  trace = FALSE
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,
## : There were missing values in resampled performance measures.

#Compare resampling performance

results <- resamples(list(
  KNN = knnModel,
  SVM = svmModel,
  MARS = marsModel,
  NNET = nnetModel
))

summary(results)
## 
## Call:
## summary.resamples(object = results)
## 
## Models: KNN, SVM, MARS, NNET 
## Number of resamples: 5 
## 
## MAE 
##            Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
## KNN   2.1787387  2.377898  2.396911  2.541953  2.804047  2.952170    0
## SVM   1.4523860  1.482680  1.564855  1.547754  1.591627  1.647222    0
## MARS  0.9646322  1.504122  1.558764  1.475199  1.598404  1.750075    0
## NNET 12.9206915 13.427731 13.465472 13.393685 13.515723 13.638808    0
## 
## RMSE 
##           Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
## KNN   2.754618  2.856409  3.047236  3.168206  3.572812  3.609955    0
## SVM   1.772888  1.806176  2.053598  1.995793  2.102949  2.243353    0
## MARS  1.384900  1.876485  1.890364  1.855192  1.962265  2.161946    0
## NNET 13.912244 14.089797 14.255479 14.288961 14.570281 14.617005    0
## 
## Rsquared 
##           Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
## KNN  0.5947779 0.7134049 0.7588157 0.7262370 0.7623482 0.8018382    0
## SVM  0.7319290 0.8473133 0.8517162 0.8386436 0.8682216 0.8940381    0
## MARS 0.8212629 0.8591270 0.8673515 0.8687133 0.8681654 0.9276596    0
## NNET        NA        NA        NA       NaN        NA        NA    5
bwplot(results)

#Comment: Based on cross-validation results, the MARS model provides the best performance, with the lowest RMSE and MAE and the highest R². The SVM model is the second-best performer, while KNN shows moderate accuracy. The neural network performs poorly, likely due to insufficient tuning or convergence issues.

#Variable Importance

#Note:KNN doesn’t have meaningful variable importance

varImp(marsModel)
## earth variable importance
## 
##    Overall
## X4  100.00
## X2   79.05
## X1   50.06
## X5   17.88
## X3    0.00
varImp(svmModel)
## loess r-squared variable importance
## 
##     Overall
## X4  100.000
## X1   90.065
## X2   84.677
## X5   39.456
## X3   25.521
## X9   16.623
## X10  11.687
## X8    3.946
## X7    2.356
## X6    0.000
varImp(nnetModel)
## nnet variable importance
## 
##      Overall
## X4  100.0000
## X1   78.7348
## X2   52.7002
## X5   19.1295
## X9   16.6698
## X8   13.8779
## X3    8.7366
## X10   3.8570
## X6    0.9438
## X7    0.0000

#Comment: Variable importance results show that X1, X2, X4, and X5 are consistently identified as the most important predictors. X3 shows lower importance in some models, likely due to its nonlinear (quadratic) effect. The remaining variables (X6–X10) have very low importance, confirming that they are non-informative noise variables. Overall, the results align well with the true data-generating process.

#Test set performance

pred_knn  <- predict(knnModel, testX)
pred_svm  <- predict(svmModel, testX)
pred_mars <- predict(marsModel, testX)
pred_nnet <- predict(nnetModel, testX)

postResample(pred_knn, testY)
##      RMSE  Rsquared       MAE 
## 2.9933348 0.6859249 2.4099250
postResample(pred_svm, testY)
##     RMSE Rsquared      MAE 
## 2.092857 0.818803 1.750291
postResample(pred_mars, testY)
##      RMSE  Rsquared       MAE 
## 1.7844102 0.8630031 1.3327807
postResample(pred_nnet, testY)
##     RMSE Rsquared      MAE 
## 14.31431       NA 13.50617

#Comment: On the test set, the MARS model achieved the best performance, with the lowest RMSE (≈1.78), lowest MAE, and highest R² (≈0.86). The SVM model performed second best, followed by KNN. The neural network performed very poorly, with extremely high error and undefined R², indicating convergence or tuning issues.

7.5. Exercise 6.3 describes data for a chemical manufacturing process. Use the same data imputation, data splitting, and pre-processing steps as before and train several nonlinear regression models.

#Loading data

library(caret)
library(AppliedPredictiveModeling)

data(ChemicalManufacturingProcess)

#Data Splitting

set.seed(123)

inTrain <- createDataPartition(ChemicalManufacturingProcess$Yield,
                               p = 0.8, list = FALSE)

trainData <- ChemicalManufacturingProcess[inTrain, ]
testData  <- ChemicalManufacturingProcess[-inTrain, ]

#Imputation & Preprocessing

preProcValues <- preProcess(trainData, method = c("medianImpute", "center", "scale"))

trainTransformed <- predict(preProcValues, trainData)
testTransformed  <- predict(preProcValues, testData)

#Train Nonlinear Models

ctrl <- trainControl(method = "cv", number = 5)

set.seed(123)

# KNN
knnModel <- train(Yield ~ ., data = trainTransformed,
                  method = "knn",
                  tuneLength = 10,
                  trControl = ctrl)

# SVM
svmModel <- train(Yield ~ ., data = trainTransformed,
                  method = "svmRadial",
                  tuneLength = 10,
                  trControl = ctrl)

# MARS
marsModel <- train(Yield ~ ., data = trainTransformed,
                   method = "earth",
                   tuneLength = 10,
                   trControl = ctrl)

# Neural Net
nnetModel <- train(Yield ~ ., data = trainTransformed,
                   method = "nnet",
                   tuneLength = 10,
                   trControl = ctrl,
                   trace = FALSE)
## Warning: model fit failed for Fold1: size=17, decay=0.0000000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold1: size=19, decay=0.0000000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold1: size=17, decay=0.1000000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold1: size=19, decay=0.1000000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold1: size=17, decay=0.0421697 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold1: size=19, decay=0.0421697 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold1: size=17, decay=0.0177828 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold1: size=19, decay=0.0177828 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold1: size=17, decay=0.0074989 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold1: size=19, decay=0.0074989 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold1: size=17, decay=0.0031623 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold1: size=19, decay=0.0031623 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold1: size=17, decay=0.0013335 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold1: size=19, decay=0.0013335 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold1: size=17, decay=0.0005623 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold1: size=19, decay=0.0005623 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold1: size=17, decay=0.0002371 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold1: size=19, decay=0.0002371 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold1: size=17, decay=0.0001000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold1: size=19, decay=0.0001000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold2: size=17, decay=0.0000000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold2: size=19, decay=0.0000000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold2: size=17, decay=0.1000000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold2: size=19, decay=0.1000000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold2: size=17, decay=0.0421697 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold2: size=19, decay=0.0421697 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold2: size=17, decay=0.0177828 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold2: size=19, decay=0.0177828 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold2: size=17, decay=0.0074989 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold2: size=19, decay=0.0074989 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold2: size=17, decay=0.0031623 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold2: size=19, decay=0.0031623 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold2: size=17, decay=0.0013335 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold2: size=19, decay=0.0013335 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold2: size=17, decay=0.0005623 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold2: size=19, decay=0.0005623 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold2: size=17, decay=0.0002371 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold2: size=19, decay=0.0002371 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold2: size=17, decay=0.0001000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold2: size=19, decay=0.0001000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold3: size=17, decay=0.0000000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold3: size=19, decay=0.0000000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold3: size=17, decay=0.1000000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold3: size=19, decay=0.1000000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold3: size=17, decay=0.0421697 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold3: size=19, decay=0.0421697 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold3: size=17, decay=0.0177828 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold3: size=19, decay=0.0177828 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold3: size=17, decay=0.0074989 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold3: size=19, decay=0.0074989 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold3: size=17, decay=0.0031623 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold3: size=19, decay=0.0031623 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold3: size=17, decay=0.0013335 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold3: size=19, decay=0.0013335 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold3: size=17, decay=0.0005623 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold3: size=19, decay=0.0005623 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold3: size=17, decay=0.0002371 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold3: size=19, decay=0.0002371 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold3: size=17, decay=0.0001000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold3: size=19, decay=0.0001000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold4: size=17, decay=0.0000000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold4: size=19, decay=0.0000000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold4: size=17, decay=0.1000000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold4: size=19, decay=0.1000000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold4: size=17, decay=0.0421697 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold4: size=19, decay=0.0421697 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold4: size=17, decay=0.0177828 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold4: size=19, decay=0.0177828 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold4: size=17, decay=0.0074989 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold4: size=19, decay=0.0074989 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold4: size=17, decay=0.0031623 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold4: size=19, decay=0.0031623 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold4: size=17, decay=0.0013335 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold4: size=19, decay=0.0013335 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold4: size=17, decay=0.0005623 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold4: size=19, decay=0.0005623 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold4: size=17, decay=0.0002371 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold4: size=19, decay=0.0002371 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold4: size=17, decay=0.0001000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold4: size=19, decay=0.0001000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold5: size=17, decay=0.0000000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold5: size=19, decay=0.0000000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold5: size=17, decay=0.1000000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold5: size=19, decay=0.1000000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold5: size=17, decay=0.0421697 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold5: size=19, decay=0.0421697 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold5: size=17, decay=0.0177828 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold5: size=19, decay=0.0177828 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold5: size=17, decay=0.0074989 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold5: size=19, decay=0.0074989 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold5: size=17, decay=0.0031623 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold5: size=19, decay=0.0031623 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold5: size=17, decay=0.0013335 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold5: size=19, decay=0.0013335 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold5: size=17, decay=0.0005623 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold5: size=19, decay=0.0005623 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold5: size=17, decay=0.0002371 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold5: size=19, decay=0.0002371 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning: model fit failed for Fold5: size=17, decay=0.0001000 Error in nnet.default(x, y, w, ...) : too many (1004) weights
## Warning: model fit failed for Fold5: size=19, decay=0.0001000 Error in nnet.default(x, y, w, ...) : too many (1122) weights
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,
## : There were missing values in resampled performance measures.
## Warning in train.default(x, y, weights = w, ...): missing values found in
## aggregated results

#Compare Resmapling performance

results <- resamples(list(
  KNN = knnModel,
  SVM = svmModel,
  MARS = marsModel,
  NNET = nnetModel
))

summary(results)
## 
## Call:
## summary.resamples(object = results)
## 
## Models: KNN, SVM, MARS, NNET 
## Number of resamples: 5 
## 
## MAE 
##           Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
## KNN  0.4328363 0.4657012 0.5147072 0.5614261 0.6715309 0.7223549    0
## SVM  0.4382031 0.4555181 0.4782245 0.4788199 0.5076219 0.5145318    0
## MARS 0.4338469 0.4772605 0.5237300 0.5209531 0.5838133 0.5861149    0
## NNET 0.5066585 0.6383440 0.6772955 0.6430231 0.6959067 0.6969108    0
## 
## RMSE 
##           Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
## KNN  0.5413910 0.6321094 0.6349797 0.7028631 0.8459635 0.8598718    0
## SVM  0.5563581 0.5727391 0.5729584 0.6042018 0.6288512 0.6901020    0
## MARS 0.5075426 0.6193663 0.6215339 0.6396564 0.6823830 0.7674561    0
## NNET 0.6373146 0.7965150 0.8339999 0.8080682 0.8495031 0.9230083    0
## 
## Rsquared 
##           Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
## KNN  0.3082065 0.3487485 0.5670712 0.5175147 0.6348217 0.7287254    0
## SVM  0.5633838 0.6200789 0.6445207 0.6497739 0.6578666 0.7630198    0
## MARS 0.4941583 0.5553619 0.5810792 0.5888236 0.6091588 0.7043600    0
## NNET 0.4346845 0.4376291 0.4650075 0.4934819 0.5396108 0.5904775    0
bwplot(results)

#Test set performance

pred_knn  <- predict(knnModel, testTransformed)
pred_svm  <- predict(svmModel, testTransformed)
pred_mars <- predict(marsModel, testTransformed)
pred_nnet <- predict(nnetModel, testTransformed)

postResample(pred_knn, testTransformed$Yield)
##      RMSE  Rsquared       MAE 
## 0.7571541 0.4285288 0.6298808
postResample(pred_svm, testTransformed$Yield)
##      RMSE  Rsquared       MAE 
## 0.6616231 0.5555194 0.5560868
postResample(pred_mars, testTransformed$Yield)
##      RMSE  Rsquared       MAE 
## 0.7434069 0.4680622 0.6027513
postResample(pred_nnet, testTransformed$Yield)
##      RMSE  Rsquared       MAE 
## 0.8927062 0.2969256 0.7539213

#Variable importance

varImp(svmModel)
## loess r-squared variable importance
## 
##   only 20 most important variables shown (out of 57)
## 
##                        Overall
## ManufacturingProcess32  100.00
## BiologicalMaterial06     94.06
## ManufacturingProcess36   81.54
## BiologicalMaterial03     81.27
## ManufacturingProcess13   80.63
## ManufacturingProcess31   78.52
## BiologicalMaterial02     76.04
## ManufacturingProcess17   75.92
## ManufacturingProcess09   73.04
## BiologicalMaterial12     69.48
## ManufacturingProcess06   66.24
## BiologicalMaterial11     59.72
## ManufacturingProcess33   57.06
## ManufacturingProcess29   54.40
## BiologicalMaterial04     53.93
## BiologicalMaterial01     45.62
## BiologicalMaterial08     44.93
## ManufacturingProcess30   42.47
## BiologicalMaterial09     40.88
## ManufacturingProcess11   38.38
plot(varImp(svmModel), top = 20)

#(a) Which nonlinear regression model gives the optimal resampling and test set performance?

#Answer: Based on the resampling results, the SVM model provides the best performance, achieving the lowest RMSE and MAE and the highest R². MARS performs second best, followed by KNN, while the neural network shows the weakest performance.

#(b) Which predictors are most important in the optimal nonlinear regression model? Do either the biological or process variables dominate the list? How do the top ten important predictors compare to the top ten predictors from the optimal linear model?

#Answer: The most important predictor is ManufacturingProcess32, followed by BiologicalMaterial06, ManufacturingProcess36, and BiologicalMaterial03. Overall, both process and biological variables are important; however, process variables appear more frequently among the top predictors, indicating they have a slightly stronger influence on yield. Compared to the optimal linear model, the nonlinear model identifies additional important predictors and captures more complex relationships, leading to differences in the top ten variables.

#(c) Explore the relationships between the top predictors and the response for the predictors that are unique to the optimal nonlinear regression model.Do these plots reveal intuition about the biological or process predictors and their relationship with yield?

#Answer: The relationships between the top predictors and yield show clear nonlinear patterns, particularly for variables such as ManufacturingProcess32 and BiologicalMaterial06. These plots indicate that changes in these predictors do not lead to simple linear increases or decreases in yield. Instead, the effects vary across their ranges, suggesting interactions and nonlinear behavior. This helps explain why nonlinear models such as SVM outperform linear models. From a practical perspective, optimizing key process variables—especially ManufacturingProcess32—could significantly improve yield, while biological variables contribute in more complex ways.