library(tidyverse)
library(caret)
In this document, we will be going through exercises 7.2 and 7.5 from Applied Predictive Modeling - Kuhn and Johnson.
Friedman (1991) introduced several benchmark data sets create by sim- ulation. One of these simulations used the following nonlinear equation to create data: y = 10 sin(πx1x2) + 20(x3 − 0.5)2 + 10x4 + 5x5 + N(0, σ2) where the x values are random variables uniformly distributed between [0, 1] (there are also 5 other non-informative variables also created in the simula- tion). The package mlbench contains a function called mlbench.friedman1 that simulates these data:
library(mlbench)
set.seed(200)
trainingData <- mlbench.friedman1(200, sd = 1)
## We convert the 'x' data from a matrix to a data frame
## One reason is that this will give the columns names.
trainingData$x <- data.frame(trainingData$x)
## Look at the data using
featurePlot(trainingData$x, trainingData$y)
## or other methods.
## This creates a list with a vector 'y' and a matrix
## of predictors 'x'. Also simulate a large test set to
## estimate the true error rate with good precision:
testData <- mlbench.friedman1(5000, sd = 1)
testData$x <- data.frame(testData$x)
Tune several models on these data. For example:
First we train a KNN model.
knnModel <- train(x = trainingData$x,
y = trainingData$y,
method = "knn",
preProc = c("center", "scale"),
tuneLength = 10)
knnModel
## k-Nearest Neighbors
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results across tuning parameters:
##
## k RMSE Rsquared MAE
## 5 3.466085 0.5121775 2.816838
## 7 3.349428 0.5452823 2.727410
## 9 3.264276 0.5785990 2.660026
## 11 3.214216 0.6024244 2.603767
## 13 3.196510 0.6176570 2.591935
## 15 3.184173 0.6305506 2.577482
## 17 3.183130 0.6425367 2.567787
## 19 3.198752 0.6483184 2.592683
## 21 3.188993 0.6611428 2.588787
## 23 3.200458 0.6638353 2.604529
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 17.
Then we test the model on the test set.
predict(knnModel, testData$x) %>%
postResample(pred = ., obs = testData$y)
## RMSE Rsquared MAE
## 3.2040595 0.6819919 2.5683461
Next we train a linear regression model as a baseline.
lmModel <- train(x = trainingData$x,
y = trainingData$y,
method = "lm",
preProc = c("center", "scale"),
tuneLength = 10)
lmModel
## Linear Regression
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results:
##
## RMSE Rsquared MAE
## 2.47988 0.7559676 1.955258
##
## Tuning parameter 'intercept' was held constant at a value of TRUE
We are already seeing better results with a linear regression model because it actually takes into account feature correlation to the response for fitting a model. KNN does not and instead groups all features equally based on distance.
predict(lmModel, testData$x) %>%
postResample(pred = ., obs = testData$y)
## RMSE Rsquared MAE
## 2.6970680 0.7084666 2.0600540
Next we train a radial svm.
svmrModel <- train(x = trainingData$x,
y = trainingData$y,
method = "svmRadial",
preProc = c("center", "scale"),
tuneLength = 14)
svmrModel
## Support Vector Machines with Radial Basis Function Kernel
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results across tuning parameters:
##
## C RMSE Rsquared MAE
## 0.25 2.525979 0.7804630 2.016014
## 0.50 2.293423 0.7960080 1.808878
## 1.00 2.156969 0.8112034 1.697751
## 2.00 2.081486 0.8226986 1.631756
## 4.00 2.050864 0.8270475 1.605584
## 8.00 2.046714 0.8280409 1.602156
## 16.00 2.046390 0.8281073 1.601597
## 32.00 2.046390 0.8281073 1.601597
## 64.00 2.046390 0.8281073 1.601597
## 128.00 2.046390 0.8281073 1.601597
## 256.00 2.046390 0.8281073 1.601597
## 512.00 2.046390 0.8281073 1.601597
## 1024.00 2.046390 0.8281073 1.601597
## 2048.00 2.046390 0.8281073 1.601597
##
## Tuning parameter 'sigma' was held constant at a value of 0.06529705
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were sigma = 0.06529705 and C = 16.
Our radial SVM gives us better predictions than either previous model. Which is interesting considering the data seems to generally be more linear in structure.
predict(svmrModel, testData$x) %>%
postResample(pred = ., obs = testData$y)
## RMSE Rsquared MAE
## 2.0792960 0.8247794 1.5796158
Next we train a linear SVM. We attempted training a polynomial SVM but that took much too long due to the fact that there are a multitude of polynomial orders that would need to be estimated with us not specifying an order in training parameters. Even with a second degree specified the training was still taking too long.
svmModel <- train(x = trainingData$x,
y = trainingData$y,
method = "svmLinear",
preProc = c("center", "scale"),
tuneLength = 14)
svmModel
## Support Vector Machines with Linear Kernel
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results:
##
## RMSE Rsquared MAE
## 2.6106 0.7314207 2.067093
##
## Tuning parameter 'C' was held constant at a value of 1
Interestingly enough our linear SVM is worse off than the radial SVM and even a regular linear regression model.
predict(svmModel, testData$x) %>%
postResample(pred = ., obs = testData$y)
## RMSE Rsquared MAE
## 2.7633860 0.6973384 2.0970616
Next we train a MARS model.
marsGrid <- expand.grid(.degree = 1:2, .nprune = 2:38)
marsModel <- train(x = trainingData$x,
y = trainingData$y,
method = "earth",
preProc = c("center", "scale"),
tuneGrid = marsGrid)
## Loading required package: earth
## Warning: package 'earth' was built under R version 4.3.2
## Loading required package: Formula
## Loading required package: plotmo
## Warning: package 'plotmo' was built under R version 4.3.2
## Loading required package: plotrix
## Loading required package: TeachingDemos
## Warning: package 'TeachingDemos' was built under R version 4.3.2
marsModel
## Multivariate Adaptive Regression Spline
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results across tuning parameters:
##
## degree nprune RMSE Rsquared MAE
## 1 2 4.498265 0.2096403 3.734967
## 1 3 3.707770 0.4625882 2.988300
## 1 4 2.832362 0.6887229 2.293904
## 1 5 2.578666 0.7415914 2.078526
## 1 6 2.428237 0.7703362 1.923512
## 1 7 2.002152 0.8413397 1.571986
## 1 8 1.845327 0.8656727 1.443783
## 1 9 1.784237 0.8749215 1.396701
## 1 10 1.752534 0.8800314 1.362773
## 1 11 1.777895 0.8775044 1.371078
## 1 12 1.812492 0.8739528 1.394908
## 1 13 1.815728 0.8735168 1.402239
## 1 14 1.823563 0.8725096 1.405168
## 1 15 1.831333 0.8710090 1.410873
## 1 16 1.833543 0.8705330 1.415533
## 1 17 1.832521 0.8707123 1.414501
## 1 18 1.832521 0.8707123 1.414501
## 1 19 1.832521 0.8707123 1.414501
## 1 20 1.832521 0.8707123 1.414501
## 1 21 1.832521 0.8707123 1.414501
## 1 22 1.832521 0.8707123 1.414501
## 1 23 1.832521 0.8707123 1.414501
## 1 24 1.832521 0.8707123 1.414501
## 1 25 1.832521 0.8707123 1.414501
## 1 26 1.832521 0.8707123 1.414501
## 1 27 1.832521 0.8707123 1.414501
## 1 28 1.832521 0.8707123 1.414501
## 1 29 1.832521 0.8707123 1.414501
## 1 30 1.832521 0.8707123 1.414501
## 1 31 1.832521 0.8707123 1.414501
## 1 32 1.832521 0.8707123 1.414501
## 1 33 1.832521 0.8707123 1.414501
## 1 34 1.832521 0.8707123 1.414501
## 1 35 1.832521 0.8707123 1.414501
## 1 36 1.832521 0.8707123 1.414501
## 1 37 1.832521 0.8707123 1.414501
## 1 38 1.832521 0.8707123 1.414501
## 2 2 4.501395 0.2102498 3.738169
## 2 3 3.660758 0.4729578 2.946851
## 2 4 2.864397 0.6822248 2.299847
## 2 5 2.528711 0.7527803 2.039786
## 2 6 2.416099 0.7746520 1.935244
## 2 7 2.044893 0.8361433 1.613209
## 2 8 1.841008 0.8676011 1.443831
## 2 9 1.725216 0.8827906 1.356670
## 2 10 1.595760 0.8988541 1.253079
## 2 11 1.521431 0.9075363 1.198704
## 2 12 1.498659 0.9104951 1.169960
## 2 13 1.489963 0.9121527 1.155843
## 2 14 1.502130 0.9105579 1.168189
## 2 15 1.497737 0.9120184 1.154562
## 2 16 1.504215 0.9114432 1.154259
## 2 17 1.513109 0.9102511 1.163959
## 2 18 1.519474 0.9095003 1.168080
## 2 19 1.516635 0.9100302 1.165927
## 2 20 1.518004 0.9098451 1.166194
## 2 21 1.518004 0.9098451 1.166194
## 2 22 1.518004 0.9098451 1.166194
## 2 23 1.518004 0.9098451 1.166194
## 2 24 1.518004 0.9098451 1.166194
## 2 25 1.518004 0.9098451 1.166194
## 2 26 1.518004 0.9098451 1.166194
## 2 27 1.518004 0.9098451 1.166194
## 2 28 1.518004 0.9098451 1.166194
## 2 29 1.518004 0.9098451 1.166194
## 2 30 1.518004 0.9098451 1.166194
## 2 31 1.518004 0.9098451 1.166194
## 2 32 1.518004 0.9098451 1.166194
## 2 33 1.518004 0.9098451 1.166194
## 2 34 1.518004 0.9098451 1.166194
## 2 35 1.518004 0.9098451 1.166194
## 2 36 1.518004 0.9098451 1.166194
## 2 37 1.518004 0.9098451 1.166194
## 2 38 1.518004 0.9098451 1.166194
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were nprune = 13 and degree = 2.
Our MARS model is the best fit yet with an R^2 jumping into the 90s and an RMSE of 1.32.
predict(marsModel, testData$x) %>%
postResample(pred = ., obs = testData$y)
## RMSE Rsquared MAE
## 1.3227340 0.9291489 1.0524686
Interestingly enough the self-selection algorithm of MARS has properly chosen all the non-random variables to use.
summary(marsModel)
## Call: earth(x=data.frame[200,10], y=c(18.46,16.1,17...), keepxy=TRUE, degree=2,
## nprune=13)
##
## coefficients
## (Intercept) 21.690154
## h(0.507267-X1) -4.203744
## h(X1-0.507267) 3.072355
## h(0.325504-X2) -5.314859
## h(-0.216741-X3) 3.320304
## h(X3- -0.216741) 2.321760
## h(0.953812-X4) -2.775288
## h(X4-0.953812) 2.778320
## h(1.17878-X5) -1.607769
## h(X1-0.507267) * h(X2- -0.798188) -3.199202
## h(0.606835-X1) * h(0.325504-X2) 2.030856
## h(0.325504-X2) * h(X3-0.795427) 1.369704
##
## Selected 12 of 21 terms, and 5 of 10 predictors (nprune=13)
## Termination condition: Reached nk 21
## Importance: X1, X4, X2, X5, X3, X6-unused, X7-unused, X8-unused, X9-unused, ...
## Number of terms at each degree of interaction: 1 8 3
## GCV 1.842426 RSS 270.9495 GRSq 0.9251967 RSq 0.9444425
Finally, we’ll attempt to train a neural network model.
nnetModel <- train(x = trainingData$x,
y = trainingData$y,
method = "nnet",
preProc = c("center", "scale"),
trace = FALSE,
linout = TRUE)
nnetModel
## Neural Network
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results across tuning parameters:
##
## size decay RMSE Rsquared MAE
## 1 0e+00 2.658645 0.7177289 2.069135
## 1 1e-04 2.640829 0.7243062 2.053681
## 1 1e-01 2.604037 0.7311093 2.007156
## 3 0e+00 3.179406 0.6530360 2.391318
## 3 1e-04 2.915620 0.6891061 2.234140
## 3 1e-01 2.811843 0.7030670 2.194293
## 5 0e+00 3.533839 0.5682995 2.684057
## 5 1e-04 3.420446 0.6210628 2.551700
## 5 1e-01 3.071106 0.6561624 2.443279
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were size = 1 and decay = 0.1.
Our neural network model has a decent RMSE and R^2, and by tuning the model further we would likely see big improvements. However, this is just below the performance of the linear svm model as is.
predict(nnetModel, testData$x) %>%
postResample(pred = ., obs = testData$y)
## RMSE Rsquared MAE
## 2.6493172 0.7177208 2.0295260
Which models appear to give the best performance? Does MARS select the informative predictors (those named X1–X5)?
Our MARS models and SVM with a radial kernel models give the best performance. The MARS model is particularly impressive here because it is still fairly interpretable compared to our SVM model and has selected only the informative predictors through its algorithm.
Exercise 6.3 describes data for a chemical manufacturing process. Use the same data imputation, data splitting, and pre-processing steps as before and train several nonlinear regression models.
(Exercise 6.3: A chemical manufacturing process for a pharmaceutical product was discussed in Sect. 1.4. In this problem, the objective is to understand the re- lationship between biological measurements of the raw materials (predictors), measurements of the manufacturing process (predictors), and the response of product yield. Biological predictors cannot be changed but can be used to assess the quality of the raw material before processing. On the other hand, manufacturing process predictors can be changed in the manufacturing pro- cess. Improving product yield by 1 % will boost revenue by approximately one hundred thousand dollars per batch.)
These are the same loading and preprocessing steps from 6.3:
library(AppliedPredictiveModeling)
data(ChemicalManufacturingProcess)
preProcess(ChemicalManufacturingProcess, method = c("knnImpute", "BoxCox", "center", "scale")) |>
predict(ChemicalManufacturingProcess) -> cmp
part <- createDataPartition(cmp$Yield, p = 0.75, list = FALSE)
cmp_train <- cmp[part,]
cmp_test <- cmp[-part,]
dim(cmp_train)
## [1] 132 58
First we train a KNN model.
knnModel <- train(x = cmp_train[,-1],
y = cmp_train$Yield,
method = "knn",
tuneLength = 10)
knnModel
## k-Nearest Neighbors
##
## 132 samples
## 57 predictor
##
## No pre-processing
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 132, 132, 132, 132, 132, 132, ...
## Resampling results across tuning parameters:
##
## k RMSE Rsquared MAE
## 5 0.7744257 0.4113969 0.6129787
## 7 0.7536568 0.4426736 0.5974297
## 9 0.7602935 0.4361320 0.6023658
## 11 0.7684387 0.4272692 0.6123498
## 13 0.7697262 0.4277985 0.6152154
## 15 0.7684272 0.4339447 0.6126863
## 17 0.7714023 0.4385002 0.6121538
## 19 0.7746543 0.4395945 0.6148734
## 21 0.7798477 0.4366079 0.6171120
## 23 0.7811411 0.4393293 0.6181021
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 7.
Then we test the model on the test set.
predict(knnModel, cmp_test[,-1]) %>%
postResample(pred = ., obs = cmp_test$Yield)
## RMSE Rsquared MAE
## 0.7248466 0.4614413 0.5711783
Next we train a radial SVM model.
svmrModel <- train(x = cmp_train[,-1],
y = cmp_train$Yield,
method = "svmRadial",
tuneLength = 14)
svmrModel
## Support Vector Machines with Radial Basis Function Kernel
##
## 132 samples
## 57 predictor
##
## No pre-processing
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 132, 132, 132, 132, 132, 132, ...
## Resampling results across tuning parameters:
##
## C RMSE Rsquared MAE
## 0.25 0.7771903 0.4571285 0.6332184
## 0.50 0.7355233 0.4912732 0.5952206
## 1.00 0.7074115 0.5187540 0.5687826
## 2.00 0.6925739 0.5273558 0.5536189
## 4.00 0.6877438 0.5290517 0.5482240
## 8.00 0.6854898 0.5314065 0.5463209
## 16.00 0.6853818 0.5315932 0.5461541
## 32.00 0.6853818 0.5315932 0.5461541
## 64.00 0.6853818 0.5315932 0.5461541
## 128.00 0.6853818 0.5315932 0.5461541
## 256.00 0.6853818 0.5315932 0.5461541
## 512.00 0.6853818 0.5315932 0.5461541
## 1024.00 0.6853818 0.5315932 0.5461541
## 2048.00 0.6853818 0.5315932 0.5461541
##
## Tuning parameter 'sigma' was held constant at a value of 0.01438654
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were sigma = 0.01438654 and C = 16.
Our radial SVM gives us worse predictions than our KNN model.
predict(svmrModel, cmp_test[,-1]) %>%
postResample(pred = ., obs = cmp_test$Yield)
## RMSE Rsquared MAE
## 0.5462639 0.6875045 0.4454188
Next we train a linear SVM. Which has quite a high RMSE compared to our radial model or KNN. Suggeesting that linear separation is not ideal for our data here.
svmModel <- train(x = cmp_train[,-1],
y = cmp_train$Yield,
method = "svmLinear",
preProc = c("center", "scale"),
tuneLength = 14)
svmModel
## Support Vector Machines with Linear Kernel
##
## 132 samples
## 57 predictor
##
## Pre-processing: centered (57), scaled (57)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 132, 132, 132, 132, 132, 132, ...
## Resampling results:
##
## RMSE Rsquared MAE
## 5.226649 0.117571 1.535834
##
## Tuning parameter 'C' was held constant at a value of 1
The RMSE we get with the test set is better than our training data strangely, but it is still beat by our previous models.
predict(svmModel, cmp_test[,-1]) %>%
postResample(pred = ., obs = cmp_test$Yield)
## RMSE Rsquared MAE
## 0.7140505 0.5419303 0.5693525
Next we train a MARS model.
marsGrid <- expand.grid(.degree = 1:2, .nprune = 2:14)
marsModel <- train(x = cmp_train[,-1],
y = cmp_train$Yield,
method = "earth",
tuneGrid = marsGrid)
marsModel
## Multivariate Adaptive Regression Spline
##
## 132 samples
## 57 predictor
##
## No pre-processing
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 132, 132, 132, 132, 132, 132, ...
## Resampling results across tuning parameters:
##
## degree nprune RMSE Rsquared MAE
## 1 2 0.8293207 0.3634353 0.6438544
## 1 3 0.7259401 0.5005203 0.5712317
## 1 4 0.6933652 0.5425642 0.5545139
## 1 5 1.4390610 0.5123569 0.6647397
## 1 6 1.4448307 0.4692227 0.6900797
## 1 7 2.3814006 0.4321544 0.8533137
## 1 8 2.3796328 0.4267461 0.8599907
## 1 9 2.7466883 0.3323892 0.9446441
## 1 10 2.6219353 0.3595570 0.9238739
## 1 11 2.7924948 0.3720344 0.9516353
## 1 12 2.9897946 0.3517220 0.9974320
## 1 13 3.1747231 0.3226596 1.0470684
## 1 14 3.3377744 0.3157628 1.0864234
## 2 2 0.8334384 0.3561989 0.6490347
## 2 3 0.7509244 0.4701457 0.5947671
## 2 4 0.8814143 0.4859057 0.6028938
## 2 5 1.1703443 0.4747686 0.6498159
## 2 6 1.5011217 0.4102719 0.7369577
## 2 7 1.4684903 0.3921025 0.7308880
## 2 8 1.4680296 0.4015419 0.7394385
## 2 9 1.6634878 0.3490464 0.7980369
## 2 10 1.6789701 0.3470176 0.8116508
## 2 11 1.8938866 0.3456008 0.8676863
## 2 12 1.9865711 0.3449002 0.8900045
## 2 13 2.0333068 0.3290731 0.9166938
## 2 14 2.1053626 0.3227253 0.9271062
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were nprune = 4 and degree = 1.
We get our second lowest RMSE yet with MARS.
predict(marsModel, cmp_test[,-1]) %>%
postResample(pred = ., obs = cmp_test$Yield)
## RMSE Rsquared MAE
## 0.6520461 0.5580345 0.5385178
Finally, we’ll attempt to train a neural network model.
nnetModel <- train(x = cmp_train[,-1],
y = cmp_train$Yield,
method = "nnet",
trace = FALSE,
linout = TRUE)
nnetModel
## Neural Network
##
## 132 samples
## 57 predictor
##
## No pre-processing
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 132, 132, 132, 132, 132, 132, ...
## Resampling results across tuning parameters:
##
## size decay RMSE Rsquared MAE
## 1 0e+00 0.9942810 0.3188098 0.7890465
## 1 1e-04 1.0675120 0.2798334 0.8589885
## 1 1e-01 0.9581127 0.3788706 0.7516320
## 3 0e+00 1.0212852 0.3461211 0.8133082
## 3 1e-04 1.0228707 0.3483214 0.8204448
## 3 1e-01 0.9315270 0.4088806 0.7301173
## 5 0e+00 1.0500518 0.3525208 0.8386511
## 5 1e-04 1.1204129 0.3178585 0.9004586
## 5 1e-01 0.8506808 0.4581910 0.6739639
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were size = 5 and decay = 0.1.
Our neural network model has a middle of the road RMSE and R^2.
predict(nnetModel, cmp_test[,-1]) %>%
postResample(pred = ., obs = cmp_test$Yield)
## RMSE Rsquared MAE
## 0.8098940 0.4185406 0.5896052
Which nonlinear regression model gives the optimal resampling and test set performance?
Our radial SVM model gave the best test set performance with an RMSE of 0.6710227. Indicating the features and data follow some sort of radial pattern for predictions. However, our linear PLS model had a better performance with an RMSE of 0.6340737. So, there’s still linear importance within our data.
Which predictors are most important in the optimal nonlinear regres- sion model? Do either the biological or process variables dominate the list? How do the top ten important predictors compare to the top ten predictors from the optimal linear model?
Our top two important predictors are manufacturing processes and there are more manufacturing processes in our top ten important predictors than biological. However, our PLS model had the top 6 being all manufacturing processes while there was a harsh drop off afterwards where we saw the biological material.
Besides biological material 12 making it into the top 10 instead of 8, our list of important predictors is the same. Yet the order of importance is not. This showcases how linear models are similar yet different to nonlinear models.
plot(varImp(svmrModel), top = 25)
Explore the relationships between the top predictors and the response for the predictors that are unique to the optimal nonlinear regression model. Do these plots reveal intuition about the biological or process predictors and their relationship with yield?
ggplot(data = ChemicalManufacturingProcess, aes(y = Yield, x = BiologicalMaterial12)) +
geom_point()
Visualizing our unique predictor within the top 10 important variables of biological material 12 it makes sense that this was a more important predictor for our non-linear model but compared to our linear one. Although there does seem to be a slight positive linear correlation, the relationship seems to be more parabolic. Yield increases as we get closer to 21 from 18, but afterwards yield seems to decrease. Thus, we want to keep biological material 12 at around 21 units to maximize yield.