set.seed(200)
<- mlbench.friedman1(200, sd = 1)
trainingData $x <- data.frame(trainingData$x) trainingData
HW8DATA624
Assignment
We are required to complete questions 7.2 and 7.5 from chapter 7 of “Applied Predictive Modeling” by Max Kuhn and Kjell Johnson.
7.2
We are tasked with tuning models on simulated data via the Friedman1 function from the mlbench package. Below, I use the code provided in the book to create the data and tune models.
The below code creates 200 data points with 10 x variables and one y variable
featurePlot(trainingData$x, trainingData$y)
Next we create test data
<- mlbench.friedman1(5000, sd = 1)
testData $x <- data.frame(testData$x) testData
Next, we tune models on the data:
<- train(x = trainingData$x,
knnModel y = trainingData$y,
method = 'knn',
preProcess = c('center', 'scale'),
tuneLength = 10)
knnModel
k-Nearest Neighbors
200 samples
10 predictor
Pre-processing: centered (10), scaled (10)
Resampling: Bootstrapped (25 reps)
Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
Resampling results across tuning parameters:
k RMSE Rsquared MAE
5 3.466085 0.5121775 2.816838
7 3.349428 0.5452823 2.727410
9 3.264276 0.5785990 2.660026
11 3.214216 0.6024244 2.603767
13 3.196510 0.6176570 2.591935
15 3.184173 0.6305506 2.577482
17 3.183130 0.6425367 2.567787
19 3.198752 0.6483184 2.592683
21 3.188993 0.6611428 2.588787
23 3.200458 0.6638353 2.604529
RMSE was used to select the optimal model using the smallest value.
The final value used for the model was k = 17.
Next, we use the final model (with k = 17) to predict data in the test set:
<- predict(knnModel, newdata = testData$x)
knnPred postResample(pred = knnPred, obs = testData$y)
RMSE Rsquared MAE
3.2040595 0.6819919 2.5683461
MARS Model
Next, I try fitting a MARS model on the data.
set.seed(100)
<- expand.grid(.degree = 1:3, .nprune = 2:50)
marsGrid <- train(x = trainingData$x,
marsModel y = trainingData$y,
method = 'earth',
preProcess = c('center', 'scale'),
tuneGrid = marsGrid,
trControl = trainControl(method = "cv", number = 10))
marsModel
Multivariate Adaptive Regression Spline
200 samples
10 predictor
Pre-processing: centered (10), scaled (10)
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 180, 180, 180, 180, 180, 180, ...
Resampling results across tuning parameters:
degree nprune RMSE Rsquared MAE
1 2 4.327937 0.2544880 3.6004742
1 3 3.572450 0.4912720 2.8958113
1 4 2.596841 0.7183600 2.1063410
1 5 2.370161 0.7659777 1.9186686
1 6 2.276141 0.7881481 1.8100006
1 7 1.766728 0.8751831 1.3902146
1 8 1.780946 0.8723243 1.4013449
1 9 1.665091 0.8819775 1.3255147
1 10 1.663804 0.8821283 1.3276573
1 11 1.657738 0.8822967 1.3317299
1 12 1.653784 0.8827903 1.3315041
1 13 1.648496 0.8823663 1.3164065
1 14 1.639073 0.8841742 1.3128329
1 15 1.639073 0.8841742 1.3128329
1 16 1.639073 0.8841742 1.3128329
1 17 1.639073 0.8841742 1.3128329
1 18 1.639073 0.8841742 1.3128329
1 19 1.639073 0.8841742 1.3128329
1 20 1.639073 0.8841742 1.3128329
1 21 1.639073 0.8841742 1.3128329
1 22 1.639073 0.8841742 1.3128329
1 23 1.639073 0.8841742 1.3128329
1 24 1.639073 0.8841742 1.3128329
1 25 1.639073 0.8841742 1.3128329
1 26 1.639073 0.8841742 1.3128329
1 27 1.639073 0.8841742 1.3128329
1 28 1.639073 0.8841742 1.3128329
1 29 1.639073 0.8841742 1.3128329
1 30 1.639073 0.8841742 1.3128329
1 31 1.639073 0.8841742 1.3128329
1 32 1.639073 0.8841742 1.3128329
1 33 1.639073 0.8841742 1.3128329
1 34 1.639073 0.8841742 1.3128329
1 35 1.639073 0.8841742 1.3128329
1 36 1.639073 0.8841742 1.3128329
1 37 1.639073 0.8841742 1.3128329
1 38 1.639073 0.8841742 1.3128329
1 39 1.639073 0.8841742 1.3128329
1 40 1.639073 0.8841742 1.3128329
1 41 1.639073 0.8841742 1.3128329
1 42 1.639073 0.8841742 1.3128329
1 43 1.639073 0.8841742 1.3128329
1 44 1.639073 0.8841742 1.3128329
1 45 1.639073 0.8841742 1.3128329
1 46 1.639073 0.8841742 1.3128329
1 47 1.639073 0.8841742 1.3128329
1 48 1.639073 0.8841742 1.3128329
1 49 1.639073 0.8841742 1.3128329
1 50 1.639073 0.8841742 1.3128329
2 2 4.327937 0.2544880 3.6004742
2 3 3.572450 0.4912720 2.8958113
2 4 2.661826 0.7070510 2.1734709
2 5 2.404015 0.7578971 1.9753867
2 6 2.243927 0.7914805 1.7830717
2 7 1.856336 0.8605482 1.4356822
2 8 1.754607 0.8763186 1.3968406
2 9 1.653859 0.8870129 1.2813884
2 10 1.434159 0.9166537 1.1339203
2 11 1.320482 0.9289120 1.0347278
2 12 1.317547 0.9306879 1.0359899
2 13 1.296910 0.9306902 1.0146112
2 14 1.221407 0.9395223 0.9631486
2 15 1.230516 0.9390469 0.9761484
2 16 1.236911 0.9387407 0.9745362
2 17 1.236911 0.9387407 0.9745362
2 18 1.236911 0.9387407 0.9745362
2 19 1.236911 0.9387407 0.9745362
2 20 1.236911 0.9387407 0.9745362
2 21 1.236911 0.9387407 0.9745362
2 22 1.236911 0.9387407 0.9745362
2 23 1.236911 0.9387407 0.9745362
2 24 1.236911 0.9387407 0.9745362
2 25 1.236911 0.9387407 0.9745362
2 26 1.236911 0.9387407 0.9745362
2 27 1.236911 0.9387407 0.9745362
2 28 1.236911 0.9387407 0.9745362
2 29 1.236911 0.9387407 0.9745362
2 30 1.236911 0.9387407 0.9745362
2 31 1.236911 0.9387407 0.9745362
2 32 1.236911 0.9387407 0.9745362
2 33 1.236911 0.9387407 0.9745362
2 34 1.236911 0.9387407 0.9745362
2 35 1.236911 0.9387407 0.9745362
2 36 1.236911 0.9387407 0.9745362
2 37 1.236911 0.9387407 0.9745362
2 38 1.236911 0.9387407 0.9745362
2 39 1.236911 0.9387407 0.9745362
2 40 1.236911 0.9387407 0.9745362
2 41 1.236911 0.9387407 0.9745362
2 42 1.236911 0.9387407 0.9745362
2 43 1.236911 0.9387407 0.9745362
2 44 1.236911 0.9387407 0.9745362
2 45 1.236911 0.9387407 0.9745362
2 46 1.236911 0.9387407 0.9745362
2 47 1.236911 0.9387407 0.9745362
2 48 1.236911 0.9387407 0.9745362
2 49 1.236911 0.9387407 0.9745362
2 50 1.236911 0.9387407 0.9745362
3 2 4.327937 0.2544880 3.6004742
3 3 3.572450 0.4912720 2.8958113
3 4 2.661826 0.7070510 2.1734709
3 5 2.404015 0.7578971 1.9753867
3 6 2.258530 0.7888892 1.7954652
3 7 1.850728 0.8620159 1.4273124
3 8 1.751759 0.8768118 1.3917401
3 9 1.659166 0.8866163 1.2790604
3 10 1.443606 0.9158100 1.1243194
3 11 1.339761 0.9276771 1.0405949
3 12 1.320350 0.9307765 1.0276609
3 13 1.301929 0.9303287 1.0151967
3 14 1.253136 0.9346362 0.9746054
3 15 1.267729 0.9329718 0.9811453
3 16 1.274882 0.9327201 0.9851629
3 17 1.280571 0.9324050 0.9937222
3 18 1.280823 0.9322527 0.9964978
3 19 1.283637 0.9318523 1.0049435
3 20 1.283637 0.9318523 1.0049435
3 21 1.283637 0.9318523 1.0049435
3 22 1.283637 0.9318523 1.0049435
3 23 1.283637 0.9318523 1.0049435
3 24 1.283637 0.9318523 1.0049435
3 25 1.283637 0.9318523 1.0049435
3 26 1.283637 0.9318523 1.0049435
3 27 1.283637 0.9318523 1.0049435
3 28 1.283637 0.9318523 1.0049435
3 29 1.283637 0.9318523 1.0049435
3 30 1.283637 0.9318523 1.0049435
3 31 1.283637 0.9318523 1.0049435
3 32 1.283637 0.9318523 1.0049435
3 33 1.283637 0.9318523 1.0049435
3 34 1.283637 0.9318523 1.0049435
3 35 1.283637 0.9318523 1.0049435
3 36 1.283637 0.9318523 1.0049435
3 37 1.283637 0.9318523 1.0049435
3 38 1.283637 0.9318523 1.0049435
3 39 1.283637 0.9318523 1.0049435
3 40 1.283637 0.9318523 1.0049435
3 41 1.283637 0.9318523 1.0049435
3 42 1.283637 0.9318523 1.0049435
3 43 1.283637 0.9318523 1.0049435
3 44 1.283637 0.9318523 1.0049435
3 45 1.283637 0.9318523 1.0049435
3 46 1.283637 0.9318523 1.0049435
3 47 1.283637 0.9318523 1.0049435
3 48 1.283637 0.9318523 1.0049435
3 49 1.283637 0.9318523 1.0049435
3 50 1.283637 0.9318523 1.0049435
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were nprune = 14 and degree = 2.
The model with the lowest RMSE was with 2 degrees of interaction and 14 terms
ggplot(marsModel)
summary(marsModel)
Call: earth(x=data.frame[200,10], y=c(18.46,16.1,17...), keepxy=TRUE, degree=2,
nprune=14)
coefficients
(Intercept) 22.0339290
h(0.507267-X1) -4.5157241
h(X1-0.507267) 2.6841329
h(0.325504-X2) -5.4438679
h(-0.216741-X3) 3.3683600
h(X3- -0.216741) 2.0371575
h(0.953812-X4) -2.7853388
h(X4-0.953812) 2.7366442
h(1.17878-X5) -1.5636213
h(X1- -0.951872) * h(X2-0.325504) -0.7790969
h(X1-0.507267) * h(X2- -0.798188) -2.6276789
h(0.606835-X1) * h(0.325504-X2) 2.1773145
h(0.325504-X2) * h(X3-0.795427) 1.7739671
h(X2-0.325504) * h(X3- -0.917499) 0.5726623
Selected 14 of 21 terms, and 5 of 10 predictors (nprune=14)
Termination condition: Reached nk 21
Importance: X1, X4, X2, X5, X3, X6-unused, X7-unused, X8-unused, X9-unused, ...
Number of terms at each degree of interaction: 1 8 5
GCV 1.841667 RSS 255.2757 GRSq 0.9252276 RSq 0.9476564
Next, I try predicting the test data with the MARS model:
<- predict(marsModel, newdata = testData$x)
MARSPred postResample(pred = MARSPred, obs = testData$y)
RMSE Rsquared MAE
1.2779993 0.9338365 1.0147070
SVM
Next, I try fitting an SVM model:
set.seed(225)
<- train(x = trainingData$x,
svmModel y = trainingData$y,
method = 'svmRadial',
preProcess = c('center', 'scale'),
tuneLength = 14,
trControl = trainControl(method = "cv", number = 10))
svmModel
Support Vector Machines with Radial Basis Function Kernel
200 samples
10 predictor
Pre-processing: centered (10), scaled (10)
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 180, 180, 180, 180, 180, 180, ...
Resampling results across tuning parameters:
C RMSE Rsquared MAE
0.25 2.499701 0.7970965 1.996951
0.50 2.238097 0.8115233 1.792365
1.00 2.049451 0.8324583 1.639456
2.00 1.961412 0.8444204 1.559239
4.00 1.870907 0.8571727 1.498361
8.00 1.836778 0.8629510 1.472100
16.00 1.849663 0.8613122 1.482504
32.00 1.849663 0.8613122 1.482504
64.00 1.849663 0.8613122 1.482504
128.00 1.849663 0.8613122 1.482504
256.00 1.849663 0.8613122 1.482504
512.00 1.849663 0.8613122 1.482504
1024.00 1.849663 0.8613122 1.482504
2048.00 1.849663 0.8613122 1.482504
Tuning parameter 'sigma' was held constant at a value of 0.06254786
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were sigma = 0.06254786 and C = 8.
plot(svmModel)
Next, I try predicting with the SVM model:
<- predict(svmModel, newdata = testData$x)
svmPred postResample(pred = svmPred, obs = testData$y)
RMSE Rsquared MAE
2.0523643 0.8293154 1.5571606
The MARS model performs better than the KNN model and SVM model, as it has both a lower RMSE and a (much) higher R squared. The MARS model did select the informative predictors (X1-X5).
Below is a plot of the variable importance:
plot(varImp(marsModel), main = 'Variable Importance')
7.5
We are tasked with fitting non-linear models on the chemical manufacturing data.
data("ChemicalManufacturingProcess")
There are NAs in the data set:
sum(is.na(ChemicalManufacturingProcess))
[1] 106
Below I impute missing values using the k-nearest neighbors approach. I will use the KNN function from the VIM package. The function adds columns which state whether a value was imputed to the right of the original data. I will drop these columns from the data frame.
<- kNN(ChemicalManufacturingProcess)
chem_data <- chem_data[,1:ncol(ChemicalManufacturingProcess)] chem_data
Next, I split the data into a training and test set:
set.seed(10)
<- createDataPartition(chem_data$Yield, p = 0.8, list = FALSE)
chem_trainIndex
<- chem_data[chem_trainIndex,]
chem_trainData <- chem_data[-chem_trainIndex,] chem_testData
KNN model
First, I will tune a KNN model on the data:
set.seed(210)
<- train(x = chem_trainData[,-1],
knnChemModel y = chem_trainData$Yield,
method = 'knn',
preProcess = c('center', 'scale'),
tuneGrid = data.frame(.k=1:20),
trControl = trainControl(method = "cv", number = 10))
knnChemModel
k-Nearest Neighbors
144 samples
57 predictor
Pre-processing: centered (57), scaled (57)
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 128, 130, 130, 129, 130, 130, ...
Resampling results across tuning parameters:
k RMSE Rsquared MAE
1 1.465714 0.4465654 1.116089
2 1.228056 0.5717160 1.013607
3 1.277520 0.5193685 1.035558
4 1.277025 0.5402453 1.023845
5 1.286623 0.5393261 1.032174
6 1.306640 0.5267678 1.044836
7 1.326741 0.5246410 1.062796
8 1.336258 0.5251640 1.078861
9 1.358468 0.5042863 1.092338
10 1.342232 0.5279743 1.077845
11 1.346397 0.5282144 1.094812
12 1.349239 0.5271784 1.092292
13 1.362739 0.5157005 1.100105
14 1.372961 0.5046815 1.101857
15 1.399120 0.4806478 1.112901
16 1.398594 0.4840723 1.114808
17 1.412197 0.4758792 1.126821
18 1.427435 0.4660139 1.140383
19 1.428054 0.4671728 1.135843
20 1.435325 0.4611414 1.145407
RMSE was used to select the optimal model using the smallest value.
The final value used for the model was k = 2.
plot(knnChemModel)
Next, I try predicting the test set with the KNN model:
<- predict(knnChemModel, newdata = chem_testData[,-1])
knnChemPred postResample(pred = knnChemPred, obs = chem_testData$Yield)
RMSE Rsquared MAE
1.5622837 0.2797031 1.2695312
MARS model
Next, I tune a MARS model:
set.seed(215)
<- expand.grid(.degree = 1:3, .nprune = 2:100)
marsChemGrid <- train(x = chem_trainData[,-1],
marsChemModel y = chem_trainData$Yield,
method = 'earth',
preProcess = c('center', 'scale'),
tuneGrid = marsChemGrid,
trControl = trainControl(method = "cv", number = 10))
marsChemModel
Multivariate Adaptive Regression Spline
144 samples
57 predictor
Pre-processing: centered (57), scaled (57)
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 129, 129, 128, 131, 130, 129, ...
Resampling results across tuning parameters:
degree nprune RMSE Rsquared MAE
1 2 1.397656 0.4796494 1.0945946
1 3 1.175481 0.6259761 0.9526671
1 4 1.211015 0.6036202 0.9718720
1 5 1.250996 0.5842997 1.0087666
1 6 2.596217 0.4764189 1.4077280
1 7 2.645785 0.4428147 1.4283295
1 8 2.912071 0.4227154 1.5171069
1 9 2.769404 0.4223613 1.5086877
1 10 2.719705 0.4170912 1.5176289
1 11 2.747755 0.4140453 1.5183338
1 12 2.432505 0.4160484 1.4308117
1 13 2.395924 0.4136836 1.4295424
1 14 2.372032 0.4223269 1.4183484
1 15 2.375673 0.4197655 1.4198212
1 16 2.377044 0.4187515 1.4287127
1 17 2.377044 0.4187515 1.4287127
1 18 2.377044 0.4187515 1.4287127
1 19 2.377044 0.4187515 1.4287127
1 20 2.377044 0.4187515 1.4287127
1 21 2.377044 0.4187515 1.4287127
1 22 2.377044 0.4187515 1.4287127
1 23 2.377044 0.4187515 1.4287127
1 24 2.377044 0.4187515 1.4287127
1 25 2.377044 0.4187515 1.4287127
1 26 2.377044 0.4187515 1.4287127
1 27 2.377044 0.4187515 1.4287127
1 28 2.377044 0.4187515 1.4287127
1 29 2.377044 0.4187515 1.4287127
1 30 2.377044 0.4187515 1.4287127
1 31 2.377044 0.4187515 1.4287127
1 32 2.377044 0.4187515 1.4287127
1 33 2.377044 0.4187515 1.4287127
1 34 2.377044 0.4187515 1.4287127
1 35 2.377044 0.4187515 1.4287127
1 36 2.377044 0.4187515 1.4287127
1 37 2.377044 0.4187515 1.4287127
1 38 2.377044 0.4187515 1.4287127
1 39 2.377044 0.4187515 1.4287127
1 40 2.377044 0.4187515 1.4287127
1 41 2.377044 0.4187515 1.4287127
1 42 2.377044 0.4187515 1.4287127
1 43 2.377044 0.4187515 1.4287127
1 44 2.377044 0.4187515 1.4287127
1 45 2.377044 0.4187515 1.4287127
1 46 2.377044 0.4187515 1.4287127
1 47 2.377044 0.4187515 1.4287127
1 48 2.377044 0.4187515 1.4287127
1 49 2.377044 0.4187515 1.4287127
1 50 2.377044 0.4187515 1.4287127
1 51 2.377044 0.4187515 1.4287127
1 52 2.377044 0.4187515 1.4287127
1 53 2.377044 0.4187515 1.4287127
1 54 2.377044 0.4187515 1.4287127
1 55 2.377044 0.4187515 1.4287127
1 56 2.377044 0.4187515 1.4287127
1 57 2.377044 0.4187515 1.4287127
1 58 2.377044 0.4187515 1.4287127
1 59 2.377044 0.4187515 1.4287127
1 60 2.377044 0.4187515 1.4287127
1 61 2.377044 0.4187515 1.4287127
1 62 2.377044 0.4187515 1.4287127
1 63 2.377044 0.4187515 1.4287127
1 64 2.377044 0.4187515 1.4287127
1 65 2.377044 0.4187515 1.4287127
1 66 2.377044 0.4187515 1.4287127
1 67 2.377044 0.4187515 1.4287127
1 68 2.377044 0.4187515 1.4287127
1 69 2.377044 0.4187515 1.4287127
1 70 2.377044 0.4187515 1.4287127
1 71 2.377044 0.4187515 1.4287127
1 72 2.377044 0.4187515 1.4287127
1 73 2.377044 0.4187515 1.4287127
1 74 2.377044 0.4187515 1.4287127
1 75 2.377044 0.4187515 1.4287127
1 76 2.377044 0.4187515 1.4287127
1 77 2.377044 0.4187515 1.4287127
1 78 2.377044 0.4187515 1.4287127
1 79 2.377044 0.4187515 1.4287127
1 80 2.377044 0.4187515 1.4287127
1 81 2.377044 0.4187515 1.4287127
1 82 2.377044 0.4187515 1.4287127
1 83 2.377044 0.4187515 1.4287127
1 84 2.377044 0.4187515 1.4287127
1 85 2.377044 0.4187515 1.4287127
1 86 2.377044 0.4187515 1.4287127
1 87 2.377044 0.4187515 1.4287127
1 88 2.377044 0.4187515 1.4287127
1 89 2.377044 0.4187515 1.4287127
1 90 2.377044 0.4187515 1.4287127
1 91 2.377044 0.4187515 1.4287127
1 92 2.377044 0.4187515 1.4287127
1 93 2.377044 0.4187515 1.4287127
1 94 2.377044 0.4187515 1.4287127
1 95 2.377044 0.4187515 1.4287127
1 96 2.377044 0.4187515 1.4287127
1 97 2.377044 0.4187515 1.4287127
1 98 2.377044 0.4187515 1.4287127
1 99 2.377044 0.4187515 1.4287127
1 100 2.377044 0.4187515 1.4287127
2 2 1.397656 0.4796494 1.0945946
2 3 1.340348 0.5209794 1.0631839
2 4 1.266138 0.5662456 1.0098808
2 5 1.249357 0.5916629 0.9692782
2 6 1.289490 0.5910860 0.9982445
2 7 1.308424 0.5886979 1.0096864
2 8 1.292601 0.6021639 1.0138807
2 9 1.351036 0.5880971 1.0486129
2 10 1.517781 0.5523609 1.1008885
2 11 1.513767 0.5611558 1.0919718
2 12 1.573704 0.5268401 1.1442936
2 13 1.527645 0.5537503 1.1066168
2 14 1.592421 0.5664257 1.1111685
2 15 1.560927 0.5768300 1.0956702
2 16 1.721506 0.5571156 1.1260366
2 17 1.816464 0.5702275 1.1714685
2 18 1.797507 0.5638597 1.1788806
2 19 1.792913 0.5605208 1.1756709
2 20 1.752868 0.5752170 1.1511954
2 21 1.769578 0.5748549 1.1496129
2 22 1.770170 0.5733124 1.1550279
2 23 1.774019 0.5675947 1.1584076
2 24 1.769759 0.5672906 1.1540906
2 25 1.783348 0.5639166 1.1623984
2 26 1.804384 0.5577609 1.1796524
2 27 1.808440 0.5554705 1.1864492
2 28 1.808440 0.5554705 1.1864492
2 29 1.808440 0.5554705 1.1864492
2 30 1.808440 0.5554705 1.1864492
2 31 1.808440 0.5554705 1.1864492
2 32 1.808440 0.5554705 1.1864492
2 33 1.808440 0.5554705 1.1864492
2 34 1.808440 0.5554705 1.1864492
2 35 1.808440 0.5554705 1.1864492
2 36 1.808440 0.5554705 1.1864492
2 37 1.808440 0.5554705 1.1864492
2 38 1.808440 0.5554705 1.1864492
2 39 1.808440 0.5554705 1.1864492
2 40 1.808440 0.5554705 1.1864492
2 41 1.808440 0.5554705 1.1864492
2 42 1.808440 0.5554705 1.1864492
2 43 1.808440 0.5554705 1.1864492
2 44 1.808440 0.5554705 1.1864492
2 45 1.808440 0.5554705 1.1864492
2 46 1.808440 0.5554705 1.1864492
2 47 1.808440 0.5554705 1.1864492
2 48 1.808440 0.5554705 1.1864492
2 49 1.808440 0.5554705 1.1864492
2 50 1.808440 0.5554705 1.1864492
2 51 1.808440 0.5554705 1.1864492
2 52 1.808440 0.5554705 1.1864492
2 53 1.808440 0.5554705 1.1864492
2 54 1.808440 0.5554705 1.1864492
2 55 1.808440 0.5554705 1.1864492
2 56 1.808440 0.5554705 1.1864492
2 57 1.808440 0.5554705 1.1864492
2 58 1.808440 0.5554705 1.1864492
2 59 1.808440 0.5554705 1.1864492
2 60 1.808440 0.5554705 1.1864492
2 61 1.808440 0.5554705 1.1864492
2 62 1.808440 0.5554705 1.1864492
2 63 1.808440 0.5554705 1.1864492
2 64 1.808440 0.5554705 1.1864492
2 65 1.808440 0.5554705 1.1864492
2 66 1.808440 0.5554705 1.1864492
2 67 1.808440 0.5554705 1.1864492
2 68 1.808440 0.5554705 1.1864492
2 69 1.808440 0.5554705 1.1864492
2 70 1.808440 0.5554705 1.1864492
2 71 1.808440 0.5554705 1.1864492
2 72 1.808440 0.5554705 1.1864492
2 73 1.808440 0.5554705 1.1864492
2 74 1.808440 0.5554705 1.1864492
2 75 1.808440 0.5554705 1.1864492
2 76 1.808440 0.5554705 1.1864492
2 77 1.808440 0.5554705 1.1864492
2 78 1.808440 0.5554705 1.1864492
2 79 1.808440 0.5554705 1.1864492
2 80 1.808440 0.5554705 1.1864492
2 81 1.808440 0.5554705 1.1864492
2 82 1.808440 0.5554705 1.1864492
2 83 1.808440 0.5554705 1.1864492
2 84 1.808440 0.5554705 1.1864492
2 85 1.808440 0.5554705 1.1864492
2 86 1.808440 0.5554705 1.1864492
2 87 1.808440 0.5554705 1.1864492
2 88 1.808440 0.5554705 1.1864492
2 89 1.808440 0.5554705 1.1864492
2 90 1.808440 0.5554705 1.1864492
2 91 1.808440 0.5554705 1.1864492
2 92 1.808440 0.5554705 1.1864492
2 93 1.808440 0.5554705 1.1864492
2 94 1.808440 0.5554705 1.1864492
2 95 1.808440 0.5554705 1.1864492
2 96 1.808440 0.5554705 1.1864492
2 97 1.808440 0.5554705 1.1864492
2 98 1.808440 0.5554705 1.1864492
2 99 1.808440 0.5554705 1.1864492
2 100 1.808440 0.5554705 1.1864492
3 2 1.397656 0.4796494 1.0945946
3 3 1.253384 0.5900284 1.0217768
3 4 1.223552 0.6241221 0.9930089
3 5 1.222308 0.6389274 0.9839698
3 6 1.300949 0.6094165 1.0303986
3 7 1.422793 0.5593827 1.1082234
3 8 1.714140 0.5081902 1.1985890
3 9 1.781525 0.4865014 1.2305941
3 10 1.764976 0.4850417 1.2225069
3 11 1.786215 0.4893484 1.2423560
3 12 1.783904 0.4838812 1.2567143
3 13 1.790925 0.4836727 1.2443353
3 14 1.760681 0.4968505 1.2329811
3 15 1.811700 0.4703532 1.2635783
3 16 1.800513 0.4816581 1.2519393
3 17 1.863410 0.4640128 1.2807891
3 18 1.848220 0.4684821 1.2741869
3 19 1.849942 0.4543157 1.2756488
3 20 1.870374 0.4624069 1.2816319
3 21 1.850034 0.4670269 1.2518775
3 22 1.868776 0.4726532 1.2452719
3 23 1.899235 0.4727350 1.2535101
3 24 1.913370 0.4671947 1.2616897
3 25 1.928114 0.4585725 1.2780399
3 26 1.918545 0.4599720 1.2768248
3 27 1.911157 0.4657486 1.2683624
3 28 1.911157 0.4657486 1.2683624
3 29 1.911157 0.4657486 1.2683624
3 30 1.673410 0.5193550 1.2156738
3 31 1.673410 0.5193550 1.2156738
3 32 1.673410 0.5193550 1.2156738
3 33 1.673410 0.5193550 1.2156738
3 34 1.673410 0.5193550 1.2156738
3 35 1.673410 0.5193550 1.2156738
3 36 1.673410 0.5193550 1.2156738
3 37 1.673410 0.5193550 1.2156738
3 38 1.673410 0.5193550 1.2156738
3 39 1.673410 0.5193550 1.2156738
3 40 1.673410 0.5193550 1.2156738
3 41 1.673410 0.5193550 1.2156738
3 42 1.673410 0.5193550 1.2156738
3 43 1.673410 0.5193550 1.2156738
3 44 1.673410 0.5193550 1.2156738
3 45 1.673410 0.5193550 1.2156738
3 46 1.673410 0.5193550 1.2156738
3 47 1.673410 0.5193550 1.2156738
3 48 1.673410 0.5193550 1.2156738
3 49 1.673410 0.5193550 1.2156738
3 50 1.673410 0.5193550 1.2156738
3 51 1.673410 0.5193550 1.2156738
3 52 1.673410 0.5193550 1.2156738
3 53 1.673410 0.5193550 1.2156738
3 54 1.673410 0.5193550 1.2156738
3 55 1.673410 0.5193550 1.2156738
3 56 1.673410 0.5193550 1.2156738
3 57 1.673410 0.5193550 1.2156738
3 58 1.673410 0.5193550 1.2156738
3 59 1.673410 0.5193550 1.2156738
3 60 1.673410 0.5193550 1.2156738
3 61 1.673410 0.5193550 1.2156738
3 62 1.673410 0.5193550 1.2156738
3 63 1.673410 0.5193550 1.2156738
3 64 1.673410 0.5193550 1.2156738
3 65 1.673410 0.5193550 1.2156738
3 66 1.673410 0.5193550 1.2156738
3 67 1.673410 0.5193550 1.2156738
3 68 1.673410 0.5193550 1.2156738
3 69 1.673410 0.5193550 1.2156738
3 70 1.673410 0.5193550 1.2156738
3 71 1.673410 0.5193550 1.2156738
3 72 1.673410 0.5193550 1.2156738
3 73 1.673410 0.5193550 1.2156738
3 74 1.673410 0.5193550 1.2156738
3 75 1.673410 0.5193550 1.2156738
3 76 1.673410 0.5193550 1.2156738
3 77 1.673410 0.5193550 1.2156738
3 78 1.673410 0.5193550 1.2156738
3 79 1.673410 0.5193550 1.2156738
3 80 1.673410 0.5193550 1.2156738
3 81 1.673410 0.5193550 1.2156738
3 82 1.673410 0.5193550 1.2156738
3 83 1.673410 0.5193550 1.2156738
3 84 1.673410 0.5193550 1.2156738
3 85 1.673410 0.5193550 1.2156738
3 86 1.673410 0.5193550 1.2156738
3 87 1.673410 0.5193550 1.2156738
3 88 1.673410 0.5193550 1.2156738
3 89 1.673410 0.5193550 1.2156738
3 90 1.673410 0.5193550 1.2156738
3 91 1.673410 0.5193550 1.2156738
3 92 1.673410 0.5193550 1.2156738
3 93 1.673410 0.5193550 1.2156738
3 94 1.673410 0.5193550 1.2156738
3 95 1.673410 0.5193550 1.2156738
3 96 1.673410 0.5193550 1.2156738
3 97 1.673410 0.5193550 1.2156738
3 98 1.673410 0.5193550 1.2156738
3 99 1.673410 0.5193550 1.2156738
3 100 1.673410 0.5193550 1.2156738
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were nprune = 3 and degree = 1.
The model with the lowest RMSE was with 1 degrees of interaction and 3 terms
ggplot(marsChemModel)
summary(marsChemModel)
Call: earth(x=data.frame[144,57], y=c(38,42.44,42.0...), keepxy=TRUE, degree=1,
nprune=3)
coefficients
(Intercept) 39.357846
h(0.641807-ManufacturingProcess09) -1.011735
h(ManufacturingProcess32- -1.17911) 1.285230
Selected 3 of 21 terms, and 2 of 57 predictors (nprune=3)
Termination condition: RSq changed by less than 0.001 at 21 terms
Importance: ManufacturingProcess32, ManufacturingProcess09, ...
Number of terms at each degree of interaction: 1 2 (additive model)
GCV 1.362344 RSS 182.7907 GRSq 0.6218734 RSq 0.6427314
Next, I try predicting the test set data with the MARS model:
<- predict(marsChemModel, newdata = chem_testData[,-1])
MARSChemPred postResample(pred = MARSChemPred, obs = chem_testData$Yield)
RMSE Rsquared MAE
1.2554338 0.4264021 1.0514768
SVM
Below I tune an SVM model:
set.seed(230)
<- train(x = chem_trainData[,-1],
svmChemModel y = chem_trainData$Yield,
method = 'svmRadial',
preProcess = c('center', 'scale'),
tuneLength = 14,
trControl = trainControl(method = "cv", number = 10))
svmChemModel
Support Vector Machines with Radial Basis Function Kernel
144 samples
57 predictor
Pre-processing: centered (57), scaled (57)
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 129, 130, 130, 129, 131, 129, ...
Resampling results across tuning parameters:
C RMSE Rsquared MAE
0.25 1.415883 0.5336956 1.1612506
0.50 1.281674 0.5964956 1.0546869
1.00 1.200240 0.6241479 0.9774171
2.00 1.181807 0.6332608 0.9536772
4.00 1.190212 0.6356245 0.9511234
8.00 1.162745 0.6559100 0.9465857
16.00 1.160748 0.6567212 0.9440564
32.00 1.160748 0.6567212 0.9440564
64.00 1.160748 0.6567212 0.9440564
128.00 1.160748 0.6567212 0.9440564
256.00 1.160748 0.6567212 0.9440564
512.00 1.160748 0.6567212 0.9440564
1024.00 1.160748 0.6567212 0.9440564
2048.00 1.160748 0.6567212 0.9440564
Tuning parameter 'sigma' was held constant at a value of 0.01320701
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were sigma = 0.01320701 and C = 16.
ggplot(svmChemModel)
Next, I try predicting the test set data with the SVM model:
<- predict(svmChemModel, newdata = chem_testData[,-1])
SVMChemPred postResample(pred = SVMChemPred, obs = chem_testData$Yield)
RMSE Rsquared MAE
1.0995312 0.5563240 0.9233152
A
We are asked to determine which model gives the optimal re-sampling and test set performance.
Below is the knnModel:
knnChemModel
k-Nearest Neighbors
144 samples
57 predictor
Pre-processing: centered (57), scaled (57)
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 128, 130, 130, 129, 130, 130, ...
Resampling results across tuning parameters:
k RMSE Rsquared MAE
1 1.465714 0.4465654 1.116089
2 1.228056 0.5717160 1.013607
3 1.277520 0.5193685 1.035558
4 1.277025 0.5402453 1.023845
5 1.286623 0.5393261 1.032174
6 1.306640 0.5267678 1.044836
7 1.326741 0.5246410 1.062796
8 1.336258 0.5251640 1.078861
9 1.358468 0.5042863 1.092338
10 1.342232 0.5279743 1.077845
11 1.346397 0.5282144 1.094812
12 1.349239 0.5271784 1.092292
13 1.362739 0.5157005 1.100105
14 1.372961 0.5046815 1.101857
15 1.399120 0.4806478 1.112901
16 1.398594 0.4840723 1.114808
17 1.412197 0.4758792 1.126821
18 1.427435 0.4660139 1.140383
19 1.428054 0.4671728 1.135843
20 1.435325 0.4611414 1.145407
RMSE was used to select the optimal model using the smallest value.
The final value used for the model was k = 2.
The RMSE is the re-sampling was 1.228056.
The RMSE for the test set is:
postResample(pred = knnChemPred, obs = chem_testData$Yield)
RMSE Rsquared MAE
1.5622837 0.2797031 1.2695312
1.5622837.
Below is the MARS model:
marsChemModel
Multivariate Adaptive Regression Spline
144 samples
57 predictor
Pre-processing: centered (57), scaled (57)
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 129, 129, 128, 131, 130, 129, ...
Resampling results across tuning parameters:
degree nprune RMSE Rsquared MAE
1 2 1.397656 0.4796494 1.0945946
1 3 1.175481 0.6259761 0.9526671
1 4 1.211015 0.6036202 0.9718720
1 5 1.250996 0.5842997 1.0087666
1 6 2.596217 0.4764189 1.4077280
1 7 2.645785 0.4428147 1.4283295
1 8 2.912071 0.4227154 1.5171069
1 9 2.769404 0.4223613 1.5086877
1 10 2.719705 0.4170912 1.5176289
1 11 2.747755 0.4140453 1.5183338
1 12 2.432505 0.4160484 1.4308117
1 13 2.395924 0.4136836 1.4295424
1 14 2.372032 0.4223269 1.4183484
1 15 2.375673 0.4197655 1.4198212
1 16 2.377044 0.4187515 1.4287127
1 17 2.377044 0.4187515 1.4287127
1 18 2.377044 0.4187515 1.4287127
1 19 2.377044 0.4187515 1.4287127
1 20 2.377044 0.4187515 1.4287127
1 21 2.377044 0.4187515 1.4287127
1 22 2.377044 0.4187515 1.4287127
1 23 2.377044 0.4187515 1.4287127
1 24 2.377044 0.4187515 1.4287127
1 25 2.377044 0.4187515 1.4287127
1 26 2.377044 0.4187515 1.4287127
1 27 2.377044 0.4187515 1.4287127
1 28 2.377044 0.4187515 1.4287127
1 29 2.377044 0.4187515 1.4287127
1 30 2.377044 0.4187515 1.4287127
1 31 2.377044 0.4187515 1.4287127
1 32 2.377044 0.4187515 1.4287127
1 33 2.377044 0.4187515 1.4287127
1 34 2.377044 0.4187515 1.4287127
1 35 2.377044 0.4187515 1.4287127
1 36 2.377044 0.4187515 1.4287127
1 37 2.377044 0.4187515 1.4287127
1 38 2.377044 0.4187515 1.4287127
1 39 2.377044 0.4187515 1.4287127
1 40 2.377044 0.4187515 1.4287127
1 41 2.377044 0.4187515 1.4287127
1 42 2.377044 0.4187515 1.4287127
1 43 2.377044 0.4187515 1.4287127
1 44 2.377044 0.4187515 1.4287127
1 45 2.377044 0.4187515 1.4287127
1 46 2.377044 0.4187515 1.4287127
1 47 2.377044 0.4187515 1.4287127
1 48 2.377044 0.4187515 1.4287127
1 49 2.377044 0.4187515 1.4287127
1 50 2.377044 0.4187515 1.4287127
1 51 2.377044 0.4187515 1.4287127
1 52 2.377044 0.4187515 1.4287127
1 53 2.377044 0.4187515 1.4287127
1 54 2.377044 0.4187515 1.4287127
1 55 2.377044 0.4187515 1.4287127
1 56 2.377044 0.4187515 1.4287127
1 57 2.377044 0.4187515 1.4287127
1 58 2.377044 0.4187515 1.4287127
1 59 2.377044 0.4187515 1.4287127
1 60 2.377044 0.4187515 1.4287127
1 61 2.377044 0.4187515 1.4287127
1 62 2.377044 0.4187515 1.4287127
1 63 2.377044 0.4187515 1.4287127
1 64 2.377044 0.4187515 1.4287127
1 65 2.377044 0.4187515 1.4287127
1 66 2.377044 0.4187515 1.4287127
1 67 2.377044 0.4187515 1.4287127
1 68 2.377044 0.4187515 1.4287127
1 69 2.377044 0.4187515 1.4287127
1 70 2.377044 0.4187515 1.4287127
1 71 2.377044 0.4187515 1.4287127
1 72 2.377044 0.4187515 1.4287127
1 73 2.377044 0.4187515 1.4287127
1 74 2.377044 0.4187515 1.4287127
1 75 2.377044 0.4187515 1.4287127
1 76 2.377044 0.4187515 1.4287127
1 77 2.377044 0.4187515 1.4287127
1 78 2.377044 0.4187515 1.4287127
1 79 2.377044 0.4187515 1.4287127
1 80 2.377044 0.4187515 1.4287127
1 81 2.377044 0.4187515 1.4287127
1 82 2.377044 0.4187515 1.4287127
1 83 2.377044 0.4187515 1.4287127
1 84 2.377044 0.4187515 1.4287127
1 85 2.377044 0.4187515 1.4287127
1 86 2.377044 0.4187515 1.4287127
1 87 2.377044 0.4187515 1.4287127
1 88 2.377044 0.4187515 1.4287127
1 89 2.377044 0.4187515 1.4287127
1 90 2.377044 0.4187515 1.4287127
1 91 2.377044 0.4187515 1.4287127
1 92 2.377044 0.4187515 1.4287127
1 93 2.377044 0.4187515 1.4287127
1 94 2.377044 0.4187515 1.4287127
1 95 2.377044 0.4187515 1.4287127
1 96 2.377044 0.4187515 1.4287127
1 97 2.377044 0.4187515 1.4287127
1 98 2.377044 0.4187515 1.4287127
1 99 2.377044 0.4187515 1.4287127
1 100 2.377044 0.4187515 1.4287127
2 2 1.397656 0.4796494 1.0945946
2 3 1.340348 0.5209794 1.0631839
2 4 1.266138 0.5662456 1.0098808
2 5 1.249357 0.5916629 0.9692782
2 6 1.289490 0.5910860 0.9982445
2 7 1.308424 0.5886979 1.0096864
2 8 1.292601 0.6021639 1.0138807
2 9 1.351036 0.5880971 1.0486129
2 10 1.517781 0.5523609 1.1008885
2 11 1.513767 0.5611558 1.0919718
2 12 1.573704 0.5268401 1.1442936
2 13 1.527645 0.5537503 1.1066168
2 14 1.592421 0.5664257 1.1111685
2 15 1.560927 0.5768300 1.0956702
2 16 1.721506 0.5571156 1.1260366
2 17 1.816464 0.5702275 1.1714685
2 18 1.797507 0.5638597 1.1788806
2 19 1.792913 0.5605208 1.1756709
2 20 1.752868 0.5752170 1.1511954
2 21 1.769578 0.5748549 1.1496129
2 22 1.770170 0.5733124 1.1550279
2 23 1.774019 0.5675947 1.1584076
2 24 1.769759 0.5672906 1.1540906
2 25 1.783348 0.5639166 1.1623984
2 26 1.804384 0.5577609 1.1796524
2 27 1.808440 0.5554705 1.1864492
2 28 1.808440 0.5554705 1.1864492
2 29 1.808440 0.5554705 1.1864492
2 30 1.808440 0.5554705 1.1864492
2 31 1.808440 0.5554705 1.1864492
2 32 1.808440 0.5554705 1.1864492
2 33 1.808440 0.5554705 1.1864492
2 34 1.808440 0.5554705 1.1864492
2 35 1.808440 0.5554705 1.1864492
2 36 1.808440 0.5554705 1.1864492
2 37 1.808440 0.5554705 1.1864492
2 38 1.808440 0.5554705 1.1864492
2 39 1.808440 0.5554705 1.1864492
2 40 1.808440 0.5554705 1.1864492
2 41 1.808440 0.5554705 1.1864492
2 42 1.808440 0.5554705 1.1864492
2 43 1.808440 0.5554705 1.1864492
2 44 1.808440 0.5554705 1.1864492
2 45 1.808440 0.5554705 1.1864492
2 46 1.808440 0.5554705 1.1864492
2 47 1.808440 0.5554705 1.1864492
2 48 1.808440 0.5554705 1.1864492
2 49 1.808440 0.5554705 1.1864492
2 50 1.808440 0.5554705 1.1864492
2 51 1.808440 0.5554705 1.1864492
2 52 1.808440 0.5554705 1.1864492
2 53 1.808440 0.5554705 1.1864492
2 54 1.808440 0.5554705 1.1864492
2 55 1.808440 0.5554705 1.1864492
2 56 1.808440 0.5554705 1.1864492
2 57 1.808440 0.5554705 1.1864492
2 58 1.808440 0.5554705 1.1864492
2 59 1.808440 0.5554705 1.1864492
2 60 1.808440 0.5554705 1.1864492
2 61 1.808440 0.5554705 1.1864492
2 62 1.808440 0.5554705 1.1864492
2 63 1.808440 0.5554705 1.1864492
2 64 1.808440 0.5554705 1.1864492
2 65 1.808440 0.5554705 1.1864492
2 66 1.808440 0.5554705 1.1864492
2 67 1.808440 0.5554705 1.1864492
2 68 1.808440 0.5554705 1.1864492
2 69 1.808440 0.5554705 1.1864492
2 70 1.808440 0.5554705 1.1864492
2 71 1.808440 0.5554705 1.1864492
2 72 1.808440 0.5554705 1.1864492
2 73 1.808440 0.5554705 1.1864492
2 74 1.808440 0.5554705 1.1864492
2 75 1.808440 0.5554705 1.1864492
2 76 1.808440 0.5554705 1.1864492
2 77 1.808440 0.5554705 1.1864492
2 78 1.808440 0.5554705 1.1864492
2 79 1.808440 0.5554705 1.1864492
2 80 1.808440 0.5554705 1.1864492
2 81 1.808440 0.5554705 1.1864492
2 82 1.808440 0.5554705 1.1864492
2 83 1.808440 0.5554705 1.1864492
2 84 1.808440 0.5554705 1.1864492
2 85 1.808440 0.5554705 1.1864492
2 86 1.808440 0.5554705 1.1864492
2 87 1.808440 0.5554705 1.1864492
2 88 1.808440 0.5554705 1.1864492
2 89 1.808440 0.5554705 1.1864492
2 90 1.808440 0.5554705 1.1864492
2 91 1.808440 0.5554705 1.1864492
2 92 1.808440 0.5554705 1.1864492
2 93 1.808440 0.5554705 1.1864492
2 94 1.808440 0.5554705 1.1864492
2 95 1.808440 0.5554705 1.1864492
2 96 1.808440 0.5554705 1.1864492
2 97 1.808440 0.5554705 1.1864492
2 98 1.808440 0.5554705 1.1864492
2 99 1.808440 0.5554705 1.1864492
2 100 1.808440 0.5554705 1.1864492
3 2 1.397656 0.4796494 1.0945946
3 3 1.253384 0.5900284 1.0217768
3 4 1.223552 0.6241221 0.9930089
3 5 1.222308 0.6389274 0.9839698
3 6 1.300949 0.6094165 1.0303986
3 7 1.422793 0.5593827 1.1082234
3 8 1.714140 0.5081902 1.1985890
3 9 1.781525 0.4865014 1.2305941
3 10 1.764976 0.4850417 1.2225069
3 11 1.786215 0.4893484 1.2423560
3 12 1.783904 0.4838812 1.2567143
3 13 1.790925 0.4836727 1.2443353
3 14 1.760681 0.4968505 1.2329811
3 15 1.811700 0.4703532 1.2635783
3 16 1.800513 0.4816581 1.2519393
3 17 1.863410 0.4640128 1.2807891
3 18 1.848220 0.4684821 1.2741869
3 19 1.849942 0.4543157 1.2756488
3 20 1.870374 0.4624069 1.2816319
3 21 1.850034 0.4670269 1.2518775
3 22 1.868776 0.4726532 1.2452719
3 23 1.899235 0.4727350 1.2535101
3 24 1.913370 0.4671947 1.2616897
3 25 1.928114 0.4585725 1.2780399
3 26 1.918545 0.4599720 1.2768248
3 27 1.911157 0.4657486 1.2683624
3 28 1.911157 0.4657486 1.2683624
3 29 1.911157 0.4657486 1.2683624
3 30 1.673410 0.5193550 1.2156738
3 31 1.673410 0.5193550 1.2156738
3 32 1.673410 0.5193550 1.2156738
3 33 1.673410 0.5193550 1.2156738
3 34 1.673410 0.5193550 1.2156738
3 35 1.673410 0.5193550 1.2156738
3 36 1.673410 0.5193550 1.2156738
3 37 1.673410 0.5193550 1.2156738
3 38 1.673410 0.5193550 1.2156738
3 39 1.673410 0.5193550 1.2156738
3 40 1.673410 0.5193550 1.2156738
3 41 1.673410 0.5193550 1.2156738
3 42 1.673410 0.5193550 1.2156738
3 43 1.673410 0.5193550 1.2156738
3 44 1.673410 0.5193550 1.2156738
3 45 1.673410 0.5193550 1.2156738
3 46 1.673410 0.5193550 1.2156738
3 47 1.673410 0.5193550 1.2156738
3 48 1.673410 0.5193550 1.2156738
3 49 1.673410 0.5193550 1.2156738
3 50 1.673410 0.5193550 1.2156738
3 51 1.673410 0.5193550 1.2156738
3 52 1.673410 0.5193550 1.2156738
3 53 1.673410 0.5193550 1.2156738
3 54 1.673410 0.5193550 1.2156738
3 55 1.673410 0.5193550 1.2156738
3 56 1.673410 0.5193550 1.2156738
3 57 1.673410 0.5193550 1.2156738
3 58 1.673410 0.5193550 1.2156738
3 59 1.673410 0.5193550 1.2156738
3 60 1.673410 0.5193550 1.2156738
3 61 1.673410 0.5193550 1.2156738
3 62 1.673410 0.5193550 1.2156738
3 63 1.673410 0.5193550 1.2156738
3 64 1.673410 0.5193550 1.2156738
3 65 1.673410 0.5193550 1.2156738
3 66 1.673410 0.5193550 1.2156738
3 67 1.673410 0.5193550 1.2156738
3 68 1.673410 0.5193550 1.2156738
3 69 1.673410 0.5193550 1.2156738
3 70 1.673410 0.5193550 1.2156738
3 71 1.673410 0.5193550 1.2156738
3 72 1.673410 0.5193550 1.2156738
3 73 1.673410 0.5193550 1.2156738
3 74 1.673410 0.5193550 1.2156738
3 75 1.673410 0.5193550 1.2156738
3 76 1.673410 0.5193550 1.2156738
3 77 1.673410 0.5193550 1.2156738
3 78 1.673410 0.5193550 1.2156738
3 79 1.673410 0.5193550 1.2156738
3 80 1.673410 0.5193550 1.2156738
3 81 1.673410 0.5193550 1.2156738
3 82 1.673410 0.5193550 1.2156738
3 83 1.673410 0.5193550 1.2156738
3 84 1.673410 0.5193550 1.2156738
3 85 1.673410 0.5193550 1.2156738
3 86 1.673410 0.5193550 1.2156738
3 87 1.673410 0.5193550 1.2156738
3 88 1.673410 0.5193550 1.2156738
3 89 1.673410 0.5193550 1.2156738
3 90 1.673410 0.5193550 1.2156738
3 91 1.673410 0.5193550 1.2156738
3 92 1.673410 0.5193550 1.2156738
3 93 1.673410 0.5193550 1.2156738
3 94 1.673410 0.5193550 1.2156738
3 95 1.673410 0.5193550 1.2156738
3 96 1.673410 0.5193550 1.2156738
3 97 1.673410 0.5193550 1.2156738
3 98 1.673410 0.5193550 1.2156738
3 99 1.673410 0.5193550 1.2156738
3 100 1.673410 0.5193550 1.2156738
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were nprune = 3 and degree = 1.
The RMSE is the re-sampling was 1.775481.
The RMSE for the test set is:
postResample(pred = MARSChemPred, obs = chem_testData$Yield)
RMSE Rsquared MAE
1.2554338 0.4264021 1.0514768
Below is the SVM model:
svmChemModel
Support Vector Machines with Radial Basis Function Kernel
144 samples
57 predictor
Pre-processing: centered (57), scaled (57)
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 129, 130, 130, 129, 131, 129, ...
Resampling results across tuning parameters:
C RMSE Rsquared MAE
0.25 1.415883 0.5336956 1.1612506
0.50 1.281674 0.5964956 1.0546869
1.00 1.200240 0.6241479 0.9774171
2.00 1.181807 0.6332608 0.9536772
4.00 1.190212 0.6356245 0.9511234
8.00 1.162745 0.6559100 0.9465857
16.00 1.160748 0.6567212 0.9440564
32.00 1.160748 0.6567212 0.9440564
64.00 1.160748 0.6567212 0.9440564
128.00 1.160748 0.6567212 0.9440564
256.00 1.160748 0.6567212 0.9440564
512.00 1.160748 0.6567212 0.9440564
1024.00 1.160748 0.6567212 0.9440564
2048.00 1.160748 0.6567212 0.9440564
Tuning parameter 'sigma' was held constant at a value of 0.01320701
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were sigma = 0.01320701 and C = 16.
The RMSE for the re-sampling set is 1.160748.
The RMSE for the test set is:
postResample(pred = SVMChemPred, obs = chem_testData$Yield)
RMSE Rsquared MAE
1.0995312 0.5563240 0.9233152
The SVM performed best on the test set in terms of RMSE and R squared (though the test set r-squared of 0.56 is still somewhat low). Thus, I would recommend the SVM model ahead of the KNN and MARS models.
B
Below are the top predictors for the SVM model:
plot(varImp(svmChemModel), main = 'Variable Importance')
Based on the above, Manufacturing Process 32 and 13 have the most importance in creating support vectors. While Biological Materials appear in the top ten, it seems the the Manufacturing Processes have more impact on the support vectors.
The Lasso model is different in nature than the SVM model; however, the Lasso model seemed to be more influenced by Manufacturing Processes.
C
Below I explore the relationship between Yield and the top ten variables that influence the support vectors:
<- varImp(svmChemModel)
final_variables <- as.data.frame(final_variables$importance)
final_variables $variable <- rownames(final_variables)
final_variables
<- final_variables[order(-final_variables$Overall),][1:10,]
top_ten #final_variables <- final_variables[order(final_variables$Overall,decreasing = T),]
<- top_ten$variable
top_ten_names
|> dplyr::select(Yield, all_of(top_ten_names)) |> ggpairs() chem_data
From the above, we can see that certain manufacturing processes are positively correlating to yield while some are negatively correlated. Biological Material is positively correlated. This suggests that better measurements on biological material may correspond with better yields while different manufacturing processes can positively or negatively correspond with yield.