For this exercise we have to train several non linear models on this data.
library(mlbench)
library(caret)
## Loading required package: ggplot2
## Loading required package: lattice
set.seed(200)
trainingData <- mlbench.friedman1(200,sd = 1)
trainingData$x <- data.frame(trainingData$x)
featurePlot(trainingData$x,trainingData$y)
## Creates a list with vector y and a matrix of predictors x, also simulates a large test to estimate the true error rate with good precision
testData <- mlbench.friedman1(5000,sd = 1)
testData$x <- data.frame(testData$x)
## Tune several model on these data.
knnmodel <- train(x = trainingData$x,y = trainingData$y,method = "knn",preProc = c("center","scale"),tuneLength = 10)
knnmodel
## k-Nearest Neighbors
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 200, 200, 200, 200, 200, 200, ...
## Resampling results across tuning parameters:
##
## k RMSE Rsquared MAE
## 5 3.466085 0.5121775 2.816838
## 7 3.349428 0.5452823 2.727410
## 9 3.264276 0.5785990 2.660026
## 11 3.214216 0.6024244 2.603767
## 13 3.196510 0.6176570 2.591935
## 15 3.184173 0.6305506 2.577482
## 17 3.183130 0.6425367 2.567787
## 19 3.198752 0.6483184 2.592683
## 21 3.188993 0.6611428 2.588787
## 23 3.200458 0.6638353 2.604529
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 17.
plot(knnmodel)
KnnPred <- predict(knnmodel,newdata = testData$x)
## use the rsample
postResample(KnnPred,testData$y)
## RMSE Rsquared MAE
## 3.2040595 0.6819919 2.5683461
KNN appears to not have done very well on this set of data, we will attempt to create other models.
## Taking a look at the mars model from the textbook..
library(earth)
## Warning: package 'earth' was built under R version 4.3.3
## Loading required package: Formula
## Loading required package: plotmo
## Warning: package 'plotmo' was built under R version 4.3.3
## Loading required package: plotrix
## Warning: package 'plotrix' was built under R version 4.3.2
marsfit <- earth(trainingData$x,trainingData$y)
summary(marsfit)
## Call: earth(x=trainingData$x, y=trainingData$y)
##
## coefficients
## (Intercept) 18.451984
## h(0.621722-X1) -11.074396
## h(0.601063-X2) -10.744225
## h(X3-0.281766) 20.607853
## h(0.447442-X3) 17.880232
## h(X3-0.447442) -23.282007
## h(X3-0.636458) 15.150350
## h(0.734892-X4) -10.027487
## h(X4-0.734892) 9.092045
## h(0.850094-X5) -4.723407
## h(X5-0.850094) 10.832932
## h(X6-0.361791) -1.956821
##
## Selected 12 of 18 terms, and 6 of 10 predictors
## Termination condition: Reached nk 21
## Importance: X1, X4, X2, X5, X3, X6, X7-unused, X8-unused, X9-unused, ...
## Number of terms at each degree of interaction: 1 11 (additive model)
## GCV 2.540556 RSS 397.9654 GRSq 0.8968524 RSq 0.9183982
# decided to tweak the grid from the textbook.
set.seed(100)
marsGrid <- expand.grid(.degree = 1:2,.nprune = 2:15)
Marsmodel <- train(trainingData$x,trainingData$y,method = "earth",tuneGrid = marsGrid,trControl = trainControl(method = "cv"))
Marsmodel
## Multivariate Adaptive Regression Spline
##
## 200 samples
## 10 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 180, 180, 180, 180, 180, 180, ...
## Resampling results across tuning parameters:
##
## degree nprune RMSE Rsquared MAE
## 1 2 4.327937 0.2544880 3.600474
## 1 3 3.572450 0.4912720 2.895811
## 1 4 2.596841 0.7183600 2.106341
## 1 5 2.370161 0.7659777 1.918669
## 1 6 2.276141 0.7881481 1.810001
## 1 7 1.766728 0.8751831 1.390215
## 1 8 1.780946 0.8723243 1.401345
## 1 9 1.665091 0.8819775 1.325515
## 1 10 1.663804 0.8821283 1.327657
## 1 11 1.657738 0.8822967 1.331730
## 1 12 1.653784 0.8827903 1.331504
## 1 13 1.648496 0.8823663 1.316407
## 1 14 1.639073 0.8841742 1.312833
## 1 15 1.639073 0.8841742 1.312833
## 2 2 4.327937 0.2544880 3.600474
## 2 3 3.572450 0.4912720 2.895811
## 2 4 2.661826 0.7070510 2.173471
## 2 5 2.404015 0.7578971 1.975387
## 2 6 2.243927 0.7914805 1.783072
## 2 7 1.856336 0.8605482 1.435682
## 2 8 1.754607 0.8763186 1.396841
## 2 9 1.603578 0.8938666 1.261361
## 2 10 1.492421 0.9084998 1.168700
## 2 11 1.317350 0.9292504 1.033926
## 2 12 1.304327 0.9320133 1.019108
## 2 13 1.277510 0.9323681 1.002927
## 2 14 1.269626 0.9350024 1.003346
## 2 15 1.266217 0.9359400 1.013893
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were nprune = 15 and degree = 2.
plot(Marsmodel)
Looking at the model summary we see that the x1,x2,x3,x4,x5 which according to the textbook are the informative predictors have been used in the model, and the final values used in the model are nprune = 13 and the degree is 2.
Marspred <- predict(Marsmodel,testData$x)
# use the PostRsample
postResample(Marspred,testData$y)
## RMSE Rsquared MAE
## 1.1589948 0.9460418 0.9250230
varImp(Marsmodel)
## earth variable importance
##
## Overall
## X1 100.00
## X4 75.24
## X2 48.73
## X5 15.52
## X3 0.00
We see on the VarImp plot that x1 had the highest overall importance followed by x4 and then x2.
We will attempt to fit a basic neural network model now. The code below mimics the computing section from the textbook
## First we remove correlated predictors and create our grid
## We noticed that components aren't very correlated with each other, thus it is empty
tooCorrelated <- findCorrelation(cor(trainingData$x),cutoff = .75)
tooCorrelated
## integer(0)
Since the predictors aren’t too correlated we can skip the step in the textbook where we filter out the correlated predictors and just set up the grid.
## Create a specific candidate sets of model to evaluate
## Textbook says our lambda decay should be between 0 and 0.1 and I will make .size 1 to 5 to play around with it
nnetGrid <- expand.grid(size = c(1:10),decay = c(0,.01,.1))
# stick with 10 fold cross validation again
set.seed(100)
ctrl1 <- trainControl(method = "cv",number = 10)
## Create the model with the train function from caret
# I put 5 hidden units and maxiterations is 250?
## Apparently .bag parameter from the textbook should be removed since it interferes with the model creation..
set.seed(100)
nnetTune <- train(trainingData$x,trainingData$y,method = "nnet",tuneGrid = nnetGrid,trControl = ctrl1,preProc = c("center","scale"),linout = TRUE,trace = FALSE,MaxNWts = 10 * (ncol(trainingData$x)+ 1) + 10 + 1, maxit = 500)
nnetTune
## Neural Network
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 180, 180, 180, 180, 180, 180, ...
## Resampling results across tuning parameters:
##
## size decay RMSE Rsquared MAE
## 1 0.00 2.540546 0.7254252 2.008197
## 1 0.01 2.385425 0.7602782 1.887777
## 1 0.10 2.393895 0.7596503 1.894167
## 2 0.00 2.566314 0.7212618 2.052922
## 2 0.01 2.620311 0.7133931 2.039737
## 2 0.10 2.592617 0.7216119 2.105093
## 3 0.00 2.302390 0.7766520 1.829400
## 3 0.01 2.278686 0.7724293 1.803214
## 3 0.10 2.472966 0.7519439 1.977240
## 4 0.00 2.433270 0.7497128 1.862868
## 4 0.01 2.485616 0.7430911 2.025216
## 4 0.10 2.358607 0.7772165 1.880072
## 5 0.00 2.515220 0.7472471 2.029753
## 5 0.01 2.430803 0.7607865 1.932468
## 5 0.10 2.311168 0.7845703 1.830923
## 6 0.00 6.784753 0.5364566 3.582875
## 6 0.01 2.613883 0.7309686 2.115403
## 6 0.10 2.534052 0.7478124 2.000938
## 7 0.00 6.272111 0.4964439 3.730671
## 7 0.01 3.152634 0.6475999 2.502389
## 7 0.10 2.576842 0.7834795 2.042534
## 8 0.00 9.132172 0.4711305 5.258124
## 8 0.01 2.990065 0.6809087 2.372930
## 8 0.10 2.911291 0.6864066 2.328065
## 9 0.00 8.411991 0.4618170 4.241654
## 9 0.01 3.173268 0.6966041 2.567788
## 9 0.10 2.903788 0.6864446 2.313660
## 10 0.00 4.954073 0.4182651 3.602381
## 10 0.01 3.508813 0.5912309 2.787225
## 10 0.10 2.973902 0.6608165 2.360836
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were size = 3 and decay = 0.01.
plot(nnetTune)
It appears that the model choose a weight decay of 0,01 with 3 hidden units to get the lowest RMSE.
NNpred <- predict(nnetTune,testData$x)
postResample(NNpred,testData$y)
## RMSE Rsquared MAE
## 1.9478991 0.8496598 1.4767314
It seems the Neural Network did worst on the test data than the MARS and the knn model
Finally, we attempt to build a Support Vector Machine Model, I am going to attempt build 3 svms model using different kernel and choose the one that did the best..
Here I will build a svm model with the radial kernel, a tunelength of 12 and standard 10-fold cross validation model.
set.seed(100)
svmRtuned1 <- train(trainingData$x,trainingData$y,method = "svmRadial",preProc = c("center","scale"),tuneLength = 12,trControl = ctrl1)
svmRtuned1
## Support Vector Machines with Radial Basis Function Kernel
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 180, 180, 180, 180, 180, 180, ...
## Resampling results across tuning parameters:
##
## C RMSE Rsquared MAE
## 0.25 2.530787 0.7922715 2.013175
## 0.50 2.259539 0.8064569 1.789962
## 1.00 2.099789 0.8274242 1.656154
## 2.00 2.002943 0.8412934 1.583791
## 4.00 1.943618 0.8504425 1.546586
## 8.00 1.918711 0.8547582 1.532981
## 16.00 1.920651 0.8536189 1.536116
## 32.00 1.920651 0.8536189 1.536116
## 64.00 1.920651 0.8536189 1.536116
## 128.00 1.920651 0.8536189 1.536116
## 256.00 1.920651 0.8536189 1.536116
## 512.00 1.920651 0.8536189 1.536116
##
## Tuning parameter 'sigma' was held constant at a value of 0.06509124
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were sigma = 0.06509124 and C = 8.
plot(svmRtuned1)
The plot and the summary shows that the lowest RMSE vale fell off at sigma = 0.65 and cost at 8 and we see that higher costs results in the same RMSE and R2 squared values.
svmRtunedpred <- predict(svmRtuned1,testData$x)
postResample(svmRtunedpred,testData$y)
## RMSE Rsquared MAE
## 2.0631908 0.8275736 1.5662213
The svm with the radial kernel did okay on the test data
Here I will build a svm model with the linear kernel, with the same tunelength and cross-validation
set.seed(100)
svmRtuned2 <- train(trainingData$x,trainingData$y,method = "svmLinear",preProc = c("center","scale"),tuneLength = 12,trControl = ctrl1)
svmRtuned2
## Support Vector Machines with Linear Kernel
##
## 200 samples
## 10 predictor
##
## Pre-processing: centered (10), scaled (10)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 180, 180, 180, 180, 180, 180, ...
## Resampling results:
##
## RMSE Rsquared MAE
## 2.414092 0.7548203 1.965221
##
## Tuning parameter 'C' was held constant at a value of 1
Since we can’t plot the linear kernel, we can see that the linear kernel performed worse on the data with a higher RMSE value and a slightly lower R-squared value.
svmRtunedpred2 <- predict(svmRtuned2,testData$x)
postResample(svmRtunedpred2,testData$y)
## RMSE Rsquared MAE
## 2.7633860 0.6973384 2.0970616
The svm with the linear kernel performed worse on the test data
For the svm polynomial kernel, the model was taking a very long to produce any values, thus I omitted it in this case. Looking at the other kernels we see that the svm with the default radial kernel performed better than the linear kernel,so for SVM we will stick with the radial kernel.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Tabble <- bind_rows(knn = postResample(KnnPred,testData$y),NN = postResample(NNpred,testData$y),SVMLinear = postResample(svmRtunedpred,testData$y),Mars = postResample(Marspred,testData$y))
Tabble$id = c("knn","Neural Network","Support Vector Machine","MARS")
Tabble
## # A tibble: 4 × 4
## RMSE Rsquared MAE id
## <dbl> <dbl> <dbl> <chr>
## 1 3.20 0.682 2.57 knn
## 2 1.95 0.850 1.48 Neural Network
## 3 2.06 0.828 1.57 Support Vector Machine
## 4 1.16 0.946 0.925 MARS
Looking at the various Non-Linear Models we have created in this exercise it appears that MARS model had done very well in terms of having the lowest RMSE and the highest R-squared value onto the test set, so we would go with the MARS model.
For this exercise we will re-use the same data-cleaning and preprocessing I had done in the previous hw, but will apply different models.
library(AppliedPredictiveModeling)
## Warning: package 'AppliedPredictiveModeling' was built under R version 4.3.3
data("ChemicalManufacturingProcess")
## we have to find the columns with missing values
na_counts <- colSums(is.na(ChemicalManufacturingProcess))
cols_w_na <- names(na_counts[na_counts > 0])
cols_w_na
## [1] "ManufacturingProcess01" "ManufacturingProcess02" "ManufacturingProcess03"
## [4] "ManufacturingProcess04" "ManufacturingProcess05" "ManufacturingProcess06"
## [7] "ManufacturingProcess07" "ManufacturingProcess08" "ManufacturingProcess10"
## [10] "ManufacturingProcess11" "ManufacturingProcess12" "ManufacturingProcess14"
## [13] "ManufacturingProcess22" "ManufacturingProcess23" "ManufacturingProcess24"
## [16] "ManufacturingProcess25" "ManufacturingProcess26" "ManufacturingProcess27"
## [19] "ManufacturingProcess28" "ManufacturingProcess29" "ManufacturingProcess30"
## [22] "ManufacturingProcess31" "ManufacturingProcess33" "ManufacturingProcess34"
## [25] "ManufacturingProcess35" "ManufacturingProcess36" "ManufacturingProcess40"
## [28] "ManufacturingProcess41"
It appears the missing values are in the ManufacturingProcess columns
## Check each column and impute it
trans <- preProcess(ChemicalManufacturingProcess,method = "knnImpute")
We use the preProcess function and apply knnimpute according to section 3.9 from the textbook.
## This is the only way I have found that I can view the knn imputation..
imp <- predict(trans,newdata = ChemicalManufacturingProcess)
head(imp)
## Yield BiologicalMaterial01 BiologicalMaterial02 BiologicalMaterial03
## 1 -1.1792673 -0.2261036 -1.5140979 -2.68303622
## 2 1.2263678 2.2391498 1.3089960 -0.05623504
## 3 1.0042258 2.2391498 1.3089960 -0.05623504
## 4 0.6737219 2.2391498 1.3089960 -0.05623504
## 5 1.2534583 1.4827653 1.8939391 1.13594780
## 6 1.8386128 -0.4081962 0.6620886 -0.59859075
## BiologicalMaterial04 BiologicalMaterial05 BiologicalMaterial06
## 1 0.2201765 0.4941942 -1.3828880
## 2 1.2964386 0.4128555 1.1290767
## 3 1.2964386 0.4128555 1.1290767
## 4 1.2964386 0.4128555 1.1290767
## 5 0.9414412 -0.3734185 1.5348350
## 6 1.5894524 1.7305423 0.6192092
## BiologicalMaterial07 BiologicalMaterial08 BiologicalMaterial09
## 1 -0.1313107 -1.233131 -3.3962895
## 2 -0.1313107 2.282619 -0.7227225
## 3 -0.1313107 2.282619 -0.7227225
## 4 -0.1313107 2.282619 -0.7227225
## 5 -0.1313107 1.071310 -0.1205678
## 6 -0.1313107 1.189487 -1.7343424
## BiologicalMaterial10 BiologicalMaterial11 BiologicalMaterial12
## 1 1.1005296 -1.838655 -1.7709224
## 2 1.1005296 1.393395 1.0989855
## 3 1.1005296 1.393395 1.0989855
## 4 1.1005296 1.393395 1.0989855
## 5 0.4162193 0.136256 1.0989855
## 6 1.6346255 1.022062 0.7240877
## ManufacturingProcess01 ManufacturingProcess02 ManufacturingProcess03
## 1 0.2154105 0.5662872 0.3765810
## 2 -6.1497028 -1.9692525 0.1979962
## 3 -6.1497028 -1.9692525 0.1087038
## 4 -6.1497028 -1.9692525 0.4658734
## 5 -0.2784345 -1.9692525 0.1087038
## 6 0.4348971 -1.9692525 0.5551658
## ManufacturingProcess04 ManufacturingProcess05 ManufacturingProcess06
## 1 0.5655598 -0.44593467 -0.5414997
## 2 -2.3669726 0.99933318 0.9625383
## 3 -3.1638563 0.06246417 -0.1117745
## 4 -3.3232331 0.42279841 2.1850322
## 5 -2.2075958 0.84537219 -0.6304083
## 6 -1.2513352 0.49486525 0.5550403
## ManufacturingProcess07 ManufacturingProcess08 ManufacturingProcess09
## 1 -0.1596700 -0.3095182 -1.7201524
## 2 -0.9580199 0.8941637 0.5883746
## 3 1.0378549 0.8941637 -0.3815947
## 4 -0.9580199 -1.1119728 -0.4785917
## 5 1.0378549 0.8941637 -0.4527258
## 6 1.0378549 0.8941637 -0.2199332
## ManufacturingProcess10 ManufacturingProcess11 ManufacturingProcess12
## 1 -0.07700901 -0.09157342 -0.4806937
## 2 0.52297397 1.08204765 -0.4806937
## 3 0.31428424 0.55112383 -0.4806937
## 4 -0.02483658 0.80261406 -0.4806937
## 5 -0.39004361 0.10403009 -0.4806937
## 6 0.28819802 1.41736795 -0.4806937
## ManufacturingProcess13 ManufacturingProcess14 ManufacturingProcess15
## 1 0.97711512 0.8093999 1.1846438
## 2 -0.50030980 0.2775205 0.9617071
## 3 0.28765016 0.4425865 0.8245152
## 4 0.28765016 0.7910592 1.0817499
## 5 0.09066017 2.5334227 3.3282665
## 6 -0.50030980 2.4050380 3.1396277
## ManufacturingProcess16 ManufacturingProcess17 ManufacturingProcess18
## 1 0.3303945 0.9263296 0.1505348
## 2 0.1455765 -0.2753953 0.1559773
## 3 0.1455765 0.3655246 0.1831898
## 4 0.1967569 0.3655246 0.1695836
## 5 0.4754056 -0.3555103 0.2076811
## 6 0.6261033 -0.7560852 0.1423710
## ManufacturingProcess19 ManufacturingProcess20 ManufacturingProcess21
## 1 0.4563798 0.3109942 0.2109804
## 2 1.5095063 0.1849230 0.2109804
## 3 1.0926437 0.1849230 0.2109804
## 4 0.9829430 0.1562704 0.2109804
## 5 1.6192070 0.2938027 -0.6884239
## 6 1.9044287 0.3998171 -0.5599376
## ManufacturingProcess22 ManufacturingProcess23 ManufacturingProcess24
## 1 0.05833309 0.8317688 0.8907291
## 2 -0.72230090 -1.8147683 -1.0060115
## 3 -0.42205706 -1.2132826 -0.8335805
## 4 -0.12181322 -0.6117969 -0.6611496
## 5 0.77891831 0.5911745 1.5804530
## 6 1.07916216 -1.2132826 -1.3508734
## ManufacturingProcess25 ManufacturingProcess26 ManufacturingProcess27
## 1 0.1200183 0.1256347 0.3460352
## 2 0.1093082 0.1966227 0.1906613
## 3 0.1842786 0.2159831 0.2104362
## 4 0.1708910 0.2052273 0.1906613
## 5 0.2726365 0.2912733 0.3432102
## 6 0.1146633 0.2417969 0.3516852
## ManufacturingProcess28 ManufacturingProcess29 ManufacturingProcess30
## 1 0.7826636 0.5943242 0.7566948
## 2 0.8779201 0.8347250 0.7566948
## 3 0.8588688 0.7746248 0.2444430
## 4 0.8588688 0.7746248 0.2444430
## 5 0.8969714 0.9549255 -0.1653585
## 6 0.9160227 1.0150257 0.9615956
## ManufacturingProcess31 ManufacturingProcess32 ManufacturingProcess33
## 1 -0.1952552 -0.4568829 0.9890307
## 2 -0.2672523 1.9517531 0.9890307
## 3 -0.1592567 2.6928719 0.9890307
## 4 -0.1592567 2.3223125 1.7943843
## 5 -0.1412574 2.3223125 2.5997378
## 6 -0.3572486 2.6928719 2.5997378
## ManufacturingProcess34 ManufacturingProcess35 ManufacturingProcess36
## 1 -1.7202722 -0.88694718 -0.6557774
## 2 1.9568096 1.14638329 -0.6557774
## 3 1.9568096 1.23880740 -1.8000420
## 4 0.1182687 0.03729394 -1.8000420
## 5 0.1182687 -2.55058120 -2.9443066
## 6 0.1182687 -0.51725073 -1.8000420
## ManufacturingProcess37 ManufacturingProcess38 ManufacturingProcess39
## 1 -1.1540243 0.7174727 0.2317270
## 2 2.2161351 -0.8224687 0.2317270
## 3 -0.7046697 -0.8224687 0.2317270
## 4 0.4187168 -0.8224687 0.2317270
## 5 -1.8280562 -0.8224687 0.2981503
## 6 -1.3787016 -0.8224687 0.2317270
## ManufacturingProcess40 ManufacturingProcess41 ManufacturingProcess42
## 1 0.05969714 -0.06900773 0.20279570
## 2 2.14909691 2.34626280 -0.05472265
## 3 -0.46265281 -0.44058781 0.40881037
## 4 -0.46265281 -0.44058781 -0.31224099
## 5 -0.46265281 -0.44058781 -0.10622632
## 6 -0.46265281 -0.44058781 0.15129203
## ManufacturingProcess43 ManufacturingProcess44 ManufacturingProcess45
## 1 2.40564734 -0.01588055 0.64371849
## 2 -0.01374656 0.29467248 0.15220242
## 3 0.10146268 -0.01588055 0.39796046
## 4 0.21667191 -0.01588055 -0.09355562
## 5 0.21667191 -0.32643359 -0.09355562
## 6 1.48397347 -0.01588055 -0.33931365
It seems the entire values were transformed, but the missing values are imputed with KNN, the defualt is k = 10.
## We need a ydefault
impnoY <- imp %>%
select(-Yield)
set.seed(1)
trainRow <- createDataPartition(imp$Yield, p=0.8, list=FALSE)
imp.train <- impnoY[trainRow, ]
Yield.train <- imp[trainRow,]$Yield
imp.test <- impnoY[-trainRow, ]
Yield.test <- imp[-trainRow,]$Yield
## To note that we got a warning of a Zerovariance predictor which was the biological material 07
knnmodel2 <- train(x = imp.train,y = Yield.train,method = "knn",preProc = c("center","scale"),tuneLength = 10)
knnmodel2
## k-Nearest Neighbors
##
## 144 samples
## 57 predictor
##
## Pre-processing: centered (57), scaled (57)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 144, 144, 144, 144, 144, 144, ...
## Resampling results across tuning parameters:
##
## k RMSE Rsquared MAE
## 5 0.8291537 0.3668913 0.6533378
## 7 0.8211288 0.3769018 0.6488012
## 9 0.8080362 0.3954928 0.6430315
## 11 0.8027950 0.4033675 0.6446828
## 13 0.7954209 0.4188734 0.6373942
## 15 0.8003301 0.4156814 0.6410707
## 17 0.8041667 0.4135066 0.6434692
## 19 0.8080031 0.4134072 0.6473769
## 21 0.8120783 0.4125170 0.6503615
## 23 0.8174328 0.4090511 0.6541833
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was k = 13.
plot(knnmodel2)
We see that if we have eleven nearest neighbors in our model, we have the lowest RMSE values. Based on the training data it appears that knn doesn’t appear to fit the data well.
knnpred3 <- predict(knnmodel2,imp.test)
postResample(knnpred3,Yield.test)
## RMSE Rsquared MAE
## 0.5931631 0.6003265 0.5191605
It appears that the knn performed better on the test data, telling us that the knn model was able to generalize onto the new data.
## First we remove correlated predictors and create our grid
tooCorrelated <- findCorrelation(cor(imp.train),cutoff = .75)
tooCorrelated
## [1] 2 6 8 1 12 4 44 27 41 42 21 26 25 54 57 56 38 37 43 30 52
We can notice that some of the predictors are correlated with each other,, and thus we will have to remove from our train and test sets.
## We can create new train and test sets according to the textbook
trainxnet <- imp.train[,-tooCorrelated]
testxnet <- imp.test[,-tooCorrelated]
And now we can create our model
## Create a specific candidate sets of model to evaluate
## Textbook says our lambda decay should be between 0 and 0.1 and I will make .size 1 to 5 to play around with it
nnetGrid2 <- expand.grid(size = c(1:5),decay = c(0,.01,.1))
# stick with 10 fold cross validation
set.seed(102)
ctrl2 <- trainControl(method = "cv",number = 10)
## Create the model with the train function from caret
# I put 10 hidden units and maxiterations is 500
## Apparently .bag parameter from the textbook should be removed since it interferes with the model creation..
set.seed(102)
nnetTune2 <- train(trainxnet,Yield.train,method = "nnet",tuneGrid = nnetGrid2,trControl = ctrl2,preProc = c("center","scale"),linout = TRUE,trace = FALSE,MaxNWts = 10 * (ncol(trainxnet)+ 1) + 10 + 1, maxit = 500)
nnetTune2
## Neural Network
##
## 144 samples
## 36 predictor
##
## Pre-processing: centered (36), scaled (36)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 130, 129, 130, 130, 128, 129, ...
## Resampling results across tuning parameters:
##
## size decay RMSE Rsquared MAE
## 1 0.00 0.9208304 0.3632362 0.7653970
## 1 0.01 0.8638859 0.3767211 0.6847736
## 1 0.10 0.7981602 0.4796208 0.6396534
## 2 0.00 0.9369980 0.3752629 0.7527670
## 2 0.01 0.9366374 0.4003136 0.7079207
## 2 0.10 0.8862205 0.4525297 0.6856467
## 3 0.00 0.8853312 0.4331014 0.7078507
## 3 0.01 1.1003103 0.3732871 0.8775116
## 3 0.10 0.9017420 0.4690088 0.7068700
## 4 0.00 1.1458739 0.3571929 0.9214667
## 4 0.01 1.0886435 0.3867733 0.8657484
## 4 0.10 0.9806198 0.4010995 0.7518177
## 5 0.00 1.3230848 0.2166120 1.0446854
## 5 0.01 1.0292122 0.3139016 0.8082850
## 5 0.10 0.8210496 0.4731805 0.6478778
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were size = 1 and decay = 0.1.
plot(nnetTune2)
The final parameters used for lowest Rmse were size = 1 and the weight decay value being 0.01
NNpred2 <-predict(nnetTune2,testxnet)
postResample(NNpred2,Yield.test)
## RMSE Rsquared MAE
## 0.6999993 0.4787150 0.5919719
It appears the Neural Network with these setting appears to have performed worse on the test set.
## Create a grid, use up to three degrees..
marsgrid2 <- expand.grid(.degree = 1:3,.nprune = 2:30)
set.seed(102)
marsTuned <- train(imp.train,Yield.train,method = "earth",trControl = trainControl(method = "cv"),tuneGrid = marsgrid2)
marsTuned
## Multivariate Adaptive Regression Spline
##
## 144 samples
## 57 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 130, 129, 130, 130, 128, 129, ...
## Resampling results across tuning parameters:
##
## degree nprune RMSE Rsquared MAE
## 1 2 0.7401828 0.5076835 0.5859174
## 1 3 0.6700539 0.6143070 0.5478887
## 1 4 0.6336196 0.6339889 0.5213229
## 1 5 0.6380850 0.6223464 0.5330441
## 1 6 0.6493623 0.6225629 0.5253757
## 1 7 0.6781870 0.5902393 0.5623102
## 1 8 0.6670887 0.5997152 0.5330688
## 1 9 0.6570267 0.6124918 0.5262138
## 1 10 0.6388756 0.6317758 0.5094791
## 1 11 0.6552031 0.6037919 0.5232743
## 1 12 0.6577968 0.6088902 0.5296082
## 1 13 0.6690541 0.6000783 0.5423353
## 1 14 0.6689753 0.5993107 0.5396548
## 1 15 0.6777116 0.5914683 0.5506578
## 1 16 0.6843519 0.5875380 0.5539098
## 1 17 0.6847553 0.5897997 0.5561409
## 1 18 0.6830386 0.5906434 0.5553786
## 1 19 0.6858592 0.5875657 0.5591931
## 1 20 0.6832162 0.5870476 0.5605455
## 1 21 0.6855152 0.5834529 0.5660780
## 1 22 0.6855152 0.5834529 0.5660780
## 1 23 0.6855152 0.5834529 0.5660780
## 1 24 0.6855152 0.5834529 0.5660780
## 1 25 0.6855152 0.5834529 0.5660780
## 1 26 0.6855152 0.5834529 0.5660780
## 1 27 0.6855152 0.5834529 0.5660780
## 1 28 0.6855152 0.5834529 0.5660780
## 1 29 0.6855152 0.5834529 0.5660780
## 1 30 0.6855152 0.5834529 0.5660780
## 2 2 0.7401828 0.5076835 0.5859174
## 2 3 0.6858121 0.5699007 0.5520895
## 2 4 0.6710553 0.5797737 0.5440391
## 2 5 0.6822840 0.5778308 0.5530664
## 2 6 0.6845224 0.5639186 0.5548198
## 2 7 0.7442537 0.5219663 0.5662516
## 2 8 0.7586647 0.5112531 0.5762678
## 2 9 0.8138133 0.4577877 0.5959540
## 2 10 0.8431044 0.4171317 0.6106736
## 2 11 0.8217784 0.4393290 0.5945527
## 2 12 0.8017329 0.4579097 0.5893465
## 2 13 1.1532784 0.4466222 0.7026523
## 2 14 1.1544112 0.4427076 0.7034843
## 2 15 1.1460628 0.4506502 0.7035306
## 2 16 1.1397684 0.4590504 0.7018349
## 2 17 1.1387850 0.4796401 0.6967700
## 2 18 1.1673808 0.4660005 0.7139729
## 2 19 1.1705725 0.4691587 0.7219652
## 2 20 1.1802355 0.4645687 0.7273957
## 2 21 1.1802355 0.4645687 0.7273957
## 2 22 1.1802355 0.4645687 0.7273957
## 2 23 1.1802355 0.4645687 0.7273957
## 2 24 1.1802355 0.4645687 0.7273957
## 2 25 1.1802355 0.4645687 0.7273957
## 2 26 1.1802355 0.4645687 0.7273957
## 2 27 1.1802355 0.4645687 0.7273957
## 2 28 1.1802355 0.4645687 0.7273957
## 2 29 1.1802355 0.4645687 0.7273957
## 2 30 1.1802355 0.4645687 0.7273957
## 3 2 0.7401828 0.5076835 0.5859174
## 3 3 0.6521967 0.6021238 0.5292473
## 3 4 0.6465669 0.6188624 0.5236559
## 3 5 0.6709029 0.5711027 0.5551302
## 3 6 0.6761283 0.5651955 0.5720745
## 3 7 0.6763515 0.5786132 0.5566866
## 3 8 0.6621672 0.5995342 0.5372266
## 3 9 0.6634479 0.6006042 0.5373258
## 3 10 0.6734689 0.5910775 0.5378345
## 3 11 0.6654518 0.5918416 0.5321995
## 3 12 0.6884503 0.5758691 0.5560502
## 3 13 0.7053613 0.5595778 0.5637820
## 3 14 0.7096525 0.5615177 0.5685833
## 3 15 0.7334520 0.5394750 0.5862697
## 3 16 0.7774417 0.5261401 0.6039763
## 3 17 0.8292614 0.5185614 0.6346414
## 3 18 0.8283822 0.5243659 0.6320672
## 3 19 0.8285162 0.5213362 0.6333991
## 3 20 0.8354142 0.5185224 0.6428146
## 3 21 0.8298619 0.5242205 0.6424197
## 3 22 0.8446325 0.5104117 0.6562010
## 3 23 0.8454015 0.5102857 0.6539848
## 3 24 0.8347014 0.5121538 0.6502187
## 3 25 0.8384520 0.5077728 0.6523437
## 3 26 1.0147562 0.4619587 0.7194270
## 3 27 1.0096110 0.4613875 0.7194332
## 3 28 0.9787094 0.4631364 0.7072870
## 3 29 0.9817739 0.4590483 0.7092167
## 3 30 0.9817739 0.4590483 0.7092167
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were nprune = 4 and degree = 1.
plot(marsTuned)
We see various fluctuations with the different degrees within the model, but ultimately the lowest is at the first degree and the # of terms chosen is at 4
maarsPred <- predict(marsTuned,imp.test)
postResample(maarsPred,Yield.test)
## RMSE Rsquared MAE
## 0.6321206 0.5108776 0.5290423
It appears the Mars model did slightly worse than the knn model but better than the Neural Network model.
I am going to stick with the default radial kernel for our svm model
set.seed(123)
svmRTune <- train(imp.train,Yield.train,method = "svmRadial",preProc = c("center","scale"),tuneLength = 10,trControl = trainControl(method = "cv"))
svmRTune
## Support Vector Machines with Radial Basis Function Kernel
##
## 144 samples
## 57 predictor
##
## Pre-processing: centered (57), scaled (57)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 128, 129, 129, 130, 128, 131, ...
## Resampling results across tuning parameters:
##
## C RMSE Rsquared MAE
## 0.25 0.7919846 0.4880654 0.6359145
## 0.50 0.7241610 0.5438409 0.5798627
## 1.00 0.6657428 0.6114859 0.5334832
## 2.00 0.6353959 0.6311046 0.5076299
## 4.00 0.6296930 0.6363929 0.4999562
## 8.00 0.6291753 0.6396011 0.4962631
## 16.00 0.6292338 0.6397384 0.4963570
## 32.00 0.6292338 0.6397384 0.4963570
## 64.00 0.6292338 0.6397384 0.4963570
## 128.00 0.6292338 0.6397384 0.4963570
##
## Tuning parameter 'sigma' was held constant at a value of 0.01460699
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were sigma = 0.01460699 and C = 8.
plot(svmRTune)
Like the Mars model before, I have got a warning for zero variance predictors which was only a single predictor, but the model choose the cost paremeters for 4 and a sigma is 0.01.
svmpred <- predict(svmRTune,imp.test)
postResample(svmpred,Yield.test)
## RMSE Rsquared MAE
## 0.5337424 0.6735164 0.4331323
It seems like svm performed well on the test set as well, the R-squared is pretty high.
Tablee2 <- bind_rows(knn = postResample(knnpred3,Yield.test),NN = postResample(NNpred2,Yield.test),mars = postResample(maarsPred,Yield.test),svm = postResample(svmpred,Yield.test))
Tablee2$id = c("KNN","NN","MARS","SVM")
Tablee2
## # A tibble: 4 × 4
## RMSE Rsquared MAE id
## <dbl> <dbl> <dbl> <chr>
## 1 0.593 0.600 0.519 KNN
## 2 0.700 0.479 0.592 NN
## 3 0.632 0.511 0.529 MARS
## 4 0.534 0.674 0.433 SVM
Looking at the data, it appears that SVM with the default radial kernel performed well on the testing set..
plot(varImp(svmRTune),top = 20)
Acoording to the vif plot, we see that the manufacturingprocess predictors play a big part in predicting the yield. For the top ten we see a mix of manufacturingprocess and biologicalmaterial dominate in yield prediction. If we compare the top ten predictors from our nonlinear model, we see a even mix of manufacturing process and biological material play a great importance in yield prediction, whereas in the linear model, the variable importance placed a greater importance on the manufacturing process than the biological material.
If we look at the top ten predictors, our variable importance plot tells us that we should take a look at the manufacturing process32, 13 and 36 and to use biological_material06 and to 03 in order to produce the greatest amount of yield. This also tells me that the SVM was able to capture the patterns that the linear model wasn’t able to and thus the nonlinear model tells us that there are some biological materials that play a great importance in determining the highest yield.