DATA 624 Homework 9
DATA 624 Homework 9
Question 8.1
In this exercise we work with the simulated dataset from Exercise 7.2. We use the mlbench.friedman1 function from the mlbench library to simulate data for ten predictor variables V1 through V10 and one response variable y from the following nonlinear equation:
\[y = 10 \sin (\pi V_1 V_2) + 20 (V_3 -0.5)^2 + 10 V_4 + 5 V_5 + \epsilon\]
where the Vj are random variables uniformly distributed on [0, 1] and the error term ϵ∼N(0,σ2) is normally distributed. Note that only the first five Vj variables enter the equation for the response y; the remaining Vj variables are non-informative / noise variables. Our sample size is 200.
## Warning: package 'mlbench' was built under R version 3.6.2
set.seed(200)
simulated <- mlbench.friedman1(200, sd = 1)
simulated <- cbind(simulated$x, simulated$y)
simulated <- as.data.frame(simulated)
colnames(simulated)[ncol(simulated)] <- "y"
(a) Fit a random forest model to all of the predictors, then estimate the variable importance scores:
set.seed(1012)
rf <- randomForest(y ~ ., data = simulated,
importance = TRUE,
ntree = 1000, mtry = 3)
rf
##
## Call:
## randomForest(formula = y ~ ., data = simulated, importance = TRUE, ntree = 1000, mtry = 3)
## Type of random forest: regression
## Number of trees: 1000
## No. of variables tried at each split: 3
##
## Mean of squared residuals: 6.592795
## % Var explained: 72.96
The variable importance scores are output below. We see that the most important variables in the random forest model are V1, V4, V2, and V5; none of the non-informative predictors (V6 through V10) are significant in the model.
## Overall
## V1 8.843111596
## V2 6.642947053
## V3 0.670424188
## V4 7.707206179
## V5 2.224687958
## V6 0.214742062
## V7 -0.004430714
## V8 -0.090858381
## V9 0.030887887
## V10 -0.014806425
Additional Predictor
Now add an additional predictor that is highly correlated with one of the informative predictors. For example:
set.seed(200)
simulated$duplicate1 <- simulated$V1 + rnorm(200) * .1
cor(simulated$duplicate1, simulated$V1)
## [1] 0.9497025
Fit another random forest model to these data. Did the importance score for V1 change? What happens when you add another predictor that is also highly correlated with V1?
rf2 <- randomForest(y ~ .,
data = simulated,
importance = TRUE,
ntree = 1000)
rfImp2 <- varImp(rf2, scale = FALSE)
rfImp2 <- varImp(rf2, scale = FALSE)
rfImp <- merge(rfImp1, rfImp2, by = "row.names", all = TRUE, sort = FALSE)
colnames(rfImp) <- c("Variable", "RF", "RF_dup")
rfImp %>% kable(digits = 3,
caption = "Variable importance scores for random forest model")
Variable | RF | RF_dup |
---|---|---|
V1 | 8.843 | 6.007 |
V2 | 6.643 | 6.059 |
V3 | 0.670 | 0.585 |
V4 | 7.707 | 6.864 |
V5 | 2.225 | 2.199 |
V6 | 0.215 | 0.109 |
V7 | -0.004 | 0.061 |
V8 | -0.091 | -0.041 |
V9 | 0.031 | 0.061 |
V10 | -0.015 | 0.100 |
duplicate1 | NA | 4.433 |
We see that the importance score for V1 has changed, and in fact has been diluted by the effect of the correlated predictor. For instance, in the first RF model, the importance score for V1 is ~ 8.7, whereas in the second RF model, the influence of V1 has been divided into V1 (with score ~ 5.7) and duplicate1 (with score ~ 4.3).
cForest Function
Use the cforest function in the party package to fit a random forest model using conditional inference trees. The party package function varimp can calculate predictor importance. The conditional argument of that function toggles between the traditional importance measure and the modified version described in Strobl et al. (2007). Do these importances show the same pattern as the traditional random forest model?
Boosted Trees and Cubist
Repeat this process with different tree models, such as boosted trees and Cubist. Does the same pattern occur?
Question 8.2
Repeat this process with different tree models, such as boosted trees and Cubist. Does the same pattern occur? In this chapter, we learned that single regression trees suffer from selection bias such that predictors with a higher number of distinct values (low var) are favored over more granular predictors (high var).
This simulation was created with two variables, X1 and X2, each containing 200 values. One of the variables has a low variance predictor that correlates to the response variable and the second variable has high variance with no relation to the response variable.
If we assume that X1 was the true correlated variable to the response variable, however, since so much noise was introduced to X2 (highly granular), X2 is now the more important variable in this dataset which affects the bias.
library(rpart)
set.seed(200)
X1 <- rep(1:2,each=100)
X2 <- rnorm(200,mean=0,sd=2)
Y <- X1 + rnorm(200,mean=0,sd=4)
df1 <- data.frame(Y=Y, X1=X1, X2=X2)
mod <- rpart(Y ~ ., data = df1)
varImp(mod)
## Overall
## X1 0.1390440
## X2 0.4393341
Question 8.3
Figure 8.24
In stochastic gradient boosting the bagging fraction and learning rate will govern the construction of the trees as they are guided by the gradient. Although the optimal values of these parameters should be obtained through the tuning process, it is helpful to understand how the magnitudes of these parameters affect magnitudes of variable importance. Figure 8.24 provides the variable importance plots for boosting using two extreme values for the bagging fraction (0.1 and 0.9) and the learning rate (0.1 and 0.9) for the solubility data. The left-hand plot has both parameters set to 0.1, and the right-hand plot has both set to 0.9:
Predictors
Why does the model on the right focus its importance on just the first few of predictors, whereas the model on the left spreads importance across more predictors?
The model on the right focuses its’ importance on just the first few predictors because as the learning rate increases, the model will use fewer predictors. Also due to the bagging fraction, the higher the fraction, the less predictors will be identified as important.
Predictive
Which model do you think would be more predictive of other samples?
As learning rate and bagging fraction control the overfitting of the gradient boosting model that requires tuning then a smaller learning rate and bagging fraction would give better generalization over unseen samples. So, a model with a 0.1 learning rate and bagging fraction will be more predictive of other samples. However, this may lead to a trade off between bias-variance.Always go for an ensemble of weak predictors.
Interaction Depth
How would increasing interaction depth affect the slope of predictor importance for either model in Fig. 8.24? Increasing the interaction depth will most likely decrease the slope of the variable importance for either model. The reason for this is that increasing the interaction depth produces stronger / more complex learners at each iteration step, which will include more variables in each learner. As a result the final model will likely include a greater diversity of variables that are influential in the model, so that importance scores are less concentrated in the top predictors.
Question 8.7 - Chemical Manufacturing Process
Refer to Exercises 6.3 and 7.5 which describe a chemical manufacturing process. Use the same data imputation, data splitting, and pre-processing steps as before and train several tree-based models:
## Warning: package 'RANN' was built under R version 3.6.3
knn_model <- preProcess(ChemicalManufacturingProcess, "knnImpute")
df <- predict(knn_model, ChemicalManufacturingProcess)
df <- df %>%
select_at(vars(-one_of(nearZeroVar(., names = TRUE))))
in_train <- createDataPartition(df$Yield, times = 1, p = 0.8, list = FALSE)
train_df <- df[in_train, ]
test_df <- df[-in_train, ]
pls_model <- train(
Yield ~ ., data = train_df, method = "pls",
center = TRUE,
scale = TRUE,
trControl = trainControl("cv", number = 10),
tuneLength = 25
)
pls_predictions <- predict(pls_model, test_df)
pls_in_sample <- pls_model$results[pls_model$results$ncomp == pls_model$bestTune$ncomp,]
results <- data.frame(t(postResample(pred = pls_predictions, obs = test_df$Yield))) %>%
mutate("In Sample RMSE" = pls_in_sample$RMSE,
"In Sample Rsquared" = pls_in_sample$Rsquared,
"In Sample MAE" = pls_in_sample$MAE,
"Model"= "PLS")
Bagged Tree
set.seed(42)
bagControl = bagControl(fit = ctreeBag$fit, predict = ctreeBag$pred, aggregate = ctreeBag$aggregate)
bag_model <- train(Yield ~ ., data = train_df, method="bag", bagControl = bagControl,
center = TRUE,
scale = TRUE,
trControl = trainControl("cv", number = 10),
tuneLength = 25)
## Warning: executing %dopar% sequentially: no parallel backend registered
bag_predictions <- predict(bag_model, test_df)
bag_in_sample <- merge(bag_model$results, bag_model$bestTune)
results <- data.frame(t(postResample(pred = bag_predictions, obs = test_df$Yield))) %>%
mutate("In Sample RMSE" = bag_in_sample$RMSE,
"In Sample Rsquared" = bag_in_sample$Rsquared,
"In Sample MAE" = bag_in_sample$MAE,
"Model"= "Bagged Tree") %>%
rbind(results)
bag_model
## Bagged Model
##
## 144 samples
## 56 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 129, 129, 130, 129, 130, 130, ...
## Resampling results:
##
## RMSE Rsquared MAE
## 0.6934337 0.5219607 0.5598568
##
## Tuning parameter 'vars' was held constant at a value of 56
Gradient Boosting Machine
set.seed(42)
gbm_model <- train(Yield ~ ., data = train_df, method="gbm", verbose = FALSE,
trControl = trainControl("cv", number = 10),
tuneLength = 25)
gbm_predictions <- predict(gbm_model, test_df)
gbm_in_sample <- merge(gbm_model$results, gbm_model$bestTune)
results <- data.frame(t(postResample(pred = gbm_predictions, obs = test_df$Yield))) %>%
mutate("In Sample RMSE" = gbm_in_sample$RMSE,
"In Sample Rsquared" = gbm_in_sample$Rsquared,
"In Sample MAE" = gbm_in_sample$MAE,
"Model"= "Boosted Tree") %>%
rbind(results)
gbm_model
## Stochastic Gradient Boosting
##
## 144 samples
## 56 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 129, 129, 130, 129, 130, 130, ...
## Resampling results across tuning parameters:
##
## interaction.depth n.trees RMSE Rsquared MAE
## 1 50 0.6309073 0.6307072 0.4937181
## 1 100 0.6067715 0.6547561 0.4803041
## 1 150 0.6010362 0.6564741 0.4779124
## 1 200 0.5980808 0.6590915 0.4775236
## 1 250 0.5958417 0.6577338 0.4756263
## 1 300 0.5927691 0.6578817 0.4780365
## 1 350 0.5963573 0.6555732 0.4777925
## 1 400 0.5946346 0.6562853 0.4749995
## 1 450 0.5963141 0.6567886 0.4757628
## 1 500 0.5926436 0.6578978 0.4732564
## 1 550 0.5936163 0.6585542 0.4739573
## 1 600 0.5946184 0.6583046 0.4739970
## 1 650 0.5918882 0.6592197 0.4715162
## 1 700 0.5885880 0.6612148 0.4705597
## 1 750 0.5881407 0.6608977 0.4694182
## 1 800 0.5896609 0.6617300 0.4700589
## 1 850 0.5892677 0.6614786 0.4706464
## 1 900 0.5898261 0.6616341 0.4701334
## 1 950 0.5886399 0.6635003 0.4692815
## 1 1000 0.5894494 0.6625427 0.4687962
## 1 1050 0.5875729 0.6645981 0.4652955
## 1 1100 0.5873891 0.6656895 0.4653363
## 1 1150 0.5868669 0.6655589 0.4638092
## 1 1200 0.5872305 0.6661440 0.4644110
## 1 1250 0.5884106 0.6647677 0.4655850
## 2 50 0.6311179 0.6134649 0.4966245
## 2 100 0.6047233 0.6348163 0.4771659
## 2 150 0.6088295 0.6287847 0.4808143
## 2 200 0.6023855 0.6353514 0.4753952
## 2 250 0.5974785 0.6433702 0.4731082
## 2 300 0.5977877 0.6436925 0.4720235
## 2 350 0.5992533 0.6431515 0.4724790
## 2 400 0.5975111 0.6442736 0.4715936
## 2 450 0.5956891 0.6446095 0.4693770
## 2 500 0.5954435 0.6452745 0.4691212
## 2 550 0.5953380 0.6449826 0.4683246
## 2 600 0.5952153 0.6443400 0.4690054
## 2 650 0.5963402 0.6436744 0.4694721
## 2 700 0.5970919 0.6428312 0.4702579
## 2 750 0.5967674 0.6430600 0.4696340
## 2 800 0.5972682 0.6426041 0.4697263
## 2 850 0.5973559 0.6423307 0.4692039
## 2 900 0.5968805 0.6425709 0.4687781
## 2 950 0.5970037 0.6421885 0.4688498
## 2 1000 0.5972009 0.6420018 0.4690319
## 2 1050 0.5974290 0.6418480 0.4691441
## 2 1100 0.5972286 0.6419469 0.4687419
## 2 1150 0.5972299 0.6418811 0.4687288
## 2 1200 0.5973797 0.6418380 0.4688392
## 2 1250 0.5973377 0.6418838 0.4687683
## 3 50 0.6242751 0.6263997 0.4994594
## 3 100 0.5979024 0.6602940 0.4734082
## 3 150 0.5974208 0.6607187 0.4728631
## 3 200 0.5935466 0.6631332 0.4704156
## 3 250 0.5912659 0.6657619 0.4680239
## 3 300 0.5902524 0.6653438 0.4664539
## 3 350 0.5902175 0.6639226 0.4682868
## 3 400 0.5894217 0.6653152 0.4695143
## 3 450 0.5877731 0.6661266 0.4680076
## 3 500 0.5873842 0.6662242 0.4677935
## 3 550 0.5864037 0.6667776 0.4670614
## 3 600 0.5862757 0.6666107 0.4671412
## 3 650 0.5868866 0.6659124 0.4677483
## 3 700 0.5867189 0.6657532 0.4676510
## 3 750 0.5866726 0.6656772 0.4675994
## 3 800 0.5868771 0.6654900 0.4677143
## 3 850 0.5866930 0.6656422 0.4676334
## 3 900 0.5867760 0.6655104 0.4677075
## 3 950 0.5866457 0.6656169 0.4676049
## 3 1000 0.5866126 0.6656306 0.4675329
## 3 1050 0.5866318 0.6656287 0.4674474
## 3 1100 0.5866551 0.6655299 0.4674906
## 3 1150 0.5866068 0.6656294 0.4674132
## 3 1200 0.5865965 0.6656305 0.4674021
## 3 1250 0.5866192 0.6655997 0.4674201
## 4 50 0.5983073 0.6621151 0.4728144
## 4 100 0.5900492 0.6594888 0.4691085
## 4 150 0.5838583 0.6657635 0.4651294
## 4 200 0.5856860 0.6639413 0.4636197
## 4 250 0.5847828 0.6669159 0.4666416
## 4 300 0.5842354 0.6661883 0.4658842
## 4 350 0.5850207 0.6656138 0.4659710
## 4 400 0.5836022 0.6675400 0.4638836
## 4 450 0.5841968 0.6672274 0.4647597
## 4 500 0.5842065 0.6672440 0.4649113
## 4 550 0.5844792 0.6669517 0.4652066
## 4 600 0.5843876 0.6670649 0.4652700
## 4 650 0.5844515 0.6669909 0.4653745
## 4 700 0.5846619 0.6668608 0.4657048
## 4 750 0.5845906 0.6670458 0.4656246
## 4 800 0.5847751 0.6668218 0.4657424
## 4 850 0.5847228 0.6669435 0.4656221
## 4 900 0.5847645 0.6669118 0.4657733
## 4 950 0.5847752 0.6669395 0.4658091
## 4 1000 0.5847307 0.6670358 0.4658006
## 4 1050 0.5847444 0.6670450 0.4658176
## 4 1100 0.5847715 0.6670425 0.4658570
## 4 1150 0.5848018 0.6670339 0.4659200
## 4 1200 0.5848436 0.6669873 0.4659820
## 4 1250 0.5848639 0.6669724 0.4660031
## 5 50 0.6174240 0.6193119 0.4874456
## 5 100 0.6160631 0.6202667 0.4821789
## 5 150 0.6119950 0.6228217 0.4750994
## 5 200 0.6093205 0.6237943 0.4728638
## 5 250 0.6100703 0.6221117 0.4757623
## 5 300 0.6094285 0.6220897 0.4764123
## 5 350 0.6101704 0.6215377 0.4772197
## 5 400 0.6096742 0.6222826 0.4764519
## 5 450 0.6099425 0.6223596 0.4773556
## 5 500 0.6090567 0.6232013 0.4771806
## 5 550 0.6090669 0.6233317 0.4777190
## 5 600 0.6085424 0.6239158 0.4779873
## 5 650 0.6085997 0.6241357 0.4781254
## 5 700 0.6084989 0.6242398 0.4782532
## 5 750 0.6085994 0.6242566 0.4785080
## 5 800 0.6086607 0.6243149 0.4787011
## 5 850 0.6087003 0.6243646 0.4788080
## 5 900 0.6087675 0.6243437 0.4789216
## 5 950 0.6087360 0.6244102 0.4789415
## 5 1000 0.6087218 0.6244729 0.4790081
## 5 1050 0.6087400 0.6245642 0.4790156
## 5 1100 0.6086462 0.6246657 0.4790038
## 5 1150 0.6086138 0.6247457 0.4790202
## 5 1200 0.6086170 0.6247716 0.4790445
## 5 1250 0.6086292 0.6247799 0.4790764
## 6 50 0.5987489 0.6346355 0.4795116
## 6 100 0.5998853 0.6346885 0.4760614
## 6 150 0.5957082 0.6392401 0.4740528
## 6 200 0.5933964 0.6397054 0.4701734
## 6 250 0.5943587 0.6373843 0.4717210
## 6 300 0.5941584 0.6373905 0.4728093
## 6 350 0.5938423 0.6381372 0.4719655
## 6 400 0.5933050 0.6387263 0.4716850
## 6 450 0.5935012 0.6387969 0.4715688
## 6 500 0.5932407 0.6395139 0.4714417
## 6 550 0.5931797 0.6395850 0.4713832
## 6 600 0.5930596 0.6398224 0.4711257
## 6 650 0.5929065 0.6398772 0.4709837
## 6 700 0.5930927 0.6397487 0.4711092
## 6 750 0.5931008 0.6397533 0.4711432
## 6 800 0.5931784 0.6396816 0.4710612
## 6 850 0.5932643 0.6395733 0.4711257
## 6 900 0.5933767 0.6394541 0.4711961
## 6 950 0.5934302 0.6394659 0.4712244
## 6 1000 0.5934794 0.6393795 0.4711839
## 6 1050 0.5935689 0.6392978 0.4711826
## 6 1100 0.5934403 0.6394625 0.4710436
## 6 1150 0.5934048 0.6394969 0.4710327
## 6 1200 0.5933951 0.6394931 0.4710004
## 6 1250 0.5934018 0.6394815 0.4710085
## 7 50 0.6128720 0.6315496 0.4832009
## 7 100 0.5893562 0.6545720 0.4611145
## 7 150 0.5863401 0.6569251 0.4630634
## 7 200 0.5883147 0.6548335 0.4679783
## 7 250 0.5870171 0.6552643 0.4685550
## 7 300 0.5866931 0.6541779 0.4695581
## 7 350 0.5865482 0.6531843 0.4691222
## 7 400 0.5870886 0.6528764 0.4698032
## 7 450 0.5866555 0.6531760 0.4699850
## 7 500 0.5864248 0.6532976 0.4699667
## 7 550 0.5862678 0.6534482 0.4699204
## 7 600 0.5863946 0.6533652 0.4700062
## 7 650 0.5864247 0.6532801 0.4700786
## 7 700 0.5862381 0.6533620 0.4697206
## 7 750 0.5860701 0.6535920 0.4695611
## 7 800 0.5862496 0.6534195 0.4696628
## 7 850 0.5862489 0.6533937 0.4696054
## 7 900 0.5862667 0.6534130 0.4695573
## 7 950 0.5862693 0.6533916 0.4694894
## 7 1000 0.5862632 0.6534206 0.4694705
## 7 1050 0.5862083 0.6534182 0.4693263
## 7 1100 0.5862541 0.6533755 0.4693561
## 7 1150 0.5863049 0.6533166 0.4693413
## 7 1200 0.5863285 0.6533060 0.4693287
## 7 1250 0.5863305 0.6532910 0.4693159
## 8 50 0.6037675 0.6462019 0.4795321
## 8 100 0.5891200 0.6640340 0.4728118
## 8 150 0.5834128 0.6675718 0.4707811
## 8 200 0.5817024 0.6705246 0.4684816
## 8 250 0.5840053 0.6684555 0.4709404
## 8 300 0.5842854 0.6684347 0.4713324
## 8 350 0.5829032 0.6707079 0.4705647
## 8 400 0.5826471 0.6704938 0.4711233
## 8 450 0.5818881 0.6710821 0.4711234
## 8 500 0.5819493 0.6711690 0.4716400
## 8 550 0.5822448 0.6709230 0.4721244
## 8 600 0.5821313 0.6709039 0.4723258
## 8 650 0.5817722 0.6711923 0.4723589
## 8 700 0.5818675 0.6710934 0.4726294
## 8 750 0.5819050 0.6710840 0.4727501
## 8 800 0.5820246 0.6710397 0.4730449
## 8 850 0.5822426 0.6708004 0.4734169
## 8 900 0.5821069 0.6709608 0.4733736
## 8 950 0.5821962 0.6709102 0.4734926
## 8 1000 0.5822908 0.6708858 0.4736839
## 8 1050 0.5822700 0.6709013 0.4736985
## 8 1100 0.5822851 0.6708948 0.4737608
## 8 1150 0.5823331 0.6708371 0.4738205
## 8 1200 0.5822841 0.6708891 0.4738277
## 8 1250 0.5822810 0.6709028 0.4738160
## 9 50 0.6252593 0.6030827 0.5039324
## 9 100 0.6089446 0.6306181 0.4934928
## 9 150 0.6002561 0.6372654 0.4862365
## 9 200 0.6012432 0.6413822 0.4897792
## 9 250 0.5992531 0.6437936 0.4888823
## 9 300 0.5996180 0.6432824 0.4887577
## 9 350 0.5989823 0.6445553 0.4878142
## 9 400 0.5989254 0.6448842 0.4872468
## 9 450 0.5993808 0.6446434 0.4875285
## 9 500 0.5993307 0.6448844 0.4869649
## 9 550 0.5997003 0.6447122 0.4872817
## 9 600 0.5997750 0.6448328 0.4870235
## 9 650 0.5996073 0.6451682 0.4865550
## 9 700 0.5998120 0.6450502 0.4866173
## 9 750 0.5998953 0.6450392 0.4865719
## 9 800 0.6001510 0.6447875 0.4867435
## 9 850 0.6001571 0.6448530 0.4868342
## 9 900 0.6003145 0.6447612 0.4870066
## 9 950 0.6002985 0.6448337 0.4870275
## 9 1000 0.6002565 0.6449306 0.4869360
## 9 1050 0.6002589 0.6449754 0.4869078
## 9 1100 0.6002861 0.6449619 0.4869091
## 9 1150 0.6002658 0.6450063 0.4868673
## 9 1200 0.6003251 0.6449879 0.4869265
## 9 1250 0.6003876 0.6449204 0.4869681
## 10 50 0.5988445 0.6405503 0.4619744
## 10 100 0.5823821 0.6516596 0.4540404
## 10 150 0.5783267 0.6572517 0.4548987
## 10 200 0.5805434 0.6520122 0.4569885
## 10 250 0.5792742 0.6544262 0.4547467
## 10 300 0.5782078 0.6551322 0.4536666
## 10 350 0.5783594 0.6548950 0.4546321
## 10 400 0.5775021 0.6552175 0.4539790
## 10 450 0.5778112 0.6550330 0.4540743
## 10 500 0.5772075 0.6555900 0.4537745
## 10 550 0.5771933 0.6556253 0.4534061
## 10 600 0.5769960 0.6559652 0.4532567
## 10 650 0.5769943 0.6558407 0.4531623
## 10 700 0.5768416 0.6560351 0.4528411
## 10 750 0.5768631 0.6559595 0.4526172
## 10 800 0.5767745 0.6560948 0.4524098
## 10 850 0.5768271 0.6560190 0.4523467
## 10 900 0.5770058 0.6557852 0.4523942
## 10 950 0.5769468 0.6558311 0.4522465
## 10 1000 0.5769077 0.6558489 0.4521204
## 10 1050 0.5769275 0.6558564 0.4521123
## 10 1100 0.5769205 0.6558432 0.4520678
## 10 1150 0.5769534 0.6557626 0.4520467
## 10 1200 0.5769722 0.6557517 0.4520160
## 10 1250 0.5769365 0.6557737 0.4519936
## 11 50 0.6027864 0.6446978 0.4750317
## 11 100 0.5835711 0.6607075 0.4639215
## 11 150 0.5808365 0.6608939 0.4626384
## 11 200 0.5777044 0.6615090 0.4602860
## 11 250 0.5809321 0.6581100 0.4636323
## 11 300 0.5807280 0.6576942 0.4638993
## 11 350 0.5806467 0.6575338 0.4646094
## 11 400 0.5812190 0.6571682 0.4649779
## 11 450 0.5814798 0.6566465 0.4649264
## 11 500 0.5815439 0.6569013 0.4651821
## 11 550 0.5815904 0.6569066 0.4652411
## 11 600 0.5817642 0.6567184 0.4652589
## 11 650 0.5818331 0.6567860 0.4653843
## 11 700 0.5819328 0.6567417 0.4655423
## 11 750 0.5820973 0.6566329 0.4657146
## 11 800 0.5822409 0.6565226 0.4658882
## 11 850 0.5823114 0.6564871 0.4659312
## 11 900 0.5823215 0.6565285 0.4659934
## 11 950 0.5822578 0.6565518 0.4659365
## 11 1000 0.5822976 0.6565505 0.4659869
## 11 1050 0.5823645 0.6564762 0.4660221
## 11 1100 0.5823970 0.6564611 0.4660766
## 11 1150 0.5824180 0.6564089 0.4661261
## 11 1200 0.5823577 0.6564525 0.4660589
## 11 1250 0.5823865 0.6564543 0.4660691
## 12 50 0.6378931 0.6111094 0.5008019
## 12 100 0.6259994 0.6162948 0.4914628
## 12 150 0.6294908 0.6145839 0.4971782
## 12 200 0.6268547 0.6209571 0.4947585
## 12 250 0.6286841 0.6195957 0.4954037
## 12 300 0.6268532 0.6203933 0.4934231
## 12 350 0.6270169 0.6208249 0.4940195
## 12 400 0.6270111 0.6215649 0.4938405
## 12 450 0.6270391 0.6216721 0.4942121
## 12 500 0.6270443 0.6216231 0.4947761
## 12 550 0.6268060 0.6218034 0.4942386
## 12 600 0.6263710 0.6221787 0.4939327
## 12 650 0.6262439 0.6223473 0.4938332
## 12 700 0.6260070 0.6226847 0.4936706
## 12 750 0.6259992 0.6228451 0.4936006
## 12 800 0.6260106 0.6229395 0.4936554
## 12 850 0.6261209 0.6227800 0.4936778
## 12 900 0.6261289 0.6228364 0.4937199
## 12 950 0.6260504 0.6228696 0.4936209
## 12 1000 0.6260233 0.6230004 0.4935868
## 12 1050 0.6260933 0.6229501 0.4937400
## 12 1100 0.6261136 0.6229581 0.4937684
## 12 1150 0.6260161 0.6230925 0.4936640
## 12 1200 0.6260897 0.6230419 0.4937236
## 12 1250 0.6260444 0.6230851 0.4936824
## 13 50 0.6064571 0.6569744 0.4835030
## 13 100 0.5858364 0.6704804 0.4672533
## 13 150 0.5828565 0.6688501 0.4668959
## 13 200 0.5794808 0.6705839 0.4638754
## 13 250 0.5756510 0.6723391 0.4608137
## 13 300 0.5744252 0.6718572 0.4600654
## 13 350 0.5747879 0.6723226 0.4586323
## 13 400 0.5740719 0.6727287 0.4579685
## 13 450 0.5748140 0.6716597 0.4581323
## 13 500 0.5748565 0.6714121 0.4578706
## 13 550 0.5745036 0.6718521 0.4571833
## 13 600 0.5743063 0.6720354 0.4568778
## 13 650 0.5744086 0.6719178 0.4567515
## 13 700 0.5745091 0.6718571 0.4567673
## 13 750 0.5745709 0.6718792 0.4565678
## 13 800 0.5746925 0.6718283 0.4564953
## 13 850 0.5745885 0.6719102 0.4563257
## 13 900 0.5746896 0.6718056 0.4563677
## 13 950 0.5746744 0.6718669 0.4562556
## 13 1000 0.5746138 0.6719120 0.4561670
## 13 1050 0.5746373 0.6718875 0.4562084
## 13 1100 0.5746186 0.6719155 0.4562015
## 13 1150 0.5746780 0.6718735 0.4562559
## 13 1200 0.5746388 0.6719026 0.4562067
## 13 1250 0.5746193 0.6719296 0.4562296
## 14 50 0.6142569 0.6391354 0.4903833
## 14 100 0.6048979 0.6438676 0.4851166
## 14 150 0.6041952 0.6444500 0.4801017
## 14 200 0.6005928 0.6470243 0.4742183
## 14 250 0.5979210 0.6481960 0.4741671
## 14 300 0.5990453 0.6469541 0.4761016
## 14 350 0.5985472 0.6484928 0.4760309
## 14 400 0.5986570 0.6480185 0.4760216
## 14 450 0.5982093 0.6484835 0.4753816
## 14 500 0.5982547 0.6485065 0.4749761
## 14 550 0.5981987 0.6485727 0.4749301
## 14 600 0.5980895 0.6488278 0.4747420
## 14 650 0.5978660 0.6489295 0.4745874
## 14 700 0.5978274 0.6491006 0.4746273
## 14 750 0.5980906 0.6488025 0.4748868
## 14 800 0.5981321 0.6487490 0.4749663
## 14 850 0.5981402 0.6487223 0.4750145
## 14 900 0.5980969 0.6487608 0.4750826
## 14 950 0.5980835 0.6487664 0.4751420
## 14 1000 0.5981781 0.6486946 0.4752532
## 14 1050 0.5981618 0.6487649 0.4752696
## 14 1100 0.5981521 0.6487705 0.4753129
## 14 1150 0.5981999 0.6487515 0.4753864
## 14 1200 0.5981955 0.6487499 0.4754039
## 14 1250 0.5982233 0.6487099 0.4754379
## 15 50 0.5909216 0.6617263 0.4685242
## 15 100 0.5843349 0.6634189 0.4600576
## 15 150 0.5843538 0.6609248 0.4570682
## 15 200 0.5780880 0.6676294 0.4529088
## 15 250 0.5774006 0.6682967 0.4515725
## 15 300 0.5782841 0.6682201 0.4519571
## 15 350 0.5785731 0.6677397 0.4524536
## 15 400 0.5773291 0.6688004 0.4517500
## 15 450 0.5769183 0.6689401 0.4519365
## 15 500 0.5769291 0.6690826 0.4522984
## 15 550 0.5765246 0.6693141 0.4522681
## 15 600 0.5765866 0.6692571 0.4523768
## 15 650 0.5763590 0.6694678 0.4523261
## 15 700 0.5762960 0.6696306 0.4525726
## 15 750 0.5761478 0.6697895 0.4524889
## 15 800 0.5762327 0.6697050 0.4526729
## 15 850 0.5762402 0.6696805 0.4528047
## 15 900 0.5763398 0.6695786 0.4529169
## 15 950 0.5762828 0.6696262 0.4529369
## 15 1000 0.5762596 0.6696379 0.4529360
## 15 1050 0.5762588 0.6696612 0.4529509
## 15 1100 0.5762609 0.6696232 0.4530212
## 15 1150 0.5762732 0.6696235 0.4530692
## 15 1200 0.5762262 0.6696792 0.4530276
## 15 1250 0.5762579 0.6696297 0.4530477
## 16 50 0.6012866 0.6463433 0.4727289
## 16 100 0.5850188 0.6538847 0.4619024
## 16 150 0.5832660 0.6567236 0.4626143
## 16 200 0.5812212 0.6601239 0.4636802
## 16 250 0.5807522 0.6590648 0.4641960
## 16 300 0.5808366 0.6596937 0.4654527
## 16 350 0.5802779 0.6606164 0.4656112
## 16 400 0.5810332 0.6594848 0.4666254
## 16 450 0.5815226 0.6590281 0.4672300
## 16 500 0.5822807 0.6582218 0.4675979
## 16 550 0.5824551 0.6579962 0.4679001
## 16 600 0.5825039 0.6581014 0.4681271
## 16 650 0.5824942 0.6581680 0.4682601
## 16 700 0.5825720 0.6580559 0.4683004
## 16 750 0.5826544 0.6579954 0.4682960
## 16 800 0.5827358 0.6578055 0.4683242
## 16 850 0.5828283 0.6577738 0.4682649
## 16 900 0.5827430 0.6579009 0.4682693
## 16 950 0.5828399 0.6577851 0.4683136
## 16 1000 0.5828613 0.6577769 0.4683137
## 16 1050 0.5828707 0.6578081 0.4682935
## 16 1100 0.5828785 0.6577793 0.4682896
## 16 1150 0.5829466 0.6577285 0.4683716
## 16 1200 0.5829581 0.6577134 0.4683682
## 16 1250 0.5829548 0.6577264 0.4683415
## 17 50 0.6159387 0.6397471 0.4917127
## 17 100 0.6117889 0.6467647 0.4860853
## 17 150 0.6150783 0.6443266 0.4893354
## 17 200 0.6199205 0.6401290 0.4946792
## 17 250 0.6167891 0.6436867 0.4915429
## 17 300 0.6172380 0.6422244 0.4919541
## 17 350 0.6181056 0.6406419 0.4909742
## 17 400 0.6182382 0.6408717 0.4904182
## 17 450 0.6181398 0.6415268 0.4901848
## 17 500 0.6187442 0.6410465 0.4904008
## 17 550 0.6187649 0.6409718 0.4902204
## 17 600 0.6186392 0.6413144 0.4901694
## 17 650 0.6187964 0.6411375 0.4901360
## 17 700 0.6189743 0.6408553 0.4901396
## 17 750 0.6189839 0.6407826 0.4900448
## 17 800 0.6189047 0.6408768 0.4897844
## 17 850 0.6189483 0.6408309 0.4897483
## 17 900 0.6189536 0.6408718 0.4896991
## 17 950 0.6190206 0.6408076 0.4898203
## 17 1000 0.6191999 0.6406357 0.4899872
## 17 1050 0.6191412 0.6406861 0.4899929
## 17 1100 0.6192383 0.6406103 0.4900389
## 17 1150 0.6192411 0.6405962 0.4900316
## 17 1200 0.6192974 0.6405650 0.4901263
## 17 1250 0.6192806 0.6405896 0.4901146
## 18 50 0.5969583 0.6435204 0.4677064
## 18 100 0.5713324 0.6682158 0.4531170
## 18 150 0.5704196 0.6665138 0.4570070
## 18 200 0.5745215 0.6623484 0.4562116
## 18 250 0.5749738 0.6610978 0.4588408
## 18 300 0.5751097 0.6615297 0.4590077
## 18 350 0.5753508 0.6613360 0.4593212
## 18 400 0.5762133 0.6603564 0.4603896
## 18 450 0.5764337 0.6603244 0.4606973
## 18 500 0.5765126 0.6603413 0.4609872
## 18 550 0.5761226 0.6607660 0.4607411
## 18 600 0.5762433 0.6606756 0.4608797
## 18 650 0.5761211 0.6608211 0.4607078
## 18 700 0.5762094 0.6607197 0.4607121
## 18 750 0.5759836 0.6610039 0.4604790
## 18 800 0.5759493 0.6610352 0.4603589
## 18 850 0.5760203 0.6610606 0.4602753
## 18 900 0.5758789 0.6612244 0.4601330
## 18 950 0.5758421 0.6612141 0.4600261
## 18 1000 0.5758488 0.6612814 0.4599787
## 18 1050 0.5758377 0.6612890 0.4599797
## 18 1100 0.5757797 0.6613521 0.4598869
## 18 1150 0.5757943 0.6613379 0.4598429
## 18 1200 0.5757798 0.6613617 0.4598723
## 18 1250 0.5757516 0.6614103 0.4598409
## 19 50 0.6207191 0.6220703 0.4844043
## 19 100 0.6023416 0.6299308 0.4738582
## 19 150 0.5993115 0.6340607 0.4741469
## 19 200 0.5985159 0.6339221 0.4708231
## 19 250 0.5975867 0.6342962 0.4699974
## 19 300 0.5976104 0.6336626 0.4696362
## 19 350 0.5975593 0.6337437 0.4694065
## 19 400 0.5977512 0.6333469 0.4690268
## 19 450 0.5969199 0.6341746 0.4686532
## 19 500 0.5968322 0.6346430 0.4686513
## 19 550 0.5968517 0.6346127 0.4685978
## 19 600 0.5968861 0.6344717 0.4688525
## 19 650 0.5968331 0.6345102 0.4688336
## 19 700 0.5969386 0.6345080 0.4687942
## 19 750 0.5969809 0.6344950 0.4688438
## 19 800 0.5969598 0.6345720 0.4687715
## 19 850 0.5968303 0.6347113 0.4686089
## 19 900 0.5968983 0.6346229 0.4686368
## 19 950 0.5969852 0.6345543 0.4687395
## 19 1000 0.5969620 0.6345766 0.4686658
## 19 1050 0.5969978 0.6345046 0.4687356
## 19 1100 0.5970268 0.6344719 0.4687562
## 19 1150 0.5971082 0.6343885 0.4688390
## 19 1200 0.5970651 0.6344436 0.4688086
## 19 1250 0.5970620 0.6344397 0.4688035
## 20 50 0.6011708 0.6410582 0.4721729
## 20 100 0.5952816 0.6392767 0.4649464
## 20 150 0.5918079 0.6423383 0.4613861
## 20 200 0.5916356 0.6421098 0.4626314
## 20 250 0.5888480 0.6423461 0.4605700
## 20 300 0.5875946 0.6429851 0.4593752
## 20 350 0.5878004 0.6425606 0.4595304
## 20 400 0.5874942 0.6428205 0.4590028
## 20 450 0.5878427 0.6424751 0.4589464
## 20 500 0.5880068 0.6422327 0.4589695
## 20 550 0.5877574 0.6423750 0.4588014
## 20 600 0.5879702 0.6421784 0.4589761
## 20 650 0.5879404 0.6421797 0.4588063
## 20 700 0.5877857 0.6423769 0.4586590
## 20 750 0.5878282 0.6422789 0.4588036
## 20 800 0.5877307 0.6423928 0.4586910
## 20 850 0.5877283 0.6423497 0.4587245
## 20 900 0.5876937 0.6423887 0.4587516
## 20 950 0.5877054 0.6423896 0.4587351
## 20 1000 0.5876954 0.6423913 0.4587010
## 20 1050 0.5877652 0.6422974 0.4587706
## 20 1100 0.5877552 0.6423211 0.4587455
## 20 1150 0.5878112 0.6422513 0.4588111
## 20 1200 0.5878029 0.6422375 0.4588283
## 20 1250 0.5877950 0.6422693 0.4587948
## 21 50 0.6079846 0.6431370 0.4861477
## 21 100 0.5947847 0.6516804 0.4860843
## 21 150 0.5872872 0.6612481 0.4785250
## 21 200 0.5904859 0.6560361 0.4845608
## 21 250 0.5896839 0.6567538 0.4838110
## 21 300 0.5890443 0.6568146 0.4817993
## 21 350 0.5894751 0.6565873 0.4814292
## 21 400 0.5889776 0.6576469 0.4810387
## 21 450 0.5888537 0.6578214 0.4806241
## 21 500 0.5890142 0.6577038 0.4805494
## 21 550 0.5887276 0.6581646 0.4802687
## 21 600 0.5890968 0.6577888 0.4807414
## 21 650 0.5889724 0.6580515 0.4807106
## 21 700 0.5891463 0.6579090 0.4809034
## 21 750 0.5892813 0.6577782 0.4811921
## 21 800 0.5894136 0.6576744 0.4814320
## 21 850 0.5893876 0.6577445 0.4814832
## 21 900 0.5894753 0.6577074 0.4816114
## 21 950 0.5894396 0.6577712 0.4816591
## 21 1000 0.5895221 0.6577294 0.4817435
## 21 1050 0.5895201 0.6577644 0.4817566
## 21 1100 0.5895461 0.6577870 0.4818076
## 21 1150 0.5896611 0.6576725 0.4819292
## 21 1200 0.5896186 0.6577557 0.4818894
## 21 1250 0.5896213 0.6577705 0.4819376
## 22 50 0.6007417 0.6439502 0.4785078
## 22 100 0.5858285 0.6571064 0.4622704
## 22 150 0.5836370 0.6593913 0.4643078
## 22 200 0.5843434 0.6597882 0.4660571
## 22 250 0.5859495 0.6570273 0.4673451
## 22 300 0.5859228 0.6571433 0.4682115
## 22 350 0.5865744 0.6562729 0.4688460
## 22 400 0.5872954 0.6553624 0.4698156
## 22 450 0.5877714 0.6550678 0.4701132
## 22 500 0.5877099 0.6553359 0.4695276
## 22 550 0.5878931 0.6551595 0.4696228
## 22 600 0.5878543 0.6553651 0.4696045
## 22 650 0.5880955 0.6550390 0.4696651
## 22 700 0.5878916 0.6553499 0.4696694
## 22 750 0.5879437 0.6553842 0.4697214
## 22 800 0.5880899 0.6552120 0.4699198
## 22 850 0.5880870 0.6552442 0.4698992
## 22 900 0.5881371 0.6552621 0.4700604
## 22 950 0.5882209 0.6551964 0.4700949
## 22 1000 0.5882699 0.6551664 0.4702112
## 22 1050 0.5882655 0.6551736 0.4701910
## 22 1100 0.5883363 0.6551213 0.4702794
## 22 1150 0.5883775 0.6551381 0.4703565
## 22 1200 0.5883910 0.6551301 0.4703784
## 22 1250 0.5884067 0.6550851 0.4703995
## 23 50 0.6121801 0.6294483 0.4851937
## 23 100 0.5940484 0.6413371 0.4758075
## 23 150 0.5855388 0.6477329 0.4712309
## 23 200 0.5873943 0.6429038 0.4758498
## 23 250 0.5866406 0.6432376 0.4764504
## 23 300 0.5868202 0.6429057 0.4762047
## 23 350 0.5874420 0.6406166 0.4770274
## 23 400 0.5872972 0.6401297 0.4771355
## 23 450 0.5870976 0.6399403 0.4769181
## 23 500 0.5872632 0.6396043 0.4768552
## 23 550 0.5870881 0.6396687 0.4765962
## 23 600 0.5869016 0.6397154 0.4765120
## 23 650 0.5867844 0.6399122 0.4764161
## 23 700 0.5868403 0.6398528 0.4764757
## 23 750 0.5868600 0.6398883 0.4764695
## 23 800 0.5868335 0.6398269 0.4763298
## 23 850 0.5867442 0.6399010 0.4762396
## 23 900 0.5868150 0.6397898 0.4762643
## 23 950 0.5868147 0.6397928 0.4762151
## 23 1000 0.5868074 0.6397735 0.4761992
## 23 1050 0.5868337 0.6397474 0.4761874
## 23 1100 0.5868204 0.6397385 0.4761499
## 23 1150 0.5867997 0.6397575 0.4761340
## 23 1200 0.5868320 0.6397508 0.4761425
## 23 1250 0.5868488 0.6397347 0.4761362
## 24 50 0.6060294 0.6446920 0.4876318
## 24 100 0.5979500 0.6392111 0.4854646
## 24 150 0.5918530 0.6485809 0.4792034
## 24 200 0.5876457 0.6512109 0.4768701
## 24 250 0.5882299 0.6505471 0.4770217
## 24 300 0.5890413 0.6487328 0.4769165
## 24 350 0.5890646 0.6493622 0.4772681
## 24 400 0.5886826 0.6495412 0.4772876
## 24 450 0.5891230 0.6491641 0.4775144
## 24 500 0.5890555 0.6494915 0.4773101
## 24 550 0.5893315 0.6492690 0.4774540
## 24 600 0.5892429 0.6494440 0.4772928
## 24 650 0.5888863 0.6498401 0.4769813
## 24 700 0.5889077 0.6497731 0.4769288
## 24 750 0.5887776 0.6499632 0.4767815
## 24 800 0.5887939 0.6499220 0.4767461
## 24 850 0.5886876 0.6501276 0.4766366
## 24 900 0.5885783 0.6502710 0.4766200
## 24 950 0.5886423 0.6501863 0.4766492
## 24 1000 0.5886470 0.6502207 0.4766493
## 24 1050 0.5885765 0.6503184 0.4765954
## 24 1100 0.5885983 0.6502916 0.4766267
## 24 1150 0.5886105 0.6502915 0.4766663
## 24 1200 0.5886333 0.6502488 0.4766886
## 24 1250 0.5886331 0.6502613 0.4766698
## 25 50 0.6204856 0.6275884 0.4822181
## 25 100 0.6185601 0.6274999 0.4804351
## 25 150 0.6144744 0.6310520 0.4786403
## 25 200 0.6127136 0.6339151 0.4775013
## 25 250 0.6130874 0.6326767 0.4808386
## 25 300 0.6130085 0.6327472 0.4818438
## 25 350 0.6123501 0.6342301 0.4818099
## 25 400 0.6111742 0.6352579 0.4813876
## 25 450 0.6114638 0.6349289 0.4815989
## 25 500 0.6117496 0.6348516 0.4821730
## 25 550 0.6112154 0.6354850 0.4815842
## 25 600 0.6117672 0.6347751 0.4821931
## 25 650 0.6111170 0.6353514 0.4817011
## 25 700 0.6109953 0.6355070 0.4815397
## 25 750 0.6109652 0.6355276 0.4813389
## 25 800 0.6109155 0.6355763 0.4812067
## 25 850 0.6109711 0.6355292 0.4812864
## 25 900 0.6111264 0.6353994 0.4814162
## 25 950 0.6109776 0.6355778 0.4812537
## 25 1000 0.6108348 0.6356975 0.4810446
## 25 1050 0.6108560 0.6357154 0.4809885
## 25 1100 0.6108913 0.6356597 0.4809846
## 25 1150 0.6108811 0.6356628 0.4809495
## 25 1200 0.6109206 0.6356205 0.4809650
## 25 1250 0.6109343 0.6356071 0.4809245
##
## Tuning parameter 'shrinkage' was held constant at a value of 0.1
##
## Tuning parameter 'n.minobsinnode' was held constant at a value of 10
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were n.trees = 150, interaction.depth =
## 18, shrinkage = 0.1 and n.minobsinnode = 10.
ggplot(gbm_model, highlight = TRUE) +
labs(title = paste0("Tuning profile: ", gbm_model$modelInfo$label))
## Warning: The shape palette can deal with a maximum of 6 discrete values because
## more than 6 becomes difficult to discriminate; you have 25. Consider
## specifying shapes manually if you must have them.
## Warning: Removed 475 rows containing missing values (geom_point).
Random Forest
set.seed(42)
rf_model <- train(Yield ~ ., data = train_df, method = "ranger",
scale = TRUE,
trControl = trainControl("cv", number = 10),
tuneLength = 25)
rf_predictions <- predict(rf_model, test_df)
rf_in_sample <- merge(rf_model$results, rf_model$bestTune)
results <- data.frame(t(postResample(pred = rf_predictions, obs = test_df$Yield))) %>%
mutate("In Sample RMSE" = rf_in_sample$RMSE,
"In Sample Rsquared" = rf_in_sample$Rsquared,
"In Sample MAE" = rf_in_sample$MAE,
"Model"= "Random Forest") %>%
rbind(results)
rf_model
## Random Forest
##
## 144 samples
## 56 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 129, 129, 130, 129, 130, 130, ...
## Resampling results across tuning parameters:
##
## mtry splitrule RMSE Rsquared MAE
## 2 variance 0.6500132 0.6405980 0.5270888
## 2 extratrees 0.7081189 0.6063816 0.5839878
## 4 variance 0.6110728 0.6879689 0.4939586
## 4 extratrees 0.6734844 0.6211211 0.5514717
## 6 variance 0.5999764 0.6812662 0.4818284
## 6 extratrees 0.6544688 0.6403787 0.5375388
## 8 variance 0.5955772 0.6880625 0.4738274
## 8 extratrees 0.6376049 0.6644931 0.5191036
## 11 variance 0.5842961 0.6966973 0.4619424
## 11 extratrees 0.6233472 0.6735892 0.5068586
## 13 variance 0.5889070 0.6852941 0.4631030
## 13 extratrees 0.6188364 0.6788552 0.5015429
## 15 variance 0.5820176 0.6922276 0.4547773
## 15 extratrees 0.6129074 0.6817932 0.4986697
## 17 variance 0.5805109 0.6943491 0.4572579
## 17 extratrees 0.6117319 0.6800644 0.4953316
## 20 variance 0.5826146 0.6866492 0.4542811
## 20 extratrees 0.6137834 0.6778865 0.4956006
## 22 variance 0.5823764 0.6889828 0.4537202
## 22 extratrees 0.6046238 0.6880740 0.4880356
## 24 variance 0.5810892 0.6921636 0.4544925
## 24 extratrees 0.6089319 0.6854059 0.4919202
## 26 variance 0.5844387 0.6863990 0.4555478
## 26 extratrees 0.6033226 0.6900209 0.4853285
## 29 variance 0.5863207 0.6841378 0.4594347
## 29 extratrees 0.6052358 0.6838869 0.4851054
## 31 variance 0.5849939 0.6822214 0.4553039
## 31 extratrees 0.6014043 0.6906210 0.4838076
## 33 variance 0.5856966 0.6790755 0.4555637
## 33 extratrees 0.6020048 0.6847682 0.4808528
## 35 variance 0.5884058 0.6780521 0.4561543
## 35 extratrees 0.5979276 0.6919178 0.4788435
## 38 variance 0.5895917 0.6764056 0.4591812
## 38 extratrees 0.5991942 0.6883245 0.4790427
## 40 variance 0.5849066 0.6822193 0.4553453
## 40 extratrees 0.6032657 0.6852441 0.4820079
## 42 variance 0.5883705 0.6769147 0.4566871
## 42 extratrees 0.6015277 0.6852934 0.4804959
## 44 variance 0.5884035 0.6761349 0.4574677
## 44 extratrees 0.5941553 0.6935554 0.4755886
## 47 variance 0.5908001 0.6749328 0.4613816
## 47 extratrees 0.5975337 0.6858579 0.4751913
## 49 variance 0.5955189 0.6665826 0.4644387
## 49 extratrees 0.5922861 0.6930786 0.4719625
## 51 variance 0.5902225 0.6737350 0.4615126
## 51 extratrees 0.5980513 0.6824137 0.4748528
## 53 variance 0.5986187 0.6634534 0.4657746
## 53 extratrees 0.5943250 0.6912784 0.4736702
## 56 variance 0.5979837 0.6651726 0.4689527
## 56 extratrees 0.5954310 0.6860949 0.4740501
##
## Tuning parameter 'min.node.size' was held constant at a value of 5
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were mtry = 17, splitrule = variance
## and min.node.size = 5.
Conditional Inference Random Forest
set.seed(42)
crf_model <- train(Yield ~ ., data = train_df, method = "cforest",
trControl = trainControl("cv", number = 10),
tuneLength = 25)
crf_predictions <- predict(crf_model, test_df)
crf_in_sample <- merge(crf_model$results, crf_model$bestTune)
results <- data.frame(t(postResample(pred = crf_predictions, obs = test_df$Yield))) %>%
mutate("In Sample RMSE" = crf_in_sample$RMSE,
"In Sample Rsquared" = crf_in_sample$Rsquared,
"In Sample MAE" = crf_in_sample$MAE,
"Model"= "Conditional Random Forest") %>%
rbind(results)
crf_model
## Conditional Inference Random Forest
##
## 144 samples
## 56 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 129, 129, 130, 129, 130, 130, ...
## Resampling results across tuning parameters:
##
## mtry RMSE Rsquared MAE
## 2 0.7764876 0.5335619 0.6386618
## 4 0.6935829 0.5889540 0.5671497
## 6 0.6644925 0.6064106 0.5353275
## 8 0.6475647 0.6301389 0.5187062
## 11 0.6407569 0.6303319 0.5091173
## 13 0.6385331 0.6289130 0.5051144
## 15 0.6339825 0.6364721 0.5015974
## 17 0.6330259 0.6352765 0.5017147
## 20 0.6349996 0.6290909 0.5023303
## 22 0.6378956 0.6233378 0.5040365
## 24 0.6319682 0.6306735 0.4996607
## 26 0.6327051 0.6272160 0.5008591
## 29 0.6309457 0.6317958 0.4992596
## 31 0.6305353 0.6277954 0.4992323
## 33 0.6304848 0.6266151 0.4981848
## 35 0.6364690 0.6171781 0.5011835
## 38 0.6350177 0.6198361 0.5016223
## 40 0.6329029 0.6231098 0.5002293
## 42 0.6373717 0.6163779 0.5038154
## 44 0.6378349 0.6158300 0.5045886
## 47 0.6400866 0.6125926 0.5070005
## 49 0.6368146 0.6150981 0.5039618
## 51 0.6375358 0.6163547 0.5064698
## 53 0.6433914 0.6052982 0.5091913
## 56 0.6407299 0.6108010 0.5057624
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was mtry = 33.
ggplot(crf_model, highlight = TRUE) +
labs(title = paste0("Tuning profile: ", crf_model$modelInfo$label))
Corey’s Transparent Random Forest Function
This is my own Random Forest Function that I built and it can display any of the underlying trees right now I am having it display only every 100th tree.
## Rattle: A free graphical interface for data science with R.
## Version 5.2.0 Copyright (c) 2006-2018 Togaware Pty Ltd.
## Type 'rattle()' to shake, rattle, and roll your data.
##
## Attaching package: 'rattle'
## The following object is masked from 'package:randomForest':
##
## importance
## Loading required package: grid
## Loading required package: mvtnorm
## Loading required package: modeltools
## Loading required package: stats4
## Loading required package: strucchange
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: sandwich
##
## Attaching package: 'strucchange'
## The following object is masked from 'package:stringr':
##
## boundary
## Loading required package: libcoin
##
## Attaching package: 'partykit'
## The following objects are masked from 'package:party':
##
## cforest, ctree, ctree_control, edge_simple, mob, mob_control,
## node_barplot, node_bivplot, node_boxplot, node_inner, node_surv,
## node_terminal, varimp
##
## Attaching package: 'data.table'
## The following objects are masked from 'package:dplyr':
##
## between, first, last
## The following object is masked from 'package:purrr':
##
## transpose
## Loading required package: MASS
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
##
## Attaching package: 'memisc'
## The following objects are masked from 'package:modeltools':
##
## Lapply, relabel
## The following objects are masked from 'package:dplyr':
##
## collect, recode, rename, syms
## The following object is masked from 'package:purrr':
##
## %@%
## The following object is masked from 'package:ggplot2':
##
## syms
## The following objects are masked from 'package:stats':
##
## contr.sum, contr.treatment, contrasts
## The following object is masked from 'package:base':
##
## as.array
##
## Attaching package: 'plotly'
## The following objects are masked from 'package:memisc':
##
## rename, style
## The following object is masked from 'package:MASS':
##
## select
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
cor_RF_predictions <- test_df2$prediction_overall
results <- data.frame(t(postResample(pred = cor_RF_predictions, obs = test_df$Yield))) %>%
mutate("In Sample RMSE" = "N/A",
"In Sample Rsquared" = "N/A",
"In Sample MAE" = "N/A",
"Model"= "Coreys Random Forest") %>%
rbind(results)
RMSE | Rsquared | MAE | In Sample RMSE | In Sample Rsquared | In Sample MAE | Model |
---|---|---|---|---|---|---|
0.6568448 | 0.6655215 | 0.4830638 | 0.570419640436632 | 0.666513842965521 | 0.457006960756145 | Boosted Tree |
0.6998683 | 0.6886100 | 0.5176466 | 0.580510877600571 | 0.694349051298223 | 0.45725788543933 | Random Forest |
0.7315112 | 0.5697158 | 0.5696631 | 0.798414716727957 | 0.430474895720782 | 0.63628271644332 | PLS |
0.7785552 | 0.5344336 | 0.6058196 | 0.630484804614008 | 0.626615102326014 | 0.498184799729995 | Conditional Random Forest |
0.8420061 | 0.6048711 | 0.6558904 | N/A | N/A | N/A | Coreys Random Forest |
0.8806930 | 0.3729322 | 0.6738923 | 0.693433717952822 | 0.521960671559252 | 0.559856839357211 | Bagged Tree |