Data 624 Assignment 9

8.1

a.

Fit a random forest model to all of the predictors, then estimate the variable importance scores:

library(mlbench)
set.seed(200)

simulated <- mlbench.friedman1(200, sd = 1)
simulated <- cbind(simulated$x, simulated$y)
simulated <- as.data.frame(simulated)
colnames(simulated)[ncol(simulated)] <- "y"
library(randomForest)
library(caret)
model1 <- randomForest(y ~ ., data = simulated,importance = TRUE, ntree = 1000)
rfImp1 <- varImp(model1, scale = FALSE)

rfImp1 %>% arrange(-Overall)
         Overall
V1   8.732235404
V4   7.615118809
V2   6.415369387
V5   2.023524577
V3   0.763591825
V6   0.165111172
V7  -0.005961659
V10 -0.074944788
V9  -0.095292651
V8  -0.166362581

Did the random forest model significantly use the uniformative predictors (V6 - V10)?

According to the variable importance, V6 - V10 are considered unimportant predictors

b.

Now add an additional predictor that is highly correlated with one of the informative predictors. For example:

simulated$duplicate1 <- simulated$V1 + rnorm(200) * .1
cor(simulated$duplicate1, simulated$V1)
[1] 0.9460206

Fit another random forest model to these data. Did the importance score for V1 change? What happens when you add another predictor that is also highly correlated with V1?

model1 <- randomForest(y ~ ., data = simulated,importance = TRUE, ntree = 1000)
rfImp1 <- varImp(model1, scale = FALSE)

rfImp1 %>% arrange(-Overall)
               Overall
V4          7.04752238
V2          6.06896061
V1          5.69119973
duplicate1  4.28331581
V5          1.87238438
V3          0.62970218
V6          0.13569065
V10         0.02894814
V9          0.00840438
V7         -0.01345645
V8         -0.04370565

When adding a predictor that is highly correlated with V1, we end up with V1 being close to that feature in terms of feature importance.

c. 

Use the cforest function in the party packageto fit a random forest model using conditional inference trees. The party package function varimp can calculate predictor importance. The conditional argument of that function toggles between the traditional importance measure and the modified version described in Strobl et al. (2007). Do these importances show the same pattern as the traditional random forest model?

set.seed(200)
partyforest<-party::cforest(y ~ ., data = simulated)

party::varimp(partyforest) %>% as.data.frame() %>% 
  rename("Variable Importance"="." ) %>% 
  arrange(-`Variable Importance`)
           Variable Importance
V4                 7.690072627
V2                 6.249765490
duplicate1         5.291458919
V1                 4.293239165
V5                 1.594974284
V3                 0.012220217
V7                -0.004297228
V9                -0.013118351
V10               -0.022547729
V6                -0.031833051
V8                -0.044193455

This implementation of the random shows very slight variations but the general gist is similar.

d. 

Repeat this process with different tree models, such as boosted trees and Cubist. Does the same pattern occur?

library(Cubist)
library(gbm)
Mboosted<-gbm(y~.,data =simulated,distribution = "gaussian")
Mcube<-cubist(simulated[,-11],simulated$y,data =simulated,committees=100)

Boosted Variable importance model also shows similar results and this can vary very slightly depending on number of trees. V6 has 0 importance to this model

varImp(Mboosted, numTrees = 100) %>% 
  arrange(-Overall)
              Overall
V4         4757.61808
V2         3658.76044
V1         2544.17548
duplicate1 1712.20947
V5         1511.66363
V3         1324.69227
V6           29.83765
V7            0.00000
V8            0.00000
V9            0.00000
V10           0.00000

Cubist variable importance shows very similarly that V6-V10 are not as important

varImp(Mcube) %>% 
  arrange(-Overall)
           Overall
V2            59.5
V1            52.5
V4            46.0
V3            43.5
duplicate1    27.5
V5            27.0
V6            10.0
V8             4.0
V10            1.0
V7             0.0
V9             0.0

8.2

Use a simulation to show tree bias with different granularities

This simulation will show granularities based on the number of decimal places and use the rpart function to determine feature importance

df <- data.frame(var1 = sample(0:10000000/10000000, 500, replace = TRUE),
                 var2 = sample(0:1000/1000, 500, replace = TRUE),
                 var3 = sample(0:100000/100000, 500, replace = TRUE),
                 var4 = sample(0:10/10, 500, replace = TRUE)) %>% 
  mutate(y = var1 + var4 + rnorm(500))


head(df)
       var1  var2    var3 var4          y
1 0.2260410 0.470 0.53561  0.3 -0.3199583
2 0.8513108 0.239 0.47088  0.0  1.8677621
3 0.9149679 0.617 0.21226  0.4  1.0459871
4 0.5671512 0.360 0.50202  0.0  0.5580564
5 0.2966322 0.335 0.53569  0.3  1.6417407
6 0.3721379 0.340 0.10817  0.4  1.2693237

below shows how the more granulated features (var 1 and 3) are used to make the predictions while 2 and 4 are less important. this is also shown in decision tree with most important predictors at the root node.

library(rpart)
rp_model <- rpart(y~., data=df)
varImp(rp_model) %>% 
  arrange(-Overall)
       Overall
var3 0.4861922
var2 0.4197022
var4 0.3600104
var1 0.3451787
rpart.plot::rpart.plot(rp_model)

8.3

a.

In stochastic gradient boosting, the bagging fraction and learning rate will govern the construction of the trees as they are guided by the gradient. Although the optimal values of these parameters should be obtained through the tuning process, it is helpful to understand how the magnitudes of these parameters affect the magnitudes of variable importance. Figure 8.24 provides the variable importance plots for boosting using two extreme values for the bagging fraction (0.1 and 0.9) and the learning rate (0.1 and 0.9) for the solubility data. The left-hand plot has both parameters set to 0.1, and the right-hand plot has both set to 0.9:

Why does the model on the right focus its importance on just the first few predictors, whereas the model on the left spreads importance across more predictors?

Because the bagging fraction of the graph on the left is higher, it will not be able capture importance based on features that are not as important. Lowering the bagging rate allows to capture importance from other features and this is typically tuned for models. similarly, a higher learning rate means less dependence after the iterations from other features.

b.

Which model do you think would be more predictive of other samples?

I believe the model on the left would be more predictive of other samples as it is more generalized and can use additional predictors.

c. 

How would increasing interaction depth affect the slope of predictor im-portanceforeithermodelinFig.8.24?

Increasing the interaction depth would result in a more balanced varaible importance

8.7

data(ChemicalManufacturingProcess)

df<-
ChemicalManufacturingProcess %>% 
  recipe(~.) %>% 
  step_impute_knn(all_predictors()) %>% 
  prep() %>% 
  bake(ChemicalManufacturingProcess)
parsnip::show_engines("decision_tree")
# A tibble: 5 x 2
  engine mode          
  <chr>  <chr>         
1 rpart  classification
2 rpart  regression    
3 C5.0   classification
4 spark  classification
5 spark  regression    
library(rules)

# setup decision tree engine
engine_dtree<-decision_tree() %>% 
  set_engine("rpart") %>% 
  set_mode("regression")

# Setup random forest regression
engine_forest<-rand_forest() %>% 
  set_engine("randomForest") %>% 
  set_mode("regression")

# Setup logistic regression
engine_cubist<- cubist_rules() %>% 
  set_engine("Cubist") %>% 
  set_mode("regression")
set.seed(1)
dfsplit<-initial_split(df, prop = 0.75)

df_train<-training(dfsplit)

df_test<-testing(dfsplit)

rec<-df_train %>% recipe(Yield~., data = df_train) 

A.

Which tree-based regression model gives the optimal resampling and test set performance?

Q8_3<-
  function(engine,model){
    dt_wf<-
      workflow() %>% 
      add_recipe(rec) %>% 
      add_model(engine)%>% 
      fit(data = df_train)
    res<-
      bind_rows(
        dt_wf %>% 
          predict(df_train) %>% 
          mutate(actual = df_train$Yield) %>% 
          mutate(type = "Training"),
        
        dt_wf %>% 
          predict(df_test) %>% 
          mutate(actual = df_test$Yield) %>% 
          mutate(type = "Testing")
      ) %>% 
      mutate(model = model) 
    return(list(dt_wf,res))
  }

Results<-
  bind_rows(
  Q8_3(engine_dtree,"Decision Tree")[2],
  Q8_3(engine_forest,"Random Forest")[2],
  Q8_3(engine_cubist,"Cubist")[2]
  )
Results %>% 
  group_by(type, model) %>% 
  metrics( truth = actual,estimate = .pred) %>% 
  pivot_wider(names_from = .metric, values_from = .estimate) %>% 
  ungroup() %>% 
  group_by(type) %>% 
  arrange(-rsq, .by_group = T)
# A tibble: 6 x 6
# Groups:   type [2]
  type     model         .estimator  rmse   rsq   mae
  <chr>    <chr>         <chr>      <dbl> <dbl> <dbl>
1 Testing  Random Forest standard   1.19  0.592 0.891
2 Testing  Cubist        standard   1.48  0.396 1.24 
3 Testing  Decision Tree standard   1.68  0.238 1.23 
4 Training Random Forest standard   0.458 0.963 0.347
5 Training Cubist        standard   0.872 0.780 0.643
6 Training Decision Tree standard   0.885 0.772 0.702

as shown above, the model that performs best is the random forest model on both training and testing sets.

library(ggpmisc)
Results %>% 
  ggplot(aes(y = .pred, x = actual,color = type )) +
  geom_point()+
  geom_smooth(method = "lm") +
  facet_grid(model~type, scales = "free")+
  stat_poly_eq(aes(label = paste(..adj.rr.label.., sep = "~~~")), 
               label.x.npc = "left", label.y.npc = .9,
               formula = y~x, parse = TRUE, size = 4)+
  theme(panel.background = element_blank(),
        panel.grid = element_blank(),
        panel.border = element_rect(color = "black", fill = NA),
        legend.position = "top",
        legend.title = element_blank())+
  scale_color_brewer( palette = "Dark2")+
  labs(x = "Actual Observations",
       y = "Predicted Observatins",
       title = "Comparison of Tree Based Models")

b.

Which predictors are most important in the optimal tree-basedregression model? Do either the biological or process variables dominate the list? How do the top 10 important predictors compare to the top 10 predictors from the optimal linear and nonlinear models?

m1<-Q8_3(engine_dtree,"Decision Tree")[[1]]
m2<-Q8_3(engine_forest,"Random Forest")[[1]]
m3<-Q8_3(engine_cubist,"Cubist")[[1]]


bind_rows(
  m1$fit$fit$fit %>% varImp() %>% 
    top_n(10) %>% 
    arrange(-Overall) %>% 
    mutate(model = "Decition Tree") %>% 
    rownames_to_column()
  ,
  
  m2$fit$fit$fit %>% varImp() %>% 
    top_n(10) %>% 
    arrange(-Overall) %>% 
    mutate(model = "Random Forest") %>% 
    rownames_to_column()
  ,
  
  m3$fit$fit$fit %>% varImp() %>% 
    top_n(1) %>% 
    arrange(-Overall) %>% 
    mutate(model = "Cubist") %>% 
    rownames_to_column()
) %>%  
  mutate(rowname = tidytext::reorder_within(rowname, Overall, model)) %>%
  ggplot(aes(x = Overall, y = rowname, fill = model))+
  geom_col()+
  facet_wrap(~model, scales = "free", ncol = 1)+
  scale_color_brewer(palette = "Dark2")+
  scale_fill_brewer(palette = "Dark2")+
  tidytext::scale_y_reordered()+
  theme_bw()

Based on the variable importance plot above, manufacturing process 31 and 32 seem to be present in each model, 32 is the best predictor. the Cubist model was not able to distinguish between the importance of its top predictors and is likely why it has poor performance. 32 was also a top predictor in the linear and non-linear models attempted in previous assignments.

c. 

Plot the optimal single tree with the distribution of yield in the terminal nodes. Does this view of the data provide additional knowledge about the biological or process predictors and their relationship with yield?

to get the optimal parameters for the decision tree, we will develop a tuning matrix to optimize cost_complexity (information gain required at each node to continue splitting), tree_depth (how many branches down a tree is allowed to go), and min_n (minimum number of samples at each node required to split).

engine_dtree<-decision_tree(cost_complexity = tune(),
                            tree_depth = tune(),
                            min_n= tune()) %>% 
  set_engine("rpart") %>% 
  set_mode("regression")


datacv<- vfold_cv(df, v = 10)

tree_grid <- grid_regular(cost_complexity(), tree_depth(), min_n(), levels = 4)

tree_rs <- tune_grid(
  object =  engine_dtree,
  preprocessor = rec,
  resamples = datacv,
  grid = tree_grid,
  metrics = metric_set(rsq,rmse )
)

collect_metrics(tree_rs) %>% kable() %>% 
   kable_styling(
      full_width = F, position="center", bootstrap_options = c("hover")) %>% 
  scroll_box(height = "200px") %>% 
  kable_classic_2()
cost_complexity tree_depth min_n .metric .estimator mean n std_err .config
0e+00 1 2 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model01
0e+00 1 2 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model01
1e-07 1 2 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model02
1e-07 1 2 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model02
1e-04 1 2 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model03
1e-04 1 2 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model03
1e-01 1 2 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model04
1e-01 1 2 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model04
0e+00 5 2 rmse standard 1.4563516 10 0.1206577 Preprocessor1_Model05
0e+00 5 2 rsq standard 0.4243557 10 0.0526111 Preprocessor1_Model05
1e-07 5 2 rmse standard 1.4563516 10 0.1206577 Preprocessor1_Model06
1e-07 5 2 rsq standard 0.4243557 10 0.0526111 Preprocessor1_Model06
1e-04 5 2 rmse standard 1.4534933 10 0.1203006 Preprocessor1_Model07
1e-04 5 2 rsq standard 0.4251855 10 0.0525927 Preprocessor1_Model07
1e-01 5 2 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model08
1e-01 5 2 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model08
0e+00 10 2 rmse standard 1.4897137 10 0.1109299 Preprocessor1_Model09
0e+00 10 2 rsq standard 0.4178560 10 0.0524274 Preprocessor1_Model09
1e-07 10 2 rmse standard 1.4897633 10 0.1109309 Preprocessor1_Model10
1e-07 10 2 rsq standard 0.4178239 10 0.0524301 Preprocessor1_Model10
1e-04 10 2 rmse standard 1.4790640 10 0.1112472 Preprocessor1_Model11
1e-04 10 2 rsq standard 0.4202554 10 0.0530302 Preprocessor1_Model11
1e-01 10 2 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model12
1e-01 10 2 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model12
0e+00 15 2 rmse standard 1.4910580 10 0.1102626 Preprocessor1_Model13
0e+00 15 2 rsq standard 0.4169480 10 0.0521566 Preprocessor1_Model13
1e-07 15 2 rmse standard 1.4910625 10 0.1102452 Preprocessor1_Model14
1e-07 15 2 rsq standard 0.4169305 10 0.0521534 Preprocessor1_Model14
1e-04 15 2 rmse standard 1.4788006 10 0.1113068 Preprocessor1_Model15
1e-04 15 2 rsq standard 0.4206964 10 0.0531014 Preprocessor1_Model15
1e-01 15 2 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model16
1e-01 15 2 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model16
0e+00 1 14 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model17
0e+00 1 14 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model17
1e-07 1 14 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model18
1e-07 1 14 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model18
1e-04 1 14 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model19
1e-04 1 14 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model19
1e-01 1 14 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model20
1e-01 1 14 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model20
0e+00 5 14 rmse standard 1.4229090 10 0.0796394 Preprocessor1_Model21
0e+00 5 14 rsq standard 0.4685613 10 0.0493506 Preprocessor1_Model21
1e-07 5 14 rmse standard 1.4229090 10 0.0796394 Preprocessor1_Model22
1e-07 5 14 rsq standard 0.4685613 10 0.0493506 Preprocessor1_Model22
1e-04 5 14 rmse standard 1.4229090 10 0.0796394 Preprocessor1_Model23
1e-04 5 14 rsq standard 0.4685613 10 0.0493506 Preprocessor1_Model23
1e-01 5 14 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model24
1e-01 5 14 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model24
0e+00 10 14 rmse standard 1.4469313 10 0.0829300 Preprocessor1_Model25
0e+00 10 14 rsq standard 0.4580978 10 0.0503995 Preprocessor1_Model25
1e-07 10 14 rmse standard 1.4469313 10 0.0829300 Preprocessor1_Model26
1e-07 10 14 rsq standard 0.4580978 10 0.0503995 Preprocessor1_Model26
1e-04 10 14 rmse standard 1.4469313 10 0.0829300 Preprocessor1_Model27
1e-04 10 14 rsq standard 0.4580978 10 0.0503995 Preprocessor1_Model27
1e-01 10 14 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model28
1e-01 10 14 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model28
0e+00 15 14 rmse standard 1.4469313 10 0.0829300 Preprocessor1_Model29
0e+00 15 14 rsq standard 0.4580978 10 0.0503995 Preprocessor1_Model29
1e-07 15 14 rmse standard 1.4469313 10 0.0829300 Preprocessor1_Model30
1e-07 15 14 rsq standard 0.4580978 10 0.0503995 Preprocessor1_Model30
1e-04 15 14 rmse standard 1.4469313 10 0.0829300 Preprocessor1_Model31
1e-04 15 14 rsq standard 0.4580978 10 0.0503995 Preprocessor1_Model31
1e-01 15 14 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model32
1e-01 15 14 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model32
0e+00 1 27 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model33
0e+00 1 27 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model33
1e-07 1 27 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model34
1e-07 1 27 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model34
1e-04 1 27 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model35
1e-04 1 27 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model35
1e-01 1 27 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model36
1e-01 1 27 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model36
0e+00 5 27 rmse standard 1.4466793 10 0.0714133 Preprocessor1_Model37
0e+00 5 27 rsq standard 0.4005788 10 0.0442998 Preprocessor1_Model37
1e-07 5 27 rmse standard 1.4466793 10 0.0714133 Preprocessor1_Model38
1e-07 5 27 rsq standard 0.4005788 10 0.0442998 Preprocessor1_Model38
1e-04 5 27 rmse standard 1.4466793 10 0.0714133 Preprocessor1_Model39
1e-04 5 27 rsq standard 0.4005788 10 0.0442998 Preprocessor1_Model39
1e-01 5 27 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model40
1e-01 5 27 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model40
0e+00 10 27 rmse standard 1.4466793 10 0.0714133 Preprocessor1_Model41
0e+00 10 27 rsq standard 0.4005788 10 0.0442998 Preprocessor1_Model41
1e-07 10 27 rmse standard 1.4466793 10 0.0714133 Preprocessor1_Model42
1e-07 10 27 rsq standard 0.4005788 10 0.0442998 Preprocessor1_Model42
1e-04 10 27 rmse standard 1.4466793 10 0.0714133 Preprocessor1_Model43
1e-04 10 27 rsq standard 0.4005788 10 0.0442998 Preprocessor1_Model43
1e-01 10 27 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model44
1e-01 10 27 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model44
0e+00 15 27 rmse standard 1.4466793 10 0.0714133 Preprocessor1_Model45
0e+00 15 27 rsq standard 0.4005788 10 0.0442998 Preprocessor1_Model45
1e-07 15 27 rmse standard 1.4466793 10 0.0714133 Preprocessor1_Model46
1e-07 15 27 rsq standard 0.4005788 10 0.0442998 Preprocessor1_Model46
1e-04 15 27 rmse standard 1.4466793 10 0.0714133 Preprocessor1_Model47
1e-04 15 27 rsq standard 0.4005788 10 0.0442998 Preprocessor1_Model47
1e-01 15 27 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model48
1e-01 15 27 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model48
0e+00 1 40 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model49
0e+00 1 40 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model49
1e-07 1 40 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model50
1e-07 1 40 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model50
1e-04 1 40 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model51
1e-04 1 40 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model51
1e-01 1 40 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model52
1e-01 1 40 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model52
0e+00 5 40 rmse standard 1.4033259 10 0.0716159 Preprocessor1_Model53
0e+00 5 40 rsq standard 0.4264263 10 0.0397899 Preprocessor1_Model53
1e-07 5 40 rmse standard 1.4033259 10 0.0716159 Preprocessor1_Model54
1e-07 5 40 rsq standard 0.4264263 10 0.0397899 Preprocessor1_Model54
1e-04 5 40 rmse standard 1.4033259 10 0.0716159 Preprocessor1_Model55
1e-04 5 40 rsq standard 0.4264263 10 0.0397899 Preprocessor1_Model55
1e-01 5 40 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model56
1e-01 5 40 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model56
0e+00 10 40 rmse standard 1.4033259 10 0.0716159 Preprocessor1_Model57
0e+00 10 40 rsq standard 0.4264263 10 0.0397899 Preprocessor1_Model57
1e-07 10 40 rmse standard 1.4033259 10 0.0716159 Preprocessor1_Model58
1e-07 10 40 rsq standard 0.4264263 10 0.0397899 Preprocessor1_Model58
1e-04 10 40 rmse standard 1.4033259 10 0.0716159 Preprocessor1_Model59
1e-04 10 40 rsq standard 0.4264263 10 0.0397899 Preprocessor1_Model59
1e-01 10 40 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model60
1e-01 10 40 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model60
0e+00 15 40 rmse standard 1.4033259 10 0.0716159 Preprocessor1_Model61
0e+00 15 40 rsq standard 0.4264263 10 0.0397899 Preprocessor1_Model61
1e-07 15 40 rmse standard 1.4033259 10 0.0716159 Preprocessor1_Model62
1e-07 15 40 rsq standard 0.4264263 10 0.0397899 Preprocessor1_Model62
1e-04 15 40 rmse standard 1.4033259 10 0.0716159 Preprocessor1_Model63
1e-04 15 40 rsq standard 0.4264263 10 0.0397899 Preprocessor1_Model63
1e-01 15 40 rmse standard 1.4255120 10 0.0907007 Preprocessor1_Model64
1e-01 15 40 rsq standard 0.3908935 10 0.0483682 Preprocessor1_Model64
tree_rs %>% autoplot()+theme_light(base_family = "IBMPlexSans")

according to our tuning matrix, the optimal tree is based on below.

final_tree <- finalize_model(engine_dtree, select_best(tree_rs, "rsq"))

final_tree
Decision Tree Model Specification (regression)

Main Arguments:
  cost_complexity = 1e-10
  tree_depth = 5
  min_n = 14

Computational engine: rpart 

An decision tree plot based on the entinre dataset using these tuning parameters looks like such:

final_wf<-
  workflow() %>% 
  add_model(final_tree) %>% 
  add_recipe(rec)

dtmodel<-
  final_wf %>% fit(data = df)

rpart.plot::rpart.plot(dtmodel$fit$fit$fit, roundint = F)