WY Accountability Random Forest Model

## # A tibble: 4 × 4
##   spl                                           enrollment_size     n percent
##   <fct>                                         <chr>           <int>   <dbl>
## 1 Exceeding or Meeting Expectations             large             200    24.4
## 2 Exceeding or Meeting Expectations             small             255    31.1
## 3 Partially Meeting or Not Meeting Expectations large             211    25.7
## 4 Partially Meeting or Not Meeting Expectations small             155    18.9

Model

## 25.866 sec elapsed

Metrics before Tuning

The initial model run generates an roc_auc of 0.678

## # A tibble: 1 × 6
##   .metric .estimator  mean     n std_err .config             
##   <chr>   <chr>      <dbl> <int>   <dbl> <chr>               
## 1 roc_auc binary     0.678    10  0.0213 Preprocessor1_Model1

Metrics after Tuning

## Random Forest Model Specification (classification)
## 
## Main Arguments:
##   mtry = tune()
##   trees = 10000
##   min_n = tune()
## 
## Engine-Specific Arguments:
##   num.threads = cores
##   importance = permutation
##   verbose = TRUE
## 
## Computational engine: ranger 
## 
## Model fit template:
## ranger::ranger(x = missing_arg(), y = missing_arg(), weights = missing_arg(), 
##     mtry = min_cols(~tune(), x), num.trees = 10000, min.node.size = min_rows(~tune(), 
##         x), num.threads = cores, importance = "permutation", 
##     verbose = TRUE, seed = sample.int(10^5, 1), probability = TRUE)

## 333.297 sec elapsed

With tuning, the roc_auc value improves to 0.683

## # A tibble: 5 × 8
##    mtry min_n .metric .estimator  mean     n std_err .config              
##   <int> <int> <chr>   <chr>      <dbl> <int>   <dbl> <chr>                
## 1     5    40 roc_auc binary     0.680     9  0.0242 Preprocessor1_Model05
## 2     6    20 roc_auc binary     0.679     9  0.0238 Preprocessor1_Model10
## 3    12    30 roc_auc binary     0.679     9  0.0245 Preprocessor1_Model01
## 4    17    33 roc_auc binary     0.675     9  0.0248 Preprocessor1_Model02
## 5     9    13 roc_auc binary     0.673     9  0.0230 Preprocessor1_Model06

Model with Workflow Added

## ══ Workflow ════════════════════════════════════════════════════════════════════
## Preprocessor: Recipe
## Model: rand_forest()
## 
## ── Preprocessor ────────────────────────────────────────────────────────────────
## 5 Recipe Steps
## 
## • step_novel()
## • step_unknown()
## • step_impute_median()
## • step_dummy()
## • step_nzv()
## 
## ── Model ───────────────────────────────────────────────────────────────────────
## Random Forest Model Specification (classification)
## 
## Main Arguments:
##   mtry = 5
##   trees = 10000
##   min_n = 40
## 
## Engine-Specific Arguments:
##   num.threads = cores
##   importance = permutation
##   verbose = TRUE
## 
## Computational engine: ranger

## 7.026 sec elapsed

With the workflow, the roc_auc generates a value of 0.626

## [[1]]
## # A tibble: 2 × 4
##   .metric  .estimator .estimate .config             
##   <chr>    <chr>          <dbl> <chr>               
## 1 accuracy binary         0.621 Preprocessor1_Model1
## 2 roc_auc  binary         0.621 Preprocessor1_Model1

Important Predictor Plot

Predictions

## # A tibble: 206 × 1
##    .pred_class                                  
##    <fct>                                        
##  1 Exceeding or Meeting Expectations            
##  2 Partially Meeting or Not Meeting Expectations
##  3 Exceeding or Meeting Expectations            
##  4 Partially Meeting or Not Meeting Expectations
##  5 Partially Meeting or Not Meeting Expectations
##  6 Exceeding or Meeting Expectations            
##  7 Partially Meeting or Not Meeting Expectations
##  8 Partially Meeting or Not Meeting Expectations
##  9 Exceeding or Meeting Expectations            
## 10 Exceeding or Meeting Expectations            
## # … with 196 more rows

Model with Condensced Recipe

## 6.925 sec elapsed

Metrics before Tuning

The initial model run generates an roc_auc of 0.594

## # A tibble: 1 × 6
##   .metric .estimator  mean     n std_err .config             
##   <chr>   <chr>      <dbl> <int>   <dbl> <chr>               
## 1 roc_auc binary     0.594    10  0.0157 Preprocessor1_Model1

Metrics after Tuning

## Random Forest Model Specification (classification)
## 
## Main Arguments:
##   mtry = tune()
##   trees = 10000
##   min_n = tune()
## 
## Engine-Specific Arguments:
##   num.threads = cores
##   importance = permutation
##   verbose = TRUE
## 
## Computational engine: ranger 
## 
## Model fit template:
## ranger::ranger(x = missing_arg(), y = missing_arg(), weights = missing_arg(), 
##     mtry = min_cols(~tune(), x), num.trees = 10000, min.node.size = min_rows(~tune(), 
##         x), num.threads = cores, importance = "permutation", 
##     verbose = TRUE, seed = sample.int(10^5, 1), probability = TRUE)

## 219.46 sec elapsed

With tuning, the roc_auc value improves to 0.611

## # A tibble: 5 × 8
##    mtry min_n .metric .estimator  mean     n std_err .config              
##   <int> <int> <chr>   <chr>      <dbl> <int>   <dbl> <chr>                
## 1     1     6 roc_auc binary     0.611    10  0.0165 Preprocessor1_Model07
## 2     3    40 roc_auc binary     0.595    10  0.0160 Preprocessor1_Model05
## 3     4    20 roc_auc binary     0.584    10  0.0131 Preprocessor1_Model10
## 4     5    13 roc_auc binary     0.580    10  0.0120 Preprocessor1_Model06
## 5     7    30 roc_auc binary     0.574    10  0.0121 Preprocessor1_Model01

Model with Workflow Added

## ══ Workflow ════════════════════════════════════════════════════════════════════
## Preprocessor: Recipe
## Model: rand_forest()
## 
## ── Preprocessor ────────────────────────────────────────────────────────────────
## 5 Recipe Steps
## 
## • step_novel()
## • step_unknown()
## • step_impute_median()
## • step_dummy()
## • step_nzv()
## 
## ── Model ───────────────────────────────────────────────────────────────────────
## Random Forest Model Specification (classification)
## 
## Main Arguments:
##   mtry = 1
##   trees = 10000
##   min_n = 6
## 
## Engine-Specific Arguments:
##   num.threads = cores
##   importance = permutation
##   verbose = TRUE
## 
## Computational engine: ranger

## 2.266 sec elapsed

With the workflow, the roc_auc generates a value of 0.621

## [[1]]
## # A tibble: 2 × 4
##   .metric  .estimator .estimate .config             
##   <chr>    <chr>          <dbl> <chr>               
## 1 accuracy binary         0.578 Preprocessor1_Model1
## 2 roc_auc  binary         0.553 Preprocessor1_Model1

Important Predictor Plot

Predictions

## # A tibble: 206 × 1
##    .pred_class                                  
##    <fct>                                        
##  1 Exceeding or Meeting Expectations            
##  2 Partially Meeting or Not Meeting Expectations
##  3 Exceeding or Meeting Expectations            
##  4 Partially Meeting or Not Meeting Expectations
##  5 Partially Meeting or Not Meeting Expectations
##  6 Exceeding or Meeting Expectations            
##  7 Partially Meeting or Not Meeting Expectations
##  8 Partially Meeting or Not Meeting Expectations
##  9 Exceeding or Meeting Expectations            
## 10 Exceeding or Meeting Expectations            
## # … with 196 more rows