1 dlookr

1.1 Analysis

## [1] 1141   18
##  [1] "ACCESS_CNTL_CD"     "ALIGNMENT_CD"       "HWY_TYPE_CD"       
##  [4] "LIGHTING_CD"        "LOC_TYPE_CD"        "MAN_COLL_CD"       
##  [7] "PRI_CONTRIB_FAC_CD" "ROAD_COND_CD"       "ROAD_REL_CD"       
## [10] "ROAD_TYPE_CD"       "SEVERITY_CD"        "WEATHER_CD"        
## [13] "DAY_OF_WK"          "INTERSECTION"       "DR_COND_CD"        
## [16] "DR_DISTRACT_CD"     "DR_SEX"             "VehNum"
## # A tibble: 15 x 6
##    variables       types  missing_count missing_percent unique_count unique_rate
##    <chr>           <chr>          <int>           <dbl>        <int>       <dbl>
##  1 ACCESS_CNTL_CD  factor             1          0.0876            6     0.00526
##  2 ALIGNMENT_CD    factor             1          0.0876           13     0.0114 
##  3 HWY_TYPE_CD     factor             3          0.263             6     0.00526
##  4 LIGHTING_CD     factor             1          0.0876            9     0.00789
##  5 LOC_TYPE_CD     factor             1          0.0876            9     0.00789
##  6 MAN_COLL_CD     factor             3          0.263            13     0.0114 
##  7 PRI_CONTRIB_FA~ factor             1          0.0876            9     0.00789
##  8 ROAD_COND_CD    factor             1          0.0876            9     0.00789
##  9 ROAD_REL_CD     factor             1          0.0876            9     0.00789
## 10 ROAD_TYPE_CD    factor             1          0.0876            6     0.00526
## 11 SEVERITY_CD     factor             1          0.0876            5     0.00438
## 12 WEATHER_CD      factor             1          0.0876            7     0.00613
## 13 INTERSECTION    integ~             1          0.0876            3     0.00263
## 14 DR_COND_CD      factor             5          0.438            11     0.00964
## 15 DR_DISTRACT_CD  factor             9          0.789             7     0.00613
## [1] 1127   18
## [1] 1141   18
## # A tibble: 0 x 6
## # ... with 6 variables: variables <chr>, types <chr>, missing_count <int>,
## #   missing_percent <dbl>, unique_count <int>, unique_rate <dbl>
## [1] "split_df"   "grouped_df" "tbl_df"     "tbl"        "data.frame"
## ** Split train/test set information **
##  + random seed        :  74341 
##  + split data            
##     - train set count :  789 
##     - test set count  :  338 
##  + target variable    :  VehNum 
##     - minority class  :  Single (0.496007)
##     - majority class  :  Multiple (0.503993)
## ** Split train/test set information **
##  + random seed        :  54666 
##  + split data            
##     - train set count :  676 
##     - test set count  :  451 
##  + target variable    :  VehNum 
##     - minority class  :  Single (0.496007)
##     - majority class  :  Multiple (0.503993)
## [1] "ROAD_COND_CD" "DR_COND_CD"
## 
## Multiple   Single 
##      568      559
## 
##  Multiple    Single 
## 0.5039929 0.4960071
## ** Split train/test set information **
##  + random seed        :  89902 
##  + split data            
##     - train set count :  789 
##     - test set count  :  338 
##  + target variable    :  VehNum 
##     - minority class  :  Single (0.496007)
##     - majority class  :  Multiple (0.503993)
## 
## Multiple   Single 
##      399      399
## 
## Multiple   Single 
##      390      390
## 
## Multiple   Single 
##     1560     1170
## -- Checking unique value --------------------------- unique value is one --
## No variables that unique value is one.
## 
## -- Checking unique rate ------------------------------- high unique rate --
## No variables that high unique rate.
## 
## -- Checking character variables ----------------------- categorical data --
## No character variables.
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## # A tibble: 5 x 5
##   step     model_id     target positive fitted_model
##   <chr>    <chr>        <chr>  <chr>    <list>      
## 1 1.Fitted logistic     VehNum Single   <glm>       
## 2 1.Fitted rpart        VehNum Single   <rpart>     
## 3 1.Fitted ctree        VehNum Single   <BinaryTr>  
## 4 1.Fitted randomForest VehNum Single   <rndmFrs.>  
## 5 1.Fitted ranger       VehNum Single   <ranger>
## # A tibble: 5 x 6
##   step        model_id     target positive fitted_model predicted  
##   <chr>       <chr>        <chr>  <chr>    <list>       <list>     
## 1 2.Predicted logistic     VehNum Single   <glm>        <fct [338]>
## 2 2.Predicted rpart        VehNum Single   <rpart>      <fct [338]>
## 3 2.Predicted ctree        VehNum Single   <BinaryTr>   <fct [338]>
## 4 2.Predicted randomForest VehNum Single   <rndmFrs.>   <fct [338]>
## 5 2.Predicted ranger       VehNum Single   <ranger>     <fct [338]>
## # A tibble: 5 x 7
##   step          model_id     target positive fitted_model predicted  performance
##   <chr>         <chr>        <chr>  <chr>    <list>       <list>     <list>     
## 1 3.Performanc~ logistic     VehNum Single   <glm>        <fct [338~ <dbl [15]> 
## 2 3.Performanc~ rpart        VehNum Single   <rpart>      <fct [338~ <dbl [15]> 
## 3 3.Performanc~ ctree        VehNum Single   <BinaryTr>   <fct [338~ <dbl [15]> 
## 4 3.Performanc~ randomForest VehNum Single   <rndmFrs.>   <fct [338~ <dbl [15]> 
## 5 3.Performanc~ ranger       VehNum Single   <ranger>     <fct [338~ <dbl [15]>
## $logistic
## ZeroOneLoss    Accuracy   Precision      Recall Sensitivity Specificity 
##  0.09171598  0.90828402  0.89204545  0.92899408  0.92899408  0.88757396 
##    F1_Score Fbeta_Score     LogLoss         AUC        Gini       PRAUC 
##  0.91014493  0.91014493  0.65357680  0.93890270  0.87780540  0.88463772 
##     LiftAUC     GainAUC     KS_Stat 
##  1.56643581  0.71945135 82.84023669 
## 
## $rpart
## ZeroOneLoss    Accuracy   Precision      Recall Sensitivity Specificity 
##  0.08284024  0.91715976  0.92727273  0.90532544  0.90532544  0.92899408 
##    F1_Score Fbeta_Score     LogLoss         AUC        Gini       PRAUC 
##  0.91616766  0.91616766  0.23800632  0.95490354  0.90182417  0.22951214 
##     LiftAUC     GainAUC     KS_Stat 
##  0.91375852  0.72745177 83.43195266 
## 
## $ctree
## ZeroOneLoss    Accuracy   Precision      Recall Sensitivity Specificity 
##  0.07692308  0.92307692  0.91812865  0.92899408  0.92899408  0.91715976 
##    F1_Score Fbeta_Score     LogLoss         AUC        Gini       PRAUC 
##  0.92352941  0.92352941  1.35425832  0.94531004  0.89012990  0.22652407 
##     LiftAUC     GainAUC     KS_Stat 
##  0.86700550  0.72265502 84.61538462 
## 
## $randomForest
## ZeroOneLoss    Accuracy   Precision      Recall Sensitivity Specificity 
##  0.07692308  0.92307692  0.88648649  0.97041420  0.97041420  0.87573964 
##    F1_Score Fbeta_Score     LogLoss         AUC        Gini       PRAUC 
##  0.92655367  0.92655367  0.19237258  0.98074297  0.96127587  0.84309355 
##     LiftAUC     GainAUC     KS_Stat 
##  1.53558963  0.74037149 86.39053254 
## 
## $ranger
## ZeroOneLoss    Accuracy   Precision      Recall Sensitivity Specificity 
##  0.07100592  0.92899408  0.89617486  0.97041420  0.97041420  0.88757396 
##    F1_Score Fbeta_Score     LogLoss         AUC        Gini       PRAUC 
##  0.93181818  0.93181818  0.19446190  0.97783691  0.95567382  0.96932139 
##     LiftAUC     GainAUC     KS_Stat 
##  1.66149062  0.73891846 86.98224852
##                logistic       rpart       ctree randomForest      ranger
## ZeroOneLoss  0.09171598  0.08284024  0.07692308   0.07692308  0.07100592
## Accuracy     0.90828402  0.91715976  0.92307692   0.92307692  0.92899408
## Precision    0.89204545  0.92727273  0.91812865   0.88648649  0.89617486
## Recall       0.92899408  0.90532544  0.92899408   0.97041420  0.97041420
## Sensitivity  0.92899408  0.90532544  0.92899408   0.97041420  0.97041420
## Specificity  0.88757396  0.92899408  0.91715976   0.87573964  0.88757396
## F1_Score     0.91014493  0.91616766  0.92352941   0.92655367  0.93181818
## Fbeta_Score  0.91014493  0.91616766  0.92352941   0.92655367  0.93181818
## LogLoss      0.65357680  0.23800632  1.35425832   0.19237258  0.19446190
## AUC          0.93890270  0.95490354  0.94531004   0.98074297  0.97783691
## Gini         0.87780540  0.90182417  0.89012990   0.96127587  0.95567382
## PRAUC        0.88463772  0.22951214  0.22652407   0.84309355  0.96932139
## LiftAUC      1.56643581  0.91375852  0.86700550   1.53558963  1.66149062
## GainAUC      0.71945135  0.72745177  0.72265502   0.74037149  0.73891846
## KS_Stat     82.84023669 83.43195266 84.61538462  86.39053254 86.98224852
## $recommend_model
## [1] "ranger"
## 
## $top_metric_count
##     logistic        rpart        ctree randomForest       ranger 
##            0            2            0            5            7 
## 
## $mean_rank
##     logistic        rpart        ctree randomForest       ranger 
##     4.153846     3.307692     3.500000     2.346154     1.692308 
## 
## $top_metric
## $top_metric$logistic
## NULL
## 
## $top_metric$rpart
## [1] "Precision"   "Specificity"
## 
## $top_metric$ctree
## NULL
## 
## $top_metric$randomForest
## [1] "Recall"  "LogLoss" "AUC"     "Gini"    "GainAUC"
## 
## $top_metric$ranger
## [1] "ZeroOneLoss" "Accuracy"    "Recall"      "F1_Score"    "PRAUC"      
## [6] "LiftAUC"     "KS_Stat"

## [1] 0.47

## [1] 0.728

## [1] 0.47
## [1] "ranger"
##   [1] 4.717460e-02 4.901754e-01 9.911071e-01 9.798024e-01 1.684743e-01
##   [6] 9.103888e-01 5.050000e-03 4.487179e-03 4.921930e-03 4.272894e-03
##  [11] 9.888698e-01 2.189738e-02 8.134489e-01 4.699519e-03 3.036190e-02
##  [16] 3.518355e-02 9.888991e-02 9.983571e-01 1.131217e-01 9.583333e-04
##  [21] 9.818898e-01 9.828571e-01 4.720962e-02 9.794294e-01 9.852429e-01
##  [26] 8.412807e-01 6.603730e-02 5.851481e-01 2.237766e-02 2.358277e-04
##  [31] 9.988000e-01 6.531253e-02 4.494579e-02 6.887684e-01 8.235737e-01
##  [36] 8.107428e-01 9.671976e-01 1.404627e-01 9.714286e-04 0.000000e+00
##  [41] 9.881095e-01 9.943079e-01 1.233452e-02 4.307550e-02 9.290619e-01
##  [46] 9.621492e-01 9.909278e-01 9.919000e-01 8.750899e-01 4.538261e-02
##  [51] 3.111111e-03 9.958738e-01 1.111664e-04 3.333886e-04 9.744092e-02
##  [56] 1.010557e-01 4.477463e-02 8.963199e-01 8.291403e-04 4.284885e-03
##  [61] 1.927357e-02 1.781524e-02 9.569548e-01 2.753846e-03 5.904679e-01
##  [66] 9.783778e-01 4.750762e-01 4.804214e-02 9.970286e-01 9.990000e-01
##  [71] 9.857111e-01 5.227437e-02 9.934365e-01 5.421733e-01 9.910198e-01
##  [76] 1.095191e-01 1.177096e-01 1.282383e-01 8.941010e-01 8.878652e-01
##  [81] 9.696397e-01 9.985714e-01 1.577352e-03 7.634061e-01 8.889730e-01
##  [86] 6.441521e-01 9.458462e-03 1.436433e-02 1.375521e-01 1.375521e-01
##  [91] 9.028528e-01 9.252062e-01 7.848032e-01 9.780135e-01 1.808050e-03
##  [96] 1.213605e-03 1.360544e-05 9.866589e-01 9.236298e-03 9.729889e-01
## [101] 1.009109e-03 3.444500e-03 7.315826e-01 9.209965e-01 3.131960e-01
## [106] 3.573800e-01 8.896296e-03 9.953063e-01 8.132540e-02 9.127569e-01
## [111] 9.856651e-01 1.546667e-02 9.237349e-01 9.173854e-01 9.617290e-01
## [116] 1.725714e-02 3.345922e-02 9.493056e-01 9.568690e-01 8.955099e-02
## [121] 1.538951e-02 7.955056e-01 8.190476e-01 3.916661e-02 1.960329e-02
## [126] 9.973681e-01 2.268786e-02 9.909000e-01 9.107260e-01 9.966538e-01
## [131] 2.966667e-03 2.649444e-02 2.400427e-03 1.112143e-02 1.074430e-01
## [136] 8.034373e-01 9.190762e-01 9.200813e-01 9.895968e-01 9.433253e-01
## [141] 1.540588e-01 9.725079e-01 9.975476e-01 9.192231e-01 9.279762e-02
## [146] 9.514262e-01 9.611734e-01 9.785333e-01 8.481143e-01 1.560145e-01
## [151] 9.756393e-01 9.980000e-01 8.605734e-01 9.785270e-01 1.949697e-02
## [156] 9.988889e-01 2.680292e-04 9.605956e-01 1.360544e-05 9.979167e-01
## [161] 2.983026e-02 2.358277e-04 1.610757e-02 9.891778e-01 1.563963e-03
## [166] 9.452242e-01 9.929405e-01 7.658016e-02 9.915905e-01 8.669864e-01
## [171] 9.428056e-01 9.091433e-01 5.948822e-01 9.944444e-01 9.719126e-01
## [176] 1.577352e-03 9.559390e-01 2.969997e-01 8.587786e-01 9.345663e-01
## [181] 1.521542e-03 9.483288e-01 6.086905e-02 9.917611e-01 9.452961e-01
## [186] 9.790135e-01 3.499127e-02 7.519048e-03 2.468714e-01 3.066522e-02
## [191] 9.997500e-01 2.196885e-01 8.519919e-01 9.981429e-01 5.839286e-02
## [196] 8.923016e-02 1.054709e-01 4.880159e-03 3.990548e-03 2.802384e-02
## [201] 9.914675e-01 9.973778e-01 8.165759e-01 5.619073e-02 9.743667e-01
## [206] 9.856111e-01 4.121296e-02 4.742916e-01 4.409634e-01 9.068944e-01
## [211] 8.377700e-01 6.854212e-01 2.358277e-04 2.358277e-04 1.397100e-01
## [216] 9.924167e-01 4.654524e-01 8.915498e-01 7.015118e-01 1.211978e-01
## [221] 9.146012e-01 8.300992e-01 8.816736e-01 3.394935e-02 9.924222e-01
## [226] 9.273842e-01 2.316825e-02 1.402369e-01 9.060472e-01 9.339618e-01
## [231] 9.984286e-01 1.000000e+00 9.707232e-01 5.712451e-03 9.945556e-01
## [236] 1.004660e-01 6.315808e-02 9.897246e-01 1.728014e-01 6.222183e-01
## [241] 9.891724e-01 9.735651e-01 2.672839e-02 9.697913e-01 1.029881e-01
## [246] 1.097302e-02 9.022197e-01 9.667777e-01 9.894524e-01 9.924405e-01
## [251] 2.377535e-01 5.689307e-01 8.876450e-01 9.992000e-01 9.582048e-01
## [256] 8.656084e-01 1.111664e-04 1.382633e-01 2.028949e-01 9.791535e-01
## [261] 9.236499e-01 8.996796e-01 3.882388e-02 3.882388e-02 9.547524e-01
## [266] 8.450178e-01 2.609912e-01 9.988000e-01 9.773190e-01 7.736065e-01
## [271] 9.335687e-01 9.863254e-01 9.671817e-01 2.312659e-02 8.932545e-01
## [276] 9.277675e-01 9.619553e-01 1.726032e-02 9.712571e-01 0.000000e+00
## [281] 1.786938e-01 1.737914e-01 9.452588e-01 9.960000e-01 3.254225e-02
## [286] 9.917761e-01 9.978889e-01 9.351703e-01 8.815211e-01 1.442547e-02
## [291] 8.105873e-02 8.105873e-02 8.105873e-02 4.842577e-02 2.828738e-01
## [296] 2.427265e-01 9.960500e-01 6.123563e-01 4.138889e-03 1.153475e-03
## [301] 4.366820e-02 5.842063e-01 5.842063e-01 2.772965e-04 4.547931e-02
## [306] 9.593230e-01 5.620915e-04 7.984429e-01 9.870437e-01 7.375418e-01
## [311] 9.948135e-01 9.495905e-01 4.073474e-01 9.279041e-01 1.910037e-02
## [316] 3.277772e-01 2.699981e-01 8.299181e-01 9.215278e-01 8.108277e-04
## [321] 9.983111e-01 5.680556e-03 2.666667e-03 7.981648e-01 1.285828e-03
## [326] 7.645923e-01 8.768716e-01 9.242766e-01 2.424277e-01 8.494795e-01
## [331] 9.725579e-01 5.018093e-01 6.350997e-03 5.714286e-04 1.298801e-02
## [336] 1.298801e-02 9.932405e-01 9.656508e-01
## [1] 0.9289941
## [1] 0.9319527
##           actual
## predict    Multiple Single
##   Multiple      150      5
##   Single         19    164
##           actual
## predict    Multiple Single
##   Multiple      149      3
##   Single         20    166
## [1] 0.9318182
## [1] 0.9352113
## [1] 0.9230769
## -- Checking unique value --------------------------- unique value is one --
## No variables that unique value is one.
## 
## -- Checking unique rate ------------------------------- high unique rate --
## No variables that high unique rate.
## 
## -- Checking character variables ----------------------- categorical data --
## No character variables.
##  [1] Single   Single   Single   Multiple Multiple Multiple Single   Multiple
##  [9] Multiple Single   Single   Single   Single   Single   Multiple Single  
## [17] Multiple Multiple Single   Single   Single   Single   Single   Single  
## [25] Multiple Multiple Multiple Single   Single   Single   Single   Multiple
## [33] Single   Single   Multiple Single   Multiple Multiple Single   Multiple
## [41] Multiple Multiple Single   Single   Multiple Multiple Single   Multiple
## [49] Multiple Multiple
## Levels: Multiple Single
##  [1] Single   Single   Single   Multiple Multiple Multiple Single   Multiple
##  [9] Multiple Single   Single   Single   Single   Single   Multiple Single  
## [17] Multiple Multiple Single   Single   Single   Single   Single   Single  
## [25] Multiple Multiple Multiple Single   Single   Single   Single   Multiple
## [33] Single   Single   Multiple Single   Multiple Multiple Single   Multiple
## [41] Multiple Multiple Single   Single   Multiple Multiple Single   Multiple
## [49] Multiple Multiple
## Levels: Multiple Single
## [1] 0