## Welcome to DALEX (version: 1.0).
## Find examples and detailed introduction at: https://pbiecek.github.io/ema/
## Additional features will be available after installation of: ggpubr.
## Use 'install_dependencies()' to get all suggested dependencies
## [1] "CRASH_NUM1" "NARRATIVE" "ACCESS_CNTL_CD"
## [4] "ALIGNMENT_CD" "HWY_TYPE_CD" "INVEST_AGENCY_CD"
## [7] "LIGHTING_CD" "LOC_TYPE_CD" "MAN_COLL_CD"
## [10] "PRI_CONTRIB_FAC_CD" "ROAD_COND_CD" "ROAD_REL_CD"
## [13] "ROAD_TYPE_CD" "SEC_CONTRIB_FAC_CD" "SEVERITY_CD"
## [16] "SURF_COND_CD" "SURF_TYPE_CD" "WEATHER_CD"
## [19] "CRASH_DATE" "CRASH_TIME" "CR_MONTH"
## [22] "CR_HOUR" "DAY_OF_WK" "INTERSECTION"
## [25] "NUM_VEH" "LAT" "LONG"
## [28] "PARISH_CD" "CITY_CD" "TIME_AMB_ARR"
## [31] "TIME_AMB_ARR_HOSP" "HIT_AND_RUN"
## 'data.frame': 338 obs. of 32 variables:
## $ CRASH_NUM1 : Factor w/ 338 levels "LA10_100109200922477",..: 1 2 3 4 8 5 6 7 13 9 ...
## $ NARRATIVE : Factor w/ 338 levels "-----on sunday august 4, 2012, corporal matthew cleland #9506 responded to a single vehicle crash with injuries"| __truncated__,..: 314 63 212 163 290 321 187 172 323 292 ...
## $ ACCESS_CNTL_CD : Factor w/ 4 levels "A","B","C","Z": 1 2 1 1 1 1 1 1 1 1 ...
## $ ALIGNMENT_CD : Factor w/ 9 levels "A","B","C","D",..: 1 1 1 1 1 1 1 4 1 1 ...
## $ HWY_TYPE_CD : Factor w/ 5 levels "A","B","C","D",..: 5 4 3 3 5 5 4 3 5 4 ...
## $ INVEST_AGENCY_CD : Factor w/ 3 levels "B","C","Z": 1 2 2 1 1 1 2 2 1 2 ...
## $ LIGHTING_CD : Factor w/ 7 levels "A","B","C","D",..: 3 1 1 1 1 1 1 4 3 1 ...
## $ LOC_TYPE_CD : Factor w/ 8 levels "A","B","C","D",..: 4 4 3 3 3 4 4 5 4 4 ...
## $ MAN_COLL_CD : Factor w/ 12 levels "A","B","C","D",..: 4 12 12 2 4 4 10 12 3 4 ...
## $ PRI_CONTRIB_FAC_CD: Factor w/ 10 levels "A","B","C","D",..: 1 2 2 1 1 2 1 1 4 10 ...
## $ ROAD_COND_CD : Factor w/ 9 levels "A","B","C","D",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ ROAD_REL_CD : Factor w/ 7 levels "A","B","C","D",..: 1 1 1 1 1 1 1 5 1 1 ...
## $ ROAD_TYPE_CD : Factor w/ 6 levels "A","B","C","D",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ SEC_CONTRIB_FAC_CD: Factor w/ 10 levels "A","B","C","D",..: 4 1 2 8 4 2 2 2 10 10 ...
## $ SEVERITY_CD : Factor w/ 5 levels "A","B","C","D",..: 3 4 4 3 2 2 3 4 2 4 ...
## $ SURF_COND_CD : Factor w/ 3 levels "A","B","Y": 1 1 1 1 1 1 1 1 1 1 ...
## $ SURF_TYPE_CD : Factor w/ 5 levels "A","B","D","Y",..: 2 2 2 2 1 2 2 2 1 2 ...
## $ WEATHER_CD : Factor w/ 5 levels "A","B","C","D",..: 1 1 1 1 1 1 1 2 2 1 ...
## $ CRASH_DATE : int 40187 40225 40257 40296 40456 40379 40422 40429 40538 40475 ...
## $ CRASH_TIME : num 367 368 368 368 367 ...
## $ CR_MONTH : int 1 2 3 4 10 7 9 9 12 10 ...
## $ CR_HOUR : int 2 16 14 16 11 20 19 22 20 14 ...
## $ DAY_OF_WK : Factor w/ 7 levels "FR","MO","SA",..: 3 6 3 7 6 6 7 7 4 4 ...
## $ INTERSECTION : int 1 1 0 0 1 0 0 1 0 1 ...
## $ NUM_VEH : int 2 2 1 2 2 2 1 1 1 1 ...
## $ LAT : num 32.5 0 0 30.2 0 ...
## $ LONG : num 92.7 0 0 92.1 0 ...
## $ PARISH_CD : int 31 26 44 28 10 28 37 50 26 29 ...
## $ CITY_CD : int 4 0 0 5 4 4 0 0 5 0 ...
## $ TIME_AMB_ARR : num 367 368 368 368 367 ...
## $ TIME_AMB_ARR_HOSP : num 367 367 367 368 367 ...
## $ HIT_AND_RUN : int 0 0 0 0 0 0 0 0 1 0 ...
## [1] 332 7
## SEVERITY_CD DAY_OF_WK LIGHTING_CD HWY_TYPE_CD WEATHER_CD CR_HOUR NUM_VEH
## 1 C SA C E A 2 2
## 2 D TU A D A 16 2
## 3 D SA A C A 14 1
## 4 C WE A C A 16 2
## 5 B TU A E A 11 2
## 6 B TU A E A 20 2
## randomForest 4.6-14
## Type rfNews() to see new features/changes/bug fixes.
##
## Attaching package: 'randomForest'
## The following object is masked from 'package:ranger':
##
## importance
## Warning in randomForest.default(m, y, ...): The response has five or fewer
## unique values. Are you sure you want to do regression?
##
## Call:
## randomForest(formula = SEVERITY_CD == "A" ~ ., data = mn02)
## Type of random forest: regression
## Number of trees: 500
## No. of variables tried at each split: 2
##
## Mean of squared residuals: 0.003575175
## % Var explained: -19.05
## Preparation of a new explainer is initiated
## -> model label : Random Forest v7
## -> data : 332 rows 6 cols
## -> target variable : 332 values
## -> model_info : package randomForest , ver. 4.6.14 , task regression ( default )
## -> predict function : yhat.randomForest will be used ( default )
## -> predicted values : numerical, min = -8.673617e-18 , mean = 0.003478108 , max = 0.2283204
## -> residual function : difference between y and yhat ( default )
## -> residuals : numerical, min = -0.2283204 , mean = -0.0004660603 , max = 0.7716796
## A new explainer has been created!
## variable mean_dropout_loss label
## 1 _full_model_ 2.507805 Random Forest v7
## 2 LIGHTING_CD 2.578274 Random Forest v7
## 3 WEATHER_CD 2.665051 Random Forest v7
## 4 HWY_TYPE_CD 2.848549 Random Forest v7
## 5 CR_HOUR 3.199140 Random Forest v7
## 6 NUM_VEH 3.271387 Random Forest v7

## Top profiles :
## _vname_ _label_ _x_ _yhat_ _ids_
## 1 CR_HOUR Random Forest v7 1.00 0.002116324 0
## 2 CR_HOUR Random Forest v7 1.31 0.002116324 0
## 3 CR_HOUR Random Forest v7 2.00 0.002116324 0
## 4 CR_HOUR Random Forest v7 2.93 0.002116324 0
## 5 CR_HOUR Random Forest v7 3.72 0.002116324 0
## 6 CR_HOUR Random Forest v7 7.55 0.002116324 0

## 'variable_type' changed to 'categorical' due to lack of numerical variables.


## Loading required package: Hmisc
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Loading required package: ggplot2
##
## Attaching package: 'ggplot2'
## The following object is masked from 'package:randomForest':
##
## margin
##
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
##
## format.pval, units
## Loading required package: SparseM
##
## Attaching package: 'SparseM'
## The following object is masked from 'package:base':
##
## backsolve
## Preparation of a new explainer is initiated
## -> model label : Logistic regression
## -> data : 332 rows 7 cols
## -> target variable : 332 values
## -> model_info : package stats , ver. 3.6.2 , task regression ( [33m default [39m )
## -> predict function : function(m, x) predict(m, x, type = "fitted")
## -> predicted values : numerical, min = 1.478422e-21 , mean = 0.003017853 , max = 0.3538808
## -> residual function : difference between y and yhat ( [33m default [39m )
## -> residuals : numerical, min = -0.3538808 , mean = -5.804906e-06 , max = 0.7209833
## [32m A new explainer has been created! [39m
## Loaded gbm 2.1.5
## Distribution not specified, assuming multinomial ...
## Preparation of a new explainer is initiated
## -> model label : Generalized Boosted Models
## -> data : 332 rows 7 cols
## -> target variable : 332 values
## -> model_info : package gbm , ver. 2.1.5 , task regression ( default )
## -> predict function : function(m, x) predict(m, x, n.trees = 15000, type = "response")
## Warning in predict.gbm(m, x, n.trees = 15000, type = "response"): Number of
## trees not specified or exceeded number fit so far. Using 1500.
## -> predicted values : predict function returns multiple columns: 2 ( WARNING ) some of functionalities may not work
## -> residual function : difference between y and yhat ( default )
## Warning in predict.gbm(m, x, n.trees = 15000, type = "response"): Number of
## trees not specified or exceeded number fit so far. Using 1500.
## -> residuals : numerical, min = -1 , mean = -0.496988 , max = 0.538872
## A new explainer has been created!
##
## Attaching package: 'e1071'
## The following object is masked from 'package:Hmisc':
##
## impute
## Preparation of a new explainer is initiated
## -> model label : Support Vector Machines
## -> data : 332 rows 7 cols
## -> target variable : 332 values
## -> model_info : package e1071 , ver. 1.7.3 , task classification ( default )
## -> predict function : yhat.svm will be used ( default )
## -> predicted values : numerical, min = 0.0004821902 , mean = 0.003251029 , max = 0.00693297
## -> residual function : difference between y and yhat ( default )
## -> residuals : numerical, min = -0.00693297 , mean = -0.0002389809 , max = 0.993067
## A new explainer has been created!

