Model Explainability

0.1 Model Explainability

H2O 를 활용한 Model Explainability

[참조 1] http://docs.h2o.ai/h2o/latest-stable/h2o-docs/explain.html

R의 h2o.explain() 함수

H2O 설명 성 인터페이스는 H2O의 여러 설명성 방법 및 시각화에 대한 편리한 래퍼임

주요 함수 인 h2o.explain () 및 h2o.explain_row () (로컬 설명)는 개별 H2O 모델과 모델 목록 또는 H2O AutoML 객체에 대해 작동

h2o.explain () 함수는 설명 목록 (부분 종속성 플롯 또는 가변 중요도 플롯과 같은 개별 설명 단위)을 생성

h2o.explain () 함수 외부의 개별 유틸리티 함수에 의해 생성 될 수도 있음. R 인터페이스에서 사용되는 시각화 엔진은 ggplot2 패키지이고 Python에서는 matplotlib를 사용

0.1.1 packages

library(h2o)

## Warning: package 'h2o' was built under R version 4.0.3

# 초기 준비
h2o.init()

##  Connection successful!
## 
## R is connected to the H2O cluster: 
##     H2O cluster uptime:         49 seconds 650 milliseconds 
##     H2O cluster timezone:       Asia/Seoul 
##     H2O data parsing timezone:  UTC 
##     H2O cluster version:        3.32.0.1 
##     H2O cluster version age:    1 month and 2 days  
##     H2O cluster name:           H2O_started_from_R_user_jna970 
##     H2O cluster total nodes:    1 
##     H2O cluster total memory:   3.97 GB 
##     H2O cluster total cores:    4 
##     H2O cluster allowed cores:  4 
##     H2O cluster healthy:        TRUE 
##     H2O Connection ip:          localhost 
##     H2O Connection port:        54321 
##     H2O Connection proxy:       NA 
##     H2O Internal Security:      FALSE 
##     H2O API Extensions:         Amazon S3, Algos, AutoML, Core V3, TargetEncoder, Core V4 
##     R Version:                  R version 4.0.2 (2020-06-22)

0.1.2 data import

# Import wine quality dataset
f <- "http://h2o-public-test-data.s3.amazonaws.com/smalldata/wine/winequality-redwhite-no-BOM.csv"
df <- h2o.importFile(f)

## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |======================================================================| 100%

df

##   fixed acidity volatile acidity citric acid residual sugar chlorides
## 1           7.0             0.27        0.36           20.7     0.045
## 2           6.3             0.30        0.34            1.6     0.049
## 3           8.1             0.28        0.40            6.9     0.050
## 4           7.2             0.23        0.32            8.5     0.058
## 5           7.2             0.23        0.32            8.5     0.058
## 6           8.1             0.28        0.40            6.9     0.050
##   free sulfur dioxide total sulfur dioxide density   pH sulphates alcohol
## 1                  45                  170  1.0010 3.00      0.45     8.8
## 2                  14                  132  0.9940 3.30      0.49     9.5
## 3                  30                   97  0.9951 3.26      0.44    10.1
## 4                  47                  186  0.9956 3.19      0.40     9.9
## 5                  47                  186  0.9956 3.19      0.40     9.9
## 6                  30                   97  0.9951 3.26      0.44    10.1
##   quality  type
## 1       6 white
## 2       6 white
## 3       6 white
## 4       6 white
## 5       6 white
## 6       6 white
## 
## [6497 rows x 13 columns]

0.1.3 data mumming

# Response column
y <- "quality"

# Split into train & test
splits <- h2o.splitFrame(df, ratios = 0.8, seed = 1)
train <- splits[[1]]
test <- splits[[2]]

0.1.4 modeling

# Run AutoML for 1 minute  # time adj.
aml <- h2o.automl(y = y, training_frame = train, max_runtime_secs = 30, seed = 1)

## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |====                                                                  |   5%
## 09:16:34.314: AutoML: XGBoost is not available; skipping it.
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |======================================================================| 100%

aml

## AutoML Details
## ==============
## Project Name: AutoML_20201111_91634283 
## Leader Model ID: StackedEnsemble_AllModels_AutoML_20201111_091634 
## Algorithm: stackedensemble 
## 
## Total Number of Models Trained: 21 
## Start Time: 2020-11-11 09:16:34 UTC 
## End Time: 2020-11-11 09:17:07 UTC 
## Duration: 33 s
## 
## Leaderboard
## ===========
##                                               model_id mean_residual_deviance
## 1     StackedEnsemble_AllModels_AutoML_20201111_091634              0.4099079
## 2  StackedEnsemble_BestOfFamily_AutoML_20201111_091634              0.4100041
## 3           GBM_grid__1_AutoML_20201111_091634_model_2              0.4270992
## 4           GBM_grid__1_AutoML_20201111_091634_model_4              0.4435748
## 5                         GBM_4_AutoML_20201111_091634              0.4475602
## 6           GBM_grid__1_AutoML_20201111_091634_model_5              0.4495944
## 7           GBM_grid__1_AutoML_20201111_091634_model_3              0.4632410
## 8                         GBM_3_AutoML_20201111_091634              0.4678701
## 9                         GBM_5_AutoML_20201111_091634              0.4756958
## 10          GBM_grid__1_AutoML_20201111_091634_model_1              0.4804408
##         rmse       mse       mae      rmsle
## 1  0.6402405 0.4099079 0.4786649 0.09658789
## 2  0.6403156 0.4100041 0.4759276 0.09664738
## 3  0.6535283 0.4270992 0.4995301 0.09849460
## 4  0.6660141 0.4435748 0.5104683 0.10012070
## 5  0.6689994 0.4475602 0.5167626 0.10086721
## 6  0.6705180 0.4495944 0.5197342 0.10082819
## 7  0.6806181 0.4632410 0.5284120 0.10227777
## 8  0.6840103 0.4678701 0.5318473 0.10284119
## 9  0.6897071 0.4756958 0.5365082 0.10358894
## 10 0.6931384 0.4804408 0.5407935 0.10387706
## 
## [21 rows x 6 columns]

0.1.5 Explain Models

# 자동 시각화
#  Explain leader model & compare with all AutoML models
exa <- h2o.explain(aml, test)

## Warning: replacing previous import 'vctrs::data_frame' by 'tibble::data_frame'
## when loading 'dplyr'

exa

## 
## 
## Leaderboard
## ===========
## 
## > Leaderboard shows models with their metrics. When provided with H2OAutoML object, the leaderboard shows 5-fold cross-validated metrics by default (depending on the H2OAutoML settings), otherwise it shows metrics computed on the newdata. At most 20 models are shown by default.
## 
## 
## |  | model_id | mean_residual_deviance | rmse | mse | mae | rmsle | training_time_ms | predict_time_per_row_ms
## |:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
## | **1** |StackedEnsemble_AllModels_AutoML_20201111_091634 | 0.409907884519946 | 0.640240489597422 | 0.409907884519946 | 0.478664895626517 | 0.0965878936331133 | 295 | 0.024884 | 
## | **2** |StackedEnsemble_BestOfFamily_AutoML_20201111_091634 | 0.410004129284925 | 0.64031564816497 | 0.410004129284925 | 0.47592764733454 | 0.0966473789450107 | 177 | 0.009202 | 
## | **3** |GBM_grid__1_AutoML_20201111_091634_model_2 | 0.427099214937211 | 0.653528281665921 | 0.427099214937211 | 0.499530143087204 | 0.0984946022083631 | 261 | 0.012482 | 
## | **4** |GBM_grid__1_AutoML_20201111_091634_model_4 | 0.443574751311116 | 0.666014077412119 | 0.443574751311116 | 0.510468306535198 | 0.100120697383841 | 362 | 0.023849 | 
## | **5** |GBM_4_AutoML_20201111_091634 | 0.447560202898601 | 0.668999404258779 | 0.447560202898601 | 0.516762601368897 | 0.100867209765573 | 176 | 0.005541 | 
## | **6** |GBM_grid__1_AutoML_20201111_091634_model_5 | 0.449594354700766 | 0.670517974927418 | 0.449594354700766 | 0.519734221324602 | 0.100828191681073 | 161 | 0.005553 | 
## | **7** |GBM_grid__1_AutoML_20201111_091634_model_3 | 0.463241016444974 | 0.680618113515188 | 0.463241016444974 | 0.528412043962963 | 0.102277768127452 | 275 | 0.022162 | 
## | **8** |GBM_3_AutoML_20201111_091634 | 0.467870122993106 | 0.684010323747461 | 0.467870122993106 | 0.531847294781806 | 0.102841187536772 | 164 | 0.006041 | 
## | **9** |GBM_5_AutoML_20201111_091634 | 0.475695831459985 | 0.689707062063297 | 0.475695831459985 | 0.536508190817063 | 0.103588944840026 | 169 | 0.005337 | 
## | **10** |GBM_grid__1_AutoML_20201111_091634_model_1 | 0.480440811700469 | 0.693138378464552 | 0.480440811700469 | 0.540793546117954 | 0.103877064592823 | 161 | 0.007135 | 
## | **11** |XRT_1_AutoML_20201111_091634 | 0.492106775642297 | 0.701503225682033 | 0.492106775642297 | 0.498207060278435 | 0.105631785973352 | 194 | 0.001781 | 
## | **12** |GBM_2_AutoML_20201111_091634 | 0.498405261338071 | 0.705978230073754 | 0.498405261338071 | 0.554768438434374 | 0.106071044039449 | 167 | 0.002136 | 
## | **13** |GBM_1_AutoML_20201111_091634 | 0.535197866921301 | 0.731572188455316 | 0.535197866921301 | 0.573935980359852 | 0.109552741563892 | 167 | 0.001933 | 
## | **14** |GLM_1_AutoML_20201111_091634 | 0.547635852689973 | 0.740024224394021 | 0.547635852689973 | 0.575655090946566 | 0.110643664344652 | 310 | 0.000836 | 
## | **15** |DRF_1_AutoML_20201111_091634 | 0.568292941307505 | 0.753852068583422 | 0.568292941307505 | 0.519268797011166 | 0.113315310578338 | 160 | 0.001093 | 
## | **16** |DeepLearning_grid__3_AutoML_20201111_091634_model_1 | 0.578366430164745 | 0.76050406321383 | 0.578366430164745 | 0.594584346143362 | 0.114253666020708 | 549 | 0.020479 | 
## | **17** |DeepLearning_1_AutoML_20201111_091634 | 0.605931097337828 | 0.778415761234205 | 0.605931097337828 | 0.607318193052633 | 0.116833250144689 | 54 | 0.001595 | 
## | **18** |GBM_grid__1_AutoML_20201111_091634_model_6 | 0.646439021085477 | 0.804014316467982 | 0.646439021085477 | 0.624581902620733 | 0.120125034806472 | 23 | 0.000751 | 
## | **19** |DeepLearning_grid__2_AutoML_20201111_091634_model_1 | 0.710838412812453 | 0.843112337006435 | 0.710838412812453 | 0.654546700209717 | 0.129729790588691 | 337 | 0.00511 | 
## | **20** |DeepLearning_grid__1_AutoML_20201111_091634_model_1 | 0.767899982311447 | 0.87629902562507 | 0.767899982311447 | 0.670436811332022 | 0.138171553832188 | 172 | 0.001453 | 
## 
## 
## Residual Analysis
## =================
## 
## > Residual Analysis plots the fitted values vs residuals on a test dataset. Ideally, residuals should be randomly distributed. Patterns in this plot can indicate potential problems with the model selection, e.g., using simpler model than necessary, not accounting for heteroscedasticity, autocorrelation, etc. Note that if you see "striped" lines of residuals, that is an artifact of having an integer valued (vs a real valued) response variable.

## 
## 
## Variable Importance
## ===================
## 
## > The variable importance plot shows the relative importance of the most important variables in the model.

## 
## 
## Variable Importance Heatmap
## ===========================
## 
## > Variable importance heatmap shows variable importance across multiple models. Some models in H2O return variable importance for one-hot (binary indicator) encoded versions of categorical columns (e.g. Deep Learning, XGBoost). In order for the variable importance of categorical columns to be compared across all model types we compute a summarization of the the variable importance across all one-hot encoded features and return a single variable importance for the original categorical feature. By default, the models and variables are ordered by their similarity.

## 
## 
## Model Correlation
## =================
## 
## > This plot shows the correlation between the predictions of the models. For classification, frequency of identical predictions is used. By default, models are ordered by their similarity (as computed by hierarchical clustering).

## Interpretable models: GLM_1_AutoML_20201111_091634 
## 
## 
## SHAP Summary
## ============
## 
## > SHAP summary plot shows the contribution of the features for each instance (row of data). The sum of the feature contributions and the bias term is equal to the raw prediction of the model, i.e., prediction before applying inverse link function.

## 
## 
## Partial Dependence Plots
## ========================
## 
## > Partial dependence plot (PDP) gives a graphical depiction of the marginal effect of a variable on the response. The effect of a variable is measured in change in the mean response. PDP assumes independence between the feature for which is the PDP computed and the rest.

## 
## 
## Individual Conditional Expectations
## ===================================
## 
## > An Individual Conditional Expectation (ICE) plot gives a graphical depiction of the marginal effect of a variable on the response. ICE plots are similar to partial dependence plots (PDP); PDP shows the average effect of a feature while ICE plot shows the effect for a single instance. This function will plot the effect for each decile. In contrast to the PDP, ICE plots can provide more insight, especially when there is stronger feature interaction.

#  Explain a single H2O model (e.g. leader model from AutoML)
#  exm <- h2o.explain(aml@leader, test)
#  exm


# Explanation Plotting Functions
# Methods for an AutoML object
#h2o.varimp_heatmap() 
#h2o.model_correlation_heatmap()
#h2o.pd_multi_plot()

# Methods for an H2O model
#h2o.residual_analysis_plot()
#h2o.varimp_plot()
#h2o.shap_explain_row_plot()
#h2o.shap_summary_plot()
#h2o.pd_plot()
#h2o.ice_plot()


# Parameters
# "leaderboard" (AutoML and list of models only)
# "residual_analysis" (regression only)
# "confusion_matrix" (classification only)
# "varimp" (not currently available for Stacked Ensembles)
# "varimp_heatmap"
# "model_correlation_heatmap"
# "shap_summary" (single models only)
# "pdp"
# "ice"