In this section we introduce the privacy, discrimination, and explainable aspects of modeling. We started with the different ways in which we can reconcile privacy with data science modelling.
We will start with the different ways in which we can reconcile privacy with data science modelling.
Definition of Trust
Trusting the Predictions: Whether a user trusts and individual prediction sufficiently to take some action based on it.
Trusting the Model: Whether the user trust a model to behave in a reasonable ways if deployed.
How do we Measure Trust?
These are two of the main ways we trust explanations of a model. But is this a good idea?
Challanges
Interpretable models are not always possible in complex situations where relationships between features and outcomes are non-linear.
Even highly accurate models can be biased or make unfair decisions, especially when trained on historical data that includes biases.
Three-must Have for a Good Explanation
1. Interpret ability: Humans can easily interpret reasoning
2. Faithful: Describe how the model actually works
3. Model Agnostic: Can explain any classifier
Definition: An algorithm that can explain the predictions of any classifier or regressor in a faithful way, by approximating it locally with an interpretable model.
GOAL: To identify an interpretable model over the interpretable representation that is locally faithful to the classifier.
Key Idea:
Pick a model class interpretable by humans Not globally faithful :(
Locally approximate global (blackbox) model Simple model globally bad, but locally good :)
Sparse Linear Explanations
Step 1: Sample points around xi
Step 2: Use complex model to predict labels for each sample
Step 3: Weigh samples according to distance to xi
Step 4: Learn new simple model on weighted samples
Step 5: Use simple model to explain
## Warning: package 'caret' was built under R version 4.3.2
## Loading required package: ggplot2
## Loading required package: lattice
## Warning: package 'lattice' was built under R version 4.3.3
## Warning: package 'lime' was built under R version 4.3.3
# Split up the data set
iris_test <- iris[1:5, 1:4]
iris_train <- iris[-(1:5), 1:4]
iris_lab <- iris[[5]][-(1:5)]
# Create Random Forest model on iris data
model <- train(iris_train, iris_lab, method = 'rf')
# Create an explainer object
explainer <- lime(iris_train, model)
# Explain new observation
explanation <- explain(iris_test, explainer, n_labels = 1, n_features = 2)
# The output is provided in a consistent tabular format and includes the
# output from the model.
explanation## # A tibble: 10 × 13
## model_type case label label_prob model_r2 model_intercept model_prediction
## <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 classificat… 1 seto… 1 0.702 0.120 0.950
## 2 classificat… 1 seto… 1 0.702 0.120 0.950
## 3 classificat… 2 seto… 1 0.689 0.122 0.941
## 4 classificat… 2 seto… 1 0.689 0.122 0.941
## 5 classificat… 3 seto… 1 0.678 0.128 0.945
## 6 classificat… 3 seto… 1 0.678 0.128 0.945
## 7 classificat… 4 seto… 1 0.687 0.122 0.950
## 8 classificat… 4 seto… 1 0.687 0.122 0.950
## 9 classificat… 5 seto… 1 0.686 0.125 0.956
## 10 classificat… 5 seto… 1 0.686 0.125 0.956
## # ℹ 6 more variables: feature <chr>, feature_value <dbl>, feature_weight <dbl>,
## # feature_desc <chr>, data <list>, prediction <list>
# Summary