이 글은 Christoph Molnar의 웹북 Interpretable Machine Learning의 내용 중 일부를 번역한 것입니다. 필요한 경우에는 개인적으로 이해한 내용을 덧붙여 설명하였습니다.

Local surrogate model은 블랙박스 머신러닝 모델의 개별 예측값을 설명하기 위해 사용된다. Local interpretable model-agnostic explanations (LIME) (Ribeiro, M.T., Singh, S. and Guestrin, C., 2016)은 local surrogate model의 구체적인 적용을 제안한 논문이다. Surrogate model은 블랙박스 모델의 예측값들을 근사하도록 훈련된다. global surrogate model을 적합(fitting)하기 위해 노력하는 대신 LIME은 왜 단일 예측값이 만들어졌는지를 설명하기 위한 local surrogate model을 적합하는데 집중한다.

아이디어는 상당히 직관적이다. 먼저, 훈련 데이터에 대해서는 잊고, 데이터 포인트를 입력하고 예측된 결과를 얻을 수 있는 블랙막스 모델만 있다고 상상해보라. 당신은 원하는 만큼 블랙박스 모델을 조사해 볼 수 있다. 당신의 목표는 왜 머신러닝 모델이 특정한 예측을 만들어냈는지 이해하는 것이다. LIME은 머신러닝 모델에 원 데이터를 변형한 데이터를 제공했을 때 모형의 예측에 어떤 일이 발생하는지 테스트 한다. LIME은 교란된(perturbed) 샘플과 이에 대응하는 블랙박스 모델의 예측값들로 이루어진 새로운 데이터 세트를 생성한다. 그런 다음 이 데이터 세트에서 관심있는 인스턴스에 대한 샘플링된 인스턴스들(관측값들)의 근접성(proximity)에 의해 가중치가 매겨진 해석 가능한 모델(interpretable model)을 훈련시킨다. 해석 가능한 모델은 LASSO나 decision tree 같은 모델을 말한다.

학습된 모델은 국소적으로는 머신러닝 모델을 잘 근사시켜야 하지만 전역적으로 그럴 필요는 없다. 이런 종류의 정확도(accuracy)를 local fidelity(국소적 충실도??)라고 부른다.

수학적으로 interpretability constraint가 있는 local surrogate model은 다음과 같이 표현될 수 있다:

\[explanation(x) = \underset{g \in G}{argmin} L(f,g,\pi_x) + \Omega(g)\]

인스턴스 x에 대한 설명 모델은 모델 복잡도 \(\Omega(g)\)를 낮게(e.g. 더 적은 변수 수) 유지하면서 얼마나 설명 모델의 예측이 원래 모델 f(e.g. xgboost)의 예측과 가까운지를 측정하는 손실함수 L(e.g. mean squared error)을 최소화하는 모델 g(e.g. linear regression)이다. G는 가능한 설명의 집합인데, 예를 들어, 모든 가능한 linear regression 모델들이다. 근접성 측도(proximity measure) \(\pi_x\)는 우리가 설명을 위해 고려하는 인스턴스 x 주변의 범위가 얼마나 큰지 정의한다. 실제로는, LIME은 loss 부분만 최적화 한다. linear regression 모델이 사용할 최대 변수 수 같은 복잡도는 사용자가 결정해야 한다.

local surrogate model을 적합하기 위한 레시피는 다음과 같다:

블랙박스 예측에 대한 설명을 원하는 관심 인스턴스를 선택한다.
데이터 세트를 교란시켜서(perturb) 관심 인스턴스에 대한 블랙박스 예측을 얻는다.
관심 인스턴스에 대한 근접성에 따라 새로운 샘플들에 가중치를 부여한다.
가중치가 적용된 해석 가능한 모델을 변형된 데이터 세트에 적합시킨다.
local model을 해석하여 예측을 설명한다.

R이나 파이썬에서 활용시에는 linear regression이 interpretable surrogate model로 선택될 수 있다. LIME을 사용할 때는 해석 가능한 모델에 사용할 변수 수 K를 선택해야 한다. K가 작을수록 모델을 해석하기 쉽고, K가 클수록 모델은 높은 fidelity를 만들어낸다. 적잘한 K개의 변수를 선택하는 여러가지 방법이 있는데 확실한 선택은 Lasso를 사용하는 것이다. 규제 파라미터 \(\lambda\)가 큰 Lasso 모델은 절편만 포함된 모델을 산출한다. \(\lambda\) 값을 천천히 감소시키면서 Lasso 모델을 여러번 차례대로 적합함으로써, 변수들은 0이 아닌 weight 추정값을 갖게 된다. K개의 변수들이 모델에 포함되어 있을 때(모델에 있는 전체 변수 수 p>K), 이 방법을 통해 K개의 변수를 찾을 수 있다. 다른 전략으로는 forward selection이나 backward elemination이 있을 수 있다. K개의 변수를 얻을 때까지 forward selection이나 backward elimination을 반복하는 것이다. decision tree 같은 다른 해석가능한 모델을 사용할 수도 있다.

데이터의 변형(variations)을 어떻게 얻을 수 있을까? 데이터의 형태가 텍스트인지 이미지인지 테이블(tabular)인지에 따라 다르다. 텍스트나 이미지 데이터의 경우 단일 단어들이나 super-pixel들을 마스킹하는 방법을 사용할 수 있다. 테이블 데이터의 경우, LIME은 새로운 샘플들은 각각의 변수를 교란시켜서(perturbing) 만드는데, 여기서 교란시킨다는 것은 해당 변수의 평균과 분산을 파라미터로 가지는 정규분포에서 생성된 노이즈를 해당 변수에 추가하는 것이다.

1 LIME for Tabular Data

Tabular 데이터란 표(table)로 제공되는 데이터를 의미하며, 각 행은 인스턴스를 나타내고 각 열은 변수를 나타낸다. LIME 샘플링은 관심있는 인스턴스 주변에서 이루어지는 것이 아니라, training 데이터의 중심에서 수행된다.

샘플링 과정에서는 관심있는 데이터 포인트에 대한 정보가 전혀 사용되지 않는다. 관심있는 데이터 포인트의 ‘이웃’ 개념이 도입되는 것은 커널 함수를 사용하여 샘플링된 데이터 포인트들과 관심있는 데이터 포인트 사이의 거리에 따라 가중치를 다르게 준 이후이다.

샘플링 및 로컬 모델 피팅의 작동 방식은 시각적으로 설명하는 것이 가장 좋다:

## Creating dataset ###########################################################
library("dplyr")
library("ggplot2")

# Define range of set
lower_x1 = -2
upper_x1 = 2
lower_x2 = -2
upper_x2 = 1

# Size of the training set for the black box classifier
n_training  = 20000
# Size for the grid to plot the decision boundaries
n_grid = 100
# Number of samples for LIME explanations
n_sample = 500


# Simulate y ~ x1 + x2
set.seed(1)
x1 = runif(n_training, min = lower_x1, max = upper_x1)
x2 = runif(n_training, min = lower_x2, max = upper_x2)

# Function for creating y values
get_y = function(x1, x2, noise_prob = 0){
  y = sign(sign(x2-1+abs(x1*2))/3 - sign(x2-.5+abs(x1*3))/3) + 1
  y = y * (1 - rbinom(length(x1), 1, prob = noise_prob))
  # flip classes
  y = 1 - y
  y
}

y = get_y(x1, x2)
# Add noise
y_noisy = get_y(x1, x2, noise_prob = 0.01)
lime_training_df = data.frame(x1=x1, x2=x2, y=as.factor(y), y_noisy=as.factor(y_noisy))

# For scaling later on
x_means = c(mean(x1), mean(x2))
x_sd = c(sd(x1), sd(x2))


# Learn model
rf = randomForest::randomForest(y_noisy ~ x1 + x2, data = lime_training_df, ntree=100)
lime_training_df$predicted = predict(rf, lime_training_df)


# The decision boundaries
grid_x1 = seq(from=lower_x1, to=upper_x1, length.out=n_grid)
grid_x2 = seq(from=lower_x2, to=upper_x2, length.out=n_grid)
grid_df = expand.grid(x1 = grid_x1, x2 = grid_x2)
grid_df$predicted = as.numeric(as.character(predict(rf, newdata = grid_df)))


# The observation to be explained
explain_x1 = 1
explain_x2 = -0.5
explain_y_model = predict(rf, newdata = data.frame(x1=explain_x1, x2=explain_x2))
df_explain = data.frame(x1=explain_x1, x2=explain_x2, y_predicted=explain_y_model)

point_explain = c(explain_x1, explain_x2)
point_explain_scaled = (point_explain - x_means) / x_sd

# Drawing the samples for the LIME explanations
x1_sample = rnorm(n_sample, x_means[1], x_sd[1])
x2_sample = rnorm(n_sample, x_means[2], x_sd[2])
df_sample = data.frame(x1 = x1_sample, x2 = x2_sample)
# Scale the samples
points_sample = apply(df_sample, 1, function(x){
  (x - x_means) / x_sd
}) %>% t

#' Get euclidean distances of samples to instances to be explainedß
#' @param point_explain Vector of scaled features
#' @param points_sample data.frame of scaled features for the sample points
#' @return Vector with distances of samples to instance to be explained
get_distances = function(point_explain, points_sample){
  # euclidean distance
  apply(points_sample, 1, function(x){
    sum((point_explain - x)^2)
  })
}

#' @param d Distance between center and point
#' @param kernel_width Width of kernel
kernel = function(d, kernel_width){
  sqrt(exp(-(d^2) / kernel_width^2))
}

# Add weights to the samples
kernel_width = sqrt(dim(df_sample)[2]) * 0.15
distances = get_distances(point_explain_scaled, 
                          points_sample = points_sample)

df_sample$weights = kernel(distances, kernel_width=kernel_width)

df_sample$predicted = predict(rf, newdata = df_sample)


# Trees
# mod = rpart(predicted ~ x1 + x2, data = df_sample,  weights = df_sample$weights)
# grid_df$explained = predict(mod, newdata = grid_df, type='prob')[,2]

# Logistic regression model
mod = glm(predicted ~ x1 + x2, data = df_sample,  weights = df_sample$weights, family='binomial')
grid_df$explained = predict(mod, newdata = grid_df, type='response')

# logistic decision boundary
coefs = coefficients(mod)
logistic_boundary_x1 = grid_x1
logistic_boundary_x2 = -  (1/coefs['x2']) * (coefs['(Intercept)'] + coefs['x1'] * grid_x1) 
logistic_boundary_df = data.frame(x1 = logistic_boundary_x1, x2 = logistic_boundary_x2)  
logistic_boundary_df = filter(logistic_boundary_df, x2 <= upper_x2, x2 >= lower_x2)


# Create a smaller grid for visualization of local model boundaries
x1_steps = unique(grid_df$x1)[seq(from=1, to=n_grid, length.out = 20)]
x2_steps = unique(grid_df$x2)[seq(from=1, to=n_grid, length.out = 20)]
grid_df_small = grid_df[grid_df$x1 %in% x1_steps & grid_df$x2 %in% x2_steps,]
grid_df_small$explained_class = round(grid_df_small$explained)

# define graphics theme
my_theme = function(legend.position='right'){
  theme_bw() %+replace%
    theme(legend.position=legend.position)
}

colors = c('#132B43', '#56B1F7')
# Data with some noise
p_data = ggplot(lime_training_df) +
  geom_point(aes(x=x1,y=x2,fill=y_noisy, color=y_noisy), alpha =0.3, shape=21) +
  scale_fill_manual(values = colors) +
  scale_color_manual(values = colors) +
  my_theme(legend.position = 'none')

# The decision boundaries of the learned black box classifier
p_boundaries = ggplot(grid_df) +
  geom_raster(aes(x=x1,y=x2,fill=predicted), alpha = 0.3, interpolate=TRUE) +
  my_theme(legend.position='none') +
  ggtitle('A')


# Drawing some samples
p_samples = p_boundaries +
  geom_point(data = df_sample, aes(x=x1, y=x2)) +
  scale_x_continuous(limits = c(-2, 2)) +
  scale_y_continuous(limits = c(-2, 1))
# The point to be explained
p_explain = p_samples +
  geom_point(data = df_explain, aes(x=x1,y=x2), fill = 'yellow', shape = 21, size=4) +
  ggtitle('B')

p_weighted = p_boundaries +
  geom_point(data = df_sample, aes(x=x1, y=x2, size=weights)) +
  scale_x_continuous(limits = c(-2, 2)) +
  scale_y_continuous(limits = c(-2, 1)) +
  geom_point(data = df_explain, aes(x=x1,y=x2), fill = 'yellow', shape = 21, size=4) +
  ggtitle('C')

p_boundaries_lime = ggplot(grid_df)  +
  geom_raster(aes(x=x1,y=x2,fill=predicted), alpha = 0.3, interpolate=TRUE) +
  geom_point(aes(x=x1, y=x2, color=explained), size = 2, data = grid_df_small[grid_df_small$explained_class==1,], shape=3) +
  geom_point(aes(x=x1, y=x2, color=explained), size = 2, data = grid_df_small[grid_df_small$explained_class==0,], shape=95) +
  geom_point(data = df_explain, aes(x=x1,y=x2), fill = 'yellow', shape = 21, size=4) +
  geom_line(aes(x=x1, y=x2), data =logistic_boundary_df, color = 'white', size=2) +
  my_theme(legend.position='none') + ggtitle('D')


gridExtra::grid.arrange(p_boundaries, p_explain, p_weighted, p_boundaries_lime, ncol=2)

LIME 샘플링이 어떻게 작동하는가?:

A) 블랙 박스 모델은 변수 x1와 x2가 주어졌을 때 두 클래스 중 하나를 예측한다. 대부분의 데이터 포인트의 클래스는 0이고(어두운 색), 클래스가 1인 데이터 포인트들은 뒤집어진 V모양으로 나타난다(밝은 색). 이 그래프는 머신러닝 모델에 의해 학습된 결정 경계를 보여준다. 여기서는 랜덤포레스트이지만 LIME은 모델에 구애받지 않기 때문에 다른 모델이어도 된다. 우리는 결정 경계에만 관심이 있다.

B) 노란색 점은 우리가 설명하고자 하는 관심있는 인스턴스이다. 검정색 점들은 training 데이터에서 해당 변수의 평균과 분산을 파라미터로 가지는 정규분포에서 추출된 데이터들이다. 이 작업은 한 번만 수행하면 되며 다른 설명에 다시 사용할 수 있다.

C) 관심있는 인스턴스 근처의 점들에 큰 가중치를 주어서 locality를 도입한다.

D) 그리드의 색상과 부호는 국소적으로 학습된 모델의 분류결과를 가중된 샘플 형태로 표시한다. 흰색 선은 로컬 모델의 분류결과가 변경되는 결정 경계(P(클래스) = 0.5)를 표시한다.

항상 그렇듯이 세부적인 부분에 어려운 점이 있는데, 어떤 점 주변의 의미있는 이웃을 정의하기가 어렵다. LIME은 현재 지수 평활 커널(exponential smoothing kernel)을 사용하여 이웃을 정의한다. 평활 커널은 두 개의 데이터 인스턴스를 받아서 근접성 측도(proximity measure)를 반환하는 함수이다. 커널 너비는 이웃의 크기를 결정한다. 작은 커널 너비는 인스턴스가 로컬 모델에 영향을 미치기 위해서는 매우 근접해야 함을 의미하며, 폭이 더 크면 멀리 있는 인스턴스도 모델에 영향을 미친다.

LIME의 Python 코드를 보면 file lime/lime_tabular.py 정규화된 데이터에 지수 평활 커널을 사용한다는 것과 커널의 너비는 \(0.75\times\sqrt{ncol(train)}\)으로 되어 있다는 것을 알 수 있을 것이다. 이 부분이 문제가 될 수 있는데, 가장 큰 문제는 우리가 최상의 커널이나 최적의 너비를 찾을 방법이 없다는 것이다. 그리고 0.75라는 값은 도대체 어디서 나온 것일까? 특정 시나리오에서는 다음 그림과 같이 커널 폭을 변경하여 설명을 쉽게 뒤집을 수 있다.

set.seed(42)
df = data.frame(x = rnorm(200, mean = 0, sd = 3))
df$x[df$x < -5] = -5
df$y = (df$x + 2)^2
df$y[df$x > 1] = -df$x[df$x > 1] + 10 + - 0.05 * df$x[df$x > 1]^2
#df$y = df$y + rnorm(nrow(df), sd = 0.05)
explain.p = data.frame(x = 1.6, y = 8.5)
w1 = kernel(get_distances(data.frame(x = explain.p$x), df), 0.1)
w2 = kernel(get_distances(data.frame(x = explain.p$x), df), 0.75)
w3 = kernel(get_distances(data.frame(x = explain.p$x), df), 2)
lm.1 = lm(y ~ x, data = df, weights = w1)
lm.2 = lm(y ~ x, data = df, weights = w2)
lm.3 = lm(y ~ x, data = df, weights = w3)
df.all = rbind(df, df, df)
df.all$lime = c(predict(lm.1), predict(lm.2), predict(lm.3))
df.all$width = factor(c(rep(c(0.1, 0.75, 2), each = nrow(df))))
ggplot(df.all, aes(x = x, y = y)) + 
  geom_line(size = 2.5) + 
  geom_rug(sides = "b") + 
  geom_line(aes(x = x, y = lime, group = width, color = width)) + 
  geom_point(data = explain.p, aes(x = x, y = y), size = 12, shape = "x") + 
  scale_color_discrete("Kernel width") + 
  scale_y_continuous("Black Box prediction")

인스턴스 x=1.6 일 때 예측값에 대한 설명이다. 하나의 변수에 의존하는 블랙박스 모델의 예측값들은 검정색 선으로 나타나며 데이터의 분포는 그래프 하단에 rug로 표시되어 있다. 커널 폭이 서로 다른 세 개의 local surrogate model이 계산된다. 결과로 나오는 선형 회귀 모형은 커널 폭에 크게 의존한다: 변수가 x=1.6에 대해 음의 효과를 주는가 양의 효과를 주는가 아니면 아무 효과도 주지 않는가?

이 예제는 단지 하나의 변수만 보여주었다. 고차원 변수 공간에서는 훨씬 더 나빠진다. 또한 거리 측정치가 모든 변수를 동등하게 다루어야 하는지도 매우 불분명하다. 변수 x1의 한 단위가 변수 x2의 한 단위와 동일한가? 거리 측정은 상당히 임의적이며 서로 다른 차원(변수들)에서의 거리는 전혀 비교할 수 없을 수도 있다.

Q. 모든 변수를 표준화 하고 시작하면 되는거 아닌가??

1.1 Example

구체적인 예제를 살펴보자. bike rental data로 돌아가서 예측 문제를 분류 문제로 바꿔보자. 자전거 대여가 시간이 지남에 따라 인기를 끌게 된 추세를 감안한 후, 우리는 지정된 날에 대여 자전거의 수가 추세선보다 높거나 낮을 것인지 알고 싶다. 또한 ’above’는 평균 자전거 카운트보다 높은 것으로 해석할 수 있지만 추세에 맞게 조정할 수 있다.

bike retal data는 여기에서 얻을 수 있다.

data("bike")
ntree = 100
bike.train.resid = factor(resid(lm(cnt ~ days_since_2011, data = bike)) > 0, levels = c(FALSE, TRUE), labels = c('below', 'above'))
bike.train.x = bike[names(bike) != 'cnt']
model <- caret::train(bike.train.x,
  bike.train.resid,
  method = 'rf', ntree=ntree, maximise = FALSE)
n_features_lime = 2

먼저 tree size가 100인 랜덤포레스트 모델로 분류 과제를 학습시킨다. 계절과 날씨 정보를 고려해 볼 때, 임대 자전거의 수가 언제 추세와 관련 없는 평균(trend-free average)보다 높을 것인가?

설명은 두 개의 변수로 만들어진다. 예측 클래스가 서로 다른 두 인스턴스에 적용된 sparse local linear model의 결과:

library("iml")
library("gridExtra")
instance_indices = c(295, 8)
set.seed(44)
bike.train.x$temp = round(bike.train.x$temp, 2)
pred = Predictor$new(model, data = bike.train.x, class = "above", type = "prob")
lim = LocalModel$new(pred, x.interest = bike.train.x[instance_indices[1],], k = n_features_lime)
a = plot(lim)
lim = LocalModel$new(pred, x.interest = bike.train.x[instance_indices[2],], k = n_features_lime)
b = plot(lim)
grid.arrange(a, b, ncol = 2)

bike rental dataset의 두 인스턴스에 대한 LIME의 설명이다. 더 따듯한 기온과 좋은 날씨 상황은 예측에 긍정적 인 영향을 미친다. x축은 변수의 효과를 나타낸다: weight \(\times\) 실제 변수 값.

수치형 변수보다는 범주형 변수를 해석하는 것이 더 쉽다는 것이 그림에서 분명해진다. 해결책은 수치형 변수를 빈(bin)으로 범주화하는 것이다.

2 LIME for Text

텍스트 데이터에 대한 LIME은 테이블 데이터에 대한 LIME과 다르다. 데이터의 변형(variation)이 다른 방식으로 만들어진다: 새 텍스트는 원래 텍스트에서 임의로 단어를 제거하여 만들어진다. 데이터 세트는 각 단어에 대해 binary 변수로 표현된다. 단어가 변수에 포함되어 있으면 1의 값을 가지고, 제거된 경우에는 0의 값을 가진다.

2.1 Example

이 예제에서는 YouTube comments의 spam과 ham을 분류할 것이다. 각 comment는 하나의 문서(=하나의 행)이고, 각 열은 특정 단어의 발생 횟수이다. 여기서 블랙박스 모델은 문서 단어 행렬(document word matrix)에 대한 의사결정나무이다. 의사결정나무는 이해하기 쉽지만 이 경우에는 tree depth가 매우 깊다. 의사결정나무 대신 word2vec의 임베딩에 대해 훈련된 recurrent neural network나 support vector machine을 사용할 수도 있다. 남아 있는 comment들 중 두 개가 설명을 위해 선택되었다.

data("ycomments")
example_indices = c(267, 173)
texts = ycomments$CONTENT[example_indices]

두 개의 comment와 대응하는 클래스를 보자:

ycomments[example_indices, c('CONTENT', 'CLASS')]

##                                     CONTENT CLASS
## 267                       PSY is a good guy     0
## 173 For Christmas Song visit my channel! ;)     1

다음 단계에서는 로컬 모델에서 사용되는 데이터 세트의 변형(variation)을 만든다. 예를 들어 어떤 comment의 변형 중 일부는 다음과 같을 수 있다.

prepare_data = function(comments, trained_corpus = NULL){

  corpus = Corpus(VectorSource(comments))
  dtm = DocumentTermMatrix(corpus, control = list(removePunctuation = TRUE,
                                                  stopwords=TRUE,
                                                  stemming = FALSE,
                                                  removeNumbers = TRUE
                                                  ))

  labeledTerms = as.data.frame(as.matrix(dtm))

  # Seems that columns called break or next cause trouble
  names(labeledTerms)[names(labeledTerms) %in% c('break')] <- 'break.'
  names(labeledTerms)[names(labeledTerms) %in% c('next')] <- 'next.'
  names(labeledTerms)[names(labeledTerms) %in% c('else')] <- 'else.'


  if(!is.null(trained_corpus)){
    # Make sure only overlapping features are used
    labeledTerms = labeledTerms[intersect(colnames(labeledTerms), colnames(trained_corpus))]

    empty_corpus = trained_corpus[1, ]
    labeledTerms = data.frame(data.table::rbindlist(list(empty_corpus, labeledTerms), fill=TRUE))
    labeledTerms = labeledTerms[2:nrow(labeledTerms),]
  }
  labeledTerms
}

get_predict_fun = function(model, train_corpus){
  function(comments){
    terms = prepare_data(comments, train_corpus)
    predict(model, newdata = terms, type='prob')
  }
}

#' Tokenize sentence into words
#'
#' @param x string with sentence
#' @return list of words
tokenize = function(x){
  unlist(strsplit(x, "\\s+"))
}

#' Get a subset from a text
#'
#' @param words List of words
#' @param prob Probability with which to keep a word
#' @return List with two objects. First object is the new text. Second object is a vector
#' of length number of words with 0s and 1s, indicating whether a word is in the new
#' sentence (1) or not (0)
draw_combination =  function(words, prob=0.5){
  # Create combination
  combi = rbinom(n = length(words), size = 1, prob = prob)
  names(combi) = words
  df = data.frame(t(combi))
  # Create text
  new_text = paste(words[which(combi==1)], collapse  = ' ')
  list(text = new_text,
       combi = df)
}

#'Create variations of a text
#'
#'@param text The text
#'@param pred_fun The prediction function from the machine learning model.
#'      It should contain the complete pipeline:  take the raw text, do all the pre-processing
#'      and do the prediction. Returned prediction should be a data.frame with one column per class
#'@param prob Probability with which to keep a word
#'@param n_variations Number of variations to create
#'@param class The class for which to create the predictions
#'@return data.frame for a local linear model, containing binary features for word occurence
#'weights for distance to original sentence and the predictions for the chosen class.
create_variations = function(text, pred_fun, prob=0.5, n_variations = 100, class, round.to = 2){
  tokenized = tokenize(text)
  df = data.frame(lapply(tokenized, function(x) 1))
  names(df) = tokenized

  combinations = lapply(1:n_variations, function(x){
    draw_combination(tokenized, prob=prob)
  })

  texts = as.vector(sapply(combinations, function(x) x['text']))

  features = data.frame(data.table::rbindlist(sapply(combinations, function(x) x['combi'])))
  weights = round(rowSums(features) / ncol(features), round.to)
  predictions = round(pred_fun(texts)[,class], round.to)

  cbind(features, pred=predictions, weights = weights)
}

library("tm")
labeledTerms = prepare_data(ycomments$CONTENT)
labeledTerms$class = factor(ycomments$CLASS, levels = c(0,1), labels = c('no spam', 'spam'))
labeledTerms2 = prepare_data(ycomments, trained_corpus = labeledTerms)
rp = rpart::rpart(class ~ ., data = labeledTerms)
predict_fun = get_predict_fun(rp, labeledTerms)
tokenized = tokenize(texts[2])
set.seed(2)
variations = create_variations(texts[2], predict_fun, prob=0.7, n_variations = 5, class='spam')
colnames(variations) = c(tokenized, 'prob', 'weight')
example_sentence = paste(colnames(variations)[variations[2, ] == 1], collapse = ' ')

library(DT)
datatable(variations, options = list(autoWidth = TRUE))

각 열은 문장에서 한 단어에 대응된다. 각 행은 하나의 변형(variation)인데, 1은 단어가 이 변형의 일부라는 것을 나타내고 0은 단어가 지워졌다는 것을 나타낸다. 첫 번째 변형(rowname=2인 것)에 대응되는 문장은 “For Song visit ;)” 이다.

다음은 LIME 알고리즘에 의해 추정된 “PSY is a good guy”, “For Christmas Song visit my channel! ;)” 두 문장에 대한 local weight들이다.

get.ycomments.classifier = function(ycomments){
  labeledTerms = prepare_data(ycomments$CONTENT)
  labeledTerms$class = factor(ycomments$CLASS, levels = c(0,1), labels = c('no spam', 'spam'))
  rp = rpart::rpart(class ~ ., data = labeledTerms)
  get_predict_fun(rp, labeledTerms)
}

#' Explain the classification of a text
#'
#'@param text The text for which to explain the classification
#'@param pred_fun The prediction function from the machine learning model.
#'      It should contain the complete pipeline:  take the raw text, do all the pre-processing
#'      and do the prediction. Returned prediction should be a data.frame with one column per class
#'@param prob The probability to keep a word in the variations
#'@param n_variations The number of text variations to create
#'@param K The number of features to use for the explanation
#'@param case The ID of the observation
#'@param class The class for which to create the explanations
explain_text = function(text, pred_fun, prob=0.9, n_variations=500, K = 3, case=1, class){
  stopifnot(K >= 1)
  df = create_variations(text, pred_fun = pred_fun, prob = prob, n_variations = n_variations, class=class)
  mod = glm(pred ~ . - weights, data =df , weights=df$weights, family = 'binomial')
  coefs = coef(mod)
  coefs = coefs[names(coefs) != '(Intercept)']
  names(coefs) = tokenize(text)
  coefs = coefs[base::order(abs(coefs), decreasing = TRUE)]
  coefs = coefs[1:K]
  # Create explanation compatible to R-LIME format
  tibble(case = case,
         label = class,
         label_prob = pred_fun(text)[, class],
         model_intercept = coef(mod)['(Intercept)'],
         feature = names(coefs),
         feature_value = names(coefs),
         feature_weight = coefs,
         feature_desc = names(coefs),
         data = text,
         prediction = list(pred_fun(text)))
}

set.seed(42)
ycomments.predict = get.ycomments.classifier(ycomments)
explanations  = data.table::rbindlist(lapply(seq_along(texts), function(i) {
  explain_text(texts[i], ycomments.predict, class='spam', case=i, prob = 0.5)
})
)
explanations = data.frame(explanations)
datatable(explanations[c("case", "label_prob", "feature", "feature_weight")], options = list(autoWidth = TRUE))

LIME explanations for text classification.

“channel”이라는 단어가 spam일 확률이 높다는 것을 나타낸다.

3 LIME for Images

이미지 데이터에 대한 LIME은 테이블이나 텍스트 데이터와는 다르게 작동한다. 하나 이상의 픽셀이 하나의 클래스에 기여하기 때문에 단일 픽셀을 교란시키는(perturb) 것은 직관적으로 말이 되지 않는다. 개별 픽셀을 임의로 변경하면 예측이 크게 바뀌지 않을 것이다. 따라서 슈퍼픽셀(superpixels) 분할을 수행하고 슈퍼픽셀을 꺼서(마스킹 한다는 뜻) 샘플(즉, 이미지)의 변형(variations)이 생성된다. 슈퍼픽셀은 유사한 색상의 픽셀을 연결하며 사용자가 제공한 색상으로 각 픽셀을 교체하여 끌 수 있다(예: 적절한 값은 회색이 될 수 있음). 또한 사용자는 각각의 permutation에서 슈퍼픽셀을 끌 확률을 정할 수 있다.

3.1 Example

이미지 데이터에 LIME을 적용할 때는 계산시간이 오래 걸리기 때문에 lime R package는 미리 계산된 예제를 포함하며, 여기서는 이 예제를 사용하여 설명할 것이다. 이미지 데이터는 시각화에 매우 유용하고 설명을 이미지 샘플에 직접 표시할 수 있다. 확률에 따라 정렬된 이미지 당 몇 개의 예측 라벨을 가질 수 있기 때문에 우리는 상위 n개의 라벨에 대해 설명할 수 있다. 다음 이미지에서 상위 3가지 예측은 딸기, 양초, 테이퍼, 왁스 라이트, 그리고 스미스 할머니(Granny Smith)였다. 첫 번째 케이스에서 예측과 설명은 매우 합리적이다. 두 번째 예측을 위해 이미지의 어떤 부분이 이 클래스에 기여했는지 보는 것을 상당히 흥미롭다. 우리는 토마토처럼 반짝이는 양초, 테이퍼, 왁스 등으로 라벨이 붙은 물체가 training set에 있다고 결론을 내릴 수 있다.

# Having trouble to install imagemick in version 6.8.8 or higher on TravisCI, 
# which would be required for this code. So running only locally and added the
# image manually.
# For running locally, set eval = TRUE and make sure lime is installed.
library("lime")
explanation <- .load_image_example()
plot_image_explanation(explanation)

4 Advantages

기반이 되는 머신러닝 모델을 바꾸더라도 동일한 local interpretable model을 사용하여 설명할 수 있다. 설명을 보는 사람들이 의사결정나무를 가장 잘 이해한다고 가정해보자. local surrogate model을 사용하기 때문에 실제로 의사결정나무를 사용하여 예측을 수행 할 필요 없이 의사결정나무를 설명에 사용할 수 있다. 예를 들어,예측 모델로 SVM을 사용했다고 하자. 그런데 xgboost 모델이 더 잘 작동한다는 것으로 밝혀지면 SVM 대신 xgboost 모델을 사용하면서, 설명에는 그대로 의사결정나무를 사용할 수 있다.
Local surrogate model로 사용한 interpretable 모델에 대한 선행연구와 그동안의 경험을 활용할 수 있다.
LASSO나 얕은 의사결정나무를 Local surrogate model로 사용할 때, 결과에 대한 설명은 짧고(선택적이고) 대비되는 결과를 잘 보여줄 수 있다. 그러므로, 사람이 이해하기 쉬운 설명을 만들어낸다. 이것이 일반인이나 시간이 매우 부족한 사람을 대상으로 설명할 때 LIME을 많이 사용하는 이유이다. 하지만 인과관계를 완전히 설명하기에는 충분하지 않기 때문에 예측에 대한 완전한 설명이 법적으로 필요한 compliance scenarios에서는 LIME을 사용하지 않는다. 또한 머신러닝 모델을 디버깅하는 경우에도 몇 가지 이유 보다는(LIME은 관심있는 샘플에 대한 로컬 모델을 적합하는 것) 모든 이유에 대해 아는 것이 유용하다.
LIME은 테이블, 텍스트, 이미지 데이터에 모두 사용할 수 있는 몇 안되는 방법 중 하나이다.
fidelity measure(해석 가능한 모델이 블랙박스 예측에 얼마나 근접한가)는 해석 가능한 모델이 해당 데이터 인스턴스의 인접 영역(neiborhood)에 있는 블랙박스 예측을 설명하는데 얼마나 신뢰할 수 있는지 알 수 있게 해준다.
LIME은 Python 및 R(Lime Package 및 iml package)에서 구현되며 매우 사용하기 쉽다. local surrogate model로 만들어진 설명에는 원래 모델과는 다른 변수들을 사용할 수 있다. 특히 원래 모델에서 사용된 변수가 해석될 수 없는 경우 다른 방법에 비해 LIME을 사용하는 큰 이점이 될 수 있다. 텍스트 분류기는 추상적인 단어 임베딩들을 변수로 사용하지만 설명에는 문장에 단어 존재하지는 여부를 사용할 수 있다. 회귀 모형은 해석 불가능한 일부 속성의 변환에 의존할 수 있지만 설명은 원래 특성을 사용하여 만들 수 있다.

5 Disadvantages

테이블 데이터에 LIME을 사용할 때 인접 영역(neiborhood)을 올바르게 정의하는 것은 매우 큰 문제이고, 해결되지 않은 문제이다. 각각의 어플리케이션에 대해 서로 다른 커널 설명을 사용해보고 해당 커널 설정이 적절한지 직접 확인해야 한다. 불행하게도, 이것이 좋은 커널 폭을 찾기 위한 최선의 조언이다.
샘플링은 현재 구현된 LIME에서 보다 개선될 여지가 있다. 데이터 포인트들은 변수 사이의 상관관계를 무시한 채 정규분포에서 샘플링된다. 이는 부적절한 데이터 포인트가 local interpretable model의 학습에 사용되는 결과로 이어질 수 있다.
또 다른 정말 큰 문제는 설명의 불안정성이다. 한 논문에서 저자들은 시뮬레이션 된 세팅에서 두 개의 매우 근접한 데이터 포인트에 대한 설명이 크게 다르다는 것을 보여주었다.

Local Surrogate Models(LIME)

stat17_hb @ korea.ac.kr