1 Explanation

1.1 Brief

We will learn to use classification model using bank dataset. We want to know the relationship among variables, especially between the y(has the client subscribed a term deposit?) with other variables. We also want to predict the future channely` based on the historical data.

1.2 Data’s Point of View

In the following code we would borrow from a dataset prepared by UCI Machine Learning Repository and available on the UCI Machine Learning repository

2 Data Preparation

2.1 Data Set Information

The data is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the product (bank term deposit) would be (‘yes’) or not (‘no’) subscribed.

Attribute Information:

  • age (numeric)
  • job : type of job (categorical: ‘admin.’,‘blue-collar’,‘entrepreneur’,‘housemaid’,‘management’,‘retired’,‘self-employed’,‘services’,‘student’,‘technician’,‘unemployed’,‘unknown’)
  • marital : marital status (categorical: ‘divorced’,‘married’,‘single’,‘unknown’; note: ‘divorced’ means divorced or widowed)
  • education (categorical: ‘basic.4y’,‘basic.6y’,‘basic.9y’,‘high.school’,‘illiterate’,‘professional.course’,‘university.degree’,‘unknown’)
  • default: has credit in default? (categorical: ‘no’,‘yes’,‘unknown’)
  • housing: has housing loan? (categorical: ‘no’,‘yes’,‘unknown’)
  • loan: has personal loan? (categorical: ‘no’,‘yes’,‘unknown’)
  • contact: contact communication type (categorical: ‘cellular’,‘telephone’)
  • month: last contact month of year (categorical: ‘jan’, ‘feb’, ‘mar’, …, ‘nov’, ‘dec’)
  • day_of_week: last contact day of the week (categorical: ‘mon’,‘tue’,‘wed’,‘thu’,‘fri’)
  • duration: last contact duration, in seconds (numeric). Important note: this attribute highly affects the output target (e.g., if duration=0 then y=‘no’). Yet, the duration is not known before a call is performed. Also, after the end of the call y is obviously known. Thus, this input should only be included for benchmark purposes and should be discarded if the intention is to have a realistic predictive model.
  • campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact)
  • pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric; 999 means client was not previously contacted)
  • previous: number of contacts performed before this campaign and for this client (numeric)
  • poutcome: outcome of the previous marketing campaign (categorical: ‘failure’,‘nonexistent’,‘success’)
  • y - has the client subscribed a term deposit? (binary: ‘yes’,‘no’)

2.2 Load Library

Load the required package

library(tidyverse)
library(GGally) 
library(car)
library(caret)
library(class)
library(dplyr) 
library(ggplot2) 
library(gridExtra) 
library(inspectdf)  
library(tidymodels)  

options(scipen = 100, max.print = 1e+06)

2.3 Load Dataset

# Read the dataset in, drop the "Region" feature because it's not interesting
bank <- read.csv("data_input2/bank.csv", sep =";" )

2.4 Explore Data

glimpse(bank)
## Rows: 4,521
## Columns: 17
## $ age       <int> 30, 33, 35, 30, 59, 35, 36, 39, 41, 43, 39, 43, 36, 20, 31, …
## $ job       <chr> "unemployed", "services", "management", "management", "blue-…
## $ marital   <chr> "married", "married", "single", "married", "married", "singl…
## $ education <chr> "primary", "secondary", "tertiary", "tertiary", "secondary",…
## $ default   <chr> "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", …
## $ balance   <int> 1787, 4789, 1350, 1476, 0, 747, 307, 147, 221, -88, 9374, 26…
## $ housing   <chr> "no", "yes", "yes", "yes", "yes", "no", "yes", "yes", "yes",…
## $ loan      <chr> "no", "yes", "no", "yes", "no", "no", "no", "no", "no", "yes…
## $ contact   <chr> "cellular", "cellular", "cellular", "unknown", "unknown", "c…
## $ day       <int> 19, 11, 16, 3, 5, 23, 14, 6, 14, 17, 20, 17, 13, 30, 29, 29,…
## $ month     <chr> "oct", "may", "apr", "jun", "may", "feb", "may", "may", "may…
## $ duration  <int> 79, 220, 185, 199, 226, 141, 341, 151, 57, 313, 273, 113, 32…
## $ campaign  <int> 1, 1, 1, 4, 1, 2, 1, 2, 2, 1, 1, 2, 2, 1, 1, 2, 5, 1, 1, 1, …
## $ pdays     <int> -1, 339, 330, -1, -1, 176, 330, -1, -1, 147, -1, -1, -1, -1,…
## $ previous  <int> 0, 4, 1, 0, 0, 3, 2, 0, 0, 2, 0, 0, 0, 0, 1, 0, 0, 2, 0, 1, …
## $ poutcome  <chr> "unknown", "failure", "failure", "unknown", "unknown", "fail…
## $ y         <chr> "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", …

The data has 4,521 rows and 17 columns.

2.5 Cek missing value

First, we check missing value

anyNA(bank)
## [1] FALSE

Our data has no missing value.

3 Exploratory Data Analysis

Exploratory data analysis is a phase where we explore the data variables, see if there are any pattern that can indicate any kind of correlation between variables.

3.1 Target Variable Proportion

x<- inspectdf::inspect_cat(bank)
show_plot(x)

Based on the plot, there is an unbalanced proportion between the levels in our target variable. To avoid the loss of variance, I will use upsampling (rather than downsampling) to balance the proportion.

3.2 Check Correlation

GGally::ggcorr(bank, hjust = 1, layout.exp = 2, label = T, label_size = 2.9)

Based on the plot above, there are no predictor variables which have a high correlation with one another.

4 Modeling

4.1 Holdout: Train-Test Split

set.seed(417)
split <- initial_split(data = bank, prop = 0.8, strata = "y")

train <- training(split)
test <- testing(split)
bank_recipe <- recipe(y~., train) %>% 
  themis::step_upsample(y, seed = 417) %>%
  step_nzv(all_predictors()) %>% 
  prep()

bank_train <- juice(bank_recipe)
bank_test <- bake(bank_recipe, test) 

4.2 Modeling: Naive Bayes

There are certain characteristics of Naive Bayes that should be considered:

  • assumes that all features of the dataset are equally important and independent. This allows Naive Bayes to perform faster computation (the - algorithms is quite simple).
  • prone to bias due to data scarcity. In some cases, our data may have a distribution where scarce observations lead to probabilities approximating close to 0 or 1, which introduces a heavy bias into our model that could lead to poor performance on unseen data.
  • more appropriate for data with categoric predictors. This is because Naive Bayes is sensitive to data scarcity. Meanwhile, a continuous variable might contain really scarce or even only one observation for certain value.
  • apply Laplace estimator/smoothing for data scarcity problem. Laplace estimator proposes the adding of a small number (usually 1) to each of the counts in the frequency table. This subsequently ensures that each class-feature combination has a non-zero probability of occurring.

We are still going to try using Naive Bayes and the result will be compared with the other models. While building our Naive Bayes model, we should also apply Laplace estimator.

library(e1071)
# model building
naive <- naiveBayes(y~. , bank_train, laplace = 1)
# model fitting
naive_pred <- predict(naive, bank_test, type = "class") # for the class prediction
naive_prob <- predict(naive, bank_test, type = "raw") # for the probability

# result
naive_table <- select(bank_test, y) %>%
  bind_cols(y_pred = naive_pred) %>% 
  bind_cols(y_eprob = round(naive_prob[,1],4)) %>% 
  bind_cols(y_pprob = round(naive_prob[,2],4))

# performance evaluation - confusion matrix
naive_table %>% 
  conf_mat(y, y_pred) %>% 
  autoplot(type = "heatmap")

naive_table %>%
  summarise(
    accuracy = accuracy_vec(y, y_pred),
    sensitivity = sens_vec(y, y_pred),
    specificity = spec_vec(y, y_pred),
    precision = precision_vec(y, y_pred)
  )
## # A tibble: 1 × 4
##   accuracy sensitivity specificity precision
##      <dbl>       <dbl>       <dbl>     <dbl>
## 1    0.822        0.83       0.762     0.964
# ROC
naive_roc <- data.frame(prediction=round(naive_prob[,1],4),
                      trueclass=as.numeric(naive_table$y=="no"))
head(naive_roc)
##   prediction trueclass
## 1     0.2643         1
## 2     0.9636         1
## 3     0.9680         1
## 4     0.6033         1
## 5     0.9261         1
## 6     0.9803         1
library(ROCR)
naive_roc <- ROCR::prediction(naive_roc$prediction, naive_roc$trueclass) 

# ROC curve
plot(performance(naive_roc, "tpr", "fpr"),
     main = "ROC")
abline(a = 0, b = 1)

# AUC
auc_ROCR_n <- performance(naive_roc, measure = "auc")
auc_ROCR_n <- auc_ROCR_n@y.values[[1]]
auc_ROCR_n
## [1] 0.8688929

Below, I will demonstrate some model tuning practice by changing the threshold for classification.

library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
# model tuning - metrics function
metrics <- function(cutoff, prob, ref, postarget, negtarget) 
{
  predict <- factor(ifelse(prob >= cutoff, postarget, negtarget))
  conf <- caret::confusionMatrix(predict , ref, positive = postarget)
  acc <- conf$overall[1]
  rec <- conf$byClass[1]
  prec <- conf$byClass[3]
  spec <- conf$byClass[2]
  mat <- t(as.matrix(c(rec , acc , prec, spec))) 
  colnames(mat) <- c("recall", "accuracy", "precicion", "specificity")
  return(mat)
}
co <- seq(0.01,0.99,length=100)
result <- matrix(0,100,4)

# apply function metrics
for(i in 1:100){
  result[i,] = metrics(cutoff = co[i], 
                     prob = naive_table$y_eprob, 
                     ref = as.factor(ifelse(naive_table$y == "no", 1, 0)), 
                     postarget = "1", 
                     negtarget = "0")
}

# visualize
ggplotly(tibble("Recall" = result[,1],
           "Accuracy" = result[,2],
           "Precision" = result[,3],
           "Specificity" = result[,4],
                   "Cutoff" = co) %>% 
  gather(key = "Metrics", value = "value", 1:4) %>% 
  ggplot(aes(x = Cutoff, y = value, col = Metrics)) +
  geom_line(lwd = 1.5) +
  scale_color_manual(values = c("darkred","darkgreen","orange", "blue")) +
  scale_y_continuous(breaks = seq(0,1,0.1), limits = c(0,1)) +
  scale_x_continuous(breaks = seq(0,1,0.1)) +
  labs(title = "Tradeoff Model Perfomance") +
  theme_minimal() +
  theme(legend.position = "top",
        panel.grid.minor.y = element_blank(),
        panel.grid.minor.x = element_blank()))
# tuning final model
naive_table <- naive_table %>% 
  mutate(tuning_pred = as.factor(ifelse(y_eprob >= 0.445, "no", "yes")))

# metrics result
final_n <- naive_table %>%
  summarise(
    accuracy = accuracy_vec(y, tuning_pred),
    sensitivity = sens_vec(y, tuning_pred),
    specificity = spec_vec(y, tuning_pred),
    precision = precision_vec(y, tuning_pred)
  ) %>% 
  cbind(AUC = auc_ROCR_n)
final_n
##    accuracy sensitivity specificity precision       AUC
## 1 0.8430939     0.85625   0.7428571 0.9620787 0.8688929

4.3 Modeling: Decision Tree

Decision tree model is one of the tree-based models which has the major benefit of being interpretable. Decision tree is an algorithm that will make a set of rules visualized in a diagram that resembles a tree.

To really understand what it looks like, let’s make a decision tree model based on our bank_train data. We can use rpart package for building a decision tree model and use rattle and rpart.plot package to visualize them.

library(rpart)
## 
## Attaching package: 'rpart'
## The following object is masked from 'package:dials':
## 
##     prune
library(rattle)
## Warning: package 'rattle' was built under R version 4.2.3
## Loading required package: bitops
## Rattle: A free graphical interface for data science with R.
## Version 5.5.1 Copyright (c) 2006-2021 Togaware Pty Ltd.
## Type 'rattle()' to shake, rattle, and roll your data.
library(rpart.plot)
## Warning: package 'rpart.plot' was built under R version 4.2.3
# model building
dtree <- rpart(formula = y ~ ., data = bank_train, method = "class")

fancyRpartPlot(dtree, sub = NULL)

# model fitting
dtree_pred <- predict(dtree, bank_test, type = "class")
dtree_prob <- predict(dtree, bank_test, type = "prob")

# result
dtree_table <- select(bank_test, y) %>%
  bind_cols(y_pred = dtree_pred) %>% 
  bind_cols(y_eprob = round(dtree_prob[,1],4)) %>% 
  bind_cols(y_pprob = round(dtree_prob[,2],4))

# performance evaluation - confusion matrix
dtree_table %>% 
  conf_mat(y, y_pred) %>% 
  autoplot(type = "heatmap")

dtree_table %>%
  summarise(
    accuracy = accuracy_vec(y, y_pred),
    sensitivity = sens_vec(y, y_pred),
    specificity = spec_vec(y, y_pred),
    precision = precision_vec(y, y_pred)
  )
## # A tibble: 1 × 4
##   accuracy sensitivity specificity precision
##      <dbl>       <dbl>       <dbl>     <dbl>
## 1    0.797       0.789       0.857     0.977
# ROC
dtree_roc <- data.frame(prediction=round(dtree_prob[,1],4),
                      trueclass=as.numeric(dtree_table$y=="no"))
head(dtree_roc)
##   prediction trueclass
## 1     0.1859         1
## 2     0.9374         1
## 3     0.9443         1
## 4     0.3426         1
## 5     0.9374         1
## 6     0.9374         1
dtree_roc <- ROCR::prediction(dtree_roc$prediction, dtree_roc$trueclass) 

# ROC curve
plot(performance(dtree_roc, "tpr", "fpr"),
     main = "ROC")
abline(a = 0, b = 1)

# AUC
auc_ROCR_d <- performance(dtree_roc, measure = "auc")
auc_ROCR_d <- auc_ROCR_d@y.values[[1]]
auc_ROCR_d
## [1] 0.8462798
for(i in 1:100){
  result[i,] = metrics(cutoff = co[i], 
                     prob = dtree_table$y_eprob, 
                     ref = as.factor(ifelse(dtree_table$y == "no", 1, 0)), 
                     postarget = "1", 
                     negtarget = "0")
}
## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.
ggplotly(data_frame("Recall" = result[,1],
           "Accuracy" = result[,2],
           "Precision" = result[,3],
           "Specificity" = result[,4],
                   "Cutoff" = co) %>% 
  gather(key = "Metrics", value = "value", 1:4) %>% 
  ggplot(aes(x = Cutoff, y = value, col = Metrics)) +
  geom_line(lwd = 1.5) +
  scale_color_manual(values = c("darkred","darkgreen","orange", "blue")) +
  scale_y_continuous(breaks = seq(0,1,0.1), limits = c(0,1)) +
  scale_x_continuous(breaks = seq(0,1,0.1)) +
  labs(title = "Tradeoff Model Perfomance") +
  theme_minimal() +
  theme(legend.position = "top",
        panel.grid.minor.y = element_blank(),
        panel.grid.minor.x = element_blank()))
## Warning: `data_frame()` was deprecated in tibble 1.1.0.
## ℹ Please use `tibble()` instead.
# tuning final model
dtree_table <- dtree_table %>% 
  mutate(tuning_pred = as.factor(ifelse(y_eprob >= 0.7, "no", "yes")))

# metrics result
final_d <- dtree_table %>%
  summarise(
    accuracy = accuracy_vec(y, tuning_pred),
    sensitivity = sens_vec(y, tuning_pred),
    specificity = spec_vec(y, tuning_pred),
    precision = precision_vec(y, tuning_pred)
  ) %>% 
  cbind(AUC = auc_ROCR_d)
final_d
##    accuracy sensitivity specificity precision       AUC
## 1 0.7966851     0.78875   0.8571429 0.9767802 0.8462798

4.4 Modelling: Random Forest

To really understand how random forest work, let’s apply the random forest algorithm to our bank data.

# model building
set.seed(417)
ctrl <- trainControl(method="repeatedcv", number=4, repeats=4) # k-fold cross validation
forest <- train(y ~ ., data=bank_train, method="rf", trControl = ctrl)
forest
## Random Forest 
## 
## 6400 samples
##   14 predictor
##    2 classes: 'no', 'yes' 
## 
## No pre-processing
## Resampling: Cross-Validated (4 fold, repeated 4 times) 
## Summary of sample sizes: 4800, 4800, 4800, 4800, 4800, 4800, ... 
## Resampling results across tuning parameters:
## 
##   mtry  Accuracy   Kappa    
##    2    0.8912500  0.7825000
##   21    0.9636328  0.9272656
##   40    0.9583594  0.9167188
## 
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was mtry = 21.

From the model summary, we know that the optimum number of variables considered for splitting at each tree node is 2. We can also inspect the importance of each variable that was used in our random forest using varImp().

varImp(forest)
## rf variable importance
## 
##   only 20 most important variables shown (out of 40)
## 
##                   Overall
## duration          100.000
## balance            19.652
## age                19.582
## day                17.927
## poutcomesuccess    13.776
## contactunknown     10.752
## campaign            7.365
## monthoct            6.415
## previous            5.604
## monthmar            4.847
## monthjul            4.728
## housingyes          2.881
## jobstudent          2.814
## monthjun            2.629
## monthmay            2.594
## poutcomeunknown     2.583
## jobblue-collar      2.391
## monthnov            2.353
## monthfeb            2.270
## educationtertiary   2.073
plot(forest$finalModel)
legend("topright", colnames(forest$finalModel$err.rate),col=1:6,cex=0.8,fill=1:6)

forest$finalModel
## 
## Call:
##  randomForest(x = x, y = y, mtry = param$mtry) 
##                Type of random forest: classification
##                      Number of trees: 500
## No. of variables tried at each split: 21
## 
##         OOB estimate of  error rate: 2.77%
## Confusion matrix:
##       no  yes class.error
## no  3023  177   0.0553125
## yes    0 3200   0.0000000

Let’s test our random forest model to our bank_test dataset:

# model fitting
forest_pred <- predict(forest, bank_test, type = "raw")
forest_prob <- predict(forest, bank_test, type = "prob")

# result
forest_table <- select(bank_test, y) %>%
  bind_cols(y_pred = forest_pred) %>% 
  bind_cols(y_eprob = round(forest_prob[,1],4)) %>% 
  bind_cols(y_pprob = round(forest_prob[,2],4))

# performance evaluation - confusion matrix 
forest_table %>% 
  conf_mat(y, y_pred) %>% 
  autoplot(type = "heatmap")

forest_table %>%
  summarise(
    accuracy = accuracy_vec(y, y_pred),
    sensitivity = sens_vec(y, y_pred),
    specificity = spec_vec(y, y_pred),
    precision = precision_vec(y, y_pred)
  )
## # A tibble: 1 × 4
##   accuracy sensitivity specificity precision
##      <dbl>       <dbl>       <dbl>     <dbl>
## 1    0.888       0.946       0.448     0.929
# ROC
forest_roc <- data.frame(prediction=round(forest_prob[,1],4),
                      trueclass=as.numeric(forest_table$y=="no"))
head(forest_roc)
##   prediction trueclass
## 1      0.700         1
## 2      0.944         1
## 3      0.998         1
## 4      0.752         1
## 5      0.994         1
## 6      1.000         1
forest_roc <- ROCR::prediction(forest_roc$prediction, forest_roc$trueclass) 

# ROC curve
plot(performance(forest_roc, "tpr", "fpr"),
     main = "ROC")
abline(a = 0, b = 1)

# AUC
auc_ROCR_f <- performance(forest_roc, measure = "auc")
auc_ROCR_f <- auc_ROCR_f@y.values[[1]]
auc_ROCR_f
## [1] 0.8945595
for(i in 1:100){
  result[i,] = metrics(cutoff = co[i], 
                     prob = forest_table$y_eprob, 
                     ref = as.factor(ifelse(forest_table$y == "no", 1, 0)), 
                     postarget = "1", 
                     negtarget = "0")
}
## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.

## Warning in confusionMatrix.default(predict, ref, positive = postarget): Levels
## are not in the same order for reference and data. Refactoring data to match.
ggplotly(data_frame("Recall" = result[,1],
           "Accuracy" = result[,2],
           "Precision" = result[,3],
           "Specificity" = result[,4],
                   "Cutoff" = co) %>% 
  gather(key = "Metrics", value = "value", 1:4) %>% 
  ggplot(aes(x = Cutoff, y = value, col = Metrics)) +
  geom_line(lwd = 1.5) +
  scale_color_manual(values = c("darkred","darkgreen","orange", "blue")) +
  scale_y_continuous(breaks = seq(0,1,0.1), limits = c(0,1)) +
  scale_x_continuous(breaks = seq(0,1,0.1)) +
  labs(title = "Tradeoff Model Perfomance") +
  theme_minimal() +
  theme(legend.position = "top",
        panel.grid.minor.y = element_blank(),
        panel.grid.minor.x = element_blank()))
# tuning final model
forest_table <- forest_table %>% 
  mutate(tuning_pred = as.factor(ifelse(y_eprob >= 0.5, "no", "yes")))

# metrics result
final_f <- forest_table %>%
  summarise(
    accuracy = accuracy_vec(y, tuning_pred),
    sensitivity = sens_vec(y, tuning_pred),
    specificity = spec_vec(y, tuning_pred),
    precision = precision_vec(y, tuning_pred)
  ) %>% 
  cbind(AUC = auc_ROCR_f)
final_f
##    accuracy sensitivity specificity precision       AUC
## 1 0.8872928     0.94625   0.4380952 0.9276961 0.8945595

5 Conclusion

rbind("Naive Bayes" = final_n, "Decision Tree" = final_d, "Random Forest" = final_f)
##                accuracy sensitivity specificity precision       AUC
## Naive Bayes   0.8430939     0.85625   0.7428571 0.9620787 0.8688929
## Decision Tree 0.7966851     0.78875   0.8571429 0.9767802 0.8462798
## Random Forest 0.8872928     0.94625   0.4380952 0.9276961 0.8945595

Based on the metrics table above, the predictive model built using Random Forest algorithm gave the best result. The model gave highest accuracy 90.16% while also maintain sensitivity, and precision above 90%. It also gave the highest AUC at 90.75%. Therefore the best model to predict bank’s subscribed to deposit is the Random Forest model.