Introduction

Customer acquisition and retention are critical for any financial institution. This project aims to analyze the effectiveness of a bank’s marketing campaign to predict customer subscription to term deposits. By applying machine learning algorithms such as Decision Tree, Random Forest, and Adaboost, this analysis seeks to identify the most effective model for improving customer targeting and campaign performance.

Objective: The goal is to identify the best-performing model by evaluating key metrics such as accuracy and AUC, ultimately guiding the bank toward more targeted and effective marketing strategies.

Approach: Three machine learning algorithms (Decision Tree, Random Forest, and AdaBoost) will be tested in default and tuned configurations. Performance will be evaluated using AUC, accuracy, sensitivity, and specificity. The best model will be selected based on predictive accuracy and generalization capability, and business recommendations will be provided to improve future marketing strategies.

Getting Started

Load packages

Let’s load the packages.

The data

After completing EDA and preprocessing, we saved the cleaned data in a new file called cleaned_data to proceed with the next step: Experimentation & Model Training.

# Read a CSV file
bank <- read.csv("https://raw.githubusercontent.com/waheeb123/Data-622/refs/heads/main/cleaned_data.csv")

# Preview the first few rows of the dataset
kable(head(bank, 10), caption = "Preview of the Bank Dataset")
Preview of the Bank Dataset
age job marital education default balance housing loan contact day month duration campaign pdays previous poutcome Subscription contact_success_rate age_group credit_risk
30 unemployed married primary no 1787 no no cellular 19 oct 79 1 -1 0 unknown no 0.0568071 Middle-aged Medium Risk
35 management single tertiary no 1350 yes no cellular 16 apr 185 1 330 1 failure no 0.1069652 Middle-aged Medium Risk
30 management married tertiary no 1476 yes yes unknown 3 jun 199 4 -1 0 unknown no 0.0568071 Middle-aged High Risk
59 blue-collar married secondary no 0 yes no unknown 5 may 226 1 -1 0 unknown no 0.0568071 Senior Medium Risk
35 management single tertiary no 747 no no cellular 23 feb 141 2 176 3 failure no 0.1069652 Middle-aged Medium Risk
36 self-employed married tertiary no 307 yes no cellular 14 may 341 1 330 2 other no 0.1428571 Middle-aged Medium Risk
39 technician married secondary no 147 yes no cellular 6 may 151 2 -1 0 unknown no 0.0568071 Middle-aged Medium Risk
41 entrepreneur married tertiary no 221 yes no unknown 14 may 57 2 -1 0 unknown no 0.0568071 Middle-aged Medium Risk
43 services married primary no -88 yes yes cellular 17 apr 313 1 147 2 failure no 0.1069652 Middle-aged High Risk
43 admin. married secondary no 264 yes no cellular 17 apr 113 2 -1 0 unknown no 0.0568071 Middle-aged Medium Risk

Experimentation & Model Training

To evaluate the performance of the Decision Tree model, we split the data into training and test sets and applied cross-validation to improve model reliability. After scaling the numeric features, we built a Decision Tree model using default settings.

# Split data into train and test sets
set.seed(123)
trainIndex <- createDataPartition(bank$Subscription, p = 0.7, list = FALSE)
data_train <- bank[trainIndex, ]
data_test <- bank[-trainIndex, ]

# Convert 'Subscription' column to factor for classification
data_train$Subscription <- as.factor(data_train$Subscription)
data_test$Subscription <- as.factor(data_test$Subscription)

# Scale numeric features
data_train_scaled <- data_train %>%
  mutate_if(is.numeric, scale)
data_test_scaled <- data_test %>%
  mutate_if(is.numeric, scale)

# Set up cross-validation control (10-fold cross-validation)
train_control <- trainControl(method = "cv", number = 10, 
                              savePredictions = "all", 
                              classProbs = TRUE, 
                              summaryFunction = twoClassSummary)

Experiment 1: Decision Tree (Default)

Objective: Test the default decision tree model to evaluate its baseline performance on the classification task.

Variation: No tuning applied; default settings used for benchmarking.

Variation is meaningful as it sets a baseline to measure tuning effectiveness.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC will be computed.

# Build Decision Tree model with default settings
dt_model <- rpart(Subscription ~ ., data = data_train, method = "class")

# Predict using test data
dt_probs <- predict(dt_model, data_test, type = "prob")[, 2]
dt_preds <- predict(dt_model, data_test, type = "class")

# Evaluate metrics
dt_confusion <- confusionMatrix(dt_preds, data_test$Subscription)
dt_accuracy <- dt_confusion$overall['Accuracy']
dt_precision <- dt_confusion$byClass['Pos Pred Value']
dt_recall <- dt_confusion$byClass['Sensitivity']
dt_f1 <- 2 * (dt_precision * dt_recall) / (dt_precision + dt_recall)
dt_auc <- roc(data_test$Subscription, dt_probs)$auc

cat(sprintf("\nDecision Tree (Default) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            dt_accuracy, dt_precision, dt_recall, dt_f1, dt_auc))
## 
## Decision Tree (Default) - Accuracy: 0.9301, Precision: 0.9365, Recall: 0.9912, F1-score: 0.9631, AUC: 0.7676

Conclusion: Default decision tree model shows high recall but moderate AUC, indicating overfitting.

Recommendation: Further tuning of cp and maxdepth or switching to ensemble methods may improve AUC.

Result logged in results table for comparison.

Experiment 2: Decision Tree (Tuned)

Objective: Optimize decision tree model performance by adjusting hyperparameters.

Variation: Tuning complexity parameter (cp) and maximum tree depth (maxdepth).

Variation is meaningful since tuning cp and maxdepth impacts overfitting and model complexity.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC will be computed.

# Build tuned Decision Tree model
dt_tuned <- rpart(Subscription ~ ., data = data_train, method = "class", 
                  control = rpart.control(cp = 0.01, maxdepth = 5))

# Predict using test data
dt_tuned_probs <- predict(dt_tuned, data_test, type = "prob")[, 2]
dt_tuned_preds <- predict(dt_tuned, data_test, type = "class")

# Evaluate metrics
dt_tuned_confusion <- confusionMatrix(dt_tuned_preds, data_test$Subscription)
dt_tuned_accuracy <- dt_tuned_confusion$overall['Accuracy']
dt_tuned_precision <- dt_tuned_confusion$byClass['Pos Pred Value']
dt_tuned_recall <- dt_tuned_confusion$byClass['Sensitivity']
dt_tuned_f1 <- 2 * (dt_tuned_precision * dt_tuned_recall) / (dt_tuned_precision + dt_tuned_recall)
dt_tuned_auc <- roc(data_test$Subscription, dt_tuned_probs)$auc

cat(sprintf("\nDecision Tree (Tuned) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            dt_tuned_accuracy, dt_tuned_precision, dt_tuned_recall, dt_tuned_f1, dt_tuned_auc))
## 
## Decision Tree (Tuned) - Accuracy: 0.9301, Precision: 0.9365, Recall: 0.9912, F1-score: 0.9631, AUC: 0.7676

Conclusion: Tuning did not improve accuracy, but it may have helped reduce overfitting.

Recommendation: Try ensemble models like Random Forest or AdaBoost for improved performance.

Result logged in results table for comparison.

Experiment 3: Random Forest (Default)

Objective: Evaluate the baseline performance of a Random Forest model on the classification task.

Variation: No tuning applied; using default ntree = 100.

Variation is meaningful because it establishes a benchmark for comparison with tuned models.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# Build Random Forest model with default settings
rf_model <- randomForest(Subscription ~ ., data = data_train, ntree = 100)

# Predict using test data
rf_probs <- predict(rf_model, data_test, type = "prob")[, 2]
rf_preds <- predict(rf_model, data_test, type = "class")

# Evaluate metrics
rf_confusion <- confusionMatrix(rf_preds, data_test$Subscription)
rf_accuracy <- rf_confusion$overall['Accuracy']
rf_precision <- rf_confusion$byClass['Pos Pred Value']
rf_recall <- rf_confusion$byClass['Sensitivity']
rf_f1 <- 2 * (rf_precision * rf_recall) / (rf_precision + rf_recall)
rf_auc <- roc(data_test$Subscription, rf_probs)$auc

cat(sprintf("\nRandom Forest (Default) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            rf_accuracy, rf_precision, rf_recall, rf_f1, rf_auc))
## 
## Random Forest (Default) - Accuracy: 0.9283, Precision: 0.9363, Recall: 0.9893, F1-score: 0.9621, AUC: 0.9119

Conclusion: The Random Forest model with default settings achieved high accuracy and AUC, indicating strong predictive power.

Recommendation: Further tuning can focus on adjusting the number of trees and the number of variables per split.

Result logged in results table for comparison.

Experiment 4: Random Forest (Tuned)

Objective: Improve Random Forest performance by tuning hyperparameters.

Variation: Increased ntree to 200 and adjusted mtry to 4.

Variation is meaningful because increasing ntree reduces variance, and adjusting mtry balances bias-variance tradeoff.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# Build tuned Random Forest model
rf_tuned <- randomForest(Subscription ~ ., data = data_train, ntree = 200, mtry = 4)

# Predict using test data
rf_tuned_probs <- predict(rf_tuned, data_test, type = "prob")[, 2]
rf_tuned_preds <- predict(rf_tuned, data_test, type = "class")

# Evaluate metrics
rf_tuned_confusion <- confusionMatrix(rf_tuned_preds, data_test$Subscription)
rf_tuned_accuracy <- rf_tuned_confusion$overall['Accuracy']
rf_tuned_precision <- rf_tuned_confusion$byClass['Pos Pred Value']
rf_tuned_recall <- rf_tuned_confusion$byClass['Sensitivity']
rf_tuned_f1 <- 2 * (rf_tuned_precision * rf_tuned_recall) / (rf_tuned_precision + rf_tuned_recall)
rf_tuned_auc <- roc(data_test$Subscription, rf_tuned_probs)$auc

cat(sprintf("\nRandom Forest (Tuned) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            rf_tuned_accuracy, rf_tuned_precision, rf_tuned_recall, rf_tuned_f1, rf_tuned_auc))
## 
## Random Forest (Tuned) - Accuracy: 0.9265, Precision: 0.9362, Recall: 0.9873, F1-score: 0.9611, AUC: 0.9077

Conclusion: Tuning resulted in a slight drop in accuracy but improved generalization as indicated by consistent AUC.

Recommendation: Further adjustments to mtry or implementing feature selection may improve performance.

Result logged in results table for comparison

Experiment 5: Adaboost (Default)

Objective: Evaluate the baseline performance of an AdaBoost model on the classification task.

Variation: No tuning applied; using default iter = 50.

Variation is meaningful because it establishes a benchmark for comparison with tuned models.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# AdaBoost (Default) Model
library(ada)

# Train AdaBoost model
ada_model <- ada(Subscription ~ ., data = data_train, iter = 50)

# Predict using test data
ada_probs <- predict(ada_model, data_test, type = "prob")[, 2]
ada_preds <- predict(ada_model, data_test, type = "class")

# Evaluate metrics
ada_confusion <- confusionMatrix(ada_preds, data_test$Subscription)
ada_accuracy <- ada_confusion$overall['Accuracy']
ada_precision <- ada_confusion$byClass['Pos Pred Value']
ada_recall <- ada_confusion$byClass['Sensitivity']
ada_f1 <- 2 * (ada_precision * ada_recall) / (ada_precision + ada_recall)
ada_auc <- roc(data_test$Subscription, ada_probs)$auc

cat(sprintf("\nAdaBoost (Default) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            ada_accuracy, ada_precision, ada_recall, ada_f1, ada_auc))
## 
## AdaBoost (Default) - Accuracy: 0.9265, Precision: 0.9362, Recall: 0.9873, F1-score: 0.9611, AUC: 0.9034

Conclusion: The AdaBoost model with default settings achieved strong performance, with high accuracy and AUC.

Recommendation: Further tuning of the number of iterations may improve generalization and reduce variance.

Result logged in results table for comparison.

Experiment 6: Adaboost (Tuned)

Objective: Improve AdaBoost performance by tuning hyperparameters.

Variation: Increased the number of boosting iterations from 50 to 100.

Variation is meaningful because increasing iterations can reduce bias and improve model performance.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# AdaBoost (Tuned) Model
# Tuning AdaBoost model - for example, increase iterations
ada_tuned_model <- ada(Subscription ~ ., data = data_train, iter = 100)

# Predict using test data
ada_tuned_probs <- predict(ada_tuned_model, data_test, type = "prob")[, 2]
ada_tuned_preds <- predict(ada_tuned_model, data_test, type = "class")

# Evaluate metrics
ada_tuned_confusion <- confusionMatrix(ada_tuned_preds, data_test$Subscription)
ada_tuned_accuracy <- ada_tuned_confusion$overall['Accuracy']
ada_tuned_precision <- ada_tuned_confusion$byClass['Pos Pred Value']
ada_tuned_recall <- ada_tuned_confusion$byClass['Sensitivity']
ada_tuned_f1 <- 2 * (ada_tuned_precision * ada_tuned_recall) / (ada_tuned_precision + ada_tuned_recall)
ada_tuned_auc <- roc(data_test$Subscription, ada_tuned_probs)$auc

cat(sprintf("\nAdaBoost (Tuned) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            ada_tuned_accuracy, ada_tuned_precision, ada_tuned_recall, ada_tuned_f1, ada_tuned_auc))
## 
## AdaBoost (Tuned) - Accuracy: 0.9265, Precision: 0.9362, Recall: 0.9873, F1-score: 0.9611, AUC: 0.9038

Conclusion: Tuning increased the number of boosting rounds, slightly improving the recall and F1-score.

Recommendation: Further adjustments to learning rate or max depth may enhance performance.

Result logged in ‘results’ table for comparison.

Results and Visualization

Print the result in a table

Objective: Compare model performance across Decision Tree, Random Forest, and AdaBoost variations.

Variation: Models were tuned to assess impact on accuracy and AUC.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# Store Results (without xgboost)
results <- tibble(
  Model = c("Decision Tree (Default)", "Decision Tree (Tuned)", 
            "Random Forest (Default)", "Random Forest (Tuned)", 
            "AdaBoost (Default)", "AdaBoost (Tuned)"),
  Accuracy = c(dt_accuracy, dt_tuned_accuracy, rf_accuracy, rf_tuned_accuracy, 
               ada_accuracy, ada_tuned_accuracy),
  Precision = c(dt_precision, dt_tuned_precision, rf_precision, rf_tuned_precision, 
                ada_precision, ada_tuned_precision),
  Recall = c(dt_recall, dt_tuned_recall, rf_recall, rf_tuned_recall, 
             ada_recall, ada_tuned_recall),
  F1_Score = c(dt_f1, dt_tuned_f1, rf_f1, rf_tuned_f1, 
               ada_f1, ada_tuned_f1),
  AUC = c(dt_auc, dt_tuned_auc, rf_auc, rf_tuned_auc, 
          ada_auc, ada_tuned_auc)
)

# Plot AUC Comparison
ggplot(results, aes(x = reorder(Model, AUC), y = AUC, fill = Model)) +
  geom_bar(stat = "identity", color = "black") +
  coord_flip() +
  theme_minimal() +
  labs(title = "AUC Comparison Across Models", x = "Model", y = "AUC")

The table below summarizes the model performance. Random Forest showed the highest AUC, indicating strong predictive capability, while AdaBoost demonstrated balanced performance

# Display Results
print(results)
## # A tibble: 6 × 6
##   Model                   Accuracy Precision Recall F1_Score   AUC
##   <chr>                      <dbl>     <dbl>  <dbl>    <dbl> <dbl>
## 1 Decision Tree (Default)    0.930     0.936  0.991    0.963 0.768
## 2 Decision Tree (Tuned)      0.930     0.936  0.991    0.963 0.768
## 3 Random Forest (Default)    0.928     0.936  0.989    0.962 0.912
## 4 Random Forest (Tuned)      0.927     0.936  0.987    0.961 0.908
## 5 AdaBoost (Default)         0.927     0.936  0.987    0.961 0.903
## 6 AdaBoost (Tuned)           0.927     0.936  0.987    0.961 0.904

Based on the results, Random Forest with tuning is the most effective model for predicting customer subscription to term deposits. The bank should focus on refining Random Forest hyperparameters and combining ensemble models to improve generalization.

Conclusion

The objective of this project was to analyze the effectiveness of a bank’s marketing campaign and predict customer subscription to term deposits using three machine learning models: Decision Tree, Random Forest, and AdaBoost. Through systematic experimentation and tuning, the models were evaluated based on key performance metrics, including accuracy, precision, recall, F1-score, and AUC.

Key Findings:Decision Tree: The default Decision Tree model demonstrated high recall but moderate AUC, indicating overfitting. Tuning the complexity parameter (cp) and tree depth improved generalization but did not significantly enhance overall accuracy.

Random Forest: The Random Forest model exhibited strong predictive power with high AUC and balanced performance across accuracy, precision, and recall. Tuning further enhanced model stability and generalization. AdaBoost: AdaBoost achieved competitive performance with high recall and AUC. Increasing the number of boosting rounds improved recall and F1-score but slightly reduced accuracy, suggesting a balance between bias and variance.

Best Model: The tuned Random Forest model emerged as the best-performing model, achieving the highest AUC and consistent predictive accuracy. Its ability to handle complex patterns and reduce overfitting makes it the most reliable model for customer targeting.

Recommendations:The bank should implement the tuned Random Forest model for future marketing campaigns to improve customer targeting and conversion rates. Further improvements can be achieved by exploring hyperparameter adjustments and feature selection to refine model performance. Combining ensemble methods such as AdaBoost and Random Forest may further enhance predictive accuracy and generalization.