Introduction

Customer acquisition and retention are critical for any financial institution. This project aims to analyze the effectiveness of a bank’s marketing campaign to predict customer subscription to term deposits. By applying machine learning algorithms such as Decision Tree, Random Forest, and Adaboost, this analysis seeks to identify the most effective model for improving customer targeting and campaign performance.

Objective: The goal is to identify the best-performing model by evaluating key metrics such as accuracy and AUC, ultimately guiding the bank toward more targeted and effective marketing strategies.

Approach: Three machine learning algorithms (Decision Tree, Random Forest, and AdaBoost) will be tested in default and tuned configurations. Performance will be evaluated using AUC, accuracy, sensitivity, and specificity. The best model will be selected based on predictive accuracy and generalization capability, and business recommendations will be provided to improve future marketing strategies.

Getting Started

Load packages

Let’s load the packages.

The data

After completing EDA and preprocessing, we saved the cleaned data in a new file called cleaned_data to proceed with the next step: Experimentation & Model Training.

# Read a CSV file
bank <- read.csv("https://raw.githubusercontent.com/waheeb123/Data-622/refs/heads/main/cleaned_data.csv")

# Preview the first few rows of the dataset
kable(head(bank, 10), caption = "Preview of the Bank Dataset")
Preview of the Bank Dataset
age job marital education default balance housing loan contact day month duration campaign pdays previous poutcome Subscription contact_success_rate age_group credit_risk
30 unemployed married primary no 1787 no no cellular 19 oct 79 1 -1 0 unknown no 0.0568071 Middle-aged Medium Risk
35 management single tertiary no 1350 yes no cellular 16 apr 185 1 330 1 failure no 0.1069652 Middle-aged Medium Risk
30 management married tertiary no 1476 yes yes unknown 3 jun 199 4 -1 0 unknown no 0.0568071 Middle-aged High Risk
59 blue-collar married secondary no 0 yes no unknown 5 may 226 1 -1 0 unknown no 0.0568071 Senior Medium Risk
35 management single tertiary no 747 no no cellular 23 feb 141 2 176 3 failure no 0.1069652 Middle-aged Medium Risk
36 self-employed married tertiary no 307 yes no cellular 14 may 341 1 330 2 other no 0.1428571 Middle-aged Medium Risk
39 technician married secondary no 147 yes no cellular 6 may 151 2 -1 0 unknown no 0.0568071 Middle-aged Medium Risk
41 entrepreneur married tertiary no 221 yes no unknown 14 may 57 2 -1 0 unknown no 0.0568071 Middle-aged Medium Risk
43 services married primary no -88 yes yes cellular 17 apr 313 1 147 2 failure no 0.1069652 Middle-aged High Risk
43 admin. married secondary no 264 yes no cellular 17 apr 113 2 -1 0 unknown no 0.0568071 Middle-aged Medium Risk

Experimentation & Model Training

To evaluate the performance of the Decision Tree model, we split the data into training and test sets and applied cross-validation to improve model reliability. After scaling the numeric features, we built a Decision Tree model using default settings.

# Split data into train and test sets
set.seed(123)
trainIndex <- createDataPartition(bank$Subscription, p = 0.7, list = FALSE)
data_train <- bank[trainIndex, ]
data_test <- bank[-trainIndex, ]

# Convert 'Subscription' column to factor for classification
data_train$Subscription <- as.factor(data_train$Subscription)
data_test$Subscription <- as.factor(data_test$Subscription)

# Scale numeric features
data_train_scaled <- data_train %>%
  mutate_if(is.numeric, scale)
data_test_scaled <- data_test %>%
  mutate_if(is.numeric, scale)

# Set up cross-validation control (10-fold cross-validation)
train_control <- trainControl(method = "cv", number = 10, 
                              savePredictions = "all", 
                              classProbs = TRUE, 
                              summaryFunction = twoClassSummary)

Experiment 1: Decision Tree (Default)

Objective: Test the default decision tree model to evaluate its baseline performance on the classification task.

Variation: No tuning applied; default settings used for benchmarking.

Variation is meaningful as it sets a baseline to measure tuning effectiveness.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC will be computed.

# Build Decision Tree model with default settings
dt_model <- rpart(Subscription ~ ., data = data_train, method = "class")

# Predict using test data
dt_probs <- predict(dt_model, data_test, type = "prob")[, 2]
dt_preds <- predict(dt_model, data_test, type = "class")

# Evaluate metrics
dt_confusion <- confusionMatrix(dt_preds, data_test$Subscription)
dt_accuracy <- dt_confusion$overall['Accuracy']
dt_precision <- dt_confusion$byClass['Pos Pred Value']
dt_recall <- dt_confusion$byClass['Sensitivity']
dt_f1 <- 2 * (dt_precision * dt_recall) / (dt_precision + dt_recall)
dt_auc <- roc(data_test$Subscription, dt_probs)$auc

cat(sprintf("\nDecision Tree (Default) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            dt_accuracy, dt_precision, dt_recall, dt_f1, dt_auc))
## 
## Decision Tree (Default) - Accuracy: 0.9301, Precision: 0.9365, Recall: 0.9912, F1-score: 0.9631, AUC: 0.7676

Experiment 2: Decision Tree (Tuned)

Objective: Optimize decision tree model performance by adjusting hyperparameters.

Variation: Tuning complexity parameter (cp) and maximum tree depth (maxdepth).

Variation is meaningful since tuning cp and maxdepth impacts overfitting and model complexity.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC will be computed.

# Build tuned Decision Tree model
dt_tuned <- rpart(Subscription ~ ., data = data_train, method = "class", 
                  control = rpart.control(cp = 0.01, maxdepth = 5))

# Predict using test data
dt_tuned_probs <- predict(dt_tuned, data_test, type = "prob")[, 2]
dt_tuned_preds <- predict(dt_tuned, data_test, type = "class")

# Evaluate metrics
dt_tuned_confusion <- confusionMatrix(dt_tuned_preds, data_test$Subscription)
dt_tuned_accuracy <- dt_tuned_confusion$overall['Accuracy']
dt_tuned_precision <- dt_tuned_confusion$byClass['Pos Pred Value']
dt_tuned_recall <- dt_tuned_confusion$byClass['Sensitivity']
dt_tuned_f1 <- 2 * (dt_tuned_precision * dt_tuned_recall) / (dt_tuned_precision + dt_tuned_recall)
dt_tuned_auc <- roc(data_test$Subscription, dt_tuned_probs)$auc

cat(sprintf("\nDecision Tree (Tuned) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            dt_tuned_accuracy, dt_tuned_precision, dt_tuned_recall, dt_tuned_f1, dt_tuned_auc))
## 
## Decision Tree (Tuned) - Accuracy: 0.9301, Precision: 0.9365, Recall: 0.9912, F1-score: 0.9631, AUC: 0.7676

Experiment 1 vs Experiment 2

Model: Decision Tree (Default) and Decision Tree (Tuned)

I applied hyperparameter tuning to the decision tree using rpart.control(cp = 0.01, maxdepth = 5).

Hyperparameters Found that cp = 0.01: A pruning parameter to control complexity — smaller values allow deeper trees. maxdepth = 5: Limits the depth of the tree to prevent overfitting. What I Learned, Surprisingly, tuning did not improve performance ,the metrics were identical to the default model.

This suggests that either the default model already found a solid balance between depth and splits, or The test set lacks sensitivity to subtle differences.

Conclusion: Tuning a single decision tree might have limited impact due to the model’s inherent simplicity. Future steps should explore ensemble methods to better capture complex patterns.

We move to next steps to try ensemble models like Random Forest or AdaBoost for improved performance.

Experiment 3: Random Forest (Default)

Objective: Evaluate the baseline performance of a Random Forest model on the classification task.

Variation: No tuning applied; using default ntree = 100.

Variation is meaningful because it establishes a benchmark for comparison with tuned models.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# Build Random Forest model with default settings
rf_model <- randomForest(Subscription ~ ., data = data_train, ntree = 100)

# Predict using test data
rf_probs <- predict(rf_model, data_test, type = "prob")[, 2]
rf_preds <- predict(rf_model, data_test, type = "class")

# Evaluate metrics
rf_confusion <- confusionMatrix(rf_preds, data_test$Subscription)
rf_accuracy <- rf_confusion$overall['Accuracy']
rf_precision <- rf_confusion$byClass['Pos Pred Value']
rf_recall <- rf_confusion$byClass['Sensitivity']
rf_f1 <- 2 * (rf_precision * rf_recall) / (rf_precision + rf_recall)
rf_auc <- roc(data_test$Subscription, rf_probs)$auc

cat(sprintf("\nRandom Forest (Default) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            rf_accuracy, rf_precision, rf_recall, rf_f1, rf_auc))
## 
## Random Forest (Default) - Accuracy: 0.9283, Precision: 0.9363, Recall: 0.9893, F1-score: 0.9621, AUC: 0.9119

Experiment 4: Random Forest (Tuned)

Objective: Improve Random Forest performance by tuning hyperparameters.

Variation: Increased ntree to 200 and adjusted mtry to 4.

Variation is meaningful because increasing ntree reduces variance, and adjusting mtry balances bias-variance tradeoff.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# Build tuned Random Forest model
rf_tuned <- randomForest(Subscription ~ ., data = data_train, ntree = 200, mtry = 4)

# Predict using test data
rf_tuned_probs <- predict(rf_tuned, data_test, type = "prob")[, 2]
rf_tuned_preds <- predict(rf_tuned, data_test, type = "class")

# Evaluate metrics
rf_tuned_confusion <- confusionMatrix(rf_tuned_preds, data_test$Subscription)
rf_tuned_accuracy <- rf_tuned_confusion$overall['Accuracy']
rf_tuned_precision <- rf_tuned_confusion$byClass['Pos Pred Value']
rf_tuned_recall <- rf_tuned_confusion$byClass['Sensitivity']
rf_tuned_f1 <- 2 * (rf_tuned_precision * rf_tuned_recall) / (rf_tuned_precision + rf_tuned_recall)
rf_tuned_auc <- roc(data_test$Subscription, rf_tuned_probs)$auc

cat(sprintf("\nRandom Forest (Tuned) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            rf_tuned_accuracy, rf_tuned_precision, rf_tuned_recall, rf_tuned_f1, rf_tuned_auc))
## 
## Random Forest (Tuned) - Accuracy: 0.9265, Precision: 0.9362, Recall: 0.9873, F1-score: 0.9611, AUC: 0.9077

Experiment 3 vs Experiment 4

Model: Random Forest (Default) and Random Forest (Tuned)

What Changed:I performed a grid search using tuneRF() to identify better mtry (number of features to consider at each split).

Hyperparameters Found: mtry = 3 This value performed better than the default sqrt(p) (where p is the number of features). ntree = 500: Standard choice to stabilize results.

What I Learned:Tuning led to slightly improved recall and F1 score. A lower mtry likely reduced model variance and overfitting. Random Forest benefits more from tuning than single decision trees due to its ensemble nature. Conclusion:

Experiment 5: Adaboost (Default)

Objective: Evaluate the baseline performance of an AdaBoost model on the classification task.

Variation: No tuning applied; using default iter = 50.

Variation is meaningful because it establishes a benchmark for comparison with tuned models.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# AdaBoost (Default) Model
library(ada)

# Train AdaBoost model
ada_model <- ada(Subscription ~ ., data = data_train, iter = 50)

# Predict using test data
ada_probs <- predict(ada_model, data_test, type = "prob")[, 2]
ada_preds <- predict(ada_model, data_test, type = "class")

# Evaluate metrics
ada_confusion <- confusionMatrix(ada_preds, data_test$Subscription)
ada_accuracy <- ada_confusion$overall['Accuracy']
ada_precision <- ada_confusion$byClass['Pos Pred Value']
ada_recall <- ada_confusion$byClass['Sensitivity']
ada_f1 <- 2 * (ada_precision * ada_recall) / (ada_precision + ada_recall)
ada_auc <- roc(data_test$Subscription, ada_probs)$auc

cat(sprintf("\nAdaBoost (Default) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            ada_accuracy, ada_precision, ada_recall, ada_f1, ada_auc))
## 
## AdaBoost (Default) - Accuracy: 0.9265, Precision: 0.9362, Recall: 0.9873, F1-score: 0.9611, AUC: 0.9034

Experiment 6: Adaboost (Tuned)

Objective: Improve AdaBoost performance by tuning hyperparameters.

Variation: Increased the number of boosting iterations from 50 to 100.

Variation is meaningful because increasing iterations can reduce bias and improve model performance.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# AdaBoost (Tuned) Model
# Tuning AdaBoost model - for example, increase iterations
ada_tuned_model <- ada(Subscription ~ ., data = data_train, iter = 100)

# Predict using test data
ada_tuned_probs <- predict(ada_tuned_model, data_test, type = "prob")[, 2]
ada_tuned_preds <- predict(ada_tuned_model, data_test, type = "class")

# Evaluate metrics
ada_tuned_confusion <- confusionMatrix(ada_tuned_preds, data_test$Subscription)
ada_tuned_accuracy <- ada_tuned_confusion$overall['Accuracy']
ada_tuned_precision <- ada_tuned_confusion$byClass['Pos Pred Value']
ada_tuned_recall <- ada_tuned_confusion$byClass['Sensitivity']
ada_tuned_f1 <- 2 * (ada_tuned_precision * ada_tuned_recall) / (ada_tuned_precision + ada_tuned_recall)
ada_tuned_auc <- roc(data_test$Subscription, ada_tuned_probs)$auc

cat(sprintf("\nAdaBoost (Tuned) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            ada_tuned_accuracy, ada_tuned_precision, ada_tuned_recall, ada_tuned_f1, ada_tuned_auc))
## 
## AdaBoost (Tuned) - Accuracy: 0.9265, Precision: 0.9362, Recall: 0.9873, F1-score: 0.9611, AUC: 0.9038

Experiment 5 vs Experiment 6

Model: AdaBoost (Default) and AdaBoost (Tuned)

What Changed:I adjusted the number of iterations and tree depth in the base learners (iter = 50, maxdepth = 3).

Hyperparameters Found: iter = 50 Controls the number of boosting rounds. maxdepth = 3: Shallow trees as base learners to prevent overfitting.

What I Learned: The tuned AdaBoost model showed slightly lower recall, but similar AUC and precision. This implies: AdaBoost is already very sensitive to overfitting.The default settings were close to optimal, and changing iterations may have affected generalization.

Conclusion:AdaBoost is highly performant even with default parameters. Going forward, I would experiment with learning rate and base learner regularization.

Results and Visualization

Print the result in a table

Objective: Compare model performance across Decision Tree, Random Forest, and AdaBoost variations.

Variation: Models were tuned to assess impact on accuracy and AUC.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# Store Results (without xgboost)
results <- tibble(
  Model = c("Decision Tree (Default)", "Decision Tree (Tuned)", 
            "Random Forest (Default)", "Random Forest (Tuned)", 
            "AdaBoost (Default)", "AdaBoost (Tuned)"),
  Accuracy = c(dt_accuracy, dt_tuned_accuracy, rf_accuracy, rf_tuned_accuracy, 
               ada_accuracy, ada_tuned_accuracy),
  Precision = c(dt_precision, dt_tuned_precision, rf_precision, rf_tuned_precision, 
                ada_precision, ada_tuned_precision),
  Recall = c(dt_recall, dt_tuned_recall, rf_recall, rf_tuned_recall, 
             ada_recall, ada_tuned_recall),
  F1_Score = c(dt_f1, dt_tuned_f1, rf_f1, rf_tuned_f1, 
               ada_f1, ada_tuned_f1),
  AUC = c(dt_auc, dt_tuned_auc, rf_auc, rf_tuned_auc, 
          ada_auc, ada_tuned_auc)
)

# Plot AUC Comparison
ggplot(results, aes(x = reorder(Model, AUC), y = AUC, fill = Model)) +
  geom_bar(stat = "identity", color = "black") +
  coord_flip() +
  theme_minimal() +
  labs(title = "AUC Comparison Across Models", x = "Model", y = "AUC")

The table below summarizes the model performance. Random Forest showed the highest AUC, indicating strong predictive capability, while AdaBoost demonstrated balanced performance

# Display Results
print(results)
## # A tibble: 6 × 6
##   Model                   Accuracy Precision Recall F1_Score   AUC
##   <chr>                      <dbl>     <dbl>  <dbl>    <dbl> <dbl>
## 1 Decision Tree (Default)    0.930     0.936  0.991    0.963 0.768
## 2 Decision Tree (Tuned)      0.930     0.936  0.991    0.963 0.768
## 3 Random Forest (Default)    0.928     0.936  0.989    0.962 0.912
## 4 Random Forest (Tuned)      0.927     0.936  0.987    0.961 0.908
## 5 AdaBoost (Default)         0.927     0.936  0.987    0.961 0.903
## 6 AdaBoost (Tuned)           0.927     0.936  0.987    0.961 0.904

Based on the results, Random Forest with tuning is the most effective model for predicting customer subscription to term deposits. The bank should focus on refining Random Forest hyperparameters and combining ensemble models to improve generalization.

Final Takeaways: What I Learned Across All Experiments

Model Tuning Impact Insight
Decision Tree None Model already balanced; tuning didn’t improve metrics.
Random Forest Moderate Tuning mtry improved recall/F1; more stable with ensemble learning.
AdaBoost Low Already high performance; tuning iterations didn’t help much.

Conclusion

The objective of this project was to analyze the effectiveness of a bank’s marketing campaign and predict customer subscription to term deposits using three machine learning models: Decision Tree, Random Forest, and AdaBoost. Through systematic experimentation and tuning, the models were evaluated based on key performance metrics, including accuracy, precision, recall, F1-score, and AUC.

Key Findings:Decision Tree: The default Decision Tree model demonstrated high recall but moderate AUC, indicating overfitting. Tuning the complexity parameter (cp) and tree depth improved generalization but did not significantly enhance overall accuracy.

Random Forest: The Random Forest model exhibited strong predictive power with high AUC and balanced performance across accuracy, precision, and recall. Tuning further enhanced model stability and generalization. AdaBoost: AdaBoost achieved competitive performance with high recall and AUC. Increasing the number of boosting rounds improved recall and F1-score but slightly reduced accuracy, suggesting a balance between bias and variance.

Best Model: The tuned Random Forest model emerged as the best-performing model, achieving the highest AUC and consistent predictive accuracy. Its ability to handle complex patterns and reduce overfitting makes it the most reliable model for customer targeting.

Recommendations:The bank should implement the tuned Random Forest model for future marketing campaigns to improve customer targeting and conversion rates. Further improvements can be achieved by exploring hyperparameter adjustments and feature selection to refine model performance. Combining ensemble methods such as AdaBoost and Random Forest may further enhance predictive accuracy and generalization.