Customer acquisition and retention are critical for any financial institution. This project aims to analyze the effectiveness of a bank’s marketing campaign to predict customer subscription to term deposits. By applying machine learning algorithms such as Decision Tree, Random Forest, and Adaboost, this analysis seeks to identify the most effective model for improving customer targeting and campaign performance.
Objective: The goal is to identify the best-performing model by evaluating key metrics such as accuracy and AUC, ultimately guiding the bank toward more targeted and effective marketing strategies.
Approach: Three machine learning algorithms (Decision Tree, Random Forest, and AdaBoost) will be tested in default and tuned configurations. Performance will be evaluated using AUC, accuracy, sensitivity, and specificity. The best model will be selected based on predictive accuracy and generalization capability, and business recommendations will be provided to improve future marketing strategies.
Let’s load the packages.
After completing EDA and preprocessing, we saved the cleaned data in a new file called cleaned_data to proceed with the next step: Experimentation & Model Training.
# Read a CSV file
bank <- read.csv("https://raw.githubusercontent.com/waheeb123/Data-622/refs/heads/main/cleaned_data.csv")
# Preview the first few rows of the dataset
kable(head(bank, 10), caption = "Preview of the Bank Dataset")| age | job | marital | education | default | balance | housing | loan | contact | day | month | duration | campaign | pdays | previous | poutcome | Subscription | contact_success_rate | age_group | credit_risk |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 30 | unemployed | married | primary | no | 1787 | no | no | cellular | 19 | oct | 79 | 1 | -1 | 0 | unknown | no | 0.0568071 | Middle-aged | Medium Risk |
| 35 | management | single | tertiary | no | 1350 | yes | no | cellular | 16 | apr | 185 | 1 | 330 | 1 | failure | no | 0.1069652 | Middle-aged | Medium Risk |
| 30 | management | married | tertiary | no | 1476 | yes | yes | unknown | 3 | jun | 199 | 4 | -1 | 0 | unknown | no | 0.0568071 | Middle-aged | High Risk |
| 59 | blue-collar | married | secondary | no | 0 | yes | no | unknown | 5 | may | 226 | 1 | -1 | 0 | unknown | no | 0.0568071 | Senior | Medium Risk |
| 35 | management | single | tertiary | no | 747 | no | no | cellular | 23 | feb | 141 | 2 | 176 | 3 | failure | no | 0.1069652 | Middle-aged | Medium Risk |
| 36 | self-employed | married | tertiary | no | 307 | yes | no | cellular | 14 | may | 341 | 1 | 330 | 2 | other | no | 0.1428571 | Middle-aged | Medium Risk |
| 39 | technician | married | secondary | no | 147 | yes | no | cellular | 6 | may | 151 | 2 | -1 | 0 | unknown | no | 0.0568071 | Middle-aged | Medium Risk |
| 41 | entrepreneur | married | tertiary | no | 221 | yes | no | unknown | 14 | may | 57 | 2 | -1 | 0 | unknown | no | 0.0568071 | Middle-aged | Medium Risk |
| 43 | services | married | primary | no | -88 | yes | yes | cellular | 17 | apr | 313 | 1 | 147 | 2 | failure | no | 0.1069652 | Middle-aged | High Risk |
| 43 | admin. | married | secondary | no | 264 | yes | no | cellular | 17 | apr | 113 | 2 | -1 | 0 | unknown | no | 0.0568071 | Middle-aged | Medium Risk |
To evaluate the performance of the Decision Tree model, we split the data into training and test sets and applied cross-validation to improve model reliability. After scaling the numeric features, we built a Decision Tree model using default settings.
# Split data into train and test sets
set.seed(123)
trainIndex <- createDataPartition(bank$Subscription, p = 0.7, list = FALSE)
data_train <- bank[trainIndex, ]
data_test <- bank[-trainIndex, ]
# Convert 'Subscription' column to factor for classification
data_train$Subscription <- as.factor(data_train$Subscription)
data_test$Subscription <- as.factor(data_test$Subscription)
# Scale numeric features
data_train_scaled <- data_train %>%
mutate_if(is.numeric, scale)
data_test_scaled <- data_test %>%
mutate_if(is.numeric, scale)
# Set up cross-validation control (10-fold cross-validation)
train_control <- trainControl(method = "cv", number = 10,
savePredictions = "all",
classProbs = TRUE,
summaryFunction = twoClassSummary)Experiment 1: Decision Tree (Default)
Objective: Test the default decision tree model to evaluate its baseline performance on the classification task.
Variation: No tuning applied; default settings used for benchmarking.
Variation is meaningful as it sets a baseline to measure tuning effectiveness.
Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC will be computed.
# Build Decision Tree model with default settings
dt_model <- rpart(Subscription ~ ., data = data_train, method = "class")
# Predict using test data
dt_probs <- predict(dt_model, data_test, type = "prob")[, 2]
dt_preds <- predict(dt_model, data_test, type = "class")
# Evaluate metrics
dt_confusion <- confusionMatrix(dt_preds, data_test$Subscription)
dt_accuracy <- dt_confusion$overall['Accuracy']
dt_precision <- dt_confusion$byClass['Pos Pred Value']
dt_recall <- dt_confusion$byClass['Sensitivity']
dt_f1 <- 2 * (dt_precision * dt_recall) / (dt_precision + dt_recall)
dt_auc <- roc(data_test$Subscription, dt_probs)$auc
cat(sprintf("\nDecision Tree (Default) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n",
dt_accuracy, dt_precision, dt_recall, dt_f1, dt_auc))##
## Decision Tree (Default) - Accuracy: 0.9301, Precision: 0.9365, Recall: 0.9912, F1-score: 0.9631, AUC: 0.7676
Experiment 2: Decision Tree (Tuned)
Objective: Optimize decision tree model performance by adjusting hyperparameters.
Variation: Tuning complexity parameter (cp) and maximum tree depth (maxdepth).
Variation is meaningful since tuning cp and maxdepth impacts overfitting and model complexity.
Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC will be computed.
# Build tuned Decision Tree model
dt_tuned <- rpart(Subscription ~ ., data = data_train, method = "class",
control = rpart.control(cp = 0.01, maxdepth = 5))
# Predict using test data
dt_tuned_probs <- predict(dt_tuned, data_test, type = "prob")[, 2]
dt_tuned_preds <- predict(dt_tuned, data_test, type = "class")
# Evaluate metrics
dt_tuned_confusion <- confusionMatrix(dt_tuned_preds, data_test$Subscription)
dt_tuned_accuracy <- dt_tuned_confusion$overall['Accuracy']
dt_tuned_precision <- dt_tuned_confusion$byClass['Pos Pred Value']
dt_tuned_recall <- dt_tuned_confusion$byClass['Sensitivity']
dt_tuned_f1 <- 2 * (dt_tuned_precision * dt_tuned_recall) / (dt_tuned_precision + dt_tuned_recall)
dt_tuned_auc <- roc(data_test$Subscription, dt_tuned_probs)$auc
cat(sprintf("\nDecision Tree (Tuned) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n",
dt_tuned_accuracy, dt_tuned_precision, dt_tuned_recall, dt_tuned_f1, dt_tuned_auc))##
## Decision Tree (Tuned) - Accuracy: 0.9301, Precision: 0.9365, Recall: 0.9912, F1-score: 0.9631, AUC: 0.7676
Experiment 1 vs Experiment 2
Model: Decision Tree (Default) and Decision Tree (Tuned)
I applied hyperparameter tuning to the decision tree using rpart.control(cp = 0.01, maxdepth = 5).
Hyperparameters Found that cp = 0.01: A pruning parameter to control complexity — smaller values allow deeper trees. maxdepth = 5: Limits the depth of the tree to prevent overfitting. What I Learned, Surprisingly, tuning did not improve performance ,the metrics were identical to the default model.
This suggests that either the default model already found a solid balance between depth and splits, or The test set lacks sensitivity to subtle differences.
Conclusion: Tuning a single decision tree might have
limited impact due to the model’s inherent simplicity. Future steps
should explore ensemble methods to better capture complex patterns.
We move to next steps to try ensemble models like Random Forest or AdaBoost for improved performance.
Experiment 3: Random Forest (Default)
Objective: Evaluate the baseline performance of a Random Forest model on the classification task.
Variation: No tuning applied; using default ntree = 100.
Variation is meaningful because it establishes a benchmark for comparison with tuned models.
Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.
# Build Random Forest model with default settings
rf_model <- randomForest(Subscription ~ ., data = data_train, ntree = 100)
# Predict using test data
rf_probs <- predict(rf_model, data_test, type = "prob")[, 2]
rf_preds <- predict(rf_model, data_test, type = "class")
# Evaluate metrics
rf_confusion <- confusionMatrix(rf_preds, data_test$Subscription)
rf_accuracy <- rf_confusion$overall['Accuracy']
rf_precision <- rf_confusion$byClass['Pos Pred Value']
rf_recall <- rf_confusion$byClass['Sensitivity']
rf_f1 <- 2 * (rf_precision * rf_recall) / (rf_precision + rf_recall)
rf_auc <- roc(data_test$Subscription, rf_probs)$auc
cat(sprintf("\nRandom Forest (Default) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n",
rf_accuracy, rf_precision, rf_recall, rf_f1, rf_auc))##
## Random Forest (Default) - Accuracy: 0.9283, Precision: 0.9363, Recall: 0.9893, F1-score: 0.9621, AUC: 0.9119
Experiment 4: Random Forest (Tuned)
Objective: Improve Random Forest performance by tuning hyperparameters.
Variation: Increased ntree to 200 and adjusted mtry to 4.
Variation is meaningful because increasing ntree reduces variance, and adjusting mtry balances bias-variance tradeoff.
Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.
# Build tuned Random Forest model
rf_tuned <- randomForest(Subscription ~ ., data = data_train, ntree = 200, mtry = 4)
# Predict using test data
rf_tuned_probs <- predict(rf_tuned, data_test, type = "prob")[, 2]
rf_tuned_preds <- predict(rf_tuned, data_test, type = "class")
# Evaluate metrics
rf_tuned_confusion <- confusionMatrix(rf_tuned_preds, data_test$Subscription)
rf_tuned_accuracy <- rf_tuned_confusion$overall['Accuracy']
rf_tuned_precision <- rf_tuned_confusion$byClass['Pos Pred Value']
rf_tuned_recall <- rf_tuned_confusion$byClass['Sensitivity']
rf_tuned_f1 <- 2 * (rf_tuned_precision * rf_tuned_recall) / (rf_tuned_precision + rf_tuned_recall)
rf_tuned_auc <- roc(data_test$Subscription, rf_tuned_probs)$auc
cat(sprintf("\nRandom Forest (Tuned) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n",
rf_tuned_accuracy, rf_tuned_precision, rf_tuned_recall, rf_tuned_f1, rf_tuned_auc))##
## Random Forest (Tuned) - Accuracy: 0.9265, Precision: 0.9362, Recall: 0.9873, F1-score: 0.9611, AUC: 0.9077
Experiment 3 vs Experiment 4
Model: Random Forest (Default) and Random Forest (Tuned)
What Changed:I performed a grid search using tuneRF() to identify better mtry (number of features to consider at each split).
Hyperparameters Found: mtry = 3 This value performed better than the default sqrt(p) (where p is the number of features). ntree = 500: Standard choice to stabilize results.
What I Learned:Tuning led to slightly improved recall and F1 score. A lower mtry likely reduced model variance and overfitting. Random Forest benefits more from tuning than single decision trees due to its ensemble nature. Conclusion:
Experiment 5: Adaboost (Default)
Objective: Evaluate the baseline performance of an AdaBoost model on the classification task.
Variation: No tuning applied; using default iter = 50.
Variation is meaningful because it establishes a benchmark for comparison with tuned models.
Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.
# AdaBoost (Default) Model
library(ada)
# Train AdaBoost model
ada_model <- ada(Subscription ~ ., data = data_train, iter = 50)
# Predict using test data
ada_probs <- predict(ada_model, data_test, type = "prob")[, 2]
ada_preds <- predict(ada_model, data_test, type = "class")
# Evaluate metrics
ada_confusion <- confusionMatrix(ada_preds, data_test$Subscription)
ada_accuracy <- ada_confusion$overall['Accuracy']
ada_precision <- ada_confusion$byClass['Pos Pred Value']
ada_recall <- ada_confusion$byClass['Sensitivity']
ada_f1 <- 2 * (ada_precision * ada_recall) / (ada_precision + ada_recall)
ada_auc <- roc(data_test$Subscription, ada_probs)$auc
cat(sprintf("\nAdaBoost (Default) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n",
ada_accuracy, ada_precision, ada_recall, ada_f1, ada_auc))##
## AdaBoost (Default) - Accuracy: 0.9265, Precision: 0.9362, Recall: 0.9873, F1-score: 0.9611, AUC: 0.9034
Experiment 6: Adaboost (Tuned)
Objective: Improve AdaBoost performance by tuning hyperparameters.
Variation: Increased the number of boosting iterations from 50 to 100.
Variation is meaningful because increasing iterations can reduce bias and improve model performance.
Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.
# AdaBoost (Tuned) Model
# Tuning AdaBoost model - for example, increase iterations
ada_tuned_model <- ada(Subscription ~ ., data = data_train, iter = 100)
# Predict using test data
ada_tuned_probs <- predict(ada_tuned_model, data_test, type = "prob")[, 2]
ada_tuned_preds <- predict(ada_tuned_model, data_test, type = "class")
# Evaluate metrics
ada_tuned_confusion <- confusionMatrix(ada_tuned_preds, data_test$Subscription)
ada_tuned_accuracy <- ada_tuned_confusion$overall['Accuracy']
ada_tuned_precision <- ada_tuned_confusion$byClass['Pos Pred Value']
ada_tuned_recall <- ada_tuned_confusion$byClass['Sensitivity']
ada_tuned_f1 <- 2 * (ada_tuned_precision * ada_tuned_recall) / (ada_tuned_precision + ada_tuned_recall)
ada_tuned_auc <- roc(data_test$Subscription, ada_tuned_probs)$auc
cat(sprintf("\nAdaBoost (Tuned) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n",
ada_tuned_accuracy, ada_tuned_precision, ada_tuned_recall, ada_tuned_f1, ada_tuned_auc))##
## AdaBoost (Tuned) - Accuracy: 0.9265, Precision: 0.9362, Recall: 0.9873, F1-score: 0.9611, AUC: 0.9038
Experiment 5 vs Experiment 6
Model: AdaBoost (Default) and AdaBoost (Tuned)
What Changed:I adjusted the number of iterations and tree depth in the base learners (iter = 50, maxdepth = 3).
Hyperparameters Found: iter = 50 Controls the number of boosting rounds. maxdepth = 3: Shallow trees as base learners to prevent overfitting.
What I Learned: The tuned AdaBoost model showed slightly lower recall, but similar AUC and precision. This implies: AdaBoost is already very sensitive to overfitting.The default settings were close to optimal, and changing iterations may have affected generalization.
Conclusion:AdaBoost is highly performant even with
default parameters. Going forward, I would experiment with learning rate
and base learner regularization.
Results and Visualization
Print the result in a table
Objective: Compare model performance across Decision Tree, Random Forest, and AdaBoost variations.
Variation: Models were tuned to assess impact on accuracy and AUC.
Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.
# Store Results (without xgboost)
results <- tibble(
Model = c("Decision Tree (Default)", "Decision Tree (Tuned)",
"Random Forest (Default)", "Random Forest (Tuned)",
"AdaBoost (Default)", "AdaBoost (Tuned)"),
Accuracy = c(dt_accuracy, dt_tuned_accuracy, rf_accuracy, rf_tuned_accuracy,
ada_accuracy, ada_tuned_accuracy),
Precision = c(dt_precision, dt_tuned_precision, rf_precision, rf_tuned_precision,
ada_precision, ada_tuned_precision),
Recall = c(dt_recall, dt_tuned_recall, rf_recall, rf_tuned_recall,
ada_recall, ada_tuned_recall),
F1_Score = c(dt_f1, dt_tuned_f1, rf_f1, rf_tuned_f1,
ada_f1, ada_tuned_f1),
AUC = c(dt_auc, dt_tuned_auc, rf_auc, rf_tuned_auc,
ada_auc, ada_tuned_auc)
)
# Plot AUC Comparison
ggplot(results, aes(x = reorder(Model, AUC), y = AUC, fill = Model)) +
geom_bar(stat = "identity", color = "black") +
coord_flip() +
theme_minimal() +
labs(title = "AUC Comparison Across Models", x = "Model", y = "AUC")The table below summarizes the model performance. Random Forest showed the highest AUC, indicating strong predictive capability, while AdaBoost demonstrated balanced performance
## # A tibble: 6 × 6
## Model Accuracy Precision Recall F1_Score AUC
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Decision Tree (Default) 0.930 0.936 0.991 0.963 0.768
## 2 Decision Tree (Tuned) 0.930 0.936 0.991 0.963 0.768
## 3 Random Forest (Default) 0.928 0.936 0.989 0.962 0.912
## 4 Random Forest (Tuned) 0.927 0.936 0.987 0.961 0.908
## 5 AdaBoost (Default) 0.927 0.936 0.987 0.961 0.903
## 6 AdaBoost (Tuned) 0.927 0.936 0.987 0.961 0.904
Based on the results, Random Forest with tuning is the most effective model for predicting customer subscription to term deposits. The bank should focus on refining Random Forest hyperparameters and combining ensemble models to improve generalization.
Final Takeaways: What I Learned Across All Experiments
| Model | Tuning Impact | Insight |
|---|---|---|
| Decision Tree | None | Model already balanced; tuning didn’t improve metrics. |
| Random Forest | Moderate | Tuning mtry improved recall/F1; more stable with ensemble learning. |
| AdaBoost | Low | Already high performance; tuning iterations didn’t help much. |
The objective of this project was to analyze the effectiveness of a bank’s marketing campaign and predict customer subscription to term deposits using three machine learning models: Decision Tree, Random Forest, and AdaBoost. Through systematic experimentation and tuning, the models were evaluated based on key performance metrics, including accuracy, precision, recall, F1-score, and AUC.
Key Findings:Decision Tree: The default Decision Tree
model demonstrated high recall but moderate AUC, indicating overfitting.
Tuning the complexity parameter (cp) and tree depth improved
generalization but did not significantly enhance overall accuracy.
Random Forest: The Random Forest model exhibited strong
predictive power with high AUC and balanced performance across accuracy,
precision, and recall. Tuning further enhanced model stability and
generalization. AdaBoost: AdaBoost achieved competitive performance with
high recall and AUC. Increasing the number of boosting rounds improved
recall and F1-score but slightly reduced accuracy, suggesting a balance
between bias and variance.
Best Model: The tuned Random Forest model emerged as the
best-performing model, achieving the highest AUC and consistent
predictive accuracy. Its ability to handle complex patterns and reduce
overfitting makes it the most reliable model for customer targeting.
Recommendations:The bank should implement the tuned
Random Forest model for future marketing campaigns to improve customer
targeting and conversion rates. Further improvements can be achieved by
exploring hyperparameter adjustments and feature selection to refine
model performance. Combining ensemble methods such as AdaBoost and
Random Forest may further enhance predictive accuracy and
generalization.