Experimentation & Model Training

Introduction

Customer acquisition and retention are critical for any financial institution. This project aims to analyze the effectiveness of a bank’s marketing campaign to predict customer subscription to term deposits. By applying machine learning algorithms such as Decision Tree, Random Forest, and Adaboost, this analysis seeks to identify the most effective model for improving customer targeting and campaign performance.

Objective: The goal is to identify the best-performing model by evaluating key metrics such as accuracy and AUC, ultimately guiding the bank toward more targeted and effective marketing strategies.

Approach: Three machine learning algorithms (Decision Tree, Random Forest, and AdaBoost) will be tested in default and tuned configurations. Performance will be evaluated using AUC, accuracy, sensitivity, and specificity. The best model will be selected based on predictive accuracy and generalization capability, and business recommendations will be provided to improve future marketing strategies.

Getting Started

Load packages

Let’s load the packages.

The data

After completing EDA and preprocessing, we saved the cleaned data in a new file called cleaned_data to proceed with the next step: Experimentation & Model Training.

# Read a CSV file
bank <- read.csv("https://raw.githubusercontent.com/waheeb123/Data-622/refs/heads/main/cleaned_data.csv")

# Preview the first few rows of the dataset
kable(head(bank, 10), caption = "Preview of the Bank Dataset")

Preview of the Bank Dataset
age	job	marital	education	default	balance	housing	loan	contact	day	month	duration	campaign	pdays	previous	poutcome	Subscription	contact_success_rate	age_group	credit_risk
30	unemployed	married	primary	no	1787	no	no	cellular	19	oct	79	1	-1	0	unknown	no	0.0568071	Middle-aged	Medium Risk
35	management	single	tertiary	no	1350	yes	no	cellular	16	apr	185	1	330	1	failure	no	0.1069652	Middle-aged	Medium Risk
30	management	married	tertiary	no	1476	yes	yes	unknown	3	jun	199	4	-1	0	unknown	no	0.0568071	Middle-aged	High Risk
59	blue-collar	married	secondary	no	0	yes	no	unknown	5	may	226	1	-1	0	unknown	no	0.0568071	Senior	Medium Risk
35	management	single	tertiary	no	747	no	no	cellular	23	feb	141	2	176	3	failure	no	0.1069652	Middle-aged	Medium Risk
36	self-employed	married	tertiary	no	307	yes	no	cellular	14	may	341	1	330	2	other	no	0.1428571	Middle-aged	Medium Risk
39	technician	married	secondary	no	147	yes	no	cellular	6	may	151	2	-1	0	unknown	no	0.0568071	Middle-aged	Medium Risk
41	entrepreneur	married	tertiary	no	221	yes	no	unknown	14	may	57	2	-1	0	unknown	no	0.0568071	Middle-aged	Medium Risk
43	services	married	primary	no	-88	yes	yes	cellular	17	apr	313	1	147	2	failure	no	0.1069652	Middle-aged	High Risk
43	admin.	married	secondary	no	264	yes	no	cellular	17	apr	113	2	-1	0	unknown	no	0.0568071	Middle-aged	Medium Risk

Experimentation & Model Training

To evaluate the performance of the Decision Tree model, we split the data into training and test sets and applied cross-validation to improve model reliability. After scaling the numeric features, we built a Decision Tree model using default settings.

# Split data into train and test sets
set.seed(123)
trainIndex <- createDataPartition(bank$Subscription, p = 0.7, list = FALSE)
data_train <- bank[trainIndex, ]
data_test <- bank[-trainIndex, ]

# Convert 'Subscription' column to factor for classification
data_train$Subscription <- as.factor(data_train$Subscription)
data_test$Subscription <- as.factor(data_test$Subscription)

# Scale numeric features
data_train_scaled <- data_train %>%
  mutate_if(is.numeric, scale)
data_test_scaled <- data_test %>%
  mutate_if(is.numeric, scale)

# Set up cross-validation control (10-fold cross-validation)
train_control <- trainControl(method = "cv", number = 10, 
                              savePredictions = "all", 
                              classProbs = TRUE, 
                              summaryFunction = twoClassSummary)

Experiment 1: Decision Tree (Default)

Objective: Test the default decision tree model to evaluate its baseline performance on the classification task.

Variation: No tuning applied; default settings used for benchmarking.

Variation is meaningful as it sets a baseline to measure tuning effectiveness.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC will be computed.

# Build Decision Tree model with default settings
dt_model <- rpart(Subscription ~ ., data = data_train, method = "class")

# Predict using test data
dt_probs <- predict(dt_model, data_test, type = "prob")[, 2]
dt_preds <- predict(dt_model, data_test, type = "class")

# Evaluate metrics
dt_confusion <- confusionMatrix(dt_preds, data_test$Subscription)
dt_accuracy <- dt_confusion$overall['Accuracy']
dt_precision <- dt_confusion$byClass['Pos Pred Value']
dt_recall <- dt_confusion$byClass['Sensitivity']
dt_f1 <- 2 * (dt_precision * dt_recall) / (dt_precision + dt_recall)
dt_auc <- roc(data_test$Subscription, dt_probs)$auc

cat(sprintf("\nDecision Tree (Default) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            dt_accuracy, dt_precision, dt_recall, dt_f1, dt_auc))

## 
## Decision Tree (Default) - Accuracy: 0.9301, Precision: 0.9365, Recall: 0.9912, F1-score: 0.9631, AUC: 0.7676

Conclusion: Default decision tree model shows high recall but moderate AUC, indicating overfitting.

Recommendation: Further tuning of cp and maxdepth or switching to ensemble methods may improve AUC.

Result logged in results table for comparison.

Experiment 2: Decision Tree (Tuned)

Objective: Optimize decision tree model performance by adjusting hyperparameters.

Variation: Tuning complexity parameter (cp) and maximum tree depth (maxdepth).

Variation is meaningful since tuning cp and maxdepth impacts overfitting and model complexity.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC will be computed.

# Build tuned Decision Tree model
dt_tuned <- rpart(Subscription ~ ., data = data_train, method = "class", 
                  control = rpart.control(cp = 0.01, maxdepth = 5))

# Predict using test data
dt_tuned_probs <- predict(dt_tuned, data_test, type = "prob")[, 2]
dt_tuned_preds <- predict(dt_tuned, data_test, type = "class")

# Evaluate metrics
dt_tuned_confusion <- confusionMatrix(dt_tuned_preds, data_test$Subscription)
dt_tuned_accuracy <- dt_tuned_confusion$overall['Accuracy']
dt_tuned_precision <- dt_tuned_confusion$byClass['Pos Pred Value']
dt_tuned_recall <- dt_tuned_confusion$byClass['Sensitivity']
dt_tuned_f1 <- 2 * (dt_tuned_precision * dt_tuned_recall) / (dt_tuned_precision + dt_tuned_recall)
dt_tuned_auc <- roc(data_test$Subscription, dt_tuned_probs)$auc

cat(sprintf("\nDecision Tree (Tuned) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            dt_tuned_accuracy, dt_tuned_precision, dt_tuned_recall, dt_tuned_f1, dt_tuned_auc))

## 
## Decision Tree (Tuned) - Accuracy: 0.9301, Precision: 0.9365, Recall: 0.9912, F1-score: 0.9631, AUC: 0.7676

Conclusion: Tuning did not improve accuracy, but it may have helped reduce overfitting.

Recommendation: Try ensemble models like Random Forest or AdaBoost for improved performance.

Result logged in results table for comparison.

Experiment 3: Random Forest (Default)

Objective: Evaluate the baseline performance of a Random Forest model on the classification task.

Variation: No tuning applied; using default ntree = 100.

Variation is meaningful because it establishes a benchmark for comparison with tuned models.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# Build Random Forest model with default settings
rf_model <- randomForest(Subscription ~ ., data = data_train, ntree = 100)

# Predict using test data
rf_probs <- predict(rf_model, data_test, type = "prob")[, 2]
rf_preds <- predict(rf_model, data_test, type = "class")

# Evaluate metrics
rf_confusion <- confusionMatrix(rf_preds, data_test$Subscription)
rf_accuracy <- rf_confusion$overall['Accuracy']
rf_precision <- rf_confusion$byClass['Pos Pred Value']
rf_recall <- rf_confusion$byClass['Sensitivity']
rf_f1 <- 2 * (rf_precision * rf_recall) / (rf_precision + rf_recall)
rf_auc <- roc(data_test$Subscription, rf_probs)$auc

cat(sprintf("\nRandom Forest (Default) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            rf_accuracy, rf_precision, rf_recall, rf_f1, rf_auc))

## 
## Random Forest (Default) - Accuracy: 0.9283, Precision: 0.9363, Recall: 0.9893, F1-score: 0.9621, AUC: 0.9119

Conclusion: The Random Forest model with default settings achieved high accuracy and AUC, indicating strong predictive power.

Recommendation: Further tuning can focus on adjusting the number of trees and the number of variables per split.

Result logged in results table for comparison.

Experiment 4: Random Forest (Tuned)

Objective: Improve Random Forest performance by tuning hyperparameters.

Variation: Increased ntree to 200 and adjusted mtry to 4.

Variation is meaningful because increasing ntree reduces variance, and adjusting mtry balances bias-variance tradeoff.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# Build tuned Random Forest model
rf_tuned <- randomForest(Subscription ~ ., data = data_train, ntree = 200, mtry = 4)

# Predict using test data
rf_tuned_probs <- predict(rf_tuned, data_test, type = "prob")[, 2]
rf_tuned_preds <- predict(rf_tuned, data_test, type = "class")

# Evaluate metrics
rf_tuned_confusion <- confusionMatrix(rf_tuned_preds, data_test$Subscription)
rf_tuned_accuracy <- rf_tuned_confusion$overall['Accuracy']
rf_tuned_precision <- rf_tuned_confusion$byClass['Pos Pred Value']
rf_tuned_recall <- rf_tuned_confusion$byClass['Sensitivity']
rf_tuned_f1 <- 2 * (rf_tuned_precision * rf_tuned_recall) / (rf_tuned_precision + rf_tuned_recall)
rf_tuned_auc <- roc(data_test$Subscription, rf_tuned_probs)$auc

cat(sprintf("\nRandom Forest (Tuned) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            rf_tuned_accuracy, rf_tuned_precision, rf_tuned_recall, rf_tuned_f1, rf_tuned_auc))

## 
## Random Forest (Tuned) - Accuracy: 0.9265, Precision: 0.9362, Recall: 0.9873, F1-score: 0.9611, AUC: 0.9077

Conclusion: Tuning resulted in a slight drop in accuracy but improved generalization as indicated by consistent AUC.

Recommendation: Further adjustments to mtry or implementing feature selection may improve performance.

Result logged in results table for comparison

Experiment 5: Adaboost (Default)

Objective: Evaluate the baseline performance of an AdaBoost model on the classification task.

Variation: No tuning applied; using default iter = 50.

Variation is meaningful because it establishes a benchmark for comparison with tuned models.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# AdaBoost (Default) Model
library(ada)

# Train AdaBoost model
ada_model <- ada(Subscription ~ ., data = data_train, iter = 50)

# Predict using test data
ada_probs <- predict(ada_model, data_test, type = "prob")[, 2]
ada_preds <- predict(ada_model, data_test, type = "class")

# Evaluate metrics
ada_confusion <- confusionMatrix(ada_preds, data_test$Subscription)
ada_accuracy <- ada_confusion$overall['Accuracy']
ada_precision <- ada_confusion$byClass['Pos Pred Value']
ada_recall <- ada_confusion$byClass['Sensitivity']
ada_f1 <- 2 * (ada_precision * ada_recall) / (ada_precision + ada_recall)
ada_auc <- roc(data_test$Subscription, ada_probs)$auc

cat(sprintf("\nAdaBoost (Default) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            ada_accuracy, ada_precision, ada_recall, ada_f1, ada_auc))

## 
## AdaBoost (Default) - Accuracy: 0.9265, Precision: 0.9362, Recall: 0.9873, F1-score: 0.9611, AUC: 0.9034

Conclusion: The AdaBoost model with default settings achieved strong performance, with high accuracy and AUC.

Recommendation: Further tuning of the number of iterations may improve generalization and reduce variance.

Result logged in results table for comparison.

Experiment 6: Adaboost (Tuned)

Objective: Improve AdaBoost performance by tuning hyperparameters.

Variation: Increased the number of boosting iterations from 50 to 100.

Variation is meaningful because increasing iterations can reduce bias and improve model performance.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# AdaBoost (Tuned) Model
# Tuning AdaBoost model - for example, increase iterations
ada_tuned_model <- ada(Subscription ~ ., data = data_train, iter = 100)

# Predict using test data
ada_tuned_probs <- predict(ada_tuned_model, data_test, type = "prob")[, 2]
ada_tuned_preds <- predict(ada_tuned_model, data_test, type = "class")

# Evaluate metrics
ada_tuned_confusion <- confusionMatrix(ada_tuned_preds, data_test$Subscription)
ada_tuned_accuracy <- ada_tuned_confusion$overall['Accuracy']
ada_tuned_precision <- ada_tuned_confusion$byClass['Pos Pred Value']
ada_tuned_recall <- ada_tuned_confusion$byClass['Sensitivity']
ada_tuned_f1 <- 2 * (ada_tuned_precision * ada_tuned_recall) / (ada_tuned_precision + ada_tuned_recall)
ada_tuned_auc <- roc(data_test$Subscription, ada_tuned_probs)$auc

cat(sprintf("\nAdaBoost (Tuned) - Accuracy: %.4f, Precision: %.4f, Recall: %.4f, F1-score: %.4f, AUC: %.4f\n", 
            ada_tuned_accuracy, ada_tuned_precision, ada_tuned_recall, ada_tuned_f1, ada_tuned_auc))

## 
## AdaBoost (Tuned) - Accuracy: 0.9265, Precision: 0.9362, Recall: 0.9873, F1-score: 0.9611, AUC: 0.9038

Conclusion: Tuning increased the number of boosting rounds, slightly improving the recall and F1-score.

Recommendation: Further adjustments to learning rate or max depth may enhance performance.

Result logged in ‘results’ table for comparison.

Results and Visualization

Print the result in a table

Objective: Compare model performance across Decision Tree, Random Forest, and AdaBoost variations.

Variation: Models were tuned to assess impact on accuracy and AUC.

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and AUC.

# Store Results (without xgboost)
results <- tibble(
  Model = c("Decision Tree (Default)", "Decision Tree (Tuned)", 
            "Random Forest (Default)", "Random Forest (Tuned)", 
            "AdaBoost (Default)", "AdaBoost (Tuned)"),
  Accuracy = c(dt_accuracy, dt_tuned_accuracy, rf_accuracy, rf_tuned_accuracy, 
               ada_accuracy, ada_tuned_accuracy),
  Precision = c(dt_precision, dt_tuned_precision, rf_precision, rf_tuned_precision, 
                ada_precision, ada_tuned_precision),
  Recall = c(dt_recall, dt_tuned_recall, rf_recall, rf_tuned_recall, 
             ada_recall, ada_tuned_recall),
  F1_Score = c(dt_f1, dt_tuned_f1, rf_f1, rf_tuned_f1, 
               ada_f1, ada_tuned_f1),
  AUC = c(dt_auc, dt_tuned_auc, rf_auc, rf_tuned_auc, 
          ada_auc, ada_tuned_auc)
)

# Plot AUC Comparison
ggplot(results, aes(x = reorder(Model, AUC), y = AUC, fill = Model)) +
  geom_bar(stat = "identity", color = "black") +
  coord_flip() +
  theme_minimal() +
  labs(title = "AUC Comparison Across Models", x = "Model", y = "AUC")

The table below summarizes the model performance. Random Forest showed the highest AUC, indicating strong predictive capability, while AdaBoost demonstrated balanced performance

# Display Results
print(results)

## # A tibble: 6 × 6
##   Model                   Accuracy Precision Recall F1_Score   AUC
##   <chr>                      <dbl>     <dbl>  <dbl>    <dbl> <dbl>
## 1 Decision Tree (Default)    0.930     0.936  0.991    0.963 0.768
## 2 Decision Tree (Tuned)      0.930     0.936  0.991    0.963 0.768
## 3 Random Forest (Default)    0.928     0.936  0.989    0.962 0.912
## 4 Random Forest (Tuned)      0.927     0.936  0.987    0.961 0.908
## 5 AdaBoost (Default)         0.927     0.936  0.987    0.961 0.903
## 6 AdaBoost (Tuned)           0.927     0.936  0.987    0.961 0.904

Based on the results, Random Forest with tuning is the most effective model for predicting customer subscription to term deposits. The bank should focus on refining Random Forest hyperparameters and combining ensemble models to improve generalization.

Conclusion

The objective of this project was to analyze the effectiveness of a bank’s marketing campaign and predict customer subscription to term deposits using three machine learning models: Decision Tree, Random Forest, and AdaBoost. Through systematic experimentation and tuning, the models were evaluated based on key performance metrics, including accuracy, precision, recall, F1-score, and AUC.

Key Findings:Decision Tree: The default Decision Tree model demonstrated high recall but moderate AUC, indicating overfitting. Tuning the complexity parameter (cp) and tree depth improved generalization but did not significantly enhance overall accuracy.

Random Forest: The Random Forest model exhibited strong predictive power with high AUC and balanced performance across accuracy, precision, and recall. Tuning further enhanced model stability and generalization. AdaBoost: AdaBoost achieved competitive performance with high recall and AUC. Increasing the number of boosting rounds improved recall and F1-score but slightly reduced accuracy, suggesting a balance between bias and variance.

Best Model: The tuned Random Forest model emerged as the best-performing model, achieving the highest AUC and consistent predictive accuracy. Its ability to handle complex patterns and reduce overfitting makes it the most reliable model for customer targeting.

Recommendations:The bank should implement the tuned Random Forest model for future marketing campaigns to improve customer targeting and conversion rates. Further improvements can be achieved by exploring hyperparameter adjustments and feature selection to refine model performance. Combining ensemble methods such as AdaBoost and Random Forest may further enhance predictive accuracy and generalization.

Experimentation & Model Training

Waheeb Algabri

Introduction

Getting Started

Load packages

The data

Experimentation & Model Training

Conclusion