Introduction

In Machine Learning, Experimentation refers to the systematic process of designing, executing, and analyzing different configurations to identify the optimal settings that performs best on a given task. Experimentation is learning by doing. It involves systematically changing parameters, evaluating results with metrics, and comparing different approaches to find the best solution; essentially, it’s the practice of testing and refining machine learning models through controlled experiments to improve their performance.

The key is to modify only one or a few variables at a time to isolate the impact of each change and understand its effect on model performance. In the assignment you will conduct at least 6 experiments. In real life, data scientists run anywhere from a dozen to hundreds of experiments (depending on the dataset and problem domain).

Libraries required for this project

library(tidyverse)
library(openintro)
library(reshape2)
library(infer)
library(dplyr)
library(knitr)
library(corrplot)
library(ggcorrplot)
library(ggthemes)
library(caret)
library(kableExtra)
library(ROSE)
library(rpart)
library(rpart.plot)
library(randomForest)
library(adabag)
library(pROC)

Assignment

This assignment consists of conducting at least two (2) experiments for different algorithms: Decision Trees, Random Forest and Adaboost. That is, at least six (6) experiments in total (3 algorithms x 2 experiments each). For each experiment you will define what you are trying to achieve (before each run), conduct the experiment, and at the end you will review how your experiment went. These experiments will allow you to compare algorithms and choose the optimal model.

Using the dataset and EDA from the previous assignment

# Load dataset
dataBank <- read.csv("C:/Users/vitug/OneDrive/Desktop/CUNY Masters/DATA_622/bank data.csv", stringsAsFactors = TRUE)
kable(head(dataBank, 10), caption = "Bank Dataset")
Bank Dataset
X age job marital education default balance housing loan contact day month campaign previous term age_group credit_risk Subscription
1 58 management married tertiary no 2143 yes no unknown 5 may 1 0 no Senior Medium Risk no
2 44 technician single secondary no 29 yes no unknown 5 may 1 0 no Middle-aged Medium Risk no
3 33 entrepreneur married secondary no 2 yes yes unknown 5 may 1 0 no Middle-aged High Risk no
4 47 blue-collar married unknown no 1506 yes no unknown 5 may 1 0 no Middle-aged Medium Risk no
5 33 unknown single unknown no 1 no no unknown 5 may 1 0 no Middle-aged Medium Risk no
6 35 management married tertiary no 231 yes no unknown 5 may 1 0 no Middle-aged Medium Risk no
7 28 management single tertiary no 447 yes yes unknown 5 may 1 0 no Middle-aged High Risk no
8 42 entrepreneur divorced tertiary yes 2 yes no unknown 5 may 1 0 no Middle-aged Medium Risk no
9 58 retired married primary no 121 yes no unknown 5 may 1 0 no Senior Medium Risk no
10 43 technician single secondary no 593 yes no unknown 5 may 1 0 no Middle-aged Medium Risk no
# Check target variable distribution
table(dataBank$term, dataBank$Subscription)
##      
##          no   yes
##   no  39922     0
##   yes     0  5289
# Check for missing values
sum(is.na(dataBank))
## [1] 0
Data Preprocessing

In the previous assignment we performed some EDA in the dataset, for some reason a new index column was added (X), I will remove that, and also remove the “Subscription” column which is related to the “term” variable to prevent data leakage.

#remove unnecessary columns
dataBank$X <- NULL
# Find relationship between 'term' and 'Subscription', remove if necessary.
if(all(dataBank$term == dataBank$Subscription) || 
   cor(as.numeric(dataBank$term), as.numeric(dataBank$Subscription)) > 0.9) {
  dataBank$Subscription <- NULL
  print("Removed Subscription feature due to high correlation with target variable")
}
## [1] "Removed Subscription feature due to high correlation with target variable"
I’m going to split the data before applying the algorithms and experiments
# Create train/test split (70/30)
set.seed(123) 
trainIndex <- createDataPartition(dataBank$term, p = 0.7, list = FALSE)
trainData <- dataBank[trainIndex, ]
testData <- dataBank[-trainIndex, ]
# check the balance between classes
prop.table(table(trainData$term))
## 
##        no       yes 
## 0.8829979 0.1170021

perform the following:

Sections

1.Algorithm Selection

You will perform experiments using the following algorithms:

algorithms

Decision Trees

# Decision Tree Baseline
dt_baseline <- rpart(term ~ ., data = trainData, method = "class")
dt_pred <- predict(dt_baseline, testData, type = "class")
dt_cm <- confusionMatrix(dt_pred, testData$term, positive = "yes")
dt_roc <- roc(testData$term, predict(dt_baseline, testData, type = "prob")[, "yes"])
## Setting levels: control = no, case = yes
## Setting direction: controls < cases

Random Forest

# Random Forest Baseline
rf_baseline <- randomForest(term ~ ., data = trainData)
rf_pred <- predict(rf_baseline, testData)
rf_cm <- confusionMatrix(rf_pred, testData$term, positive = "yes")
rf_roc <- roc(testData$term, predict(rf_baseline, testData, type = "prob")[, "yes"])
## Setting levels: control = no, case = yes
## Setting direction: controls < cases

Adaboost

# AdaBoost Baseline
ada_baseline <- boosting(term ~ ., data = trainData, mfinal = 50)
ada_pred <- predict(ada_baseline, testData)
ada_cm <- confusionMatrix(as.factor(ada_pred$class), testData$term, positive = "yes")

Table with models baseline results

# Store baseline results
baseline_results <- data.frame(
  Algorithm = c("Decision Tree", "Random Forest", "AdaBoost"),
  Accuracy = c(dt_cm$overall["Accuracy"], rf_cm$overall["Accuracy"], ada_cm$overall["Accuracy"]),
  F1_Score = c(dt_cm$byClass["F1"], rf_cm$byClass["F1"], ada_cm$byClass["F1"]),
  Sensitivity = c(dt_cm$byClass["Sensitivity"], rf_cm$byClass["Sensitivity"], ada_cm$byClass["Sensitivity"]),
  Specificity = c(dt_cm$byClass["Specificity"], rf_cm$byClass["Specificity"], ada_cm$byClass["Specificity"])
)
print(baseline_results)
##       Algorithm  Accuracy  F1_Score Sensitivity Specificity
## 1 Decision Tree 0.8830556        NA   0.0000000   1.0000000
## 2 Random Forest 0.8868161 0.2817033   0.1897856   0.9791249
## 3      AdaBoost 0.8829819 0.2886598   0.2030265   0.9730294

based on the table above, the Random Forest model has the highest accuracy rate with 88.7, also the best Specificity value with .979. AdaBoost has the best F1 Score with 288,9 as well as Sensitivity rate of .203.

2.Experiment

For each of the algorithms (above), perform at least two (2) experiments. In a typical experiment you should: -Define the objective of the experiment (hypothesis) -Decide what will change, and what will stay the same -Select the evaluation metric (what you want to measure) -Perform the experiment -Document the experiment so you compare results (track progress)

experiments

Decision tree Exp 1: Find optimal complexity parameter (cp)

In the first Decision Tree experiment I will Tuning the complexity parameter to improve model performance

# Experiment 1:
dt_exp1 <- function() {
  # Set up cross-validation
  ctrl <- trainControl(
    method = "cv",
    number = 5,
    classProbs = TRUE,
    summaryFunction = twoClassSummary
  )
  
  # Define parameter grid
  grid <- expand.grid(cp = seq(0.001, 0.05, by = 0.005))
  
  # Train with parameter tuning
  dt_tuned <- train(
    term ~ .,
    data = trainData,
    method = "rpart",
    trControl = ctrl,
    tuneGrid = grid,
    metric = "ROC"
  )
  
  # Best parameter
  best_cp <- dt_tuned$bestTune$cp
  print(paste("Best cp value:", best_cp))
  
  # Train final model with best parameter
  dt_final <- rpart(term ~ ., data = trainData, method = "class", cp = best_cp)
  
  # Evaluate
  dt_pred <- predict(dt_final, testData, type = "class")
  dt_cm <- confusionMatrix(dt_pred, testData$term, positive = "yes")
  dt_roc <- roc(testData$term, predict(dt_final, testData, type = "prob")[, "yes"])
  
  # Print results
  print("Decision Tree - Experiment 1 (CP Tuning) Results:")
  print(dt_cm)
  print(paste("AUC:", auc(dt_roc)))
  
  # Return model and metrics
  return(list(
    model = dt_final,
    confusion_matrix = dt_cm,
    roc = dt_roc,
    auc = auc(dt_roc),
    best_param = best_cp
  ))
}

Decision tree Exp 2: Feature selection

In the second Decision Tree experiment I will use only the most important features will improve model generalization

# Experiment 2:
dt_exp2 <- function() {
  # Train initial model
  dt_init <- rpart(term ~ ., data = trainData, method = "class")
  
  # Check if variable importance exists
  if (!exists("variable.importance", dt_init) || length(dt_init$variable.importance) == 0) {
    print("No variable importance found. Using all features.")
    return(dt_exp1())  # Fall back to experiment 1
  }
  
  # Get variable importance
  importance <- dt_init$variable.importance
  
  # Determine how many features to select (min of 5 or what's available)
  n_features <- min(5, length(importance))
  
  # Make sure we have at least one feature
  if (n_features < 1) {
    print("Not enough important features found. Using all features.")
    return(dt_exp1())  # Fall back to experiment 1
  }
  
  # Select top important features
  top_features <- names(importance)[1:n_features]
  print("Top features for Decision Tree:")
  print(top_features)
  
  # Create formula with only important features
  formula_str <- paste("term ~", paste(top_features, collapse = " + "))
  print(paste("Formula:", formula_str))
  formula <- as.formula(formula_str)
  
  # Train model with selected features
  dt_features <- rpart(formula, data = trainData, method = "class")
  
  # Evaluate
  dt_pred <- predict(dt_features, testData, type = "class")
  dt_cm <- confusionMatrix(dt_pred, testData$term, positive = "yes")
  dt_roc <- roc(testData$term, predict(dt_features, testData, type = "prob")[, "yes"])
  
  # Print results
  print("Decision Tree - Experiment 2 (Feature Selection) Results:")
  print(dt_cm)
  print(paste("AUC:", auc(dt_roc)))
  
  # Return model and metrics
  return(list(
    model = dt_features,
    confusion_matrix = dt_cm,
    roc = dt_roc,
    auc = auc(dt_roc),
    features = top_features
  ))
}
# Run experiments
dt_result1 <- dt_exp1()
## [1] "Best cp value: 0.001"
## Setting levels: control = no, case = yes
## Setting direction: controls < cases
## [1] "Decision Tree - Experiment 1 (CP Tuning) Results:"
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    no   yes
##        no  11767  1388
##        yes   209   198
##                                           
##                Accuracy : 0.8822          
##                  95% CI : (0.8767, 0.8876)
##     No Information Rate : 0.8831          
##     P-Value [Acc > NIR] : 0.6219          
##                                           
##                   Kappa : 0.1585          
##                                           
##  Mcnemar's Test P-Value : <2e-16          
##                                           
##             Sensitivity : 0.12484         
##             Specificity : 0.98255         
##          Pos Pred Value : 0.48649         
##          Neg Pred Value : 0.89449         
##              Prevalence : 0.11694         
##          Detection Rate : 0.01460         
##    Detection Prevalence : 0.03001         
##       Balanced Accuracy : 0.55370         
##                                           
##        'Positive' Class : yes             
##                                           
## [1] "AUC: 0.651224738253304"
dt_result2 <- dt_exp2()
## [1] "No variable importance found. Using all features."
## [1] "Best cp value: 0.001"
## Setting levels: control = no, case = yes
## Setting direction: controls < cases
## [1] "Decision Tree - Experiment 1 (CP Tuning) Results:"
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    no   yes
##        no  11767  1388
##        yes   209   198
##                                           
##                Accuracy : 0.8822          
##                  95% CI : (0.8767, 0.8876)
##     No Information Rate : 0.8831          
##     P-Value [Acc > NIR] : 0.6219          
##                                           
##                   Kappa : 0.1585          
##                                           
##  Mcnemar's Test P-Value : <2e-16          
##                                           
##             Sensitivity : 0.12484         
##             Specificity : 0.98255         
##          Pos Pred Value : 0.48649         
##          Neg Pred Value : 0.89449         
##              Prevalence : 0.11694         
##          Detection Rate : 0.01460         
##    Detection Prevalence : 0.03001         
##       Balanced Accuracy : 0.55370         
##                                           
##        'Positive' Class : yes             
##                                           
## [1] "AUC: 0.651224738253304"

Random Forest Exp 1: Tune mtry parameter

In Random Forest experiment one, I will tune the “mtry” parameter to optimize the number of variables at each split will improve performance

# Experiment 1: 
rf_exp1 <- function() {
  # Set up cross-validation
  ctrl <- trainControl(
    method = "cv",
    number = 5,
    classProbs = TRUE,
    summaryFunction = twoClassSummary
  )
  
  # Define parameter grid - try a range of mtry values
  mtry_values <- c(2, sqrt(ncol(trainData) - 1), ncol(trainData)/3)
  grid <- expand.grid(mtry = mtry_values)
  
  # Train with parameter tuning
  rf_tuned <- train(
    term ~ .,
    data = trainData,
    method = "rf",
    trControl = ctrl,
    tuneGrid = grid,
    metric = "ROC",
    ntree = 200
  )
  
  # Best parameter
  best_mtry <- rf_tuned$bestTune$mtry
  print(paste("Best mtry value:", best_mtry))
  
  # Train final model with best parameter
  rf_final <- randomForest(term ~ ., data = trainData, mtry = best_mtry, ntree = 200)
  
  # Evaluate
  rf_pred <- predict(rf_final, testData)
  rf_cm <- confusionMatrix(rf_pred, testData$term, positive = "yes")
  rf_roc <- roc(testData$term, predict(rf_final, testData, type = "prob")[, "yes"])
  
  # Print results
  print("Random Forest - Experiment 1 (mtry Tuning) Results:")
  print(rf_cm)
  print(paste("AUC:", auc(rf_roc)))
  
  # Return model and metrics
  return(list(
    model = rf_final,
    confusion_matrix = rf_cm,
    roc = rf_roc,
    auc = auc(rf_roc),
    best_param = best_mtry
  ))
}

Random Forest Exp 2: Adjust number of trees (ntree)

In the Random Forest experiment 2, I will Increase the number of trees in order to improve model stability and accuracy

# Experiment 2: 
rf_exp2 <- function() {
  # Try different numbers of trees
  ntree_values <- c(50, 100, 200, 300, 500)
  results <- data.frame(ntree = integer(), accuracy = numeric(), f1 = numeric(), auc = numeric())
  
  for (n in ntree_values) {
    # Train model
    rf_model <- randomForest(term ~ ., data = trainData, ntree = n)
    
    # Evaluate
    rf_pred <- predict(rf_model, testData)
    rf_cm <- confusionMatrix(rf_pred, testData$term, positive = "yes")
    rf_roc <- roc(testData$term, predict(rf_model, testData, type = "prob")[, "yes"])
    
    # Store results
    results <- rbind(results, data.frame(
      ntree = n,
      accuracy = rf_cm$overall["Accuracy"],
      f1 = rf_cm$byClass["F1"],
      auc = auc(rf_roc)
    ))
  }
  
  # Find best ntree value based on F1 score
  best_ntree <- results$ntree[which.max(results$f1)]
  print(paste("Best ntree value:", best_ntree))
  
  # Train final model with best parameter
  rf_final <- randomForest(term ~ ., data = trainData, ntree = best_ntree)
  
  # Evaluate
  rf_pred <- predict(rf_final, testData)
  rf_cm <- confusionMatrix(rf_pred, testData$term, positive = "yes")
  rf_roc <- roc(testData$term, predict(rf_final, testData, type = "prob")[, "yes"])
  
  # Print results
  print("Random Forest - Experiment 2 (ntree Tuning) Results:")
  print(rf_cm)
  print(paste("AUC:", auc(rf_roc)))
  
  # Return model and metrics
  return(list(
    model = rf_final,
    confusion_matrix = rf_cm,
    roc = rf_roc,
    auc = auc(rf_roc),
    ntree_results = results,
    best_param = best_ntree
  ))
}
# Run experiments
rf_result1 <- rf_exp1()
## [1] "Best mtry value: 5.33333333333333"
## Setting levels: control = no, case = yes
## Setting direction: controls < cases
## [1] "Random Forest - Experiment 1 (mtry Tuning) Results:"
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    no   yes
##        no  11653  1221
##        yes   323   365
##                                           
##                Accuracy : 0.8862          
##                  95% CI : (0.8807, 0.8915)
##     No Information Rate : 0.8831          
##     P-Value [Acc > NIR] : 0.1336          
##                                           
##                   Kappa : 0.2693          
##                                           
##  Mcnemar's Test P-Value : <2e-16          
##                                           
##             Sensitivity : 0.23014         
##             Specificity : 0.97303         
##          Pos Pred Value : 0.53052         
##          Neg Pred Value : 0.90516         
##              Prevalence : 0.11694         
##          Detection Rate : 0.02691         
##    Detection Prevalence : 0.05073         
##       Balanced Accuracy : 0.60158         
##                                           
##        'Positive' Class : yes             
##                                           
## [1] "AUC: 0.773992789066995"
rf_result2 <- rf_exp2()
## Setting levels: control = no, case = yes
## Setting direction: controls < cases
## Setting levels: control = no, case = yes
## Setting direction: controls < cases
## Setting levels: control = no, case = yes
## Setting direction: controls < cases
## Setting levels: control = no, case = yes
## Setting direction: controls < cases
## Setting levels: control = no, case = yes
## Setting direction: controls < cases
## [1] "Best ntree value: 300"
## Setting levels: control = no, case = yes
## Setting direction: controls < cases
## [1] "Random Forest - Experiment 2 (ntree Tuning) Results:"
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    no   yes
##        no  11731  1274
##        yes   245   312
##                                           
##                Accuracy : 0.888           
##                  95% CI : (0.8826, 0.8933)
##     No Information Rate : 0.8831          
##     P-Value [Acc > NIR] : 0.03717         
##                                           
##                   Kappa : 0.2453          
##                                           
##  Mcnemar's Test P-Value : < 2e-16         
##                                           
##             Sensitivity : 0.19672         
##             Specificity : 0.97954         
##          Pos Pred Value : 0.56014         
##          Neg Pred Value : 0.90204         
##              Prevalence : 0.11694         
##          Detection Rate : 0.02301         
##    Detection Prevalence : 0.04107         
##       Balanced Accuracy : 0.58813         
##                                           
##        'Positive' Class : yes             
##                                           
## [1] "AUC: 0.776913036876612"

Adaboost Exp 1: Adjust mfinal (number of iterations)

In Adaboost experiment 1, I will try to find the optimal number of boosting iterations that might help me improve model performance

# Experiment 1: 
ada_exp1 <- function() {
  # Try different numbers of iterations
  mfinal_values <- c(10, 30, 50, 100, 150)
  results <- data.frame(mfinal = integer(), accuracy = numeric(), f1 = numeric())
  
  for (m in mfinal_values) {
    # Train model
    ada_model <- boosting(term ~ ., data = trainData, mfinal = m)
    
    # Evaluate
    ada_pred <- predict(ada_model, testData)
    ada_cm <- confusionMatrix(as.factor(ada_pred$class), testData$term, positive = "yes")
    
    # Store results
    results <- rbind(results, data.frame(
      mfinal = m,
      accuracy = ada_cm$overall["Accuracy"],
      f1 = ada_cm$byClass["F1"]
    ))
  }
  
  # Find best mfinal value based on F1 score
  best_mfinal <- results$mfinal[which.max(results$f1)]
  print(paste("Best mfinal value:", best_mfinal))
  
  # Train final model with best parameter
  ada_final <- boosting(term ~ ., data = trainData, mfinal = best_mfinal)
  
  # Evaluate
  ada_pred <- predict(ada_final, testData)
  ada_cm <- confusionMatrix(as.factor(ada_pred$class), testData$term, positive = "yes")
  
  # Print results
  print("AdaBoost - Experiment 1 (mfinal Tuning) Results:")
  print(ada_cm)
  
  # Return model and metrics
  return(list(
    model = ada_final,
    confusion_matrix = ada_cm,
    mfinal_results = results,
    best_param = best_mfinal
  ))
}

Adaboost Exp 2: Handle class imbalance using weights

In Adaboost experiment rate 2, I will adjust weights for minority class to help improve classification of minority class

# Experiment 2:
ada_exp2 <- function() {
  # Calculate class weights inversely proportional to class frequencies
  class_weights <- 1 / table(trainData$term)
  class_weights <- class_weights / sum(class_weights)
  
  # Create weighted version of training data
  # We'll create a weight vector for boosting
  weights <- ifelse(trainData$term == "yes", 
                    class_weights["yes"], 
                    class_weights["no"])
  
  # Train with adjusted weights
  ada_weighted <- boosting(term ~ ., data = trainData, mfinal = 50, control = list(weights = weights))
  
  # Evaluate
  ada_pred <- predict(ada_weighted, testData)
  ada_cm <- confusionMatrix(as.factor(ada_pred$class), testData$term, positive = "yes")
  
  # Print results
  print("AdaBoost - Experiment 2 (Class Weighting) Results:")
  print(ada_cm)
  print(paste("Class weights used - yes:", class_weights["yes"], "no:", class_weights["no"]))
  
  # Return model and metrics
  return(list(
    model = ada_weighted,
    confusion_matrix = ada_cm,
    weights = class_weights
  ))
}
# Run experiments
ada_result1 <- ada_exp1()
## [1] "Best mfinal value: 50"
## [1] "AdaBoost - Experiment 1 (mfinal Tuning) Results:"
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    no   yes
##        no  11683  1280
##        yes   293   306
##                                           
##                Accuracy : 0.884           
##                  95% CI : (0.8785, 0.8894)
##     No Information Rate : 0.8831          
##     P-Value [Acc > NIR] : 0.3703          
##                                           
##                   Kappa : 0.2308          
##                                           
##  Mcnemar's Test P-Value : <2e-16          
##                                           
##             Sensitivity : 0.19294         
##             Specificity : 0.97553         
##          Pos Pred Value : 0.51085         
##          Neg Pred Value : 0.90126         
##              Prevalence : 0.11694         
##          Detection Rate : 0.02256         
##    Detection Prevalence : 0.04417         
##       Balanced Accuracy : 0.58424         
##                                           
##        'Positive' Class : yes             
## 
ada_result2 <- ada_exp2()
## [1] "AdaBoost - Experiment 2 (Class Weighting) Results:"
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    no   yes
##        no  11684  1285
##        yes   292   301
##                                           
##                Accuracy : 0.8837          
##                  95% CI : (0.8782, 0.8891)
##     No Information Rate : 0.8831          
##     P-Value [Acc > NIR] : 0.4114          
##                                           
##                   Kappa : 0.2271          
##                                           
##  Mcnemar's Test P-Value : <2e-16          
##                                           
##             Sensitivity : 0.18979         
##             Specificity : 0.97562         
##          Pos Pred Value : 0.50759         
##          Neg Pred Value : 0.90092         
##              Prevalence : 0.11694         
##          Detection Rate : 0.02219         
##    Detection Prevalence : 0.04373         
##       Balanced Accuracy : 0.58270         
##                                           
##        'Positive' Class : yes             
##                                           
## [1] "Class weights used - yes: 0.882997883029479 no: 0.11700211697052"

Variations

Since I performed all the experiments in all three algorithms, I will create some tables and visualizations to find out the variances between experiments in all models, as well as to find the best model for this project.

results

Performance Metrics Across Models

# Compile all results
all_results <- data.frame(
  Algorithm = c(
    "Decision Tree (Baseline)", "Decision Tree (CP Tuning)", "Decision Tree (Feature Selection)",
    "Random Forest (Baseline)", "Random Forest (mtry Tuning)", "Random Forest (ntree Tuning)",
    "AdaBoost (Baseline)", "AdaBoost (mfinal Tuning)", "AdaBoost (Class Weighting)"
  ),
  Accuracy = c(
    dt_cm$overall["Accuracy"], dt_result1$confusion_matrix$overall["Accuracy"], dt_result2$confusion_matrix$overall["Accuracy"],
    rf_cm$overall["Accuracy"], rf_result1$confusion_matrix$overall["Accuracy"], rf_result2$confusion_matrix$overall["Accuracy"],
    ada_cm$overall["Accuracy"], ada_result1$confusion_matrix$overall["Accuracy"], ada_result2$confusion_matrix$overall["Accuracy"]
  ),
  F1_Score = c(
    dt_cm$byClass["F1"], dt_result1$confusion_matrix$byClass["F1"], dt_result2$confusion_matrix$byClass["F1"],
    rf_cm$byClass["F1"], rf_result1$confusion_matrix$byClass["F1"], rf_result2$confusion_matrix$byClass["F1"],
    ada_cm$byClass["F1"], ada_result1$confusion_matrix$byClass["F1"], ada_result2$confusion_matrix$byClass["F1"]
  ),
  Sensitivity = c(
    dt_cm$byClass["Sensitivity"], dt_result1$confusion_matrix$byClass["Sensitivity"], dt_result2$confusion_matrix$byClass["Sensitivity"],
    rf_cm$byClass["Sensitivity"], rf_result1$confusion_matrix$byClass["Sensitivity"], rf_result2$confusion_matrix$byClass["Sensitivity"],
    ada_cm$byClass["Sensitivity"], ada_result1$confusion_matrix$byClass["Sensitivity"], ada_result2$confusion_matrix$byClass["Sensitivity"]
  ),
  Specificity = c(
    dt_cm$byClass["Specificity"], dt_result1$confusion_matrix$byClass["Specificity"], dt_result2$confusion_matrix$byClass["Specificity"],
    rf_cm$byClass["Specificity"], rf_result1$confusion_matrix$byClass["Specificity"], rf_result2$confusion_matrix$byClass["Specificity"],
    ada_cm$byClass["Specificity"], ada_result1$confusion_matrix$byClass["Specificity"], ada_result2$confusion_matrix$byClass["Specificity"]
  )
)

# Add AUC where available
all_results$AUC <- c(
  auc(dt_roc), dt_result1$auc, dt_result2$auc,
  auc(rf_roc), rf_result1$auc, rf_result2$auc,
  NA, NA, NA  # AdaBoost doesn't provide probabilities directly for ROC
)

# Display results
print(all_results)
##                           Algorithm  Accuracy  F1_Score Sensitivity Specificity
## 1          Decision Tree (Baseline) 0.8830556        NA   0.0000000   1.0000000
## 2         Decision Tree (CP Tuning) 0.8822445 0.1986954   0.1248424   0.9825484
## 3 Decision Tree (Feature Selection) 0.8822445 0.1986954   0.1248424   0.9825484
## 4          Random Forest (Baseline) 0.8868161 0.2817033   0.1897856   0.9791249
## 5       Random Forest (mtry Tuning) 0.8861525 0.3210202   0.2301387   0.9730294
## 6      Random Forest (ntree Tuning) 0.8879959 0.2911806   0.1967213   0.9795424
## 7               AdaBoost (Baseline) 0.8829819 0.2886598   0.2030265   0.9730294
## 8          AdaBoost (mfinal Tuning) 0.8840142 0.2800915   0.1929382   0.9755344
## 9        AdaBoost (Class Weighting) 0.8837192 0.2762735   0.1897856   0.9756179
##         AUC
## 1 0.5000000
## 2 0.6512247
## 3 0.6512247
## 4 0.7748837
## 5 0.7739928
## 6 0.7769130
## 7        NA
## 8        NA
## 9        NA
# Create performance comparison visualization
all_results_long <- tidyr::pivot_longer(
  all_results,
  cols = c("Accuracy", "F1_Score", "Sensitivity", "Specificity", "AUC"),
  names_to = "Metric",
  values_to = "Value"
)

Graph of Metrics Comparison

# Plot metrics comparison
ggplot(all_results_long, aes(x = reorder(Algorithm, Value), y = Value, fill = Metric)) +
  geom_bar(stat = "identity", position = "dodge") +
  coord_flip() +
  theme_minimal() +
  labs(title = "Performance Metrics Comparison Across Models",
       x = "Algorithm",
       y = "Metric Value") +
  theme(legend.position = "bottom")
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_bar()`).

#### Graph of ROC curves

# Plot ROC curves for models where available
plot(dt_roc, col = "blue", main = "ROC Curves Comparison")
plot(dt_result1$roc, add = TRUE, col = "lightblue")
plot(dt_result2$roc, add = TRUE, col = "darkblue")
plot(rf_roc, add = TRUE, col = "red")
plot(rf_result1$roc, add = TRUE, col = "pink")
plot(rf_result2$roc, add = TRUE, col = "darkred")
legend("bottomright", 
       legend = c("DT Baseline", "DT CP Tuned", "DT Feature Selection", 
                  "RF Baseline", "RF mtry Tuned", "RF ntree Tuned"),
       col = c("blue", "lightblue", "darkblue", "red", "pink", "darkred"),
       lwd = 2)

Table with the best model based on F1 Score

# Find the best model based on F1 score (good for imbalanced data)
best_model_index <- which.max(all_results$F1_Score)
best_model_name <- all_results$Algorithm[best_model_index]

cat("The best performing model based on F1 Score is:", best_model_name, 
    "with F1 Score of", all_results$F1_Score[best_model_index], "\n")
## The best performing model based on F1 Score is: Random Forest (mtry Tuning) with F1 Score of 0.3210202
# If you want to consider multiple metrics, create a weighted average
# For example: 0.4*Accuracy + 0.4*F1_Score + 0.2*AUC
all_results$Combined_Score <- 0.4 * all_results$Accuracy + 
                             0.4 * all_results$F1_Score + 
                             0.2 * all_results$AUC
all_results$Combined_Score[is.na(all_results$Combined_Score)] <- 
  0.5 * all_results$Accuracy[is.na(all_results$Combined_Score)] + 
  0.5 * all_results$F1_Score[is.na(all_results$Combined_Score)]

best_overall_index <- which.max(all_results$Combined_Score)
best_overall_name <- all_results$Algorithm[best_overall_index]

cat("The best overall model based on combined metrics is:", best_overall_name,
    "with Combined Score of", all_results$Combined_Score[best_overall_index], "\n")
## The best overall model based on combined metrics is: Random Forest (mtry Tuning) with Combined Score of 0.6376676

Graph with variable importance RF “mtry” Model

# For the best model, analyze feature importance
if(grepl("Random Forest", best_overall_name)) {
  # For Random Forest
  if(best_overall_name == "Random Forest (Baseline)") {
    importance_plot <- varImpPlot(rf_baseline, main = "Variable Importance - Random Forest Baseline")
  } else if(best_overall_name == "Random Forest (mtry Tuning)") {
    importance_plot <- varImpPlot(rf_result1$model, main = "Variable Importance - RF mtry Tuned")
  } else if(best_overall_name == "Random Forest (ntree Tuning)") {
    importance_plot <- varImpPlot(rf_result2$model, main = "Variable Importance - RF ntree Tuned")
  }
} else if(grepl("Decision Tree", best_overall_name)) {
  # For Decision Tree
  if(best_overall_name == "Decision Tree (Baseline)") {
    importance <- dt_baseline$variable.importance
  } else if(best_overall_name == "Decision Tree (CP Tuning)") {
    importance <- dt_result1$model$variable.importance
  } else if(best_overall_name == "Decision Tree (Feature Selection)") {
    importance <- dt_result2$model$variable.importance
  }
  
  # Plot importance
  barplot(sort(importance, decreasing = TRUE),
          main = "Variable Importance - Decision Tree",
          col = "skyblue",
          las = 2,
          cex.names = 0.7)
}

#### Conclusion

Based on the tables and graphs above, the best performing model based on F1 Score is: Random Forest (mtry Tuning) with F1 Score of 0.3210202, the Decision Tree is the one with the lowest F1 Score. Random Forest (ntree Tunning) has the highest accuracy rate with a value of 8879959, while both Decision Tree experiments has the lowest values 8822445. Random Forest ntree tunning is the one with the highest AUC with a value of 0.7769130 while decision tree baseline has the lowest with a value of 0.5000000
In conclusion, the best overall model based on combined metrics is the Random Forest (mtry Tuning) with Combined Score of 0.6376676