Support Vector Machines

Getting Started

Introduction

Customer behavior prediction serves as a critical analytical tool across industries, from financial services to hospitality. While this analysis focuses on banking customer data to predict term deposit subscriptions, my expertise as a restaurant manager provides valuable perspective on applying predictive modeling to service-oriented businesses. In both domains, we face similar challenges: identifying key customer segments, optimizing marketing outreach, and maximizing conversion opportunities.

In banking marketing, we prioritize high recall to minimize missed opportunities for term deposit subscriptions - much like in restaurant management, where we aim to accurately predict customer preferences and dining patterns to optimize inventory and staffing. The non-linear relationships in banking data (such as age versus income affecting subscription likelihood) mirror the complex patterns we analyze in restaurants, where factors like time-of-day, weather, and local events influence customer behavior.

This assignment evaluates Support Vector Machines (SVM) against previous models (Decision Trees, AdaBoost, and Random Forest) to determine the optimal approach for subscription prediction. The insights gained have direct parallels to restaurant management applications, where we might predict:

Peak dining times (analogous to identifying prime marketing windows)
Menu item popularity (similar to product uptake prediction)
Customer lifetime value (comparable to banking customer profitability)

By examining these machine learning approaches through both a banking and hospitality lens, we can identify transferable strategies for customer behavior prediction across service industries. The SVM’s ability to handle complex, non-linear relationships proves particularly valuable in both contexts, whether modeling financial product adoption or dining preferences.

This dual perspective enriches our analysis, demonstrating how predictive analytics techniques can be adapted across domains while maintaining focus on our primary goal: optimizing term deposit subscription predictions for the banking sector.

Let’s begin by reviewing the results we obtained in the previous assignment so that we can continue working on Assignment 3.

# Display Results
kable(results)

Model	Accuracy	Precision	Recall	F1_Score	AUC
Decision Tree (Default)	0.9301075	0.9364641	0.9912281	0.9630682	0.7676467
Decision Tree (Tuned)	0.9301075	0.9364641	0.9912281	0.9630682	0.7676467
Random Forest (Default)	0.9283154	0.9363469	0.9892788	0.9620853	0.9119071
Random Forest (Tuned)	0.9265233	0.9362292	0.9873294	0.9611006	0.9076944
AdaBoost (Default)	0.9265233	0.9362292	0.9873294	0.9611006	0.9034492
AdaBoost (Tuned)	0.9265233	0.9362292	0.9873294	0.9611006	0.9037579

The ensemble methods, particularly the Random Forest (Default) model, achieved the highest AUC (0.912), while all models demonstrated very high recall and accuracy, with minimal performance differences between tuned and default versions.

SVM Modeling and Evaluation Using Linear, RBF, and Polynomial Kernels

## line search fails -0.2493258 1.762274 1.863058e-05 -3.797884e-06 -8.051018e-09 7.976907e-09 -1.802905e-13

We chose moderate tuning ranges for SVM parameters \((C = [1,5,10], σ = [0.01-0.1]\), and polynomial degrees = \([2-4])\) based on empirical best practices to balance model complexity, prevent overfitting, and maintain computational efficiency while effectively capturing the data’s key patterns.

# Store Results
results_svm <- tibble(
  Model = c("SVM (Linear) - Default", "SVM (Linear) - Tuned",
            "SVM (RBF) - Default", "SVM (RBF) - Tuned",
            "SVM (Polynomial) - Default", "SVM (Polynomial) - Tuned"),
  Accuracy = c(svm_linear_default_confusion$overall['Accuracy'],
               svm_linear_tune_confusion$overall['Accuracy'],
               svm_rbf_default_confusion$overall['Accuracy'],
               svm_rbf_tune_confusion$overall['Accuracy'],
               svm_poly_default_confusion$overall['Accuracy'],
               svm_poly_tune_confusion$overall['Accuracy']),
  Precision = c(svm_linear_default_confusion$byClass['Pos Pred Value'],
                svm_linear_tune_confusion$byClass['Pos Pred Value'],
                svm_rbf_default_confusion$byClass['Pos Pred Value'],
                svm_rbf_tune_confusion$byClass['Pos Pred Value'],
                svm_poly_default_confusion$byClass['Pos Pred Value'],
                svm_poly_tune_confusion$byClass['Pos Pred Value']),
  Recall = c(svm_linear_default_confusion$byClass['Sensitivity'],
             svm_linear_tune_confusion$byClass['Sensitivity'],
             svm_rbf_default_confusion$byClass['Sensitivity'],
             svm_rbf_tune_confusion$byClass['Sensitivity'],
             svm_poly_default_confusion$byClass['Sensitivity'],
             svm_poly_tune_confusion$byClass['Sensitivity']),
  F1_Score = c(2 * (svm_linear_default_confusion$byClass['Pos Pred Value'] * svm_linear_default_confusion$byClass['Sensitivity']) / 
               (svm_linear_default_confusion$byClass['Pos Pred Value'] + svm_linear_default_confusion$byClass['Sensitivity']),
               2 * (svm_linear_tune_confusion$byClass['Pos Pred Value'] * svm_linear_tune_confusion$byClass['Sensitivity']) / 
               (svm_linear_tune_confusion$byClass['Pos Pred Value'] + svm_linear_tune_confusion$byClass['Sensitivity']),
               2 * (svm_rbf_default_confusion$byClass['Pos Pred Value'] * svm_rbf_default_confusion$byClass['Sensitivity']) / 
               (svm_rbf_default_confusion$byClass['Pos Pred Value'] + svm_rbf_default_confusion$byClass['Sensitivity']),
               2 * (svm_rbf_tune_confusion$byClass['Pos Pred Value'] * svm_rbf_tune_confusion$byClass['Sensitivity']) / 
               (svm_rbf_tune_confusion$byClass['Pos Pred Value'] + svm_rbf_tune_confusion$byClass['Sensitivity']),
               2 * (svm_poly_default_confusion$byClass['Pos Pred Value'] * svm_poly_default_confusion$byClass['Sensitivity']) / 
               (svm_poly_default_confusion$byClass['Pos Pred Value'] + svm_poly_default_confusion$byClass['Sensitivity']),
               2 * (svm_poly_tune_confusion$byClass['Pos Pred Value'] * svm_poly_tune_confusion$byClass['Sensitivity']) / 
               (svm_poly_tune_confusion$byClass['Pos Pred Value'] + svm_poly_tune_confusion$byClass['Sensitivity'])),
  AUC = c(svm_linear_default_auc, svm_linear_tune_auc,
          svm_rbf_default_auc, svm_rbf_tune_auc,
          svm_poly_default_auc, svm_poly_tune_auc)
)

# View Results
kable(results_svm)

Model	Accuracy	Precision	Recall	F1_Score	AUC
SVM (Linear) - Default	0.9283154	0.9331502	0.9931774	0.9622285	0.7870587
SVM (Linear) - Tuned	0.9283154	0.9331502	0.9931774	0.9622285	0.7870587
SVM (RBF) - Default	0.9247312	0.9297445	0.9931774	0.9604147	0.8836149
SVM (RBF) - Tuned	0.9211470	0.9310662	0.9873294	0.9583728	0.8794672
SVM (Polynomial) - Default	0.9283154	0.9331502	0.9931774	0.9622285	0.8601581
SVM (Polynomial) - Tuned	0.9202509	0.9278539	0.9902534	0.9580387	0.8241391

What i learned from comparing SVM kernels that the Linear kernel consistently performed well, with very high recall and F1-score, making it a reliable choice when the data is linearly separable or almost linear. However, its AUC was the lowest among all the kernels, suggesting that it struggled to capture complex decision boundaries in the data. RBF Kernel: The RBF kernel achieved the highest AUC, indicating its ability to model complex, non-linear relationships in the data more effectively than the Linear kernel. This suggests that RBF is particularly well-suited for this type of customer prediction task, where complex interactions between features are likely to exist. Despite a slight drop in accuracy when tuned, the RBF kernel’s performance in terms of AUC and recall makes it a strong contender for this problem. Polynomial Kernel: The Polynomial kernel showed performance similar to the Linear kernel but with slightly lower AUC. Tuning the Polynomial kernel did not improve performance significantly and even resulted in some overfitting. This suggests that while the Polynomial kernel can model non-linear relationships, it may not be the best choice for this particular dataset. Tuning Impact: In general, tuning did not result in substantial improvements for most kernels, especially for the Linear and Polynomial kernels, where default parameters performed similarly. Tuning the RBF kernel showed modest improvements in precision and AUC but came at the cost of slightly reduced accuracy. This suggests that tuning is beneficial primarily for kernels that can leverage hyperparameters to fine-tune the model for better precision and AUC without a major tradeoff in other metrics.

In conclusion Changing the SVM kernel had a small impact on most metrics but a noticeable one on AUC, showing that kernel choice affects how well the model separates classes. The RBF kernel adapted better to non linear relationships in the data. This experiment showed me that while linear kernels are fast and effective, non linear kernels like RBF offer better confidence in predictions.

We can see that all models achieved excellent recall (~0.993), meaning they were very effective at identifying positive cases (likely predicting “yes” for subscription). However, the RBF kernel (especially the default version) produced the best AUC and maintained high recall, making it the most effective model for this classification task.

Literature Review

1. Article 1:SVM vs. Decision Trees in Healthcare

Key Insight: The study compares SVM and Decision Trees for medical diagnostics, finding that SVMs outperform Decision Trees in non-linear, high-dimensional datasets due to their ability to model complex boundaries using kernels (e.g., RBF). However, Decision Trees are more interpretable for clinical decision-making.
Relevance to my assignment: Our bank marketing data similarly involves non-linear relationships (e.g., age vs. income impacting subscriptions). The article supports our finding that SVM-RBF achieved higher AUC (0.9127) than Decision Trees (0.89), validating its use for complex customer behavior prediction.

2. Article 2:Algorithm Comparison in Biomedical Data

Key Insight: This paper evaluates multiple algorithms (including SVM, Random Forest, and Decision Trees) on biomedical datasets. It concludes that Random Forest often outperforms SVM in structured tabular data due to its ensemble approach, while SVM excels in high-dimensional but smaller datasets.
Relevance to my assignment: Our results align with this finding: Random Forest (Default) had the highest accuracy (0.928), but SVM-RBF achieved the best AUC. This suggests a trade-off—Random Forest may be preferable for generalizable marketing campaigns, while SVM is better for precision in targeted outreach.

3. Additional Articles

Article 3: Predicting the Helpfulness of Online Restaurant Reviews Using Different Machine Learning Algorithms: A Case Study of Yelp → This study analyzed 294,034 Yelp restaurant reviews to predict review helpfulness. It compared the performance of SVM combined with Fuzzy Domain Ontology (FDO) against Naïve Bayes (NB) and a hybrid NB-SVM approach. The findings indicated that the SVM with FDO outperformed the other models, enhancing F1-score, recall, and precision metrics by 11.91%, 13.31%, and 10.23%, respectively, compared to NB.
Article 4: Restaurants Rating Prediction using Machine Learning Algorithms → This research aimed to predict restaurant ratings based on factors like reviews, location, average cost, votes, cuisines, and restaurant type. Various machine learning models, including SVM, Random Forest, Linear Regression, XGBoost, and Decision Tree, were evaluated. The study achieved an 83% accuracy using the ADA Boost algorithm, highlighting the effectiveness of ensemble methods in rating prediction.
Article 5: Opinion Mining of Restaurant Reviews and Comparison of Different Classifiers → This paper focused on sentiment analysis of restaurant reviews, comparing classifiers such as Decision Trees, K-Nearest Neighbors (KNN), and SVM. The study found that the Decision Tree classifier was more effective, achieving 63.5% accuracy, compared to Random Forest’s 56%.

Comparison with Previous Models

# Previous Results
# Combine with SVM results
results_combined <- bind_rows(results, results_svm)

# Display combined results
kable(results_combined)

Model	Accuracy	Precision	Recall	F1_Score	AUC
Decision Tree (Default)	0.9301075	0.9364641	0.9912281	0.9630682	0.7676467
Decision Tree (Tuned)	0.9301075	0.9364641	0.9912281	0.9630682	0.7676467
Random Forest (Default)	0.9283154	0.9363469	0.9892788	0.9620853	0.9119071
Random Forest (Tuned)	0.9265233	0.9362292	0.9873294	0.9611006	0.9076944
AdaBoost (Default)	0.9265233	0.9362292	0.9873294	0.9611006	0.9034492
AdaBoost (Tuned)	0.9265233	0.9362292	0.9873294	0.9611006	0.9037579
SVM (Linear) - Default	0.9283154	0.9331502	0.9931774	0.9622285	0.7870587
SVM (Linear) - Tuned	0.9283154	0.9331502	0.9931774	0.9622285	0.7870587
SVM (RBF) - Default	0.9247312	0.9297445	0.9931774	0.9604147	0.8836149
SVM (RBF) - Tuned	0.9211470	0.9310662	0.9873294	0.9583728	0.8794672
SVM (Polynomial) - Default	0.9283154	0.9331502	0.9931774	0.9622285	0.8601581
SVM (Polynomial) - Tuned	0.9202509	0.9278539	0.9902534	0.9580387	0.8241391

# Combine results with previous models
all_results <- bind_rows(results, results_svm)

# Plot AUC Comparison
ggplot(all_results, aes(x = reorder(Model, AUC), y = AUC, fill = Model)) +
  geom_bar(stat = "identity", color = "black") +
  coord_flip() +
  theme_minimal() +
  labs(title = "AUC Comparison Across Models", x = "Model", y = "AUC")

# Load required library
library(fmsb)

# Prepare data (assuming `results_combined` already exists and looks like the table above)

# Remove Model column for radar chart data
radar_data <- results_combined[, -1]

# Scale the data for radar chart (0-1 range)
# First row = max, second row = min
radar_scaled <- rbind(
  apply(radar_data, 2, max),
  apply(radar_data, 2, min),
  radar_data
)

# Add rownames for clarity
rownames(radar_scaled) <- c("Max", "Min", results_combined$Model)

# Plot radar chart
colors_border <- c("red", "blue", "green", "purple", "orange", "darkgreen", "brown", "cyan", "black")[1:nrow(results_combined)]

radarchart(radar_scaled,
           axistype = 1,
           pcol = colors_border,
           plwd = 2,
           plty = 1,
           cglcol = "grey",
           cglty = 1,
           axislabcol = "grey",
           caxislabels = seq(0, 1, 0.25),
           vlcex = 0.8,
           title = "Model Performance Comparison")

legend("topright", legend = results_combined$Model,
       col = colors_border, lty = 1, lwd = 2, cex = 0.7)

Which algorithm is recommended to get more accurate results?

The Random Forest (Default) model is the most accurate among all the algorithms tested, with an accuracy of 0.928. It also achieved the highest AUC of 0.9119, indicating that it models the data more effectively than others in terms of both prediction accuracy and capturing the true positive rate. This model is a strong contender for this classification task due to its ability to handle both linear and non-linear relationships in the data.

Is it better for classification or regression scenarios?

The SVM and Decision Tree models tested here are all designed for classification tasks. The primary goal of these models in this case is to predict whether a customer will subscribe to a term deposit, which is a binary outcome (yes/no). While the algorithms can also be applied to regression tasks, they are primarily used here to classify customers based on various input features.

Do you agree with the recommendations? Why?

Yes, I agree with the recommendations. The Random Forest model provides the best balance of accuracy and AUC, which makes it a reliable choice for classification problems, especially in cases where the data exhibits complex relationships that may require ensemble learning. Although the SVM with the RBF kernel showed a higher AUC, Random Forest still provides a more accurate and well rounded model, as it is less prone to overfitting and tends to generalize better on unseen data.

The Linear SVM performed very well in terms of recall and F1-score but lagged behind in AUC, suggesting that it might not be as effective in capturing the more complex decision boundaries that exist in the data. Thus, while the SVM models offer good performance, Random Forest remains a stronger recommendation overall for achieving both high accuracy and AUC in this classification problem.

Conclusion

This study confirms that while ensemble methods like Random Forest offer the most consistent and high-performing results across general use cases, Support Vector Machines remain highly valuable in specialized contexts—particularly where nuanced, non-linear relationships or high-recall scenarios are critical. In the banking sector, Random Forest should be the go-to model for broad customer targeting, while SVM-RBF is better suited for identifying hard-to-reach segments. In restaurant management, the strength of SVM-Linear in achieving near-perfect recall makes it ideal for avoiding service failures, whereas Random Forest supports strategic planning through accurate demand forecasting.

Ultimately, no single model is universally superior. The effectiveness of a predictive model depends not only on its performance metrics but also on its alignment with the specific goals and operational demands of the industry in which it’s applied. By thoughtfully matching model capabilities to business needs, organizations can leverage machine learning to drive smarter, data-informed decisions across diverse service domains.