DATA 624 Presentation

2024-11-01

Introduction to Non-Linear Models

Purpose: Capture complex patterns in data.
Methods Covered: Support Vector Machines (SVM) for regression and classification.
Why SVM? Known for flexibility and robustness against outliers.

Overview of Support Vector Machines (SVM)

Developed by: Vladimir Vapnik in the 1960s.
Applications: Originally for classification; extended to regression.
Core Idea: Maximizing margin to improve model stability and generalization.

Support Vector Machines for Regression

Goal: Minimize impact of outliers on model.
Key Method: ε-Insensitive Loss Function.
Impact: Only points outside ε zone contribute to the model.
Outcome: Models that are less sensitive to data noise.

Mathematics of SVM Regression

Objective Function: \[ \text{Minimize: } \sum L_{\epsilon}(y_i - \hat{y}_i) + \sum \beta^2 \]
Key Components:
- \(L_{\epsilon}\): ε-insensitive function limits small errors.
- Regularization: Helps avoid overfitting.
- Support Vectors: Points that define the regression line.

Non-Linear Classification with SVM

Challenges: Non-linear boundaries often needed for classification.
Kernel Trick: Allows non-linear boundaries without explicit transformation.
Kernel Types: Polynomial, RBF, Hyperbolic Tangent.

Margin and Support Vectors in Classification

Concept of Margin: Distance between decision boundary and nearest data points.
Support Vectors: Only these points (on the margin) influence the boundary.
Result: High margin classifiers are less likely to overfit.

Mathematics of SVM Classification

Decision Function: \[ D(u) = \beta_0 + \sum y_i \alpha_i K(x_i, u) \]
Terms:
- \(y\): Class labels
- \(\alpha\): Parameters for support vectors
- \(K(x, u)\): Kernel function provides flexibility in model boundaries.

Kernel Functions in Detail

Linear Kernel: Best for linear relationships.
Polynomial Kernel: Captures polynomial relationships.
RBF Kernel: Effective for highly non-linear data.
Choosing a Kernel: Based on data complexity.

Hyperparameter Tuning in SVM

Key Parameters:
- Cost (C): Adjusts trade-off between bias and variance.
- Kernel Parameters: E.g., σ in RBF controls scale.
Tuning Strategy: Use cross-validation to balance fitting.

Practical Example: Classification Boundaries with RBF Kernel

Scenario: Example of boundary complexity with cost variation.
Cost:
- Low cost: Underfitting.
- High cost: Overfitting with complex boundaries.
Outcome: Optimal cost achieves balanced boundary.

Case Study: SVM for Grant Data Classification

Data Setup: Predict grant success with RBF kernel.
Steps:
- Tune kernel and cost parameters.
- Evaluate with reduced and full predictor sets.
Results: Reduced set achieved ROC AUC = 0.898.

Extensions of SVM

Least Squares SVM: Simplifies optimization.
Relevance Vector Machines (RVM): Bayesian analog with fewer vectors.
Graph Kernels: For chemistry and text mining applications.

Conclusion and Best Practices

Strengths of SVM:
- Robust to noise, highly flexible for non-linear patterns.
Limitations:
- Computationally intensive with large datasets.
Final Tip: Center and scale predictors for best performance.