2024-11-01
Introduction to Non-Linear Models
- Purpose: Capture complex patterns in data.
- Methods Covered: Support Vector Machines (SVM) for regression and classification.
- Why SVM? Known for flexibility and robustness against outliers.
Overview of Support Vector Machines (SVM)
- Developed by: Vladimir Vapnik in the 1960s.
- Applications: Originally for classification; extended to regression.
- Core Idea: Maximizing margin to improve model stability and generalization.
Support Vector Machines for Regression
- Goal: Minimize impact of outliers on model.
- Key Method: ε-Insensitive Loss Function.
- Impact: Only points outside ε zone contribute to the model.
- Outcome: Models that are less sensitive to data noise.
Mathematics of SVM Regression
- Objective Function: \[
\text{Minimize: } \sum L_{\epsilon}(y_i - \hat{y}_i) + \sum \beta^2
\]
- Key Components:
- \(L_{\epsilon}\): ε-insensitive function limits small errors.
- Regularization: Helps avoid overfitting.
- Support Vectors: Points that define the regression line.
Non-Linear Classification with SVM
- Challenges: Non-linear boundaries often needed for classification.
- Kernel Trick: Allows non-linear boundaries without explicit transformation.
- Kernel Types: Polynomial, RBF, Hyperbolic Tangent.
Margin and Support Vectors in Classification
- Concept of Margin: Distance between decision boundary and nearest data points.
- Support Vectors: Only these points (on the margin) influence the boundary.
- Result: High margin classifiers are less likely to overfit.
Mathematics of SVM Classification
- Decision Function: \[
D(u) = \beta_0 + \sum y_i \alpha_i K(x_i, u)
\]
- Terms:
- \(y\): Class labels
- \(\alpha\): Parameters for support vectors
- \(K(x, u)\): Kernel function provides flexibility in model boundaries.
Kernel Functions in Detail
- Linear Kernel: Best for linear relationships.
- Polynomial Kernel: Captures polynomial relationships.
- RBF Kernel: Effective for highly non-linear data.
- Choosing a Kernel: Based on data complexity.
Hyperparameter Tuning in SVM
- Key Parameters:
- Cost (C): Adjusts trade-off between bias and variance.
- Kernel Parameters: E.g., σ in RBF controls scale.
- Tuning Strategy: Use cross-validation to balance fitting.
Practical Example: Classification Boundaries with RBF Kernel
- Scenario: Example of boundary complexity with cost variation.
- Cost:
- Low cost: Underfitting.
- High cost: Overfitting with complex boundaries.
- Outcome: Optimal cost achieves balanced boundary.
Case Study: SVM for Grant Data Classification
- Data Setup: Predict grant success with RBF kernel.
- Steps:
- Tune kernel and cost parameters.
- Evaluate with reduced and full predictor sets.
- Results: Reduced set achieved ROC AUC = 0.898.
Extensions of SVM
- Least Squares SVM: Simplifies optimization.
- Relevance Vector Machines (RVM): Bayesian analog with fewer vectors.
- Graph Kernels: For chemistry and text mining applications.
Conclusion and Best Practices
- Strengths of SVM:
- Robust to noise, highly flexible for non-linear patterns.
- Limitations:
- Computationally intensive with large datasets.
- Final Tip: Center and scale predictors for best performance.