\[ \begin{split} D(u) &=\beta_0 + \sum^P_{j=1}\beta_ju_j\\ &=\beta_0+\sum^n_{i=1}y_i\alpha_ix^{'}_iu \end{split} \]
\(\alpha\)=0 for samples not on the margin.
Prediction is supported by data points with largest uncertainty.
Based on summation of the product of: * the sign of the class * the model parameter * dot product between new sample and the support vectors, \(x_i^{'}u\) + distance of \(x_i\) from the origin + distance of u from the origin + cosine of angle between \(x_i\) and u
Tuning by resampling.
Penalise data points on the wrong side or inside the margin.
Cost value adds complexity to the boundary. High value leads to over-fitting.
Substituting linear cross product with kernel function to achieve flexible decision boundaries.
\[ \begin{split} D(u)&=\beta_0 + \sum^P_{j=1}\beta_ju_j\\&=\beta_0+\sum^n_{i=1}y_i\alpha_ix^{'}_iu \end{split} \]
Prior centering and scaling avoid domination of values large in magnitude.
Specialized kernels
Prediction line is supported by poorly predicted points located outside \(\pm \epsilon\) of the regression line.
SVM regression coefficients minimize \(\epsilon\) loss function: \[ Cost\sum^n_{i=1}L_\epsilon(y_i-y_i^{'}+\sum^P_{j=1}\beta^2_j) \]
Cost parameter is set manually to penalise large residuals.
library(AppliedPredictiveModeling)
data(solubility)
library(caret)
## Loading required package: lattice
## Loading required package: ggplot2
svmRTuned <- train(solTrainXtrans, solTrainY,
method = "svmRadial",
preProc = c("center","scale"),
tuneLength = 14,
trControl = trainControl(method = "cv"))
## Loading required package: kernlab
svmRTuned
## Support Vector Machines with Radial Basis Function Kernel
##
## 951 samples
## 228 predictors
##
## Pre-processing: centered (228), scaled (228)
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 857, 856, 856, 855, 855, 856, ...
## Resampling results across tuning parameters:
##
## C RMSE Rsquared RMSE SD Rsquared SD
## 0.25 0.7987222 0.8679683 0.08522893 0.01588388
## 0.50 0.7042621 0.8900392 0.06991161 0.01407636
## 1.00 0.6533926 0.9018439 0.06163992 0.01386187
## 2.00 0.6213222 0.9095591 0.06079800 0.01457232
## 4.00 0.6061262 0.9134779 0.06108246 0.01603500
## 8.00 0.5920402 0.9171835 0.05957932 0.01606659
## 16.00 0.5899165 0.9178490 0.05610869 0.01575665
## 32.00 0.5895256 0.9179679 0.05434215 0.01542801
## 64.00 0.5911698 0.9176429 0.05053791 0.01451370
## 128.00 0.5928217 0.9172239 0.04642479 0.01363293
## 256.00 0.5958985 0.9163522 0.04313798 0.01336025
## 512.00 0.5978189 0.9157150 0.04313953 0.01402077
## 1024.00 0.5993385 0.9153275 0.04324524 0.01409860
## 2048.00 0.6028148 0.9143679 0.04378952 0.01412931
##
## Tuning parameter 'sigma' was held constant at a value of 0.00268803
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were sigma = 0.00268803 and C = 32.
svmRTuned$finalModel
## Support Vector Machine object of class "ksvm"
##
## SV type: eps-svr (regression)
## parameter : epsilon = 0.1 cost C = 32
##
## Gaussian Radial Basis kernel function.
## Hyperparameter : sigma = 0.00268802992487251
##
## Number of Support Vectors : 625
##
## Objective Function Value : -378.2904
## Training error : 0.009743
Kuhn, M., & Johnson, Kjell. (2013). Applied Predictive Modeling, SpringerLink e-books.