0.1 SVM for two classes

0.1.1 SVM

The class is not linearly seperable, so we use SVM regression.

## [1]  1  2  5  7 14 16 17
## 
## Call:
## svm(formula = y ~ ., data = dat, kernel = "linear", cost = 10, 
##     scale = FALSE)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  linear 
##        cost:  10 
## 
## Number of Support Vectors:  7
## 
##  ( 4 3 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  -1 1

##  [1]  1  2  3  4  5  7  9 10 12 13 14 15 16 17 18 20
## 
## Call:
## svm(formula = y ~ ., data = dat, kernel = "linear", cost = 0.1, 
##     scale = FALSE)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  linear 
##        cost:  0.1 
## 
## Number of Support Vectors:  16
## 
##  ( 8 8 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  -1 1

0.2 Test data for validation

## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##  cost
##   0.1
## 
## - best performance: 0.05 
## 
## - Detailed performance results:
##    cost error dispersion
## 1 1e-02  0.55  0.4377975
## 2 1e-01  0.05  0.1581139
## 3 1e+00  0.15  0.2415229
## 4 5e+00  0.15  0.2415229
## 5 1e+01  0.15  0.2415229
## 6 1e+02  0.15  0.2415229

You can also embed plots, for example:

## 
## Call:
## best.tune(method = svm, train.x = y ~ ., data = dat, ranges = list(cost = c(0.01, 
##     0.1, 1, 5, 10, 100)), kernel = "linear")
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  linear 
##        cost:  0.1 
## 
## Number of Support Vectors:  16
## 
##  ( 8 8 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  -1 1

0.3 Test Set for Prediction

##        truth
## predict -1 1
##      -1  9 1
##      1   2 8

0.3.2 Linear Seperable Case

## 
## Call:
## svm(formula = y ~ ., data = dat1, kernel = "linear", cost = 1, 
##     scale = FALSE)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  linear 
##        cost:  1 
## 
## Number of Support Vectors:  7
## 
##  ( 3 4 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  -1 1

0.3.3 Non-Linear SVM

## 
## Call:
## svm(formula = y ~ ., data = datn[train, ], kernel = "radial", 
##     gamma = 1, cost = 1e-04)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  1e-04 
## 
## Number of Support Vectors:  54
## 
##  ( 27 27 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  1 2
## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##  cost gamma
##     1   0.5
## 
## - best performance: 0.07 
## 
## - Detailed performance results:
##     cost gamma error dispersion
## 1  1e-02   0.5  0.27 0.15670212
## 2  1e-01   0.5  0.26 0.15776213
## 3  1e+00   0.5  0.07 0.08232726
## 4  5e+00   0.5  0.07 0.08232726
## 5  1e+01   0.5  0.07 0.08232726
## 6  1e+02   0.5  0.14 0.15055453
## 7  1e-02   1.0  0.27 0.15670212
## 8  1e-01   1.0  0.22 0.16193277
## 9  1e+00   1.0  0.07 0.08232726
## 10 5e+00   1.0  0.08 0.07888106
## 11 1e+01   1.0  0.09 0.07378648
## 12 1e+02   1.0  0.12 0.12292726
## 13 1e-02   2.0  0.27 0.15670212
## 14 1e-01   2.0  0.27 0.15670212
## 15 1e+00   2.0  0.07 0.08232726
## 16 5e+00   2.0  0.09 0.07378648
## 17 1e+01   2.0  0.11 0.07378648
## 18 1e+02   2.0  0.12 0.13165612
## 19 1e-02   3.0  0.27 0.15670212
## 20 1e-01   3.0  0.27 0.15670212
## 21 1e+00   3.0  0.07 0.08232726
## 22 5e+00   3.0  0.11 0.07378648
## 23 1e+01   3.0  0.08 0.07888106
## 24 1e+02   3.0  0.13 0.14181365
## 25 1e-02   4.0  0.27 0.15670212
## 26 1e-01   4.0  0.27 0.15670212
## 27 1e+00   4.0  0.07 0.08232726
## 28 5e+00   4.0  0.10 0.06666667
## 29 1e+01   4.0  0.09 0.07378648
## 30 1e+02   4.0  0.13 0.14181365
##     pred
## true  1  2
##    1 67 10
##    2  2 21

0.4 ROC Curves

0.5 SVM with Multiple Classes

## 
## Call:
## svm(formula = y ~ ., data = datm, kernel = "radial", cost = 10, 
##     gamma = 1)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  10 
## 
## Number of Support Vectors:  105
## 
##  ( 38 37 30 )
## 
## 
## Number of Classes:  3 
## 
## Levels: 
##  0 1 2

0.5.1 ISLR Example

## [1] "xtrain" "xtest"  "ytrain" "ytest"
## [1]   63 2308
## [1]   20 2308
## [1] 63
## [1] 20
## 
##  1  2  3  4 
##  8 23 12 20
## 
## 1 2 3 4 
## 3 6 6 5

0.5.2 SVM to predict

## 
## Call:
## svm(formula = y ~ ., data = dat, kernel = "linear", cost = 10)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  linear 
##        cost:  10 
## 
## Number of Support Vectors:  58
## 
##  ( 20 20 11 7 )
## 
## 
## Number of Classes:  4 
## 
## Levels: 
##  1 2 3 4
##    
##      1  2  3  4
##   1  8  0  0  0
##   2  0 23  0  0
##   3  0  0 12  0
##   4  0  0  0 20
##        
## pred.te 1 2 3 4
##       1 3 0 0 0
##       2 0 6 2 0
##       3 0 0 4 0
##       4 0 0 0 5