SVM in R

Now we perform SVM for the fish data

Fisrt read data from the file “fish.dat”

fish <- read.table("fish.data.txt",h = T)

library(e1071)

Then apply SVM on the data using the R function svm:

s <- svm(fish[,2:7],fish[,1])

Note that the default settings are: cost=1, kernel=RBF, gamma=1/(# of variables). Check out the summary of the output

summary(s)
## 
## Call:
## svm.default(x = fish[, 2:7], y = fish[, 1])
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  1 
##       gamma:  0.1666667 
## 
## Number of Support Vectors:  85
## 
##  ( 12 5 18 8 4 8 30 )
## 
## 
## Number of Classes:  7 
## 
## Levels: 
##  bream parki perch pike roach smelt white

Use the SVM classifier to predict the classes of the original data (“one-against-one” strategy is used here):

pred <- predict(s,fish[,2:7])

table(fish[,1],pred) # 不太好
##        pred
##         bream parki perch pike roach smelt white
##   bream    33     0     0    0     0     0     0
##   parki     1     9     0    0     0     0     0
##   perch     0     0    54    0     0     0     0
##   pike      0     0     0   16     0     0     0
##   roach     0     0    18    0     0     0     0
##   smelt     0     0     1    0     0    11     0
##   white     0     0     5    0     0     0     0
# The apparent error rate is 25/148

We can change gamma to get a new (or better) classifier:

更改gamma係數

s1<-svm(fish[,2:7],fish[,1],gamma=1)

pred1<-predict(s1,fish[,2:7])

table(fish[,1],pred1)
##        pred1
##         bream parki perch pike roach smelt white
##   bream    33     0     0    0     0     0     0
##   parki     0    10     0    0     0     0     0
##   perch     0     0    53    0     1     0     0
##   pike      0     0     0   16     0     0     0
##   roach     0     0    11    0     7     0     0
##   smelt     0     0     0    0     0    12     0
##   white     0     0     4    0     1     0     0
# The new error rate is 16/148, which becomes significantly smaller.

# Choose an even larger gamma:
s5<-svm(fish[,2:7],fish[,1],gamma=5)

pred5<-predict(s5,fish[,2:7])

table(fish[,1],pred5)
##        pred5
##         bream parki perch pike roach smelt white
##   bream    33     0     0    0     0     0     0
##   parki     0    10     0    0     0     0     0
##   perch     0     0    54    0     0     0     0
##   pike      0     0     0   16     0     0     0
##   roach     0     0     7    0    11     0     0
##   smelt     0     0     0    0     0    12     0
##   white     0     0     1    0     1     0     3
# The new error rate is 9/148, which becomes significantly smaller.

# Theoretically, the increase of gamma will derive an apparent error rate 0.

# However, this might cause an over-fitting problem which affects the "true error rate".

Now we use a 20-fold CV to derive a good combination of (cost,gamma) based on grid search:

(1)Let’s start with (cost=0.1,gamma=0.1):

c1 <- svm(fish[,2:7],fish[,1],cost=0.1,gamma=0.1,cross=20)

summary(c1)
## 
## Call:
## svm.default(x = fish[, 2:7], y = fish[, 1], gamma = 0.1, cost = 0.1, 
##     cross = 20)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  0.1 
##       gamma:  0.1 
## 
## Number of Support Vectors:  137
## 
##  ( 33 5 18 10 12 16 43 )
## 
## 
## Number of Classes:  7 
## 
## Levels: 
##  bream parki perch pike roach smelt white
## 
## 20-fold cross-validation on training data:
## 
## Total Accuracy: 63.51351 
## Single Accuracies:
##  100 42.85714 50 71.42857 62.5 57.14286 57.14286 62.5 100 37.5 42.85714 85.71429 62.5 57.14286 87.5 57.14286 71.42857 62.5 57.14286 50

(2)Change to (cost=0.5,gamma=0.1):

c1 <- svm(fish[,2:7],fish[,1],cost=0.5,gamma=0.1,cross=20)

summary(c1)
## 
## Call:
## svm.default(x = fish[, 2:7], y = fish[, 1], gamma = 0.1, cost = 0.5, 
##     cross = 20)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  0.5 
##       gamma:  0.1 
## 
## Number of Support Vectors:  101
## 
##  ( 17 5 18 10 10 9 32 )
## 
## 
## Number of Classes:  7 
## 
## Levels: 
##  bream parki perch pike roach smelt white
## 
## 20-fold cross-validation on training data:
## 
## Total Accuracy: 82.43243 
## Single Accuracies:
##  85.71429 85.71429 62.5 71.42857 87.5 100 71.42857 100 71.42857 75 85.71429 71.42857 100 57.14286 100 71.42857 100 87.5 71.42857 87.5

(3)Change to (cost=100,gamma=0.2):

c1<-svm(fish[,2:7],fish[,1],cost=130,gamma=0.2,cross=20)

summary(c1)
## 
## Call:
## svm.default(x = fish[, 2:7], y = fish[, 1], gamma = 0.2, cost = 130, 
##     cross = 20)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  130 
##       gamma:  0.2 
## 
## Number of Support Vectors:  65
## 
##  ( 6 5 14 4 2 9 25 )
## 
## 
## Number of Classes:  7 
## 
## Levels: 
##  bream parki perch pike roach smelt white
## 
## 20-fold cross-validation on training data:
## 
## Total Accuracy: 91.21622 
## Single Accuracies:
##  85.71429 85.71429 100 100 87.5 100 85.71429 100 85.71429 100 71.42857 85.71429 100 85.71429 87.5 85.71429 100 87.5 100 87.5

Question: How to find the best combination of (cost, gamma) that minimizes the prediction error?

To find the best combination of the tuning parameters (cost, gamma), one can perform a grid search over a respecified parameter range. Suppose now we search the best (cost, gamma) over the region of [100,1000]x[0.5,5], with 10 equally spaced points allocated for each dimension (thus 100 grid points to be compared).

The result based on a 10-fold cross validation (default) can be produced by:

tobj <- tune.svm(Species ~ ., data=fish, cost= 100*(1:10), gamma=0.5*(1:10))

summary(tobj)
## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##  gamma cost
##    0.5  400
## 
## - best performance: 0.0947619 
## 
## - Detailed performance results:
##     gamma cost     error dispersion
## 1     0.5  100 0.1223810 0.09486833
## 2     1.0  100 0.1157143 0.07949355
## 3     1.5  100 0.1147619 0.06327144
## 4     2.0  100 0.1280952 0.06600007
## 5     2.5  100 0.1485714 0.06855967
## 6     3.0  100 0.1752381 0.07058764
## 7     3.5  100 0.1819048 0.07611769
## 8     4.0  100 0.1819048 0.07611769
## 9     4.5  100 0.2023810 0.08824169
## 10    5.0  100 0.2023810 0.08824169
## 11    0.5  200 0.1085714 0.09150900
## 12    1.0  200 0.1085714 0.07999118
## 13    1.5  200 0.1147619 0.06327144
## 14    2.0  200 0.1280952 0.06600007
## 15    2.5  200 0.1485714 0.06855967
## 16    3.0  200 0.1752381 0.07058764
## 17    3.5  200 0.1819048 0.07611769
## 18    4.0  200 0.1819048 0.07611769
## 19    4.5  200 0.2023810 0.08824169
## 20    5.0  200 0.2023810 0.08824169
## 21    0.5  300 0.1085714 0.09150900
## 22    1.0  300 0.1085714 0.07999118
## 23    1.5  300 0.1147619 0.06327144
## 24    2.0  300 0.1280952 0.06600007
## 25    2.5  300 0.1485714 0.06855967
## 26    3.0  300 0.1752381 0.07058764
## 27    3.5  300 0.1819048 0.07611769
## 28    4.0  300 0.1819048 0.07611769
## 29    4.5  300 0.2023810 0.08824169
## 30    5.0  300 0.2023810 0.08824169
## 31    0.5  400 0.0947619 0.10214258
## 32    1.0  400 0.1085714 0.07999118
## 33    1.5  400 0.1147619 0.06327144
## 34    2.0  400 0.1280952 0.06600007
## 35    2.5  400 0.1485714 0.06855967
## 36    3.0  400 0.1752381 0.07058764
## 37    3.5  400 0.1819048 0.07611769
## 38    4.0  400 0.1819048 0.07611769
## 39    4.5  400 0.2023810 0.08824169
## 40    5.0  400 0.2023810 0.08824169
## 41    0.5  500 0.0947619 0.10214258
## 42    1.0  500 0.1085714 0.07999118
## 43    1.5  500 0.1147619 0.06327144
## 44    2.0  500 0.1280952 0.06600007
## 45    2.5  500 0.1485714 0.06855967
## 46    3.0  500 0.1752381 0.07058764
## 47    3.5  500 0.1819048 0.07611769
## 48    4.0  500 0.1819048 0.07611769
## 49    4.5  500 0.2023810 0.08824169
## 50    5.0  500 0.2023810 0.08824169
## 51    0.5  600 0.0947619 0.10214258
## 52    1.0  600 0.1085714 0.07999118
## 53    1.5  600 0.1147619 0.06327144
## 54    2.0  600 0.1280952 0.06600007
## 55    2.5  600 0.1485714 0.06855967
## 56    3.0  600 0.1752381 0.07058764
## 57    3.5  600 0.1819048 0.07611769
## 58    4.0  600 0.1819048 0.07611769
## 59    4.5  600 0.2023810 0.08824169
## 60    5.0  600 0.2023810 0.08824169
## 61    0.5  700 0.0947619 0.10214258
## 62    1.0  700 0.1085714 0.07999118
## 63    1.5  700 0.1147619 0.06327144
## 64    2.0  700 0.1280952 0.06600007
## 65    2.5  700 0.1485714 0.06855967
## 66    3.0  700 0.1752381 0.07058764
## 67    3.5  700 0.1819048 0.07611769
## 68    4.0  700 0.1819048 0.07611769
## 69    4.5  700 0.2023810 0.08824169
## 70    5.0  700 0.2023810 0.08824169
## 71    0.5  800 0.0947619 0.10214258
## 72    1.0  800 0.1085714 0.07999118
## 73    1.5  800 0.1147619 0.06327144
## 74    2.0  800 0.1280952 0.06600007
## 75    2.5  800 0.1485714 0.06855967
## 76    3.0  800 0.1752381 0.07058764
## 77    3.5  800 0.1819048 0.07611769
## 78    4.0  800 0.1819048 0.07611769
## 79    4.5  800 0.2023810 0.08824169
## 80    5.0  800 0.2023810 0.08824169
## 81    0.5  900 0.0947619 0.10214258
## 82    1.0  900 0.1085714 0.07999118
## 83    1.5  900 0.1147619 0.06327144
## 84    2.0  900 0.1280952 0.06600007
## 85    2.5  900 0.1485714 0.06855967
## 86    3.0  900 0.1752381 0.07058764
## 87    3.5  900 0.1819048 0.07611769
## 88    4.0  900 0.1819048 0.07611769
## 89    4.5  900 0.2023810 0.08824169
## 90    5.0  900 0.2023810 0.08824169
## 91    0.5 1000 0.0947619 0.10214258
## 92    1.0 1000 0.1085714 0.07999118
## 93    1.5 1000 0.1147619 0.06327144
## 94    2.0 1000 0.1280952 0.06600007
## 95    2.5 1000 0.1485714 0.06855967
## 96    3.0 1000 0.1752381 0.07058764
## 97    3.5 1000 0.1819048 0.07611769
## 98    4.0 1000 0.1819048 0.07611769
## 99    4.5 1000 0.2023810 0.08824169
## 100   5.0 1000 0.2023810 0.08824169
# This shows the best choice of (cost, gamma) is (300, 0.5), which results in a minimum prediction error 0.067%.

# For the overall comparison, one can produce the contour plot of prediction errors over the search range of (cost, gamma) by:

plot(tobj, xlab = "gamma", ylab="C")

Based on the optimal choice (cost=300, gamma=0.5), we can fit a new SVM model for all training data:

c1<-svm(fish[,2:7],fish[,1],cost=300,gamma=0.5)

pred<-predict(c1,fish[,2:7])

table(fish[,1],pred)
##        pred
##         bream parki perch pike roach smelt white
##   bream    33     0     0    0     0     0     0
##   parki     0    10     0    0     0     0     0
##   perch     0     0    54    0     0     0     0
##   pike      0     0     0   16     0     0     0
##   roach     0     0     0    0    18     0     0
##   smelt     0     0     0    0     0    12     0
##   white     0     0     0    0     0     0     5
# Note that this classifier has a “zero apparent error rate”.

# The most important thing is, it is a better classifier for prediction (the estimated true error is 0.067).
# Now we use this classifier to predict the test data:

test<-read.table("fish_test.data.txt",h=T)

predict(c1,test)
##     1     2     3     4     5     6     7     8     9    10    11 
## parki bream perch perch  pike smelt smelt parki roach roach perch 
## Levels: bream parki perch pike roach smelt white
# Check out if the prediction results the same as those made by other approaches. 
summary(svm(Species ~ ., data=fish, cost= 1800, gamma=0.04,cross=148))
## 
## Call:
## svm(formula = Species ~ ., data = fish, cost = 1800, gamma = 0.04, 
##     cross = 148)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  1800 
##       gamma:  0.04 
## 
## Number of Support Vectors:  56
## 
##  ( 7 5 12 3 4 5 20 )
## 
## 
## Number of Classes:  7 
## 
## Levels: 
##  bream parki perch pike roach smelt white
## 
## 148-fold cross-validation on training data:
## 
## Total Accuracy: 95.94595 
## Single Accuracies:
##  100 100 100 100 100 100 100 100 100 100 100 100 100 100 0 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 0 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 0 0 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 0 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 0 100 100 100 100