data(PlantGrowth)
x <- PlantGrowth$weight
y <- PlantGrowth$group
model_svm <- svm(group ~ ., data = PlantGrowth)
summary(model_svm)
##
## Call:
## svm(formula = group ~ ., data = PlantGrowth)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 1
##
## Number of Support Vectors: 29
##
## ( 10 9 10 )
##
##
## Number of Classes: 3
##
## Levels:
## ctrl trt1 trt2
There are 29 support vectors so that means there are 29 points close to the hyperplane which are influencing the decision of where it goes. There are 3 classes. There are only 30 data points in PlantGrowth so all of the data points impact the hyperplane placement.
data(iris)
x <- iris[1:4]
y <- iris[5]
iris_svm <- svm(Species ~ ., data = iris)
summary(iris_svm)
##
## Call:
## svm(formula = Species ~ ., data = iris)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 1
##
## Number of Support Vectors: 51
##
## ( 8 22 21 )
##
##
## Number of Classes: 3
##
## Levels:
## setosa versicolor virginica
There are 51 support vectors here out of 150 data points. So much less of the data are support vectors in iris vs PlantGrowth. This also has 3 classes.
set.seed(123)
samples <- sample(nrow(iris), nrow(iris)*0.80)
train <- iris[samples,]
test <- iris[-samples,]
iris_svm_train <- svm(Species ~., data = train)
summary(iris_svm_train)
##
## Call:
## svm(formula = Species ~ ., data = train)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 1
##
## Number of Support Vectors: 46
##
## ( 8 19 19 )
##
##
## Number of Classes: 3
##
## Levels:
## setosa versicolor virginica
pred_test <- predict(iris_svm_train,test)
table(pred_test,test$Species)
##
## pred_test setosa versicolor virginica
## setosa 10 0 0
## versicolor 0 14 0
## virginica 0 1 5
tp <- sum(pred_test == test$Species)/length(test$Species)
print(paste("The true positive rate is",tp))
## [1] "The true positive rate is 0.966666666666667"
Here we see that 1/30 or 3.3% of the data was incorrectly classified using the test data. The training model had 46 support vectors.