Question 7. In the problem, you will use support vector approaches
in order to predict whether a given car get high or low gas mileage
based on the Auto data set.
Cost = 10 has lowest cross validation error rate
set.seed(20)
tune.out = tune(svm, mpglevel ~ ., data = Auto, kernel = "radial", ranges = list(cost = c(0.1,
1, 5, 10), gamma = c(0.01, 0.1, 1, 5, 10, 100)))
summary(tune.out)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost gamma
## 10 0.01
##
## - best performance: 0.01788462
##
## - Detailed performance results:
## cost gamma error dispersion
## 1 0.1 1e-02 0.08948718 0.04895280
## 2 1.0 1e-02 0.07397436 0.03703948
## 3 5.0 1e-02 0.04846154 0.03511978
## 4 10.0 1e-02 0.01788462 0.02430352
## 5 0.1 1e-01 0.07660256 0.04196447
## 6 1.0 1e-01 0.05358974 0.03072647
## 7 5.0 1e-01 0.02544872 0.02689601
## 8 10.0 1e-01 0.03051282 0.02626589
## 9 0.1 1e+00 0.55352564 0.03286051
## 10 1.0 1e+00 0.06108974 0.03822583
## 11 5.0 1e+00 0.06628205 0.04203725
## 12 10.0 1e+00 0.06628205 0.04203725
## 13 0.1 5e+00 0.55352564 0.03286051
## 14 1.0 5e+00 0.47711538 0.05309479
## 15 5.0 5e+00 0.47198718 0.05959666
## 16 10.0 5e+00 0.47198718 0.05959666
## 17 0.1 1e+01 0.55352564 0.03286051
## 18 1.0 1e+01 0.50762821 0.04363628
## 19 5.0 1e+01 0.50000000 0.03812890
## 20 10.0 1e+01 0.50000000 0.03812890
## 21 0.1 1e+02 0.55352564 0.03286051
## 22 1.0 1e+02 0.55352564 0.03286051
## 23 5.0 1e+02 0.55352564 0.03286051
## 24 10.0 1e+02 0.55352564 0.03286051
c. Now repeat (b), this time using SVMs with radial and polynomial
basis kernels, with different values of gamma and degree and cost.
Comment on your results.
Cost = 10 and degree = 2 has the lowest cross validation error
rate.
set.seed(25)
tune.out = tune(svm, mpglevel ~ ., data = Auto, kernel = "polynomial", ranges = list(cost = c(0.1,
1, 5, 10), degree = c(2, 3, 4)))
summary(tune.out)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost degree
## 10 2
##
## - best performance: 0.4944231
##
## - Detailed performance results:
## cost degree error dispersion
## 1 0.1 2 0.5609615 0.06035063
## 2 1.0 2 0.5609615 0.06035063
## 3 5.0 2 0.5609615 0.06035063
## 4 10.0 2 0.4944231 0.14176060
## 5 0.1 3 0.5609615 0.06035063
## 6 1.0 3 0.5609615 0.06035063
## 7 5.0 3 0.5609615 0.06035063
## 8 10.0 3 0.5609615 0.06035063
## 9 0.1 4 0.5609615 0.06035063
## 10 1.0 4 0.5609615 0.06035063
## 11 5.0 4 0.5609615 0.06035063
## 12 10.0 4 0.5609615 0.06035063
set.seed(30)
tune.out = tune(svm, mpglevel ~ ., data = Auto, kernel = "radial", ranges = list(cost = c(0.1,
1, 5, 10), gamma = c(0.01, 0.1, 1, 5, 10, 100)))
summary(tune.out)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost gamma
## 10 0.01
##
## - best performance: 0.03051282
##
## - Detailed performance results:
## cost gamma error dispersion
## 1 0.1 1e-02 0.08653846 0.03965503
## 2 1.0 1e-02 0.07378205 0.03839078
## 3 5.0 1e-02 0.05339744 0.03009416
## 4 10.0 1e-02 0.03051282 0.03730356
## 5 0.1 1e-01 0.07634615 0.03756851
## 6 1.0 1e-01 0.05596154 0.03097340
## 7 5.0 1e-01 0.03576923 0.03002696
## 8 10.0 1e-01 0.03320513 0.03201679
## 9 0.1 1e+00 0.56884615 0.04091889
## 10 1.0 1e+00 0.05846154 0.05059439
## 11 5.0 1e+00 0.05589744 0.04399925
## 12 10.0 1e+00 0.05589744 0.04399925
## 13 0.1 5e+00 0.56884615 0.04091889
## 14 1.0 5e+00 0.51282051 0.06587368
## 15 5.0 5e+00 0.50769231 0.06996093
## 16 10.0 5e+00 0.50769231 0.06996093
## 17 0.1 1e+01 0.56884615 0.04091889
## 18 1.0 1e+01 0.52294872 0.06718636
## 19 5.0 1e+01 0.51269231 0.06472758
## 20 10.0 1e+01 0.51269231 0.06472758
## 21 0.1 1e+02 0.56884615 0.04091889
## 22 1.0 1e+02 0.56884615 0.04091889
## 23 5.0 1e+02 0.56884615 0.04091889
## 24 10.0 1e+02 0.56884615 0.04091889
d. Make some plots to back up your assertions in (b) and (c).
svm.linear = svm(mpglevel ~ ., data = Auto, kernel = "linear", cost = 1)
svm.poly = svm(mpglevel ~ ., data = Auto, kernel = "polynomial", cost = 10,
degree = 2)
svm.radial = svm(mpglevel ~ ., data = Auto, kernel = "radial", cost = 10, gamma = 0.01)
plotpairs = function(fit) {
for (name in names(Auto)[!(names(Auto) %in% c("mpg", "mpglevel", "name"))]) {
plot(fit, Auto, as.formula(paste("mpg~", name, sep = "")))
}
}
plotpairs(svm.linear)







plotpairs(svm.poly)







Question 8. This problem involves the OJ data set which is part of
the ISLR2 package.
a. Create a training set containing a random sample of 800
observations and a test set containing the remaining observations.
library(ISLR2)
library(e1071)
data("OJ")
set.seed(1)
train = sample(dim(OJ)[1], 800)
OJ.train = OJ[train, ]
OJ.test = OJ[-train, ]
b. Fit a support vector classifier to the training data using cost =
0.01, with Purchase as the response and the other variables as
predictors. Use the summary() function to produce summary statistics,
and describe the results obtained.
Total of 615 support vectors, 309 observations in one class and 306
in another.
svmfit = svm(Purchase~., data = OJ.train, kernel = "linear", cost = 0.01, scale = FALSE)
summary(svmfit)
##
## Call:
## svm(formula = Purchase ~ ., data = OJ.train, kernel = "linear", cost = 0.01,
## scale = FALSE)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.01
##
## Number of Support Vectors: 615
##
## ( 309 306 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
c. What are the training and test error rates?
With cost = 0.01, 630 training observations were correctly
classified (78.75%) and 207 test observations were correctly classified
(76.67%)
train.rate = predict(svmfit, OJ.train)
table(OJ.train$Purchase, train.rate)
## train.rate
## CH MM
## CH 420 65
## MM 105 210
test.rate = predict(svmfit, OJ.test)
table(OJ.test$Purchase, test.rate)
## test.rate
## CH MM
## CH 148 20
## MM 43 59
d. Use the tune() function to select an optimal cost. Consider
values in the range 0.01 to 10.
tune.out = tune(svm, Purchase ~., data = OJ.train, kernel = "linear", ranges = list(cost = c(0.01, 0.1, 1, 5, 10 )))
summary(tune.out)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 10
##
## - best performance: 0.17125
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.17375 0.03884174
## 2 0.10 0.17875 0.03064696
## 3 1.00 0.17500 0.03061862
## 4 5.00 0.17250 0.03322900
## 5 10.00 0.17125 0.03488573
e. Compute the training and test error rates using this new value
for cost.
for cost = 0.1, 668 training observations are correctly classified
(83.5%) and 226 test observations are correctly classified (83.7%)
svmlinear = svm(Purchase ~ ., kernel = "linear", data = OJ.train, cost = 0.1)
train.rates = predict(svmlinear, OJ.train)
table(OJ.train$Purchase, train.rates)
## train.rates
## CH MM
## CH 422 63
## MM 69 246
svmlinear = svm(Purchase ~ ., kernel = "linear", data = OJ.test, cost = 0.1)
test.rates = predict(svmlinear, OJ.test)
table(OJ.test$Purchase, test.rates)
## test.rates
## CH MM
## CH 155 13
## MM 31 71
f. Repeat parts (b) through (e) using a support vector machine with
a radial kernel. Use the default value for gamma.
svm.radial = svm(Purchase~., data = OJ.train, kernel = "radial")
summary(svm.radial)
##
## Call:
## svm(formula = Purchase ~ ., data = OJ.train, kernel = "radial")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 1
##
## Number of Support Vectors: 373
##
## ( 188 185 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
train.rate = predict(svm.radial, OJ.train)
table(OJ.train$Purchase, train.rate)
## train.rate
## CH MM
## CH 441 44
## MM 77 238
test.rate = predict(svm.radial, OJ.test)
table(OJ.test$Purchase, test.rate)
## test.rate
## CH MM
## CH 151 17
## MM 33 69
tune.radial = tune(svm, Purchase ~., data = OJ.train, kernel = "radial", ranges = list(cost = c(0.01, 0.1, 1, 5, 10 )))
summary(tune.radial)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 1
##
## - best performance: 0.17625
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.39375 0.06568284
## 2 0.10 0.18250 0.05470883
## 3 1.00 0.17625 0.03793727
## 4 5.00 0.18125 0.04299952
## 5 10.00 0.18125 0.04340139
svm.radial = svm(Purchase ~ ., kernel = "radial", data = OJ.train, cost = 1)
train.rates = predict(svm.radial, OJ.train)
table(OJ.train$Purchase, train.rates)
## train.rates
## CH MM
## CH 441 44
## MM 77 238
svm.radial = svm(Purchase ~ ., kernel = "radial", data = OJ.test, cost = 1)
test.rates = predict(svm.radial, OJ.test)
table(OJ.test$Purchase, test.rates)
## test.rates
## CH MM
## CH 157 11
## MM 28 74
g. Repeat parts (b) through (e) using a support vector machine with
a polynomial kernel. Set degree = 2.
svm.polynomial = svm(Purchase~., data = OJ.train, kernel = "polynomial", degree = 2)
summary(svm.radial)
##
## Call:
## svm(formula = Purchase ~ ., data = OJ.test, kernel = "radial", cost = 1)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 1
##
## Number of Support Vectors: 152
##
## ( 78 74 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
train.rate = predict(svm.polynomial, OJ.train)
table(OJ.train$Purchase, train.rate)
## train.rate
## CH MM
## CH 449 36
## MM 110 205
# training rate 81.75% accurate
test.rate = predict(svm.polynomial, OJ.test)
table(OJ.test$Purchase, test.rate)
## test.rate
## CH MM
## CH 153 15
## MM 45 57
#test rate 77.8% accurate
tune.polynomial = tune(svm, Purchase ~., data = OJ.train, kernel = "polynomial", ranges = list(cost = c(0.01, 0.1, 1, 5, 10 )))
summary(tune.polynomial)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 10
##
## - best performance: 0.19125
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.37125 0.07337357
## 2 0.10 0.29000 0.07139483
## 3 1.00 0.19375 0.04903584
## 4 5.00 0.19250 0.05041494
## 5 10.00 0.19125 0.05622685
svm.polynomial = svm(Purchase ~ ., kernel = "polynomial", data = OJ.train, cost = 1)
train.rates = predict(svm.polynomial, OJ.train)
table(OJ.train$Purchase, train.rates)
## train.rates
## CH MM
## CH 453 32
## MM 91 224
#training rate 84.6% accurate
svm.polynomial = svm(Purchase ~ ., kernel = "polynomial", data = OJ.test, cost = 1)
test.rates = predict(svm.polynomial, OJ.test)
table(OJ.test$Purchase, test.rates)
## test.rates
## CH MM
## CH 161 7
## MM 42 60
#test rate 81.9% accurate
h. Overall, which approach seems to give the best results on this
data?
Using a polynomial kernel appears to have the more accurate
classification rate.