We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary. We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features.
set.seed(12)
x1 = runif(500)-0.5
x2 = runif(500)-0.5
y = 1 * (x1^2 - x2^2 > 0)
plot(x1[y==0], x2[y==0], col="red", xlab="X1", ylab="X2", pch=18)
points(x1[y==1], x2[y==1], col="blue", pch=16)
dat = data.frame(x1 = x1, x2 = x2, y = as.factor(y))
glm.fit = glm(y~., data = dat, family = "binomial")
summary(glm.fit)
##
## Call:
## glm(formula = y ~ ., family = "binomial", data = dat)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.350 -1.165 1.050 1.151 1.291
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.04927 0.08978 0.549 0.583
## x1 -0.23002 0.31534 -0.729 0.466
## x2 0.51072 0.31560 1.618 0.106
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 692.86 on 499 degrees of freedom
## Residual deviance: 689.58 on 497 degrees of freedom
## AIC: 695.58
##
## Number of Fisher Scoring iterations: 3
glm.prob = predict(glm.fit, newdata = dat, type = "response")
glm.pred = ifelse(glm.prob >= 0.5, 1, 0)
data.positive = dat[glm.pred == 1,]
data.negative = dat[glm.pred == 0,]
plot(data.positive$x1, data.positive$x2, col="red", xlab="X1", ylab="X2", pch=18)
points(data.negative$x1, data.negative$x2, col="blue", pch=16)
The logistic model that I ran for this problem is - $ y = x^2_{1} +log(x_2)+(x1*x2) $ With the model the 2nd Degree Polynomial of x1 and log of x2 are significant variables to estimating y.
glm.fit2 = glm(y~poly(x1,2) + log(x2) + I(x1*x2), data=dat, family = "binomial")
summary(glm.fit2)
##
## Call:
## glm(formula = y ~ poly(x1, 2) + log(x2) + I(x1 * x2), family = "binomial",
## data = dat)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.02566 -0.12847 0.00022 0.14337 1.60700
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -8.817 1.504 -5.864 4.53e-09 ***
## poly(x1, 2)1 -10.434 20.032 -0.521 0.602
## poly(x1, 2)2 90.385 14.996 6.027 1.67e-09 ***
## log(x2) -6.237 1.025 -6.085 1.16e-09 ***
## I(x1 * x2) 7.625 9.572 0.797 0.426
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 341.847 on 247 degrees of freedom
## Residual deviance: 90.027 on 243 degrees of freedom
## (252 observations deleted due to missingness)
## AIC: 100.03
##
## Number of Fisher Scoring iterations: 8
The decision boundary between this plot and the one above are completely different. The plot presented below cannot be split with a linear decision boundary.
glm.probs2 = predict(glm.fit2, newdata = dat, type = "response")
glm.pred2 = ifelse(glm.probs2 >= 0.5, 1, 0)
data.positive2 = dat[glm.pred2 == 1,]
data.negative2 = dat[glm.pred2 == 0,]
plot(data.positive2$x1, data.positive2$x2, col="red", xlab="X1", ylab="X2", pch=18)
points(data.negative2$x1, data.negative2$x2, col="blue", pch=16)
The Support Vector Classifier almost perfectly predictions the classes of the observations, but has blurriness towards the middle of the plot..
library(e1071)
svm.fit = svm(as.factor(y)~ x1 + x2, data = dat, kernal = "linear", cost = 0.1)
svm.pred = predict(svm.fit, dat)
svm.positive = dat[svm.pred == 1,]
svm.negative = dat[svm.pred == 0,]
plot(svm.positive$x1, svm.positive$x2, col="red", xlab="X1", ylab="X2", pch=18)
points(svm.negative$x1, svm.negative$x2, col="blue", pch=16)
library(e1071)
svm.fit2=svm(as.factor(y)~x1+x2, dat, kernel="radial", gamma=1, cost=1)
svm.pred2=predict(svm.fit2, dat)
svm.positive2= dat[svm.pred2==1,]
svm.negative2= dat[svm.pred2==0,]
plot(svm.positive2$x1, svm.positive2$x2, col="red", xlab="X1", ylab="X2", pch=18)
points(svm.negative2$x1, svm.negative2$x2, col="blue", pch=16)
Non-Linear SVM was the best.
In this problem, you will use support vector approaches in order to predict whether a given car gets high or low gas mileage based on the Auto data set.
data(Auto)
summary(Auto)
## mpg cylinders displacement horsepower weight
## Min. : 9.00 Min. :3.000 Min. : 68.0 Min. : 46.0 Min. :1613
## 1st Qu.:17.00 1st Qu.:4.000 1st Qu.:105.0 1st Qu.: 75.0 1st Qu.:2225
## Median :22.75 Median :4.000 Median :151.0 Median : 93.5 Median :2804
## Mean :23.45 Mean :5.472 Mean :194.4 Mean :104.5 Mean :2978
## 3rd Qu.:29.00 3rd Qu.:8.000 3rd Qu.:275.8 3rd Qu.:126.0 3rd Qu.:3615
## Max. :46.60 Max. :8.000 Max. :455.0 Max. :230.0 Max. :5140
##
## acceleration year origin name
## Min. : 8.00 Min. :70.00 Min. :1.000 amc matador : 5
## 1st Qu.:13.78 1st Qu.:73.00 1st Qu.:1.000 ford pinto : 5
## Median :15.50 Median :76.00 Median :1.000 toyota corolla : 5
## Mean :15.54 Mean :75.98 Mean :1.577 amc gremlin : 4
## 3rd Qu.:17.02 3rd Qu.:79.00 3rd Qu.:2.000 amc hornet : 4
## Max. :24.80 Max. :82.00 Max. :3.000 chevrolet chevette: 4
## (Other) :365
gas.median = median(Auto$mpg)
gas.class = ifelse(Auto$mpg > gas.median, 1, 0)
Auto$mpglevel = as.factor(gas.class)
str(Auto$mpglevel)
## Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
set.seed(123)
set.seed(10)
tune.out=tune(svm, mpglevel~., data=Auto, kernal="linear", ranges=list(cost=c(0.001, 0.01, 0.1, 1,5,10,100)))
summary(tune.out)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 100
##
## - best performance: 0.01262821
##
## - Detailed performance results:
## cost error dispersion
## 1 1e-03 0.55365385 0.03306661
## 2 1e-02 0.55365385 0.03306661
## 3 1e-01 0.09942308 0.04714670
## 4 1e+00 0.07897436 0.03260883
## 5 5e+00 0.06878205 0.03175943
## 6 1e+01 0.05352564 0.03055668
## 7 1e+02 0.01262821 0.02437031
tune.out$best.parameters
## cost
## 7 100
best.svmLinear = tune.out$best.model
summary(best.svmLinear)
##
## Call:
## best.tune(METHOD = svm, train.x = mpglevel ~ ., data = Auto, ranges = list(cost = c(0.001,
## 0.01, 0.1, 1, 5, 10, 100)), kernal = "linear")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 100
##
## Number of Support Vectors: 63
##
## ( 30 33 )
##
##
## Number of Classes: 2
##
## Levels:
## 0 1
(c)Now repeat (b), this time using SVMs with radial and polynomial basis kernels, with different values of gamma and degree and cost. Comment on your results.
set.seed(123)
tune.out.rad = tune(svm, mpglevel~., data=Auto, kernal="radial", ranges=list(cost=c(0.001, 0.01, 0.1, 1, 5, 10 ,100), gamma=c(0.001, 0.01, 0.1, 1, 5, 10, 100)))
summary(tune.out.rad)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost gamma
## 100 0.01
##
## - best performance: 0.01025641
##
## - Detailed performance results:
## cost gamma error dispersion
## 1 1e-03 1e-03 0.58173077 0.04740051
## 2 1e-02 1e-03 0.58173077 0.04740051
## 3 1e-01 1e-03 0.56891026 0.06627739
## 4 1e+00 1e-03 0.09173077 0.03990003
## 5 5e+00 1e-03 0.07634615 0.03928191
## 6 1e+01 1e-03 0.07121795 0.04410874
## 7 1e+02 1e-03 0.02288462 0.01427008
## 8 1e-03 1e-02 0.58173077 0.04740051
## 9 1e-02 1e-02 0.58173077 0.04740051
## 10 1e-01 1e-02 0.08916667 0.04345384
## 11 1e+00 1e-02 0.07378205 0.04185248
## 12 5e+00 1e-02 0.04589744 0.03136327
## 13 1e+01 1e-02 0.02032051 0.02305327
## 14 1e+02 1e-02 0.01025641 0.01792836
## 15 1e-03 1e-01 0.58173077 0.04740051
## 16 1e-02 1e-01 0.21391026 0.09431095
## 17 1e-01 1e-01 0.07634615 0.03928191
## 18 1e+00 1e-01 0.05852564 0.03960325
## 19 5e+00 1e-01 0.03057692 0.02611396
## 20 1e+01 1e-01 0.03314103 0.02942215
## 21 1e+02 1e-01 0.03326923 0.02434857
## 22 1e-03 1e+00 0.58173077 0.04740051
## 23 1e-02 1e+00 0.58173077 0.04740051
## 24 1e-01 1e+00 0.58173077 0.04740051
## 25 1e+00 1e+00 0.05865385 0.04942437
## 26 5e+00 1e+00 0.05608974 0.04595880
## 27 1e+01 1e+00 0.05608974 0.04595880
## 28 1e+02 1e+00 0.05608974 0.04595880
## 29 1e-03 5e+00 0.58173077 0.04740051
## 30 1e-02 5e+00 0.58173077 0.04740051
## 31 1e-01 5e+00 0.58173077 0.04740051
## 32 1e+00 5e+00 0.51544872 0.06790600
## 33 5e+00 5e+00 0.51544872 0.06790600
## 34 1e+01 5e+00 0.51544872 0.06790600
## 35 1e+02 5e+00 0.51544872 0.06790600
## 36 1e-03 1e+01 0.58173077 0.04740051
## 37 1e-02 1e+01 0.58173077 0.04740051
## 38 1e-01 1e+01 0.58173077 0.04740051
## 39 1e+00 1e+01 0.54602564 0.06355090
## 40 5e+00 1e+01 0.54102564 0.06959451
## 41 1e+01 1e+01 0.54102564 0.06959451
## 42 1e+02 1e+01 0.54102564 0.06959451
## 43 1e-03 1e+02 0.58173077 0.04740051
## 44 1e-02 1e+02 0.58173077 0.04740051
## 45 1e-01 1e+02 0.58173077 0.04740051
## 46 1e+00 1e+02 0.58173077 0.04740051
## 47 5e+00 1e+02 0.58173077 0.04740051
## 48 1e+01 1e+02 0.58173077 0.04740051
## 49 1e+02 1e+02 0.58173077 0.04740051
tune.out.rad$best.performance
## [1] 0.01025641
best.rad.model = tune.out.rad$best.model
summary(best.rad.model)
##
## Call:
## best.tune(METHOD = svm, train.x = mpglevel ~ ., data = Auto, ranges = list(cost = c(0.001,
## 0.01, 0.1, 1, 5, 10, 100), gamma = c(0.001, 0.01, 0.1, 1, 5,
## 10, 100)), kernal = "radial")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 100
##
## Number of Support Vectors: 57
##
## ( 27 30 )
##
##
## Number of Classes: 2
##
## Levels:
## 0 1
set.seed(123)
tune.out.poly = tune(svm, mpglevel~., data=Auto, kernal="polynomial", ranges=list(cost=c(0.001, 0.01, 0.1, 1, 5, 10 ,100), degree=c(2,3,4,5)))
summary(tune.out.poly)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost degree
## 100 2
##
## - best performance: 0.01282051
##
## - Detailed performance results:
## cost degree error dispersion
## 1 1e-03 2 0.58173077 0.04740051
## 2 1e-02 2 0.58173077 0.04740051
## 3 1e-01 2 0.10692308 0.05900981
## 4 1e+00 2 0.07891026 0.03828837
## 5 5e+00 2 0.06608974 0.04785032
## 6 1e+01 2 0.05602564 0.03551922
## 7 1e+02 2 0.01282051 0.01813094
## 8 1e-03 3 0.58173077 0.04740051
## 9 1e-02 3 0.58173077 0.04740051
## 10 1e-01 3 0.10692308 0.05900981
## 11 1e+00 3 0.07891026 0.03828837
## 12 5e+00 3 0.06608974 0.04785032
## 13 1e+01 3 0.05602564 0.03551922
## 14 1e+02 3 0.01282051 0.01813094
## 15 1e-03 4 0.58173077 0.04740051
## 16 1e-02 4 0.58173077 0.04740051
## 17 1e-01 4 0.10692308 0.05900981
## 18 1e+00 4 0.07891026 0.03828837
## 19 5e+00 4 0.06608974 0.04785032
## 20 1e+01 4 0.05602564 0.03551922
## 21 1e+02 4 0.01282051 0.01813094
## 22 1e-03 5 0.58173077 0.04740051
## 23 1e-02 5 0.58173077 0.04740051
## 24 1e-01 5 0.10692308 0.05900981
## 25 1e+00 5 0.07891026 0.03828837
## 26 5e+00 5 0.06608974 0.04785032
## 27 1e+01 5 0.05602564 0.03551922
## 28 1e+02 5 0.01282051 0.01813094
tune.out.poly$best.performance
## [1] 0.01282051
best.poly.model = tune.out.poly$best.model
summary(best.poly.model)
##
## Call:
## best.tune(METHOD = svm, train.x = mpglevel ~ ., data = Auto, ranges = list(cost = c(0.001,
## 0.01, 0.1, 1, 5, 10, 100), degree = c(2, 3, 4, 5)), kernal = "polynomial")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 100
##
## Number of Support Vectors: 63
##
## ( 30 33 )
##
##
## Number of Classes: 2
##
## Levels:
## 0 1
svm.linear=svm(mpglevel~., data=Auto, kernal="linear", cost=100)
svm.rad=svm(mpglevel~., data=Auto, kernal="radial", cost=100, gamma=0.01)
svm.poly=svm(mpglevel~., data=Auto, kernal="polynomial", cost=100, degree=2)
plotpairs = function(autofit) {
for (name in names(Auto)[!(names(Auto) %in% c("mpg", "mpglevel", "name"))]) {
plot(autofit, Auto, as.formula(paste("mpg~", name, sep = "")))
}
}
plotpairs(svm.linear)
plotpairs(svm.rad)
plotpairs(svm.poly)
This problem involves the OJ data set which is part of the ISLR package.
data(OJ)
summary(OJ)
## Purchase WeekofPurchase StoreID PriceCH PriceMM
## CH:653 Min. :227.0 Min. :1.00 Min. :1.690 Min. :1.690
## MM:417 1st Qu.:240.0 1st Qu.:2.00 1st Qu.:1.790 1st Qu.:1.990
## Median :257.0 Median :3.00 Median :1.860 Median :2.090
## Mean :254.4 Mean :3.96 Mean :1.867 Mean :2.085
## 3rd Qu.:268.0 3rd Qu.:7.00 3rd Qu.:1.990 3rd Qu.:2.180
## Max. :278.0 Max. :7.00 Max. :2.090 Max. :2.290
## DiscCH DiscMM SpecialCH SpecialMM
## Min. :0.00000 Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
## Median :0.00000 Median :0.0000 Median :0.0000 Median :0.0000
## Mean :0.05186 Mean :0.1234 Mean :0.1477 Mean :0.1617
## 3rd Qu.:0.00000 3rd Qu.:0.2300 3rd Qu.:0.0000 3rd Qu.:0.0000
## Max. :0.50000 Max. :0.8000 Max. :1.0000 Max. :1.0000
## LoyalCH SalePriceMM SalePriceCH PriceDiff Store7
## Min. :0.000011 Min. :1.190 Min. :1.390 Min. :-0.6700 No :714
## 1st Qu.:0.325257 1st Qu.:1.690 1st Qu.:1.750 1st Qu.: 0.0000 Yes:356
## Median :0.600000 Median :2.090 Median :1.860 Median : 0.2300
## Mean :0.565782 Mean :1.962 Mean :1.816 Mean : 0.1465
## 3rd Qu.:0.850873 3rd Qu.:2.130 3rd Qu.:1.890 3rd Qu.: 0.3200
## Max. :0.999947 Max. :2.290 Max. :2.090 Max. : 0.6400
## PctDiscMM PctDiscCH ListPriceDiff STORE
## Min. :0.0000 Min. :0.00000 Min. :0.000 Min. :0.000
## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.140 1st Qu.:0.000
## Median :0.0000 Median :0.00000 Median :0.240 Median :2.000
## Mean :0.0593 Mean :0.02731 Mean :0.218 Mean :1.631
## 3rd Qu.:0.1127 3rd Qu.:0.00000 3rd Qu.:0.300 3rd Qu.:3.000
## Max. :0.4020 Max. :0.25269 Max. :0.440 Max. :4.000
str(OJ)
## 'data.frame': 1070 obs. of 18 variables:
## $ Purchase : Factor w/ 2 levels "CH","MM": 1 1 1 2 1 1 1 1 1 1 ...
## $ WeekofPurchase: num 237 239 245 227 228 230 232 234 235 238 ...
## $ StoreID : num 1 1 1 1 7 7 7 7 7 7 ...
## $ PriceCH : num 1.75 1.75 1.86 1.69 1.69 1.69 1.69 1.75 1.75 1.75 ...
## $ PriceMM : num 1.99 1.99 2.09 1.69 1.69 1.99 1.99 1.99 1.99 1.99 ...
## $ DiscCH : num 0 0 0.17 0 0 0 0 0 0 0 ...
## $ DiscMM : num 0 0.3 0 0 0 0 0.4 0.4 0.4 0.4 ...
## $ SpecialCH : num 0 0 0 0 0 0 1 1 0 0 ...
## $ SpecialMM : num 0 1 0 0 0 1 1 0 0 0 ...
## $ LoyalCH : num 0.5 0.6 0.68 0.4 0.957 ...
## $ SalePriceMM : num 1.99 1.69 2.09 1.69 1.69 1.99 1.59 1.59 1.59 1.59 ...
## $ SalePriceCH : num 1.75 1.75 1.69 1.69 1.69 1.69 1.69 1.75 1.75 1.75 ...
## $ PriceDiff : num 0.24 -0.06 0.4 0 0 0.3 -0.1 -0.16 -0.16 -0.16 ...
## $ Store7 : Factor w/ 2 levels "No","Yes": 1 1 1 1 2 2 2 2 2 2 ...
## $ PctDiscMM : num 0 0.151 0 0 0 ...
## $ PctDiscCH : num 0 0 0.0914 0 0 ...
## $ ListPriceDiff : num 0.24 0.24 0.23 0 0 0.3 0.3 0.24 0.24 0.24 ...
## $ STORE : num 1 1 1 1 0 0 0 0 0 0 ...
set.seed(1234)
oj.intrain <- createDataPartition(OJ$Purchase, p = 0.746, list = FALSE)
oj.train <- OJ[oj.intrain,]
oj.test <- OJ[-oj.intrain,]
dim(oj.train)
## [1] 800 18
oj.svm <- svm(Purchase~., data = oj.train, kernel = "linear", cost = 0.01)
summary(oj.svm)
##
## Call:
## svm(formula = Purchase ~ ., data = oj.train, kernel = "linear", cost = 0.01)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.01
##
## Number of Support Vectors: 439
##
## ( 221 218 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
The Train Error Rate is 17.63%.
oj.train.pred <- predict(oj.svm, oj.train)
table(oj.train$Purchase, oj.train.pred)
## oj.train.pred
## CH MM
## CH 430 58
## MM 83 229
(83+58)/800
## [1] 0.17625
The Test Error Rate is 14.44%.
oj.test.pred <- predict(oj.svm, oj.test)
table(oj.test$Purchase, oj.test.pred)
## oj.test.pred
## CH MM
## CH 147 18
## MM 21 84
(21+18)/270
## [1] 0.1444444
set.seed(1234)
oj.tune.out = tune(svm, Purchase ~., data = oj.train, kernel = "linear", ranges = list(cost=c(0.001, 0.01, 0.1, 1, 5, 10)))
summary(oj.tune.out)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 0.1
##
## - best performance: 0.17625
##
## - Detailed performance results:
## cost error dispersion
## 1 1e-03 0.33750 0.07264832
## 2 1e-02 0.18000 0.04937104
## 3 1e-01 0.17625 0.05816941
## 4 1e+00 0.18000 0.05986095
## 5 5e+00 0.17625 0.05478810
## 6 1e+01 0.17750 0.05583955
oj.tune.out$best.parameters
## cost
## 3 0.1
oj.new.svm = svm(Purchase ~., kernel = "linear", data = oj.train, cost = oj.tune.out$best.parameters$cost)
oj.new.train.pred = predict(oj.new.svm, oj.train)
table(oj.train$Purchase, oj.new.train.pred)
## oj.new.train.pred
## CH MM
## CH 428 60
## MM 75 237
(75 + 60)/800
## [1] 0.16875
oj.new.test.pred = predict(oj.new.svm, oj.test)
table(oj.test$Purchase, oj.new.test.pred)
## oj.new.test.pred
## CH MM
## CH 147 18
## MM 20 85
(20 + 18)/270
## [1] 0.1407407
oj.svm.rad = svm(Purchase~., data = oj.train, kernel = "radial", cost = 0.01)
summary(oj.svm.rad)
##
## Call:
## svm(formula = Purchase ~ ., data = oj.train, kernel = "radial", cost = 0.01)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 0.01
##
## Number of Support Vectors: 625
##
## ( 313 312 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
oj.rad.train.pred = predict(oj.svm.rad, oj.train)
table(oj.train$Purchase, oj.rad.train.pred)
## oj.rad.train.pred
## CH MM
## CH 488 0
## MM 312 0
312/800
## [1] 0.39
oj.rad.test.pred = predict(oj.svm.rad, oj.test)
table(oj.test$Purchase, oj.rad.test.pred)
## oj.rad.test.pred
## CH MM
## CH 165 0
## MM 105 0
105/270
## [1] 0.3888889
set.seed(1234)
oj.rad.tune = tune(svm, Purchase~., data = oj.train, kernal = "radial", ranges = list(cost=c(0.001, 0.01, 0.1, 1, 5, 10)))
summary(oj.rad.tune)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 1
##
## - best performance: 0.1775
##
## - Detailed performance results:
## cost error dispersion
## 1 1e-03 0.39000 0.04706674
## 2 1e-02 0.39000 0.04706674
## 3 1e-01 0.18750 0.03632416
## 4 1e+00 0.17750 0.04440971
## 5 5e+00 0.18250 0.03446012
## 6 1e+01 0.18875 0.03747684
oj.svm.rad2 = svm(Purchase~., data = oj.train, kernel = "radial", cost = oj.rad.tune$best.parameters$cost)
summary(oj.svm.rad2)
##
## Call:
## svm(formula = Purchase ~ ., data = oj.train, kernel = "radial", cost = oj.rad.tune$best.parameters$cost)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 1
##
## Number of Support Vectors: 375
##
## ( 189 186 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
oj.train.rad.pred2 = predict(oj.svm.rad2, oj.train)
table(oj.train$Purchase, oj.train.rad.pred2)
## oj.train.rad.pred2
## CH MM
## CH 445 43
## MM 79 233
(79 + 43)/800
## [1] 0.1525
oj.test.rad.pred2 = predict(oj.svm.rad2, oj.test)
table(oj.test$Purchase, oj.test.rad.pred2)
## oj.test.rad.pred2
## CH MM
## CH 150 15
## MM 26 79
(26+15)/270
## [1] 0.1518519
oj.svm.poly = svm(Purchase~., data = oj.train, kernel = "poly", cost = 0.01, degree = 2)
summary(oj.svm.poly)
##
## Call:
## svm(formula = Purchase ~ ., data = oj.train, kernel = "poly", cost = 0.01,
## degree = 2)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: polynomial
## cost: 0.01
## degree: 2
## coef.0: 0
##
## Number of Support Vectors: 629
##
## ( 317 312 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
oj.poly.train.pred = predict(oj.svm.poly, oj.train)
table(oj.train$Purchase, oj.poly.train.pred)
## oj.poly.train.pred
## CH MM
## CH 488 0
## MM 312 0
312/800
## [1] 0.39
oj.poly.test.pred = predict(oj.svm.poly, oj.test)
table(oj.test$Purchase, oj.poly.test.pred)
## oj.poly.test.pred
## CH MM
## CH 165 0
## MM 105 0
105/270
## [1] 0.3888889
set.seed(1234)
oj.poly.tune = tune(svm, Purchase~., data = oj.train, kernel = "poly", ranges = list(cost=c(0.001, 0.01, 0.1, 1, 5, 10)), degree = 2)
summary(oj.poly.tune)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 10
##
## - best performance: 0.18625
##
## - Detailed performance results:
## cost error dispersion
## 1 1e-03 0.39000 0.04706674
## 2 1e-02 0.39000 0.04706674
## 3 1e-01 0.30875 0.05804991
## 4 1e+00 0.20500 0.05109903
## 5 5e+00 0.19125 0.03998698
## 6 1e+01 0.18625 0.04767147
oj.poly.tune$best.parameters
## cost
## 6 10
oj.svm.poly2 = svm(Purchase~., data = oj.train, kernel = "poly", cost = oj.poly.tune$best.parameters$cost, degree = 2)
summary(oj.svm.poly2)
##
## Call:
## svm(formula = Purchase ~ ., data = oj.train, kernel = "poly", cost = oj.poly.tune$best.parameters$cost,
## degree = 2)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: polynomial
## cost: 10
## degree: 2
## coef.0: 0
##
## Number of Support Vectors: 344
##
## ( 175 169 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
oj.train.poly.pred2 = predict(oj.svm.poly2, oj.train)
table(oj.train$Purchase, oj.train.poly.pred2)
## oj.train.poly.pred2
## CH MM
## CH 447 41
## MM 79 233
(79+41)/800
## [1] 0.15
oj.test.poly.pred2 = predict(oj.svm.poly2, oj.test)
table(oj.test$Purchase, oj.test.poly.pred2)
## oj.test.poly.pred2
## CH MM
## CH 152 13
## MM 35 70
(35+13)/270
## [1] 0.1777778
The Tuned Radial SVM with Cost = 1 results in the lowest Test Error Rate for the OJ data set.