5. We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary. We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features.
(a) Generate a data set with n = 500 and p = 2, such that the observations belong to two classes with a quadratic decision boundary between them.
set.seed(1)
x1=runif (500) -0.5
x2=runif (500) -0.5
y=1*(x1^2-x2^2 > 0)
(b) Plot the observations, colored according to their class labels.Your plot should display X1 on the x-axis, and X2 on the y-axis.
set.seed(1)
plot(x1[y == 0],x2[y == 0],col = "red",xlab = "X1",ylab = "X2",pch = "+")
points(x1[y == 1],x2[ y == 1], col = "blue", pch = 4)
(c) Fit a logistic regression model to the data, using X1 and X2 as predictors.
set.seed(1)
logfit <- glm(y ~ x1+ x2, family = binomial)
summary(logfit)
##
## Call:
## glm(formula = y ~ x1 + x2, family = binomial)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.179 -1.139 -1.112 1.206 1.257
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.087260 0.089579 -0.974 0.330
## x1 0.196199 0.316864 0.619 0.536
## x2 -0.002854 0.305712 -0.009 0.993
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 692.18 on 499 degrees of freedom
## Residual deviance: 691.79 on 497 degrees of freedom
## AIC: 697.79
##
## Number of Fisher Scoring iterations: 3
(d) Apply this model to the training data in order to obtain a predicted class label for each training observation. Plot the observations, colored according to the predicted class labels. The decision boundary should be linear.
data = data.frame(x1 = x1, x2 = x2, y = as.factor(y))
lm.prob = predict(logfit, data, type="response")
lm.pred = ifelse(lm.prob > 0.50, 1, 0)
data.pos = data[lm.pred == 1, ]
data.neg = data[lm.pred == 0, ]
plot(data.pos$x1, data.pos$x2, col = "blue", xlab = "X1", ylab = "X2", pch = "+")
points(data.neg$x1, data.neg$x2, col = "red", pch = 4)
(e) Now fit a logistic regression model to the data using non-linear functions of X1 and X2 as predictors (e.g. X2 1 , X1×X2, log(X2),and so forth).
set.seed(1)
lmfit <- glm(y ~ poly (x1,2)+poly(x2,2)+I(x1*x2), family = binomial)
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
summary(lmfit)
##
## Call:
## glm(formula = y ~ poly(x1, 2) + poly(x2, 2) + I(x1 * x2), family = binomial)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -8.240e-04 -2.000e-08 -2.000e-08 2.000e-08 1.163e-03
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -102.2 4302.0 -0.024 0.981
## poly(x1, 2)1 2715.3 141109.5 0.019 0.985
## poly(x1, 2)2 27218.5 842987.2 0.032 0.974
## poly(x2, 2)1 -279.7 97160.4 -0.003 0.998
## poly(x2, 2)2 -28693.0 875451.3 -0.033 0.974
## I(x1 * x2) -206.4 41802.8 -0.005 0.996
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 6.9218e+02 on 499 degrees of freedom
## Residual deviance: 3.5810e-06 on 494 degrees of freedom
## AIC: 12
##
## Number of Fisher Scoring iterations: 25
(f) Apply this model to the training data in order to obtain a predicted class label for each training observation. Plot the observations, colored according to the predicted class labels. The decision boundary should be obviously non-linear. If it is not,then repeat (a)-(e) until you come up with an example in which the predicted class labels are obviously non-linear.
set.seed(1)
lm.prob = predict(lmfit,data, type = "response")
lm.pred = ifelse(lm.prob > 0.5, 1, 0)
data.pos = data[lm.pred == 1, ]
data.neg = data[lm.pred == 0, ]
plot(data.pos$x1, data.pos$x2, col = "blue", xlab = "X1", ylab = "X2", pch = "+")
points(data.neg$x1, data.neg$x2, col = "green", pch = 4)
(g) Fit a support vector classifier to the data with X1 and X2 as predictors. Obtain a class prediction for each training observation. Plot the observations, colored according to the predicted class labels.
set.seed(1)
library(e1071)
## Warning: package 'e1071' was built under R version 4.1.3
svm.fit4 = svm(y ~ x1 + x2, data, kernel = "linear", cost = 0.01)
svm.pred2 = predict(svm.fit4, data)
data.pos1 = data[svm.pred2 == 1, ]
data.neg1 = data[svm.pred2 == 0, ]
#plot(data.pos1$x1, data.pos1$x2, col = "blue", xlab = "X1", ylab = "X2", pch = "+")
#points(data.neg1$x1, data.neg1$x2, col = "red", pch = 4)
library(e1071)
svmfity = svm(as.factor(y) ~ x1 + x2, data, kernel ="linear", cost = 0.01)
svm.pred2 = predict(svmfity,data)
data.pos1 = data[svm.pred2 == 1, ]
data.neg1 = data[svm.pred2 == 0, ]
plot(data.neg1$x1,data.neg1$x2,col = "red",xlab = "X1",ylab = "X2",pch = "+")
points(data.pos1$x1,data.pos1$x2,col = "blue",pch = 4)
(h) Fit a SVM using a non-linear kernel to the data. Obtain a class prediction for each training observation. Plot the observations,colored according to the predicted class labels.
library(e1071)
svmfity1 = svm(as.factor(y) ~ x1 + x2, data, gamma=1)
svm.pred3 = predict(svmfity1,data)
data.pos2 = data[svm.pred3 == 1, ]
data.neg2 = data[svm.pred3 == 0, ]
plot(data.neg2$x1,data.neg2$x2,col = "green",xlab = "X1",ylab = "X2",pch = "+")
points(data.pos2$x1,data.pos2$x2,col = "brown",pch = 4)
(i) Comment on your results. SVM and logistic regression resulted in a non-linear boundary for this data.
7. In this problem, you will use support vector approaches in order to predict whether a given car gets high or low gas mileage based on the Auto data set.
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.1.3
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.6 v dplyr 1.0.7
## v tidyr 1.2.0 v stringr 1.4.0
## v readr 2.1.2 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(caret)
## Warning: package 'caret' was built under R version 4.1.3
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
library(ISLR)
#View(Auto)
dim(Auto)
## [1] 392 9
(a) Create a binary variable that takes on a 1 for cars with gas mileage above the median, and a 0 for cars with gas mileage below the median.
median(Auto$mpg)
## [1] 22.75
Median is 22.75
Auto$mpgbin<- ifelse(Auto$mpg > 22.75, 1,0)
View(Auto)
Auto2 <- Auto[, -c(1,9)] #delete column 1 and 9
View(Auto2)
(b) Fit a support vector classifier to the data with various values of cost, in order to predict whether a car gets high or low gas mileage. Report the cross-validation errors associated with different values of this parameter. Comment on your results.
train_control<-trainControl(method = "repeatedcv", number = 10, repeats = 3)
svm1<-train(mpgbin~., data=Auto2, method='svmLinear', trcontrol=train_control,preProcess=c('center','scale'))
## Warning in train.default(x, y, weights = w, ...): You are trying to do
## regression and your outcome only has two possible values Are you trying to do
## classification? If so, use a 2 level factor as your outcome column.
svm1
## Support Vector Machines with Linear Kernel
##
## 392 samples
## 7 predictor
##
## Pre-processing: centered (7), scaled (7)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 392, 392, 392, 392, 392, 392, ...
## Resampling results:
##
## RMSE Rsquared MAE
## 0.3302032 0.5985918 0.1965709
##
## Tuning parameter 'C' was held constant at a value of 1
svm2<-train(mpgbin~., data=Auto2, method='svmLinear', trcontrol=train_control,preProcess=c('center','scale'), tuneGrid=expand.grid(C = seq(0,2,length=20)))
## Warning in train.default(x, y, weights = w, ...): You are trying to do
## regression and your outcome only has two possible values Are you trying to do
## classification? If so, use a 2 level factor as your outcome column.
## Warning: model fit failed for Resample01: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample02: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample03: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample04: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample05: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample06: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample07: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample08: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample09: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample10: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample11: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample12: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample13: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample14: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample15: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample16: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample17: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample18: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample19: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample20: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample21: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample22: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample23: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample24: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning: model fit failed for Resample25: C=0.0000 Error in .local(x, ...) :
## No Support Vectors found. You may want to change your parameters
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
## Warning in train.default(x, y, weights = w, ...): missing values found in
## aggregated results
svm2$bestTune
## C
## 2 0.1052632
WE see that C=0.1 yields lowest RMSE. (c) Now repeat (b), this time using SVMs with radial and polynomial basis kernels, with different values of gamma and degree and cost. Comment on your results.
set.seed(1)
tune.out_poly = tune(svm, mpgbin~., data = Auto2, kernel = "polynomial", ranges = list(cost = c(0.1, 1, 5, 10), degree = c(2, 3, 4)))
summary(tune.out_poly)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost degree
## 10 3
##
## - best performance: 0.1092121
##
## - Detailed performance results:
## cost degree error dispersion
## 1 0.1 2 0.2239802 0.07304997
## 2 1.0 2 0.1574751 0.05592423
## 3 5.0 2 0.1516332 0.03804864
## 4 10.0 2 0.1540740 0.03973014
## 5 0.1 3 0.1366508 0.04330112
## 6 1.0 3 0.1221202 0.04297165
## 7 5.0 3 0.1108529 0.03519430
## 8 10.0 3 0.1092121 0.03417340
## 9 0.1 4 0.1894453 0.06727966
## 10 1.0 4 0.1374626 0.03881428
## 11 5.0 4 0.1390815 0.03978204
## 12 10.0 4 0.1364791 0.03701171
The best performance is when cost = 10 and degree = 3, with error = 0.1092121
set.seed(1)
tune.out_rad = tune(svm, mpgbin~., data = Auto2, kernel = "radial", ranges = list(cost = c(0.1, 1, 5, 10), gamma = c(0.01,0.1,1,5,10,100)))
summary(tune.out_rad)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost gamma
## 1 1
##
## - best performance: 0.05852796
##
## - Detailed performance results:
## cost gamma error dispersion
## 1 0.1 1e-02 0.09902986 0.025488731
## 2 1.0 1e-02 0.08857701 0.029198514
## 3 5.0 1e-02 0.08524248 0.034827814
## 4 10.0 1e-02 0.08617197 0.036465230
## 5 0.1 1e-01 0.07262940 0.026661100
## 6 1.0 1e-01 0.07541273 0.029983134
## 7 5.0 1e-01 0.07161875 0.027027169
## 8 10.0 1e-01 0.06849272 0.025857267
## 9 0.1 1e+00 0.08069019 0.021176790
## 10 1.0 1e+00 0.05852796 0.019700180
## 11 5.0 1e+00 0.06724268 0.018999084
## 12 10.0 1e+00 0.07260378 0.023453751
## 13 0.1 5e+00 0.27250383 0.041777294
## 14 1.0 5e+00 0.10239908 0.023918049
## 15 5.0 5e+00 0.10580167 0.028598473
## 16 10.0 5e+00 0.10643383 0.028660883
## 17 0.1 1e+01 0.37817888 0.043686490
## 18 1.0 1e+01 0.15015684 0.021735121
## 19 5.0 1e+01 0.15179528 0.022782957
## 20 10.0 1e+01 0.15179528 0.022782957
## 21 0.1 1e+02 0.45119417 0.036134217
## 22 1.0 1e+02 0.24762989 0.002280045
## 23 5.0 1e+02 0.24763004 0.002280016
## 24 10.0 1e+02 0.24763004 0.002280016
The best performance of error, 0.05852796 is possible when cost = 1 and gamma = 1.
#Trying the same using the train function
svm4<-train(mpgbin~., data=Auto2, method='svmPoly', trcontrol=train_control,preProcess=c('center','scale'), tuneLength=4)
## Warning in train.default(x, y, weights = w, ...): You are trying to do
## regression and your outcome only has two possible values Are you trying to do
## classification? If so, use a 2 level factor as your outcome column.
svm4
## Support Vector Machines with Polynomial Kernel
##
## 392 samples
## 7 predictor
##
## Pre-processing: centered (7), scaled (7)
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 392, 392, 392, 392, 392, 392, ...
## Resampling results across tuning parameters:
##
## degree scale C RMSE Rsquared MAE
## 1 0.001 0.25 0.4437710 0.5903961 0.3845122
## 1 0.001 0.50 0.3507501 0.5922358 0.2877224
## 1 0.001 1.00 0.3334595 0.5986146 0.2503912
## 1 0.001 2.00 0.3287658 0.6021953 0.2331249
## 1 0.010 0.25 0.3285330 0.6024940 0.2293539
## 1 0.010 0.50 0.3289451 0.6025025 0.2174956
## 1 0.010 1.00 0.3312615 0.5993808 0.2076168
## 1 0.010 2.00 0.3339254 0.5939040 0.2020224
## 1 0.100 0.25 0.3347837 0.5918743 0.2011790
## 1 0.100 0.50 0.3372410 0.5865854 0.2008514
## 1 0.100 1.00 0.3393310 0.5825362 0.2012037
## 1 0.100 2.00 0.3409790 0.5788762 0.2019418
## 1 1.000 0.25 0.3413889 0.5779664 0.2021472
## 1 1.000 0.50 0.3424402 0.5755987 0.2029957
## 1 1.000 1.00 0.3428164 0.5746671 0.2033375
## 1 1.000 2.00 0.3430669 0.5740480 0.2034530
## 2 0.001 0.25 0.3508770 0.5920555 0.2877576
## 2 0.001 0.50 0.3334201 0.5987218 0.2503523
## 2 0.001 1.00 0.3284846 0.6027227 0.2329167
## 2 0.001 2.00 0.3279036 0.6039990 0.2205738
## 2 0.010 0.25 0.3215450 0.6160742 0.2120849
## 2 0.010 0.50 0.3198753 0.6204749 0.1996402
## 2 0.010 1.00 0.3189046 0.6215889 0.1921195
## 2 0.010 2.00 0.3175083 0.6234110 0.1861556
## 2 0.100 0.25 0.3024286 0.6513431 0.1592962
## 2 0.100 0.50 0.3069965 0.6458131 0.1523289
## 2 0.100 1.00 0.3106477 0.6415438 0.1473256
## 2 0.100 2.00 0.3142842 0.6370387 0.1439027
## 2 1.000 0.25 0.3120840 0.6426786 0.1419943
## 2 1.000 0.50 0.3112195 0.6444346 0.1428458
## 2 1.000 1.00 0.3117967 0.6433825 0.1445070
## 2 1.000 2.00 0.3127382 0.6419252 0.1455702
## 3 0.001 0.25 0.3378963 0.5961692 0.2617577
## 3 0.001 0.50 0.3296518 0.6017898 0.2387915
## 3 0.001 1.00 0.3272798 0.6043725 0.2254789
## 3 0.001 2.00 0.3274053 0.6057200 0.2131303
## 3 0.010 0.25 0.3150410 0.6287553 0.2004528
## 3 0.010 0.50 0.3131838 0.6317721 0.1911045
## 3 0.010 1.00 0.3113489 0.6343552 0.1833573
## 3 0.010 2.00 0.3103344 0.6362009 0.1753454
## 3 0.100 0.25 0.2933712 0.6740697 0.1464891
## 3 0.100 0.50 0.2932031 0.6761117 0.1429394
## 3 0.100 1.00 0.2913336 0.6804020 0.1405076
## 3 0.100 2.00 0.2904938 0.6819195 0.1416625
## 3 1.000 0.25 0.2990037 0.6777090 0.1545258
## 3 1.000 0.50 0.3079933 0.6650910 0.1600121
## 3 1.000 1.00 0.3226808 0.6456590 0.1671689
## 3 1.000 2.00 0.3461041 0.6154762 0.1752223
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were degree = 3, scale = 0.1 and C = 2.
The best perfromance with low RMSE value is when degree = 3, scale = 0.1 and C = 2.
(d) Make some plots to back up your assertions in (b) and (c). Hint: In the lab, we used the plot() function for svm objects only in cases with p = 2. When p > 2, you can use the plot() function to create plots displaying pairs of variables at a time.Essentially, instead of typing
plot(svmfit , dat) where svmfit contains your fitted model and dat is a data frame containing your data, you can type plot(svmfit , dat , x1∼x4) in order to plot just the first and fourth variables. However, you must replace x1 and x4 with the correct variable names. To find out more, type ?plot.svm.
#Auto2$mpgbin1 <- as.numeric(as.character(Auto2$mpgbin))
Auto$med=ifelse(Auto$mpg > 22.75, 1, 0)
Auto$med=as.factor(Auto$med)
#Plotting
svm.linear = svm(med~.,data=Auto,kernel="linear",cost=0.1)
svm.poly = svm(med~.,data=Auto,kernel="polynomial",cost = 2,degree =3)
svm.radial = svm(med~.,data=Auto,kernel="radial",cost = 1,gamma = 1)
#plot(svm.linear,Auto2,names(Auto2))
#plotpairs = function(fit){
# for(name in names(Auto2)){
# plot(fit, Auto2, as.formula(paste("mpgbin~",name, sep = "")))
# }
#}
#plotpairs(svm.linear)
plotpairs=function(fit){
for(name in names(Auto)[!(names(Auto) %in% c("mpg","med","mpgbin", "name"))]){
plot(fit,Auto,as.formula(paste("mpg~",name,sep = "")))
}
}
plotpairs(svm.linear)
plot(svm2)
8. This problem involves the OJ data set which is part of the ISLR package.
(a) Create a training set containing a random sample of 800 observations, and a test set containing the remaining observations.
library(ISLR)
#View(OJ)
dim(OJ)
## [1] 1070 18
train = sample(dim(OJ)[1],800)
OJtrain = OJ[train,]
OJtest = OJ[-train,]
str(OJtrain)
## 'data.frame': 800 obs. of 18 variables:
## $ Purchase : Factor w/ 2 levels "CH","MM": 2 2 1 1 1 1 1 2 1 2 ...
## $ WeekofPurchase: num 262 275 237 268 251 253 230 254 270 240 ...
## $ StoreID : num 1 7 3 7 4 2 4 7 7 7 ...
## $ PriceCH : num 1.76 1.99 1.79 1.86 1.99 1.89 1.79 1.86 1.86 1.86 ...
## $ PriceMM : num 1.99 2.13 2.09 2.13 2.23 2.09 1.79 2.18 2.13 2.09 ...
## $ DiscCH : num 0 0 0 0 0 0.13 0 0 0.27 0 ...
## $ DiscMM : num 0.4 0.54 0 0 0 0 0 0 0 0 ...
## $ SpecialCH : num 0 1 0 0 0 0 0 0 1 0 ...
## $ SpecialMM : num 1 0 0 0 0 0 0 0 0 0 ...
## $ LoyalCH : num 0.616 0.995 0.541 0.966 0.946 ...
## $ SalePriceMM : num 1.59 1.59 2.09 2.13 2.23 2.09 1.79 2.18 2.13 2.09 ...
## $ SalePriceCH : num 1.76 1.99 1.79 1.86 1.99 1.76 1.79 1.86 1.59 1.86 ...
## $ PriceDiff : num -0.17 -0.4 0.3 0.27 0.24 0.33 0 0.32 0.54 0.23 ...
## $ Store7 : Factor w/ 2 levels "No","Yes": 1 2 1 2 1 1 1 2 2 2 ...
## $ PctDiscMM : num 0.201 0.254 0 0 0 ...
## $ PctDiscCH : num 0 0 0 0 0 ...
## $ ListPriceDiff : num 0.23 0.14 0.3 0.27 0.24 0.2 0 0.32 0.27 0.23 ...
## $ STORE : num 1 0 3 0 4 2 4 0 0 0 ...
(b) Fit a support vector classifier to the training data using cost=0.01, with Purchase as the response and the other variables as predictors. Use the summary() function to produce summary statistics, and describe the results obtained.
library(e1071)
svmOJ = svm(Purchase ~ ., kernel = "linear", data = OJtrain, cost = 0.01)
summary(svmOJ)
##
## Call:
## svm(formula = Purchase ~ ., data = OJtrain, kernel = "linear", cost = 0.01)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.01
##
## Number of Support Vectors: 426
##
## ( 212 214 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
There are 434 support vector classifiers with 218 in the first class and 216 in the other class.
(c) What are the training and test error rates?
library(e1071)
set.seed(1)
predOJtrain = predict(svmOJ, OJtrain, type = "class")
table(OJtrain$Purchase,predOJtrain)
## predOJtrain
## CH MM
## CH 446 53
## MM 75 226
The training error rate is (54+81)/(54+81+439+226) = 135/800 =16.87%
library(e1071)
set.seed(1)
predOJ = predict(svmOJ, OJtest, type = "class")
table(OJtest$Purchase,predOJ)
## predOJ
## CH MM
## CH 134 20
## MM 32 84
Test error rate is 30+18/ (30+18+142+80)=48/270 = 17.78% The test error rate is more than the training error rate.
(d) Use the tune() function to select an optimal cost. Consider values in the range 0.01 to 10.
library(e1071)
set.seed(1)
tune.out=tune(svm ,Purchase~.,data=OJtrain ,kernel ="linear",ranges=list(cost=c(0.01,0.1, 1,5,10) ))
summary(tune.out)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 1
##
## - best performance: 0.16
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.16375 0.03508422
## 2 0.10 0.16250 0.03535534
## 3 1.00 0.16000 0.03809710
## 4 5.00 0.16250 0.03726780
## 5 10.00 0.16250 0.03726780
bestmod=tune.out$best.model
summary(bestmod)
##
## Call:
## best.tune(method = svm, train.x = Purchase ~ ., data = OJtrain, ranges = list(cost = c(0.01,
## 0.1, 1, 5, 10)), kernel = "linear")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 1
##
## Number of Support Vectors: 317
##
## ( 158 159 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
(e) Compute the training and test error rates using this new value for cost.
library(e1071)
set.seed(1)
predOJbest1 = predict(bestmod, OJtrain, type = "class")
table(OJtrain$Purchase,predOJbest1)
## predOJbest1
## CH MM
## CH 442 57
## MM 71 230
The training error rate is (73+56)/(73+56+437+234)=129/800 = 16.12%
library(e1071)
set.seed(1)
predOJbest2 = predict(bestmod, OJtest, type = "class")
table(OJtest$Purchase,predOJbest2)
## predOJbest2
## CH MM
## CH 133 21
## MM 31 85
The test error rate is (29+21)/(29+21+139+81) = 18.52%
(f) Repeat parts (b) through (e) using a support vector machine with a radial kernel. Use the default value for gamma.
library(e1071)
set.seed(1)
svmOJr=svm(Purchase~., data = OJtrain, kernel ="radial", gamma=1,cost=0.01)
summary(svmOJr)
##
## Call:
## svm(formula = Purchase ~ ., data = OJtrain, kernel = "radial", gamma = 1,
## cost = 0.01)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 0.01
##
## Number of Support Vectors: 634
##
## ( 301 333 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
There are 644 support vector machines, with 337 in first class and 307 in the second class.
library(e1071)
set.seed(1)
predOJr=predict(svmOJr,OJtrain)
table(predict=predOJr,truth=OJtrain$Purchase)
## truth
## predict CH MM
## CH 499 301
## MM 0 0
The training error rate is 307/493 = 62.27%
library(e1071)
set.seed(1)
predOJr1=predict(svmOJr,OJtest)
table(predict=predOJr1,truth=OJtest$Purchase)
## truth
## predict CH MM
## CH 154 116
## MM 0 0
The test error rate is 110/160 = 68.75% , which is more than the training error rate.
(g) Repeat parts (b) through (e) using a support vector machine with a polynomial kernel. Set degree=2.
library(e1071)
set.seed(1)
svmOJpoly = svm(Purchase ~ ., data = OJtrain, kernel = "poly", degree = 2)
summary(svmOJpoly)
##
## Call:
## svm(formula = Purchase ~ ., data = OJtrain, kernel = "poly", degree = 2)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: polynomial
## cost: 1
## degree: 2
## coef.0: 0
##
## Number of Support Vectors: 441
##
## ( 218 223 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
There are 437 support vestor machines with 223 in first class and 214 in the second class.
library(e1071)
set.seed(1)
predOJpoly=predict(svmOJpoly,OJtrain)
table(predict=predOJpoly,truth=OJtrain$Purchase)
## truth
## predict CH MM
## CH 472 112
## MM 27 189
The training error rate is (34+107)/(34+107+459+200)=141/800 = 17.62%
library(e1071)
set.seed(1)
predOJpoly1=predict(svmOJpoly,OJtest)
table(predict=predOJpoly1,truth=OJtest$Purchase)
## truth
## predict CH MM
## CH 142 43
## MM 12 73
The test error rate is (17+42)/(17+42+68+143)=59/270 = 21.86%, which is more than training error rate.
(h) Overall, which approach seems to give the best results on this data
The support vector classifier yielded better result compared to support vector machine with a radial or polynomial kernel.