Chapter 09 (page 368): 5, 7, 8

Problem 5

We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary. We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features.
Q5(a) Generate a data set with n = 500 and p = 2, such that the observations belong to two classes with a quadratic decision boundary between them.

A5(a)

set.seed(5)
x1 = runif(500) - .5
x2 = runif(500) -.5
y = 1 * (x1^2 - x2^2 > 0)
df=data.frame(x1,x2)

Q5(b) Plot the observations, colored according to their class labels. Your plot should display X1 on the x-axis, and X2 on the y-axis.
A5(b)

plot(x1, x2, col = (y + 1), pch=19)

Q5(c) Fit a logistic regression model to the data, using X1 and X2 as predictors.
A5(c)

set.seed(5)
lr.fit=glm(y~x1+x2,family=binomial)
summary(lr.fit)
## 
## Call:
## glm(formula = y ~ x1 + x2, family = binomial)
## 
## Deviance Residuals: 
##    Min      1Q  Median      3Q     Max  
## -1.200  -1.161  -1.131   1.190   1.223  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.03150    0.08949  -0.352    0.725
## x1          -0.06176    0.30506  -0.202    0.840
## x2          -0.11509    0.31086  -0.370    0.711
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 693.02  on 499  degrees of freedom
## Residual deviance: 692.85  on 497  degrees of freedom
## AIC: 698.85
## 
## Number of Fisher Scoring iterations: 3

Q5(d) Apply this model to the training data in order to obtain a predicted class label for each training observation. Plot the observations, colored according to the predicted class labels. The decision boundary should be linear.
A5(d)

set.seed(5)
lr.prob = predict(lr.fit, df, type="response")
lr.pred = rep(0,500)
lr.pred[lr.prob > .5] = 1
plot(x1,x2, col=lr.pred+1, pch=19)

Q5(e) Now fit a logistic regression model to the data using non-linear functions of X1 and X2 as predictors.
A5(e)

lr.fit2 = glm(y ~ x1 + x2+ I(x1^2) + I(x2^2) + I(x1 * x2), data = df, family = binomial)
lr.fit2
## 
## Call:  glm(formula = y ~ x1 + x2 + I(x1^2) + I(x2^2) + I(x1 * x2), family = binomial, 
##     data = df)
## 
## Coefficients:
## (Intercept)           x1           x2      I(x1^2)      I(x2^2)   I(x1 * x2)  
##   3.326e+14    2.028e+14    1.536e+13    2.349e+16   -2.714e+16   -5.007e+14  
## 
## Degrees of Freedom: 499 Total (i.e. Null);  494 Residual
## Null Deviance:       693 
## Residual Deviance: 2379  AIC: 2391

Q5(f) Apply this model to the training data in order to obtain a predicted class label for each training observation. Plot the observations, colored according to the predicted class labels. The decision boundary should be obviously non-linear. If it is not, then repeat (a)-(e) until you come up with an example in which the predicted class labels are obviously non-linear.
A5(f)

set.seed(5)
lr2.prob = predict(lr.fit2, df, type="response")
lr2.pred = rep(0,500)
lr2.pred[lr2.prob > .5] = 1
plot(x1,x2, col=lr2.pred+1, pch=19)

Q5(g) Fit a support vector classifier to the data with X1 and X2 as predictors. Obtain a class prediction for each training observation. Plot the observations, colored according to the predicted class labels.
A5(g) The linear kernel failed to find a decision boundary and classified all of the points together in one class.

library(e1071)
set.seed(5)
svm.fit = svm(y ~ x1 + x2, df, kernel = "linear", cost = 0.1)
svm.prob = predict(svm.fit, df)
svm.pred = rep(0,500)
svm.pred[svm.prob > .5] = 1
plot(x1,x2, col=svm.pred+1, pch=19)

Q5(h) Fit a SVM using a non-linear kernel to the data. Obtain a class prediction for each training observation. Plot the observations, colored according to the predicted class labels.
A5(h) The non-linear kernel results were similar to the logistic regression model that used non-linear functions of X1 and X2 as predictors.

set.seed(5)
svm2.fit = svm(y ~ x1 + x2, df, gamma=1)
svm2.prob = predict(svm2.fit, df)
svm2.pred = rep(0,500)
svm2.pred[svm2.prob > .5] = 1
plot(x1,x2, col=svm2.pred+1, pch=19)

Q5(i) Comment on your results.
A5(i) The SVM with the non-linear kernel was the most effective and accurate. The SVM with the non-linear kernel most closely resembled the true form of the classes, hence being the most effective at finding non-linear boundaries. The logistic regression model that included interaction terms had a similar effect and results as the SVM with the non-linear kernel but is not as straightforward. The linear logistic regression without squared values or interactions and the SVM with a linear kernel both fail to identify a decision boundary that represented the true values.

Problem 7

In this problem, you will use support vector approaches in order to predict whether a given car gets high or low gas mileage based on the Auto data set.

library(ISLR2)
attach(Auto)

Q7(a) Create a binary variable that takes on a 1 for cars with gas mileage above the median, and a 0 for cars with gas mileage below the median. A7(a)

Auto$MPG01 = ifelse(Auto$mpg > median(Auto$mpg), 1, 0)
median(mpg)
## [1] 22.75
AutoDF = subset(Auto, select=-c(mpg))

Q7(b) Fit a support vector classifier to the data with various values of cost, in order to predict whether a car gets high or low gas mileage. Report the cross-validation errors associated with different values of this parameter. Comment on your results. Note you will need to fit the classifier without the gas mileage variable to produce sensible results.
A7(b) cost= 1 is best option with the lowest cross-validation error.

set.seed(1)
svmL.cv = tune(svm, MPG01~., data=AutoDF, kernel="linear", ranges= list(cost=c(0.001, 0.01, 0.1, 1, 5, 10)))
summary(svmL.cv)
## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##  cost
##     1
## 
## - best performance: 0.09603609 
## 
## - Detailed performance results:
##    cost      error dispersion
## 1 1e-03 0.10881486 0.02537281
## 2 1e-02 0.10421950 0.03138085
## 3 1e-01 0.10227373 0.03634911
## 4 1e+00 0.09603609 0.03666741
## 5 5e+00 0.10034346 0.03612147
## 6 1e+01 0.10531309 0.03683207
bestL=svmL.cv$best.model

Q7(c) Now repeat (b), this time using SVMs with radial and polynomial basis kernels, with different values of gamma and degree and cost. Comment on your results.
A7(c) Using a Polynomial kernel the optimal options with the lowest cross-validation error are cost= 10 and degree = 1, with an error of 0.1039033. Using a Radial kernel the optimal options with the lowest cross-validation error are cost= 5 and gamma = 0.1, with an error of 0.06515710.

set.seed(1)
svmP.cv = tune(svm, MPG01~., data=AutoDF, kernel="polynomial", 
               ranges= list(cost=c(0.001, 0.01, 0.1, 1, 5, 10),
                            degree = c(1:5)))
summary(svmP.cv)
## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##  cost degree
##    10      1
## 
## - best performance: 0.1039033 
## 
## - Detailed performance results:
##     cost degree     error dispersion
## 1  1e-03      1 0.4935679 0.03915358
## 2  1e-02      1 0.4508053 0.03828434
## 3  1e-01      1 0.1670639 0.03132832
## 4  1e+00      1 0.1039924 0.02744167
## 5  5e+00      1 0.1044447 0.03235532
## 6  1e+01      1 0.1039033 0.03360908
## 7  1e-03      2 0.4984497 0.03930707
## 8  1e-02      2 0.4982382 0.03934745
## 9  1e-01      2 0.4960905 0.03974005
## 10 1e+00      2 0.4752293 0.04525479
## 11 5e+00      2 0.3981761 0.07334314
## 12 1e+01      2 0.3375643 0.08311313
## 13 1e-03      3 0.4984673 0.03930311
## 14 1e-02      3 0.4984136 0.03930767
## 15 1e-01      3 0.4978768 0.03935396
## 16 1e+00      3 0.4924827 0.03986701
## 17 5e+00      3 0.4692575 0.04334894
## 18 1e+01      3 0.4418838 0.04913560
## 19 1e-03      4 0.4984731 0.03930263
## 20 1e-02      4 0.4984719 0.03930290
## 21 1e-01      4 0.4984602 0.03930559
## 22 1e+00      4 0.4983427 0.03933257
## 23 5e+00      4 0.4978212 0.03945572
## 24 1e+01      4 0.4971505 0.03961089
## 25 1e-03      5 0.4984732 0.03930261
## 26 1e-02      5 0.4984731 0.03930264
## 27 1e-01      5 0.4984718 0.03930295
## 28 1e+00      5 0.4984585 0.03930611
## 29 5e+00      5 0.4983996 0.03932013
## 30 1e+01      5 0.4983260 0.03933773
bestP=svmP.cv$best.model
set.seed(1)
svmR.cv = tune(svm, MPG01~., data=AutoDF, kernel="radial", 
               ranges= list(cost=c(0.001, 0.01, 0.1, 1, 5, 10, 100),
                            gamma = c(0.1, 1, 5, 10)))
summary(svmR.cv)
## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##  cost gamma
##     5   0.1
## 
## - best performance: 0.0651571 
## 
## - Detailed performance results:
##     cost gamma      error  dispersion
## 1  1e-03   0.1 0.44966460 0.036851866
## 2  1e-02   0.1 0.15038588 0.023971541
## 3  1e-01   0.1 0.07282379 0.025938178
## 4  1e+00   0.1 0.07186118 0.030174404
## 5  5e+00   0.1 0.06515710 0.029987281
## 6  1e+01   0.1 0.06878597 0.031960255
## 7  1e+02   0.1 0.08770589 0.033540507
## 8  1e-03   1.0 0.49585247 0.039308236
## 9  1e-02   1.0 0.47238675 0.039412960
## 10 1e-01   1.0 0.27951125 0.036343997
## 11 1e+00   1.0 0.09918732 0.020523476
## 12 5e+00   1.0 0.10353541 0.020801755
## 13 1e+01   1.0 0.10442706 0.020690277
## 14 1e+02   1.0 0.10442540 0.020692046
## 15 1e-03   5.0 0.49797109 0.039387310
## 16 1e-02   5.0 0.49255017 0.039569487
## 17 1e-01   5.0 0.44175647 0.040898107
## 18 1e+00   5.0 0.23810450 0.007973452
## 19 5e+00   5.0 0.23812027 0.007956301
## 20 1e+01   5.0 0.23812027 0.007956301
## 21 1e+02   5.0 0.23812027 0.007956301
## 22 1e-03  10.0 0.49808972 0.039349882
## 23 1e-02  10.0 0.49332769 0.039200376
## 24 1e-01  10.0 0.44838377 0.037776360
## 25 1e+00  10.0 0.24380724 0.004605136
## 26 5e+00  10.0 0.24380416 0.004607188
## 27 1e+01  10.0 0.24380416 0.004607188
## 28 1e+02  10.0 0.24380416 0.004607188
bestR = svmR.cv$best.model
bestR
## 
## Call:
## best.tune(METHOD = svm, train.x = MPG01 ~ ., data = AutoDF, ranges = list(cost = c(0.001, 
##     0.01, 0.1, 1, 5, 10, 100), gamma = c(0.1, 1, 5, 10)), kernel = "radial")
## 
## 
## Parameters:
##    SVM-Type:  eps-regression 
##  SVM-Kernel:  radial 
##        cost:  5 
##       gamma:  0.1 
##     epsilon:  0.1 
## 
## 
## Number of Support Vectors:  242
bestR$cost
## [1] 5

Q7(d) Make some plots to back up your assertions in (b) and (c).
A7(d)

svmL = svm(MPG01 ~ ., data = AutoDF, kernel = "linear", cost = 1)
svmP = svm(MPG01 ~ ., data = AutoDF, kernel = "polynomial", cost = 10, degree = 1)
svmR = svm(MPG01 ~ ., data = AutoDF, kernel = "radial", cost = 5, gamma = 0.1)

#Linear 
cost_list.cv = as.numeric(svmL.cv$performances[,'cost'])
error_list.cv = as.numeric(svmL.cv$performances[,'error'])
plot(x =cost_list.cv, y=error_list.cv, xlab = "Cost", ylab = "CV Error")
title("Linear SVM Cross-Validation Errors ")
points(bestL$cost, min(error_list.cv), col='red')
lines(x =cost_list.cv, y=error_list.cv)

#Polynomial
cost_list.cvP = as.numeric(svmP.cv$performances[,'cost'])
error_list.cvP = as.numeric(svmP.cv$performances[,'error'])
plot(x =cost_list.cvP, y=error_list.cvP, xlab = "Cost", ylab = "CV Error")
title("Polnomial SVM Cross-Validation Errors ")
lines(x =cost_list.cvP, y=error_list.cvP)
points(bestP$cost, min(error_list.cvP), col='red')

#Radial
cost_list.cvR = as.numeric(svmR.cv$performances[,'cost'])
error_list.cvR = as.numeric(svmR.cv$performances[,'error'])
plot(x =cost_list.cvR, y=error_list.cvR, xlab = "Cost", ylab = "CV Error")
title("Radial SVM Cross-Validation Errors ")
points(bestR$cost, min(error_list.cvR), col='red')
lines(x =cost_list.cvR, y=error_list.cvR)

Problem 8

This problem involves the OJ data set which is part of the ISLR2 package.

Q8(a) Create a training set containing a random sample of 800 observations, and a test set containing the remaining observations.
A8(a)

library(ISLR2)
set.seed(1)
inTrain = sample(nrow(OJ), 800)
trainOJ = OJ[inTrain,]
testOJ = OJ[-inTrain,]

Q8(b) Fit a support vector classifier to the training data using cost = 0.01, with Purchase as the response and the other variables as predictors. Use the summary() function to produce summary statistics, and describe the results obtained.
A8(b) There are 435 support vectors using cost=.01. 219 of the support vectors belong to CH, the other 216 belong to MM.

library(e1071)
set.seed(1)
svmL = svm(Purchase ~ ., kernel = "linear", data = trainOJ, cost = 0.01)
summary(svmL) #435 Vectors
## 
## Call:
## svm(formula = Purchase ~ ., data = trainOJ, kernel = "linear", cost = 0.01)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  linear 
##        cost:  0.01 
## 
## Number of Support Vectors:  435
## 
##  ( 219 216 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  CH MM

Q8(c) What are the training and test error rates?
A8(c) The training error rate is 0.175 or 17.5% and the test error rate is .1777778 or 17.77778%.

set.seed(1)
train.pred = predict(svmL, trainOJ)
table(trainOJ$Purchase, train.pred)
##     train.pred
##       CH  MM
##   CH 420  65
##   MM  75 240
train.error = mean(trainOJ$Purchase != train.pred)
train.error
## [1] 0.175
test.pred = predict(svmL, testOJ)
table(testOJ$Purchase, test.pred)
##     test.pred
##       CH  MM
##   CH 153  15
##   MM  33  69
test.error = mean(testOJ$Purchase != test.pred)
test.error
## [1] 0.1777778

Q8(d) Use the tune() function to select an optimal cost. Consider values in the range 0.01 to 10.
A8(d) Optimal cost is 0.5145455.

set.seed(1)
tuneOJ = tune(svm, Purchase~., data=trainOJ, kernel="linear", ranges= list(cost=seq(0.01, 10, length.out=100)))

summary(tuneOJ)
## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##       cost
##  0.5145455
## 
## - best performance: 0.16875 
## 
## - Detailed performance results:
##           cost   error dispersion
## 1    0.0100000 0.17625 0.02853482
## 2    0.1109091 0.17125 0.02889757
## 3    0.2118182 0.17250 0.02751262
## 4    0.3127273 0.17125 0.02889757
## 5    0.4136364 0.17000 0.02713137
## 6    0.5145455 0.16875 0.02651650
## 7    0.6154545 0.17125 0.02703521
## 8    0.7163636 0.16875 0.02779513
## 9    0.8172727 0.17000 0.02648375
## 10   0.9181818 0.17375 0.02972676
## 11   1.0190909 0.17500 0.02946278
## 12   1.1200000 0.17375 0.02664713
## 13   1.2209091 0.17500 0.02763854
## 14   1.3218182 0.17500 0.02763854
## 15   1.4227273 0.17375 0.02853482
## 16   1.5236364 0.17375 0.02853482
## 17   1.6245455 0.17250 0.02813657
## 18   1.7254545 0.17375 0.02729087
## 19   1.8263636 0.17375 0.02729087
## 20   1.9272727 0.17250 0.02874698
## 21   2.0281818 0.17250 0.02874698
## 22   2.1290909 0.17125 0.03064696
## 23   2.2300000 0.17125 0.03064696
## 24   2.3309091 0.17125 0.03335936
## 25   2.4318182 0.17125 0.03283481
## 26   2.5327273 0.17000 0.03395258
## 27   2.6336364 0.17000 0.03291403
## 28   2.7345455 0.16875 0.03397814
## 29   2.8354545 0.16875 0.03397814
## 30   2.9363636 0.17000 0.03291403
## 31   3.0372727 0.16875 0.03019037
## 32   3.1381818 0.16875 0.03019037
## 33   3.2390909 0.16875 0.03019037
## 34   3.3400000 0.16875 0.02960973
## 35   3.4409091 0.16875 0.02960973
## 36   3.5418182 0.17000 0.02958040
## 37   3.6427273 0.17000 0.02958040
## 38   3.7436364 0.17000 0.02958040
## 39   3.8445455 0.17000 0.02958040
## 40   3.9454545 0.17000 0.02958040
## 41   4.0463636 0.17000 0.02958040
## 42   4.1472727 0.17000 0.02958040
## 43   4.2481818 0.17000 0.02958040
## 44   4.3490909 0.17125 0.03064696
## 45   4.4500000 0.17125 0.03064696
## 46   4.5509091 0.17125 0.03175973
## 47   4.6518182 0.17125 0.03175973
## 48   4.7527273 0.17125 0.03175973
## 49   4.8536364 0.17125 0.03175973
## 50   4.9545455 0.17125 0.03175973
## 51   5.0554545 0.17250 0.03162278
## 52   5.1563636 0.17250 0.03162278
## 53   5.2572727 0.17250 0.03162278
## 54   5.3581818 0.17250 0.03162278
## 55   5.4590909 0.17375 0.03304563
## 56   5.5600000 0.17375 0.03304563
## 57   5.6609091 0.17250 0.03425801
## 58   5.7618182 0.17250 0.03425801
## 59   5.8627273 0.17250 0.03425801
## 60   5.9636364 0.17500 0.03333333
## 61   6.0645455 0.17500 0.03333333
## 62   6.1654545 0.17500 0.03333333
## 63   6.2663636 0.17375 0.03197764
## 64   6.3672727 0.17375 0.03197764
## 65   6.4681818 0.17375 0.03197764
## 66   6.5690909 0.17375 0.03197764
## 67   6.6700000 0.17375 0.03197764
## 68   6.7709091 0.17375 0.03197764
## 69   6.8718182 0.17500 0.03333333
## 70   6.9727273 0.17375 0.03197764
## 71   7.0736364 0.17375 0.03197764
## 72   7.1745455 0.17375 0.03197764
## 73   7.2754545 0.17375 0.03197764
## 74   7.3763636 0.17375 0.03197764
## 75   7.4772727 0.17375 0.03197764
## 76   7.5781818 0.17375 0.03197764
## 77   7.6790909 0.17375 0.03197764
## 78   7.7800000 0.17375 0.03197764
## 79   7.8809091 0.17375 0.03197764
## 80   7.9818182 0.17375 0.03197764
## 81   8.0827273 0.17375 0.03197764
## 82   8.1836364 0.17375 0.03197764
## 83   8.2845455 0.17375 0.03197764
## 84   8.3854545 0.17375 0.03197764
## 85   8.4863636 0.17375 0.03197764
## 86   8.5872727 0.17375 0.03197764
## 87   8.6881818 0.17375 0.03197764
## 88   8.7890909 0.17250 0.03216710
## 89   8.8900000 0.17375 0.03197764
## 90   8.9909091 0.17375 0.03197764
## 91   9.0918182 0.17375 0.03197764
## 92   9.1927273 0.17375 0.03197764
## 93   9.2936364 0.17375 0.03197764
## 94   9.3945455 0.17375 0.03197764
## 95   9.4954545 0.17375 0.03197764
## 96   9.5963636 0.17375 0.03197764
## 97   9.6972727 0.17375 0.03197764
## 98   9.7981818 0.17375 0.03197764
## 99   9.8990909 0.17375 0.03197764
## 100 10.0000000 0.17375 0.03197764

Q8(e) Compute the training and test error rates using this new value for cost.
A8(e) Using the new value of 1 for cost, the training error rate is 0.165 or 16.5% and the test error rate is 0.1555556 or 15.55556%.

set.seed(1)
OJcost.fit = svm(Purchase ~ ., data = trainOJ, kernel = 'linear', cost = tuneOJ$best.parameters$cost)
summary(OJcost.fit)
## 
## Call:
## svm(formula = Purchase ~ ., data = trainOJ, kernel = "linear", cost = tuneOJ$best.parameters$cost)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  linear 
##        cost:  0.5145455 
## 
## Number of Support Vectors:  332
## 
##  ( 166 166 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  CH MM
OJcost.fit
## 
## Call:
## svm(formula = Purchase ~ ., data = trainOJ, kernel = "linear", cost = tuneOJ$best.parameters$cost)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  linear 
##        cost:  0.5145455 
## 
## Number of Support Vectors:  332
svc.train.pred = predict(OJcost.fit, trainOJ)
table(svc.train.pred, trainOJ$Purchase)
##               
## svc.train.pred  CH  MM
##             CH 424  71
##             MM  61 244
traincost.errorL = mean(trainOJ$Purchase != svc.train.pred) 
traincost.errorL
## [1] 0.165
svc.test.pred=predict(OJcost.fit, testOJ)
table(svc.test.pred, testOJ$Purchase)
##              
## svc.test.pred  CH  MM
##            CH 155  29
##            MM  13  73
testcost.errorL = mean(testOJ$Purchase != svc.test.pred)
testcost.errorL
## [1] 0.1555556

Q8(f) Repeat parts (b) through (e) using a support vector machine with a radial kernel. Use the default value for gamma.
A8(f)
parts (b) and (c): Using a radial kernel and a cost of .01 the initial part b - part c results in 634 support vectors, 319 belonging to CH, 315 belonging to MM and a training error rate of 0.39375 and a test error rate of 0.3777778. parts (d) and (e): The optimal cost using tune() is 0.5145455, training and test error rates: training = 0.14875, test = 0.1777778.

set.seed(1)
#part b
svmR = svm(Purchase ~ ., kernel = "radial", data = trainOJ, cost = 0.01)
summary(svmR) #634 support vectors
## 
## Call:
## svm(formula = Purchase ~ ., data = trainOJ, kernel = "radial", cost = 0.01)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  0.01 
## 
## Number of Support Vectors:  634
## 
##  ( 319 315 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  CH MM
#part c
set.seed(1)
train.pred = predict(svmR, trainOJ)
table(trainOJ$Purchase, train.pred)
##     train.pred
##       CH  MM
##   CH 485   0
##   MM 315   0
train.error = mean(trainOJ$Purchase != train.pred)
train.error #.39375 w/kernel = radial
## [1] 0.39375
test.pred = predict(svmR, testOJ)
table(testOJ$Purchase, test.pred)
##     test.pred
##       CH  MM
##   CH 168   0
##   MM 102   0
test.error = mean(testOJ$Purchase != test.pred)
test.error  #0.3777778 w/kernel = radial
## [1] 0.3777778
#part d
set.seed(1)
tuneOJR = tune(svm, Purchase~., data=trainOJ, kernel="radial", ranges = list(cost=c(cost=seq(0.01, 10, length.out=100))))

summary(tuneOJR) #best tune is cost=0.5145455   
## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##       cost
##  0.5145455
## 
## - best performance: 0.16625 
## 
## - Detailed performance results:
##           cost   error dispersion
## 1    0.0100000 0.39375 0.04007372
## 2    0.1109091 0.18625 0.02853482
## 3    0.2118182 0.18250 0.03238227
## 4    0.3127273 0.17875 0.03230175
## 5    0.4136364 0.17625 0.02531057
## 6    0.5145455 0.16625 0.02433134
## 7    0.6154545 0.16875 0.02301117
## 8    0.7163636 0.16750 0.02776389
## 9    0.8172727 0.17000 0.02513851
## 10   0.9181818 0.16750 0.02220485
## 11   1.0190909 0.17000 0.02058182
## 12   1.1200000 0.17250 0.02188988
## 13   1.2209091 0.17250 0.02108185
## 14   1.3218182 0.17250 0.02266912
## 15   1.4227273 0.17375 0.02389938
## 16   1.5236364 0.17625 0.02161050
## 17   1.6245455 0.17625 0.02161050
## 18   1.7254545 0.17750 0.02188988
## 19   1.8263636 0.17625 0.02079162
## 20   1.9272727 0.17625 0.02079162
## 21   2.0281818 0.17750 0.02188988
## 22   2.1290909 0.17875 0.02128673
## 23   2.2300000 0.17875 0.02128673
## 24   2.3309091 0.17750 0.02266912
## 25   2.4318182 0.17750 0.02266912
## 26   2.5327273 0.17625 0.02239947
## 27   2.6336364 0.17625 0.02239947
## 28   2.7345455 0.17625 0.02239947
## 29   2.8354545 0.17625 0.02239947
## 30   2.9363636 0.17625 0.02239947
## 31   3.0372727 0.17625 0.02239947
## 32   3.1381818 0.17750 0.02266912
## 33   3.2390909 0.17750 0.02266912
## 34   3.3400000 0.17750 0.02266912
## 35   3.4409091 0.17875 0.02360703
## 36   3.5418182 0.17875 0.02360703
## 37   3.6427273 0.17875 0.02360703
## 38   3.7436364 0.17875 0.02360703
## 39   3.8445455 0.18000 0.02371708
## 40   3.9454545 0.18000 0.02371708
## 41   4.0463636 0.18125 0.02301117
## 42   4.1472727 0.18125 0.02301117
## 43   4.2481818 0.18125 0.02301117
## 44   4.3490909 0.18125 0.02301117
## 45   4.4500000 0.18125 0.02144923
## 46   4.5509091 0.18125 0.02144923
## 47   4.6518182 0.18125 0.02144923
## 48   4.7527273 0.18125 0.02144923
## 49   4.8536364 0.18125 0.02144923
## 50   4.9545455 0.18000 0.02220485
## 51   5.0554545 0.18000 0.02220485
## 52   5.1563636 0.18000 0.02220485
## 53   5.2572727 0.18000 0.02220485
## 54   5.3581818 0.18000 0.02220485
## 55   5.4590909 0.18000 0.02220485
## 56   5.5600000 0.18000 0.02220485
## 57   5.6609091 0.18000 0.02220485
## 58   5.7618182 0.18000 0.02220485
## 59   5.8627273 0.18000 0.02220485
## 60   5.9636364 0.18000 0.02220485
## 61   6.0645455 0.18000 0.02220485
## 62   6.1654545 0.18000 0.02220485
## 63   6.2663636 0.18125 0.02301117
## 64   6.3672727 0.18125 0.02301117
## 65   6.4681818 0.18125 0.02301117
## 66   6.5690909 0.18125 0.02301117
## 67   6.6700000 0.18125 0.02301117
## 68   6.7709091 0.18125 0.02301117
## 69   6.8718182 0.18125 0.02301117
## 70   6.9727273 0.18250 0.02371708
## 71   7.0736364 0.18375 0.02503470
## 72   7.1745455 0.18250 0.02443813
## 73   7.2754545 0.18250 0.02443813
## 74   7.3763636 0.18375 0.02638523
## 75   7.4772727 0.18375 0.02638523
## 76   7.5781818 0.18375 0.02638523
## 77   7.6790909 0.18375 0.02638523
## 78   7.7800000 0.18375 0.02638523
## 79   7.8809091 0.18375 0.02638523
## 80   7.9818182 0.18375 0.02638523
## 81   8.0827273 0.18250 0.02648375
## 82   8.1836364 0.18125 0.02447363
## 83   8.2845455 0.18000 0.02443813
## 84   8.3854545 0.18000 0.02443813
## 85   8.4863636 0.18000 0.02443813
## 86   8.5872727 0.18000 0.02443813
## 87   8.6881818 0.18000 0.02443813
## 88   8.7890909 0.18375 0.02703521
## 89   8.8900000 0.18375 0.02703521
## 90   8.9909091 0.18375 0.02703521
## 91   9.0918182 0.18375 0.02703521
## 92   9.1927273 0.18375 0.02703521
## 93   9.2936364 0.18625 0.02853482
## 94   9.3945455 0.18625 0.02853482
## 95   9.4954545 0.18625 0.02853482
## 96   9.5963636 0.18625 0.02853482
## 97   9.6972727 0.18625 0.02853482
## 98   9.7981818 0.18625 0.02853482
## 99   9.8990909 0.18625 0.02853482
## 100 10.0000000 0.18625 0.02853482
#part e

set.seed(1)
OJcost.fitR = svm(Purchase ~ ., data = trainOJ, kernel ="radial", cost = 0.5145455  )
summary(OJcost.fitR)
## 
## Call:
## svm(formula = Purchase ~ ., data = trainOJ, kernel = "radial", cost = 0.5145455)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  0.5145455 
## 
## Number of Support Vectors:  406
## 
##  ( 205 201 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  CH MM
OJcost.fitR #406 support vectors
## 
## Call:
## svm(formula = Purchase ~ ., data = trainOJ, kernel = "radial", cost = 0.5145455)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  0.5145455 
## 
## Number of Support Vectors:  406
svc.train.pred = predict(OJcost.fitR, trainOJ)
table(svc.train.pred, trainOJ$Purchase)
##               
## svc.train.pred  CH  MM
##             CH 438  72
##             MM  47 243
traincost.errorR = mean(trainOJ$Purchase != svc.train.pred) 
traincost.errorR # 0.14875 f/radial w/cost=0.5145455
## [1] 0.14875
svc.test.pred=predict(OJcost.fitR, testOJ)
table(svc.test.pred, testOJ$Purchase)
##              
## svc.test.pred  CH  MM
##            CH 150  30
##            MM  18  72
testcost.errorR = mean(testOJ$Purchase != svc.test.pred)
testcost.errorR # 0.1777778 f/radial w/cost=0.5145455
## [1] 0.1777778

Q8(g) Repeat parts (b) through (e) using a support vector machine with a polynomial kernel. Set degree = 2.
A8(g)

set.seed(1)
#part b
svmP = svm(Purchase ~ ., kernel="polynomial", degree=2, data = trainOJ, cost = 0.01)
summary(svmP) #636 support vectors
## 
## Call:
## svm(formula = Purchase ~ ., data = trainOJ, kernel = "polynomial", 
##     degree = 2, cost = 0.01)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  polynomial 
##        cost:  0.01 
##      degree:  2 
##      coef.0:  0 
## 
## Number of Support Vectors:  636
## 
##  ( 321 315 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  CH MM
#part c
set.seed(1)
train.pred = predict(svmP, trainOJ)
table(trainOJ$Purchase, train.pred)
##     train.pred
##       CH  MM
##   CH 484   1
##   MM 297  18
train.error = mean(trainOJ$Purchase != train.pred)
train.error #.3725 w/kernel = Polynomial, degree=2
## [1] 0.3725
test.pred = predict(svmP, testOJ)
table(testOJ$Purchase, test.pred)
##     test.pred
##       CH  MM
##   CH 167   1
##   MM  98   4
test.error = mean(testOJ$Purchase != test.pred)
test.error  #0.3666667 w/kernel = Polynomial, degree=2
## [1] 0.3666667
#part d
set.seed(1)
tuneOJP = tune(svm, Purchase~., data=trainOJ, kernel="polynomial", degree=2, ranges = list(cost=c(cost=seq(0.01, 10, length.out=100))))

summary(tuneOJP) #best tune is cost=2.633636        
## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##      cost
##  2.431818
## 
## - best performance: 0.17125 
## 
## - Detailed performance results:
##           cost   error dispersion
## 1    0.0100000 0.39125 0.04210189
## 2    0.1109091 0.31750 0.04937104
## 3    0.2118182 0.22375 0.04016027
## 4    0.3127273 0.20000 0.04208127
## 5    0.4136364 0.20125 0.04185375
## 6    0.5145455 0.20500 0.04338138
## 7    0.6154545 0.20500 0.03961621
## 8    0.7163636 0.20250 0.04031129
## 9    0.8172727 0.20250 0.04322101
## 10   0.9181818 0.20250 0.04479893
## 11   1.0190909 0.20125 0.03928617
## 12   1.1200000 0.20125 0.04016027
## 13   1.2209091 0.19625 0.04411554
## 14   1.3218182 0.19250 0.04495368
## 15   1.4227273 0.19125 0.04411554
## 16   1.5236364 0.19000 0.04322101
## 17   1.6245455 0.18875 0.04308019
## 18   1.7254545 0.18500 0.04199868
## 19   1.8263636 0.18375 0.04411554
## 20   1.9272727 0.18125 0.04177070
## 21   2.0281818 0.18125 0.04177070
## 22   2.1290909 0.17875 0.04210189
## 23   2.2300000 0.17375 0.03793727
## 24   2.3309091 0.17375 0.03747684
## 25   2.4318182 0.17125 0.03729108
## 26   2.5327273 0.17375 0.03884174
## 27   2.6336364 0.17375 0.03884174
## 28   2.7345455 0.17500 0.03818813
## 29   2.8354545 0.17500 0.03818813
## 30   2.9363636 0.17625 0.03793727
## 31   3.0372727 0.17625 0.03793727
## 32   3.1381818 0.17750 0.03670453
## 33   3.2390909 0.18000 0.03545341
## 34   3.3400000 0.18000 0.03545341
## 35   3.4409091 0.17875 0.03586723
## 36   3.5418182 0.17875 0.03586723
## 37   3.6427273 0.17875 0.03537988
## 38   3.7436364 0.17875 0.03537988
## 39   3.8445455 0.18125 0.03498512
## 40   3.9454545 0.18250 0.03395258
## 41   4.0463636 0.18250 0.03395258
## 42   4.1472727 0.18375 0.03438447
## 43   4.2481818 0.18500 0.03425801
## 44   4.3490909 0.18500 0.03425801
## 45   4.4500000 0.18500 0.03425801
## 46   4.5509091 0.18375 0.03387579
## 47   4.6518182 0.18375 0.03387579
## 48   4.7527273 0.18250 0.03496029
## 49   4.8536364 0.18250 0.03496029
## 50   4.9545455 0.18250 0.03496029
## 51   5.0554545 0.18250 0.03496029
## 52   5.1563636 0.18250 0.03496029
## 53   5.2572727 0.18375 0.03537988
## 54   5.3581818 0.18625 0.03143004
## 55   5.4590909 0.18625 0.03143004
## 56   5.5600000 0.18375 0.03064696
## 57   5.6609091 0.18375 0.03064696
## 58   5.7618182 0.18500 0.03162278
## 59   5.8627273 0.18625 0.03304563
## 60   5.9636364 0.18625 0.03304563
## 61   6.0645455 0.18500 0.03162278
## 62   6.1654545 0.18500 0.03162278
## 63   6.2663636 0.18500 0.03162278
## 64   6.3672727 0.18500 0.03162278
## 65   6.4681818 0.18500 0.03162278
## 66   6.5690909 0.18500 0.03162278
## 67   6.6700000 0.18625 0.03356689
## 68   6.7709091 0.18500 0.03162278
## 69   6.8718182 0.18500 0.03162278
## 70   6.9727273 0.18500 0.03162278
## 71   7.0736364 0.18625 0.03251602
## 72   7.1745455 0.18500 0.03525699
## 73   7.2754545 0.18375 0.03387579
## 74   7.3763636 0.18375 0.03387579
## 75   7.4772727 0.18375 0.03387579
## 76   7.5781818 0.18250 0.03291403
## 77   7.6790909 0.18250 0.03291403
## 78   7.7800000 0.18250 0.03291403
## 79   7.8809091 0.18125 0.03448530
## 80   7.9818182 0.18125 0.03448530
## 81   8.0827273 0.18125 0.03240906
## 82   8.1836364 0.18000 0.03129164
## 83   8.2845455 0.18125 0.02841288
## 84   8.3854545 0.18250 0.02898755
## 85   8.4863636 0.18250 0.02898755
## 86   8.5872727 0.18000 0.02838231
## 87   8.6881818 0.17875 0.02766993
## 88   8.7890909 0.17875 0.02766993
## 89   8.8900000 0.17875 0.02766993
## 90   8.9909091 0.17875 0.02766993
## 91   9.0918182 0.17750 0.02751262
## 92   9.1927273 0.17750 0.02751262
## 93   9.2936364 0.17750 0.02751262
## 94   9.3945455 0.17750 0.02751262
## 95   9.4954545 0.17750 0.02751262
## 96   9.5963636 0.17875 0.02766993
## 97   9.6972727 0.17875 0.02766993
## 98   9.7981818 0.17875 0.02766993
## 99   9.8990909 0.17875 0.02766993
## 100 10.0000000 0.18125 0.02779513
#part e

set.seed(1)
OJcost.fitP = svm(Purchase ~ ., data = trainOJ, kernel="polynomial", degree=2, cost = 2.633636)
summary(OJcost.fitP)
## 
## Call:
## svm(formula = Purchase ~ ., data = trainOJ, kernel = "polynomial", 
##     degree = 2, cost = 2.633636)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  polynomial 
##        cost:  2.633636 
##      degree:  2 
##      coef.0:  0 
## 
## Number of Support Vectors:  391
## 
##  ( 197 194 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  CH MM
OJcost.fitP #396 support vectors
## 
## Call:
## svm(formula = Purchase ~ ., data = trainOJ, kernel = "polynomial", 
##     degree = 2, cost = 2.633636)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  polynomial 
##        cost:  2.633636 
##      degree:  2 
##      coef.0:  0 
## 
## Number of Support Vectors:  391
svc.train.pred = predict(OJcost.fitP, trainOJ)
table(svc.train.pred, trainOJ$Purchase)
##               
## svc.train.pred  CH  MM
##             CH 452  92
##             MM  33 223
traincost.errorP = mean(trainOJ$Purchase != svc.train.pred) 
traincost.errorP # 0.15625 f/Polynomial w/degree =2 and  w/cost=2.633636
## [1] 0.15625
svc.test.pred=predict(OJcost.fitP, testOJ)
table(svc.test.pred, testOJ$Purchase)
##              
## svc.test.pred  CH  MM
##            CH 153  40
##            MM  15  62
testcost.errorP = mean(testOJ$Purchase != svc.test.pred)
testcost.errorP # 0.2037037 f/Polynomial w/degree =2 and  w/cost=2.633636
## [1] 0.2037037

Q8(h) Overall, which approach seems to give the best results on this data?
A8(h) The Linear SVM approach gives the best results since it has the lowest test error of the three, at 0.1555556. The Radial kernal had the lowest train error of .14875.

Comparison = data.frame(Type = c('Linear', 'Radial', 'Poly'),
                        Train_Error = c(traincost.errorL, traincost.errorR, traincost.errorP),
                        Test_Error = c(testcost.errorL, testcost.errorR, testcost.errorP))

Comparison
##     Type Train_Error Test_Error
## 1 Linear     0.16500  0.1555556
## 2 Radial     0.14875  0.1777778
## 3   Poly     0.15625  0.2037037