Assignment 8

Exercise 5

We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary. We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features.

(a) Generate a data set with n = 500 and p = 2, such that the observations belong to two classes with a quadratic decision boundary between them.

set.seed(1)
x1 = runif(500) - 0.5
x2 = runif(500) - 0.5
y = 1*(x1^2 - x2^2 > 0)

(b) Plot the observations, colored according to their class labels. Your plot should display X1 on the x-axis, and X2 on the yaxis.

plot(x1,x2, col = ifelse(y == 1,"red", "blue"))

(c) Fit a logistic regression model to the data, using X1 and X2 as predictors.

glm.fit1 = glm(y ~ x1+x2, family = "binomial")
summary(glm.fit1)

## 
## Call:
## glm(formula = y ~ x1 + x2, family = "binomial")
## 
## Deviance Residuals: 
##    Min      1Q  Median      3Q     Max  
## -1.179  -1.139  -1.112   1.206   1.257  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.087260   0.089579  -0.974    0.330
## x1           0.196199   0.316864   0.619    0.536
## x2          -0.002854   0.305712  -0.009    0.993
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 692.18  on 499  degrees of freedom
## Residual deviance: 691.79  on 497  degrees of freedom
## AIC: 697.79
## 
## Number of Fisher Scoring iterations: 3

(d) Apply this model to the training data in order to obtain a predicted class label for each training observation. Plot the observations, colored according to the predicted class labels. The decision boundary should be linear.

data = data.frame(x1 = x1, x2 = x2, y = y)
glm.preds  = predict(glm.fit1,data = data ,type = "response")
glm.class = ifelse(glm.preds >= 0.47, 1, 0)
data.pos = data[glm.class == 1,]
data.neg = data[glm.class ==0,]
plot(data.pos$x1, data.pos$x2, col = "red", xlab = "x1", ylab = "x2", pch = 6)
points(data.neg$x1, data.neg$x2, col = "blue", pch = 4)

With the given model, if the cut-off is set to 0.5, there is no decision boundary observed, hence the decsion boundary is set to 0.47, to get clear decsion boundary between red and blue points and the boundary can observed as linear.

(e) Now fit a logistic regression model to the data using non-linear functions of X1 and X2 as predictors (e.g. X1^2 , X1×X2, log(X2), and so forth).

glm.fit2 = glm(y ~ x1 + poly(x2, 2) + I(x1^2) + I(x1*x2), data = data, family = "binomial" )

## Warning: glm.fit: algorithm did not converge

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

summary(glm.fit2)

## 
## Call:
## glm(formula = y ~ x1 + poly(x2, 2) + I(x1^2) + I(x1 * x2), family = "binomial", 
##     data = data)
## 
## Deviance Residuals: 
##        Min          1Q      Median          3Q         Max  
## -8.240e-04  -2.000e-08  -2.000e-08   2.000e-08   1.163e-03  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)
## (Intercept)   -1444.8    44075.8  -0.033    0.974
## x1               42.1    15492.6   0.003    0.998
## poly(x2, 2)1   -279.7    97160.4  -0.003    0.998
## poly(x2, 2)2 -28693.0   875451.3  -0.033    0.974
## I(x1^2)       16758.0   519013.0   0.032    0.974
## I(x1 * x2)     -206.4    41802.8  -0.005    0.996
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 6.9218e+02  on 499  degrees of freedom
## Residual deviance: 3.5810e-06  on 494  degrees of freedom
## AIC: 12
## 
## Number of Fisher Scoring iterations: 25

(f) Apply this model to the training data in order to obtain a predicted class label for each training observation. Plot the observations, colored according to the predicted class labels. The decision boundary should be obviously non-linear. If it is not, then repeat (a)-(e) until you come up with an example in which the predicted class labels are obviously non-linear.

glm.preds2 = predict(glm.fit2, data = data, type = "response")
glm.class2 = ifelse(glm.preds2 >= 0.47, 1, 0)
data.pos2 = data[glm.class2 == 1,]
data.neg2 = data[glm.class2 == 0,]

plot(data.pos2$x1, data.pos2$x2, col = "red", xlab = "x1", ylab = "x2", pch = 6)
points(data.neg2$x1, data.neg2$x2, col = "blue", pch = 4)

(g) Fit a support vector classifier to the data with X1 and X2 as predictors. Obtain a class prediction for each training observation. Plot the observations, colored according to the predicted class labels.

data$y = as.factor(data$y)
svm.fit1 = svm(y ~ x1+x2, data = data, kernel = "linear", cost = 0.1, scale = FALSE)
summary(svm.fit1)

## 
## Call:
## svm(formula = y ~ x1 + x2, data = data, kernel = "linear", cost = 0.1, 
##     scale = FALSE)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  linear 
##        cost:  0.1 
## 
## Number of Support Vectors:  479
## 
##  ( 239 240 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  0 1

svm.preds = predict(svm.fit1, data = data)

plot(data[svm.preds == 0,]$x1, data[svm.preds == 0,]$x2, col = "blue", pch = 4, xlab = "x1", ylab ="x2")
points(data[svm.preds == 1,]$x1, data[svm.preds == 1,]$x2, col = "red", pch = 6)

In this case (with a linear kernel), the cost parameter defines the cost of violating the margin:

A small value of cost means the margins are wide and many support vectors can be on the margin or violate it
A large cost results in a narrow margin with few support vectors on the margin or violating it

But in this case irrespective of the cost parameter , the SVM seems to predict 0 for all teh observations.

Fit a SVM using a non-linear kernel to the data with X1 and X2 as predictors. Obtain a class prediction for each training observation. Plot the observations, colored according to the predicted class labels.

svm.fit2 = svm(y ~ x1+x2, data = data, kernal = "radial", gamma = 1, scale = FALSE)
svm.preds2 = predict(svm.fit2, data = data)
plot(data[svm.preds2 == 0, ]$x1, data[svm.preds2 == 0, ]$x2, col = "blue", xlab = "x1", ylab = "x2", pch = 4)
points(data[svm.preds2 == 1,]$x1, data[svm.preds2==1,]$x2, col = "red" , pch = 6)

We can observe that non-linear kernal of the SVM is very similar to the decsion boundary.

(i) Comment on your results.

We can notice that non-linear decision boundary of SVM and logistic regression with non-linear terms included perform better to predict closer to the true class value of the original data. Where as, SVM with linear kernel and logistic regression with out any non-linear terms do a bad job in the predicting the data by majorly predicting just one of the class. An advantage of SVM with non-linear kernel has to be noted that we needn’t take hassle of making the transformations as in logistic regression but the SVM with non-linear kernel handles that.

Exercise 7

In this problem, you will use support vector approaches in order to predict whether a given car gets high or low gas mileage based on the “Auto” data set.

(a) Create a binary variable that takes on a 1 for cars with gas mileage above the median, and a 0 for cars with gas mileage below the median.

attach(Auto)
med = median(Auto$mpg)
Auto$new.var = as.factor(ifelse(Auto$mpg > med, 1, 0))

(b) Fit a support vector classifier to the data with various values of “cost”, in order to predict whether a car gets high of low gas mileage. Report the cross-validation errors associated with different values of this parameter. Comment on your results

set.seed(123)
tune.out = tune(svm, new.var ~ . , data = Auto, kernel = "linear", ranges = list(cost = c(0.1, 1, 10, 100, 1000)))
summary(tune.out)

## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##  cost
##     1
## 
## - best performance: 0.01025641 
## 
## - Detailed performance results:
##    cost      error dispersion
## 1 1e-01 0.04333333 0.03191738
## 2 1e+00 0.01025641 0.01792836
## 3 1e+01 0.01788462 0.01727588
## 4 1e+02 0.03320513 0.02720447
## 5 1e+03 0.03320513 0.02720447

The cost parameter equals 1 performs better.

(c) Now repeat (b), this time using SVMs with radial and polynomial basis kernels, with different values of “gamma” and “degree” and “cost”. Comment on your results.

set.seed(123)
tune.out = tune(svm, new.var ~., data = Auto, kernal = "polynomial", ranges = list(cost = c(0.1, 1, 10, 100, 1000), gamma = c(0.5,1,2,3,4), degree = c(2,3,4)))
summary(tune.out)

## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##  cost gamma degree
##     1   0.5      2
## 
## - best performance: 0.04576923 
## 
## - Detailed performance results:
##     cost gamma degree      error dispersion
## 1  1e-01   0.5      2 0.08147436 0.03707182
## 2  1e+00   0.5      2 0.04576923 0.03903092
## 3  1e+01   0.5      2 0.05339744 0.03440111
## 4  1e+02   0.5      2 0.05339744 0.03440111
## 5  1e+03   0.5      2 0.05339744 0.03440111
## 6  1e-01   1.0      2 0.58173077 0.04740051
## 7  1e+00   1.0      2 0.05865385 0.04942437
## 8  1e+01   1.0      2 0.05608974 0.04595880
## 9  1e+02   1.0      2 0.05608974 0.04595880
## 10 1e+03   1.0      2 0.05608974 0.04595880
## 11 1e-01   2.0      2 0.58173077 0.04740051
## 12 1e+00   2.0      2 0.11474359 0.06630201
## 13 1e+01   2.0      2 0.11474359 0.06630201
## 14 1e+02   2.0      2 0.11474359 0.06630201
## 15 1e+03   2.0      2 0.11474359 0.06630201
## 16 1e-01   3.0      2 0.58173077 0.04740051
## 17 1e+00   3.0      2 0.42878205 0.17823496
## 18 1e+01   3.0      2 0.40839744 0.18573046
## 19 1e+02   3.0      2 0.40839744 0.18573046
## 20 1e+03   3.0      2 0.40839744 0.18573046
## 21 1e-01   4.0      2 0.58173077 0.04740051
## 22 1e+00   4.0      2 0.51538462 0.06959451
## 23 1e+01   4.0      2 0.50012821 0.07022396
## 24 1e+02   4.0      2 0.50012821 0.07022396
## 25 1e+03   4.0      2 0.50012821 0.07022396
## 26 1e-01   0.5      3 0.08147436 0.03707182
## 27 1e+00   0.5      3 0.04576923 0.03903092
## 28 1e+01   0.5      3 0.05339744 0.03440111
## 29 1e+02   0.5      3 0.05339744 0.03440111
## 30 1e+03   0.5      3 0.05339744 0.03440111
## 31 1e-01   1.0      3 0.58173077 0.04740051
## 32 1e+00   1.0      3 0.05865385 0.04942437
## 33 1e+01   1.0      3 0.05608974 0.04595880
## 34 1e+02   1.0      3 0.05608974 0.04595880
## 35 1e+03   1.0      3 0.05608974 0.04595880
## 36 1e-01   2.0      3 0.58173077 0.04740051
## 37 1e+00   2.0      3 0.11474359 0.06630201
## 38 1e+01   2.0      3 0.11474359 0.06630201
## 39 1e+02   2.0      3 0.11474359 0.06630201
## 40 1e+03   2.0      3 0.11474359 0.06630201
## 41 1e-01   3.0      3 0.58173077 0.04740051
## 42 1e+00   3.0      3 0.42878205 0.17823496
## 43 1e+01   3.0      3 0.40839744 0.18573046
## 44 1e+02   3.0      3 0.40839744 0.18573046
## 45 1e+03   3.0      3 0.40839744 0.18573046
## 46 1e-01   4.0      3 0.58173077 0.04740051
## 47 1e+00   4.0      3 0.51538462 0.06959451
## 48 1e+01   4.0      3 0.50012821 0.07022396
## 49 1e+02   4.0      3 0.50012821 0.07022396
## 50 1e+03   4.0      3 0.50012821 0.07022396
## 51 1e-01   0.5      4 0.08147436 0.03707182
## 52 1e+00   0.5      4 0.04576923 0.03903092
## 53 1e+01   0.5      4 0.05339744 0.03440111
## 54 1e+02   0.5      4 0.05339744 0.03440111
## 55 1e+03   0.5      4 0.05339744 0.03440111
## 56 1e-01   1.0      4 0.58173077 0.04740051
## 57 1e+00   1.0      4 0.05865385 0.04942437
## 58 1e+01   1.0      4 0.05608974 0.04595880
## 59 1e+02   1.0      4 0.05608974 0.04595880
## 60 1e+03   1.0      4 0.05608974 0.04595880
## 61 1e-01   2.0      4 0.58173077 0.04740051
## 62 1e+00   2.0      4 0.11474359 0.06630201
## 63 1e+01   2.0      4 0.11474359 0.06630201
## 64 1e+02   2.0      4 0.11474359 0.06630201
## 65 1e+03   2.0      4 0.11474359 0.06630201
## 66 1e-01   3.0      4 0.58173077 0.04740051
## 67 1e+00   3.0      4 0.42878205 0.17823496
## 68 1e+01   3.0      4 0.40839744 0.18573046
## 69 1e+02   3.0      4 0.40839744 0.18573046
## 70 1e+03   3.0      4 0.40839744 0.18573046
## 71 1e-01   4.0      4 0.58173077 0.04740051
## 72 1e+00   4.0      4 0.51538462 0.06959451
## 73 1e+01   4.0      4 0.50012821 0.07022396
## 74 1e+02   4.0      4 0.50012821 0.07022396
## 75 1e+03   4.0      4 0.50012821 0.07022396

the best parameters fro polynomial kernel are cost = 1, gamma = 0.5, degree = 2.

set.seed(123)
tune.out = tune(svm, new.var ~., data = Auto, kernal = "radial", ranges = list(cost = c(0.1, 1, 10, 100, 1000), gamma = c(0.01, 0.1, 0.5,1, 1.5, 2.0, 2.5, 3.0), degree = c(2,3,4)))
summary(tune.out)

## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##  cost gamma degree
##   100  0.01      2
## 
## - best performance: 0.01025641 
## 
## - Detailed performance results:
##      cost gamma degree      error dispersion
## 1   1e-01  0.01      2 0.08916667 0.04345384
## 2   1e+00  0.01      2 0.07378205 0.04185248
## 3   1e+01  0.01      2 0.02032051 0.02305327
## 4   1e+02  0.01      2 0.01025641 0.01792836
## 5   1e+03  0.01      2 0.02051282 0.02022591
## 6   1e-01  0.10      2 0.07634615 0.03928191
## 7   1e+00  0.10      2 0.05852564 0.03960325
## 8   1e+01  0.10      2 0.03314103 0.02942215
## 9   1e+02  0.10      2 0.03326923 0.02434857
## 10  1e+03  0.10      2 0.03326923 0.02434857
## 11  1e-01  0.50      2 0.08147436 0.03707182
## 12  1e+00  0.50      2 0.04576923 0.03903092
## 13  1e+01  0.50      2 0.05339744 0.03440111
## 14  1e+02  0.50      2 0.05339744 0.03440111
## 15  1e+03  0.50      2 0.05339744 0.03440111
## 16  1e-01  1.00      2 0.58173077 0.04740051
## 17  1e+00  1.00      2 0.05865385 0.04942437
## 18  1e+01  1.00      2 0.05608974 0.04595880
## 19  1e+02  1.00      2 0.05608974 0.04595880
## 20  1e+03  1.00      2 0.05608974 0.04595880
## 21  1e-01  1.50      2 0.58173077 0.04740051
## 22  1e+00  1.50      2 0.09179487 0.06035241
## 23  1e+01  1.50      2 0.08416667 0.06247218
## 24  1e+02  1.50      2 0.08416667 0.06247218
## 25  1e+03  1.50      2 0.08416667 0.06247218
## 26  1e-01  2.00      2 0.58173077 0.04740051
## 27  1e+00  2.00      2 0.11474359 0.06630201
## 28  1e+01  2.00      2 0.11474359 0.06630201
## 29  1e+02  2.00      2 0.11474359 0.06630201
## 30  1e+03  2.00      2 0.11474359 0.06630201
## 31  1e-01  2.50      2 0.58173077 0.04740051
## 32  1e+00  2.50      2 0.30897436 0.17999927
## 33  1e+01  2.50      2 0.28083333 0.16358214
## 34  1e+02  2.50      2 0.28083333 0.16358214
## 35  1e+03  2.50      2 0.28083333 0.16358214
## 36  1e-01  3.00      2 0.58173077 0.04740051
## 37  1e+00  3.00      2 0.42878205 0.17823496
## 38  1e+01  3.00      2 0.40839744 0.18573046
## 39  1e+02  3.00      2 0.40839744 0.18573046
## 40  1e+03  3.00      2 0.40839744 0.18573046
## 41  1e-01  0.01      3 0.08916667 0.04345384
## 42  1e+00  0.01      3 0.07378205 0.04185248
## 43  1e+01  0.01      3 0.02032051 0.02305327
## 44  1e+02  0.01      3 0.01025641 0.01792836
## 45  1e+03  0.01      3 0.02051282 0.02022591
## 46  1e-01  0.10      3 0.07634615 0.03928191
## 47  1e+00  0.10      3 0.05852564 0.03960325
## 48  1e+01  0.10      3 0.03314103 0.02942215
## 49  1e+02  0.10      3 0.03326923 0.02434857
## 50  1e+03  0.10      3 0.03326923 0.02434857
## 51  1e-01  0.50      3 0.08147436 0.03707182
## 52  1e+00  0.50      3 0.04576923 0.03903092
## 53  1e+01  0.50      3 0.05339744 0.03440111
## 54  1e+02  0.50      3 0.05339744 0.03440111
## 55  1e+03  0.50      3 0.05339744 0.03440111
## 56  1e-01  1.00      3 0.58173077 0.04740051
## 57  1e+00  1.00      3 0.05865385 0.04942437
## 58  1e+01  1.00      3 0.05608974 0.04595880
## 59  1e+02  1.00      3 0.05608974 0.04595880
## 60  1e+03  1.00      3 0.05608974 0.04595880
## 61  1e-01  1.50      3 0.58173077 0.04740051
## 62  1e+00  1.50      3 0.09179487 0.06035241
## 63  1e+01  1.50      3 0.08416667 0.06247218
## 64  1e+02  1.50      3 0.08416667 0.06247218
## 65  1e+03  1.50      3 0.08416667 0.06247218
## 66  1e-01  2.00      3 0.58173077 0.04740051
## 67  1e+00  2.00      3 0.11474359 0.06630201
## 68  1e+01  2.00      3 0.11474359 0.06630201
## 69  1e+02  2.00      3 0.11474359 0.06630201
## 70  1e+03  2.00      3 0.11474359 0.06630201
## 71  1e-01  2.50      3 0.58173077 0.04740051
## 72  1e+00  2.50      3 0.30897436 0.17999927
## 73  1e+01  2.50      3 0.28083333 0.16358214
## 74  1e+02  2.50      3 0.28083333 0.16358214
## 75  1e+03  2.50      3 0.28083333 0.16358214
## 76  1e-01  3.00      3 0.58173077 0.04740051
## 77  1e+00  3.00      3 0.42878205 0.17823496
## 78  1e+01  3.00      3 0.40839744 0.18573046
## 79  1e+02  3.00      3 0.40839744 0.18573046
## 80  1e+03  3.00      3 0.40839744 0.18573046
## 81  1e-01  0.01      4 0.08916667 0.04345384
## 82  1e+00  0.01      4 0.07378205 0.04185248
## 83  1e+01  0.01      4 0.02032051 0.02305327
## 84  1e+02  0.01      4 0.01025641 0.01792836
## 85  1e+03  0.01      4 0.02051282 0.02022591
## 86  1e-01  0.10      4 0.07634615 0.03928191
## 87  1e+00  0.10      4 0.05852564 0.03960325
## 88  1e+01  0.10      4 0.03314103 0.02942215
## 89  1e+02  0.10      4 0.03326923 0.02434857
## 90  1e+03  0.10      4 0.03326923 0.02434857
## 91  1e-01  0.50      4 0.08147436 0.03707182
## 92  1e+00  0.50      4 0.04576923 0.03903092
## 93  1e+01  0.50      4 0.05339744 0.03440111
## 94  1e+02  0.50      4 0.05339744 0.03440111
## 95  1e+03  0.50      4 0.05339744 0.03440111
## 96  1e-01  1.00      4 0.58173077 0.04740051
## 97  1e+00  1.00      4 0.05865385 0.04942437
## 98  1e+01  1.00      4 0.05608974 0.04595880
## 99  1e+02  1.00      4 0.05608974 0.04595880
## 100 1e+03  1.00      4 0.05608974 0.04595880
## 101 1e-01  1.50      4 0.58173077 0.04740051
## 102 1e+00  1.50      4 0.09179487 0.06035241
## 103 1e+01  1.50      4 0.08416667 0.06247218
## 104 1e+02  1.50      4 0.08416667 0.06247218
## 105 1e+03  1.50      4 0.08416667 0.06247218
## 106 1e-01  2.00      4 0.58173077 0.04740051
## 107 1e+00  2.00      4 0.11474359 0.06630201
## 108 1e+01  2.00      4 0.11474359 0.06630201
## 109 1e+02  2.00      4 0.11474359 0.06630201
## 110 1e+03  2.00      4 0.11474359 0.06630201
## 111 1e-01  2.50      4 0.58173077 0.04740051
## 112 1e+00  2.50      4 0.30897436 0.17999927
## 113 1e+01  2.50      4 0.28083333 0.16358214
## 114 1e+02  2.50      4 0.28083333 0.16358214
## 115 1e+03  2.50      4 0.28083333 0.16358214
## 116 1e-01  3.00      4 0.58173077 0.04740051
## 117 1e+00  3.00      4 0.42878205 0.17823496
## 118 1e+01  3.00      4 0.40839744 0.18573046
## 119 1e+02  3.00      4 0.40839744 0.18573046
## 120 1e+03  3.00      4 0.40839744 0.18573046

tune.out$best.parameters

##   cost gamma degree
## 4  100  0.01      2

which.min(tune.out$performances$error)

## [1] 4

the best parameters for cost = 100, gamma = 0.01, degree = 2 fro radial kernel using 10-fold cross validation.

(d) Make some plots to back up your assertions in (b) and (c).

svm.linear = svm(new.var ~., data = Auto, kernel = "linear", cost = 1)
svm.poly = svm(new.var ~., data = Auto, kernel = "polynomial", cost = 1, gamma = 0.5, degree = 2)
svm.radial = svm(new.var ~., data = Auto, kernel = "radial", cost = 100, gamma = 0.01, degree = 2)

plot(svm.linear, Auto, weight ~ horsepower)

plot(svm.linear, Auto, weight ~ acceleration)

plot(svm.linear, Auto, mpg ~ displacement)

plot(svm.poly, Auto, weight ~ horsepower)

plot(svm.poly, Auto, weight ~ acceleration)

plot(svm.poly, Auto, mpg ~ displacement)

plot(svm.radial, Auto, weight ~ horsepower)

plot(svm.radial, Auto, weight ~ acceleration)

plot(svm.radial, Auto, mpg ~ displacement)

Exercise 8

This problem involves the OJ data set which is part of the ISLR package.

(a) Create a training set containing a random sample of 800 observations, and a test set containing the remaining observations.

detach(Auto)
attach(OJ)
str(OJ)

## 'data.frame':    1070 obs. of  18 variables:
##  $ Purchase      : Factor w/ 2 levels "CH","MM": 1 1 1 2 1 1 1 1 1 1 ...
##  $ WeekofPurchase: num  237 239 245 227 228 230 232 234 235 238 ...
##  $ StoreID       : num  1 1 1 1 7 7 7 7 7 7 ...
##  $ PriceCH       : num  1.75 1.75 1.86 1.69 1.69 1.69 1.69 1.75 1.75 1.75 ...
##  $ PriceMM       : num  1.99 1.99 2.09 1.69 1.69 1.99 1.99 1.99 1.99 1.99 ...
##  $ DiscCH        : num  0 0 0.17 0 0 0 0 0 0 0 ...
##  $ DiscMM        : num  0 0.3 0 0 0 0 0.4 0.4 0.4 0.4 ...
##  $ SpecialCH     : num  0 0 0 0 0 0 1 1 0 0 ...
##  $ SpecialMM     : num  0 1 0 0 0 1 1 0 0 0 ...
##  $ LoyalCH       : num  0.5 0.6 0.68 0.4 0.957 ...
##  $ SalePriceMM   : num  1.99 1.69 2.09 1.69 1.69 1.99 1.59 1.59 1.59 1.59 ...
##  $ SalePriceCH   : num  1.75 1.75 1.69 1.69 1.69 1.69 1.69 1.75 1.75 1.75 ...
##  $ PriceDiff     : num  0.24 -0.06 0.4 0 0 0.3 -0.1 -0.16 -0.16 -0.16 ...
##  $ Store7        : Factor w/ 2 levels "No","Yes": 1 1 1 1 2 2 2 2 2 2 ...
##  $ PctDiscMM     : num  0 0.151 0 0 0 ...
##  $ PctDiscCH     : num  0 0 0.0914 0 0 ...
##  $ ListPriceDiff : num  0.24 0.24 0.23 0 0 0.3 0.3 0.24 0.24 0.24 ...
##  $ STORE         : num  1 1 1 1 0 0 0 0 0 0 ...

set.seed(123)
index = sample(nrow(OJ), 800)
train.OJ = OJ[index,]
test.OJ = OJ[-index,]

(b) Fit a support vector classifier to the training data using cost=0.01, with Purchase as the response and the other variables as predictors. Use the summary() function to produce summary statistics, and describe the results obtained.

svm.OJ.linear = svm(Purchase ~., data = OJ, kernel = "linear",cost = 0.01)
summary(svm.OJ.linear)

## 
## Call:
## svm(formula = Purchase ~ ., data = OJ, kernel = "linear", cost = 0.01)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  linear 
##        cost:  0.01 
## 
## Number of Support Vectors:  560
## 
##  ( 279 281 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  CH MM

The response variable in the above model is purchase, the type of Orange juice customers purchased. Linear kernel of SVM generated 560 support vectors of 800 observations of which 279 belong to CH class and 281 belong to MM class.

(c) What are the training and test error rates?

train.OJ.pred = predict(svm.OJ.linear, train.OJ)
table(train.OJ.pred, train.OJ$Purchase)

##              
## train.OJ.pred  CH  MM
##            CH 431  74
##            MM  56 239

linear.train.err = (56+74)/(431+74+56+239)
linear.train.err

## [1] 0.1625

test.OJ.pred = predict(svm.OJ.linear, test.OJ)
table(test.OJ.pred, test.OJ$Purchase)

##             
## test.OJ.pred  CH  MM
##           CH 145  26
##           MM  21  78

linear.test.err = (21+26)/(145+26+21+78)
linear.test.err

## [1] 0.1740741

The training and the testing errors for linear kernel with cost parameter = 0.01, is 16.25% and 17.4% respectively

(d) Use the tune() function to select an optimal cost. Consider values in the range 0.01 to 10.

OJ.linear.tune = tune.svm(Purchase ~., data = train.OJ, kernel = "linear", cost = seq(0.01,10, by = 0.1))
summary(OJ.linear.tune)

## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##  cost
##  2.21
## 
## - best performance: 0.1625 
## 
## - Detailed performance results:
##     cost   error dispersion
## 1   0.01 0.17625 0.03143004
## 2   0.11 0.17375 0.03408018
## 3   0.21 0.17500 0.03435921
## 4   0.31 0.17250 0.03574602
## 5   0.41 0.17125 0.03438447
## 6   0.51 0.17125 0.03438447
## 7   0.61 0.17000 0.03343734
## 8   0.71 0.17000 0.03496029
## 9   0.81 0.16875 0.03596391
## 10  0.91 0.16875 0.03596391
## 11  1.01 0.16875 0.03596391
## 12  1.11 0.16875 0.03596391
## 13  1.21 0.16750 0.03593976
## 14  1.31 0.16875 0.03596391
## 15  1.41 0.16875 0.03596391
## 16  1.51 0.16875 0.03596391
## 17  1.61 0.16875 0.03596391
## 18  1.71 0.16875 0.03596391
## 19  1.81 0.16750 0.03593976
## 20  1.91 0.16750 0.03593976
## 21  2.01 0.16625 0.03634805
## 22  2.11 0.16500 0.03809710
## 23  2.21 0.16250 0.03773077
## 24  2.31 0.16500 0.03670453
## 25  2.41 0.16375 0.03408018
## 26  2.51 0.16250 0.03118048
## 27  2.61 0.16375 0.03197764
## 28  2.71 0.16375 0.03197764
## 29  2.81 0.16375 0.03197764
## 30  2.91 0.16500 0.03162278
## 31  3.01 0.16500 0.02934469
## 32  3.11 0.16500 0.02934469
## 33  3.21 0.16500 0.02934469
## 34  3.31 0.16500 0.02934469
## 35  3.41 0.16500 0.02934469
## 36  3.51 0.16500 0.02934469
## 37  3.61 0.16500 0.02934469
## 38  3.71 0.16500 0.02934469
## 39  3.81 0.16500 0.02934469
## 40  3.91 0.16625 0.03007514
## 41  4.01 0.16625 0.03007514
## 42  4.11 0.16625 0.03007514
## 43  4.21 0.16750 0.02898755
## 44  4.31 0.16750 0.02898755
## 45  4.41 0.16750 0.02898755
## 46  4.51 0.16625 0.03007514
## 47  4.61 0.16500 0.03106892
## 48  4.71 0.16500 0.03106892
## 49  4.81 0.16500 0.03106892
## 50  4.91 0.16500 0.03106892
## 51  5.01 0.16625 0.03007514
## 52  5.11 0.16500 0.03106892
## 53  5.21 0.16500 0.03106892
## 54  5.31 0.16500 0.03106892
## 55  5.41 0.16500 0.03106892
## 56  5.51 0.16500 0.03106892
## 57  5.61 0.16375 0.02972676
## 58  5.71 0.16375 0.02972676
## 59  5.81 0.16375 0.02972676
## 60  5.91 0.16375 0.02972676
## 61  6.01 0.16375 0.02972676
## 62  6.11 0.16500 0.02874698
## 63  6.21 0.16625 0.02766993
## 64  6.31 0.16625 0.02949223
## 65  6.41 0.16625 0.02949223
## 66  6.51 0.16625 0.02949223
## 67  6.61 0.16625 0.02949223
## 68  6.71 0.16625 0.02949223
## 69  6.81 0.16750 0.02898755
## 70  6.91 0.16750 0.02898755
## 71  7.01 0.16750 0.02898755
## 72  7.11 0.17000 0.02776389
## 73  7.21 0.17000 0.02776389
## 74  7.31 0.17000 0.02776389
## 75  7.41 0.17000 0.02776389
## 76  7.51 0.17000 0.02776389
## 77  7.61 0.17000 0.02776389
## 78  7.71 0.17000 0.02776389
## 79  7.81 0.17125 0.02829041
## 80  7.91 0.17125 0.02829041
## 81  8.01 0.17125 0.02829041
## 82  8.11 0.17125 0.02829041
## 83  8.21 0.17125 0.02829041
## 84  8.31 0.17125 0.02829041
## 85  8.41 0.17125 0.02829041
## 86  8.51 0.17125 0.02829041
## 87  8.61 0.17125 0.02829041
## 88  8.71 0.17125 0.02829041
## 89  8.81 0.17125 0.02829041
## 90  8.91 0.17125 0.02829041
## 91  9.01 0.17125 0.02829041
## 92  9.11 0.17125 0.02829041
## 93  9.21 0.17125 0.02829041
## 94  9.31 0.17125 0.02829041
## 95  9.41 0.17125 0.02829041
## 96  9.51 0.17125 0.02829041
## 97  9.61 0.17250 0.02751262
## 98  9.71 0.17250 0.02751262
## 99  9.81 0.17250 0.02751262
## 100 9.91 0.17250 0.02751262

The optimal value for cost is 2.21 with lowest cross-validation error rate of 0.162

(e) Compute the training and test error rates using this new value for cost.

svm.OJ.linear.tuned = svm(Purchase ~., data = train.OJ, kernel = "linear", cost = 2.21)
train.OJ.pred = predict(svm.OJ.linear.tuned, train.OJ)
table(train.OJ.pred, train.OJ$Purchase)

##              
## train.OJ.pred  CH  MM
##            CH 428  68
##            MM  59 245

linear.tuned.train.err = (59+68)/(428+68+59+245)
linear.tuned.train.err

## [1] 0.15875

test.OJ.pred = predict(svm.OJ.linear.tuned, test.OJ)
table(test.OJ.pred, test.OJ$Purchase)

##             
## test.OJ.pred  CH  MM
##           CH 149  25
##           MM  17  79

linear.tuned.test.err = (17+25)/(149+25+17+79)
linear.tuned.test.err

## [1] 0.1555556

The train and test error for tuned Linear svm model are 15.8% and 15.5% that less than that of of the untuned model.

(f) Repeat parts (b) through (e) using a support vector machine with a radial kernel. Use the default value for gamma.

set.seed(1)
svm.OJ.radial = svm(Purchase ~., data = OJ, kernel = "radial")
summary(svm.OJ.radial)

## 
## Call:
## svm(formula = Purchase ~ ., data = OJ, kernel = "radial")
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  1 
## 
## Number of Support Vectors:  485
## 
##  ( 245 240 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  CH MM

train.OJ.pred = predict(svm.OJ.radial, train.OJ)
table(train.OJ.pred, train.OJ$Purchase)

##              
## train.OJ.pred  CH  MM
##            CH 448  74
##            MM  39 239

radial.train.err = (39+74)/(448+74+39+239)
radial.train.err

## [1] 0.14125

test.OJ.pred = predict(svm.OJ.radial, test.OJ)
table(test.OJ.pred, test.OJ$Purchase)

##             
## test.OJ.pred  CH  MM
##           CH 149  34
##           MM  17  70

radial.test.err = (17+34)/(149+34+17+70)
radial.test.err

## [1] 0.1888889

The radial kernel generated 485 support vectors with 245 to CH and 240 to MM class. The training error of untuned radial kernel are 14.12% which is an improvemnt from tuned linear kernel but the testing error is 18.8% that is drop from tuned linear svm model.

tuning of radial Kernel

set.seed(1)
radial.tune.out = tune.svm(Purchase~., data = train.OJ, kernel = "radial", cost = seq(.01, 10, by =.1))
summary(radial.tune.out)

## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##  cost
##  1.91
## 
## - best performance: 0.16 
## 
## - Detailed performance results:
##     cost   error dispersion
## 1   0.01 0.39125 0.04752558
## 2   0.11 0.18000 0.03129164
## 3   0.21 0.16750 0.03496029
## 4   0.31 0.16625 0.03120831
## 5   0.41 0.16625 0.03064696
## 6   0.51 0.16500 0.03106892
## 7   0.61 0.16375 0.03087272
## 8   0.71 0.16250 0.03227486
## 9   0.81 0.16250 0.03227486
## 10  0.91 0.16250 0.03280837
## 11  1.01 0.16250 0.03280837
## 12  1.11 0.16000 0.03425801
## 13  1.21 0.16000 0.03425801
## 14  1.31 0.16000 0.03162278
## 15  1.41 0.16125 0.03304563
## 16  1.51 0.16250 0.03227486
## 17  1.61 0.16250 0.03061862
## 18  1.71 0.16250 0.03061862
## 19  1.81 0.16250 0.03061862
## 20  1.91 0.16000 0.03050501
## 21  2.01 0.16250 0.02946278
## 22  2.11 0.16250 0.02946278
## 23  2.21 0.16250 0.02946278
## 24  2.31 0.16125 0.03030516
## 25  2.41 0.16125 0.03030516
## 26  2.51 0.16125 0.03143004
## 27  2.61 0.16125 0.03143004
## 28  2.71 0.16125 0.03356689
## 29  2.81 0.16375 0.03408018
## 30  2.91 0.16375 0.03408018
## 31  3.01 0.16375 0.03408018
## 32  3.11 0.16500 0.03476109
## 33  3.21 0.16500 0.03476109
## 34  3.31 0.16500 0.03476109
## 35  3.41 0.16500 0.03476109
## 36  3.51 0.16500 0.03476109
## 37  3.61 0.16375 0.03701070
## 38  3.71 0.16250 0.03818813
## 39  3.81 0.16250 0.03818813
## 40  3.91 0.16250 0.03818813
## 41  4.01 0.16250 0.03818813
## 42  4.11 0.16250 0.03818813
## 43  4.21 0.16250 0.03818813
## 44  4.31 0.16250 0.03818813
## 45  4.41 0.16125 0.03793727
## 46  4.51 0.16125 0.03793727
## 47  4.61 0.16125 0.03793727
## 48  4.71 0.16125 0.03793727
## 49  4.81 0.16125 0.03793727
## 50  4.91 0.16125 0.03793727
## 51  5.01 0.16125 0.03793727
## 52  5.11 0.16125 0.03793727
## 53  5.21 0.16375 0.03793727
## 54  5.31 0.16250 0.03908680
## 55  5.41 0.16250 0.03908680
## 56  5.51 0.16375 0.03793727
## 57  5.61 0.16500 0.03763863
## 58  5.71 0.16500 0.03763863
## 59  5.81 0.16500 0.03763863
## 60  5.91 0.16500 0.03717451
## 61  6.01 0.16500 0.03717451
## 62  6.11 0.16500 0.03717451
## 63  6.21 0.16500 0.03717451
## 64  6.31 0.16625 0.03537988
## 65  6.41 0.16625 0.03537988
## 66  6.51 0.16750 0.03689324
## 67  6.61 0.16750 0.03689324
## 68  6.71 0.16875 0.03691676
## 69  6.81 0.17000 0.03641962
## 70  6.91 0.17000 0.03641962
## 71  7.01 0.17000 0.03641962
## 72  7.11 0.17000 0.03641962
## 73  7.21 0.17000 0.03641962
## 74  7.31 0.17000 0.03641962
## 75  7.41 0.17125 0.03488573
## 76  7.51 0.17125 0.03488573
## 77  7.61 0.17125 0.03488573
## 78  7.71 0.17125 0.03488573
## 79  7.81 0.17125 0.03488573
## 80  7.91 0.17125 0.03488573
## 81  8.01 0.17125 0.03488573
## 82  8.11 0.17125 0.03488573
## 83  8.21 0.17000 0.03238227
## 84  8.31 0.17000 0.03238227
## 85  8.41 0.17000 0.03238227
## 86  8.51 0.17000 0.03238227
## 87  8.61 0.17000 0.03238227
## 88  8.71 0.17000 0.03238227
## 89  8.81 0.16875 0.03397814
## 90  8.91 0.16875 0.03397814
## 91  9.01 0.16875 0.03397814
## 92  9.11 0.16875 0.03397814
## 93  9.21 0.16875 0.03346329
## 94  9.31 0.16750 0.03545341
## 95  9.41 0.16750 0.03545341
## 96  9.51 0.16750 0.03545341
## 97  9.61 0.16750 0.03545341
## 98  9.71 0.16750 0.03545341
## 99  9.81 0.16625 0.03634805
## 100 9.91 0.16625 0.03634805

set.seed(1)
svm.OJ.radial.tuned = svm(Purchase~., data = train.OJ, kernel = "radial",cost = radial.tune.out$best.parameters$cost)
summary(svm.OJ.radial.tuned)

## 
## Call:
## svm(formula = Purchase ~ ., data = train.OJ, kernel = "radial", cost = radial.tune.out$best.parameters$cost)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  1.91 
## 
## Number of Support Vectors:  349
## 
##  ( 171 178 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  CH MM

train.OJ.pred = predict(svm.OJ.radial.tuned, train.OJ)
table(train.OJ.pred, train.OJ$Purchase)

##              
## train.OJ.pred  CH  MM
##            CH 446  69
##            MM  41 244

radial.tuned.train.err = (41+69)/(800)
radial.tuned.train.err

## [1] 0.1375

test.OJ.pred = predict(svm.OJ.radial, test.OJ)
table(test.OJ.pred, test.OJ$Purchase)

##             
## test.OJ.pred  CH  MM
##           CH 149  34
##           MM  17  70

radial.tuned.test.err = (17+34)/(270)
radial.tuned.test.err

## [1] 0.1888889

(g) Repeat parts (b) through (e) using a support vector machine with a polynomial kernel. Set degree=2.

set.seed(1)
svm.poly.OJ = svm(Purchase~., kernel = 'polynomial', degree=2, data = train.OJ)
summary(svm.poly.OJ)

## 
## Call:
## svm(formula = Purchase ~ ., data = train.OJ, kernel = "polynomial", 
##     degree = 2)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  polynomial 
##        cost:  1 
##      degree:  2 
##      coef.0:  0 
## 
## Number of Support Vectors:  445
## 
##  ( 219 226 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  CH MM

train.pred.OJ = predict(svm.poly.OJ, train.OJ)
table(train.OJ$Purchase, train.pred.OJ)

##     train.pred.OJ
##       CH  MM
##   CH 454  33
##   MM 105 208

poly.train.err = (105+33)/(800)
poly.train.err

## [1] 0.1725

test.OJ.pred = predict(svm.poly.OJ, test.OJ)
table(test.OJ.pred, test.OJ$Purchase)

##             
## test.OJ.pred  CH  MM
##           CH 153  47
##           MM  13  57

poly.test.err = (13+47)/(270)
poly.test.err

## [1] 0.2222222

set.seed(1)
tuned.svm.OJ = tune.svm(Purchase~., data = train.OJ, kernel = "polynomial", degree=2, cost = seq(.01, 10, by =.1))

summary(tuned.svm.OJ)

## 
## Parameter tuning of 'svm':
## 
## - sampling method: 10-fold cross validation 
## 
## - best parameters:
##  degree cost
##       2 8.81
## 
## - best performance: 0.1625 
## 
## - Detailed performance results:
##     degree cost   error dispersion
## 1        2 0.01 0.39250 0.04533824
## 2        2 0.11 0.30250 0.02993047
## 3        2 0.21 0.22250 0.03106892
## 4        2 0.31 0.21000 0.03763863
## 5        2 0.41 0.19625 0.03866254
## 6        2 0.51 0.19250 0.03961621
## 7        2 0.61 0.19125 0.03821086
## 8        2 0.71 0.18500 0.03162278
## 9        2 0.81 0.18750 0.03061862
## 10       2 0.91 0.18625 0.03606033
## 11       2 1.01 0.18500 0.03425801
## 12       2 1.11 0.18000 0.03446012
## 13       2 1.21 0.18000 0.03446012
## 14       2 1.31 0.17875 0.03283481
## 15       2 1.41 0.18000 0.03446012
## 16       2 1.51 0.18125 0.03346329
## 17       2 1.61 0.18125 0.03294039
## 18       2 1.71 0.17875 0.03634805
## 19       2 1.81 0.17750 0.03476109
## 20       2 1.91 0.17875 0.03634805
## 21       2 2.01 0.18000 0.03827895
## 22       2 2.11 0.18000 0.04174992
## 23       2 2.21 0.17625 0.03839216
## 24       2 2.31 0.17750 0.03670453
## 25       2 2.41 0.17500 0.03535534
## 26       2 2.51 0.17750 0.03322900
## 27       2 2.61 0.17375 0.03606033
## 28       2 2.71 0.17500 0.03486083
## 29       2 2.81 0.17375 0.03458584
## 30       2 2.91 0.17375 0.03458584
## 31       2 3.01 0.17250 0.03322900
## 32       2 3.11 0.17250 0.03322900
## 33       2 3.21 0.17250 0.03322900
## 34       2 3.31 0.17375 0.03458584
## 35       2 3.41 0.17375 0.03458584
## 36       2 3.51 0.17375 0.03458584
## 37       2 3.61 0.17375 0.03197764
## 38       2 3.71 0.17375 0.03356689
## 39       2 3.81 0.17375 0.03408018
## 40       2 3.91 0.17375 0.03408018
## 41       2 4.01 0.17500 0.03227486
## 42       2 4.11 0.17500 0.03227486
## 43       2 4.21 0.17500 0.03227486
## 44       2 4.31 0.17375 0.03143004
## 45       2 4.41 0.17375 0.03143004
## 46       2 4.51 0.17125 0.03120831
## 47       2 4.61 0.17125 0.03175973
## 48       2 4.71 0.17250 0.03216710
## 49       2 4.81 0.17375 0.03087272
## 50       2 4.91 0.17500 0.03173239
## 51       2 5.01 0.17000 0.03343734
## 52       2 5.11 0.17000 0.03343734
## 53       2 5.21 0.17000 0.03343734
## 54       2 5.31 0.17250 0.03162278
## 55       2 5.41 0.17250 0.03162278
## 56       2 5.51 0.17250 0.03162278
## 57       2 5.61 0.17250 0.03322900
## 58       2 5.71 0.17375 0.03251602
## 59       2 5.81 0.17250 0.03162278
## 60       2 5.91 0.17375 0.03251602
## 61       2 6.01 0.17125 0.03335936
## 62       2 6.11 0.17125 0.03335936
## 63       2 6.21 0.17125 0.03335936
## 64       2 6.31 0.17000 0.03291403
## 65       2 6.41 0.17000 0.03291403
## 66       2 6.51 0.17000 0.03291403
## 67       2 6.61 0.16750 0.03343734
## 68       2 6.71 0.16875 0.03186887
## 69       2 6.81 0.16875 0.03186887
## 70       2 6.91 0.16875 0.03186887
## 71       2 7.01 0.16750 0.03129164
## 72       2 7.11 0.16750 0.03129164
## 73       2 7.21 0.16750 0.03129164
## 74       2 7.31 0.16750 0.03129164
## 75       2 7.41 0.16625 0.02949223
## 76       2 7.51 0.16500 0.02934469
## 77       2 7.61 0.16500 0.02934469
## 78       2 7.71 0.16500 0.02934469
## 79       2 7.81 0.16625 0.03120831
## 80       2 7.91 0.16625 0.03120831
## 81       2 8.01 0.16625 0.03120831
## 82       2 8.11 0.16625 0.03120831
## 83       2 8.21 0.16625 0.03120831
## 84       2 8.31 0.16625 0.03120831
## 85       2 8.41 0.16625 0.03120831
## 86       2 8.51 0.16625 0.03120831
## 87       2 8.61 0.16500 0.02874698
## 88       2 8.71 0.16375 0.02853482
## 89       2 8.81 0.16250 0.02886751
## 90       2 8.91 0.16250 0.02886751
## 91       2 9.01 0.16250 0.02886751
## 92       2 9.11 0.16250 0.02886751
## 93       2 9.21 0.16375 0.02853482
## 94       2 9.31 0.16375 0.02853482
## 95       2 9.41 0.16375 0.02853482
## 96       2 9.51 0.16500 0.02874698
## 97       2 9.61 0.16500 0.02874698
## 98       2 9.71 0.16625 0.02829041
## 99       2 9.81 0.16625 0.02829041
## 100      2 9.91 0.16625 0.02829041

set.seed(1)
svm.OJ.poly.tuned = svm(Purchase~., data = train.OJ, kernel = "radial",cost = tuned.svm.OJ$best.parameters$cost, degree = tuned.svm.OJ$best.parameters$degree )
summary(svm.OJ.poly.tuned)

## 
## Call:
## svm(formula = Purchase ~ ., data = train.OJ, kernel = "radial", cost = tuned.svm.OJ$best.parameters$cost, 
##     degree = tuned.svm.OJ$best.parameters$degree)
## 
## 
## Parameters:
##    SVM-Type:  C-classification 
##  SVM-Kernel:  radial 
##        cost:  8.81 
## 
## Number of Support Vectors:  324
## 
##  ( 157 167 )
## 
## 
## Number of Classes:  2 
## 
## Levels: 
##  CH MM

train.OJ.pred = predict(svm.OJ.poly.tuned, train.OJ)
table(train.OJ.pred, train.OJ$Purchase)

##              
## train.OJ.pred  CH  MM
##            CH 451  70
##            MM  36 243

poly.tuned.train.err = (36+70)/(800)
poly.tuned.train.err

## [1] 0.1325

test.OJ.pred = predict(svm.OJ.poly.tuned, test.OJ)
table(test.OJ.pred, test.OJ$Purchase)

##             
## test.OJ.pred  CH  MM
##           CH 145  34
##           MM  21  70

poly.tuned.test.err = (21+34)/(270)
poly.tuned.test.err

## [1] 0.2037037

(h) Overall, which approach seems to give the best results on this data?

df = data.frame(matrix(c(linear.tuned.train.err, linear.tuned.test.err, radial.tuned.train.err,radial.tuned.test.err, poly.tuned.train.err,poly.tuned.test.err),
                       nrow = 2, ncol = 3))
rownames(df) = c("train error", "test error")
colnames(df) = c("Linear", "Radial", "Polynomial")
df

##                Linear    Radial Polynomial
## train error 0.1587500 0.1375000  0.1325000
## test error  0.1555556 0.1888889  0.2037037