> x1= runif (500) -0.5
> x2= runif (500) -0.5
> y=1*( x1^2- x2 ^2 > 0)
set.seed(2021)
x1 <- runif(500) - 0.5
x2 <- runif(500) - 0.5
y <- 1*(x1^2 - x2^2 > 0)
df5 <- data.frame(y = as.factor(y), x1 = x1, x2 = x2)
X1 on the x-axis, and X2 on the yaxis.plot(x1[y==0], x2[y==0], col="blue", main = "X2 and X1", xlab="X1", ylab="X2", pch=20)
points(x1[y==1], x2[y==1], col="green", pch=18)
X1 and X2 as predictors.set.seed(2021)
glm.model <- glm(as.factor(y) ~ x1 + x2, data = df5, family = binomial)
summary(glm.model)
##
## Call:
## glm(formula = as.factor(y) ~ x1 + x2, family = binomial, data = df5)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.270 -1.155 -1.057 1.179 1.319
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.05235 0.08971 -0.584 0.560
## x1 -0.36740 0.30949 -1.187 0.235
## x2 -0.25117 0.30422 -0.826 0.409
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 692.86 on 499 degrees of freedom
## Residual deviance: 690.88 on 497 degrees of freedom
## AIC: 696.88
##
## Number of Fisher Scoring iterations: 3
We fail to reject the null hypothesis for the two variables. There is not enough statistically significant evidence to indicate the coefficient values of the variables are different from zero.
set.seed(2021)
glm.probs <- predict(glm.model, newdata = df5, type = 'response')
glm.preds <- ifelse(glm.probs >= 0.5, 1, 0)
glm.pos <- df5[glm.preds == 1, ]
glm.neg <- df5[glm.preds == 0, ]
plot(glm.pos$x1, glm.pos$x2, main = "Predicted X1 and X2",
xlab = "X1", ylab = "X2", col = "blue", pch = 20)
points(glm.neg$x1, glm.neg$x2, col = "red", pch = 18)
I went with something like this: \(y = x_1x_2 + x_1^2 + \log{x_2}\)
glm.model2 <- glm(y ~ I(x1 * x2) + poly(x1,2) + log(x2), data = df5, family = 'binomial')
summary(glm.model2)
##
## Call:
## glm(formula = y ~ I(x1 * x2) + poly(x1, 2) + log(x2), family = "binomial",
## data = df5)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.77449 -0.13673 -0.01565 0.10648 1.83928
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -8.7374 1.4975 -5.835 5.39e-09 ***
## I(x1 * x2) 7.5721 9.0886 0.833 0.405
## poly(x1, 2)1 -24.9740 20.7148 -1.206 0.228
## poly(x1, 2)2 99.0604 15.7496 6.290 3.18e-10 ***
## log(x2) -5.6843 0.9657 -5.886 3.95e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 341.730 on 246 degrees of freedom
## Residual deviance: 81.423 on 242 degrees of freedom
## (253 observations deleted due to missingness)
## AIC: 91.423
##
## Number of Fisher Scoring iterations: 8
Looks like the 2nd degree polynomial of X1 is statistically significant, as is the log-transformed X2.
set.seed(2021)
glm.probs.2 <- predict(glm.model2, newdata = df5, type = 'response')
glm.preds.2 <- ifelse(glm.probs.2 >= 0.5, 1, 0)
glm.pos.2 <- df5[glm.preds.2 == 1, ]
glm.neg.2 <- df5[glm.preds.2 == 0, ]
plot(glm.pos.2$x1, glm.pos.2$x2, main = "Predicted X1 and X2",
xlab = "X1", ylab = "X2", col = "blue", pch = 20)
points(glm.neg.2$x1, glm.neg.2$x2, col = "red", pch = 18)
A clearly non-linear structure is apparent.
set.seed(2021)
train_control <- trainControl(method="repeatedcv", number=10, repeats=3)
svm.model <- train(y ~ ., data = df5, method = "svmLinear", trControl = train_control, preProcess = c("center","scale"))
svm.preds <- predict(svm.model, df5)
svm.pos <- df5[svm.preds == 1, ]
svm.neg <- df5[svm.preds == 0, ]
plot(svm.pos$x1, svm.pos$x2, main = "SVM X1 and X2", xlab = "X1", ylab = "X2", col = "blue", pch = 20)
points(svm.neg$x1, svm.neg$x2, col = "red", pch = 18)
The linear kernel has “opportunities for improvement” in this case. A nonlinear kernel would be more appropriate.
set.seed(2021)
svm.model.nl <- train(y ~ ., data = df5, method = "svmPoly", trControl = train_control, preProcess = c("center","scale"), tuneLength = 4)
svm.preds.nl <- predict(svm.model.nl, df5)
svm.pos.nl <- df5[svm.preds.nl == 1, ]
svm.neg.nl <- df5[svm.preds.nl == 0, ]
plot(svm.pos.nl$x1, svm.pos.nl$x2, main = "SVM-Poly X1 and X2", xlab = "X1", ylab = "X2", col = "blue", pch = 20)
points(svm.neg.nl$x1, svm.neg.nl$x2, col = "red", pch = 18)
The svmPoly kernel produces a results plot that looks much closer to the original / real boundary.
Auto data set.auto = Auto
auto$mpg.bin <- as.factor(ifelse(auto$mpg > median(auto$mpg), 1, 0))
#str(auto$mpg.bin)
set.seed(2021)
tune.7b <- tune(svm, mpg.bin ~ ., data = auto, kernel = 'linear', ranges = list(cost = c(0.001, 0.01, 0.1, 1, 10, 50, 100)))
summary(tune.7b)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 1
##
## - best performance: 0.01262821
##
## - Detailed performance results:
## cost error dispersion
## 1 1e-03 0.09429487 0.04769741
## 2 1e-02 0.07384615 0.05674660
## 3 1e-01 0.05333333 0.05252135
## 4 1e+00 0.01262821 0.01778017
## 5 1e+01 0.02025641 0.02319375
## 6 5e+01 0.03307692 0.02087291
## 7 1e+02 0.03307692 0.02087291
tune.7b$best.parameters
## cost
## 4 1
The best.parameters command indicates a cost of 1 has the best performance / lowest cross-validation error. That error is approximately 0.0126.
set.seed(2021)
tune.7c.radial <- tune(svm, mpg.bin ~ ., data = auto, kernel = 'radial', ranges = list(cost = c(0.001, 0.01, 0.1, 1, 10, 50, 100), gamma = c(0.01, 0.1, 1, 5, 10, 20)))
summary(tune.7c.radial)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost gamma
## 50 0.01
##
## - best performance: 0.01519231
##
## - Detailed performance results:
## cost gamma error dispersion
## 1 1e-03 0.01 0.57153846 0.03170929
## 2 1e-02 0.01 0.57153846 0.03170929
## 3 1e-01 0.01 0.08147436 0.05823146
## 4 1e+00 0.01 0.07384615 0.05674660
## 5 1e+01 0.01 0.03038462 0.03886211
## 6 5e+01 0.01 0.01519231 0.01760469
## 7 1e+02 0.01 0.01519231 0.01760469
## 8 1e-03 0.10 0.57153846 0.03170929
## 9 1e-02 0.10 0.19615385 0.09005289
## 10 1e-01 0.10 0.07891026 0.06023856
## 11 1e+00 0.10 0.05339744 0.04850008
## 12 1e+01 0.10 0.02782051 0.03002300
## 13 5e+01 0.10 0.02782051 0.02748232
## 14 1e+02 0.10 0.02775641 0.03423881
## 15 1e-03 1.00 0.57153846 0.03170929
## 16 1e-02 1.00 0.57153846 0.03170929
## 17 1e-01 1.00 0.57153846 0.03170929
## 18 1e+00 1.00 0.05602564 0.06451378
## 19 1e+01 1.00 0.05858974 0.06481316
## 20 5e+01 1.00 0.05858974 0.06481316
## 21 1e+02 1.00 0.05858974 0.06481316
## 22 1e-03 5.00 0.57153846 0.03170929
## 23 1e-02 5.00 0.57153846 0.03170929
## 24 1e-01 5.00 0.57153846 0.03170929
## 25 1e+00 5.00 0.51288462 0.04866961
## 26 1e+01 5.00 0.50775641 0.04837604
## 27 5e+01 5.00 0.50775641 0.04837604
## 28 1e+02 5.00 0.50775641 0.04837604
## 29 1e-03 10.00 0.57153846 0.03170929
## 30 1e-02 10.00 0.57153846 0.03170929
## 31 1e-01 10.00 0.57153846 0.03170929
## 32 1e+00 10.00 0.53839744 0.04401964
## 33 1e+01 10.00 0.52564103 0.04349852
## 34 5e+01 10.00 0.52564103 0.04349852
## 35 1e+02 10.00 0.52564103 0.04349852
## 36 1e-03 20.00 0.57153846 0.03170929
## 37 1e-02 20.00 0.57153846 0.03170929
## 38 1e-01 20.00 0.57153846 0.03170929
## 39 1e+00 20.00 0.55878205 0.03841171
## 40 1e+01 20.00 0.55365385 0.03520658
## 41 5e+01 20.00 0.55365385 0.03520658
## 42 1e+02 20.00 0.55365385 0.03520658
The radial kernel has lowest cross-validation error (approximately 0.01519231) at a cost of 50 and gamma of 0.01.
set.seed(2021)
tune.7c.poly <- tune(svm, mpg.bin ~ ., data = auto, kernel = 'polynomial', ranges = list(cost = c(0.001, 0.01, 0.1, 1, 10, 50, 100), degree = c(2,4,6)))
summary(tune.7c.poly)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost degree
## 100 2
##
## - best performance: 0.3058974
##
## - Detailed performance results:
## cost degree error dispersion
## 1 1e-03 2 0.5715385 0.03170929
## 2 1e-02 2 0.5715385 0.03170929
## 3 1e-01 2 0.5715385 0.03170929
## 4 1e+00 2 0.5715385 0.03170929
## 5 1e+01 2 0.5535897 0.04001972
## 6 5e+01 2 0.3544231 0.08982010
## 7 1e+02 2 0.3058974 0.09460085
## 8 1e-03 4 0.5715385 0.03170929
## 9 1e-02 4 0.5715385 0.03170929
## 10 1e-01 4 0.5715385 0.03170929
## 11 1e+00 4 0.5715385 0.03170929
## 12 1e+01 4 0.5715385 0.03170929
## 13 5e+01 4 0.5715385 0.03170929
## 14 1e+02 4 0.5715385 0.03170929
## 15 1e-03 6 0.5715385 0.03170929
## 16 1e-02 6 0.5715385 0.03170929
## 17 1e-01 6 0.5715385 0.03170929
## 18 1e+00 6 0.5715385 0.03170929
## 19 1e+01 6 0.5715385 0.03170929
## 20 5e+01 6 0.5715385 0.03170929
## 21 1e+02 6 0.5715385 0.03170929
The polynomial kernel has lowest cross-validation error (approximately 0.3058974) at a cost of 100 and degree of 2.
> plot(svmfit , dat ) where svmfit contains your fitted model and dat is a data frame containing your data, you can type > plot(svmfit , dat , x1∼x4) in order to plot just the first and fourth variables. However, you must replace x1 and x4 with the correct variable names. To find out more, type ?plot.svm.Let’s set up the models with different kernels and hyper-parameters and build a function to plot the SVM model results from the different kernels.
First up is the linear kernel.
set.seed(2021)
svm.7d.linear = svm(mpg.bin ~ ., data = auto, kernel = "linear", cost = 1)
svm.7d.poly = svm(mpg.bin ~ ., data = auto, kernel = "polynomial", cost = 100, degree = 2)
svm.7d.radial = svm(mpg.bin ~ ., data = auto, kernel = "radial", cost = 50, gamma = 0.01)
plotpairs = function(fit) {
for (name in names(auto)[!(names(auto) %in% c("mpg", "mpg.bin", "name"))]) {
plot(fit, auto, as.formula(paste("mpg~", name, sep = "")))
}
}
plotpairs(svm.7d.linear)
Up next is the polynomial kernel.
plotpairs(svm.7d.poly)
Finally is the radial kernel.
plotpairs(svm.7d.radial)
OJ data set which is part of the ISLR package.set.seed(2021)
oj <- OJ
inTrain <- createDataPartition(oj$Purchase, p=0.747, list=FALSE, times = 1)
oj.train <- oj[inTrain, ]
oj.test <- oj[-inTrain, ]
dim(oj.train)
## [1] 800 18
dim(oj.test)
## [1] 270 18
cost=0.01, with Purchase as the response and the other variables as predictors. Use the summary() function to produce summary statistics, and describe the results obtained.set.seed(2021)
train_control <- trainControl(method="repeatedcv", number=10, repeats=3)
svm.oj <- train(Purchase ~., data = oj.train, method = "svmLinear", trControl = train_control, preProcess = c("center","scale"), tuneGrid = expand.grid(C = 0.01))
svm.oj
## Support Vector Machines with Linear Kernel
##
## 800 samples
## 17 predictor
## 2 classes: 'CH', 'MM'
##
## Pre-processing: centered (17), scaled (17)
## Resampling: Cross-Validated (10 fold, repeated 3 times)
## Summary of sample sizes: 719, 720, 720, 720, 720, 720, ...
## Resampling results:
##
## Accuracy Kappa
## 0.8262872 0.6279929
##
## Tuning parameter 'C' was held constant at a value of 0.01
svm.oj$finalModel
## Support Vector Machine object of class "ksvm"
##
## SV type: C-svc (classification)
## parameter : cost C = 0.01
##
## Linear (vanilla) kernel function.
##
## Number of Support Vectors : 438
##
## Objective Function Value : -3.9063
## Training error : 0.16875
svm.oj.train.preds <- predict(svm.oj, oj.train)
table(svm.oj.train.preds, oj.train$Purchase)
##
## svm.oj.train.preds CH MM
## CH 436 83
## MM 52 229
The model uses 438 support vectors, has a cost of 0.01, uses 17 predictors and has a training accuracy of 83.125%. The training model correctly predicts 436CH and 229MM in the training set.
set.seed(2021)
svm.oj.test.preds <- predict(svm.oj, oj.test)
svm.conf.matrix <- table(svm.oj.test.preds, oj.test$Purchase)
svm.conf.matrix
##
## svm.oj.test.preds CH MM
## CH 145 22
## MM 20 83
svm.oj.acc <- sum(diag(svm.conf.matrix))/sum(svm.conf.matrix)
svm.oj.err <- 1 - svm.oj.acc
cat('Overall tree model accuracy is: ', svm.oj.acc)
## Overall tree model accuracy is: 0.8444444
cat('\nOverall tree model test error rate is: ', svm.oj.err)
##
## Overall tree model test error rate is: 0.1555556
The overall training error rate is (100 - 83.125%) = 16.875%. The overall testing error rate is (100 - 84.444%) = 15.556%.
tune() function to select an optimal cost. Consider values in the range 0.01 to 10.Dangit, and I was hoping to keep on using CARET on these things. Okie-doke.
set.seed(2021)
svm.oj.tune <- tune(svm, Purchase ~ . , data = oj.train, kernel = "linear", ranges = list(cost = c(0.01, 0.1, 1, 3, 5, 7, 10)))
summary(svm.oj.tune)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 0.01
##
## - best performance: 0.1725
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.17250 0.05676462
## 2 0.10 0.17500 0.05303301
## 3 1.00 0.17625 0.05118390
## 4 3.00 0.17875 0.04966904
## 5 5.00 0.18000 0.04794383
## 6 7.00 0.17750 0.04669642
## 7 10.00 0.17750 0.04706674
Best performance is 0.1725 with a cost of 0.01. We’ll use that now.
cost.set.seed(2021)
svm.oj.tune <- svm(Purchase ~. , kernel = "linear", data = oj.train, cost = 0.01)
svm.oj.tune.train.pred <- predict(svm.oj.tune, oj.train)
tune.train.conf.mat <- table(svm.oj.tune.train.pred, oj.train$Purchase)
tune.train.conf.mat
##
## svm.oj.tune.train.pred CH MM
## CH 435 83
## MM 53 229
svm.oj.tune.train.pred.acc <- sum(diag(tune.train.conf.mat))/sum(tune.train.conf.mat)
svm.oj.tune.train.pred.err <- 1 - svm.oj.tune.train.pred.acc
cat('Overall tree model accuracy is: ', svm.oj.tune.train.pred.acc)
## Overall tree model accuracy is: 0.83
cat('\nOverall tree model test error rate is: ', svm.oj.tune.train.pred.err)
##
## Overall tree model test error rate is: 0.17
Result is the same as the SVM trained using CARET, which is a superior package.
svm.oj.tune.test.pred <- predict(svm.oj.tune, oj.test)
tune.test.conf.mat <- table(svm.oj.tune.test.pred, oj.test$Purchase)
tune.test.conf.mat
##
## svm.oj.tune.test.pred CH MM
## CH 145 22
## MM 20 83
svm.oj.tune.test.pred.acc <- sum(diag(tune.test.conf.mat))/sum(tune.test.conf.mat)
svm.oj.tune.test.pred.err <- 1 - svm.oj.tune.test.pred.acc
cat('Overall tree model accuracy is: ', svm.oj.tune.test.pred.acc)
## Overall tree model accuracy is: 0.8444444
cat('\nOverall tree model test error rate is: ', svm.oj.tune.test.pred.err)
##
## Overall tree model test error rate is: 0.1555556
Result is the same as the SVM created using CARET, which is a superior package.
svm.oj.rad <- svm(Purchase~., data = oj.train, kernel = "radial", cost = 0.01)
summary(svm.oj.rad)
##
## Call:
## svm(formula = Purchase ~ ., data = oj.train, kernel = "radial", cost = 0.01)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 0.01
##
## Number of Support Vectors: 627
##
## ( 315 312 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
svm.oj.rad.train.pred <- predict(svm.oj.rad, oj.train)
rad.confmat <- table(svm.oj.rad.train.pred, oj.train$Purchase)
rad.confmat
##
## svm.oj.rad.train.pred CH MM
## CH 488 312
## MM 0 0
svm.oj.rad.train.pred.acc <- sum(diag(rad.confmat))/sum(rad.confmat)
svm.oj.rad.train.pred.err <- 1 - svm.oj.rad.train.pred.acc
cat('Overall tree model accuracy is: ', svm.oj.rad.train.pred.acc)
## Overall tree model accuracy is: 0.61
cat('\nOverall tree model test error rate is: ', svm.oj.rad.train.pred.err)
##
## Overall tree model test error rate is: 0.39
The default svm radial model uses 627 support vectors, and has an overall model accuracy of 61% with an overall training error rate of 39%.
svm.oj.rad.test.pred <- predict(svm.oj.rad, oj.test)
rad.confmat <- table(svm.oj.rad.test.pred, oj.test$Purchase)
rad.confmat
##
## svm.oj.rad.test.pred CH MM
## CH 165 105
## MM 0 0
svm.oj.rad.test.pred.acc <- sum(diag(rad.confmat))/sum(rad.confmat)
svm.oj.rad.test.pred.err <- 1 - svm.oj.rad.test.pred.acc
cat('Overall tree model accuracy is: ', svm.oj.rad.test.pred.acc)
## Overall tree model accuracy is: 0.6111111
cat('\nOverall tree model test error rate is: ', svm.oj.rad.test.pred.err)
##
## Overall tree model test error rate is: 0.3888889
The un-tuned svm radial model has an overall testing accuracy of about 61.1%, with an overall testing error rate of 39%.
We now tune the model.
set.seed(2021)
svm.rad.tune <- tune(svm, Purchase ~ . , data = oj.train, kernel = "radial", ranges = list(cost = c(0.01, 0.1, 1, 3, 5, 7, 10)))
summary(svm.rad.tune)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 1
##
## - best performance: 0.18125
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.39000 0.05296750
## 2 0.10 0.19375 0.05008673
## 3 1.00 0.18125 0.05597929
## 4 3.00 0.18875 0.06022239
## 5 5.00 0.19000 0.05827378
## 6 7.00 0.19250 0.05470883
## 7 10.00 0.19750 0.05296750
Tuning reveals the optimal cost value is 1.0, so we proceed with this value.
set.seed(2021)
svm.rad.tune <- svm(Purchase ~. , kernel = "radial", data = oj.train, cost = 1)
svm.rad.tune.train.pred <- predict(svm.rad.tune, oj.train)
tune.train.conf.mat <- table(svm.rad.tune.train.pred, oj.train$Purchase)
tune.train.conf.mat
##
## svm.rad.tune.train.pred CH MM
## CH 446 85
## MM 42 227
svm.rad.tune.train.pred.acc <- sum(diag(tune.train.conf.mat))/sum(tune.train.conf.mat)
svm.rad.tune.train.pred.err <- 1 - svm.rad.tune.train.pred.acc
cat('Overall tree model accuracy is: ', svm.rad.tune.train.pred.acc)
## Overall tree model accuracy is: 0.84125
cat('\nOverall tree model test error rate is: ', svm.rad.tune.train.pred.err)
##
## Overall tree model test error rate is: 0.15875
The tuned radial SVM has an overall training accuracy of 84.125% and overall training error of 15.875%.
svm.oj.rad.test.pred <- predict(svm.rad.tune, oj.test)
rad.confmat <- table(svm.oj.rad.test.pred, oj.test$Purchase)
rad.confmat
##
## svm.oj.rad.test.pred CH MM
## CH 152 24
## MM 13 81
svm.rad.tune.test.pred.acc <- sum(diag(rad.confmat))/sum(rad.confmat)
svm.rad.tune.test.pred.err <- 1 - svm.rad.tune.test.pred.acc
cat('Overall tree model accuracy is: ', svm.rad.tune.test.pred.acc)
## Overall tree model accuracy is: 0.862963
cat('\nOverall tree model test error rate is: ', svm.rad.tune.test.pred.err)
##
## Overall tree model test error rate is: 0.137037
The tuned radial SVM has an overall testing accuracy of 86.296% and overall testing error of 13.704%.
degree = 2.svm.oj.poly <- svm(Purchase~., data = oj.train, kernel = "poly", cost = 0.01, degree = 2)
summary(svm.oj.poly)
##
## Call:
## svm(formula = Purchase ~ ., data = oj.train, kernel = "poly", cost = 0.01,
## degree = 2)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: polynomial
## cost: 0.01
## degree: 2
## coef.0: 0
##
## Number of Support Vectors: 629
##
## ( 317 312 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
svm.oj.poly.train.pred <- predict(svm.oj.poly, oj.train)
rad.confmat <- table(svm.oj.poly.train.pred, oj.train$Purchase)
rad.confmat
##
## svm.oj.poly.train.pred CH MM
## CH 487 294
## MM 1 18
svm.oj.poly.train.pred.acc <- sum(diag(rad.confmat))/sum(rad.confmat)
svm.oj.poly.train.pred.err <- 1 - svm.oj.poly.train.pred.acc
cat('Overall tree model accuracy is: ', svm.oj.poly.train.pred.acc)
## Overall tree model accuracy is: 0.63125
cat('\nOverall tree model test error rate is: ', svm.oj.poly.train.pred.err)
##
## Overall tree model test error rate is: 0.36875
The default svm radial model uses 629 support vectors, and has an overall model accuracy of 63.125% with an overall training error rate of 36.875%.
svm.oj.poly.test.pred <- predict(svm.oj.poly, oj.test)
poly.confmat <- table(svm.oj.poly.test.pred, oj.test$Purchase)
poly.confmat
##
## svm.oj.poly.test.pred CH MM
## CH 165 104
## MM 0 1
svm.oj.poly.test.pred.acc <- sum(diag(poly.confmat))/sum(poly.confmat)
svm.oj.poly.test.pred.err <- 1 - svm.oj.poly.test.pred.acc
cat('Overall tree model accuracy is: ', svm.oj.poly.test.pred.acc)
## Overall tree model accuracy is: 0.6148148
cat('\nOverall tree model test error rate is: ', svm.oj.poly.test.pred.err)
##
## Overall tree model test error rate is: 0.3851852
The un-tuned svm polynomial model has an overall testing accuracy of about 61.5%, with an overall testing error rate of 38.5%.
We now tune the model.
set.seed(2021)
svm.poly.tune <- tune(svm, Purchase ~ . , data = oj.train, kernel = "poly", ranges = list(cost = c(0.01, 0.1, 1, 3, 5, 7, 10)), degree = 2)
summary(svm.poly.tune)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 10
##
## - best performance: 0.19625
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.38750 0.05368374
## 2 0.10 0.32875 0.03007514
## 3 1.00 0.21500 0.04669642
## 4 3.00 0.20000 0.06400955
## 5 5.00 0.20000 0.05527708
## 6 7.00 0.19750 0.05163978
## 7 10.00 0.19625 0.06125142
Tuning reveals the optimal cost value is 10.0, so we proceed with this value.
set.seed(2021)
svm.poly.tune <- svm(Purchase ~. , kernel = "poly", data = oj.train, cost = 10, degree = 2)
svm.poly.tune.train.pred <- predict(svm.poly.tune, oj.train)
tune.train.conf.mat <- table(svm.poly.tune.train.pred, oj.train$Purchase)
tune.train.conf.mat
##
## svm.poly.tune.train.pred CH MM
## CH 446 84
## MM 42 228
svm.poly.tune.train.pred.acc <- sum(diag(tune.train.conf.mat))/sum(tune.train.conf.mat)
svm.poly.tune.train.pred.err <- 1 - svm.poly.tune.train.pred.acc
cat('Overall tree model accuracy is: ', svm.poly.tune.train.pred.acc)
## Overall tree model accuracy is: 0.8425
cat('\nOverall tree model test error rate is: ', svm.poly.tune.train.pred.err)
##
## Overall tree model test error rate is: 0.1575
The tuned polynomial SVM has an overall training accuracy of 84.25% and overall training error of 15.75%.
svm.poly.test.pred <- predict(svm.poly.tune, oj.test)
poly.confmat <- table(svm.poly.test.pred, oj.test$Purchase)
poly.confmat
##
## svm.poly.test.pred CH MM
## CH 149 28
## MM 16 77
svm.poly.tune.test.pred.acc <- sum(diag(poly.confmat))/sum(poly.confmat)
svm.poly.tune.test.pred.err <- 1 - svm.poly.tune.test.pred.acc
cat('Overall tree model accuracy is: ', svm.poly.tune.test.pred.acc)
## Overall tree model accuracy is: 0.837037
cat('\nOverall tree model test error rate is: ', svm.poly.tune.test.pred.err)
##
## Overall tree model test error rate is: 0.162963
The tuned polynomial SVM has an overall testing accuracy of 83.7% and overall testing error of 16.3%.
Let’s recap the results of tuned SVM performance on the test set:
The tuned polynomial SVM has an overall testing accuracy of 83.7% and overall testing error of 16.3%.
The tuned radial SVM has an overall testing accuracy of 86.296% and overall testing error of 13.704%.
The tuned linear SVM has an overall testing accuracy of 84.44% and overall testing error of 15.56%
Based upon these data, it looks like the approach with the best results is the tuned radial SVM.
Comment on your results.
The SVM kernel
svmPolyshowed the best performance in correctly identifying and establishing decision boundaries that reflect real boundaries. Linear SVM kernal and the various logistic regression models did not perform well when it comes to correctly identifying the classes.