Exercise 5: We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary. We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features.
set.seed(1)
x1 = runif(500) - 0.5
x2 = runif(500) - 0.5
y = 1 * (x1^2 - x2^2 > 0)
plot(x1,x2,col=ifelse(y,'red','navy'),xlab='X1',ylab='X2')
dat = data.frame(x1, x2, y = as.factor(y))
glm.fit = glm(y~., data = dat, family = "binomial")
summary(glm.fit)
##
## Call:
## glm(formula = y ~ ., family = "binomial", data = dat)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.179 -1.139 -1.112 1.206 1.257
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.087260 0.089579 -0.974 0.330
## x1 0.196199 0.316864 0.619 0.536
## x2 -0.002854 0.305712 -0.009 0.993
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 692.18 on 499 degrees of freedom
## Residual deviance: 691.79 on 497 degrees of freedom
## AIC: 697.79
##
## Number of Fisher Scoring iterations: 3
glm.preds = predict(glm.fit, newdata = dat, type ="response")
plot(x1,x2,col=ifelse(glm.preds>=0.5,'navy','red'),xlab='X1',ylab='X2')
glm.fit2 = glm(y~I(x1*x2) + poly(x2,2) + poly(x1,2), data=dat, family = "binomial")
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
glm.preds2 = predict(glm.fit2, newdata = dat, type ="response")
plot(x1,x2,col=ifelse(glm.preds2>=0.5,'navy','red'),xlab='X1',ylab='X2')
svm.fit=svm(y~.,data=dat,kernal='linear',cost=0.01)
svm.preds=predict(svm.fit,newdata=dat,type='response')
plot(x1,x2,col=ifelse(svm.preds!=0,'navy','red'),xlab='X1',ylab='X2')
svm.fit2=svm(y~.,data=dat,kernel='radial',gamma=1)
svm.preds2=predict(svm.fit2,newdata=dat,type='response')
plot(x1,x2,col=ifelse(svm.preds2!=0,'navy','red'),xlab='X1',ylab='X2')
Exercise 7: In this problem, you will use support vector approaches in order to predict whether a given car gets high or low gas mileage based on the Auto data set.
df = Auto
df$y = as.factor(ifelse(df$mpg > median(df$mpg), 1, 0))
str(df)
## 'data.frame': 392 obs. of 10 variables:
## $ mpg : num 18 15 18 16 17 15 14 14 14 15 ...
## $ cylinders : num 8 8 8 8 8 8 8 8 8 8 ...
## $ displacement: num 307 350 318 304 302 429 454 440 455 390 ...
## $ horsepower : num 130 165 150 150 140 198 220 215 225 190 ...
## $ weight : num 3504 3693 3436 3433 3449 ...
## $ acceleration: num 12 11.5 11 12 10.5 10 9 8.5 10 8.5 ...
## $ year : num 70 70 70 70 70 70 70 70 70 70 ...
## $ origin : num 1 1 1 1 1 1 1 1 1 1 ...
## $ name : Factor w/ 304 levels "amc ambassador brougham",..: 49 36 231 14 161 141 54 223 241 2 ...
## $ y : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
set.seed(42)
tune.out = tune(svm, y ~ . -mpg -name, data=df, kernel="linear", ranges=list(cost=c(0.1, 1, 10)))
summary(tune.out)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 10
##
## - best performance: 0.08653846
##
## - Detailed performance results:
## cost error dispersion
## 1 0.1 0.09673077 0.05699840
## 2 1.0 0.09423077 0.04632467
## 3 10.0 0.08653846 0.03776796
set.seed(42)
tune.out = tune(svm, y ~ . -mpg -name, data=df, kernel="radial", ranges=list(cost=c(0.1, 1, 10), gamma=c(0.5, 1, 2)))
summary(tune.out)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost gamma
## 1 0.5
##
## - best performance: 0.07628205
##
## - Detailed performance results:
## cost gamma error dispersion
## 1 0.1 0.5 0.08660256 0.04961909
## 2 1.0 0.5 0.07628205 0.04267196
## 3 10.0 0.5 0.08634615 0.04391746
## 4 0.1 1.0 0.09673077 0.05699840
## 5 1.0 1.0 0.07878205 0.04472958
## 6 10.0 1.0 0.09403846 0.04383004
## 7 0.1 2.0 0.15269231 0.10000813
## 8 1.0 2.0 0.08378205 0.04837755
## 9 10.0 2.0 0.10173077 0.05012408
df = df[, -c(1,9)]
set.seed(42)
svm_fit = svm(y ~ ., data=df, kernel='linear', cost=10)
plot(svm_fit, df, displacement~cylinders)
set.seed(42)
svm_fit = svm(y ~ ., data=df, kernel='radial', cost=1, gamma=0.5)
plot(svm_fit, df, weight~acceleration)
set.seed(42)
svm_fit = svm(y ~ ., data=df, kernel='polynomial', cost=10, degree=3)
plot(svm_fit, df, year~horsepower)
Exercise 8: This problem involves the OJ data set which is part of the ISLR package.
df = OJ
df$Purchase = as.factor(df$Purchase)
set.seed(42)
index = sample(nrow(df), 800)
train = df[index, ]
test = df[-index, ]
svm_fit = svm(Purchase~., data=train, kernel='linear', cost=0.01)
summary(svm_fit)
##
## Call:
## svm(formula = Purchase ~ ., data = train, kernel = "linear", cost = 0.01)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.01
##
## Number of Support Vectors: 432
##
## ( 215 217 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
svm_preds = predict(svm_fit, train)
confusionMatrix(data=svm_preds, reference=train$Purchase)
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 432 77
## MM 60 231
##
## Accuracy : 0.8288
## 95% CI : (0.8008, 0.8542)
## No Information Rate : 0.615
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.6346
##
## Mcnemar's Test P-Value : 0.1716
##
## Sensitivity : 0.8780
## Specificity : 0.7500
## Pos Pred Value : 0.8487
## Neg Pred Value : 0.7938
## Prevalence : 0.6150
## Detection Rate : 0.5400
## Detection Prevalence : 0.6362
## Balanced Accuracy : 0.8140
##
## 'Positive' Class : CH
##
svm_preds = predict(svm_fit, test)
confusionMatrix(data=svm_preds, reference=test$Purchase)
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 142 25
## MM 19 84
##
## Accuracy : 0.837
## 95% CI : (0.7875, 0.879)
## No Information Rate : 0.5963
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.6585
##
## Mcnemar's Test P-Value : 0.451
##
## Sensitivity : 0.8820
## Specificity : 0.7706
## Pos Pred Value : 0.8503
## Neg Pred Value : 0.8155
## Prevalence : 0.5963
## Detection Rate : 0.5259
## Detection Prevalence : 0.6185
## Balanced Accuracy : 0.8263
##
## 'Positive' Class : CH
##
set.seed(42)
tune.out = tune(svm, Purchase~., data=train, kernel="linear", ranges=list(cost=c(0.1, 1, 10)))
summary(tune.out)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 1
##
## - best performance: 0.175
##
## - Detailed performance results:
## cost error dispersion
## 1 0.1 0.17625 0.03356689
## 2 1.0 0.17500 0.02886751
## 3 10.0 0.18625 0.02729087
svm_fit = svm(Purchase~., data=train, kernel='linear', cost=1)
#train
svm_preds = predict(svm_fit, train)
confusionMatrix(data=svm_preds, reference=train$Purchase)
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 434 76
## MM 58 232
##
## Accuracy : 0.8325
## 95% CI : (0.8048, 0.8577)
## No Information Rate : 0.615
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.6424
##
## Mcnemar's Test P-Value : 0.1419
##
## Sensitivity : 0.8821
## Specificity : 0.7532
## Pos Pred Value : 0.8510
## Neg Pred Value : 0.8000
## Prevalence : 0.6150
## Detection Rate : 0.5425
## Detection Prevalence : 0.6375
## Balanced Accuracy : 0.8177
##
## 'Positive' Class : CH
##
# test
svm_preds = predict(svm_fit, test)
confusionMatrix(data=svm_preds, reference=test$Purchase)
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 140 23
## MM 21 86
##
## Accuracy : 0.837
## 95% CI : (0.7875, 0.879)
## No Information Rate : 0.5963
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.6605
##
## Mcnemar's Test P-Value : 0.8802
##
## Sensitivity : 0.8696
## Specificity : 0.7890
## Pos Pred Value : 0.8589
## Neg Pred Value : 0.8037
## Prevalence : 0.5963
## Detection Rate : 0.5185
## Detection Prevalence : 0.6037
## Balanced Accuracy : 0.8293
##
## 'Positive' Class : CH
##
svm_fit = svm(Purchase~., data=train, kernel='radial')
summary(svm_fit)
##
## Call:
## svm(formula = Purchase ~ ., data = train, kernel = "radial")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 1
##
## Number of Support Vectors: 375
##
## ( 183 192 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
# train
svm_preds = predict(svm_fit, train)
confusionMatrix(data=svm_preds, reference=train$Purchase)
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 453 81
## MM 39 227
##
## Accuracy : 0.85
## 95% CI : (0.8233, 0.874)
## No Information Rate : 0.615
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.675
##
## Mcnemar's Test P-Value : 0.000182
##
## Sensitivity : 0.9207
## Specificity : 0.7370
## Pos Pred Value : 0.8483
## Neg Pred Value : 0.8534
## Prevalence : 0.6150
## Detection Rate : 0.5663
## Detection Prevalence : 0.6675
## Balanced Accuracy : 0.8289
##
## 'Positive' Class : CH
##
# test
svm_preds = predict(svm_fit, test)
confusionMatrix(data=svm_preds, reference=test$Purchase)
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 146 28
## MM 15 81
##
## Accuracy : 0.8407
## 95% CI : (0.7915, 0.8823)
## No Information Rate : 0.5963
## P-Value [Acc > NIR] : < 2e-16
##
## Kappa : 0.6627
##
## Mcnemar's Test P-Value : 0.06725
##
## Sensitivity : 0.9068
## Specificity : 0.7431
## Pos Pred Value : 0.8391
## Neg Pred Value : 0.8438
## Prevalence : 0.5963
## Detection Rate : 0.5407
## Detection Prevalence : 0.6444
## Balanced Accuracy : 0.8250
##
## 'Positive' Class : CH
##
set.seed(42)
tune.out = tune(svm, Purchase~., data=train, kernel="radial", ranges=list(gamma=c(0.5, 1, 2)))
summary(tune.out)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## gamma
## 0.5
##
## - best performance: 0.19375
##
## - Detailed performance results:
## gamma error dispersion
## 1 0.5 0.19375 0.04299952
## 2 1.0 0.20250 0.04158325
## 3 2.0 0.21750 0.04257347
svm_fit = svm(Purchase~., data=train, kernel='radial', gamma=0.5)
# train
svm_preds = predict(svm_fit, train)
confusionMatrix(data=svm_preds, reference=train$Purchase)
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 457 74
## MM 35 234
##
## Accuracy : 0.8638
## 95% CI : (0.838, 0.8868)
## No Information Rate : 0.615
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.7053
##
## Mcnemar's Test P-Value : 0.0002729
##
## Sensitivity : 0.9289
## Specificity : 0.7597
## Pos Pred Value : 0.8606
## Neg Pred Value : 0.8699
## Prevalence : 0.6150
## Detection Rate : 0.5713
## Detection Prevalence : 0.6637
## Balanced Accuracy : 0.8443
##
## 'Positive' Class : CH
##
# test
svm_preds = predict(svm_fit, test)
confusionMatrix(data=svm_preds, reference=test$Purchase)
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 144 32
## MM 17 77
##
## Accuracy : 0.8185
## 95% CI : (0.7673, 0.8626)
## No Information Rate : 0.5963
## P-Value [Acc > NIR] : 3.798e-15
##
## Kappa : 0.6145
##
## Mcnemar's Test P-Value : 0.0455
##
## Sensitivity : 0.8944
## Specificity : 0.7064
## Pos Pred Value : 0.8182
## Neg Pred Value : 0.8191
## Prevalence : 0.5963
## Detection Rate : 0.5333
## Detection Prevalence : 0.6519
## Balanced Accuracy : 0.8004
##
## 'Positive' Class : CH
##
svm_fit = svm(Purchase~., data=train, kernel='polynomial', degree=2)
summary(svm_fit)
##
## Call:
## svm(formula = Purchase ~ ., data = train, kernel = "polynomial",
## degree = 2)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: polynomial
## cost: 1
## degree: 2
## coef.0: 0
##
## Number of Support Vectors: 443
##
## ( 217 226 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
# train
svm_preds = predict(svm_fit, train)
confusionMatrix(data=svm_preds, reference=train$Purchase)
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 461 112
## MM 31 196
##
## Accuracy : 0.8212
## 95% CI : (0.7929, 0.8472)
## No Information Rate : 0.615
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.603
##
## Mcnemar's Test P-Value : 2.233e-11
##
## Sensitivity : 0.9370
## Specificity : 0.6364
## Pos Pred Value : 0.8045
## Neg Pred Value : 0.8634
## Prevalence : 0.6150
## Detection Rate : 0.5763
## Detection Prevalence : 0.7163
## Balanced Accuracy : 0.7867
##
## 'Positive' Class : CH
##
# test
svm_preds = predict(svm_fit, test)
confusionMatrix(data=svm_preds, reference=test$Purchase)
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 149 41
## MM 12 68
##
## Accuracy : 0.8037
## 95% CI : (0.7512, 0.8494)
## No Information Rate : 0.5963
## P-Value [Acc > NIR] : 2.768e-13
##
## Kappa : 0.574
##
## Mcnemar's Test P-Value : 0.00012
##
## Sensitivity : 0.9255
## Specificity : 0.6239
## Pos Pred Value : 0.7842
## Neg Pred Value : 0.8500
## Prevalence : 0.5963
## Detection Rate : 0.5519
## Detection Prevalence : 0.7037
## Balanced Accuracy : 0.7747
##
## 'Positive' Class : CH
##
set.seed(42)
tune.out = tune(svm, Purchase~., data=train, kernel="polynomial", ranges=list(degree=c(1, 2, 3)))
summary(tune.out)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## degree
## 1
##
## - best performance: 0.18125
##
## - Detailed performance results:
## degree error dispersion
## 1 1 0.18125 0.03498512
## 2 2 0.19250 0.04216370
## 3 3 0.19625 0.03586723
svm_fit = svm(Purchase~., data=train, kernel='polynomial', degree=1)
# train
svm_preds = predict(svm_fit, train)
confusionMatrix(data=svm_preds, reference=train$Purchase)
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 432 74
## MM 60 234
##
## Accuracy : 0.8325
## 95% CI : (0.8048, 0.8577)
## No Information Rate : 0.615
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.6433
##
## Mcnemar's Test P-Value : 0.2614
##
## Sensitivity : 0.8780
## Specificity : 0.7597
## Pos Pred Value : 0.8538
## Neg Pred Value : 0.7959
## Prevalence : 0.6150
## Detection Rate : 0.5400
## Detection Prevalence : 0.6325
## Balanced Accuracy : 0.8189
##
## 'Positive' Class : CH
##
# test
svm_preds = predict(svm_fit, test)
confusionMatrix(data=svm_preds, reference=test$Purchase)
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 141 23
## MM 20 86
##
## Accuracy : 0.8407
## 95% CI : (0.7915, 0.8823)
## No Information Rate : 0.5963
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.6677
##
## Mcnemar's Test P-Value : 0.7604
##
## Sensitivity : 0.8758
## Specificity : 0.7890
## Pos Pred Value : 0.8598
## Neg Pred Value : 0.8113
## Prevalence : 0.5963
## Detection Rate : 0.5222
## Detection Prevalence : 0.6074
## Balanced Accuracy : 0.8324
##
## 'Positive' Class : CH
##