Problem 5 - Part A
In this chunk I am generating a data set.
set.seed(2)
x1 <- runif(500) -0.5
x2 <- runif(500) -0.5
y <- 1*(x1^2-x2^2 > 0)
Problem 5 - Part B
In this chunk I am plotting the observations from part A.
plot(x1, x2, col=ifelse(y, "red", "black"))
Problem 5 - Part C
In this chunk, I am fitting a logistic regression model to the data.
glmFit <- glm(y~x1+x2, data=data.frame(x1,x2,y), family="binomial")
summary(glmFit)
##
## Call:
## glm(formula = y ~ x1 + x2, family = "binomial", data = data.frame(x1,
## x2, y))
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.271 -1.193 1.097 1.147 1.209
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.07138 0.08959 0.797 0.426
## x1 -0.03532 0.29825 -0.118 0.906
## x2 0.27548 0.30762 0.896 0.370
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 692.50 on 499 degrees of freedom
## Residual deviance: 691.67 on 497 degrees of freedom
## AIC: 697.67
##
## Number of Fisher Scoring iterations: 3
Problem 5 - Part D
In the chunks below, I am applying the model to the training data to obtain predicted class labels for each training observation with a linear decision boundary.
glmPred <- predict(glmFit, data=data.frame(x1, x2))
plot(x1, x2, col=ifelse(glmPred>0, "black", "red"), pch=ifelse(as.integer(glmPred>0) == y,1,6))
Problem 5 - Part E
In this chunk, I am fitting a logistic regression model with non-linear functions (polynomial) of x1 and x2.
glmFit.2 <- glm(y~poly(x1,3)+poly(x2,3), data=data.frame(x1,x2,y), family="binomial")
## Warning: glm.fit: algorithm did not converge
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Problem 5 - Part F
In this chunk, I am plotting the model from part E to provide evidence that the decision boundary is non-linear.
glmPred.2 <- predict(glmFit.2, data=data.frame(x1,x2))
plot(x1, x2, col=ifelse(glmPred.2>0, "red", "black"), pch=ifelse(as.integer(glmPred>0) == y,1,6))
Problem 5 - Part G
In this chunk, I am using a support vector classifier to obtain class predictions for each training observation. The results seem to indicate that all class predictions were assigned to a single class.
svmFit <- svm(y~x1+x2, data=data.frame(x1,x2,y), cost=0.1, kernel="linear")
svmPred <- predict(svmFit, data.frame(x1,x2), type="response")
plot(x1, x2, col=ifelse(svmPred>0, "red", "black"), pch=ifelse(as.integer(svmPred>0) == y,1,6))
Problem 5 - Part H
In this chunk, I am fitting two support vector classifiers (one polynomial and one radial).
svmFit.2 <- svm(y~x1+x2, data=data.frame(x1,x2,y), degree=1, kernel="polynomial")
svmPred.2 <- predict(svmFit.2, data.frame(x1,x2), type="response")
plot(x1, x2, col=ifelse(svmPred.2>0, "red", "black"), pch=ifelse(as.integer(svmPred.2>0) == y,1,6))
svmFit.3 <- svm(y~x1+x2, data=data.frame(x1,x2,y), cost=1, kernel="radial")
svmPred.3 <- predict(svmFit.3)
plot(x1, x2, col=ifelse(svmPred.3>0, "red", "black"), pch=ifelse(as.integer(svmPred.3>0) == y,1,6))
Problem 5 - Part I
Out of the three models in part G and part H, it appears that the radial fit is the best since it has the least amount of misclassified observations and it assigns predictions to both classes.
Problem 7 - Part A
auto <- Auto
attach(auto)
In this chunk, I am creating the binary variable for median values.
mpgMed <- ifelse(mpg > median(mpg),1,0)
auto$mpgMed <- as.factor(mpgMed)
Problem 7 - Part B
set.seed(1)
svmTune <- tune(svm, mpgMed~., data=auto, ranges=list(cost = c(0.1, 0.2, 0.5, 1, 2, 10)), kernel="linear")
The results of the support vector classifier with cost = 0.1, 0.2, 0.5, 1, 2 and 10 indicate that the lowest cross-validation error is associated with the model containing cost = 1 which has an error rate of approximately 1.02%. Alternatively, the highest cross-validation error is associated with the model containing c = 0.1 which has an error rate of approximately 45.96%.
summary(svmTune)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 1
##
## - best performance: 0.01025641
##
## - Detailed performance results:
## cost error dispersion
## 1 0.1 0.04596154 0.03378238
## 2 0.2 0.02814103 0.01893035
## 3 0.5 0.01282051 0.01813094
## 4 1.0 0.01025641 0.01792836
## 5 2.0 0.01282051 0.02179068
## 6 10.0 0.02051282 0.02648194
svmTune$best.model
##
## Call:
## best.tune(method = svm, train.x = mpgMed ~ ., data = auto, ranges = list(cost = c(0.1,
## 0.2, 0.5, 1, 2, 10)), kernel = "linear")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 1
##
## Number of Support Vectors: 56
Problem 7 - Part C
The results of the support vector classifier with the polynomial kernel indicate that the lowest cross-validation error is associated with model containing cost = 5 and degree = 1 which has an error rate of approximately 7.4%. The results of the support vector classifier with the radial kernel indicate that the lowest cross-validation error is associated with model containing cost = 3 and gamma = 1 which has an error rate of approximately 5.9%.
set.seed(1)
svmTune.pol <- tune(svm, mpgMed~., data=auto, ranges=list(cost=c(0.1, 0.4, 0.8, 1, 3, 5), degree=c(1,2,3)), kernel="polynomial")
summary(svmTune.pol)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost degree
## 5 1
##
## - best performance: 0.07403846
##
## - Detailed performance results:
## cost degree error dispersion
## 1 0.1 1 0.19673077 0.11502574
## 2 0.4 1 0.09185897 0.04376958
## 3 0.8 1 0.08673077 0.04846618
## 4 1.0 1 0.08416667 0.04343030
## 5 3.0 1 0.07653846 0.03617137
## 6 5.0 1 0.07403846 0.03522110
## 7 0.1 2 0.55115385 0.04366593
## 8 0.4 2 0.55115385 0.04366593
## 9 0.8 2 0.55115385 0.04366593
## 10 1.0 2 0.55115385 0.04366593
## 11 3.0 2 0.55115385 0.04366593
## 12 5.0 2 0.55115385 0.04366593
## 13 0.1 3 0.55115385 0.04366593
## 14 0.4 3 0.55115385 0.04366593
## 15 0.8 3 0.55115385 0.04366593
## 16 1.0 3 0.55115385 0.04366593
## 17 3.0 3 0.55115385 0.04366593
## 18 5.0 3 0.55115385 0.04366593
svmTune.pol$best.model
##
## Call:
## best.tune(method = svm, train.x = mpgMed ~ ., data = auto, ranges = list(cost = c(0.1,
## 0.4, 0.8, 1, 3, 5), degree = c(1, 2, 3)), kernel = "polynomial")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: polynomial
## cost: 5
## degree: 1
## coef.0: 0
##
## Number of Support Vectors: 132
set.seed(1)
svmTune.rad <- tune(svm, mpgMed~., data=auto, ranges=list(cost=c(0.1, 0.4, 0.8, 1, 3, 5), gamma=c(1,2,3)), kernel="radial")
summary(svmTune.rad)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost gamma
## 3 1
##
## - best performance: 0.05884615
##
## - Detailed performance results:
## cost gamma error dispersion
## 1 0.1 1 0.55115385 0.04366593
## 2 0.4 1 0.09935897 0.06326807
## 3 0.8 1 0.07147436 0.04312562
## 4 1.0 1 0.06384615 0.04375618
## 5 3.0 1 0.05884615 0.04020934
## 6 5.0 1 0.05884615 0.04020934
## 7 0.1 2 0.55115385 0.04366593
## 8 0.4 2 0.54608974 0.04574092
## 9 0.8 2 0.36743590 0.13801791
## 10 1.0 2 0.14019231 0.07984711
## 11 3.0 2 0.13512821 0.08055403
## 12 5.0 2 0.13512821 0.08055403
## 13 0.1 3 0.55115385 0.04366593
## 14 0.4 3 0.55115385 0.04366593
## 15 0.8 3 0.50006410 0.05856451
## 16 1.0 3 0.41326923 0.14331350
## 17 3.0 3 0.38025641 0.14908523
## 18 5.0 3 0.38025641 0.14908523
svmTune.rad$best.model
##
## Call:
## best.tune(method = svm, train.x = mpgMed ~ ., data = auto, ranges = list(cost = c(0.1,
## 0.4, 0.8, 1, 3, 5), gamma = c(1, 2, 3)), kernel = "radial")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 3
##
## Number of Support Vectors: 377
Problem 7 - Part D
The three plots below are generated for the best SVM model with a linear basis kernel.
plot(svmTune$best.model, data=auto, mpg~horsepower)
plot(svmTune$best.model, data=auto, mpg~year)
plot(svmTune$best.model, data=auto, mpg~displacement)
The three plots below are generated for the best SVM model with a polynomial basis kernel.
plot(svmTune.pol$best.model, data=auto, mpg~horsepower)
plot(svmTune.pol$best.model, data=auto, mpg~year)
plot(svmTune.pol$best.model, data=auto, mpg~displacement)
The three plots below are generated for the best SVM model a radial basis kernel.
plot(svmTune.rad$best.model, data=auto, mpg~horsepower)
plot(svmTune.rad$best.model, data=auto, mpg~year)
plot(svmTune.rad$best.model, data=auto, mpg~displacement)
Problem 8 - Part A
oj <- OJ
attach(oj)
In this chunk, I am creating a training set of 800 observations and a test set of the remaining observations.
set.seed(1)
split <- sample(1:nrow(oj), 800)
ojTrain <- oj[split,]
ojTest <- oj[-split,]
Problem 8 - Part B
The results of the support vector classifier indicate that there are 435 support vectors out of the total 800 points in the training set, of which 219 belong to the class CH and 216 belong to the class MM.
ojSVM <- svm(Purchase~., data=ojTrain, cost=0.01, kernel="linear")
summary(ojSVM)
##
## Call:
## svm(formula = Purchase ~ ., data = ojTrain, cost = 0.01, kernel = "linear")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.01
##
## Number of Support Vectors: 435
##
## ( 219 216 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
Problem 8 - Part C
The training error rate is 17.5% and the test error rate is approximately 17.8%.
ojPred.train <- predict(ojSVM, ojTrain)
table(ojTrain$Purchase, ojPred.train)
## ojPred.train
## CH MM
## CH 420 65
## MM 75 240
mean(ojPred.train != ojTrain$Purchase)
## [1] 0.175
ojPred.test <- predict(ojSVM, ojTest)
table(ojTest$Purchase, ojPred.test)
## ojPred.test
## CH MM
## CH 153 15
## MM 33 69
mean(ojPred.test != ojTest$Purchase)
## [1] 0.1777778
Problem 8 - Part D
The results of the support vector classifier with cost set to a range of values between 1 and 10 indicate that the lowest cross-validation error is associated with the model containing cost = 0.5 which has an error rate of approximately 16.88%.
set.seed(1)
ojTune <- tune(svm, Purchase~., data=ojTrain, ranges=list(cost=c(0.1,0.2,0.5,0.8,1,2,3,5,8,10)), kernel="linear")
summary(ojTune)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 0.5
##
## - best performance: 0.16875
##
## - Detailed performance results:
## cost error dispersion
## 1 0.1 0.17250 0.03162278
## 2 0.2 0.17125 0.02829041
## 3 0.5 0.16875 0.02651650
## 4 0.8 0.16875 0.02779513
## 5 1.0 0.17500 0.02946278
## 6 2.0 0.17250 0.02874698
## 7 3.0 0.16875 0.03019037
## 8 5.0 0.17250 0.03162278
## 9 8.0 0.17375 0.03197764
## 10 10.0 0.17375 0.03197764
ojTune$best.model
##
## Call:
## best.tune(method = svm, train.x = Purchase ~ ., data = ojTrain, ranges = list(cost = c(0.1,
## 0.2, 0.5, 0.8, 1, 2, 3, 5, 8, 10)), kernel = "linear")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.5
##
## Number of Support Vectors: 332
Problem 8 - Part E
The training error rate for the best model from part D is 16.5% and the test error rate is approximately 15.6%.
ojPred.train.2 <- predict(ojTune$best.model, ojTrain)
table(ojTrain$Purchase, ojPred.train.2)
## ojPred.train.2
## CH MM
## CH 424 61
## MM 71 244
mean(ojPred.train.2 != ojTrain$Purchase)
## [1] 0.165
ojPred.test.2 <- predict(ojTune$best.model, ojTest)
table(ojTest$Purchase, ojPred.test.2)
## ojPred.test.2
## CH MM
## CH 155 13
## MM 29 73
mean(ojPred.test.2 != ojTest$Purchase)
## [1] 0.1555556
Problem 8 - Part F
The results of the support vector classifier with a radial kernel and cost set to 0.01 indicate that there are 634 support vectors out of the total 800 points in the training set, of which 319 belong to the class CH and 315 belong to the class MM.
ojSVM.rad <- svm(Purchase~., data=ojTrain, cost=0.01, kernel="radial")
summary(ojSVM.rad)
##
## Call:
## svm(formula = Purchase ~ ., data = ojTrain, cost = 0.01, kernel = "radial")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 0.01
##
## Number of Support Vectors: 634
##
## ( 319 315 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
The training error rate is approximately 39.38% and the test error rate is approximately 37.78%.
ojPred.rad.train <- predict(ojSVM.rad, ojTrain)
table(ojTrain$Purchase, ojPred.rad.train)
## ojPred.rad.train
## CH MM
## CH 485 0
## MM 315 0
mean(ojPred.rad.train != ojTrain$Purchase)
## [1] 0.39375
ojPred.rad.test <- predict(ojSVM.rad, ojTest)
table(ojTest$Purchase, ojPred.rad.test)
## ojPred.rad.test
## CH MM
## CH 168 0
## MM 102 0
mean(ojPred.rad.test != ojTest$Purchase)
## [1] 0.3777778
The results of the support vector classifier with a radial kernel and cost set to a range of values between 1 and 10 indicate that the lowest cross-validation error is associated with the model containing cost = 0.5, which has an error rate of 16.75%.
set.seed(1)
ojTune.rad <- tune(svm, Purchase~., data=ojTrain, ranges=list(cost=c(0.1,0.2,0.5,0.8,1,2,3,5,8,10)), kernel="radial")
summary(ojTune.rad)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 0.5
##
## - best performance: 0.1675
##
## - Detailed performance results:
## cost error dispersion
## 1 0.1 0.18625 0.02853482
## 2 0.2 0.18250 0.03238227
## 3 0.5 0.16750 0.02443813
## 4 0.8 0.16875 0.02517301
## 5 1.0 0.17125 0.02128673
## 6 2.0 0.17750 0.02188988
## 7 3.0 0.17625 0.02239947
## 8 5.0 0.18000 0.02220485
## 9 8.0 0.18250 0.02648375
## 10 10.0 0.18625 0.02853482
ojTune.rad$best.model
##
## Call:
## best.tune(method = svm, train.x = Purchase ~ ., data = ojTrain, ranges = list(cost = c(0.1,
## 0.2, 0.5, 0.8, 1, 2, 3, 5, 8, 10)), kernel = "radial")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 0.5
##
## Number of Support Vectors: 407
The training error rate for the best model is 14.75% and the test error rate is approximately 17.78%.
ojPred.rad.train.2 <- predict(ojTune.rad$best.model, ojTrain)
table(ojTrain$Purchase, ojPred.rad.train.2)
## ojPred.rad.train.2
## CH MM
## CH 438 47
## MM 71 244
mean(ojPred.rad.train.2 != ojTrain$Purchase)
## [1] 0.1475
ojPred.rad.test.2 <- predict(ojTune.rad$best.model, ojTest)
table(ojTest$Purchase, ojPred.rad.test.2)
## ojPred.rad.test.2
## CH MM
## CH 150 18
## MM 30 72
mean(ojPred.rad.test.2 != ojTest$Purchase)
## [1] 0.1777778
Problem 8 - Part G
The results of the support vector classifier with a polynomial kernel, cost set to 0.01 and degree set to 2 indicate that there are 636 support vectors out of the total 800 points in the training set, of which 321 belong to the class CH and 315 belong to the class MM.
ojSVM.pol <- svm(Purchase~., data=ojTrain, cost=0.01, degree=2, kernel="polynomial")
summary(ojSVM.pol)
##
## Call:
## svm(formula = Purchase ~ ., data = ojTrain, cost = 0.01, degree = 2,
## kernel = "polynomial")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: polynomial
## cost: 0.01
## degree: 2
## coef.0: 0
##
## Number of Support Vectors: 636
##
## ( 321 315 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
The training error rate is 37.25% and the test error rate is approximately 36.67%.
ojPred.pol.train <- predict(ojSVM.pol, ojTrain)
table(ojTrain$Purchase, ojPred.pol.train)
## ojPred.pol.train
## CH MM
## CH 484 1
## MM 297 18
mean(ojPred.pol.train != ojTrain$Purchase)
## [1] 0.3725
ojPred.pol.test <- predict(ojSVM.pol, ojTest)
table(ojTest$Purchase, ojPred.pol.test)
## ojPred.pol.test
## CH MM
## CH 167 1
## MM 98 4
mean(ojPred.pol.test != ojTest$Purchase)
## [1] 0.3666667
The results of the support vector classifier with a polynomial kernel, cost set to a range of values between 1 and 10 and degree set to 2 indicate that the lowest cross-validation error is associated with the model containing cost = 3 and degree = 2, which has an error rate of approximately 17.63%.
set.seed(1)
ojTune.pol <- tune(svm, Purchase~., data=ojTrain, ranges=list(cost=c(0.1,0.2,0.5,0.8,1,2,3,5,8,10)), degree=2, kernel="polynomial")
summary(ojTune.pol)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 3
##
## - best performance: 0.17625
##
## - Detailed performance results:
## cost error dispersion
## 1 0.1 0.32125 0.05001736
## 2 0.2 0.22625 0.03557562
## 3 0.5 0.20625 0.04050463
## 4 0.8 0.20375 0.04251225
## 5 1.0 0.20250 0.04116363
## 6 2.0 0.18125 0.04177070
## 7 3.0 0.17625 0.03793727
## 8 5.0 0.18250 0.03496029
## 9 8.0 0.18000 0.03395258
## 10 10.0 0.18125 0.02779513
ojTune.pol$best.model
##
## Call:
## best.tune(method = svm, train.x = Purchase ~ ., data = ojTrain, ranges = list(cost = c(0.1,
## 0.2, 0.5, 0.8, 1, 2, 3, 5, 8, 10)), degree = 2, kernel = "polynomial")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: polynomial
## cost: 3
## degree: 2
## coef.0: 0
##
## Number of Support Vectors: 384
The training error rate for the best model is approximately 15.38% and the test error rate is approximately 20.37%.
ojPred.pol.train.2 <- predict(ojTune.pol$best.model, ojTrain)
table(ojTrain$Purchase, ojPred.pol.train.2)
## ojPred.pol.train.2
## CH MM
## CH 452 33
## MM 90 225
mean(ojPred.pol.train.2 != ojTrain$Purchase)
## [1] 0.15375
ojPred.pol.test.2 <- predict(ojTune.pol$best.model, ojTest)
table(ojTest$Purchase, ojPred.pol.test.2)
## ojPred.pol.test.2
## CH MM
## CH 153 15
## MM 40 62
mean(ojPred.pol.test.2 != ojTest$Purchase)
## [1] 0.2037037
Problem 8 - Part H
The results of the best model for each support vector classifier suggest that the classifier with a radial kernel provides the lowest training error rate but the classifier with the linear kernel provides the lowest test error rate. Given that the training and test error rates for each best model are relatively similar but the linear model performs best on test data, it appears a linear approach provides the best results for this scenario.