library(e1071)
library(ggplot2)
library(gridExtra)
set.seed(123)
n <- 100
x1 <- rnorm(n)
x2 <- rnorm(n)
y <- ifelse(x1^2 + x2^2 > 1.5, 1, -1)
data <- data.frame(x1 = x1, x2 = x2, y = as.factor(y))
train_idx <- sample(1:n, n * 0.7)
train <- data[train_idx, ]
test <- data[-train_idx, ]
svm_linear <- svm(y ~ ., data = train, kernel = "linear", cost = 1)
svm_poly <- svm(y ~ ., data = train, kernel = "polynomial", cost = 1, degree = 3)
svm_rbf <- svm(y ~ ., data = train, kernel = "radial", cost = 1, gamma = 1)
pred_train_linear <- predict(svm_linear, train)
pred_test_linear <- predict(svm_linear, test)
pred_train_poly <- predict(svm_poly, train)
pred_test_poly <- predict(svm_poly, test)
pred_train_rbf <- predict(svm_rbf, train)
pred_test_rbf <- predict(svm_rbf, test)
train_error <- c(
linear = mean(pred_train_linear != train$y),
poly = mean(pred_train_poly != train$y),
rbf = mean(pred_train_rbf != train$y)
)
test_error <- c(
linear = mean(pred_test_linear != test$y),
poly = mean(pred_test_poly != test$y),
rbf = mean(pred_test_rbf != test$y)
)
print("Training Error Rates:")
## [1] "Training Error Rates:"
print(train_error)
## linear poly rbf
## 0.34285714 0.32857143 0.01428571
print("Test Error Rates:")
## [1] "Test Error Rates:"
print(test_error)
## linear poly rbf
## 0.33333333 0.16666667 0.06666667
plot_svm <- function(model, data, title) {
grid <- expand.grid(
x1 = seq(min(data$x1), max(data$x1), length = 100),
x2 = seq(min(data$x2), max(data$x2), length = 100)
)
grid$pred <- predict(model, grid)
ggplot(data, aes(x = x1, y = x2, color = y)) +
geom_point(size = 2) +
geom_contour(data = grid, aes(z = as.numeric(pred)), breaks = 1.5, color = "black") +
ggtitle(title) +
theme_minimal()
}
p1 <- plot_svm(svm_linear, train, "Linear SVM")
p2 <- plot_svm(svm_poly, train, "Polynomial SVM (degree=3)")
p3 <- plot_svm(svm_rbf, train, "RBF SVM (gamma=1)")
grid.arrange(p1, p2, p3, nrow = 1)
RBF SVM performs best on both training and test data due to its flexibility with non-linear boundaries.
library(ISLR2)
library(e1071)
library(caret)
## Loading required package: lattice
data(Auto)
Auto1 <- na.omit(Auto)
Auto1$hgm <- as.factor(ifelse(Auto1$mpg > median(Auto1$mpg), 1, 0))
mpg01 is created. This binary variable will now be used as the response in classification models such as SVM.
Auto1 <- Auto1[, !(names(Auto1) %in% c("mpg", "name"))]
set.seed(1234)
ctrl <- trainControl(method = "cv", number = 10)
svm_linear <- train(
hgm ~ ., data = Auto1,
method = "svmLinear",
preProcess = c("center", "scale"),
trControl = ctrl,
tuneGrid = expand.grid(C = 10^seq(-2, 3))
)
svm_linear$results
best_linear <- svm_linear$bestTune
results <- svm_linear$results
ggplot(results, aes(x = log10(C), y = Accuracy)) +
geom_line(color = "blue") +
geom_point(size = 3, color = "red") +
labs(title = "SVM Linear Kernel: Cost vs Cross-Validation Accuracy",
x = "log10(Cost)",
y = "Cross-Validation Accuracy") +
theme_minimal()
From this plot we can say that the optimal cost lies around log10(Cost) = 2, i.e., C = 100. The SVM classifier was tuned over a range of cost values using 10-fold cross validation. The best performing models are at C = 100 and C = 1000, both achieving the accuracy of 91.3 %. Lower values of C such as 0.01 and 0.1 resulted in slightly lower accuracy, indicates underfitting due to a softer margin. High C values penalize misclassification more, leading to tighter margins that fit the training data better.
grid_radial <- expand.grid(
sigma = c(0.01, 0.05, 0.1),
C = c(0.1, 1, 10)
)
svm_radial <- train(
hgm ~ ., data = Auto1,
method = "svmRadial",
preProcess = c("center", "scale"),
trControl = ctrl,
tuneGrid = grid_radial
)
grid_poly <- expand.grid(
degree = c(2, 3),
scale = 1,
C = c(0.1, 1, 10)
)
svm_poly <- train(
hgm ~ ., data = Auto1,
method = "svmPoly",
preProcess = c("center", "scale"),
trControl = ctrl,
tuneGrid = grid_poly
)
# comparision
best_radial <- svm_radial$bestTune
best_poly <- svm_poly$bestTune
radial_error <- 1 - max(svm_radial$results$Accuracy)
poly_error <- 1 - max(svm_poly$results$Accuracy)
cat("Best Radial Kernel CV Error:", round(radial_error, 4), "\n")
## Best Radial Kernel CV Error: 0.0741
print(best_radial)
## sigma C
## 6 0.05 10
cat("Best Polynomial Kernel CV Error:", round(poly_error, 4), "\n")
## Best Polynomial Kernel CV Error: 0.074
print(best_poly)
## degree scale C
## 4 3 1 0.1
For the radial kernel, we tuned over multiple values of the cost parameter (C = 0.1, 1, 10) and sigma (0.01, 0.05, 0.1). The best-performing model was obtained with C = 10 and sigma = 0.05, achieving a cv error of 0.0716, which corresponds to 92.84% accuracy. For the polynomial kernel, tuning was performed over degree = 2, 3, C = 0.1, 1, 10, and a fixed scale = 1. The best polynomial SVM used degree = 3, C = 0.1, and scale = 1, producing a slightly better cross-validation error of 0.0712, or 92.88% accuracy. The polynomial kernel had a marginal edge in performance, though both kernels performed almost same.
Hint: In the lab, we used the plot() function for svm objects only in cases with p = 2. When p > 2, you can use the plot() function to create plots displaying pairs of variables at a time. Essentially, instead of typing plot(svmfit, dat) where svmfit contains your ftted model and dat is a data frame containing your data, you can type plot(svmfit, dat, x1 ∼ x4) in order to plot just the frst and fourth variables. However, you must replace x1 and x4 with the correct variable names. To fnd out more, type ?plot.svm.
# Linear
svmfit_2d <- svm(hgm ~ horsepower + acceleration, data = Auto1,
kernel = "linear", cost = 1, scale = TRUE)
plot(svmfit_2d, Auto1, horsepower ~ acceleration,
main = "Linear SVM Decision Boundary (Horsepower vs Acceleration)")
# Polynomial
svmfit_poly <- svm(hgm ~ horsepower + acceleration, data = Auto1,
kernel = "polynomial", degree = 3, cost = 0.1, scale = TRUE)
plot(svmfit_poly, Auto1, horsepower ~ acceleration,
main = "Polynomial SVM (Degree = 3) Decision Boundary")
# Radial
svmfit_rbf <- svm(hgm ~ horsepower + acceleration, data = Auto1,
kernel = "radial", gamma = 0.05, cost = 10, scale = TRUE)
plot(svmfit_rbf, Auto1, horsepower ~ acceleration,
main = "Radial SVM Decision Boundary")
As the above result shows, these plots tells the same. In the plots, SVM decision boundary clearly separates high and low gas mileage cars using just two features: horsepower and acceleration. The background shading shows the predicted class regions, while the data points reveal that most misclassifications occur near the boundary, especially where the feature values overlap.
data(OJ)
set.seed(1234)
train_indices <- sample(1:nrow(OJ), 800)
train_data <- OJ[train_indices, ]
test_data <- OJ[-train_indices, ]
svm_oj <- svm(Purchase ~ ., data = train_data, kernel = "linear", cost = 0.01, scale = TRUE)
summary(svm_oj)
##
## Call:
## svm(formula = Purchase ~ ., data = train_data, kernel = "linear",
## cost = 0.01, scale = TRUE)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.01
##
## Number of Support Vectors: 437
##
## ( 219 218 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
The support vector classifier with a linear kernel and cost = 0.01 got 437 support vectors out of 800 training observations, nearly evenly split between the two classes (‘CH’ and ‘MM’). The high number of support vectors suggests that the classes are not easily separable. The low cost encourages a wider margin but may lead to underfitting, which may reduce model accuracy.
train_pred <- predict(svm_oj, train_data)
train_error <- mean(train_pred != train_data$Purchase)
test_pred <- predict(svm_oj, test_data)
test_error <- mean(test_pred != test_data$Purchase)
cat("Training Error Rate:", round(train_error, 4), "\n")
## Training Error Rate: 0.1688
cat("Test Error Rate:", round(test_error, 4), "\n")
## Test Error Rate: 0.1593
The training error rate is 16.88%, and the test error rate is 15.93%. This indicates that the model generalizes slightly better on test data than on the training set, which may be due to the soft margin imposed by the low cost (cost = 0.01). The relatively close values suggest the model is not overfitting and is performing reasonably well.
tune_result <- tune(
svm,
Purchase ~ ., data = train_data,
kernel = "linear",
ranges = list(cost = c(0.01, 0.1, 1, 5, 10)),
scale = TRUE
)
summary(tune_result)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 0.1
##
## - best performance: 0.17
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.17125 0.03866254
## 2 0.10 0.17000 0.04297932
## 3 1.00 0.17250 0.04401704
## 4 5.00 0.17500 0.04124790
## 5 10.00 0.17375 0.04185375
The best-performing cost was 0.10, with a corresponding cross-validation error of 17.00%. The error rates across all tested values were fairly close, ranging from 17.00% to 17.50%, indicating that the model’s performance is not sensitive to changes in the cost parameter within this range. cost 0.10 slightly outperforms both lower and higher values.
best_svm <- tune_result$best.model
train_pred_best <- predict(best_svm, train_data)
train_error_best <- mean(train_pred_best != train_data$Purchase)
test_pred_best <- predict(best_svm, test_data)
test_error_best <- mean(test_pred_best != test_data$Purchase)
cat("Training Error Rate (for cost = 0.1):", round(train_error_best, 4), "\n")
## Training Error Rate (for cost = 0.1): 0.165
cat("Test Error Rate (for cost = 0.1):", round(test_error_best, 4), "\n")
## Test Error Rate (for cost = 0.1): 0.163
Using the model with cost = 0.1, the training error rate is 16.5%, and the test error rate is 16.3%. These are very close, indicating that the model is not overfitting and generalizes well to unseen data. Compared to the model with cost = 0.01, this tuned model slightly reduced both training and test errors, which says the benefit of tuning.
svm_radial_01 <- svm(Purchase ~ ., data = train_data, kernel = "radial", cost = 0.01, scale = TRUE)
summary(svm_radial_01)
##
## Call:
## svm(formula = Purchase ~ ., data = train_data, kernel = "radial",
## cost = 0.01, scale = TRUE)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 0.01
##
## Number of Support Vectors: 636
##
## ( 319 317 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
train_pred_r01 <- predict(svm_radial_01, train_data)
test_pred_r01 <- predict(svm_radial_01, test_data)
train_err_r01 <- mean(train_pred_r01 != train_data$Purchase)
test_err_r01 <- mean(test_pred_r01 != test_data$Purchase)
cat("Training Error (Radial, cost=0.01):", round(train_err_r01, 4), "\n")
## Training Error (Radial, cost=0.01): 0.3962
cat("Test Error (Radial, cost=0.01):", round(test_err_r01, 4), "\n")
## Test Error (Radial, cost=0.01): 0.3704
set.seed(1234)
tune_radial <- tune(
svm,
Purchase ~ ., data = train_data,
kernel = "radial",
ranges = list(cost = c(0.01, 0.1, 1, 5, 10)),
scale = TRUE
)
summary(tune_radial)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 5
##
## - best performance: 0.1875
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.39625 0.05466120
## 2 0.10 0.20625 0.05212498
## 3 1.00 0.18875 0.04267529
## 4 5.00 0.18750 0.03118048
## 5 10.00 0.20000 0.03632416
best_radial_svm <- tune_radial$best.model
train_pred_best_r <- predict(best_radial_svm, train_data)
test_pred_best_r <- predict(best_radial_svm, test_data)
train_err_best_r <- mean(train_pred_best_r != train_data$Purchase)
test_err_best_r <- mean(test_pred_best_r != test_data$Purchase)
cat("Training Error (Best Radial):", round(train_err_best_r, 4), "\n")
## Training Error (Best Radial): 0.1475
cat("Test Error (Best Radial):", round(test_err_best_r, 4), "\n")
## Test Error (Best Radial): 0.163
Using an initial cost of 0.01, the model gave a training error of 39.62% and a test error of 37.04%, indicating underfitting due to an overly soft margin. After tuning, the best performance was achieved with cost = 5, yielding a cross-validation error of 18.75%.
The tuned model significantly improved generalization, with a training error of 14.75% and a test error of 16.3%. These results perform better than both the initial radial SVM and tuned linear SVM, showing that the radial kernel provides a more flexible boundary and captures non-linear relationships better.
svm_poly_01 <- svm(Purchase ~ ., data = train_data,
kernel = "polynomial", degree = 2,
cost = 0.01, scale = TRUE)
summary(svm_poly_01)
##
## Call:
## svm(formula = Purchase ~ ., data = train_data, kernel = "polynomial",
## degree = 2, cost = 0.01, scale = TRUE)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: polynomial
## cost: 0.01
## degree: 2
## coef.0: 0
##
## Number of Support Vectors: 640
##
## ( 323 317 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
train_pred_poly01 <- predict(svm_poly_01, train_data)
test_pred_poly01 <- predict(svm_poly_01, test_data)
train_err_poly01 <- mean(train_pred_poly01 != train_data$Purchase)
test_err_poly01 <- mean(test_pred_poly01 != test_data$Purchase)
cat("Training Error (Poly, cost=0.01):", round(train_err_poly01, 4), "\n")
## Training Error (Poly, cost=0.01): 0.3825
cat("Test Error (Poly, cost=0.01):", round(test_err_poly01, 4), "\n")
## Test Error (Poly, cost=0.01): 0.3407
set.seed(1234)
tune_poly <- tune(
svm,
Purchase ~ ., data = train_data,
kernel = "polynomial",
degree = 2,
ranges = list(cost = c(0.01, 0.1, 1, 5, 10)),
scale = TRUE
)
summary(tune_poly)
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 10
##
## - best performance: 0.18375
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.39625 0.06096732
## 2 0.10 0.34250 0.04901814
## 3 1.00 0.20250 0.03525699
## 4 5.00 0.18625 0.04185375
## 5 10.00 0.18375 0.03283481
best_poly <- tune_poly$best.model
train_pred_best_poly <- predict(best_poly, train_data)
test_pred_best_poly <- predict(best_poly, test_data)
train_err_best_poly <- mean(train_pred_best_poly != train_data$Purchase)
test_err_best_poly <- mean(test_pred_best_poly != test_data$Purchase)
cat("Training Error (Best Poly):", round(train_err_best_poly, 4), "\n")
## Training Error (Best Poly): 0.1562
cat("Test Error (Best Poly):", round(test_err_best_poly, 4), "\n")
## Test Error (Best Poly): 0.1556
With an initial cost of 0.01, the model resulted in a training error of 38.25% and a test error of 34.07%, indicating significant underfitting due to a soft margin and limited model flexibility. The optimal cost was found to be 10, yielding a cross-validation error of 18.38%. The resulting tuned model achieved a training error of 15.62% and a test error of 15.56%, showing substantial improvement over the initial poly model and better performance compared to both the linear and radial kernel SVMs.
A: Based on the test error rates after tuning each model, the polynomial kernel (degree = 2) gave the best overall performance. It achieved the lowest test error rate of 15.56%, slightly outperforming both the radial SVM (16.3%) and the linear SVM (16.3%).
While all three models performed similarly after tuning, the polynomial kernel offered a better balance between flexibility and generalization, suggesting that the relationship between predictors and the target (Purchase) is non-linear, but does not require overly complex boundaries. Therefore, the polynomial SVM (degree = 2, cost = 10) seems to be the most effective approach.