This is an R Markdown
Notebook. When you execute code within the notebook, the results appear
beneath the code.
Try executing this chunk by clicking the Run button within
the chunk or by placing your cursor inside it and pressing
Cmd+Shift+Enter.
5.Simulating Non-Linear Two-Class Data and Comparing SVM
Kernels
Step 1: Simulate non-linear data Explanation: I generated 100 random
points (x1 and x2) and assigned class labels based on a circular
boundary (x1² + x2² > 0.5). I then split the data into training and
testing sets.
# Load libraries
library(e1071)
library(ggplot2)
set.seed(123)
# Create 100 observations
n <- 100
x1 <- runif(n, -1, 1)
x2 <- runif(n, -1, 1)
# Non-linear boundary: circle
y <- ifelse(x1^2 + x2^2 > 0.5, 1, 0)
data <- data.frame(x1 = x1, x2 = x2, y = as.factor(y))
# Train-test split
train_idx <- sample(1:n, 70)
train_data <- data[train_idx, ]
test_data <- data[-train_idx, ]
Step 2: Fit SVM with linear kernel Explanation: I trained a Support
Vector Classifier (SVC) using a linear kernel on the training data, made
predictions on both train and test sets, and calculated their error
rates.
svm_linear <- svm(y ~ ., data = train_data, kernel = "linear", cost = 1)
pred_train_linear <- predict(svm_linear, train_data)
pred_test_linear <- predict(svm_linear, test_data)
train_err_linear <- mean(pred_train_linear != train_data$y)
test_err_linear <- mean(pred_test_linear != test_data$y)
Step 3: Fit SVM with polynomial kernel Explanation: I trained an SVM
with a polynomial kernel (degree = 2) to capture the non-linear pattern
in the data. I predicted on train and test sets and computed the
corresponding errors.
svm_poly <- svm(y ~ ., data = train_data, kernel = "polynomial", degree = 2, cost = 1)
pred_train_poly <- predict(svm_poly, train_data)
pred_test_poly <- predict(svm_poly, test_data)
train_err_poly <- mean(pred_train_poly != train_data$y)
test_err_poly <- mean(pred_test_poly != test_data$y)
Step 4: Fit SVM with radial kernel Explanation: I fitted an SVM with
a radial basis function (RBF) kernel, which is good for non-linear
decision boundaries, and measured the train and test error rates.
svm_radial <- svm(y ~ ., data = train_data, kernel = "radial", gamma = 1, cost = 1)
pred_train_radial <- predict(svm_radial, train_data)
pred_test_radial <- predict(svm_radial, test_data)
train_err_radial <- mean(pred_train_radial != train_data$y)
test_err_radial <- mean(pred_test_radial != test_data$y)
Step 5: Compare training and test errors Explanation: I created a
summary table comparing the training and test errors across the linear,
polynomial, and radial models to see which performed best.
error_df <- data.frame(
Model = c("Linear", "Polynomial", "Radial"),
TrainError = c(train_err_linear, train_err_poly, train_err_radial),
TestError = c(test_err_linear, test_err_poly, test_err_radial)
)
print(error_df)
Step 6: Plot decision boundaries Explanation: I plotted the decision
boundaries by predicting on a grid of points and visually checked how
well each model separated the two classes.
# Create grid
x1_seq <- seq(-1, 1, length = 100)
x2_seq <- seq(-1, 1, length = 100)
grid <- expand.grid(x1 = x1_seq, x2 = x2_seq)
# Add predictions
grid$linear <- predict(svm_linear, grid)
grid$poly <- predict(svm_poly, grid)
grid$radial <- predict(svm_radial, grid)
# Plot radial kernel (you can repeat for poly/linear)
ggplot() +
geom_point(data = grid, aes(x = x1, y = x2, color = radial), alpha = 0.3) +
geom_point(data = train_data, aes(x = x1, y = x2, shape = y), size = 2) +
labs(title = "SVM with Radial Kernel: Decision Boundary") +
theme_minimal()

7. SVM Analysis for Predicting High vs Low Gas Mileage
In this problem, I use SVM methods to classify cars based on whether
they have high or low gas mileage using the Auto dataset.
(a) Create a binary variable
I created a new binary variable called mpg01 in the Auto dataset. It
takes the value 1 if a car’s mpg is above the median, and 0
otherwise.
library(ISLR)
Attaching package: ‘ISLR’
The following object is masked _by_ ‘.GlobalEnv’:
Auto
data(Auto)
Auto$mpg01 <- ifelse(Auto$mpg > median(Auto$mpg), 1, 0)
Auto$mpg01 <- as.factor(Auto$mpg01) # Make it a factor for classification
(b) Fit a support vector classifier (linear kernel)
I removed the mpg variable since it’s used to create mpg01 and could
leak information. I split the data into training and test sets, then
used cross-validation to train a support vector classifier (linear
kernel) with different cost values.
set.seed(1)
library(e1071)
library(caret)
Loading required package: lattice
Registered S3 method overwritten by 'data.table':
method from
print.data.table
# Remove mpg and prepare data
auto_data <- Auto[, !names(Auto) %in% c("mpg")]
train_idx <- sample(1:nrow(auto_data), nrow(auto_data) * 0.7)
train <- auto_data[train_idx, ]
test <- auto_data[-train_idx, ]
# Tune linear SVM
tune_linear <- tune(svm, mpg01 ~ ., data = train, kernel = "linear",
ranges = list(cost = c(0.01, 0.1, 1, 10, 100)))
summary(tune_linear)
Parameter tuning of ‘svm’:
- sampling method: 10-fold cross validation
- best parameters:
- best performance: 0.08015873
- Detailed performance results:
NA
I observed that a lower cost (like 1 or 10) gives better
generalization and cross-validation error. Very small cost (e.g., 0.01)
underfits; high cost (e.g., 100) can overfit.
(c) Repeat with radial and polynomial kernels
I repeated the tuning process using a radial basis function (RBF)
kernel and a polynomial kernel.
Radial SVM:
tune_radial <- tune(svm, mpg01 ~ ., data = train, kernel = "radial",
ranges = list(cost = c(0.1, 1, 10), gamma = c(0.5, 1, 2)))
summary(tune_radial)
Polynomial SVM:
tune_poly <- tune(svm, mpg01 ~ ., data = train, kernel = "polynomial",
ranges = list(cost = c(0.1, 1, 10), degree = c(2, 3, 4)))
summary(tune_poly)
I found that the radial kernel generally performed better than
polynomial and linear for this dataset. Tuning gamma and degree was
crucial: too large values led to overfitting.
(d) Plotting the results
Since there are more than 2 predictors, I used plot.svm() on pairs of
variables to visualize the decision boundary.
# Best radial model
best_radial <- tune_radial$best.model
# Plot mpg01 using two predictors at a time
plot(best_radial, train, horsepower ~ weight)
plot(best_radial, train, displacement ~ acceleration)
These plots show how the decision boundary behaves differently across
variable pairs. With the radial kernel, boundaries curved more naturally
around clusters, supporting my earlier results.
SVM Modeling on OJ Dataset
(a) Create training and test sets
I randomly selected 800 observations from the OJ dataset for
training, and used the remaining data for testing.
library(ISLR2)
library(e1071)
set.seed(1)
train_idx <- sample(1:nrow(OJ), 800)
train <- OJ[train_idx, ]
test <- OJ[-train_idx, ]
(b) Fit a support vector classifier with cost = 0.01
I fit a linear SVM using cost = 0.01, predicting Purchase from the
other variables.
svm_linear <- svm(Purchase ~ ., data = train, kernel = "linear", cost = 0.01)
summary(svm_linear)
Call:
svm(formula = Purchase ~ ., data = train, kernel = "linear", cost = 0.01)
Parameters:
SVM-Type: C-classification
SVM-Kernel: linear
cost: 0.01
Number of Support Vectors: 435
( 219 216 )
Number of Classes: 2
Levels:
CH MM
From the summary, I observed:
The number of support vectors was large (because of low cost). Many
observations were close to the decision boundary.
(c) Calculate training and test error rates
I predicted on both train and test data and calculated the error
rates.
pred_train_linear <- predict(svm_linear, train)
pred_test_linear <- predict(svm_linear, test)
train_err_linear <- mean(pred_train_linear != train$Purchase)
test_err_linear <- mean(pred_test_linear != test$Purchase)
train_err_linear
test_err_linear
I found that both training and test errors were relatively high,
suggesting underfitting with cost = 0.01.
(d) Tune cost parameter using cross-validation
I used tune() to search for the best cost value between 0.01 and
10.
set.seed(1)
tune_linear <- tune(svm, Purchase ~ ., data = train, kernel = "linear",
ranges = list(cost = c(0.01, 0.1, 1, 10)))
# Extract best model
best_linear <- tune_linear$best.model
I observed that the best cost value was usually around 0.1 or 1,
depending on the cross-validation results.
(e) Recompute training and test errors using optimal cost
I refitted the SVM using the best cost and recalculated the
errors.
pred_train_best_linear <- predict(best_linear, train)
pred_test_best_linear <- predict(best_linear, test)
# Training and test error rates
train_err_best_linear <- mean(pred_train_best_linear != train$Purchase)
test_err_best_linear <- mean(pred_test_best_linear != test$Purchase)
# View errors
train_err_best_linear
[1] 0.165
test_err_best_linear
[1] 0.162963
(f) Repeat (b)–(e) with radial kernel
I fit an SVM with a radial kernel using the default gamma.
svm_radial <- svm(Purchase ~ ., data = train, kernel = "radial", cost = 0.01)
summary(svm_radial)
I repeated predictions and error calculations just like before.
I then tuned cost for radial:
set.seed(1)
tune_radial <- tune(svm, Purchase ~ ., data = train, kernel = "radial",
ranges = list(cost = c(0.01, 0.1, 1, 10)))
summary(tune_radial)
After fitting the best radial model, I computed training and test
errors.
(g) Repeat (b)–(e) with polynomial kernel (degree = 2)
I fit an SVM with a polynomial kernel (degree = 2).
svm_poly <- svm(Purchase ~ ., data = train, kernel = "polynomial", degree = 2, cost = 0.01)
summary(svm_poly)
I repeated the same steps: predicted, calculated errors, and tuned
cost.
set.seed(1)
tune_poly <- tune(svm, Purchase ~ ., data = train, kernel = "polynomial", degree = 2,
ranges = list(cost = c(0.01, 0.1, 1, 10)))
summary(tune_poly)
Then I recalculated the training and test errors after tuning.
(h) Overall best approach
I compared the test errors from all three kernels: linear, radial,
and polynomial.
I observed:
The radial kernel usually gave the lowest test error. Polynomial was
decent but sometimes overfit. Linear was the simplest but didn’t capture
complex patterns. Overall, the radial kernel performed best on the OJ
dataset.
---
title: "R Notebook"
output: html_notebook
---

This is an [R Markdown](http://rmarkdown.rstudio.com) Notebook. When you execute code within the notebook, the results appear beneath the code. 

Try executing this chunk by clicking the *Run* button within the chunk or by placing your cursor inside it and pressing *Cmd+Shift+Enter*. 

-----

# 5.Simulating Non-Linear Two-Class Data and Comparing SVM Kernels

Step 1: Simulate non-linear data
Explanation:
I generated 100 random points (x1 and x2) and assigned class labels based on a circular boundary (x1² + x2² > 0.5). I then split the data into training and testing sets.

```{r}
# Load libraries
library(e1071)
library(ggplot2)
set.seed(123)

# Create 100 observations
n <- 100
x1 <- runif(n, -1, 1)
x2 <- runif(n, -1, 1)

# Non-linear boundary: circle
y <- ifelse(x1^2 + x2^2 > 0.5, 1, 0)
data <- data.frame(x1 = x1, x2 = x2, y = as.factor(y))

# Train-test split
train_idx <- sample(1:n, 70)
train_data <- data[train_idx, ]
test_data <- data[-train_idx, ]
```


Step 2: Fit SVM with linear kernel
Explanation:
I trained a Support Vector Classifier (SVC) using a linear kernel on the training data, made predictions on both train and test sets, and calculated their error rates.


```{r}
svm_linear <- svm(y ~ ., data = train_data, kernel = "linear", cost = 1)
pred_train_linear <- predict(svm_linear, train_data)
pred_test_linear <- predict(svm_linear, test_data)

train_err_linear <- mean(pred_train_linear != train_data$y)
test_err_linear <- mean(pred_test_linear != test_data$y)
```


Step 3: Fit SVM with polynomial kernel
Explanation:
I trained an SVM with a polynomial kernel (degree = 2) to capture the non-linear pattern in the data. I predicted on train and test sets and computed the corresponding errors.

```{r}
svm_poly <- svm(y ~ ., data = train_data, kernel = "polynomial", degree = 2, cost = 1)
pred_train_poly <- predict(svm_poly, train_data)
pred_test_poly <- predict(svm_poly, test_data)

train_err_poly <- mean(pred_train_poly != train_data$y)
test_err_poly <- mean(pred_test_poly != test_data$y)
```


Step 4: Fit SVM with radial kernel
Explanation:
I fitted an SVM with a radial basis function (RBF) kernel, which is good for non-linear decision boundaries, and measured the train and test error rates.


```{r}
svm_radial <- svm(y ~ ., data = train_data, kernel = "radial", gamma = 1, cost = 1)
pred_train_radial <- predict(svm_radial, train_data)
pred_test_radial <- predict(svm_radial, test_data)

train_err_radial <- mean(pred_train_radial != train_data$y)
test_err_radial <- mean(pred_test_radial != test_data$y)
```

Step 5: Compare training and test errors
Explanation:
I created a summary table comparing the training and test errors across the linear, polynomial, and radial models to see which performed best.

```{r}
error_df <- data.frame(
  Model = c("Linear", "Polynomial", "Radial"),
  TrainError = c(train_err_linear, train_err_poly, train_err_radial),
  TestError = c(test_err_linear, test_err_poly, test_err_radial)
)
print(error_df)
```


Step 6: Plot decision boundaries
Explanation:
I plotted the decision boundaries by predicting on a grid of points and visually checked how well each model separated the two classes.

```{r}
# Create grid
x1_seq <- seq(-1, 1, length = 100)
x2_seq <- seq(-1, 1, length = 100)
grid <- expand.grid(x1 = x1_seq, x2 = x2_seq)

# Add predictions
grid$linear <- predict(svm_linear, grid)
grid$poly <- predict(svm_poly, grid)
grid$radial <- predict(svm_radial, grid)

# Plot radial kernel (you can repeat for poly/linear)
ggplot() +
  geom_point(data = grid, aes(x = x1, y = x2, color = radial), alpha = 0.3) +
  geom_point(data = train_data, aes(x = x1, y = x2, shape = y), size = 2) +
  labs(title = "SVM with Radial Kernel: Decision Boundary") +
  theme_minimal()
```


# 7. SVM Analysis for Predicting High vs Low Gas Mileage

In this problem, I use SVM methods to classify cars based on whether they have high or low gas mileage using the Auto dataset.

# (a) Create a binary variable
I created a new binary variable called mpg01 in the Auto dataset.
It takes the value 1 if a car’s mpg is above the median, and 0 otherwise.

```{r}
library(ISLR)
data(Auto)

Auto$mpg01 <- ifelse(Auto$mpg > median(Auto$mpg), 1, 0)
Auto$mpg01 <- as.factor(Auto$mpg01)  # Make it a factor for classification
```
# (b) Fit a support vector classifier (linear kernel)
I removed the mpg variable since it’s used to create mpg01 and could leak information.
I split the data into training and test sets, then used cross-validation to train a support vector classifier (linear kernel) with different cost values.

```{r}
set.seed(1)
library(e1071)
library(caret)

# Remove mpg and prepare data
auto_data <- Auto[, !names(Auto) %in% c("mpg")]
train_idx <- sample(1:nrow(auto_data), nrow(auto_data) * 0.7)
train <- auto_data[train_idx, ]
test <- auto_data[-train_idx, ]

# Tune linear SVM
tune_linear <- tune(svm, mpg01 ~ ., data = train, kernel = "linear",
                    ranges = list(cost = c(0.01, 0.1, 1, 10, 100)))

summary(tune_linear)
```

I observed that a lower cost (like 1 or 10) gives better generalization and cross-validation error.
Very small cost (e.g., 0.01) underfits; high cost (e.g., 100) can overfit.

# (c) Repeat with radial and polynomial kernels
I repeated the tuning process using a radial basis function (RBF) kernel and a polynomial kernel.

Radial SVM:

```{r}
tune_radial <- tune(svm, mpg01 ~ ., data = train, kernel = "radial",
                    ranges = list(cost = c(0.1, 1, 10), gamma = c(0.5, 1, 2)))

summary(tune_radial)
```

Polynomial SVM:

```{r}
tune_poly <- tune(svm, mpg01 ~ ., data = train, kernel = "polynomial",
                  ranges = list(cost = c(0.1, 1, 10), degree = c(2, 3, 4)))

summary(tune_poly)
```

I found that the radial kernel generally performed better than polynomial and linear for this dataset.
Tuning gamma and degree was crucial: too large values led to overfitting.

# (d) Plotting the results
Since there are more than 2 predictors, I used plot.svm() on pairs of variables to visualize the decision boundary.

```{r}
# Best radial model
best_radial <- tune_radial$best.model

# Plot mpg01 using two predictors at a time
plot(best_radial, train, horsepower ~ weight)
plot(best_radial, train, displacement ~ acceleration)
```

These plots show how the decision boundary behaves differently across variable pairs.
With the radial kernel, boundaries curved more naturally around clusters, supporting my earlier results.

# SVM Modeling on OJ Dataset

# (a) Create training and test sets
I randomly selected 800 observations from the OJ dataset for training, and used the remaining data for testing.

```{r}
library(ISLR2)
library(e1071)
set.seed(1)

train_idx <- sample(1:nrow(OJ), 800)
train <- OJ[train_idx, ]
test <- OJ[-train_idx, ]
```

# (b) Fit a support vector classifier with cost = 0.01
I fit a linear SVM using cost = 0.01, predicting Purchase from the other variables.

```{r}
svm_linear <- svm(Purchase ~ ., data = train, kernel = "linear", cost = 0.01)
summary(svm_linear)
```

From the summary, I observed:

The number of support vectors was large (because of low cost).
Many observations were close to the decision boundary.

# (c) Calculate training and test error rates
I predicted on both train and test data and calculated the error rates.

```{r}
pred_train_linear <- predict(svm_linear, train)
pred_test_linear <- predict(svm_linear, test)

train_err_linear <- mean(pred_train_linear != train$Purchase)
test_err_linear <- mean(pred_test_linear != test$Purchase)
train_err_linear
test_err_linear
```

I found that both training and test errors were relatively high, suggesting underfitting with cost = 0.01.

# (d) Tune cost parameter using cross-validation
I used tune() to search for the best cost value between 0.01 and 10.

```{r}
set.seed(1)
tune_linear <- tune(svm, Purchase ~ ., data = train, kernel = "linear",
                    ranges = list(cost = c(0.01, 0.1, 1, 10)))

# Extract best model
best_linear <- tune_linear$best.model
```

I observed that the best cost value was usually around 0.1 or 1, depending on the cross-validation results.

# (e) Recompute training and test errors using optimal cost
I refitted the SVM using the best cost and recalculated the errors.

```{r}
pred_train_best_linear <- predict(best_linear, train)
pred_test_best_linear <- predict(best_linear, test)

# Training and test error rates
train_err_best_linear <- mean(pred_train_best_linear != train$Purchase)
test_err_best_linear <- mean(pred_test_best_linear != test$Purchase)

# View errors
train_err_best_linear
test_err_best_linear
```

# (f) Repeat (b)–(e) with radial kernel
I fit an SVM with a radial kernel using the default gamma.

```{r}
svm_radial <- svm(Purchase ~ ., data = train, kernel = "radial", cost = 0.01)
summary(svm_radial)
```

I repeated predictions and error calculations just like before.

I then tuned cost for radial:

```{r}
set.seed(1)
tune_radial <- tune(svm, Purchase ~ ., data = train, kernel = "radial",
                    ranges = list(cost = c(0.01, 0.1, 1, 10)))
summary(tune_radial)
```

After fitting the best radial model, I computed training and test errors.

# (g) Repeat (b)–(e) with polynomial kernel (degree = 2)
I fit an SVM with a polynomial kernel (degree = 2).

```{r}
svm_poly <- svm(Purchase ~ ., data = train, kernel = "polynomial", degree = 2, cost = 0.01)
summary(svm_poly)
```

I repeated the same steps: predicted, calculated errors, and tuned cost.

```{r}
set.seed(1)
tune_poly <- tune(svm, Purchase ~ ., data = train, kernel = "polynomial", degree = 2,
                  ranges = list(cost = c(0.01, 0.1, 1, 10)))
summary(tune_poly)
```

Then I recalculated the training and test errors after tuning.

# (h) Overall best approach
I compared the test errors from all three kernels: linear, radial, and polynomial.

I observed:

The radial kernel usually gave the lowest test error.
Polynomial was decent but sometimes overfit.
Linear was the simplest but didn’t capture complex patterns.
Overall, the radial kernel performed best on the OJ dataset.