This problem involves hyper planes in two dimensions.
Sketch the hyper plane 1+3X1−X2=0. Indicate the set of points for which 1+3X1−X2>0, as well as the set of points for which 1+3X1 −X2<0.
library(ggplot2)
# Create a grid of (X1, X2) values
x1 <- seq(-5, 5, length.out = 200)
x2 <- seq(-5, 5, length.out = 200)
grid <- expand.grid(X1 = x1, X2 = x2)
# Compute the hyperplane function and classify regions
# f(X1,X2) = 1 + 3*X1 - X2
grid$value <- 1 + 3 * grid$X1 - grid$X2
grid$region <- factor(ifelse(grid$value > 0, "> 0", "< 0"))
# Plot with ggplot2
ggplot(grid, aes(x = X1, y = X2, fill = region)) +
# Shade the two half‐planes
geom_raster(alpha = 0.4, interpolate = FALSE) +
# Draw the hyperplane: intercept = 1, slope = 3
geom_abline(intercept = 1, slope = 3, color = "black", size = 1) +
# Label the regions
annotate("text", x = 2, y = -4, label = "1+3X1−X2 > 0", hjust = 0) +
annotate("text", x = -4, y = 4, label = "1+3X1−X2 < 0", hjust = 0) +
# Custom fill colors
scale_fill_manual(values = c("> 0" = "steelblue", "< 0" = "tomato")) +
# Axis labels with subscripts
labs(
title = "Hyperplane: 1 + 3 X₁ − X₂ = 0",
x = expression(X[1]),
y = expression(X[2]),
fill = "Sign of\n1+3X1−X2"
) +
theme_minimal()
On the same plot, sketch the hyper plane −2 + X1 + 2X2 = 0. Indicate the set of points for which −2 + X1 + 2X2 >0, as well as the set of points for which −2+X1 + 2X2 < 0.
# install.packages("ggplot2") # if not already installed
library(ggplot2)
# 1) build a grid of (X1,X2) values
x1 <- seq(-5, 5, length.out = 200)
x2 <- seq(-5, 5, length.out = 200)
grid <- expand.grid(X1 = x1, X2 = x2)
# 2) compute the hyperplane value and region label
grid$value <- -2 + grid$X1 + 2 * grid$X2
grid$region <- factor(ifelse(grid$value > 0, "> 0", "< 0"))
# 3) plot
ggplot(grid, aes(x = X1, y = X2, fill = region)) +
# raster for shading regions
geom_raster(alpha = 0.4, interpolate = FALSE) +
# hyperplane: intercept = 1, slope = -1/2
geom_abline(intercept = 1, slope = -0.5, color = "black", size = 1) +
# annotations
annotate("text", x = 4, y = 4, label = "-2+X1+2X2 > 0", hjust = 0) +
annotate("text", x = -4, y = -4, label = "-2+X1+2X2 < 0", hjust = 0) +
# custom fills
scale_fill_manual(values = c("> 0" = "steelblue", "< 0" = "tomato")) +
labs(
title = "Hyperplane: -2 + X1 + 2 X2 = 0",
x = expression(X[1]),
y = expression(X[2]),
fill = "Sign of\n-2+X1+2X2"
) +
theme_minimal()
We have seen that in p= 2 dimensions, a linear decision boundary takes the form β0 +β1X1 +β2X2 = 0.We now investigate a non-linear decision boundary.
Sketch the curve
(1+X1)2 + (2−X2)2 = 4.
# Grid of (X1, X2)
x1_vals <- seq(-5, 5, length.out = 200)
x2_vals <- seq(-2, 10, length.out = 200)
grid_a <- expand.grid(X1 = x1_vals, X2 = x2_vals)
grid_a$Z <- (1 + grid_a$X1)^2 + (2 - grid_a$X2)^2
# Contour at Z = 4 (the circle boundary)
ggplot(grid_a, aes(x = X1, y = X2)) +
geom_contour(aes(z = Z), breaks = 4, color = "black", size = 0.7) +
coord_equal() +
theme_minimal() +
labs(
title = "Part a: (1 + X1)^2 + (2 − X2)^2 = 4",
x = expression(X[1]),
y = expression(X[2])
)
On your sketch, indicate the set of points for which
(1+X1)2 +(2−X2)2 > 4,
as well as the set of points for which
(1+X1)2 + (2−X2)2 ≤ 4.
ggplot(grid_a, aes(x = X1, y = X2)) +
geom_raster(aes(fill = Z <= 4), interpolate = TRUE) +
scale_fill_manual(
values = c("TRUE" = "lightblue", "FALSE" = "lightgray"),
labels = c("Inside (≤ 4)", "Outside (> 4)"),
name = "Region"
) +
geom_contour(aes(z = Z), breaks = 4, color = "black", size = 0.7) +
coord_equal() +
theme_minimal() +
labs(
title = "Regions Inside vs. Outside",
x = expression(X[1]),
y = expression(X[2])
)
Suppose that a classifier assigns an observation to the blue class if
(1+X1)2 + (2−X2)2 > 4,
and to the red class otherwise.To what class is the observation (0,0) classified? (−1,1)? (2,2)? (3,8)?
# Define test points
test_points <- data.frame(
X1 = c(0, -1, 2, 3),
X2 = c(0, 1, 2, 8)
)
# Compute Z and Class
test_points$Z <- (1 + test_points$X1)^2 + (2 - test_points$X2)^2
test_points$Class <- ifelse(test_points$Z > 4, "Blue", "Red")
# Plot boundary + points
ggplot(grid_a, aes(x = X1, y = X2)) +
geom_raster(aes(fill = Z <= 4), interpolate = TRUE) +
geom_contour(aes(z = Z), breaks = 4, color = "black", size = 0.7) +
geom_point(
data = test_points,
aes(color = Class, shape = Class),
size = 3
) +
scale_color_manual(values = c("Blue" = "blue", "Red" = "red")) +
scale_fill_manual(
values = c("TRUE" = "lightblue", "FALSE" = "white"),
guide = FALSE
) +
coord_equal() +
theme_minimal() +
labs(
title = "Part c: Classification of Test Points",
x = expression(X[1]),
y = expression(X[2])
)
## Warning: The `guide` argument in `scale_*()` cannot be `FALSE`. This was deprecated in
## ggplot2 3.3.4.
## ℹ Please use "none" instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
# Print the results
print(test_points)
## X1 X2 Z Class
## 1 0 0 5 Blue
## 2 -1 1 1 Red
## 3 2 2 9 Blue
## 4 3 8 52 Blue
Argue that while the decision boundary in (c) is not linear in terms of X1 and X2, it is linear in terms of X1, \(X^{2}_{1}\) ,X2, and \(X^{2}_{2}\) .
grid_d <- grid_a
# Create transformed features
grid_d$X1_sq <- grid_d$X1^2
grid_d$X2_sq <- grid_d$X2^2
# Compute the linear combination: should equal Z - 4
grid_d$linComb <- grid_d$X1_sq + grid_d$X2_sq + 2*grid_d$X1 - 4*grid_d$X2 + 1
# Verify equality
all_equal <- all.equal(grid_d$linComb, grid_d$Z - 4)
print(paste("linComb == Z - 4 everywhere? ", all_equal))
## [1] "linComb == Z - 4 everywhere? TRUE"
# View first few rows
head(grid_d[, c("X1","X2","Z","linComb")])
## X1 X2 Z linComb
## 1 -5.000000 -2 32.00000 28.00000
## 2 -4.949749 -2 31.60052 27.60052
## 3 -4.899497 -2 31.20608 27.20608
## 4 -4.849246 -2 30.81670 26.81670
## 5 -4.798995 -2 30.43236 26.43236
## 6 -4.748744 -2 30.05308 26.05308
We start with the non-linear boundary in the original inputs: \[ (1 + X_1)^2 + (2 - X_2)^2 = 4. \]
Expanding and simplifying gives: \[ \begin{aligned} (1 + X_1)^2 + (2 - X_2)^2 &= X_1^2 + 2X_1 + 1 \;+\; X_2^2 - 4X_2 + 4 = 4,\\ \Longrightarrow\quad X_1^2 + X_2^2 + 2X_1 - 4X_2 + 1 &= 0. \end{aligned} \]
Now define new features \[ Z_0 = 1,\quad Z_1 = X_1,\quad Z_2 = X_2,\quad Z_3 = X_1^2,\quad Z_4 = X_2^2. \] In terms of \(\mathbf{Z} = (Z_0,Z_1,Z_2,Z_3,Z_4)\), the boundary becomes the hyperplane \[ 1\cdot Z_3 \;+\; 1\cdot Z_4 \;+\; 2\cdot Z_1 \;+\;(-4)\cdot Z_2 \;+\;1\cdot Z_0 \;=\;0, \] which is linear in \(\mathbf{Z}\), even though it was non-linear in \((X_1,X_2)\).
Thus by augmenting our feature space with \(X_1^2\) and \(X_2^2\), the circle in \((X_1,X_2)\) becomes a straight‐line decision boundary (a hyperplane) in the transformed space.