1. This problem involves hyperplanes in two dimensions.

a). Sketch the hyperplane 1+3X1−X2=0. Indicate the set of points for which 1+3X1−X2>0,as well as the set of points for which 1+3X1−X2<0.

b). On the same plot, sketch the hyperplane −2+X1+2X2=0. Indicate the set of points for which −2+X1+2X2>0,as well as the set of points for which −2+X1+2X2<0.

plot(NULL, xlim = c(-2, 3), ylim = c(-2, 5), xlab = "X1", ylab = "X2",
     main = "Exercise 1: Hyperplanes and Regions")
grid()

curve(3 * x + 1, from = -2, to = 3, col = "red", lwd = 2, add = TRUE)
text(2, 3, "1 + 3X1 - X2 = 0", col = "red", pos = 4)

polygon(x = c(-2, 3, 3, -2), y = c(-2, -2, 3*3+1, 3*(-2)+1), border = NA, col = rgb(1, 0, 0, 0.1))

text(0, -1.5, "1 + 3X1 - X2 > 0", col = "red")
text(0, 4.5, "1 + 3X1 - X2 < 0", col = "red")

curve((2 - x) / 2, from = -2, to = 3, col = "blue", lwd = 2, add = TRUE)
text(-1, 2.5, "-2 + X1 + 2X2 = 0", col = "blue", pos = 4)

polygon(x = c(-2, 3, 3, -2), y = c(-2, -2, (2 - 3)/2, (2 - (-2))/2), border = NA, col = rgb(0, 0, 1, 0.1))

text(-1.5, 0, "-2 + X1 + 2X2 < 0", col = "blue")
text(1.5, 4.5, "-2 + X1 + 2X2 > 0", col = "blue")

2. We have seen that in p=2 dimensions, a linear decision boundary takes the form β0+β1X1+β2X2=0. We now investigate a non-linear decision boundary.

a). Sketch the curve (1+X1)2+(2−X2)2=4.

library(ggplot2)

## Warning: package 'ggplot2' was built under R version 4.3.3

x1 <- seq(-4, 3, length.out = 200)
x2 <- seq(-2, 6, length.out = 200)
grid <- expand.grid(X1 = x1, X2 = x2)

grid$decision <- (1 + grid$X1)^2 + (2 - grid$X2)^2

ggplot(grid, aes(x = X1, y = X2)) +
  geom_contour(aes(z = decision), breaks = 4, color = "black") +
  labs(title = "Circle: (1 + X1)^2 + (2 - X2)^2 = 4",
       x = "X1", y = "X2") +
  theme_minimal()

b). On your sketch, indicate the set of points for which (1+X1)2+(2−X2)2>4, as well as the set of points for which (1+X1)2+(2−X2)2≤4.

grid$class <- ifelse(grid$decision > 4, "Blue Region (> 4)", "Red Region (≤ 4)")

ggplot(grid, aes(x = X1, y = X2)) +
  geom_tile(aes(fill = class), alpha = 0.3) +
  geom_contour(aes(z = decision), breaks = 4, color = "black") +
  scale_fill_manual(values = c("Red Region (≤ 4)" = "red", "Blue Region (> 4)" = "skyblue")) +
  labs(title = "Shaded Regions Based on Circle Inequality",
       x = "X1", y = "X2", fill = "Region") +
  theme_minimal()

c). Suppose that a classifier assigns an observation to the blue class if (1+X1)2+(2−X2)2>4, and to the red class otherwise. To what class is the observation (0,0) classified? (−1,1)?(2,2)?(3,8)?

points <- data.frame(
  X1 = c(0, -1, 2, 3),
  X2 = c(0, 1, 2, 8)
)

points$decision_value <- (1 + points$X1)^2 + (2 - points$X2)^2
points$class <- ifelse(points$decision_value > 4, "Blue", "Red")
points

##   X1 X2 decision_value class
## 1  0  0              5  Blue
## 2 -1  1              1   Red
## 3  2  2              9  Blue
## 4  3  8             52  Blue

x1 <- seq(-4, 3, length.out = 200)
x2 <- seq(-2, 6, length.out = 200)
grid <- expand.grid(X1 = x1, X2 = x2)

grid$decision <- (1 + grid$X1)^2 + (2 - grid$X2)^2

grid$class <- ifelse(grid$decision > 4, "Blue Region (> 4)", "Red Region (≤ 4)")

points <- data.frame(
  X1 = c(0, -1, 2, 3),
  X2 = c(0, 1, 2, 8),
  label = c("(0,0)", "(-1,1)", "(2,2)", "(3,8)")
)

points$decision_value <- (1 + points$X1)^2 + (2 - points$X2)^2
points$class <- ifelse(points$decision_value > 4, "Blue", "Red")

ggplot(grid, aes(x = X1, y = X2)) +
  geom_tile(aes(fill = class), alpha = 0.3) +
  geom_contour(aes(z = decision), breaks = 4, color = "black") +
  geom_point(data = points, aes(x = X1, y = X2), color = "black", size = 3) +
  geom_text(data = points, aes(x = X1, y = X2, label = label), vjust = -1, color = "black") +
  scale_fill_manual(values = c("Red Region (≤ 4)" = "red", "Blue Region (> 4)" = "skyblue")) +
  labs(title = "Classified Points on Nonlinear Decision Boundary",
       x = "X1", y = "X2", fill = "Region") +
  theme_minimal()

d). Argue that while the decision boundary in (c) is not linear in terms of X1 and X2, it is linear in terms of X1,X2 1,X2, and X^2v2

Although the decision boundary in part (c) appears non-linear in \(X_1\) and \(X_2\), we can expand the expression:

\[ (1 + X_1)^2 + (2 - X_2)^2 = 4 \]

Expanding both squared terms:

\[ (1 + X_1)^2 = X_1^2 + 2X_1 + 1 \] \[ (2 - X_2)^2 = X_2^2 - 4X_2 + 4 \]

Combining terms:

\[ X_1^2 + X_2^2 + 2X_1 - 4X_2 + 1 + 4 = 4 \Rightarrow X_1^2 + X_2^2 + 2X_1 - 4X_2 + 5 = 4 \Rightarrow X_1^2 + X_2^2 + 2X_1 - 4X_2 + 1 = 0 \]

This final equation is linear in terms of the variables:

\(X_1\)
\(X_2\)
\(X_1^2\)
\(X_2^2\)

Therefore, while the original decision boundary is non-linear in the original feature space, it is linear in an expanded feature space that includes quadratic terms. This principle underlies many nonlinear learning methods such as polynomial kernel SVMs.

Exercise 7

jace vayhinger

2025-04-17