Question 1

This problem involves hyperplanes in two dimensions.

  1. Sketch the hyperplane \(1 + 3X_1 - X_2 = 0\). Indicate the set of points for which \(1 + 3X_1 - X_2 > 0\), as well as the set of points for which \(1 + 3X_1 - X_2 < 0\).
library(ggplot2)
xlim <- c(-10, 10)
ylim <- c(-30, 30)
points <- expand.grid(
  X1 = seq(xlim[1], xlim[2], length.out = 200),
  X2 = seq(ylim[1], ylim[2], length.out = 200)
)
points$region <- ifelse(1 + 3 * points$X1 - points$X2 > 0, 
                        "> 0", "< 0")

p1 <- ggplot(points, aes(x = X1, y = X2)) +
  geom_point(aes(color = region), size = 0.5, alpha = 0.6) +
  geom_abline(intercept = 1, slope = 3, color = "black", size = 1.2) +
  scale_color_manual(values = c("#1f78b4", "#e31a1c")) +
  theme_minimal() +
  labs(title = "Hyperplane: 1 + 3X1 - X2 = 0",
       subtitle = "Blue: Region > 0, Red: Region < 0",
       color = "Region")
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
p1

  1. On the same plot, sketch the hyperplane \(-2 + X_1 + 2X_2 = 0\). Indicate the set of points for which \(-2 + X_1 + 2X_2 > 0\), as well as the set of points for which \(-2 + X_1 + 2X_2 < 0\).
points$region_combined <- interaction(
  region1 = (1 + 3 * points$X1 - points$X2 > 0),
  region2 = (-2 + points$X1 + 2 * points$X2 > 0)
)

p2 <- ggplot(points, aes(x = X1, y = X2)) +
  geom_point(aes(color = region_combined), size = 0.5, alpha = 0.7) +
  geom_abline(intercept = 1, slope = 3, color = "black", size = 1) +
  geom_abline(intercept = 1, slope = -0.5, color = "darkgreen", size = 1) +
  scale_color_manual(values = c("#a6cee3", "#fb9a99", "#1f78b4", "#e31a1c")) +
  theme_minimal() +
  labs(title = "Combined Hyperplanes",
       subtitle = "1 + 3X1 - X2 = 0 (Black), -2 + X1 + 2X2 = 0 (Green)",
       color = "Region (R1 . R2)")

p2

Question 2

We have seen that in \(p = 2\) dimensions, a linear decision boundary takes the form \(\beta_0 + \beta_1X_1 + \beta_2X_2 = 0\). We now investigate a non-linear decision boundary.

  1. Sketch the curve \[(1+X_1)^2 +(2-X_2)^2 = 4\].
points <- expand.grid(
  X1 = seq(-4, 2, length.out = 300),
  X2 = seq(-1, 5, length.out = 300)
)
points$z <- (1 + points$X1)^2 + (2 - points$X2)^2 - 4

p3 <- ggplot(points, aes(x = X1, y = X2, z = z)) +
  geom_contour(breaks = 0, colour = "blue", size = 1.1) +
  theme_minimal() +
  labs(title = "Non-linear Boundary: (1 + X1)^2 + (2 - X2)^2 = 4")

p3

  1. On your sketch, indicate the set of points for which \[(1 + X_1)^2 + (2 - X_2)^2 > 4,\] as well as the set of points for which \[(1 + X_1)^2 + (2 - X_2)^2 \leq 4.\]
points$region_circle <- ifelse(points$z > 0, "> 4", "<= 4")

p4 <- ggplot(points, aes(x = X1, y = X2)) +
  geom_point(aes(color = region_circle), size = 0.3, alpha = 0.6) +
  geom_contour(aes(z = z), breaks = 0, colour = "black", size = 1) +
  scale_color_manual(values = c("#33a02c", "#ff7f00")) +
  theme_minimal() +
  labs(title = "Region Classification by Circle",
       subtitle = "(1 + X1)^2 + (2 - X2)^2 compared to 4",
       color = "Region")

p4

  1. Suppose that a classifier assigns an observation to the blue class if \[(1 + X_1)^2 + (2 - X_2)^2 > 4,\] and to the red class otherwise. To what class is the observation \((0, 0)\) classified? \((-1, 1)\)? \((2, 2)\)? \((3, 8)\)?
new_points <- data.frame(
  X1 = c(0, -1, 2, 3),
  X2 = c(0, 1, 2, 8)
)
new_points$class <- ifelse((1 + new_points$X1)^2 + (2 - new_points$X2)^2 > 4,
                            "blue", "red")
new_points
##   X1 X2 class
## 1  0  0  blue
## 2 -1  1   red
## 3  2  2  blue
## 4  3  8  blue

Explanation: - \((0, 0)\): \((1 + 0)^2 + (2 - 0)^2 = 1 + 4 = 5 > 4 \Rightarrow\) Blue - \((-1, 1)\): \((0)^2 + (1)^2 = 0 + 1 = 1 \Rightarrow\) Red - \((2, 2)\): \((3)^2 + 0 = 9 \Rightarrow\) Blue - \((3, 8)\): \((4)^2 + (-6)^2 = 16 + 36 = 52 \Rightarrow\) Blue

  1. Argue that while the decision boundary in (c) is not linear in terms of \(X_1\) and \(X_2\), it is linear in terms of \(X_1\), \(X_1^2\), \(X_2\), and \(X_2^2\).

Explanation: The decision boundary is given by: \[(1 + X_1)^2 + (2 - X_2)^2 - 4 = 0\] Expanding: \[\begin{align*} (1 + X_1)^2 + (2 - X_2)^2 - 4 &= 1 + 2X_1 + X_1^2 + 4 - 4X_2 + X_2^2 - 4 \\ &= X_1^2 + 2X_1 + X_2^2 - 4X_2 + 1 \end{align*}\] Thus, the boundary is linear in terms of \(X_1^2\), \(X_1\), \(X_2^2\), \(X_2\), and a constant. That is, while nonlinear in the original feature space \((X_1, X_2)\), it becomes a linear function in an expanded feature space, which is often used in kernel-based or polynomial models.