Question 1

This problem involves hyper planes in two dimensions.

Part a

Sketch the hyper plane 1+3X1−X2=0. Indicate the set of points for which 1+3X1−X2>0, as well as the set of points for which 1+3X1 −X2<0.

library(ggplot2)

# Create a grid of (X1, X2) values
x1 <- seq(-5, 5, length.out = 200)
x2 <- seq(-5, 5, length.out = 200)
grid <- expand.grid(X1 = x1, X2 = x2)

# Compute the hyperplane function and classify regions
#    f(X1,X2) = 1 + 3*X1 - X2
grid$value  <- 1 + 3 * grid$X1 - grid$X2
grid$region <- factor(ifelse(grid$value > 0, "> 0", "< 0"))

# Plot with ggplot2
ggplot(grid, aes(x = X1, y = X2, fill = region)) +
  # Shade the two half‐planes
  geom_raster(alpha = 0.4, interpolate = FALSE) +
  # Draw the hyperplane: intercept = 1, slope = 3
  geom_abline(intercept = 1, slope = 3, color = "black", size = 1) +
  # Label the regions
  annotate("text", x =  2, y = -4, label = "1+3X1−X2 > 0", hjust = 0) +
  annotate("text", x = -4, y =  4, label = "1+3X1−X2 < 0", hjust = 0) +
  # Custom fill colors
  scale_fill_manual(values = c("> 0" = "steelblue", "< 0" = "tomato")) +
  # Axis labels with subscripts
  labs(
    title = "Hyperplane: 1 + 3 X₁ − X₂ = 0",
    x     = expression(X[1]),
    y     = expression(X[2]),
    fill  = "Sign of\n1+3X1−X2"
  ) +
  theme_minimal()

Part b

On the same plot, sketch the hyper plane −2 + X1 + 2X2 = 0. Indicate the set of points for which −2 + X1 + 2X2 >0, as well as the set of points for which −2+X1 + 2X2 < 0.

# install.packages("ggplot2")  # if not already installed
library(ggplot2)

# 1) build a grid of (X1,X2) values
x1 <- seq(-5, 5, length.out = 200)
x2 <- seq(-5, 5, length.out = 200)
grid <- expand.grid(X1 = x1, X2 = x2)

# 2) compute the hyperplane value and region label
grid$value  <- -2 + grid$X1 + 2 * grid$X2
grid$region <- factor(ifelse(grid$value > 0, "> 0", "< 0"))

# 3) plot
ggplot(grid, aes(x = X1, y = X2, fill = region)) +
  # raster for shading regions
  geom_raster(alpha = 0.4, interpolate = FALSE) +
  # hyperplane: intercept = 1, slope = -1/2
  geom_abline(intercept = 1, slope = -0.5, color = "black", size = 1) +
  # annotations
  annotate("text", x =  4, y =  4, label = "-2+X1+2X2 > 0", hjust = 0) +
  annotate("text", x = -4, y = -4, label = "-2+X1+2X2 < 0", hjust = 0) +
  # custom fills
  scale_fill_manual(values = c("> 0" = "steelblue", "< 0" = "tomato")) +
  labs(
    title    = "Hyperplane: -2 + X1 + 2 X2 = 0",
    x        = expression(X[1]),
    y        = expression(X[2]),
    fill     = "Sign of\n-2+X1+2X2"
  ) +
  theme_minimal()

Question 2

We have seen that in p= 2 dimensions, a linear decision boundary takes the form β01X12X2 = 0.We now investigate a non-linear decision boundary.

Part a

Sketch the curve

(1+X1)2 + (2−X2)2 = 4.

# Grid of (X1, X2)
x1_vals <- seq(-5, 5, length.out = 200)
x2_vals <- seq(-2, 10, length.out = 200)
grid_a <- expand.grid(X1 = x1_vals, X2 = x2_vals)
grid_a$Z <- (1 + grid_a$X1)^2 + (2 - grid_a$X2)^2

# Contour at Z = 4 (the circle boundary)
ggplot(grid_a, aes(x = X1, y = X2)) +
  geom_contour(aes(z = Z), breaks = 4, color = "black", size = 0.7) +
  coord_equal() +
  theme_minimal() +
  labs(
    title = "Part a: (1 + X1)^2 + (2 − X2)^2 = 4",
    x = expression(X[1]),
    y = expression(X[2])
  )

Part b

On your sketch, indicate the set of points for which

(1+X1)2 +(2−X2)2 > 4,

as well as the set of points for which

(1+X1)2 + (2−X2)2 ≤ 4.

ggplot(grid_a, aes(x = X1, y = X2)) +
  geom_raster(aes(fill = Z <= 4), interpolate = TRUE) +
  scale_fill_manual(
    values = c("TRUE" = "lightblue", "FALSE" = "lightgray"),
    labels = c("Inside (≤ 4)", "Outside (> 4)"),
    name = "Region"
  ) +
  geom_contour(aes(z = Z), breaks = 4, color = "black", size = 0.7) +
  coord_equal() +
  theme_minimal() +
  labs(
    title = "Regions Inside vs. Outside",
    x = expression(X[1]),
    y = expression(X[2])
  )

Part c

Suppose that a classifier assigns an observation to the blue class if

(1+X1)2 + (2−X2)2 > 4,

and to the red class otherwise.To what class is the observation (0,0) classified? (−1,1)? (2,2)? (3,8)?

# Define test points
test_points <- data.frame(
  X1 = c(0, -1, 2, 3),
  X2 = c(0,  1, 2, 8)
)
# Compute Z and Class
test_points$Z <- (1 + test_points$X1)^2 + (2 - test_points$X2)^2
test_points$Class <- ifelse(test_points$Z > 4, "Blue", "Red")

# Plot boundary + points
ggplot(grid_a, aes(x = X1, y = X2)) +
  geom_raster(aes(fill = Z <= 4), interpolate = TRUE) +
  geom_contour(aes(z = Z), breaks = 4, color = "black", size = 0.7) +
  geom_point(
    data = test_points,
    aes(color = Class, shape = Class),
    size = 3
  ) +
  scale_color_manual(values = c("Blue" = "blue", "Red" = "red")) +
  scale_fill_manual(
    values = c("TRUE" = "lightblue", "FALSE" = "white"),
    guide = FALSE
  ) +
  coord_equal() +
  theme_minimal() +
  labs(
    title = "Part c: Classification of Test Points",
    x = expression(X[1]),
    y = expression(X[2])
  )
## Warning: The `guide` argument in `scale_*()` cannot be `FALSE`. This was deprecated in
## ggplot2 3.3.4.
## ℹ Please use "none" instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

# Print the results
print(test_points)
##   X1 X2  Z Class
## 1  0  0  5  Blue
## 2 -1  1  1   Red
## 3  2  2  9  Blue
## 4  3  8 52  Blue

Part d

Argue that while the decision boundary in (c) is not linear in terms of X1 and X2, it is linear in terms of X1, \(X^{2}_{1}\) ,X2, and \(X^{2}_{2}\) .

grid_d <- grid_a
# Create transformed features
grid_d$X1_sq <- grid_d$X1^2
grid_d$X2_sq <- grid_d$X2^2
# Compute the linear combination: should equal Z - 4
grid_d$linComb <- grid_d$X1_sq + grid_d$X2_sq + 2*grid_d$X1 - 4*grid_d$X2 + 1

# Verify equality
all_equal <- all.equal(grid_d$linComb, grid_d$Z - 4)
print(paste("linComb == Z - 4 everywhere? ", all_equal))
## [1] "linComb == Z - 4 everywhere?  TRUE"
# View first few rows
head(grid_d[, c("X1","X2","Z","linComb")])
##          X1 X2        Z  linComb
## 1 -5.000000 -2 32.00000 28.00000
## 2 -4.949749 -2 31.60052 27.60052
## 3 -4.899497 -2 31.20608 27.20608
## 4 -4.849246 -2 30.81670 26.81670
## 5 -4.798995 -2 30.43236 26.43236
## 6 -4.748744 -2 30.05308 26.05308

We start with the non-linear boundary in the original inputs: \[ (1 + X_1)^2 + (2 - X_2)^2 = 4. \]

Expanding and simplifying gives: \[ \begin{aligned} (1 + X_1)^2 + (2 - X_2)^2 &= X_1^2 + 2X_1 + 1 \;+\; X_2^2 - 4X_2 + 4 = 4,\\ \Longrightarrow\quad X_1^2 + X_2^2 + 2X_1 - 4X_2 + 1 &= 0. \end{aligned} \]

Now define new features \[ Z_0 = 1,\quad Z_1 = X_1,\quad Z_2 = X_2,\quad Z_3 = X_1^2,\quad Z_4 = X_2^2. \] In terms of \(\mathbf{Z} = (Z_0,Z_1,Z_2,Z_3,Z_4)\), the boundary becomes the hyperplane \[ 1\cdot Z_3 \;+\; 1\cdot Z_4 \;+\; 2\cdot Z_1 \;+\;(-4)\cdot Z_2 \;+\;1\cdot Z_0 \;=\;0, \] which is linear in \(\mathbf{Z}\), even though it was non-linear in \((X_1,X_2)\).

Thus by augmenting our feature space with \(X_1^2\) and \(X_2^2\), the circle in \((X_1,X_2)\) becomes a straight‐line decision boundary (a hyperplane) in the transformed space.