Exercise 7

This problem involves hyperplanes in two dimensions.

Sketch the hyperplane 1+3X1 − X2 = 0. Indicate the set of points for which 1+3X1 − X2 > 0, as well as the set of points for which 1+3X1 − X2 < 0.

X2=3X1+1

x1_vals <- seq(-10, 10, by = 0.1)

x2_vals <- 1 + 3 * x1_vals

plot(x1_vals, x2_vals, type = "l", lwd = 2, col = "blue",
     xlab = "X1", ylab = "X2", main = "Hyperplane 1 : 1 + 3X1 - X2 = 0")

This is a straight line with slope = 3 and y-intercept = 1. The region above the line corresponds to 1 + 3X₁ − X₂ < 0. The region below the line corresponds to 1 + 3X₁ − X₂ > 0.

On the same plot, sketch the hyperplane −2 + X1 + 2X2 = 0. Indicate the set of points for which −2 + X1 + 2X2 > 0, as well as the set of points for which −2 + X1 + 2X2 < 0.

−2 + X₁ + 2X₂ = 0 : X₂ = (2 − X₁)/2

x2_vals_2 <- (2 - x1_vals) / 2

plot(x1_vals, x2_vals, type = "l", col = "blue", lwd = 2,
     xlab = "X1", ylab = "X2", main = "Two Hyperplanes")

lines(x1_vals, x2_vals_2, col = "green", lwd = 2)

points(c(-5, -2, 0), c(25, 20, 30), col = "red", pch = 19)  # region > 0
points(c(2, 5, 0), c(-10, 0, -5), col = "blue", pch = 19)   # region < 0

legend("topleft", legend = c("1 + 3X1 − X2 = 0", "-2 + X1 + 2X2 = 0"),
       col = c("blue", "green"), lty = 1, lwd = 2)

Red dots lie in the positive region of the hyperplane. Blue dots lie in the negative region.

We have seen that in p = 2 dimensions, a linear decision boundary takes the form β0+β1X1+β2X2 = 0. We now investigate a non-linear decision boundary.

Sketch the curve (1 + X1)2 + (2 − X2)2 = 4. Its a circle centered at (−1, 2) with radius 2.

x1_vals <- seq(-3, 1, by = 0.01)
x2_top <- 2 + sqrt(4 - (1 + x1_vals)^2)
x2_bottom <- 2 - sqrt(4 - (1 + x1_vals)^2)

plot(x1_vals, x2_top, type = "l", ylim = c(-1, 5), col = "black",
     xlab = "X1", ylab = "X2", main = "Circle Decision Boundary")
lines(x1_vals, x2_bottom, col = "black")

On your sketch, indicate the set of points for which (1 + X1)2 + (2 − X2)2 > 4, as well as the set of points for which (1 + X1)2 + (2 − X2)2 ≤ 4.

plot(x1_vals, x2_top, type = "l", ylim = c(-1, 5), col = "black",
     xlab = "X1", ylab = "X2", main = "Circle Decision Boundary")
lines(x1_vals, x2_bottom, col = "black")

points(x = c(-2, 0, -1, 0.6, -2.4), 
       y = c(2, 1, 0.5, 2.8, 2), col = "red", pch = 19)

points(x = c(-3, -1, -2, -1, 1), 
       y = c(4, 5, 0, -0.5, 4), col = "blue", pch = 19)

Classifies as red if point lies inside or on the circle and it classifies as blue if point lies outside the circle.

Suppose that a classifer assigns an observation to the blue class if (1 + X1)2 + (2 − X2)2 > 4, and to the red class otherwise. To what class is the observation (0, 0) classifed? (−1, 1)? (2, 2)? (3, 8)?

(1 + X1)2 + (2 − X2)2 > 4

classify <- function(x1, x2) {
  val <- (1 + x1)^2 + (2 - x2)^2
  if (val > 4) return("blue") else return("red")
}

classify(0, 0)    # (1+0)^2 + (2-0)^2 = 1 + 4 = 5 so blue

## [1] "blue"

classify(-1, 1)

## [1] "red"

classify(2, 2)

## [1] "blue"

classify(3, 8)

## [1] "blue"

Argue that while the decision boundary in (c) is not linear in terms of X1 and X2, it is linear in terms of X1, X1^2 , X2, and X2^2 .

(1 + X1)2 + (2 − X2)2 > 4, Expands to : 1+X1^2+2X1+4+X22-4X2>4 It clearly defines a circle, which is nonlinear in the original feature space of X1 and X2. A circular boundary cannot be expressed as a straight line.

Lets set new variables Z1=X1, Z2=X2, Z3=X1^2, Z4=X2^2, Now 1+2Z1-4Z2+Z3+Z4=0. This is a linear equation.

So the original boundary is nonlinear in (X1, X2) but linear in (X1, X2, X1^2, X2^2) — this is the core idea behind using basis expansion in models like SVMs or polynomial regression.