Problem set 1

In this problem set we’ll work out some properties of the least squares solution that wereviewed in the weekly readings. Consider the unsolvable system Ax = b as given below:

\[ \begin{bmatrix} 1 & 0 \\ 1 & 1 \\ 1 & 3 \\ 1 & 4 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 0 \\ 8 \\ 8 \\ 20 \end{bmatrix} \]

  • Write R markdown script to compute \(A^TA\) and \(A^Tb\).

*Sol

A <- matrix(c(1,0,1,1,1,3,1,4), byrow=TRUE, nrow=4)
b <- matrix(c(0,8,8,20), nrow=4)
(ATA <- t(A) %*% A)
##      [,1] [,2]
## [1,]    4    8
## [2,]    8   26
(ATb <- t(A) %*% b)
##      [,1]
## [1,]   36
## [2,]  112
  • Solve for \(\hat{x}\) in R using the above two computed matrices.

Sol

(x_hat <- solve(ATA) %*% ATb)
##      [,1]
## [1,]    1
## [2,]    4
# check work with lsfit()(sanity test)
x_hat_lsfit <- lsfit(A,b,intercept=FALSE)
x_hat_lsfit$coef
## X1 X2 
##  1  4
  • What is the squared error of this solution?

Sol

p <- A %*% x_hat
# compute the error vector e = b - p
(e <- b-p)
##      [,1]
## [1,]   -1
## [2,]    3
## [3,]   -5
## [4,]    3
# check work with lsfit()
round(e) == round(x_hat_lsfit$residuals)
##      [,1]
## [1,] TRUE
## [2,] TRUE
## [3,] TRUE
## [4,] TRUE
# squared error
(sq_e <- sum(e^2))
## [1] 44
  • Instead of \(b = [0; 8; 8; 20]\), start with \(p = [1; 5; 13; 17]\) and find the exact solution (i.e. show that this system is solvable as all equations are consistent with each other. This should result in an error vector e = 0).

Sol

p <- matrix(c(1,5,13,17), nrow=4)
(ATp <- t(A) %*% p)
##      [,1]
## [1,]   36
## [2,]  112
# solve x_hat using using p instead of b
(x_hat_p <- solve(ATA) %*% ATp)
##      [,1]
## [1,]    1
## [2,]    4
# calculate the predicted values 
y_hat <- A %*% x_hat_p

# calculate error
(ep <- round(y_hat - p))
##      [,1]
## [1,]    0
## [2,]    0
## [3,]    0
## [4,]    0
  • Show that the error \(e = b - p = [-1; 3;-5; 3]\).

Sol

#residual error
(b-p)
##      [,1]
## [1,]   -1
## [2,]    3
## [3,]   -5
## [4,]    3
#check with lsfit
(lsfit(A, b,intercept=FALSE)$residuals)
## [1] -1  3 -5  3
  • Show that the error e is orthogonal to p and to each of the columns of A.
# e*p == 0
(round(sum(e*p)))
## [1] 0
#Show that e and column 1 of matrix A are orthogonal:
(round(sum(e*A[,1])))
## [1] 0
#Show that e and column 2 of matrix A are orthogonal:
(round(sum(e*A[,2])))
## [1] 0

Problem set 2

Consider the modified auto-mpg data (obtained from the UC Irvine Machine Learningdataset). This dataset contains 5 columns: displacement, horsepower, weight, acceleration, mpg. We are going to model mpg as a function of the other four variables.

Write an R markdown script that takes in the auto-mpg data, extracts an A matrix from the first 4 columns and b vector from the fifth (mpg) column. Using the least squares approach, your code should compute the best fitting solution. That is, find the best fitting equation that expresses mpg in terms of the other 4 variables. Finally, calculate the fitting error between the predicted mpg of your model and the actual mpg. Your script should be able to load in the 5 column data set, extract A and b, and perform the rest of the calculations. Please have adequate comments in your code to make it easy to follow your work.

SOL

# Setting working directory
# Note: Both .rmd and .data needs to be in same directory
set_directory <- function(directory) {
if (!is.null(directory))
   setwd(directory)
}

# Convert data into matrix format and give column name
auto_data<-as.matrix(read.table("auto-mpg.data", header=FALSE,as.is=TRUE))
colnames(auto_data) <- c("displacement", "horsepower","weight","acceleration","mpg")

# check data
head(auto_data)
##      displacement horsepower weight acceleration mpg
## [1,]          307        130   3504         12.0  18
## [2,]          350        165   3693         11.5  15
## [3,]          318        150   3436         11.0  18
## [4,]          304        150   3433         12.0  16
## [5,]          302        140   3449         10.5  17
## [6,]          429        198   4341         10.0  15
# Extract an A matrix from the first four columns and vector b from the fifth.
A <- as.matrix(auto_data[,1:4])
b <- as.matrix(auto_data[,5])

# Adding a column of repeating ones for the Intercepts
A <- cbind(rep(1,nrow(A)),A)

Using the least squares approach, find best fitting equation that expresses mpg in terms of the other 4 variables:

\[\hat{mpg} = \hat{x}[1] + displacement * \hat{x}[2] + horsepower * \hat{x}[3] + weight * \hat{x}[4] + acceleration * \hat{x}[5]\]

ATA <- t(A) %*% A
ATb <- t(A) %*% b

# calculate x_hat for best fitting solution
(x_hat <- solve(ATA) %*% ATb)
##                      [,1]
##              45.251139699
## displacement -0.006000871
## horsepower   -0.043607731
## weight       -0.005280508
## acceleration -0.023147999
# check it with lsfit()
(ls_check <- lsfit(A, b, intercept=FALSE)$coefficients)
##              displacement   horsepower       weight acceleration 
## 45.251139699 -0.006000871 -0.043607731 -0.005280508 -0.023147999

calculate the fitting error between the predicted mpg of your model and the actual mpg.

p <- A %*% x_hat
e <- b-p
(sq_e <- sum(e^2))
## [1] 6979.413