Consider the unsolvable system Ax = b as given below:
\(\left[ \begin{array}{ccc} 1 & 0 \\ 1 & 1 \\ 1 & 3 \\ 1 & 4 \\ \end{array}\right]\) \(\left[ \begin{array}{ccc} x_1 \\ x_2 \\ \end{array}\right] =\) \(\left[ \begin{array}{ccc} 0 \\ 8 \\ 8 \\ 20 \\ \end{array}\right]\)
A <- matrix(c(1, 1, 1, 1, 0, 1, 3, 4), nrow=4)
b <- matrix(c(0, 8, 8, 20), nrow=4)
ata <- t(A) %*% A
ata## [,1] [,2]
## [1,] 4 8
## [2,] 8 26
atb <- t(A) %*% b
atb## [,1]
## [1,] 36
## [2,] 112
Since \(A^TA\hat{x} = A^Tb\) this give us \(\left[ \begin{array}{ccc} 4 & 8 \\ 8 & 26 \\ \end{array}\right] \hat{x} =\) \(\left[ \begin{array}{ccc} 36 \\ 112 \\ \end{array}\right] b\)
You can use elimination and solve \(4x_1 + 8x_2 = 36\) and \(8x_1 + 26x_2 = 112\) to find \(\hat{x} =\) \(\left[ \begin{array}{ccc} 1 \\ 4 \\ \end{array}\right]\)
We can also use \(A^TA\hat{x} = A^Tb\) and algebra to get \(\hat{x} = A^Tb / A^TA = A^Tb (A^TA)^{-1}\), which won’t work because the matrices don’t align, however \(\hat{x} = (A^TA)^{-1} A^Tb\) does give \(\left[ \begin{array}{ccc} 1 \\ 4 \\ \end{array}\right]\) (see below). I’m not sure why.
Also, if we use the R function solve on \(A^TA\) and \(A^Tb\) we get \(\hat{x} =\) \(\left[ \begin{array}{ccc} 1 \\ 4 \\ \end{array}\right]\) (see below).
invata <- solve(ata)
xhat <- invata %*% atb
xhat## [,1]
## [1,] 1
## [2,] 4
solve(ata, atb)## [,1]
## [1,] 1
## [2,] 4
\(E = ||Ax - b||^2 = \left[ \begin{array}{ccc} 1 \\ -3 \\ 5 \\ -3 \\ \end{array}\right]\) for \(x = \left[ \begin{array}{ccc} 1 \\ 4 \\ \end{array}\right]\)
The system Ax = p is given below:
\(\left[ \begin{array}{ccc} 1 & 0 \\ 1 & 1 \\ 1 & 3 \\ 1 & 4 \\ \end{array}\right]\) \(\left[ \begin{array}{ccc} x_1 \\ x_2 \\ \end{array}\right] =\) \(\left[ \begin{array}{ccc} 1 \\ 5 \\ 13 \\ 17 \\ \end{array}\right]\)
which is solved by \(x_1 = 1, x_2 =4\)
p <- matrix(c(1, 5, 13, 17), nrow=4)
atp <- t(A) %*% p
atp## [,1]
## [1,] 36
## [2,] 112
solve(ata, atp)## [,1]
## [1,] 1
## [2,] 4
\(b = \left[ \begin{array}{ccc} 0 \\ 8 \\ 8 \\ 20 \\ \end{array}\right]\)
\(p = \left[ \begin{array}{ccc} 1 \\ 5 \\ 13 \\ 17 \\ \end{array}\right]\)
\(b - p = \left[ \begin{array}{ccc} 0 - 1 \\ 8 - 5 \\ 8 - 13 \\ 20 - 17 \\ \end{array}\right] =\) \(\left[ \begin{array}{ccc} -1 \\ 3 \\ -5 \\ 3 \\ \end{array}\right]\)
e = b - p
e## [,1]
## [1,] -1
## [2,] 3
## [3,] -5
## [4,] 3
Two vectors are orthogonal when their dot product is zero: \(\mathbf{v} \cdot \mathbf{w} = 0\) or \(\mathbf{v}^T \mathbf{w} = 0\)
t(e) %*% p## [,1]
## [1,] 0
t(e) %*% A[,1]## [,1]
## [1,] 0
t(e) %*% A[,2]## [,1]
## [1,] 0
# dfrm <- read.table("C:/Users/robgo/Documents/CUNY/2016-2-Fall-IS605-Comp-Math/Week5/auto-mpg.data")
dfrm <- read.table("https://raw.githubusercontent.com/Godbero/IS605/master/auto-mpg.data")
A <- matrix(0, 392, 4)
A[,1] <- dfrm$V1
A[,2] <- dfrm$V2
A[,3] <- dfrm$V3
A[,4] <- dfrm$V4
b <- dfrm$V5
head(A)## [,1] [,2] [,3] [,4]
## [1,] 307 130 3504 12.0
## [2,] 350 165 3693 11.5
## [3,] 318 150 3436 11.0
## [4,] 304 150 3433 12.0
## [5,] 302 140 3449 10.5
## [6,] 429 198 4341 10.0
tail(A)## [,1] [,2] [,3] [,4]
## [387,] 151 90 2950 17.3
## [388,] 140 86 2790 15.6
## [389,] 97 52 2130 24.6
## [390,] 135 84 2295 11.6
## [391,] 120 79 2625 18.6
## [392,] 119 82 2720 19.4
head(b)## [1] 18 15 18 16 17 15
tail(b)## [1] 27 27 44 32 28 31
ata <- t(A) %*% A
atb <- t(A) %*% b
solve(ata, atb)## [,1]
## [1,] -0.030037938
## [2,] 0.157115685
## [3,] -0.006217883
## [4,] 1.997320955