Problem Set 1


Part 1

Show that \(A^TA \neq AA^T\) in general. (Proof and demonstration.)


Proof

Here we take a generalized 2x2 matrices with elements (a,b,c,d) and show how changing the order of multiplication with it’s transpose provides different results unless (a,b,c,d) meet specific criteria.


Transposition

Here we transpose A.

\[ A= \left[ {\begin{array}{cc} a & b \\ c & d \\ \end{array} } \right] \]

\[ A^T= \left[ {\begin{array}{cc} a & c \\ b & d \\ \end{array} } \right] \]


Multiplication

Here we multiply \(AA^T\) and \(A^TA\)

\[ AA^T= \left[ {\begin{array}{cc} a^2+b^2 & ac+bd \\ ac+bd & c^2+d^2 \\ \end{array} } \right] \]

\[ A^TA= \left[ {\begin{array}{cc} a^2+c^2 & ab+cd \\ ab+cd & b^2+d^2 \\ \end{array} } \right] \]


Criteria

When we set the matrices equal to each other element-wise we get three equations that must be true for the two orders of multiplication to be true. An obvious case is if b=c. However not in all cases will they be equal, so we have proved \(AA^T\neq A^TA\)

\(a^2+b^2=a^2+c^2\)
\(c^2+d^2=b^2+d^2\)
\(ac+bd=ab+cd\)


Answer

Since not for all values of (a,b,c,d) will the three equations above be true, we have proved \(AA^T\neq A^TA\)


Demonstration

We’re going to demonstrate this with a simple matrix.


Transposition

Here we transpose B.

\[ B= \left[ {\begin{array}{cc} 1 & 2 \\ 3 & 4 \\ \end{array} } \right] \]

\[ B^T= \left[ {\begin{array}{cc} 1 & 3 \\ 2 & 4 \\ \end{array} } \right] \]


Multiplication

Here we multiply \(BB^T\) and \(B^TB\)

\[ BB^T= \left[ {\begin{array}{cc} 1^2+2^2 & 1*3+2*4 \\ 1*3+2*4 & 3^2+4^2 \\ \end{array} } \right] \]

\[ B^TB= \left[ {\begin{array}{cc} 1^2+3^2 & 1*2+3*4 \\ 1*2+3*4 & 2^2+4^2 \\ \end{array} } \right] \]


Simplify

Here we multiply \(BB^T\) and \(B^TB\)

\[ BB^T= \left[ {\begin{array}{cc} 5 & 11 \\ 11 & 25 \\ \end{array} } \right] \]

\[ B^TB= \left[ {\begin{array}{cc} 10 & 14 \\ 14 & 20 \\ \end{array} } \right] \]


Answer

Above you can see that \(BB^T\neq B^TB\)


Verification in R

Here we input the matrix B into R and calculate both \(BB^T\) and \(B^TB\) to show they equal what we calculated by hand above.

B <- matrix(c(1,3,2,4),ncol=2)
B
##      [,1] [,2]
## [1,]    1    2
## [2,]    3    4
B%*%t(B)
##      [,1] [,2]
## [1,]    5   11
## [2,]   11   25
t(B)%*%B
##      [,1] [,2]
## [1,]   10   14
## [2,]   14   20


Part 2

For a special type of square matrix A, we get \(A^TA=AA^T\). Under what conditions could this be true?

We incidentally solved this in completing Part 1. If a 2x2 matrix satisfies the following criteria then in fact \(A^TA=AA^T\).

\(a^2+b^2=a^2+c^2\)
\(c^2+d^2=b^2+d^2\)
\(ac+bd=ab+cd\)


Symmetric Matrices

Any 2x2 matrix where b=c is symmetric and would satisfy the criteria. An example of this is the Identity matrix.

\[ Symmetric\ Matrix= \left[ {\begin{array}{cc} a & b \\ b & d \\ \end{array} } \right] \]


Answer

When a matrix satisfies the property that it commutes with it’s transpose then necessarily it is symmetric. I tried coming up with a 2x2 matrix that wasn’t symmetric but did commute with it’s transpose but the answer is in the criteria; I don’t think you can satisfy all three equations unless b=c.

It looks like there are complex matrices that are normal when they commute with their transpose, and they might have additional properties, but another trail for another day. From the wikipedia article for ‘Normal matrix’, “a complex square matrix \(A\) is normal if it commutes with its conjugate transpose \(A*\)”.




Problem Set 2

Matrix factorization is a very important problem. There are supercomputers built just to do matrix factorizations. Every second you are on an airplane, matrices are being factorized. Radars that track flights use a technique called Kalman filtering. At the heart of Kalman Filtering is a Matrix Factorization operation. Kalman Filters are solving linear systems of equations when they track your flight using radars.

Write an R function to factorize a square matrix A (4x4 3x3 or 2x2) into LU. You don’t have to worry about permuting rows of A.


Library

To verify our work we’re going to use the lu function from the pracma library.

library("pracma")


Data

Here are the matrices we’ve used so far to test our functions.

#A <- matrix(c(1,3, 2,4),ncol=2)
A <- matrix(c(2,4, 3,5),ncol=2)
B <- matrix(c(1,2,-2, 2,3,3, 3,1,-2),ncol=3)
#B <- matrix(c(1,-2,3, 4,8,4, -3,5,7),ncol=3)
C <- matrix(c(1,3,-1,-3, -2,-9,2,-6, -2,0,4,26, -3,-9,7,2),ncol=4)


2x2 Matrix LU

Here is the function we built to do LU factorization on a 2x2 matrix. The return code I found from typing lu in the console to read the underlying code for that function.

lu_2by2 <- function(A) {
  
  L <- diag(2)
  U <- A
  
  L[2,1] <- A[2,1]/A[1,1]
  U[2,1] <- U[2,1]-L[2,1]*U[1,1]
  U[2,2] <- U[2,2]-L[2,1]*U[1,2]
  
  return(list(L = L, U = U))
}


Test

The function we wrote should return the same output as the pracma lu function for any given 2x2 matrix.

lu_2by2(A)
## $L
##      [,1] [,2]
## [1,]    1    0
## [2,]    2    1
## 
## $U
##      [,1] [,2]
## [1,]    2    3
## [2,]    0   -1
lu(A)
## $L
##      [,1] [,2]
## [1,]    1    0
## [2,]    2    1
## 
## $U
##      [,1] [,2]
## [1,]    2    3
## [2,]    0   -1


3x3 Matrix LU

Here is the function we built to do LU factorization on a 3x3 matrix. The line doing the row operation on U[3,1] for the second time is redundant since U[2,1] is already zero however I left it in to look for patterns.

lu_3by3 <- function(B) {
  
  L <- diag(3)
  U <- B
  
  L[2,1] <- U[2,1]/U[1,1]
  U[2,1] <- U[2,1]-L[2,1]*U[1,1]
  U[2,2] <- U[2,2]-L[2,1]*U[1,2]
  U[2,3] <- U[2,3]-L[2,1]*U[1,3]
  
  L[3,1] <- U[3,1]/U[1,1]
  U[3,1] <- U[3,1]-L[3,1]*U[1,1]
  U[3,2] <- U[3,2]-L[3,1]*U[1,2]
  U[3,3] <- U[3,3]-L[3,1]*U[1,3]
  
  L[3,2] <- U[3,2]/U[2,2]
  U[3,1] <- U[3,1]-L[3,2]*U[2,1]
  U[3,2] <- U[3,2]-L[3,2]*U[2,2]
  U[3,3] <- U[3,3]-L[3,2]*U[2,3]
  
  return(list(L = L, U = U))
}


Test

The function we wrote should return the same output as the pracma lu function for any given 3x3 matrix.

lu_3by3(B)
## $L
##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    2    1    0
## [3,]   -2   -7    1
## 
## $U
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    0   -1   -5
## [3,]    0    0  -31
lu(B)
## $L
##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    2    1    0
## [3,]   -2   -7    1
## 
## $U
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    0   -1   -5
## [3,]    0    0  -31


4x4 Matrix LU

Here is the function we built to do LU factorization on a 4x4 matrix. There are a lot of redundant row operations since progressively we have more zeros; however I was interested in perserving the pattern. Next steps would be to remove unnecessary lines of code, switch to a vectorized programming solution, and to generalize the pattern for all sizes of matrix.

lu_4by4 <- function(C) {
  
  L <- diag(4)
  U <- C
  
  L[2,1] <- U[2,1]/U[1,1]
  U[2,1] <- U[2,1]-L[2,1]*U[1,1]
  U[2,2] <- U[2,2]-L[2,1]*U[1,2]
  U[2,3] <- U[2,3]-L[2,1]*U[1,3]
  U[2,4] <- U[2,4]-L[2,1]*U[1,4]
  
  L[3,1] <- U[3,1]/U[1,1]
  U[3,1] <- U[3,1]-L[3,1]*U[1,1]
  U[3,2] <- U[3,2]-L[3,1]*U[1,2]
  U[3,3] <- U[3,3]-L[3,1]*U[1,3]
  U[3,4] <- U[3,4]-L[3,1]*U[1,4]

  L[4,1] <- U[4,1]/U[1,1]
  U[4,1] <- U[4,1]-L[4,1]*U[1,1]
  U[4,2] <- U[4,2]-L[4,1]*U[1,2]
  U[4,3] <- U[4,3]-L[4,1]*U[1,3]
  U[4,4] <- U[4,4]-L[4,1]*U[1,4]
  
  L[3,2] <- U[3,2]/U[2,2]
  U[3,1] <- U[3,1]-L[3,2]*U[2,1]
  U[3,2] <- U[3,2]-L[3,2]*U[2,2]
  U[3,3] <- U[3,3]-L[3,2]*U[2,3]
  U[3,4] <- U[3,4]-L[3,2]*U[2,4]

  L[4,2] <- U[4,2]/U[2,2]
  U[4,1] <- U[4,1]-L[4,2]*U[2,1]
  U[4,2] <- U[4,2]-L[4,2]*U[2,2]
  U[4,3] <- U[4,3]-L[4,2]*U[2,3]
  U[4,4] <- U[4,4]-L[4,2]*U[2,4]
  
  L[4,3] <- U[4,3]/U[3,3]
  U[4,1] <- U[4,1]-L[4,3]*U[3,1]
  U[4,2] <- U[4,2]-L[4,3]*U[3,2]
  U[4,3] <- U[4,3]-L[4,3]*U[3,3]
  U[4,4] <- U[4,4]-L[4,3]*U[3,4]
  
  return(list(L = L, U = U))
}


Test

The function we wrote should return the same output as the pracma lu function for any given 4x4 matrix.

lu_4by4(C)
## $L
##      [,1] [,2] [,3] [,4]
## [1,]    1    0    0    0
## [2,]    3    1    0    0
## [3,]   -1    0    1    0
## [4,]   -3    4   -2    1
## 
## $U
##      [,1] [,2] [,3] [,4]
## [1,]    1   -2   -2   -3
## [2,]    0   -3    6    0
## [3,]    0    0    2    4
## [4,]    0    0    0    1
lu(C)
## $L
##      [,1] [,2] [,3] [,4]
## [1,]    1    0    0    0
## [2,]    3    1    0    0
## [3,]   -1    0    1    0
## [4,]   -3    4   -2    1
## 
## $U
##      [,1] [,2] [,3] [,4]
## [1,]    1   -2   -2   -3
## [2,]    0   -3    6    0
## [3,]    0    0    2    4
## [4,]    0    0    0    1