Problem Set 1

(1)

Show that \(A^{T}A \neq AA^{T}\) in general.

To prove this, there are two scenarios that needed to be considered. One is when A is a square matrix. The other is when A is a non-square matrix.

When A is a square matrix:

\[\begin{multline*} \begin{split} Let \space A & = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \\ then \space A^{T} &= \begin{bmatrix} a & c \\ b & d \end{bmatrix} \\ AA^{T} &= \begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} a & c \\ b & d \end{bmatrix} \\ AA^{T} &= \begin{bmatrix} (a^2 + b^2) & (ac + bd) \\ (ac + bd) & (c^2 + d^2) \end{bmatrix} \\ and \space A^{T}A & = \begin{bmatrix} a & c \\ b & d \end{bmatrix} \begin{bmatrix} a & b \\ c & d \end{bmatrix} \\ A^{T}A & = \begin{bmatrix} (a^2 + c^2) & (ab + cd) \\ (ab + cd) & (b^2 + d^2) \end{bmatrix} \\ \therefore \space A^{T}A \neq AA^{T} \end{split} \end{multline*}\]

When A is a non-square matrix, let \(A = [X_{ij}]\) where \(i \neq j\). The product of \(AA^{T}\) is going to be of dimension \([Z_{ii}]\) where \(A^{T}A\) is of dimension \([Y_{jj}]\). In conclusion, we prove that \(A^{T}A \neq AA^{T}\) for any i,j \(\in \mathbb{I}, i \neq j\)

To demonstrate with an example on

\[\begin{equation} A^{T}A \neq AA^{T} \hspace{35pt} \hspace{45pt} (1) \end{equation}\]

holds.

Square Matrix E.g.

# Let matrix A be a 2x2
mat_A <- matrix(c(1,4,3,2), byrow=T, nrow=2, ncol=2)
mat_A
##      [,1] [,2]
## [1,]    1    4
## [2,]    3    2
# Transpose of matrix A
trans_A <- t(mat_A)
trans_A
##      [,1] [,2]
## [1,]    1    3
## [2,]    4    2
mat_A_A_trans <- mat_A %*% trans_A
mat_A_A_trans
##      [,1] [,2]
## [1,]   17   11
## [2,]   11   13
mat_A_trans_A <- trans_A %*% mat_A
mat_A_trans_A
##      [,1] [,2]
## [1,]   10   10
## [2,]   10   20
cat("Equation (1) is a true statement:", identical(mat_A_A_trans, mat_A_trans_A))
## Equation (1) is a true statement: FALSE

Non-square Matrix E.g.

# Let matrix C be a 3x2
mat_C <- matrix(c(1,4,3,2,5,1), byrow=T, nrow=3, ncol=2)
mat_C
##      [,1] [,2]
## [1,]    1    4
## [2,]    3    2
## [3,]    5    1
# Transpose of matrix C
trans_C <- t(mat_C)
trans_C
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    4    2    1
mat_C_C_trans <- mat_C %*% trans_C
mat_C_C_trans
##      [,1] [,2] [,3]
## [1,]   17   11    9
## [2,]   11   13   17
## [3,]    9   17   26
mat_C_trans_C <- trans_C %*% mat_C
mat_C_trans_C
##      [,1] [,2]
## [1,]   35   15
## [2,]   15   21
cat("Equation (1) is a true statement:", identical(mat_C_C_trans, mat_C_trans_C))
## Equation (1) is a true statement: FALSE

The 2 examples complementarily illustrate why equation (1) doesnโ€™t hold.

(2)

For a special type of square matrix A, we get \(A^TA=AA^T\) . Under what conditions could this be true?

Ans: \(A^TA=AA^T\) could only be true if \(A^T\) is the inverse of A such that \(A^TA = AA^T = I\)

\[\begin{multline*} \begin{split} A^TA & = I \\ Multiplying\space both\space sides\space with\space A,\space we\space get\hspace{35pt} AA^TA & = AI \\ Multiplying\space both\space sides\space with\space A^{-1},\space we \space get \hspace{35pt} AA^TAA^{-1} & = AIA^{-1} \\ AA^TI & = AA^{-1} \\ AA^T & = I \end{split} \end{multline*}\]

Problem Set 2

Matrix factorization is a very important problem. There are supercomputers built just to do matrix factorizations. Every second you are on an airplane, matrices are being factorized. Radars that track flights use a technique called Kalman filtering. At the heart of Kalman Filtering is a Matrix Factorization operation. Kalman Filters are solving linear systems of equations when they track your flight using radars. Write an R function to factorize a square matrix A into LU or LDU

Function LU_factorize defined

The LU_factorize function is based on the algorithm presented at https://rdrr.io/cran/matrixcalc/src/R/lu.decomposition.R

LU_factorize <- function(A){
  U = A # let the upper matrix be A
  L = diag(x = 1, nrow(A), ncol(A)) # construct a diagonal matrix using a baseR function
  
  # Creating Lower trainge (L) and Upper triangle (U)
  for (i in 1:(nrow(U) - 1)){
    for (j in (i + 1):nrow(U)){
      if (U[i, i] != 0){
        m = U[j, i] / U[i, i] # m is the multiplier
        L[j, i] = m
        U[j, ] = U[j, ] - m * U[i, ]
      }
    }
  }
  return(list("U"=U, "L"=L))
}

QA

n <- 5
x <- 99

test_sample_A = matrix(sample.int(x, n^2, replace=TRUE), nrow = n, ncol=n)
test_sample_B = matrix(c(1:16), nrow = 4, ncol = 4)
test_sample_c = matrix(c(1,5,8,-7,6,2,9,3,4), nrow=3)
test_sample_d = matrix(c(8,8,9,8), nrow=2, ncol=2)
all.equal(LU_factorize(test_sample_A)$L %*% LU_factorize(test_sample_A)$U, test_sample_A )
## [1] TRUE
all.equal(LU_factorize(test_sample_B)$L %*% LU_factorize(test_sample_B)$U, test_sample_B )
## [1] TRUE
all.equal(LU_factorize(test_sample_c)$L %*% LU_factorize(test_sample_c)$U, test_sample_c )
## [1] TRUE
all.equal(LU_factorize(test_sample_d)$L %*% LU_factorize(test_sample_d)$U, test_sample_d )
## [1] TRUE