Problem Set 1

  1. Prove that \(A^TA \ne AA^T\) in general.
    Assume \(A\) is an \(m x n\) matrix, with \(m \in \mathbb{R}\) and \(n \in \mathbb{R}\). Let’s also assum \(m \ne n\). Since A is \(m x n\), \(A^T\) is then \(n x m\). If we just compare dimensions when we multiply each term of the equation, we see that each side could result in differing dimensions.

\[ \begin{align} _{(m x n)}AA^T_{(nxm)} = B_{(m x m)} \end{align} \]

\[ \begin{align} _{(n x m)}A^TA_{(mxn)} = C_{(n x n)} \end{align} \] The subscripts in parentheses above denote the dimensions of the corresponding matrices.Because \(C\) is \(n x n\), and \(B\) is \(m x m\), and \(m\) is not necessarily equal to \(n\), then \(B \ne C\) and \(AA^T \ne A^TA\).

  1. For a special type of square matrix A, we get \(A^T A = AA^T\) . Under what conditions could this be true? (Hint: The Identity matrix I is an example of such a matrix).

To solve this, let’s take an (educated) guess that \(A\) is diagonal, which would include the identity matrix \(\mathbb{I}\). Then, we can check our assumption and tha \(A\) meets the condition

A diagonal matrix \(A\) takes the form:

\[\begin{align} A = \begin{bmatrix} a_{11} & 0 & \dots \\ \vdots & \ddots & \\ 0 & & a_{KK} \end{bmatrix} \end{align}\]

with \(a_{ii} \ne 0\) and \(a_{ij} = 0\) when \(i \ne j\). In simpler terms, only the diagonal terms of the matrix are non-zero. When this is the case, \(A = A^T\) and we can rewrite the original equation substituting \(A\) for \(A^T\):

\[\begin{align} AA^T = AA = A^2 \end{align}\]

\[\begin{align} A^TA = AA = A^2 \end{align}\]

Since both sides of the equation are equal to \(A^2\), they are equal to each other, and \(AA^T = A^TA\) in the case that \(A\) is a diagonal matrix.

A = LU Factorization

First, let’s define our function(s) to factorize a given square matrix A into lower and upper triangular matrices: \(A= LU\):

# create test 3x3 matrix
sample_matrix <- matrix(data=c(1,2,3,5,5,6,7,8,9), nrow=3, ncol=3)

Adding a helper functions that will be called in lu_factorization to construct the initial \(L\) for any beginning matrix size \(n\). \(L\) starts as the identity matrix then becomes populated with “row multipliers” below the diagonal.

baseL <- function(N){
  # Returns "base" lower-triangular matrix: L
  L <- matrix(data=NA, nrow=N, ncol=N)
  # setting diagonal vals to 1 and other vals to 0
  for (i in seq(N)){
    for (j in seq(N))
      # Set diagonal vals to 1
      if (i == j){
        L[i, j] = 1
      }
      # Set other vals to 0
      else{
        L[i, j] = 0
      }
  }
  return(L)
}

Now, let’s write out our function to perform LU decomposition of a given matrix

lu_factorization <- function(A){
  # A is square, so n x n by definition
  n <- nrow(A)
  L <- baseL(n)

  # We will derive U from A, so let's set them as equal to start
  U <- A
  # First, iterate over rows
  for (i in 1:n){
    if (i > 1) { # skip first row in matrix - we won't need to row-reduce it!
      
      # nowmiterate over column vals
      for (j in 1:n){
        multiplier <- U[i, j] / U[j, j]
        if (j < i){ # only need to operate/row-reduce on index pairs below the diagonal
          
          # Set L in proper position to have multiplier (below diagonal)
          L[i, j] = multiplier
          
          # Perform row reduction on U
          U[i, ] = U[i, ] - L[i,j] * U[j, ]
        }
      }
    }
  }
  return(list(L, U))
}

# Sample test call of function
result <- lu_factorization(sample_matrix)
L <- result[[1]]
U <- result[[2]]

# print out our results
print(L)
##      [,1] [,2] [,3]
## [1,]    1  0.0    0
## [2,]    2  1.0    0
## [3,]    3  1.8    1
print(U)
##      [,1] [,2] [,3]
## [1,]    1    5  7.0
## [2,]    0   -5 -6.0
## [3,]    0    0 -1.2
print(L %*% U)
##      [,1] [,2] [,3]
## [1,]    1    5    7
## [2,]    2    5    8
## [3,]    3    6    9
# Testing out above LU function on sample 4x4 matrix
matrix4x4 <- matrix(c(2,3,1,5,7,2,3,4,1,0,5,4,3,3,4,5), nrow=4, ncol=4)

result4 <- lu_factorization(matrix4x4)

L4 <- result4[[1]]
U4 <- result4[[2]]

print(L4)
##      [,1]       [,2]      [,3] [,4]
## [1,]  1.0 0.00000000 0.0000000    0
## [2,]  1.5 1.00000000 0.0000000    0
## [3,]  0.5 0.05882353 1.0000000    0
## [4,]  2.5 1.58823529 0.8461538    1
print(U4)
##      [,1] [,2]          [,3]      [,4]
## [1,]    2  7.0  1.000000e+00  3.000000
## [2,]    0 -8.5 -1.500000e+00 -1.500000
## [3,]    0  0.0  4.588235e+00  2.588235
## [4,]    0  0.0 -4.440892e-16 -2.307692
print(L4 %*% U4)
##      [,1] [,2] [,3] [,4]
## [1,]    2    7    1    3
## [2,]    3    2    0    3
## [3,]    1    3    5    4
## [4,]    5    4    4    5