Problem Set 1

(1) Show that \((\mathbf{A}^T\mathbf{A})\neq(\mathbf{A}\mathbf{A}^T)\) in general. (Proof and demonstration.)

If the matrix A is not a square matrix, then \((\mathbf{A}^T\mathbf{A})\neq(\mathbf{A}\mathbf{A}^T)\). The reason for this is because, when multiplying two matrices, the number of columns in the first matrix must equal the number of rows in the second matrix. So, for \((\mathbf{A}\mathbf{B})\), matrix A must have mxn dimensions and matrix B must have nxs dimensions. Since, in our case, our B matrix is \(\mathbf{A}^T\), its dimensions will be nxm.

It must also be taken into consideration that the product of mutliplying two matrices will have the same number of rows as the first matrix, and the same number of columns as the second matrix. So, mutliplying \((\mathbf{A}\mathbf{B})\) will form a matrix with mxs dimensions. What this means for \((\mathbf{A}^T\mathbf{A})\), where matrix \(\mathbf{A}^T\) has dimensions mxn and matrix A has dimensions nxm, is that the product will have mxm dimensions. However, for \((\mathbf{A}\mathbf{A}^T)\) the multiplication is happening between matrices with dimensions nxm and mxn, making their product have dimensions nxn and since matrix A is not a square matrix, n is not equal to m and, therefore \((\mathbf{A}^T\mathbf{A})\neq(\mathbf{A}\mathbf{A}^T)\).

\[\mathbf{A} = \left[\begin{array} {rrr} 1 & 2 & 3 \\ 4 & 5 & 6 \end{array}\right] , \mathbf{A^T} = \left[\begin{array} {rrr} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{array}\right] \] \[\mathbf{AA^T} = \left[\begin{array} {rrr} 1 & 2 & 3 \\ 4 & 5 & 6 \end{array}\right] \left[\begin{array} {rrr} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{array}\right] = \left[\begin{array} {rrr} 14 & 32 \\ 32 & 77 \end{array}\right] \] \[\mathbf{A^TA} = \left[\begin{array} {rrr} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{array}\right] \left[\begin{array} {rrr} 1 & 2 & 3 \\ 4 & 5 & 6 \end{array}\right]= \left[\begin{array} {rrr} 17 & 22 & 27 \\ 22 & 29 & 36 \\ 27 & 36 & 45 \end{array}\right] \] \[\left[\begin{array} {rrr} 14 & 32 \\ 32 & 77 \end{array}\right] \neq \left[\begin{array} {rrr} 17 & 22 & 27 \\ 22 & 29 & 36 \\ 27 & 36 & 45 \end{array}\right] \] However, even if matrix A is square, because multiplying two square matrices looks like: \[\left[\begin{array} {rrr} a & b \\ c & d \end{array}\right] \left[\begin{array} {rrr} e & f \\ g & h \end{array}\right]= \left[\begin{array} {rrr} (ae+bg) & (af+bh) \\ (ce+dg) & (cf+dh) \end{array}\right] \] if we take a matrix A and multiply it by its transpose as such: \[\mathbf{AA^T} =\left[\begin{array} {rrr} a & b \\ c & d \end{array}\right] \left[\begin{array} {rrr} a & c \\ b & d \end{array}\right]= \left[\begin{array} {rrr} (a^2+b^2) & (ac+bd) \\ (ca+db) & (c^2+d^2) \end{array}\right] \] or if we take the transpose of a matrix and multiply it by itself: \[\mathbf{A^TA} =\left[\begin{array} {rrr} a & c \\ b & d \end{array}\right] \left[\begin{array} {rrr} a & b \\ c & d \end{array}\right]= \left[\begin{array} {rrr} (a^2+c^2) & (ab+cd) \\ (ba+dc) & (b^2+d^2) \end{array}\right] \] we see that (unless under certain conditions such as \(c^2\neq b^2\)), generally: \[\left[\begin{array} {rrr} (a^2+b^2) & (ac+bd) \\ (ca+db) & (c^2+d^2) \end{array}\right] \neq \left[\begin{array} {rrr} (a^2+c^2) & (ab+cd) \\ (ba+dc) & (b^2+d^2) \end{array}\right] \] And an example of that would be: \[\mathbf{AA^T} =\left[\begin{array} {rrr} 1 & 2 \\ 3 & 4 \end{array}\right] \left[\begin{array} {rrr} 1 & 3 \\ 2 & 4 \end{array}\right]= \left[\begin{array} {rrr} 5 & 11 \\ 11 & 25 \end{array}\right] \] \[\mathbf{A^TA} =\left[\begin{array} {rrr} 1 & 3 \\ 2 & 4 \end{array}\right]\left[\begin{array} {rrr} 1 & 2 \\ 3 & 4 \end{array}\right]= \left[\begin{array} {rrr} 10 & 14 \\ 14 & 20 \end{array}\right] \] \[ \left[\begin{array} {rrr} 5 & 11 \\ 11 & 25 \end{array}\right] \neq \left[\begin{array} {rrr} 10 & 14 \\ 14 & 20 \end{array}\right] \] (2) For a special type of square matrix A, we get \((\mathbf{A}^T\mathbf{A})=(\mathbf{A}\mathbf{A}^T)\). Under what conditions could this be true? (Hint: The Identity matrix I is an example of such a matrix).

\((\mathbf{A}^T\mathbf{A})=(\mathbf{A}\mathbf{A}^T)\) when \((\mathbf{A}^T)=(\mathbf{A})\), which is a symmetric matrix.


Problem Set 2

Write an R function to factorize a square matrix A into LU or LDU. Don’t worry about permuting rows of A and you can assume that A is less than 5x5.

LU = function(A){
  n = dim(A)[1]
  I = diag(n)
  Isub2 = diag(n)
  Isub3 = diag(n)
  Isub4 = diag(n)
  Isub5 = diag(n)
  Isub6 = diag(n)
  Isub7 = diag(n)
  
  Isub2[2,] = -(A[2,1]/A[1,1])*I[1,]+I[2,]
  A[2,] = -(A[2,1]/A[1,1])*A[1,]+A[2,]
  
  if (n==3){
    Isub3[3,] = -(A[3,1]/A[1,1])*I[1,]+I[3,]
    A[3,] = -(A[3,1]/A[1,1])*A[1,]+A[3,]
    Isub4[3,] = -(A[3,2]/A[2,2])*I[2,]+I[3,]
    A[3,] = -(A[3,2]/A[2,2])*A[2,]+A[3,]
    # L
    L = solve(Isub2) %*% solve(Isub3) %*% solve(Isub4)
    
    } else if (n==4){
      Isub3[3,] = -(A[3,1]/A[1,1])*I[1,]+I[3,]
      A[3,] = -(A[3,1]/A[1,1])*A[1,]+A[3,]
      Isub5[4,] = -(A[4,1]/A[1,1])*I[1,]+I[4,]
      A[4,] = -(A[4,1]/A[1,1])*A[1,]+A[4,]
      Isub4[3,] = -(A[3,2]/A[2,2])*I[2,]+I[3,]
      A[3,] = -(A[3,2]/A[2,2])*A[2,]+A[3,]
      # adding on to n=3
      Isub6[4,] = -(A[4,2]/A[2,2])*I[2,]+I[4,]
      A[4,] = -(A[4,2]/A[2,2])*A[2,]+A[4,]
      Isub7[4,] = -(A[4,3]/A[3,3])*I[3,]+I[4,]
      A[4,] = -(A[4,3]/A[3,3])*A[3,]+A[4,]
      # L
      L = solve(Isub2) %*% solve(Isub3) %*% solve(Isub4) %*% solve(Isub5) %*% solve(Isub6) %*% solve(Isub7)
      
    } else {L = solve(I)
      }
  
  U = A
  output = list(L,U)
  return(output)
}

Testing

# Generating random matrices and testing to see if we get the a Lower an Upper triangle
# 2x2 matrix
A = matrix(sample(20,4,T),2)
LU(A)
## [[1]]
##      [,1] [,2]
## [1,]    1    0
## [2,]    0    1
## 
## [[2]]
##      [,1]     [,2]
## [1,]   14  4.00000
## [2,]    0 15.85714
# 3x3 matrix
A = matrix(sample(20,9,T),3)
LU(A)
## [[1]]
##        [,1]      [,2] [,3]
## [1,] 1.0000  0.000000    0
## [2,] 1.0625  1.000000    0
## [3,] 0.5000 -3.076923    1
## 
## [[2]]
##      [,1]  [,2]   [,3]
## [1,]   16 20.00  2.000
## [2,]    0 -3.25  4.875
## [3,]    0  0.00 26.000
# 4x4 matrix
A = matrix(sample(20,16,T),4)
LU(A)
## [[1]]
##      [,1]     [,2]     [,3] [,4]
## [1,]  1.0 0.000000 0.000000    0
## [2,]  1.0 1.000000 0.000000    0
## [3,]  2.0 2.111111 1.000000    0
## [4,]  2.8 2.611111 1.542411    1
## 
## [[2]]
##      [,1] [,2]      [,3]       [,4]
## [1,]    5   20  15.00000  18.000000
## [2,]    0  -18  -1.00000 -17.000000
## [3,]    0    0 -24.88889  12.888889
## [4,]    0    0   0.00000  -7.891071