1. Problem set 1

  1. Show that \(\mathbf{A}^T\mathbf{A} \neq \mathbf{A}\mathbf{A}^T\) in general.

   Let \(\mathbf A = \left[{a_{ij}}\right], \mathbf{A}^T = \mathbf B = \left[{b_{ji}}\right]\) where \(i,j<2\). Then \(\mathbf{A} \mathbf{B}\) and \(\mathbf{B} \mathbf{A}\) are trvially defined because the number of columns in the left matrix equals the number of rows in the right matrix, and \(i=0\) represents the the empty set while \(i=1\) represents scalars. For the general case of \(i,j \geq 2\), consider cases where \(i = j\), and so \(\mathbf A = \left[{a_{ij}}\right]\), \(\mathbf{A}^T = \mathbf B = \left[{b_{jk}}\right]\) are square matrices. Then proceed with proof by contradiction. Suppose \(\mathbf A \mathbf B = \mathbf B \mathbf A\) and take the example where \(i,j = 2\). Get

\[\mathbf A \mathbf B = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} a_{11} b_{11} + a_{12} b_{21} & a_{11} b_{12} + a_{12} b_{22} \\ a_{21} b_{11} + a_{22} b_{21} & a_{21} b_{12} + a_{22} b_{22} \end{bmatrix}\]

   and

\[\mathbf B \mathbf A = \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} = \begin{bmatrix} b_{11} a_{11} + b_{12} a_{21} & b_{11} a_{12} + b_{12} a_{22} \\ b_{21} a_{11} + b_{22} a_{21} & b_{21} a_{12} + b_{22} a_{22} \end{bmatrix}\]

   which negates the claim that \(\mathbf A \mathbf B = \mathbf B \mathbf A\) in general. Therefore, it is true that in general, \(\mathbf A \mathbf B \ne \mathbf B \mathbf A\) \(\blacksquare\) i


  1. For a special type of square matrix \(\mathbf{A}\), we get \(\mathbf{A}^T\mathbf{A} = \mathbf{A}\mathbf{A}^T\). Under what conditions could this be true? (Hint: The Identity matrix \(\mathbf{I}\) is an example of such a matrix).

    A symmetric matrix with \(a_{ij} = a_{ji}\) has a transpose that is equal to itself such that \(\mathbf{A}^T = \mathbf{A}\). Therefore all matrices of this form yield \(\mathbf{A}^T\mathbf{A} = \mathbf{A}\mathbf{A} = \mathbf{A}\mathbf{A}^T\). ii


2. Problem set 2

Matrix factorization is a very important problem. There are supercomputers built just to do matrix factorizations. Every second you are on an airplane, matrices are being factorized. Radars that track flights use a technique called Kalman filtering. At the heart of Kalman Filtering is a Matrix Factorization operation. Kalman Filters are solving linear systems of equations when they track your flight using radars. Write an R function to factorize a square matrix \(\mathbf{A}\) into \(\mathbf{LU}\) or \(\mathbf{LDU}\), whichever you prefer. You don’t have to worry about permuting rows of \(\mathbf{A}\) and you can assume that \(\mathbf{A}\) is less than \(5\times5\), if you need to hard-code any variables in your code.

factorization = function(A){
  r <- c <- dim(A)[1]
  L <- D <- matrix(diag(r),nrow=r, ncol=c)
  U <- A
  for (j in 1:(c - 1)) {
    for (i in (j + 1):r) {
      L[i,j] <- U[i,j] / U[j,j]
      U[i,] <- U[i,] - U[j,] * L[i,j]
    }
  }
  diag(D) <- diag(U)
  for (k in 1:r) {
    U[k,] <- U[k,] / U[k,k]
  }
  LDU <- list("L" = L, "D" = D, "U" = U)
return(LDU)
}

The function creates three matrices: two identity matrices \(\mathbf{L}, \mathbf{D}\) and the matrix \(\mathbf{U} = \mathbf{A}\). It then loops through \(\mathbf{L}\) placing the multipliers in the diagonal and through \(\mathbf{U}\) applying the pivot and multiply procedure. The diagonal of \(\mathbf{U}\) is then transferred to the diagonal of \(\mathbf{D}\), and then the rows of \(\mathbf{U}\) are looped through and divided by their respective pivots. A list is then returned with all the values for the three matrices. To test the function, a matrix \(\mathbf{A}\) of rank \(n\) is created and a test is run to see if its factorization \(\mathbf{LDU}\) is equivalent.

rank = 5
set.seed(10014)
x <- rnorm(rank^2)
A <- matrix(x, nrow=rank); A
##             [,1]       [,2]        [,3]       [,4]       [,5]
## [1,]  0.29510599 -0.7054909  0.47833482  0.4346340  1.8069750
## [2,] -1.76077985  1.1386157 -0.96048538  0.2011658 -0.5593332
## [3,]  2.15850481 -0.7107617 -1.09883393  1.2091517  1.3123407
## [4,]  1.35322735 -1.2040571 -0.07013568 -2.1669726  0.9981520
## [5,] -0.01176667  0.4352981  0.40104154  0.3656437  0.5578555
LDU <- factorization(A); LDU
## $L
##             [,1]       [,2]       [,3]       [,4] [,5]
## [1,]  1.00000000  0.0000000  0.0000000  0.0000000    0
## [2,] -5.96660160  1.0000000  0.0000000  0.0000000    0
## [3,]  7.31433760 -1.4489657  1.0000000  0.0000000    0
## [4,]  4.58556388 -0.6614036  0.5454438  1.0000000    0
## [5,] -0.03987271 -0.1325949 -0.3620513 -0.4371289    1
## 
## $D
##          [,1]      [,2]     [,3]      [,4]     [,5]
## [1,] 0.295106  0.000000  0.00000  0.000000 0.000000
## [2,] 0.000000 -3.070768  0.00000  0.000000 0.000000
## [3,] 0.000000  0.000000 -1.85385  0.000000 0.000000
## [4,] 0.000000  0.000000  0.00000 -3.445818 0.000000
## [5,] 0.000000  0.000000  0.00000  0.000000 2.114382
## 
## $U
##      [,1]      [,2]       [,3]       [,4]       [,5]
## [1,]    1 -2.390636  1.6208916  1.4728066  6.1231391
## [2,]    0  1.000000 -0.6166367 -0.9100181 -3.3288636
## [3,]    0  0.000000  1.0000000 -1.1215359 -1.5681336
## [4,]    0  0.000000  0.0000000  1.0000000  0.6130682
## [5,]    0  0.000000  0.0000000  0.0000000  1.0000000
L <- LDU[[1]] #or as.matrix(LDU$L)
D <- LDU[[2]] #or as.matrix(LDU$D)
U <- LDU[[3]] #or as.matrix(LDU$U)
all.equal(L %*% D %*% U, A)
## [1] TRUE