HW2_605

jbrnbrg

September 6, 2017

1.1:

Show \(A^{T}A \not=AA^{T}\):

:JB

Let’s assume it’s true for the moment: \(A^{T}A = AA^{T}\)

If we substitute \(B\) in for \(A^{T}\) in the above equation, i.e. let \(B=A^{T}\), we can see that it would result in a form that defies the basic law of matrices: \(BA \not=AB\). Therefore, \(A^{T}A \not=AA^{T}\).

Below, I’ll demonstrate in R to provide further evidence:

A <- matrix(c(2, -1, -2, -4, 6, 3, -4, -2, 8), nrow=3, byrow=TRUE )
At <- t(A) # to get inverse of A

(At %*% A) == (A %*% At)
##       [,1]  [,2]  [,3]
## [1,] FALSE FALSE FALSE
## [2,] FALSE FALSE FALSE
## [3,] FALSE FALSE FALSE

1.2:

Under what conditions could it be true that: \(A^{-1} =A^{T}\)

:JB

The special type of matrices are orthogonal matrices. They are matrices that have the property:\(A^{-1} =A^{T}\).

Therefore, for orthogonal matrices: \(A^{T}A =A^{-1}A=I\) - which also shows that they must be invertable. For invertable matrices we know the following truth: \(A^{-1}A =AA^{-1}\) which allows us to show that for orthogonal matrices, \(A^{T}A =AA^{T}=I\).

2.0:

For LU decomposition, I’ve created a small function for creating a sort of pivot matrix that allows me to break down row operations:

pivot_maker <- function(n, i, j, element) {
  p <- diag(n)
  p[i,j] <- element
  return(p)    
}

For illustration, here’s an example matrix \(A\) that I’ll use to show it works:

(my_A <- matrix(c(2, -1, -2, -4, 6, 3, -4, -2, 8 ), nrow=3, byrow=TRUE ))
##      [,1] [,2] [,3]
## [1,]    2   -1   -2
## [2,]   -4    6    3
## [3,]   -4   -2    8

If we take the dimensions of \(A\) and use it as input in the function, we can define our \(E_1\) - note that reciprocal of \(A_{11}\) at position \((1,1)\):

(E1 <- pivot_maker(nrow(my_A), 1, 1, 1/my_A[1,1]))
##      [,1] [,2] [,3]
## [1,]  0.5    0    0
## [2,]  0.0    1    0
## [3,]  0.0    0    1

Left-hand multiplying: \(E_{1}A\) begins the row-reduction process and yields:

E1 %*% my_A
##      [,1] [,2] [,3]
## [1,]    1 -0.5   -1
## [2,]   -4  6.0    3
## [3,]   -4 -2.0    8

My solution algorithm begins from top-left (position (1,1)) to bottom-right (position (n, n)). It is based on the pseudo-code found on this Georgia Institute of Technology pdf and this Illinois Institute of Technology pdf as well - both discuss Partial Pivoting and LU factorization.

Basically, If I set up \(A\) next to an identity matrix \(I\) to create an augmented matrix and complete the row-reduction process, the right-side matrix will capture the permutations as \(L^{-1}\) and the left-hand side will become \(U\) and provide the LU decomposition of \(A\).

# input matrix must be scquare!
my_lu_decomp <- function(matrix_a){
  
  n <- nrow(matrix_a)                 
  matrix_e <- cbind(matrix_a, diag(n)) # aug_matrix:
  
  
  for(j in 1:n){
    matrix_e <- pivot_maker(n, j, j, 1/matrix_e[j,j]) %*% matrix_e
    
    if(j == n){                        # end at diagnal
      break()
    }
    
    for(i in (j+1):n){
      matrix_e <- pivot_maker(n, i, j, -matrix_e[i,j]) %*% matrix_e
    }
  }
  l<-solve(matrix_e[,4:6])             # inverse of e provides L
  u<-matrix_e[,1:3]
  
  lu_deco <- cbind(l,u)
  return(lu_deco)
}

Using the following example \(A\), I’ll demonstrate the function at work:

my_A <- matrix(c(2, -1, -2, -4, 6, 3, -4, -2, 8 ), nrow=3, byrow=TRUE )

Note that the function outputs an n-by-2n augmented matrix that contains both the \(L\) and \(U\) that must be extracted:

full <- my_lu_decomp(my_A)
(L <- full[,1:3])
##      [,1] [,2] [,3]
## [1,]    2    0    0
## [2,]   -4    4    0
## [3,]   -4   -4    3
(U <- full[,4:6])
##      [,1] [,2]  [,3]
## [1,]    1 -0.5 -1.00
## [2,]    0  1.0 -0.25
## [3,]    0  0.0  1.00
L%*%U 
##      [,1] [,2] [,3]
## [1,]    2   -1   -2
## [2,]   -4    6    3
## [3,]   -4   -2    8
L%*%U == my_A
##      [,1] [,2] [,3]
## [1,] TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE
## [3,] TRUE TRUE TRUE

Notes: Although inefficient and limited, this algorithm is pretty compact for handling so many calculations. It will not work if we have a zero in position 1,1 and it will also fail or become unstable for small numbers.

Here is a step-by-step example to see the two types of row operations at work:

my_A<-(cbind(my_A, diag(nrow(my_A))))

# divide through by position 1,1
E1 <- pivot_maker(nrow(my_A), 1, 1, 1/my_A[1,1]) %*% my_A

E1
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1 -0.5   -1  0.5    0    0
## [2,]   -4  6.0    3  0.0    1    0
## [3,]   -4 -2.0    8  0.0    0    1
E2 <- pivot_maker(nrow(my_A), 2, 1, -E1[2,1]) %*% E1

E2
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1 -0.5   -1  0.5    0    0
## [2,]    0  4.0   -1  2.0    1    0
## [3,]   -4 -2.0    8  0.0    0    1
E3 <- pivot_maker(nrow(my_A), 3, 1, -E2[3,1]) %*% E2 

E3
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1 -0.5   -1  0.5    0    0
## [2,]    0  4.0   -1  2.0    1    0
## [3,]    0 -4.0    4  2.0    0    1
E4 <- pivot_maker(nrow(my_A), 2, 2, 1/E3[2,2]) %*% E3

E4
##      [,1] [,2]  [,3] [,4] [,5] [,6]
## [1,]    1 -0.5 -1.00  0.5 0.00    0
## [2,]    0  1.0 -0.25  0.5 0.25    0
## [3,]    0 -4.0  4.00  2.0 0.00    1
E5 <- pivot_maker(nrow(my_A), 3, 2, -E4[3,2]) %*% E4

E5
##      [,1] [,2]  [,3] [,4] [,5] [,6]
## [1,]    1 -0.5 -1.00  0.5 0.00    0
## [2,]    0  1.0 -0.25  0.5 0.25    0
## [3,]    0  0.0  3.00  4.0 1.00    1
E6 <- pivot_maker(nrow(my_A), 3, 3, 1/E5[3,3]) %*% E5

E6
##      [,1] [,2]  [,3]     [,4]      [,5]      [,6]
## [1,]    1 -0.5 -1.00 0.500000 0.0000000 0.0000000
## [2,]    0  1.0 -0.25 0.500000 0.2500000 0.0000000
## [3,]    0  0.0  1.00 1.333333 0.3333333 0.3333333
L <- solve(E6[,4:6])
U <- E6[,1:3]

L%*%U
##      [,1] [,2] [,3]
## [1,]    2   -1   -2
## [2,]   -4    6    3
## [3,]   -4   -2    8