1.1:
Show \(A^{T}A \not=AA^{T}\):
:JB
Let’s assume it’s true for the moment: \(A^{T}A = AA^{T}\)
If we substitute \(B\) in for \(A^{T}\) in the above equation, i.e. let \(B=A^{T}\), we can see that it would result in a form that defies the basic law of matrices: \(BA \not=AB\). Therefore, \(A^{T}A \not=AA^{T}\).
Below, I’ll demonstrate in R to provide further evidence:
A <- matrix(c(2, -1, -2, -4, 6, 3, -4, -2, 8), nrow=3, byrow=TRUE )
At <- t(A) # to get inverse of A
(At %*% A) == (A %*% At)
## [,1] [,2] [,3]
## [1,] FALSE FALSE FALSE
## [2,] FALSE FALSE FALSE
## [3,] FALSE FALSE FALSE
1.2:
Under what conditions could it be true that: \(A^{-1} =A^{T}\)
:JB
The special type of matrices are orthogonal matrices. They are matrices that have the property:\(A^{-1} =A^{T}\).
Therefore, for orthogonal matrices: \(A^{T}A =A^{-1}A=I\) - which also shows that they must be invertable. For invertable matrices we know the following truth: \(A^{-1}A =AA^{-1}\) which allows us to show that for orthogonal matrices, \(A^{T}A =AA^{T}=I\).
2.0:
For LU decomposition, I’ve created a small function for creating a sort of pivot matrix that allows me to break down row operations:
pivot_maker <- function(n, i, j, element) {
p <- diag(n)
p[i,j] <- element
return(p)
}
For illustration, here’s an example matrix \(A\) that I’ll use to show it works:
(my_A <- matrix(c(2, -1, -2, -4, 6, 3, -4, -2, 8 ), nrow=3, byrow=TRUE ))
## [,1] [,2] [,3]
## [1,] 2 -1 -2
## [2,] -4 6 3
## [3,] -4 -2 8
If we take the dimensions of \(A\) and use it as input in the function, we can define our \(E_1\) - note that reciprocal of \(A_{11}\) at position \((1,1)\):
(E1 <- pivot_maker(nrow(my_A), 1, 1, 1/my_A[1,1]))
## [,1] [,2] [,3]
## [1,] 0.5 0 0
## [2,] 0.0 1 0
## [3,] 0.0 0 1
Left-hand multiplying: \(E_{1}A\) begins the row-reduction process and yields:
E1 %*% my_A
## [,1] [,2] [,3]
## [1,] 1 -0.5 -1
## [2,] -4 6.0 3
## [3,] -4 -2.0 8
My solution algorithm begins from top-left (position (1,1)) to bottom-right (position (n, n)). It is based on the pseudo-code found on this Georgia Institute of Technology pdf and this Illinois Institute of Technology pdf as well - both discuss Partial Pivoting and LU factorization.
Basically, If I set up \(A\) next to an identity matrix \(I\) to create an augmented matrix and complete the row-reduction process, the right-side matrix will capture the permutations as \(L^{-1}\) and the left-hand side will become \(U\) and provide the LU decomposition of \(A\).
# input matrix must be scquare!
my_lu_decomp <- function(matrix_a){
n <- nrow(matrix_a)
matrix_e <- cbind(matrix_a, diag(n)) # aug_matrix:
for(j in 1:n){
matrix_e <- pivot_maker(n, j, j, 1/matrix_e[j,j]) %*% matrix_e
if(j == n){ # end at diagnal
break()
}
for(i in (j+1):n){
matrix_e <- pivot_maker(n, i, j, -matrix_e[i,j]) %*% matrix_e
}
}
l<-solve(matrix_e[,4:6]) # inverse of e provides L
u<-matrix_e[,1:3]
lu_deco <- cbind(l,u)
return(lu_deco)
}
Using the following example \(A\), I’ll demonstrate the function at work:
my_A <- matrix(c(2, -1, -2, -4, 6, 3, -4, -2, 8 ), nrow=3, byrow=TRUE )
Note that the function outputs an n-by-2n augmented matrix that contains both the \(L\) and \(U\) that must be extracted:
full <- my_lu_decomp(my_A)
(L <- full[,1:3])
## [,1] [,2] [,3]
## [1,] 2 0 0
## [2,] -4 4 0
## [3,] -4 -4 3
(U <- full[,4:6])
## [,1] [,2] [,3]
## [1,] 1 -0.5 -1.00
## [2,] 0 1.0 -0.25
## [3,] 0 0.0 1.00
L%*%U
## [,1] [,2] [,3]
## [1,] 2 -1 -2
## [2,] -4 6 3
## [3,] -4 -2 8
L%*%U == my_A
## [,1] [,2] [,3]
## [1,] TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE
## [3,] TRUE TRUE TRUE
Notes: Although inefficient and limited, this algorithm is pretty compact for handling so many calculations. It will not work if we have a zero in position 1,1 and it will also fail or become unstable for small numbers.
Here is a step-by-step example to see the two types of row operations at work:
my_A<-(cbind(my_A, diag(nrow(my_A))))
# divide through by position 1,1
E1 <- pivot_maker(nrow(my_A), 1, 1, 1/my_A[1,1]) %*% my_A
E1
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 -0.5 -1 0.5 0 0
## [2,] -4 6.0 3 0.0 1 0
## [3,] -4 -2.0 8 0.0 0 1
E2 <- pivot_maker(nrow(my_A), 2, 1, -E1[2,1]) %*% E1
E2
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 -0.5 -1 0.5 0 0
## [2,] 0 4.0 -1 2.0 1 0
## [3,] -4 -2.0 8 0.0 0 1
E3 <- pivot_maker(nrow(my_A), 3, 1, -E2[3,1]) %*% E2
E3
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 -0.5 -1 0.5 0 0
## [2,] 0 4.0 -1 2.0 1 0
## [3,] 0 -4.0 4 2.0 0 1
E4 <- pivot_maker(nrow(my_A), 2, 2, 1/E3[2,2]) %*% E3
E4
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 -0.5 -1.00 0.5 0.00 0
## [2,] 0 1.0 -0.25 0.5 0.25 0
## [3,] 0 -4.0 4.00 2.0 0.00 1
E5 <- pivot_maker(nrow(my_A), 3, 2, -E4[3,2]) %*% E4
E5
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 -0.5 -1.00 0.5 0.00 0
## [2,] 0 1.0 -0.25 0.5 0.25 0
## [3,] 0 0.0 3.00 4.0 1.00 1
E6 <- pivot_maker(nrow(my_A), 3, 3, 1/E5[3,3]) %*% E5
E6
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 -0.5 -1.00 0.500000 0.0000000 0.0000000
## [2,] 0 1.0 -0.25 0.500000 0.2500000 0.0000000
## [3,] 0 0.0 1.00 1.333333 0.3333333 0.3333333
L <- solve(E6[,4:6])
U <- E6[,1:3]
L%*%U
## [,1] [,2] [,3]
## [1,] 2 -1 -2
## [2,] -4 6 3
## [3,] -4 -2 8