\(A^TA \neq AA^T\) can be proved generally with one counterexample. Yet let’s explore the generalized form in order to gain an intuition for why \(A^TA = AA^T\) will only be true in special circumstances.
Let’s define matrix A and its transpose:
\(A = \begin{bmatrix} a_{11} & a_{12} & a_{13} & a_{14}\\ a_{21} & a_{22} & a_{23} & a_{24}\\ a_{31} & a_{32} & a_{33} & a_{34} \end{bmatrix}\)
\(A^T = \begin{bmatrix} a_{11} & a_{21} & a_{31}\\ a_{12} & a_{22} & a_{32}\\ a_{13} & a_{23} & a_{33}\\ a_{14} & a_{24} & a_{34} \end{bmatrix}\)
Then see that the general form of \(A^TA\):
\(\begin{bmatrix} a_{11} & a_{21} & a_{31}\\ a_{12} & a_{22} & a_{32}\\ a_{13} & a_{23} & a_{33}\\ a_{14} & a_{24} & a_{34} \end{bmatrix}\begin{bmatrix} a_{11} & a_{12} & a_{13} & a_{14}\\ a_{21} & a_{22} & a_{23} & a_{24}\\ a_{31} & a_{32} & a_{33} & a_{34} \end{bmatrix}\)
will result in:
\(\begin{bmatrix} (a_{11} a_{11} + a_{21}a_{21} + a_{31} a_{31}) & (a_{11} a_{12} + a_{21}a_{22} + a_{31} a_{32}) & (a_{11} a_{13} + a_{21} a_{23} + a_{31} a_{33}) & (a_{11} a_{14} + a_{21} a_{24} + a_{31} a_{34}) \\ (a_{12} a_{11} + a_{22}a_{21} + a_{32} a_{31}) & (a_{12} a_{12} + a_{22}a_{22} + a_{32} a_{32}) & (a_{12} a_{13} + a_{22} a_{23} + a_{32} a_{33}) & (a_{12} a_{14} + a_{22} a_{24} + a_{32} a_{34})\\ (a_{13} a_{11} + a_{23}a_{21} + a_{33} a_{31}) & (a_{13} a_{12} + a_{23}a_{22} + a_{33} a_{32}) & (a_{13} a_{13} + a_{23} a_{23} + a_{33} a_{33}) & (a_{13} a_{14} + a_{23} a_{24} + a_{33} a_{34})\\ (a_{14} a_{11} + a_{24}a_{21} + a_{34} a_{31}) & (a_{14} a_{12} + a_{24}a_{22} + a_{34} a_{32}) & (a_{14} a_{13} + a_{24} a_{23} + a_{34} a_{33}) & (a_{14} a_{14} + a_{24} a_{24} + a_{34} a_{34}) \end{bmatrix}\)
Now let’s look at \(AA^T\)
\(\begin{bmatrix} a_{11} & a_{21} & a_{31}\\ a_{12} & a_{22} & a_{32}\\ a_{13} & a_{23} & a_{33}\\ a_{14} & a_{24} & a_{34} \end{bmatrix} \begin{bmatrix} a_{11} & a_{12} & a_{13} & a_{14}\\ a_{21} & a_{22} & a_{23} & a_{24}\\ a_{31} & a_{32} & a_{33} & a_{34} \end{bmatrix}\)
which results in:
\(\begin{bmatrix} (a_{11} a_{11} + a_{12}a_{12} + a_{13} a_{13} + a_{14} a_{14}) & (a_{11} a_{21} + a_{12}a_{22} + a_{13} a_{23} + a_{14} a_{24}) & (a_{11} a_{31} + a_{12} a_{32} + a_{13} a_{33} + a_{14} a_{34}) \\ (a_{21} a_{11} + a_{22} a_{12} + a_{23} a_{13} + a_{24} a_{14}) & (a_{21} a_{21} + a_{22} a_{22} + a_{23} a_{23} + a_{24} a_{24}) & (a_{21} a_{31} + a_{22} a_{32} + a_{23} a_{33} + a_{24} a_{34}) \\ (a_{31} a_{11} + a_{32} a_{12} + a_{33} a_{13} + a_{34} a_{14}) & (a_{31} a_{21} + a_{32} a_{22} + a_{33} a_{23} + a_{34} a_{24}) & (a_{31} a_{31} + a_{32} a_{32} + a_{33} a_{33} + a_{34} a_{34}) \end{bmatrix}\)
We can see the two results are not equivalent.
Let’s clarify this with an example.
# Define matrix A
A <- matrix(c(1:12), nrow=3, ncol=4)
# Generate transpose of A
At <- t(A)
A
## [,1] [,2] [,3] [,4]
## [1,] 1 4 7 10
## [2,] 2 5 8 11
## [3,] 3 6 9 12
At
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
## [4,] 10 11 12
At %*% A
## [,1] [,2] [,3] [,4]
## [1,] 14 32 50 68
## [2,] 32 77 122 167
## [3,] 50 122 194 266
## [4,] 68 167 266 365
A %*% At
## [,1] [,2] [,3]
## [1,] 166 188 210
## [2,] 188 214 240
## [3,] 210 240 270
Since we know matrix multiplication is not commutative (Beezer, page 180), in order for the equality above to be true A must equal \(A^T\). When this occurs, A is known as a symmetric matrix. Additionally,it is worth noting that since the transpose of an m × n matrix A is the matrix n × m, in order for a matrix to symmetric it must be a square matrix (Beezer, page 166). An example is demonstrated below.
# Define symmetric matrix A
A <- matrix(c(1, 1, -1, 1, 2, 0, -1, 0, 5), nrow=3, ncol=3)
# Generate transpose of A
At <- t(A)
A
## [,1] [,2] [,3]
## [1,] 1 1 -1
## [2,] 1 2 0
## [3,] -1 0 5
At
## [,1] [,2] [,3]
## [1,] 1 1 -1
## [2,] 1 2 0
## [3,] -1 0 5
A == At
## [,1] [,2] [,3]
## [1,] TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE
## [3,] TRUE TRUE TRUE
res_1 <- At %*% A
res_2 <- A %*% At
res_1 == res_2
## [,1] [,2] [,3]
## [1,] TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE
## [3,] TRUE TRUE TRUE
A known limitation of this LU factorization algorithm is its instability handling a zero pivot.
# Define matrix A
A <- matrix(c(1, 2, 4, 1, 3,6,1,5,8), nrow=3, ncol=3)
A
## [,1] [,2] [,3]
## [1,] 1 1 1
## [2,] 2 3 5
## [3,] 4 6 8
# LU factorization algorithm
lu.decomposition.new <- function(A)
{
m <- nrow(A)
L <- diag(m)
U <- A
for (k in 1:m-1) {
for (j in k+1:m) {
if (j < m+1 & k > 0) {
L[j,k] <- U[j,k]/U[k,k]
U[j,k:m] <- U[j,k:m] - L[j,k] * U[k,k:m]
}
}
}
result <- list(L=L, U=U)
return(result)
}
# Find LU matrices
lu.decomposition.new(A)
## $L
## [,1] [,2] [,3]
## [1,] 1 0 0
## [2,] 2 1 0
## [3,] 4 2 1
##
## $U
## [,1] [,2] [,3]
## [1,] 1 1 1
## [2,] 0 1 3
## [3,] 0 0 -2
# Compare with function from R package matrixcalc
lu.decomposition(A)
## $L
## [,1] [,2] [,3]
## [1,] 1 0 0
## [2,] 2 1 0
## [3,] 4 2 1
##
## $U
## [,1] [,2] [,3]
## [1,] 1 1 1
## [2,] 0 1 3
## [3,] 0 0 -2
Beezer, Subsection TSM Transposes and Symmetric Matrices, A First Course in Linear Algebra, page 166
Beezer, Subsection MM Matrix Multiplication, A First Course in Linear Algebra, page 180
Gaussian Elimination and LU Factorization
lu.decomposition: LU Decomposition of Square Matrix
LU Decomposition - Shortcut Method