Below, I define a function that gets the dimensions of a matrix, whether it is square, whether it is symmetric, and stores those properties, as well as a copy of the matrix, its transpose, and the dimensions of its transpose all in a list. It will be useful in answering the questions throughout.
get_properties <- function(A){
m = nrow(A)
n = ncol(A)
if (m != n){
square = FALSE
}
else{
square = TRUE
}
equality <- c()
if (square){
for (i in 1:m){
for (j in 1:n){
if (A[i, j] == A[j, i]){
append(equality, TRUE)
}else{
append(equality, FALSE)
}
}
}
symmetric <- isTRUE(all(equality == TRUE))
}else{
symmetric <- FALSE
}
t_A = t(A)
t_A_m = nrow(t_A)
t_A_n = ncol(t_A)
properties <- list("A" = A, "A_rows" = m, "A_cols" = n, "square" =
square, "symmetric" = symmetric, "AT" = t_A,
"AT_rows" = t_A_m, "AT_cols" = t_A_n)
}
I also adapted the below function for formatting a matrix in LaTeX from here: https://www.r-bloggers.com/2020/08/matrix-to-latex/.
to_LaTeX <- function(A){
rows <- apply(A, MARGIN=1, paste, collapse = " & ")
matrix_string <- paste(rows, collapse = " \\\\ ")
return(paste("\\begin{bmatrix}", matrix_string, "\\end{bmatrix}"))
}
Show that ATA≠AAT in general. (Proof and demonstration.)
If A is a matrix with m rows and n columns, then AT is a matrix with n rows and m columns.
If m≠n, both A and AT are rectangular matrices, and their products are square matrices. ATA is an nxn matrix, and AAT is an mxm matrix. Since m≠n, ATA≠AAT because matrices with different dimensions are never equal.
In the below example, I demonstrate:
set.seed(1)
m <- 3
n <- 5
s <- sample(1:5, m*n, replace = TRUE)
A <- matrix(s, m, n)
properties <- get_properties(A)
print_A <- to_LaTeX(properties$A)
print_AT <- to_LaTeX(properties$AT)
product1 <- t(A) %*% A
print_ATA <- to_LaTeX(product1)
product2 <- A %*% t(A)
print_AAT <- to_LaTeX(product2)
A=[122124535213351],AT=[141253233155221]
ATA=[1825172611253828421717282232132642325117111713179]≠[142920297955205545]=AAT
Therefore, ATA≠AAT when m≠n. The products will always have different dimensions.
For a special type of square matrix A, ATA=AAT. Under what conditions could this be true? (Hint: The identity matrix I is an example of such a matrix.)
The square matrix A must be symmetric in order for ATA=AAT to be true. Symmetry is defined as A=AT, meaning a square matrix must be equal to its transpose in order for it to be symmetric.
Each [row i, column j] entry of a symmetric matrix is symmetric in respect to the main diagonal. So if A is symmetric, then Ai,j=Aj,i.
The identity matrix I is square, and it is symmetric because I=IT. Symmetry is also a property of all square diagonal matrices, not just I.
Below, I confirm that I is symmetric and that ITI=IIT.
I <- diag(5)
properties <- get_properties(I)
print_I <- to_LaTeX(properties$A)
print_IT <- to_LaTeX(properties$AT)
product1 <- t(I) %*% I
print_ITI <- to_LaTeX(product1)
product2 <- I %*% t(I)
print_IIT <- to_LaTeX(product2)
I=[1000001000001000001000001]=[1000001000001000001000001]=IT
ITI=[1000001000001000001000001]=[1000001000001000001000001]=IIT
I also confirm this for a square scalar matrix B, which is simply a multiple of I.
B <- I * 3
properties <- get_properties(B)
print_B <- to_LaTeX(properties$A)
print_BT <- to_LaTeX(properties$AT)
product1 <- t(B) %*% B
print_BTB <- to_LaTeX(product1)
product2 <- B %*% t(B)
print_BBT <- to_LaTeX(product2)
B=[3000003000003000003000003]=[3000003000003000003000003]=BT
BTB=[9000009000009000009000009]=[9000009000009000009000009]=BBT
I also confirm this for a square diagonal matrix C.
C <- matrix(0, 3, 3)
diag(C) <- c(3, 2, 1)
properties <- get_properties(C)
print_C <- to_LaTeX(properties$A)
print_CT <- to_LaTeX(properties$AT)
product1 <- t(C) %*% C
print_CTC <- to_LaTeX(product1)
product2 <- C %*% t(C)
print_CCT <- to_LaTeX(product2)
C=[300020001]=[300020001]=CT
CTC=[900040001]=[900040001]=CCT
Lastly, I confirm this for a symmetric non-diagonal matrix D.
D <- rbind(c(1, 2, -1, 5),
c(2, 1, 3, 0),
c(-1, 3, 0, 4),
c(5, 0, 4, 2))
properties <- get_properties(D)
print_D <- to_LaTeX(properties$A)
print_DT <- to_LaTeX(properties$AT)
product1 <- t(D) %*% D
print_DTD <- to_LaTeX(product1)
product2 <- D %*% t(D)
print_DDT <- to_LaTeX(product2)
D=[12−152130−13045042]=[12−152130−13045042]=DT
DTD=[31125111141222512631122345]=[31125111141222512631122345]=DDT
Write an R function to factorize a square matrix A into LU or LDU, whichever you prefer. You don’t have to worry about permuting rows of A, and you can assume that A is less than 5x5 if you need to hard-code any variables in your code.
First, I define a function below to move any all-zero rows of a matrix to the bottom of that matrix so Gauss-Jordan elimination can then be performed to get the matrix into reduced row-echelon form. (I originally thought I needed reduced row-echelon form in order to get the Upper triangle, but I was mistaken. I’m leaving the code in since it’s the basis for more of my work below.)
move_all_zero_rows <- function(A){
properties <- get_properties(A)
copy <- properties$A
m <- properties$A_rows
n <- properties$A_cols
zero_row_count <- 0
checking <- TRUE
while (checking){
row_sums <- c()
for (r in 1:(m - 1)){
s <- sum(copy[r, ])
if (s == 0){
copy <- copy[-r, ]
zero_row_count <- zero_row_count + 1
break
}else{
next
}
row_sums <- append(row_sums, s)
}
if (all(row_sums > 0)){
checking <- FALSE
}
}
if (zero_row_count > 0){
for (z in 1:zero_row_count){
copy <- rbind(copy, rep(0, n))
}
}
result <- list("copy" = copy, "zero_row_count" = zero_row_count)
}
Below is the Gauss-Jordan elimination function I don’t really need anymore for this particular assignment. I started with a function based on the pseudo-code from pg. 31 of A First Course in Linear Algebra. However, my iterators were never right. So I looked for an alternative basis for developing the function and found one here: https://gist.github.com/ZPears/0583aae73aa06d8abd9e. The below function can safely be called a mishmash of my code, the book’s pseudo-code, and the code found at that link.
perform_gauss_jordan_elimination <- function(A){
properties <- get_properties(A)
copy <- properties$A
m <- properties$A_rows
n <- properties$A_cols
result <- move_all_zero_rows(copy)
copy <- result$copy
col <- 1
row <- 0
while (col < (n + 1) & (row + 1) <= m){
if (sum(copy[(row + 1):m, col]) == 0){
col <- col + 1
}else{
ind <- 0
for (i in (row + 1):m){
if (copy[i, col] != 0){
ind <- i
break
}
}
row <- row + 1
#swap row_i and row_r
row_i <- copy[ind, ]
row_r <- copy[row, ]
copy[ind, ] <- row_r
copy[row, ] <- row_i
#make leading 1
scalar <- 1 / copy[row, col]
copy[row, ] <- copy[row, ] * scalar
#make 0 entries
for (k in 1:m){
if (copy[k, col] != 0 & k != row){
scalar <- copy[k, col] / copy[row, col]
copy[k, ] <- -1 * scalar * copy[row, ] + copy[k, ]
}
}
col <- col + 1
}
}
copy
}
E <- rbind(c(1, 2, 3),
c(4, 5, 6),
c(7, 8, 9))
print_E <- to_LaTeX(E)
E_RREF <- perform_gauss_jordan_elimination(E)
print_E_RREF <- to_LaTeX(E_RREF)
I include an example using the matrix E to confirm the Gauss-Jordan elimination function does correctly produce the expected reduced row-echelon form matrix ERREF.
E=[123456789],ERREF=[10−1012000]
Since I finally realized I actually need row-echelon form rather than reduced row-echelon form to get the Upper and Lower triangle decomposition, I define another function below that returns the LU triangle after performing Gaussian (rather than Gauss-Jordan) elimination.
decompose_LU <- function(A){
properties <- get_properties(A)
copy <- properties$A
m <- properties$A_rows
n <- properties$A_cols
result <- move_all_zero_rows(copy)
copy <- result$copy
col <- 1
row <- 0
L <- matrix(0, nrow = m, ncol = n)
diag(L) <- 1
while (col < (n + 1) & (row + 1) <= m){
if (sum(copy[(row + 1):m, col]) == 0){
col <- col + 1
}else{
ind <- 0
for (i in (row + 1):m){
if (copy[i, col] != 0){
ind <- i
break
}
}
row <- row + 1
#make 0 entries
for (k in (row + 1):m){
if (copy[k, col] != 0){
scalar <- copy[k, col] / copy[row, col]
copy[k, ] <- -1 * scalar * copy[row, ] + copy[k, ]
L_entry <- list("row" = k, "col" = col, "val" =
scalar)
L[L_entry$row, L_entry$col] <- L_entry$val
}
}
col <- col + 1
}
}
U <- copy
LU <- list("L" = L, "U" = U)
LU
}
triangles <- decompose_LU(E)
L <- triangles$L
print_L <- to_LaTeX(L)
U <- triangles$U
print_U <- to_LaTeX(U)
The matrix E has now been decomposed successfully into LU:
E=[123456789]=[100410721][1230−3−6000]=LU