Loading [MathJax]/jax/output/HTML-CSS/jax.js

Before We Begin:

Below, I define a function that gets the dimensions of a matrix, whether it is square, whether it is symmetric, and stores those properties, as well as a copy of the matrix, its transpose, and the dimensions of its transpose all in a list. It will be useful in answering the questions throughout.

get_properties <- function(A){
    m = nrow(A)
    n = ncol(A)
    if (m != n){
        square = FALSE
    }
    else{
        square = TRUE
    }
    equality <- c()
    if (square){
        for (i in 1:m){
            for (j in 1:n){
                if (A[i, j] == A[j, i]){
                    append(equality, TRUE)
                }else{
                    append(equality, FALSE)
                }
            }
        }
        symmetric <- isTRUE(all(equality == TRUE))
    }else{
        symmetric <- FALSE
    }
    t_A = t(A)
    t_A_m = nrow(t_A)
    t_A_n = ncol(t_A)
    properties <- list("A" = A, "A_rows" = m, "A_cols" = n, "square" =
                       square, "symmetric" = symmetric, "AT" = t_A,
                       "AT_rows" = t_A_m, "AT_cols" = t_A_n)
}

I also adapted the below function for formatting a matrix in LaTeX from here: https://www.r-bloggers.com/2020/08/matrix-to-latex/.

to_LaTeX <- function(A){
    rows <- apply(A, MARGIN=1, paste, collapse = " & ")
    matrix_string <- paste(rows, collapse = " \\\\ ")
    return(paste("\\begin{bmatrix}", matrix_string, "\\end{bmatrix}"))
}

Problem Set 1:

Question 1:

Show that ATAAAT in general. (Proof and demonstration.)

Answer 1:

If A is a matrix with m rows and n columns, then AT is a matrix with n rows and m columns.

If mn, both A and AT are rectangular matrices, and their products are square matrices. ATA is an nxn matrix, and AAT is an mxm matrix. Since mn, ATAAAT because matrices with different dimensions are never equal.

In the below example, I demonstrate:

  • that the transpose AT of a 3x5 matrix A is indeed 5x3
  • that the product ATA is indeed 5x5
  • that the product AAT is indeed 3x3
set.seed(1)
m <- 3
n <- 5
s <- sample(1:5, m*n, replace = TRUE)
A <- matrix(s, m, n)
properties <- get_properties(A)
print_A <- to_LaTeX(properties$A)
print_AT <- to_LaTeX(properties$AT)
product1 <- t(A) %*% A
print_ATA <- to_LaTeX(product1)
product2 <- A %*% t(A)
print_AAT <- to_LaTeX(product2)

A=[122124535213351],AT=[141253233155221]

ATA=[1825172611253828421717282232132642325117111713179][142920297955205545]=AAT

Therefore, ATAAAT when mn. The products will always have different dimensions.

Question 2:

For a special type of square matrix A, ATA=AAT. Under what conditions could this be true? (Hint: The identity matrix I is an example of such a matrix.)

Answer 2:

The square matrix A must be symmetric in order for ATA=AAT to be true. Symmetry is defined as A=AT, meaning a square matrix must be equal to its transpose in order for it to be symmetric.

Each [row i, column j] entry of a symmetric matrix is symmetric in respect to the main diagonal. So if A is symmetric, then Ai,j=Aj,i.

The identity matrix I is square, and it is symmetric because I=IT. Symmetry is also a property of all square diagonal matrices, not just I.

Below, I confirm that I is symmetric and that ITI=IIT.

I <- diag(5)
properties <- get_properties(I)
print_I <- to_LaTeX(properties$A)
print_IT <- to_LaTeX(properties$AT)
product1 <- t(I) %*% I
print_ITI <- to_LaTeX(product1)
product2 <- I %*% t(I)
print_IIT <- to_LaTeX(product2)

I=[1000001000001000001000001]=[1000001000001000001000001]=IT

ITI=[1000001000001000001000001]=[1000001000001000001000001]=IIT

I also confirm this for a square scalar matrix B, which is simply a multiple of I.

B <- I * 3
properties <- get_properties(B)
print_B <- to_LaTeX(properties$A)
print_BT <- to_LaTeX(properties$AT)
product1 <- t(B) %*% B
print_BTB <- to_LaTeX(product1)
product2 <- B %*% t(B)
print_BBT <- to_LaTeX(product2)

B=[3000003000003000003000003]=[3000003000003000003000003]=BT

BTB=[9000009000009000009000009]=[9000009000009000009000009]=BBT

I also confirm this for a square diagonal matrix C.

C <- matrix(0, 3, 3)
diag(C) <- c(3, 2, 1)
properties <- get_properties(C)
print_C <- to_LaTeX(properties$A)
print_CT <- to_LaTeX(properties$AT)
product1 <- t(C) %*% C
print_CTC <- to_LaTeX(product1)
product2 <- C %*% t(C)
print_CCT <- to_LaTeX(product2)

C=[300020001]=[300020001]=CT

CTC=[900040001]=[900040001]=CCT

Lastly, I confirm this for a symmetric non-diagonal matrix D.

D <- rbind(c(1, 2, -1, 5),
            c(2, 1, 3, 0),
            c(-1, 3, 0, 4),
            c(5, 0, 4, 2))
properties <- get_properties(D)
print_D <- to_LaTeX(properties$A)
print_DT <- to_LaTeX(properties$AT)
product1 <- t(D) %*% D
print_DTD <- to_LaTeX(product1)
product2 <- D %*% t(D)
print_DDT <- to_LaTeX(product2)

D=[1215213013045042]=[1215213013045042]=DT

DTD=[31125111141222512631122345]=[31125111141222512631122345]=DDT

Problem Set 2:

Question 1:

Write an R function to factorize a square matrix A into LU or LDU, whichever you prefer. You don’t have to worry about permuting rows of A, and you can assume that A is less than 5x5 if you need to hard-code any variables in your code.

Answer 1:

First, I define a function below to move any all-zero rows of a matrix to the bottom of that matrix so Gauss-Jordan elimination can then be performed to get the matrix into reduced row-echelon form. (I originally thought I needed reduced row-echelon form in order to get the Upper triangle, but I was mistaken. I’m leaving the code in since it’s the basis for more of my work below.)

move_all_zero_rows <- function(A){
    properties <- get_properties(A)
    copy <- properties$A
    m <- properties$A_rows
    n <- properties$A_cols
    zero_row_count <- 0
    checking <- TRUE
    while (checking){
        row_sums <- c()
        for (r in 1:(m - 1)){
            s <- sum(copy[r, ])
            if (s == 0){
                copy <- copy[-r, ]
                zero_row_count <- zero_row_count + 1
                break
            }else{
                next
            }
            row_sums <- append(row_sums, s)
        }
        if (all(row_sums > 0)){
            checking <- FALSE
        }
    }
    if (zero_row_count > 0){
        for (z in 1:zero_row_count){
            copy <- rbind(copy, rep(0, n))
        }
    }
    result <- list("copy" = copy, "zero_row_count" = zero_row_count)
}

Below is the Gauss-Jordan elimination function I don’t really need anymore for this particular assignment. I started with a function based on the pseudo-code from pg. 31 of A First Course in Linear Algebra. However, my iterators were never right. So I looked for an alternative basis for developing the function and found one here: https://gist.github.com/ZPears/0583aae73aa06d8abd9e. The below function can safely be called a mishmash of my code, the book’s pseudo-code, and the code found at that link.

perform_gauss_jordan_elimination <- function(A){
    properties <- get_properties(A)
    copy <- properties$A
    m <- properties$A_rows
    n <- properties$A_cols
    result <- move_all_zero_rows(copy)
    copy <- result$copy
    col <- 1
    row <- 0
    while (col < (n + 1) & (row + 1) <= m){
        if (sum(copy[(row + 1):m, col]) == 0){
            col <- col + 1
        }else{
            ind <- 0
            for (i in (row + 1):m){
                if (copy[i, col] != 0){
                    ind <- i
                    break
                }
            }
            row <- row + 1
            #swap row_i and row_r
            row_i <- copy[ind, ]
            row_r <- copy[row, ]
            copy[ind, ] <- row_r
            copy[row, ] <- row_i
            #make leading 1
            scalar <- 1 / copy[row, col]
            copy[row, ] <- copy[row, ] * scalar
            #make 0 entries
            for (k in 1:m){
                if (copy[k, col] != 0 & k != row){
                    scalar <- copy[k, col] / copy[row, col] 
                    copy[k, ] <- -1 * scalar * copy[row, ] + copy[k, ]
                }
            }
        col <- col + 1
        }
    }
    copy
}

E <- rbind(c(1, 2, 3),
           c(4, 5, 6),
           c(7, 8, 9))
print_E <- to_LaTeX(E)
E_RREF <- perform_gauss_jordan_elimination(E)
print_E_RREF <- to_LaTeX(E_RREF)

I include an example using the matrix E to confirm the Gauss-Jordan elimination function does correctly produce the expected reduced row-echelon form matrix ERREF.

E=[123456789],ERREF=[101012000]

Since I finally realized I actually need row-echelon form rather than reduced row-echelon form to get the Upper and Lower triangle decomposition, I define another function below that returns the LU triangle after performing Gaussian (rather than Gauss-Jordan) elimination.

decompose_LU <- function(A){
    properties <- get_properties(A)
    copy <- properties$A
    m <- properties$A_rows
    n <- properties$A_cols
    result <- move_all_zero_rows(copy)
    copy <- result$copy
    col <- 1
    row <- 0
    L <- matrix(0, nrow = m, ncol = n)
    diag(L) <- 1
    while (col < (n + 1) & (row + 1) <= m){
        if (sum(copy[(row + 1):m, col]) == 0){
            col <- col + 1
        }else{
            ind <- 0
            for (i in (row + 1):m){
                if (copy[i, col] != 0){
                    ind <- i
                    break
                }
            }
            row <- row + 1
            #make 0 entries
            for (k in (row + 1):m){
                if (copy[k, col] != 0){
                    scalar <- copy[k, col] / copy[row, col]
                    copy[k, ] <- -1 * scalar * copy[row, ] + copy[k, ]
                    L_entry <- list("row" = k, "col" = col, "val" =
                                    scalar)
                    L[L_entry$row, L_entry$col] <- L_entry$val
                }
            }
        col <- col + 1
        }
    }
    U <- copy
    LU <- list("L" = L, "U" = U)
    LU
}

triangles <- decompose_LU(E)
L <- triangles$L
print_L <- to_LaTeX(L)
U <- triangles$U
print_U <- to_LaTeX(U)

The matrix E has now been decomposed successfully into LU:

E=[123456789]=[100410721][123036000]=LU