Before We Begin:

Below, I define a function that gets the dimensions of a matrix, whether it is square, whether it is symmetric, and stores those properties, as well as a copy of the matrix, its transpose, and the dimensions of its transpose all in a list. It will be useful in answering the questions throughout.

get_properties <- function(A){
    m = nrow(A)
    n = ncol(A)
    if (m != n){
        square = FALSE
    }
    else{
        square = TRUE
    }
    equality <- c()
    if (square){
        for (i in 1:m){
            for (j in 1:n){
                if (A[i, j] == A[j, i]){
                    append(equality, TRUE)
                }else{
                    append(equality, FALSE)
                }
            }
        }
        symmetric <- isTRUE(all(equality == TRUE))
    }else{
        symmetric <- FALSE
    }
    t_A = t(A)
    t_A_m = nrow(t_A)
    t_A_n = ncol(t_A)
    properties <- list("A" = A, "A_rows" = m, "A_cols" = n, "square" =
                       square, "symmetric" = symmetric, "AT" = t_A,
                       "AT_rows" = t_A_m, "AT_cols" = t_A_n)
}

I also adapted the below function for formatting a matrix in LaTeX from here: https://www.r-bloggers.com/2020/08/matrix-to-latex/.

to_LaTeX <- function(A){
    rows <- apply(A, MARGIN=1, paste, collapse = " & ")
    matrix_string <- paste(rows, collapse = " \\\\ ")
    return(paste("\\begin{bmatrix}", matrix_string, "\\end{bmatrix}"))
}

Problem Set 1:

Question 1:

Show that $A^TA \neq AA^T$ in general. (Proof and demonstration.)

Answer 1:

If $A$ is a matrix with $m$ rows and $n$ columns, then $A^T$ is a matrix with $n$ rows and $m$ columns.

If $m \neq n$ , both $A$ and $A^T$ are rectangular matrices, and their products are square matrices. $A^TA$ is an $nxn$ matrix, and $AA^T$ is an $mxm$ matrix. Since $m \neq n$ , $A^TA \neq AA^T$ because matrices with different dimensions are never equal.

In the below example, I demonstrate:

that the transpose $A^T$ of a 3x5 matrix $A$ is indeed 5x3
that the product $A^TA$ is indeed 5x5
that the product $AA^T$ is indeed 3x3

set.seed(1)
m <- 3
n <- 5
s <- sample(1:5, m*n, replace = TRUE)
A <- matrix(s, m, n)
properties <- get_properties(A)
print_A <- to_LaTeX(properties$A)
print_AT <- to_LaTeX(properties$AT)
product1 <- t(A) %*% A
print_ATA <- to_LaTeX(product1)
product2 <- A %*% t(A)
print_AAT <- to_LaTeX(product2)

$A = \begin{bmatrix} 1 & 2 & 2 & 1 & 2 \\ 4 & 5 & 3 & 5 & 2 \\ 1 & 3 & 3 & 5 & 1 \end{bmatrix}, A^T = \begin{bmatrix} 1 & 4 & 1 \\ 2 & 5 & 3 \\ 2 & 3 & 3 \\ 1 & 5 & 5 \\ 2 & 2 & 1 \end{bmatrix}$

$A^TA = \begin{bmatrix} 18 & 25 & 17 & 26 & 11 \\ 25 & 38 & 28 & 42 & 17 \\ 17 & 28 & 22 & 32 & 13 \\ 26 & 42 & 32 & 51 & 17 \\ 11 & 17 & 13 & 17 & 9 \end{bmatrix} \neq \begin{bmatrix} 14 & 29 & 20 \\ 29 & 79 & 55 \\ 20 & 55 & 45 \end{bmatrix} = AA^T$

Therefore, $A^TA \neq AA^T$ when $m \neq n$ . The products will always have different dimensions.

Question 2:

For a special type of square matrix $A$ , $A^TA = AA^T$ . Under what conditions could this be true? (Hint: The identity matrix $I$ is an example of such a matrix.)

Answer 2:

The square matrix $A$ must be symmetric in order for $A^TA = AA^T$ to be true. Symmetry is defined as $A = A^T$ , meaning a square matrix must be equal to its transpose in order for it to be symmetric.

Each [row $i$ , column $j$ ] entry of a symmetric matrix is symmetric in respect to the main diagonal. So if $A$ is symmetric, then $A_{i,j} = A_{j,i}$ .

The identity matrix $I$ is square, and it is symmetric because $I = I^T$ . Symmetry is also a property of all square diagonal matrices, not just $I$ .

Below, I confirm that $I$ is symmetric and that $I^TI = II^T$ .

I <- diag(5)
properties <- get_properties(I)
print_I <- to_LaTeX(properties$A)
print_IT <- to_LaTeX(properties$AT)
product1 <- t(I) %*% I
print_ITI <- to_LaTeX(product1)
product2 <- I %*% t(I)
print_IIT <- to_LaTeX(product2)

$I = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix} = I^T$

$I^TI = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix} = II^T$

I also confirm this for a square scalar matrix $B$ , which is simply a multiple of $I$ .

B <- I * 3
properties <- get_properties(B)
print_B <- to_LaTeX(properties$A)
print_BT <- to_LaTeX(properties$AT)
product1 <- t(B) %*% B
print_BTB <- to_LaTeX(product1)
product2 <- B %*% t(B)
print_BBT <- to_LaTeX(product2)

$B = \begin{bmatrix} 3 & 0 & 0 & 0 & 0 \\ 0 & 3 & 0 & 0 & 0 \\ 0 & 0 & 3 & 0 & 0 \\ 0 & 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 0 & 3 \end{bmatrix} = \begin{bmatrix} 3 & 0 & 0 & 0 & 0 \\ 0 & 3 & 0 & 0 & 0 \\ 0 & 0 & 3 & 0 & 0 \\ 0 & 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 0 & 3 \end{bmatrix} = B^T$

$B^TB = \begin{bmatrix} 9 & 0 & 0 & 0 & 0 \\ 0 & 9 & 0 & 0 & 0 \\ 0 & 0 & 9 & 0 & 0 \\ 0 & 0 & 0 & 9 & 0 \\ 0 & 0 & 0 & 0 & 9 \end{bmatrix} = \begin{bmatrix} 9 & 0 & 0 & 0 & 0 \\ 0 & 9 & 0 & 0 & 0 \\ 0 & 0 & 9 & 0 & 0 \\ 0 & 0 & 0 & 9 & 0 \\ 0 & 0 & 0 & 0 & 9 \end{bmatrix} = BB^T$

I also confirm this for a square diagonal matrix $C$ .

C <- matrix(0, 3, 3)
diag(C) <- c(3, 2, 1)
properties <- get_properties(C)
print_C <- to_LaTeX(properties$A)
print_CT <- to_LaTeX(properties$AT)
product1 <- t(C) %*% C
print_CTC <- to_LaTeX(product1)
product2 <- C %*% t(C)
print_CCT <- to_LaTeX(product2)

$C = \begin{bmatrix} 3 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 3 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 1 \end{bmatrix} = C^T$

$C^TC = \begin{bmatrix} 9 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 9 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & 1 \end{bmatrix} = CC^T$

Lastly, I confirm this for a symmetric non-diagonal matrix $D$ .

D <- rbind(c(1, 2, -1, 5),
            c(2, 1, 3, 0),
            c(-1, 3, 0, 4),
            c(5, 0, 4, 2))
properties <- get_properties(D)
print_D <- to_LaTeX(properties$A)
print_DT <- to_LaTeX(properties$AT)
product1 <- t(D) %*% D
print_DTD <- to_LaTeX(product1)
product2 <- D %*% t(D)
print_DDT <- to_LaTeX(product2)

$D = \begin{bmatrix} 1 & 2 & -1 & 5 \\ 2 & 1 & 3 & 0 \\ -1 & 3 & 0 & 4 \\ 5 & 0 & 4 & 2 \end{bmatrix} = \begin{bmatrix} 1 & 2 & -1 & 5 \\ 2 & 1 & 3 & 0 \\ -1 & 3 & 0 & 4 \\ 5 & 0 & 4 & 2 \end{bmatrix} = D^T$

$D^TD = \begin{bmatrix} 31 & 1 & 25 & 11 \\ 1 & 14 & 1 & 22 \\ 25 & 1 & 26 & 3 \\ 11 & 22 & 3 & 45 \end{bmatrix} = \begin{bmatrix} 31 & 1 & 25 & 11 \\ 1 & 14 & 1 & 22 \\ 25 & 1 & 26 & 3 \\ 11 & 22 & 3 & 45 \end{bmatrix} = DD^T$

Problem Set 2:

Question 1:

Write an R function to factorize a square matrix $A$ into $LU$ or $LDU$ , whichever you prefer. You don’t have to worry about permuting rows of $A$ , and you can assume that $A$ is less than 5x5 if you need to hard-code any variables in your code.

Answer 1:

First, I define a function below to move any all-zero rows of a matrix to the bottom of that matrix so Gauss-Jordan elimination can then be performed to get the matrix into reduced row-echelon form. (I originally thought I needed reduced row-echelon form in order to get the Upper triangle, but I was mistaken. I’m leaving the code in since it’s the basis for more of my work below.)

move_all_zero_rows <- function(A){
    properties <- get_properties(A)
    copy <- properties$A
    m <- properties$A_rows
    n <- properties$A_cols
    zero_row_count <- 0
    checking <- TRUE
    while (checking){
        row_sums <- c()
        for (r in 1:(m - 1)){
            s <- sum(copy[r, ])
            if (s == 0){
                copy <- copy[-r, ]
                zero_row_count <- zero_row_count + 1
                break
            }else{
                next
            }
            row_sums <- append(row_sums, s)
        }
        if (all(row_sums > 0)){
            checking <- FALSE
        }
    }
    if (zero_row_count > 0){
        for (z in 1:zero_row_count){
            copy <- rbind(copy, rep(0, n))
        }
    }
    result <- list("copy" = copy, "zero_row_count" = zero_row_count)
}

Below is the Gauss-Jordan elimination function I don’t really need anymore for this particular assignment. I started with a function based on the pseudo-code from pg. 31 of A First Course in Linear Algebra. However, my iterators were never right. So I looked for an alternative basis for developing the function and found one here: https://gist.github.com/ZPears/0583aae73aa06d8abd9e. The below function can safely be called a mishmash of my code, the book’s pseudo-code, and the code found at that link.

perform_gauss_jordan_elimination <- function(A){
    properties <- get_properties(A)
    copy <- properties$A
    m <- properties$A_rows
    n <- properties$A_cols
    result <- move_all_zero_rows(copy)
    copy <- result$copy
    col <- 1
    row <- 0
    while (col < (n + 1) & (row + 1) <= m){
        if (sum(copy[(row + 1):m, col]) == 0){
            col <- col + 1
        }else{
            ind <- 0
            for (i in (row + 1):m){
                if (copy[i, col] != 0){
                    ind <- i
                    break
                }
            }
            row <- row + 1
            #swap row_i and row_r
            row_i <- copy[ind, ]
            row_r <- copy[row, ]
            copy[ind, ] <- row_r
            copy[row, ] <- row_i
            #make leading 1
            scalar <- 1 / copy[row, col]
            copy[row, ] <- copy[row, ] * scalar
            #make 0 entries
            for (k in 1:m){
                if (copy[k, col] != 0 & k != row){
                    scalar <- copy[k, col] / copy[row, col] 
                    copy[k, ] <- -1 * scalar * copy[row, ] + copy[k, ]
                }
            }
        col <- col + 1
        }
    }
    copy
}

E <- rbind(c(1, 2, 3),
           c(4, 5, 6),
           c(7, 8, 9))
print_E <- to_LaTeX(E)
E_RREF <- perform_gauss_jordan_elimination(E)
print_E_RREF <- to_LaTeX(E_RREF)

I include an example using the matrix $E$ to confirm the Gauss-Jordan elimination function does correctly produce the expected reduced row-echelon form matrix $ERREF$ .

$E = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}, ERREF = \begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix}$

Since I finally realized I actually need row-echelon form rather than reduced row-echelon form to get the Upper and Lower triangle decomposition, I define another function below that returns the LU triangle after performing Gaussian (rather than Gauss-Jordan) elimination.

decompose_LU <- function(A){
    properties <- get_properties(A)
    copy <- properties$A
    m <- properties$A_rows
    n <- properties$A_cols
    result <- move_all_zero_rows(copy)
    copy <- result$copy
    col <- 1
    row <- 0
    L <- matrix(0, nrow = m, ncol = n)
    diag(L) <- 1
    while (col < (n + 1) & (row + 1) <= m){
        if (sum(copy[(row + 1):m, col]) == 0){
            col <- col + 1
        }else{
            ind <- 0
            for (i in (row + 1):m){
                if (copy[i, col] != 0){
                    ind <- i
                    break
                }
            }
            row <- row + 1
            #make 0 entries
            for (k in (row + 1):m){
                if (copy[k, col] != 0){
                    scalar <- copy[k, col] / copy[row, col]
                    copy[k, ] <- -1 * scalar * copy[row, ] + copy[k, ]
                    L_entry <- list("row" = k, "col" = col, "val" =
                                    scalar)
                    L[L_entry$row, L_entry$col] <- L_entry$val
                }
            }
        col <- col + 1
        }
    }
    U <- copy
    LU <- list("L" = L, "U" = U)
    LU
}

triangles <- decompose_LU(E)
L <- triangles$L
print_L <- to_LaTeX(L)
U <- triangles$U
print_U <- to_LaTeX(U)

The matrix $E$ has now been decomposed successfully into $LU$ :

$E = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 \\ 4 & 1 & 0 \\ 7 & 2 & 1 \end{bmatrix} \begin{bmatrix} 1 & 2 & 3 \\ 0 & -3 & -6 \\ 0 & 0 & 0 \end{bmatrix} = LU$

Data 605 - HW2

Glen Dale Davis

2023-02-01

Before We Begin:

Problem Set 1:

Question 1:

Answer 1:

Question 2:

Answer 2:

Problem Set 2:

Question 1:

Answer 1: