We define two types of elementary row-operations on matrices. The first one, an \(S\)-op, multiplies a row by a constant. The second, a \(T\)-op, adds a multiple of a row to a different row. Applying an elementary row operation to a matrix is equivalent to premultiplying the matrix by an \(S\)-matrix or a \(T\)-matrix, which are computed in R by
sfunc <- function (n, i, x) {
s <- diag (n)
s [i, i] <- x
return (s)
}
tfunc <- function (n, i, j, x) {
t <- diag (n)
t [i, j] <- x
return (t)
}
An \(S\)-matrix has determinant \(x\), and is consequently non-singular is \(x\not= 0\). A \(T\)-matrix has determinant 1, not matter what \(x\) is, and is consequently always non-singular. If \(i>j\) a \(T\) matrix is unit lower triangular, if \(i<j\) it is unit upper triangular.
Now suppose \(A\) is a nonsingular matrix of order \(n\). Then we can use a sequence of elementary operations, i.e. premultiplications by \(S\)-matrices and \(T\)-matrices, to transform \(A\) to unit upper triangular form. We will only use lower triangular \(T\)-matrices, and as a consequence the product of the elementary operations we use will be a lower triangular matrix.
Let us illustrate this with
set.seed (12345)
print (a <- matrix (sample (16), 4, 4))
## [,1] [,2] [,3] [,4]
## [1,] 12 6 13 3
## [2,] 14 2 7 9
## [3,] 11 4 1 10
## [4,] 16 5 8 15
We shall apply the elementary row operations to the matrix
cbind (a, diag (4))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 12 6 13 3 1 0 0 0
## [2,] 14 2 7 9 0 1 0 0
## [3,] 11 4 1 10 0 0 1 0
## [4,] 16 5 8 15 0 0 0 1
After we are done the first four columns will contain the transformed matrix, which should be unit upper triangular, and the last four columns will be the product of the \(S\)-matrices and \(T\)-matrices we used, which should be lower triangular.
The function we loop forward over columns. If we work on column \(i\) we use an \(S\)-op to make diagonal element \((i,i)\) equal to one, and then we use \(n-i-1\) different \(T\)-ops to make the elements \((j,i)\) with \(j>i\) equal to zero. Thus we use row elementary operations to bring the columns into the required shape.
gaussMatrixForward <- function (a, verbose = TRUE) {
n <- nrow (a)
for (i in 1 : n) {
a <- sfunc (n, i, 1 / a[i, i]) %*% a
if (verbose) {
print (noquote (formatC (a, digits = 4, width = 7, format = "f")))
}
if (i == n) {
break ()
}
for (j in (i + 1) : n) {
a <- tfunc (n, j, i, - a[j, i]) %*% a
if (verbose) {
print (noquote (formatC (a, digits = 4, width = 7, format = "f")))
}
}
}
return (a)
}
fa <- gaussMatrixForward (cbind(a,diag(4)))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 14.0000 2.0000 7.0000 9.0000 0.0000 1.0000 0.0000 0.0000
## [3,] 11.0000 4.0000 1.0000 10.0000 0.0000 0.0000 1.0000 0.0000
## [4,] 16.0000 5.0000 8.0000 15.0000 0.0000 0.0000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 0.0000 -5.0000 -8.1667 5.5000 -1.1667 1.0000 0.0000 0.0000
## [3,] 11.0000 4.0000 1.0000 10.0000 0.0000 0.0000 1.0000 0.0000
## [4,] 16.0000 5.0000 8.0000 15.0000 0.0000 0.0000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 0.0000 -5.0000 -8.1667 5.5000 -1.1667 1.0000 0.0000 0.0000
## [3,] 0.0000 -1.5000 -10.9167 7.2500 -0.9167 0.0000 1.0000 0.0000
## [4,] 16.0000 5.0000 8.0000 15.0000 0.0000 0.0000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 0.0000 -5.0000 -8.1667 5.5000 -1.1667 1.0000 0.0000 0.0000
## [3,] 0.0000 -1.5000 -10.9167 7.2500 -0.9167 0.0000 1.0000 0.0000
## [4,] 0.0000 -3.0000 -9.3333 11.0000 -1.3333 0.0000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 0.0000 1.0000 1.6333 -1.1000 0.2333 -0.2000 0.0000 0.0000
## [3,] 0.0000 -1.5000 -10.9167 7.2500 -0.9167 0.0000 1.0000 0.0000
## [4,] 0.0000 -3.0000 -9.3333 11.0000 -1.3333 0.0000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 0.0000 1.0000 1.6333 -1.1000 0.2333 -0.2000 0.0000 0.0000
## [3,] 0.0000 0.0000 -8.4667 5.6000 -0.5667 -0.3000 1.0000 0.0000
## [4,] 0.0000 -3.0000 -9.3333 11.0000 -1.3333 0.0000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 0.0000 1.0000 1.6333 -1.1000 0.2333 -0.2000 0.0000 0.0000
## [3,] 0.0000 0.0000 -8.4667 5.6000 -0.5667 -0.3000 1.0000 0.0000
## [4,] 0.0000 0.0000 -4.4333 7.7000 -0.6333 -0.6000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 0.0000 1.0000 1.6333 -1.1000 0.2333 -0.2000 0.0000 0.0000
## [3,] 0.0000 0.0000 1.0000 -0.6614 0.0669 0.0354 -0.1181 0.0000
## [4,] 0.0000 0.0000 -4.4333 7.7000 -0.6333 -0.6000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 0.0000 1.0000 1.6333 -1.1000 0.2333 -0.2000 0.0000 0.0000
## [3,] 0.0000 0.0000 1.0000 -0.6614 0.0669 0.0354 -0.1181 0.0000
## [4,] 0.0000 0.0000 0.0000 4.7677 -0.3366 -0.4429 -0.5236 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 0.0000 1.0000 1.6333 -1.1000 0.2333 -0.2000 0.0000 0.0000
## [3,] 0.0000 0.0000 1.0000 -0.6614 0.0669 0.0354 -0.1181 0.0000
## [4,] 0.0000 0.0000 0.0000 1.0000 -0.0706 -0.0929 -0.1098 0.2097
Let us call the first four columns of the result \(U\), i.e.
## [,1] [,2] [,3] [,4]
## [1,] 1 0.5 1.083333 0.2500000
## [2,] 0 1.0 1.633333 -1.1000000
## [3,] 0 0.0 1.000000 -0.6614173
## [4,] 0 0.0 0.000000 1.0000000
and the last four columns we call \(L\). This is
## [,1] [,2] [,3] [,4]
## [1,] 0.08333333 0.00000000 0.0000000 0.000000
## [2,] 0.23333333 -0.20000000 0.0000000 0.000000
## [3,] 0.06692913 0.03543307 -0.1181102 0.000000
## [4,] -0.07060281 -0.09289843 -0.1098266 0.209744
Thus \(LA=U\), or \(A=L^{-1}U\), and thus we have written \(A\) as the product of a lower triangular and unit upper triangular matrix. This is the LU decomposition.
But we are not yet done with our elementary row operations. We can use additional \(T\)-ops on our transformed matrix
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 1 0.5 1.083333 0.2500000 0.08333333 0.00000000 0.0000000
## [2,] 0 1.0 1.633333 -1.1000000 0.23333333 -0.20000000 0.0000000
## [3,] 0 0.0 1.000000 -0.6614173 0.06692913 0.03543307 -0.1181102
## [4,] 0 0.0 0.000000 1.0000000 -0.07060281 -0.09289843 -0.1098266
## [,8]
## [1,] 0.000000
## [2,] 0.000000
## [3,] 0.000000
## [4,] 0.209744
to zero out all upper-diagonal elements in the first four columns. If we do this, the first four columns will have been transformed to the identity matrix. We loop backward over columns, and while working on column \(i\) we use \(T\)-ops to zero out all elements \((j,i)\) with \(j < i\). There is no need for \(S\)-ops because the diagonal elements are already equal to one.
gaussMatrixBackward <- function (a, verbose = TRUE) {
n <- nrow (a)
for (i in n : 2) {
for (j in (i - 1) : 1) {
a <- tfunc (n, j, i, - a[j, i]) %*% a
if (verbose) {
print (noquote (formatC (a, digits = 4, width = 7, format = "f")))
}
}
}
return (a)
}
ga <- gaussMatrixBackward (fa)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 0.0000 1.0000 1.6333 -1.1000 0.2333 -0.2000 0.0000 0.0000
## [3,] 0.0000 0.0000 1.0000 0.0000 0.0202 -0.0260 -0.1908 0.1387
## [4,] 0.0000 0.0000 0.0000 1.0000 -0.0706 -0.0929 -0.1098 0.2097
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 0.0000 1.0000 1.6333 0.0000 0.1557 -0.3022 -0.1208 0.2307
## [3,] 0.0000 0.0000 1.0000 0.0000 0.0202 -0.0260 -0.1908 0.1387
## [4,] 0.0000 0.0000 0.0000 1.0000 -0.0706 -0.0929 -0.1098 0.2097
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.0000 0.1010 0.0232 0.0275 -0.0524
## [2,] 0.0000 1.0000 1.6333 0.0000 0.1557 -0.3022 -0.1208 0.2307
## [3,] 0.0000 0.0000 1.0000 0.0000 0.0202 -0.0260 -0.1908 0.1387
## [4,] 0.0000 0.0000 0.0000 1.0000 -0.0706 -0.0929 -0.1098 0.2097
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.0000 0.1010 0.0232 0.0275 -0.0524
## [2,] 0.0000 1.0000 0.0000 0.0000 0.1226 -0.2597 0.1908 0.0041
## [3,] 0.0000 0.0000 1.0000 0.0000 0.0202 -0.0260 -0.1908 0.1387
## [4,] 0.0000 0.0000 0.0000 1.0000 -0.0706 -0.0929 -0.1098 0.2097
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 0.0000 0.0000 0.0791 0.0514 0.2341 -0.2027
## [2,] 0.0000 1.0000 0.0000 0.0000 0.1226 -0.2597 0.1908 0.0041
## [3,] 0.0000 0.0000 1.0000 0.0000 0.0202 -0.0260 -0.1908 0.1387
## [4,] 0.0000 0.0000 0.0000 1.0000 -0.0706 -0.0929 -0.1098 0.2097
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.0000 0.0000 0.0000 0.0178 0.1813 0.1387 -0.2048
## [2,] 0.0000 1.0000 0.0000 0.0000 0.1226 -0.2597 0.1908 0.0041
## [3,] 0.0000 0.0000 1.0000 0.0000 0.0202 -0.0260 -0.1908 0.1387
## [4,] 0.0000 0.0000 0.0000 1.0000 -0.0706 -0.0929 -0.1098 0.2097
If we use \(B\) for the last four columns of the result, i.e.
print (b <- ga[, 5:8])
## [,1] [,2] [,3] [,4]
## [1,] 0.01775392 0.18125516 0.1387283 -0.204789430
## [2,] 0.12262593 -0.25970273 0.1907514 0.004128819
## [3,] 0.02023121 -0.02601156 -0.1907514 0.138728324
## [4,] -0.07060281 -0.09289843 -0.1098266 0.209744013
then \(B\) is the product of all elementary operations over both the forward and backward phase. And we have \(BA=I\), in other words \(B=A^{-1}\), and we have computed the inverse of \(A\).
Suppose \(Ax=b\) is a system of linear equations with square non-singular \(A\). Choose \(b\) equal to 1, 2, 3, 4, for example. We can find the unique solution \(x\) by using
print (fx <- gaussMatrixBackward (gaussMatrixForward (cbind (a, 1:4), verbose = FALSE), verbose = FALSE))
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 0 0 0 -0.02270851
## [2,] 0 1 0 0 0.19199009
## [3,] 0 0 1 0 -0.04913295
## [4,] 0 0 0 1 0.25309661
The last column contains the solution, which we can verify by
a %*% fx[,5]
## [,1]
## [1,] 1
## [2,] 2
## [3,] 3
## [4,] 4
We can compute the LU-decomposition of a matrix by using forward operations twice. The code is
LU_decomp <- function (a) {
n <- nrow (a)
g <- gaussMatrixForward (cbind (a, diag (n)), verbose = FALSE)
h <- gaussMatrixForward (cbind (g[, n + 1 : n], diag (n)), verbose = FALSE)
return (list (L = h[, n + 1 : n], U = g [, 1 : n]))
}
which gives
print (lu <- LU_decomp (a))
## $L
## [,1] [,2] [,3] [,4]
## [1,] 12 0.0 0.000000 0.000000
## [2,] 14 -5.0 0.000000 0.000000
## [3,] 11 -1.5 -8.466667 0.000000
## [4,] 16 -3.0 -4.433333 4.767717
##
## $U
## [,1] [,2] [,3] [,4]
## [1,] 1 0.5 1.083333 0.2500000
## [2,] 0 1.0 1.633333 -1.1000000
## [3,] 0 0.0 1.000000 -0.6614173
## [4,] 0 0.0 0.000000 1.0000000
Veryfying
lu $ L %*% lu $ U
## [,1] [,2] [,3] [,4]
## [1,] 12 6 13 3
## [2,] 14 2 7 9
## [3,] 11 4 1 10
## [4,] 16 5 8 15
After forward elimination we have \(LA=U\). Thus \(\text{det}(L)\text{det}(A)=\text{det}(U)\), and since \(\text{det}(U)=1\) we have \(\text{det}(A)=\text{det}(L)^{-1}\), where \(\text{det}(L)\) is the product of the diagonal elements of \(L\).
myDet <- function (a) {
n <- nrow (a)
h <- gaussMatrixForward (cbind (a, diag (n)), verbose = FALSE)
return (1 / prod (diag (h[, n + (1 : n)])))
}
det(a)
## [1] 2422
myDet(a)
## [1] 2422
In our R programs we applied our elementary row operations by actually premultiplying with \(S\) and \(T\)-matrices. Because these matrices are sparse, this is very inefficient. In addition we can vectorize the inner loop and get rid of some multiplications by zero. This leads, for example, to a much more efficient R function for forward elimination.
gauss.forward<-function(a, verbose = FALSE) {
n <- nrow (a)
m <- ncol (a)
for (i in 1 : n) {
a[i, ]<-a[i, ] / a[i, i]
if (verbose) {
print (noquote (formatC (a, digits = 4, width = 7, format = "f")))
}
if (i == n) break()
ii <- (i + 1) : n
ij <- i : m
a[ii, ij ] <- a[ii, ij] - outer (a[ii, i], a[i, ij])
if (verbose) {
print (noquote (formatC (a, digits = 4, width = 7, format = "f")))
}
}
return (a)
}
gf <- gauss.forward (cbind(a, diag(4)), verbose = TRUE)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 14.0000 2.0000 7.0000 9.0000 0.0000 1.0000 0.0000 0.0000
## [3,] 11.0000 4.0000 1.0000 10.0000 0.0000 0.0000 1.0000 0.0000
## [4,] 16.0000 5.0000 8.0000 15.0000 0.0000 0.0000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] 0.0000 -5.0000 -8.1667 5.5000 -1.1667 1.0000 0.0000 0.0000
## [3,] 0.0000 -1.5000 -10.9167 7.2500 -0.9167 0.0000 1.0000 0.0000
## [4,] 0.0000 -3.0000 -9.3333 11.0000 -1.3333 0.0000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] -0.0000 1.0000 1.6333 -1.1000 0.2333 -0.2000 -0.0000 -0.0000
## [3,] 0.0000 -1.5000 -10.9167 7.2500 -0.9167 0.0000 1.0000 0.0000
## [4,] 0.0000 -3.0000 -9.3333 11.0000 -1.3333 0.0000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] -0.0000 1.0000 1.6333 -1.1000 0.2333 -0.2000 -0.0000 -0.0000
## [3,] 0.0000 0.0000 -8.4667 5.6000 -0.5667 -0.3000 1.0000 0.0000
## [4,] 0.0000 0.0000 -4.4333 7.7000 -0.6333 -0.6000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] -0.0000 1.0000 1.6333 -1.1000 0.2333 -0.2000 -0.0000 -0.0000
## [3,] -0.0000 -0.0000 1.0000 -0.6614 0.0669 0.0354 -0.1181 -0.0000
## [4,] 0.0000 0.0000 -4.4333 7.7000 -0.6333 -0.6000 0.0000 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] -0.0000 1.0000 1.6333 -1.1000 0.2333 -0.2000 -0.0000 -0.0000
## [3,] -0.0000 -0.0000 1.0000 -0.6614 0.0669 0.0354 -0.1181 -0.0000
## [4,] 0.0000 0.0000 0.0000 4.7677 -0.3366 -0.4429 -0.5236 1.0000
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 1.0000 0.5000 1.0833 0.2500 0.0833 0.0000 0.0000 0.0000
## [2,] -0.0000 1.0000 1.6333 -1.1000 0.2333 -0.2000 -0.0000 -0.0000
## [3,] -0.0000 -0.0000 1.0000 -0.6614 0.0669 0.0354 -0.1181 -0.0000
## [4,] 0.0000 0.0000 0.0000 1.0000 -0.0706 -0.0929 -0.1098 0.2097