Chapter 6: Computational Linear Algebra

Ken Jinkun Xiao
March 19, 2014

Vectors and Matrices in R

Vectors in R can be constructed using the c() function. Pat- terned vectors can be constructed using seq(), rep() and $ : $. Matrices can be constructed using the functions matrix(), cbind() or rbind().

Example Hilbert Matrix

The class of Hilbert Matrices is useful when studying stability of numerical linear algebra techniques. \[ H_{ij} = 1/(i+j - 1) \]

Here is the $ 4 \times 4 $ Hilbert Matrix:

          [,1]      [,2]      [,3]      [,4]
[1,] 1.0000000 0.5000000 0.3333333 0.2500000
[2,] 0.5000000 0.3333333 0.2500000 0.2000000
[3,] 0.3333333 0.2500000 0.2000000 0.1666667
[4,] 0.2500000 0.2000000 0.1666667 0.1428571

Constructing the Hilbert matrix with cbind()

We could construct the Hilbert matrix by binding columns to- gether as follows:

1/cbind(seq(1, 4), seq(2, 5), seq(3, 6), seq(4,7))

          [,1]      [,2]      [,3]      [,4]
[1,] 1.0000000 0.5000000 0.3333333 0.2500000
[2,] 0.5000000 0.3333333 0.2500000 0.2000000
[3,] 0.3333333 0.2500000 0.2000000 0.1666667
[4,] 0.2500000 0.2000000 0.1666667 0.1428571

In this example, rbind() would give the same result, because of symmetry.

Constructing the Hilbert matrix with cbind()

Note that

rep(1:4,rep(4,4))+rep(0:3,4)

 [1] 1 2 3 4 2 3 4 5 3 4 5 6 4 5 6 7

matrix(rep(1:4,rep(4,4))+rep(0:3,4),nrow=4)

     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    2    3    4    5
[3,]    3    4    5    6
[4,]    4    5    6    7

A Better Way to Construct the Hilbert matrix

We can get the Hilbert matrix using:

H4 <- 1/matrix(rep(1:4,rep(4,4))+rep(0:3,4),nrow=4)
H4

          [,1]      [,2]      [,3]      [,4]
[1,] 1.0000000 0.5000000 0.3333333 0.2500000
[2,] 0.5000000 0.3333333 0.2500000 0.2000000
[3,] 0.3333333 0.2500000 0.2000000 0.1666667
[4,] 0.2500000 0.2000000 0.1666667 0.1428571

A Better Way to Construct the Hilbert matrix

This is a better way to construct Hilbert matrices, since we now can use the idea for any size of matrix:

n <- 4 # this gives a 4 x 4 Hilbert Matrix
H4 <- 1/matrix(rep(1:n,rep(n,n))+rep(0:(n-1),n),nrow=n)
H4

          [,1]      [,2]      [,3]      [,4]
[1,] 1.0000000 0.5000000 0.3333333 0.2500000
[2,] 0.5000000 0.3333333 0.2500000 0.2000000
[3,] 0.3333333 0.2500000 0.2000000 0.1666667
[4,] 0.2500000 0.2000000 0.1666667 0.1428571

Bonus Question: Create an excel file with a cell input of the dimension number and produce a Hilbert Matrix following

Non-square Matrices

Example:

matrix(seq(1, 6), nrow=2)

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

Non-square Matrices

Example 2:

cbind(H4, 1/5:8)

          [,1]      [,2]      [,3]      [,4]      [,5]
[1,] 1.0000000 0.5000000 0.3333333 0.2500000 0.2000000
[2,] 0.5000000 0.3333333 0.2500000 0.2000000 0.1666667
[3,] 0.3333333 0.2500000 0.2000000 0.1666667 0.1428571
[4,] 0.2500000 0.2000000 0.1666667 0.1428571 0.1250000

Accessing Matrix Elements

We access the $ (i, j) $ element of a matrix $ X $ using $ X[i,j] $.

H4[3, 2]

[1] 0.25

We can access the ith row using $ X[i,] $, and the jth column using $ X[, j] $.

H4[3,]

[1] 0.3333333 0.2500000 0.2000000 0.1666667

Row and Column Names

Usually, the names of rows and columns of a matrix do not have names:

colnames(H4)

NULL

rownames(H4)

NULL

Row and Column Names

We can add names to columns and/or rows:

colnames(H4) <- c("h1", "h2", "h3", "h4")

Accessing matrix elements using Column Names

Example:

colnames(H4)

[1] "h1" "h2" "h3" "h4"

We can now access the second column of H4 with

H4[,"h2"]

[1] 0.5000000 0.3333333 0.2500000 0.2000000

Accessing matrix elements using Column Names

We can create a matrix with rownames by applying rbind() to some vectors.

colors <- c("red", "blue", "yellow")
numbers <- c("one", "two", "three")
charmat <- rbind(colors, numbers)
charmat

        [,1]  [,2]   [,3]    
colors  "red" "blue" "yellow"
numbers "one" "two"  "three"

Matrix Properties

The dimension of a matrix is its number of rows and its number of columns. For example,

dim(charmat)

[1] 2 3

We can count the numbers of rows and columns with nrow() and ncol():

nrow(charmat)

[1] 2

ncol(charmat)

[1] 3

Matrix Properties

The determinant can be found in R using the det() function, as in

H4

            h1        h2        h3        h4
[1,] 1.0000000 0.5000000 0.3333333 0.2500000
[2,] 0.5000000 0.3333333 0.2500000 0.2000000
[3,] 0.3333333 0.2500000 0.2000000 0.1666667
[4,] 0.2500000 0.2000000 0.1666667 0.1428571

det(H4)

[1] 1.653439e-07

Diagonal Matrices

A diagonal matrix can be obtained using the diag() function applied to a vector (which will contain the elements of the diagonal):

diag(1:4)

     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    2    0    0
[3,]    0    0    3    0
[4,]    0    0    0    4

Diagonal Elements of Matrices

The diagonal elements of a matrix can be obtained using the diag() function, as in

diag(H4)

[1] 1.0000000 0.3333333 0.2000000 0.1428571

Exercise: What is the result of the following?

diag(diag(H4))

Outer Products

The outer product of column vectors x and y is $ xy^T $.

The outer() function computes this product, by default, but can produce other forms as well.

The function outer() can be used to perform an operation on all possible pairs of elements coming from two vectors. For example, we will compute all quotients among pairs of elements of the sequence running from 1 through 5.

x1 <- seq(1, 5)
outer(x1, x1, "/")

Outer Products

     [,1] [,2]      [,3] [,4] [,5]
[1,]    1  0.5 0.3333333 0.25  0.2
[2,]    2  1.0 0.6666667 0.50  0.4
[3,]    3  1.5 1.0000000 0.75  0.6
[4,]    4  2.0 1.3333333 1.00  0.8
[5,]    5  2.5 1.6666667 1.25  1.0

Outer Products

Replacing the division operation with the subtraction operator gives all pairwise differences.

outer(x1, x1, "-")

     [,1] [,2] [,3] [,4] [,5]
[1,]    0   -1   -2   -3   -4
[2,]    1    0   -1   -2   -3
[3,]    2    1    0   -1   -2
[4,]    3    2    1    0   -1
[5,]    4    3    2    1    0

Outer Products

The third argument can be any function that takes two vector arguments. The second argument can dier from the first. For example,

y <- seq(5, 10)
outer(x1, y, "+")

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    6    7    8    9   10   11
[2,]    7    8    9   10   11   12
[3,]    8    9   10   11   12   13
[4,]    9   10   11   12   13   14
[5,]   10   11   12   13   14   15

The apply() Function

This is a very useful function which is often more effcient than a for() loop. For example, compute the product of the rows of H4 using

apply(H4, 2, prod)

         h1          h2          h3          h4 
0.041666667 0.008333333 0.002777778 0.001190476

The second argument specifies whether the operation is to be applied to rows (1) or columns (2).

The third argument species the function which should be ap- plied. In this case, prod(x) function multiplies the elements of x.

Another simple example

test <- matrix(1:6, nrow=3)
test

     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6

apply(test, 1, sum)

[1] 5 7 9

apply(test, 2, sum)

[1]  6 15

Outer Products

The $ (i, j) $ element of the following outer product matrix is of the form $ x_i^{y_j} $

X <- outer(1:3, 1:2, function(x,y) x^y)
X

     [,1] [,2]
[1,]    1    1
[2,]    2    4
[3,]    3    9

The Trace

We can compute the trace of a matrix $ X $ using

diag(X)

[1] 1 4

sum(diag(X))

[1] 5

sum(diag(H4))

[1] 1.67619

The Trace

The t() function is used to calculate the matrix transpose $ X^T $:

t(charmat)

     colors   numbers
[1,] "red"    "one"  
[2,] "blue"   "two"  
[3,] "yellow" "three"

Triangular matrices

The functions lower.tri() and upper.tri() can be used to obtain the lower and upper triangular parts of matrices.

The output of the functions is a matrix of logical elements, with TRUE representing the relevant triangular elements. For example,

lower.tri(H4)

      [,1]  [,2]  [,3]  [,4]
[1,] FALSE FALSE FALSE FALSE
[2,]  TRUE FALSE FALSE FALSE
[3,]  TRUE  TRUE FALSE FALSE
[4,]  TRUE  TRUE  TRUE FALSE

Triangular matrices

We can obtain the lower triangular matrix whose nonzero ele- ments match the lower triangular elements of H4 by using

Hnew <- H4
Hnew[upper.tri(H4, diag=TRUE)] <- 0
Hnew

            h1   h2        h3 h4
[1,] 0.0000000 0.00 0.0000000  0
[2,] 0.5000000 0.00 0.0000000  0
[3,] 0.3333333 0.25 0.0000000  0
[4,] 0.2500000 0.20 0.1666667  0

Matrix Arithmetic

Multiplication of a matrix by a scalar constant is the same as multiplication of a vector by a constant.

For example, using the X matrix from the previous section, we can multiply each element by 2 as in

Y <- 2 * X
Y

     [,1] [,2]
[1,]    2    2
[2,]    4    8
[3,]    6   18

Matrix Arithmetic

Elementwise addition of matrices also proceeds as for vectors. For example,

Y + X

     [,1] [,2]
[1,]    3    3
[2,]    6   12
[3,]    9   27

Matrix Arithmetic

When adding matrices, always ensure that the dimensions match properly. If they do not match correctly, an error message will appear, as in

t(Y) + X

Matrix Arithmetic

The command X * Y performs elementwise multiplication. Note that this differs from the usual form of matrix multiplication that we will discuss below. For example,

X * Y

     [,1] [,2]
[1,]    2    2
[2,]    8   32
[3,]   18  162

Again, in order for this kind of multiplication to work, the dimensions of the matrices must match.

Matrix Multiplication

If A and B are matrices, then the matrix product AB is the matrix representing the composition of the two operations: first apply B, then apply A to the result.

For matrix multiplication to be a properly defined operation, the matrices to be multiplied must conform.

That is, the number of columns of the first matrix must match the number of rows of the second matrix.

The resulting matrix AB will have its row dimension taken from A and its column dimension taken from B.

Matrix Multiplication

In R, this form of matrix multiplication can be performed using the operator %*%, for example

t(Y) %*% X

     [,1] [,2]
[1,]   28   72
[2,]   72  196

Matrix Multiplication

We saw that t(Y) has 3 columns and X has 3 rows, so we can perform the multiplication $ Y^TX $.

The result is a $ 2\times2 $ matrix, since t(Y) has 2 rows and X has 2 columns.

If we failed to transpose Y, we would obtain an error, as in

Y %*% X

Matrix Multiplication

The crossprod() function is a somewhat more ecient way to calculate $ Y^TX $:

crossprod(Y,X)

     [,1] [,2]
[1,]   28   72
[2,]   72  196

Matrix Multiplication

The first argument of crossprod() is transposed automatically.

t(Y) %*% X needs to make a new object t(Y) before performing the multiplication. If Y is a large matrix, this will consume a lot of memory and noticeable computation time.

The crossprod(Y, X) function call can access the elements of Y directly, since the $ (i, j) $ element of $ Y^T $ is simply the $ (j, i) $ element of Y

Matrix Inversion

The inverse of a square $ n \times n $ matrix A, denoted by $ A^{-1} $, is the solution to the matrix equation $ AA^{-1} = I $, where $ I $ is the $ n \times n $ identity matrix.

We can view this as n separate systems of linear equations in $ n $ unknowns, whose solutions are the columns of $ A^{-1} $.

Matrix Inversion

For example, with

     [,1] [,2]
[1,]    3   -4
[2,]    1    2

Matrix Inversion

The usual computational approach to finding A????1 involves solv- ing these two equations.

However, this is not always a good idea!

Often the reason we are trying to and $ A^{-1} $ is so that we can solve a system $ Ax = b $ with the solution $ x = A^{-1}b $.

It doesn't make sense from a computational point of view to solve n systems of linear equations in order to obtain a result which will be used as the solution to one system.

Matrix Inversion

If we know how to solve systems, we should use that knowledge to solve $ Ax = b $ directly.

Furthermore, using A..1 may give a worse approximation to the final result than the direct approach, because there are so many more operations involved, giving opportunties for much more rounding error to creep into our results.

The LU decomposition

The general strategy for solving a system of equations $ Ax = b $ is to break down the problem into simpler ones.

This often involves rewriting the matrix A in a special form; one such form is called the LU decomposition.

In the LU decomposition, we write A as a product of two ma- trices L and U.

The LU decomposition

The matrix L is lower triangular with unit values on the diagonal, \[ L=\begin{pmatrix} 1 & 0 & \cdots & 0 \\ l_{21} & 1 & \cdots & \vdots \\ \vdots&\vdots & \ddots & 0 \\ l_{n1} & l_{n2} & \cdots & 1 \\ \end{pmatrix} \]

U is upper triangular, i.e.

\[ U=\begin{pmatrix} u_{11} & u_{12} & \cdots & u_{1n} \\ 0 & u_{22} & \cdots & u_{2n} \\ \vdots&\vdots & \ddots & 0 \\ 0 & \cdots & 0 & u_{nn} \\ \end{pmatrix} \]

The LU decomposition: Example

Consider the matrix

\[ A=\begin{pmatrix} 2 & 4 & 3 \\ 6 & 16 & 10 \\ 4&12 & 9 \\ \end{pmatrix} \]

Write the entries of A as $ a_{ij} $.

Use the relation

\[ a_{ij}=\sum_{k=1}^3l_{ik}u_{kj} \]

The LU decomposition: Example

$ a_{11} = 2 = l_{11} \times u_{11} = 1 \times u_{11} $, so $ u_{11} = 2 $.
$ a_{21} = 6 = l_{21} \times u_{11} = l_{21} \times 2 $, so $ l_{21} = 3 $.
$ a_{31} = 4 = l_{31} \times u_{11} = l_{31} \times 2 $, so $ l_{31} = 2 $.
$ a_{12} = 4 = l_{11} \times u_{12} $, so $ u_{12} = 4 $.
$ a_{22} = 16 = l_{21} \times u_{12} +l_{22} \times u_{22} = 3 \times 4+1 \times u22 $, so $ u22 = 4 $.
$ a_{32} = 12 = l_{31} \times u_{12} + l_{32} \times u_{22} = 2 \times 4 + l_{32} \times 4 $, so $ l_{32} = 1 $.
$ a_{13} = 3 = l_{11} \times u_{13} = 1 \times u_{13} $, so $ u_{13} = 3 $.
$ a_{23} = 10 = l_{21} \times u_{13} +l_{22} \times u_{23} = 3 \times 3+1 \times u_{23} $, so $ u_{23} = 1 $.
$ a_{33} = 9 = l_{31} \times u_{13} + l_{32} \times u_{23}+ l_{33} \times u_{33} = 2 \times 3 + 1 \times 1+1 \times u_{33} $, so $ u_{33} = 2 $.

The LU decomposition

Once we have L and U in hand, solving the system of equations $ Ax = b $ is easy.

We write the system as $ L(Ux) = b $, set $ y = Ux $ and solve $ Ly = b $ for $ y $ first.

Because $ L $ is lower triangular, this is straightforward using a procedure known as forward elimination.

The LU decomposition

Continuing the example above, with $b = [-1,-2,-7]^T , and setting $ y = [y_1, y_2, y_3]^T $ , we make use of the relation

\[ b_i=\sum_{j=1}^{3}l_{ij}y_j \]

to calculate

$ b_1=-1=l_{11}\times y_1=1 \times y_1 $, so $ y_1 = -1 $.
$ b_2 = -2 = l_{21} \times y_1 + l_{22} \times y_2 = 3\times (-1) +1 \times y_2 $ , so $ y_2 = 1 $.
$ b_3 = -7 = l_{31} \times y_1 +l_{32} \times {y_2} +l_{33} \times y_3 = 2 \times (-1)+ 1 \times 1+1 \times y_3 $, so $ y_3 = -6 $.

The LU decomposition

Finally, we solve Ux = y. This time the fact that U is upper triangular means solving for the entries in reverse order is easy, using a procedure called back substitution:

$ y_3 = -6 = u_{33} \times x_3 = 2 \times x_3 $ so $ x_3 = -3 $.
$ y_2 = 1 = u_{22} \times x_2 + u_{23} \times x_3 = 4 \times x_2 + 1 \times (-3) $ , so $ x_2 = 1 $.
$ y_1 = -1 = u_{11} \times x_1 + u_{12} \times x_2 + u_{13} \times x_3 = 2 \times x_1 + 4 \times 1+3 \times (-3) $ so $ x_1 = 2 $.

The LU decomposition

By processing these steps successively, the problem of solving Ax = b has been reduced to solving 15 successive linear equa- tions, each with just one unknown.

The procedure is easily automated.

In fact, the default method used in R for solving linear equations is based on this technique; the only substantial dierence is that the ordering of the columns is rearranged before factoring so that rounding error is minimized.

Matrix inversion in R

In R, matrices are inverted and linear systems of equations are solved using the solve() or qr.solve() functions.

solve() uses a method based on the LU decomposition; qr.solve() is based on the QR decomposition that is described later.

Matrix inversion in R

As an example, we compute the inverse of the $ 4 \times 4 $ Hilbert matrix.

H4inv <- solve(H4)
H4inv

   [,1]  [,2]  [,3]  [,4]
h1   16  -120   240  -140
h2 -120  1200 -2700  1680
h3  240 -2700  6480 -4200
h4 -140  1680 -4200  2800

Matrix inversion in R

To verify that this is the inverse of H4, we can check that the product of H4inv and H4 is the $ 4 \times 4 $ identity.

\[ H4inv %*% H4 \]

Solving linear systems

The function solve(A, b) gives the solution to systems of equa- tions of the form $ Ax = b $.

For example, let us find x such that $ H_4x = b $ where $ H_4 $ is the $ 4 \times 4 $ Hilbert matrix and $ b = [1 2 3 2]_T $ .

b <- c(1, 2, 3, 2)
x <- solve(H4, b)
x

   h1    h2    h3    h4 
  216 -2460  5880 -3780

Eigenvalues and Eigenvectors

Eigenvalues and eigenvectors can be computed using the func- tion eigen(). For example,

eigen(H4)

$values
[1] 1.5002142801 0.1691412202 0.0067382736 0.0000967023

$vectors
          [,1]       [,2]       [,3]        [,4]
[1,] 0.7926083  0.5820757  0.1791863 -0.02919332
[2,] 0.4519231 -0.3705022 -0.7419178  0.32871206
[3,] 0.3224164 -0.5095786  0.1002281 -0.79141115
[4,] 0.2521612 -0.5140483  0.6382825  0.51455275

Eigenvalues and Eigenvectors

Let $ x_1 $ denote the first column of the $vectors output, i.e. [0.793 0.452 0.322 0.252]^T . This is the first eigenvector, corresponding to the eigenvalue $ 1.50 $. Thus,

\[ H_4x_1 = 1.50x_1 \]

Denoting the second and third columns of $vectors by $ x_2 $ and $ x_3 $, we have

\[ H_4x_2 = 0.169x_2 \] \[ H_4x_3 = 0.00674x_3 \]

and

\[ H_4x_4 = .0000967x_4: \]

Advanced Topics

More topics about the decomposition and other linear algebra change would be mentioned in this chapter.

Singular Value Decomposition
choleski Decomposition
QR Decomposition

Advanced Topics

The operations of matrix would cause the computation intensity increasing exponentially. Therefore, we will always attempt to deal with the matrix with special format in order to reduce the computation intensity here.

Triangular matrix
Diagonal Matrix
Eigan Matrix
Orthogonal
Idempotent

The singular value decomposition of a matrix

The singular value decomposition of a square matrix A consists of three square matrices, $ U $, $ D $ and $ V $ . The matrix $ D $ is a diagonal matrix. The relation among these matrices is \[ A=UDV^T \]

The matrices $ U $ and $ V $ are said to be orthogonal, which means that $ U^{-1} = U^T $ and $ V^{-1} = V^T $.

The singular value decomposition of a matrix

The singular value decomposition of a matrix is often used to obtain accurate solutions to linear systems of equations.

The elements of D are called the singular values of A. Note that

\[ A^TA=(UDV^T)^TUDV^T=VD^TU^TUDV^T=VD^2V^{-1} \]

where $ D $ is a diagonal matrix with non-negative numbers. This is a similarity transformation which tells us that the squares of the singular values of $ A $ are the eigenvalues of $ A^TA $.

The singular value decomposition of a matrix

The singular value decomposition can be obtained using the function svd().

For example, the singular value decomposition of the $ 4 \times 4 $ Hilbert matrix $ H_4 $ is

H4.svd<-svd(H4)

The singular value decomposition of a matrix

H4.svd

$d
[1] 1.5002142801 0.1691412202 0.0067382736 0.0000967023

$u
           [,1]       [,2]       [,3]        [,4]
[1,] -0.7926083  0.5820757 -0.1791863 -0.02919332
[2,] -0.4519231 -0.3705022  0.7419178  0.32871206
[3,] -0.3224164 -0.5095786 -0.1002281 -0.79141115
[4,] -0.2521612 -0.5140483 -0.6382825  0.51455275

$v
           [,1]       [,2]       [,3]        [,4]
[1,] -0.7926083  0.5820757 -0.1791863 -0.02919332
[2,] -0.4519231 -0.3705022  0.7419178  0.32871206
[3,] -0.3224164 -0.5095786 -0.1002281 -0.79141115
[4,] -0.2521612 -0.5140483 -0.6382825  0.51455275

The singular value decomposition of a matrix

We can verify that these components can be multiplied in the appropriate way to reconstruct $ H4 $.

H4.svd$u %*% diag(H4.svd$d) %*% t(H4.svd$v)

          [,1]      [,2]      [,3]      [,4]
[1,] 1.0000000 0.5000000 0.3333333 0.2500000
[2,] 0.5000000 0.3333333 0.2500000 0.2000000
[3,] 0.3333333 0.2500000 0.2000000 0.1666667
[4,] 0.2500000 0.2000000 0.1666667 0.1428571

The singular value decomposition of a matrix

Because of the properties of the U, V and D matrices, the singular value decomposition provides a simple way to compute a matrix inverse. For example, $ H_4^{-1}=VD^{-1}U^T $ and can be recalculated as

H4.svd$v %*% diag(1/H4.svd$d) %*% t(H4.svd$u)

     [,1]  [,2]  [,3]  [,4]
[1,]   16  -120   240  -140
[2,] -120  1200 -2700  1680
[3,]  240 -2700  6480 -4200
[4,] -140  1680 -4200  2800

The Choleski decomposition of a positive definite matrix

If a matrix A is positive definite, it possesses a square root. In fact, there are usually several matrices B such that $ B^2 = A $. The Choleski decomposition is similar, but the idea is to find an upper triangular matrix U such that $ U^TU = A $.

The function chol() accomplishes this task.

The Choleski decomposition of a positive definite matrix

For example, we can compute the Choleski decomposition of the $ 3 \times 3 $ Hilbert matrix.

H4.chol <- chol(H4)
H4.chol

     h1        h2        h3         h4
[1,]  1 0.5000000 0.3333333 0.25000000
[2,]  0 0.2886751 0.2886751 0.25980762
[3,]  0 0.0000000 0.0745356 0.11180340
[4,]  0 0.0000000 0.0000000 0.01889822

The Choleski decomposition of a positive definite matrix

crossprod(H4.chol, H4.chol) # Multiplying U^T U

          h1        h2        h3        h4
h1 1.0000000 0.5000000 0.3333333 0.2500000
h2 0.5000000 0.3333333 0.2500000 0.2000000
h3 0.3333333 0.2500000 0.2000000 0.1666667
h4 0.2500000 0.2000000 0.1666667 0.1428571

The Choleski decomposition of a positive definite matrix

Once the Choleski decomposition of a matrix A = UTU has been obtained, we can compute the inverse of A using the fact that $ A^{-1}=U^{-1}U^{-T} $ (where $ U^{-T} $ is a short way to write the transpose of $ U^{-1} $).

This computation is much more stable than direct calculation of $ A^{-1} $ by Gaussian elimination.

The function chol2inv() does this calculation.

The Choleski decomposition of a positive definite matrix

For example, we can compute the inverse of $ H4 $ as

chol2inv(H4.chol)

     [,1]  [,2]  [,3]  [,4]
[1,]   16  -120   240  -140
[2,] -120  1200 -2700  1680
[3,]  240 -2700  6480 -4200
[4,] -140  1680 -4200  2800

The Choleski decomposition of a positive definite matrix

Once the Choleski decomposition has been obtained, we can compute solutions to linear systems of the form

\[ Ax=b \]

If $ A = U^TU $, then we see that $ Ux = U^{-T}b $. Therefore, the solution x can be obtained in a two step procedure:

Solve $ U^Ty = b $ for y. The solution will satisfy $ y = U^{-T}b $.
Solve $ Ux = y $.

The Choleski decomposition of a positive definite matrix

The first system is lower triangular, so forward elimination can be used to solve it.

The function forwardsolve() can be used for this.

The second system is upper triangular, so back substitution using function backsolve() can be used.

The Choleski decomposition of a positive definite matrix

For the problem $ H_4x = b $, where $ b = [1 2 3 2]^T $ , we can proceed as follows:

b <- c(1, 2, 3, 2)
y <- forwardsolve(t(H4.chol), b)
backsolve(H4.chol, y)

[1]   216 -2460  5880 -3780

The QR decomposition of a matrix

Lastly, we will learn an important decomposition in Regression Analysis.

This time, \[ A=QR \]

where $ Q $ is an orthogonal matrix, and $ R $ is an upper triangular matrix.

This decomposition can be applied even if $ A $ is not square.

Again, this decomposition can be used to obtain accurate solu- tions to linear systems of equations.

The QR decomposition of a matrix

For example, suppose we want to solve

\[ Ax=b \]

for $ x $, given the $ n \times n $ matrix $ A $ and n-vector $ b $. If we compute the QR decomposition of $ A $ forst, we can write

\[ QRx = b \]

Multiplying through by $ Q^T $ on the left gives

\[ Rx=Q^Tb \]

This is an easier system to solve, because R is an upper trian- gular matrix.

The QR decomposition of a matrix

To obtain the decomposition, we use the qr() function.

H4.qr <- qr(H4)

The QR decomposition of a matrix

The functions qr.Q() and qr.R() can be applied to this object to obtain the explicit Q and R matrices.

For our example, we have

Q <- qr.Q(H4.qr)
Q

           [,1]       [,2]       [,3]        [,4]
[1,] -0.8381164  0.5226484 -0.1539728 -0.02630668
[2,] -0.4190582 -0.4417133  0.7277538  0.31568019
[3,] -0.2793721 -0.5288214 -0.1395055 -0.78920046
[4,] -0.2095291 -0.5020717 -0.6536092  0.52613364

The QR decomposition of a matrix

R <- qr.R(H4.qr)
R

            h1         h2           h3            h4
[1,] -1.193152 -0.6704931 -0.474932601 -0.3698354709
[2,]  0.000000 -0.1185333 -0.125655095 -0.1175419928
[3,]  0.000000  0.0000000 -0.006221774 -0.0095660929
[4,]  0.000000  0.0000000  0.000000000  0.0001879049

The QR decomposition of a matrix

We can recover $ H^4 $ by multiplying $ Q $ by $ R $:

Q %*% R

            h1        h2        h3        h4
[1,] 1.0000000 0.5000000 0.3333333 0.2500000
[2,] 0.5000000 0.3333333 0.2500000 0.2000000
[3,] 0.3333333 0.2500000 0.2000000 0.1666667
[4,] 0.2500000 0.2000000 0.1666667 0.1428571

The QR decomposition of a matrix

The inverse of $ H4 $ can be obtained quickly from $ R^{-1}Q^T $ , since R is upper triangular.

qr.solve(R) %*% t(Q)

   [,1]  [,2]  [,3]  [,4]
h1   16  -120   240  -140
h2 -120  1200 -2700  1680
h3  240 -2700  6480 -4200
h4 -140  1680 -4200  2800