\(\def\a{\boldsymbol{a}}\) \(\def\b{\boldsymbol{b}}\) \(\def\c{\boldsymbol{c}}\) \(\def\f{\boldsymbol{f}}\) \(\def\g{\boldsymbol{g}}\) \(\def\h{\boldsymbol{h}}\) \(\def\j{\boldsymbol{j}}\) \(\def\u{\boldsymbol{u}}\) \(\def\x{\boldsymbol{x}}\) \(\def\y{\boldsymbol{y}}\) \(\def\A{\boldsymbol{\mathrm{A}}}\) \(\def\B{\boldsymbol{\mathrm{B}}}\) \(\def\C{\boldsymbol{\mathrm{C}}}\) \(\def\D{\boldsymbol{\mathrm{D}}}\) \(\def\E{\boldsymbol{\mathrm{E}}}\) \(\def\I{\boldsymbol{\mathrm{I}}}\) \(\def\J{\boldsymbol{\mathrm{J}}}\) \(\def\M{\boldsymbol{\mathrm{M}}}\) \(\def\O{\boldsymbol{\mathrm{O}}}\) \(\def\P{\boldsymbol{\mathrm{P}}}\) \(\def\Q{\boldsymbol{\mathrm{Q}}}\) \(\def\T{\boldsymbol{\mathrm{T}}}\) \(\def\U{\boldsymbol{\mathrm{U}}}\) \(\def\X{\boldsymbol{\mathrm{X}}}\) \(\def\zeros{\boldsymbol{0}}\) \(\def\diag{\mathrm{diag}}\) \(\def\rank{\mathrm{rank}}\) \(\def\trace{\mathrm{tr}}\) \(\def\tr{^\top}\) \(\def\ds{\displaystyle}\) \(\def\bea{\begin{eqnarray}}\) \(\def\nnn{\nonumber}\) \(\def\eea{\nnn\end{eqnarray}}\)

2.1 Matrix and Vector Notation

A=cbind(c(4,3),c(6,0),c(-1,1))
A=matrix(c(4,3,6,0,-1,1),2,3)
A=rbind(c(4,6,-1),c(3,0,1))
A=matrix(c(4,6,-1,3,0,1),2,3,byrow=TRUE)
A
##      [,1] [,2] [,3]
## [1,]    4    6   -1
## [2,]    3    0    1

Technically in R, vectors are different from matrices as shown below.

x=c(2,-5,3.4,0)
x
## [1]  2.0 -5.0  3.4  0.0
is.vector(x)
## [1] TRUE
is.matrix(x)
## [1] FALSE
X=matrix(x)
X
##      [,1]
## [1,]  2.0
## [2,] -5.0
## [3,]  3.4
## [4,]  0.0
is.vector(X)
## [1] FALSE
is.matrix(X)
## [1] TRUE

Now, we illustrate syntax in R for computing and working with some of the definitions. The transpose of A can be computed using the function t.

t(A)
##      [,1] [,2]
## [1,]    4    3
## [2,]    6    0
## [3,]   -1    1

Here is the diagonal of A obtain with the function diag.

diag(A)
## [1] 4 0

The function diag can also be used to create a square diagonal matrix with the diagonal elements specified by an input vector.

diag(c(1,2))
##      [,1] [,2]
## [1,]    1    0
## [2,]    0    2

So combining these different uses of diag gives the correct output.

diag(diag(A))
##      [,1] [,2]
## [1,]    4    0
## [2,]    0    0

The function rep is an easy way to create \(\j\).

j=rep(1,3);j
## [1] 1 1 1

The function diag also allows a convenient way to create \(\I\) because it is defined differently when the input is a vector of length 1.

I=diag(4);I
##      [,1] [,2] [,3] [,4]
## [1,]    1    0    0    0
## [2,]    0    1    0    0
## [3,]    0    0    1    0
## [4,]    0    0    0    1

If we actually want to create a \(1\times 1\) matrix with entry \(4\), we can do so directly.

matrix(4,1,1)
##      [,1]
## [1,]    4

The function matrix can also be used to create \(\O\).

O=matrix(0,2,3);O
##      [,1] [,2] [,3]
## [1,]    0    0    0
## [2,]    0    0    0

2.2 Operations

The difference is defined by \(\D=\A-\B\) where \(\D=(d_{ij})=(a_{ij}-b_{ij})\). Here we say that \(\A\) and \(\B\) are conformal for addition.

A=rbind(c(6,-1),c(0,4))
B=rbind(c(1,0),c(2,1))
A+B
##      [,1] [,2]
## [1,]    7   -1
## [2,]    2    5

So, we see that \(\A+\B=\left(\begin{array}{cc} 7 & -1 \\ 2 & 5 \end{array}\right)\).

The scalar product can be computed using the * operator since it is an elementwise operation.

3*A
##      [,1] [,2]
## [1,]   18   -3
## [2,]    0   12

So, we see that \(3\A=\left(\begin{array}{cc} 18 & -3 \\ 0 & 12 \end{array}\right)\).

A matrix product is computed in R with the %*% operator.

A%*%B
##      [,1] [,2]
## [1,]    4   -1
## [2,]    8    4

So, we see that \(\A\B=\left(\begin{array}{cc} 4 & -1 \\ 8 & 4 \end{array}\right)\).

It is important to note that the * operator instead computes the elementwise product.

A*B
##      [,1] [,2]
## [1,]    6    0
## [2,]    0    4

The elementwise product is often useful for products involving vectors. For instance, suppose \(\x=\left(\begin{array}{c} 1 \\ 2 \\ -2\end{array}\right)\). Here are two different ways to compute the square of the length of \(\x\).

x=c(1,2,-2)
t(x)%*%x
##      [,1]
## [1,]    9
x*x
## [1] 1 4 4
sum(x*x)
## [1] 9

So, we see that \(\x'\x=\ds{\sum_{i=1}^3 x_i^2}=9\). Note that, in R, the first expression is treated as a matrix while the second is treated as a vector.

sqrt(sum(x^2))
## [1] 3

So, the length of \(\x\) is \(\sqrt{9}=3\).

2.3 Partitioned Matrices

2.4 Rank

Using these types of operations, we can obtain a convenient “row equivalent” matrix. \[ \rank(\A)=\rank(\T_1\A), \ \ \ \ \ \T_1\A= \left(\begin{array}{ccc}0 & 1 & 0\\ 1 & 0 & 0 \\ 0 & 0 & 1\end{array}\right) \left(\begin{array}{ccc}0 & 3 & -6\\ 1 & -1 & 3 \\ 3 & -1 & 5\end{array}\right) = \left(\begin{array}{ccc}1 & -1 & 3\\ 0 & 3 & -6 \\ 3 & -1 & 5\end{array}\right) \] \[ \rank(\T_1\A)=\rank(\T_2\T_1\A), \ \ \ \ \ \T_2\T_1\A= \left(\begin{array}{ccc}1 & 0 & 0\\ 0 & 1 & 0 \\ -3 & 0 & 1\end{array}\right) \left(\begin{array}{ccc}1 & -1 & 3\\ 0 & 3 & -6 \\ 3 & -1 & 5\end{array}\right) = \left(\begin{array}{ccc}1 & -1 & 3\\ 0 & 3 & -6 \\ 0 & 2 & -4\end{array}\right) \] \[ \rank(\T_2\T_1\A)=\rank(\T_3\T_2\T_1\A), \ \T_3\T_2\T_1\A= \left(\begin{array}{ccc}1 & 0 & 0\\ 0 & \frac{1}{3} & 0 \\ 0 & 0 & 1\end{array}\right) \left(\begin{array}{ccc}1 & -1 & 3\\ 0 & 3 & -6 \\ 0 & 2 & -4\end{array}\right) = \left(\begin{array}{ccc}1 & -1 & 3\\ 0 & 1 & -2 \\ 0 & 2 & -4\end{array}\right) \] \[ \rank(\T_3\T_2\T_1\A)=\rank(\T_4\T_3\T_2\T_1\A), \ \T_4\T_3\T_2\T_1\A= \left(\begin{array}{ccc}1 & 0 & 0\\ 0 & 1 & 0 \\ 0 & 0 & \frac{1}{2}\end{array}\right) \left(\begin{array}{ccc}1 & -1 & 3\\ 0 & 1 & -2 \\ 0 & 2 & -4\end{array}\right) = \left(\begin{array}{ccc}1 & -1 & 3\\ 0 & 1 & -2 \\ 0 & 1 & -2\end{array}\right) \] \[ \rank(\T_4\T_3\T_2\T_1\A)=\rank(\T_5\T_4\T_3\T_2\T_1\A), \ \T_5\T_4\T_3\T_2\T_1\A= \left(\begin{array}{ccc}1 & 0 & 0\\ 0 & 1 & 0 \\ 0 & -1 & 1\end{array}\right) \left(\begin{array}{ccc}1 & -1 & 3\\ 0 & 1 & -2 \\ 0 & 1 & -2\end{array}\right) = \left(\begin{array}{ccc}1 & -1 & 3\\ 0 & 1 & -2 \\ 0 & 0 & 0\end{array}\right) \] \[ \rank(\T_5\T_4\T_3\T_2\T_1\A)=\rank(\T_6\T_5\T_4\T_3\T_2\T_1\A), \ \T_6\T_5\T_4\T_3\T_2\T_1\A= \left(\begin{array}{ccc}1 & -1 & 0\\ 0 & 1 & 0 \\ 0 & 0 & 1\end{array}\right) \left(\begin{array}{ccc}1 & -1 & 3\\ 0 & 1 & -2 \\ 0 & 0 & 0\end{array}\right) = \left(\begin{array}{ccc}1 & 0 & 1\\ 0 & 1 & -2 \\ 0 & 0 & 0\end{array}\right) \] Since \(\T_6\T_5\T_4\T_3\T_2\T_1\A\) has two rows with leading 1’s, \(\rank(\A)=\rank(\T_6\T_5\T_4\T_3\T_2\T_1\A)=2\).

library(Matrix)
A_2_4_1=matrix(c(0, 3, -6, 1, -1, 3, 3, -1, 5),3,3,byrow=TRUE)
A_2_4_1
##      [,1] [,2] [,3]
## [1,]    0    3   -6
## [2,]    1   -1    3
## [3,]    3   -1    5
rankMatrix(A_2_4_1)[1]
## [1] 2

2.5 Inverse

So, we have the result \(\A^{-1}=\ds{\left(\begin{array}{cc} \frac{4}{3} & -\frac{5}{3} \\ -\frac{1}{3} & \frac{2}{3}\end{array}\right)}\).

For \(2\times 2\) matrices, there is a shortcut formula \(\left(\begin{array}{cc} a_{11} & a_{12} \\ a_{21} & a_{22}\end{array}\right)=\frac{1}{a_{11}a_{22}-a_{12}a_{21}}\left(\begin{array}{cc} a_{22} & -a_{12} \\ -a_{21} & a_{11}\end{array}\right)\). So, we see that \(\A^{-1}=\frac{1}{3}\left(\begin{array}{cc} 4 & -5 \\ -1 & 2\end{array}\right)\).

A=cbind(c(2,1),c(5,4))
solve(A)
##            [,1]       [,2]
## [1,]  1.3333333 -1.6666667
## [2,] -0.3333333  0.6666667

2.6 Positive Definite Matrices

A=cbind(c(5,0,1),c(0,3,0),c(1,0,1))
T=chol(A); T
##          [,1]     [,2]      [,3]
## [1,] 2.236068 0.000000 0.4472136
## [2,] 0.000000 1.732051 0.0000000
## [3,] 0.000000 0.000000 0.8944272

So let \(\T=\left(\begin{array}{ccc} \sqrt{5} & 0 & \frac{1}{\sqrt{5}} \\ 0 & \sqrt{3} & 0 \\ 0 & 0 & \frac{2}{\sqrt{5}}\end{array}\right)\). The following command checks that \(\T'\T=\A\).

t(T)%*%T
##      [,1] [,2] [,3]
## [1,]    5    0    1
## [2,]    0    3    0
## [3,]    1    0    1

2.7 Systems of Equations

plot(-5:5,-5:5,type="n",xlab=expression(italic(x)[1]),ylab=expression(italic(x)[2]),family="serif",cex.lab=1.5)
abline(h=-2,col="red") #this option draws a horizontal line at x2=-2
abline(a=-3,b=1,col="blue") #a is the y-intercept and b is the slope of the line x2=x1-3
x1=c(-5,5); x2=-5+3*x1; points(x1,x2,type="l",col="#FFA200") #this shows how to plot the line directly with two points 

2.8 Generalized Inverse

There are many other generalized inverses of \(\A\) such as \(\A_2^{-}=\left(\begin{array}{ccc} -1 & 0 & 4\\ 0 & 0 & 0 \\ 1 & 0 & -3\end{array}\right)\) and \(\A_3^{-}=\left(\begin{array}{ccc} 0 & 0 & 0\\ 0 & 0.5 & -3.5 \\ 0 & 0 & 1\end{array}\right)\).

2.9 Determinants

The determinant of an \(n\times n\) matrix \(\A=(a_{ij})\) is \(\ds{\det(\A)=\sum_{j=1}^n (-1)^{i+j} a_{ij}\det(\M_{ij})}\) where \(i\) is any integer between \(1\) and \(n\) and \(M_{ij}\) is the submatrix formed by eliminating the \(i\)th row and \(j\)th column of \(\A\).

The notation \(\left| \A \right|\) is sometimes also used to represent the determinant of \(\A\).

\[\bea \nnn \left|\begin{array}{ccc} 2 & 4 & 6\\ 2 & -1 & 4 \\ 0 & 1 & 5\end{array}\right|&=& 0 \left|\begin{array}{cc} 4 & 6\\ -1 & 4\end{array}\right| -1 \left|\begin{array}{cc} 2 & 6\\ 2 & 4\end{array}\right| +5 \left|\begin{array}{cc} 2 & 4\\ 2 & -1\end{array}\right| \\ \nnn &=& 0 - 1(8-12) + 5(-2-8) \\ \nnn &=& 0+4-50 = -46. \eea\]

\[\bea \nnn \det(\T_4\T_3\T_2\T_1\A)&=&\det(\T_4)\det(\T_3)\det(\T_2)\det(\T_1)\det(\A) \\ \nnn 23 &=& (1)(-1)(1)\left(\frac{1}{2}\right)\det(\A) \\ \nnn \det(\A) &=& 23(-2)=-46. \eea\]

A=rbind(c(2,4,6),c(2,-1,4),c(0,1,5))
det(A)
## [1] -46

2.10 Orthogonal Vectors and Matrices

2.11 Trace

2.12 Eigenvalues and Eigenvectors

A=rbind(c(5,2),c(2,2))
eigen.A=eigen(A)
lambda=eigen.A$values;lambda
## [1] 6 1
U=eigen.A$vector;U
##            [,1]       [,2]
## [1,] -0.8944272  0.4472136
## [2,] -0.4472136 -0.8944272

The eigenvalues are stored in the vector lambda and the respective eigenvectors are stored in the columns of U. The following command verifies that the spectral decomposition equals A.

U%*%diag(lambda)%*%t(U)
##      [,1] [,2]
## [1,]    5    2
## [2,]    2    2

2.13 Idempotent Matrices

2.14 Vector and Matrix Calculus

Reference

  1. Rencher and Schaalje (2008)

Rencher, Alvin C, and G Bruce Schaalje. 2008. Linear Models in Statistics. John Wiley & Sons.