Matrix Trace

Although comparatively straightforward in nature, the matrix trace has many properties related to other matrix operations and often appears in statistical methods such as maximum likelihood estimation of the covariance matrix of a multivariate normal distribution due to its usefulness in computing the derivatives of quadratic forms. One such property of a matrix trace is the trace of the covariance matrix equals the total sample variance. This post will introduce and explore several properties of the trace of a matrix.

The trace of an \(n \times n\) matrix (square matrix) is the sum of the diagonal elements of the matrix. The trace is typically denoted \(tr(A)\), where \(A\) is an \(n \times n\) matrix. Thus we can write the matrix trace as \(tr(A) = \sum^n_{i=1} a_{ii}\).

The following example matrices were taken from problem 2.15 in the book Methods of Multivariate Analysis by Alvin Rencher.

An Example of the Matrix Trace

Consider the matrix:

\[A = \begin{bmatrix} 5 & 4 & 4 \\ 2 & -3 & 1 \\ 3 & 7 & 2 \end{bmatrix}\]

A <- as.matrix(data.frame(c(5,2,3),c(4,-3,7),c(4,1,2)))
A

##      c.5..2..3. c.4...3..7. c.4..1..2.
## [1,]          5           4          4
## [2,]          2          -3          1
## [3,]          3           7          2

The trace is the sum of the diagonal elements of the matrix and thus equals \(5 + (-3) + 2 = 4\)

Computing the Matrix Trace in R

There is no function to calculate the trace of a matrix in R; however, the trace can be found using the following.

sum(diag(A))

## [1] 4

We can also code our own simple function to implement the computation of a matrix trace.

trace <- function(A) {
  n <- dim(A)[1] # get dimension of matrix
  tr <- 0 # initialize trace value
  
  # Loop over the diagonal elements of the supplied matrix and add the element to tr
  for (k in 1:n) {
    l <- A[k,k]
    tr <- tr + l
  }
  return(tr[[1]])
}

trace(A)

## [1] 4

Some Properties of the Matrix Trace

The trace has several interesting properties that are often leveraged in a variety of statistical methods. Consider the previous matrix \(A\) and a new matrix \(B\).

\[A = \begin{bmatrix} 5 & 4 & 4 \\ 2 & -3 & 1 \\ 3 & 7 & 2 \end{bmatrix}\]

\[B = \begin{bmatrix} 1 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 2 & 3 \end{bmatrix}\]

B <- as.matrix(data.frame(c(1,0,1),c(0,1,2),c(1,0,3)))
B

##      c.1..0..1. c.0..1..2. c.1..0..3.
## [1,]          1          0          1
## [2,]          0          1          0
## [3,]          1          2          3

The trace of the sum of two square matrices is the sum of the traces of the two matrices.

\[ tr(A + B) = tr(A) + tr(B) \]

trace(A + B) == trace(A) + trace(B)

## [1] TRUE

The trace of a matrix equals the trace of the matrix’s transpose.

\[ tr(A) = tr(A^T) \]

trace(A) == trace(t(A))

## [1] TRUE

A matrix multiplied by a scalar has the same trace as the trace multiplied by the same scalar.

\[ tr(\alpha A) = \alpha tr(A) \]

alpha <- 3
trace(alpha * A) == alpha * trace(A)

## [1] TRUE

The trace is cyclical.

\[ tr(AB) = tr(BA) \]

trace(A %*% B) == trace(B %*% A)

## [1] TRUE

The trace is also invariant.

\[ tr(A) = tr(BAB^{-1}) \]

trace(A) == trace(crossprod(crossprod(B,A),solve(B)))

## [1] TRUE

Summary

The matrix trace is fairly straightforward in nature but underlies many core statistical methods and thus is worth exploring briefly. We will see the trace appear in future posts on eigenvalues and eigenvectors as well as maximum likelihood distribution of the multivariate normal distribution. Some more excellent information on the matrix trace can be found in this Quora discussion.

References

The trace of a matrix. (2012). Retrieved from http://www.public.iastate.edu/~dnett/S611/15Trace.pdf

Trace (linear algebra) (2016). In Wikipedia. Retrieved from https://en.wikipedia.org/wiki/Trace_(linear_algebra)

Weisstein, E. W. (2002, November 16). Matrix trace. Retrieved from http://mathworld.wolfram.com/MatrixTrace.html