[,1] [,2]
[1,] 1 2
[2,] 3 4
det(A)
[1] -2
Either look at the math camp notes, or on some simplified online blogs.
Thus, you should always check for variable values and duplicate rows in your data to be able to check if you can estimate your “beta” coefficients in your linear regression as you have to invert the your data matrix to find it. Most statistical software will automatically drop duplicate rows but will give you an error if the matrix is not invertible.
Will you be able to invert these 4 matrices ?
\[ \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ \end{bmatrix} \]
Yes, you can invert matrix A as its determinant is not 0. It is a non-singular matrix.
solve(A) # inverse of A
[,1] [,2]
[1,] -2.0 1.0
[2,] 1.5 -0.5
[,1] [,2]
[1,] 1 1.110223e-16
[2,] 0 1.000000e+00
[,1] [,2]
[1,] 1.000000e+00 4.440892e-16
[2,] -5.551115e-17 1.000000e+00
Combine rows under each other.
Combine columns after each other.
Loading required package: RConics
[,1] [,2] [,3]
[1,] 1 0 1
[2,] 2 2 1
[3,] 0 0 3
adj_R <- adjoint(R) # Compute the classical adjoint (also called adjugate) of a square matrix. The adjoint is the transpose of the cofactor matrix.
adj_R
[,1] [,2] [,3]
[1,] 6 0 -2
[2,] -6 3 1
[3,] 0 0 2
det(R)
[1] 6
R_inverse <- adj_R/det(R)
R_inverse
[,1] [,2] [,3]
[1,] 1 0.0 -0.3333333
[2,] -1 0.5 0.1666667
[3,] 0 0.0 0.3333333
solve(R)
[,1] [,2] [,3]
[1,] 1 0.0 -0.3333333
[2,] -1 0.5 0.1666667
[3,] 0 0.0 0.3333333
Row 1 and row 3 are the same !
\[ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 1 & 2 & 3 \\ \end{bmatrix} \]
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 1 2 3
typeof(B)
[1] "double"
?det
det(B)
[1] 0
??inverse
# Use the solve() function to calculate the inverse.
# solve(B)
# Using tryCatch to catch and print the error message
result <- tryCatch({
solve(B)
}, error = function(e) {
cat("Error: ", conditionMessage(e), "\n")
})
Error: Lapack routine dgesv: system is exactly singular: U[3,3] = 0
\[ \begin{bmatrix} 1 & 2 & 2 \\ -2 & 0.5 & 0 \\ -2 & 5 & 4 \\ \end{bmatrix} \]
You may not be able to visually eyeball that the third row is a linear combination of the first two rows, with an equal weight (2) on row 1 and row 2. But calculating the determinant identifies this (i.e. det(C) = 0
) and thus you know you will not be able to calculate the inverse of the matrix C as it should not exist. solve(C)
command gives us an error, as expected.
# Row 1
m11 <- 1
m12 <- 2
m13 <- 2
# Row 2
m21 <- -2
m22 <- .5
m23 <- 0
# Row 3 is a linear combination of Row 1 and Row 2
C <- matrix(data = c(m11,m12,m13,
m21,m22,m23,
2*m11+2*m21,2*m12+2*m22,2*m13+2*m23),
nrow = 3,
ncol = 3,
byrow = TRUE)
C
[,1] [,2] [,3]
[1,] 1 2.0 2
[2,] -2 0.5 0
[3,] -2 5.0 4
det(C)
[1] 0
#solve(C)
# Using tryCatch to catch and print the error message
result <- tryCatch({
solve(C)
}, error = function(e) {
cat("Error: ", conditionMessage(e), "\n")
})
Error: Lapack routine dgesv: system is exactly singular: U[3,3] = 0
You may not be able to visually eyeball that the third column is a linear combination of the first two columns, with a weight of -2 on column 1 and a weight of 3 on the second column. Seeing the pattern is harder here than matrix C above. But again calculating the determinant identifies this (i.e. det(D) = 0
) and thus you know you will not be able to calculate the inverse of the matrix D as it should not exist. solve(D)
command gives us an error, as expected.
\[ \begin{bmatrix} 1 & -6 & -20 \\ 2 & -1 & -7 \\ 3 & -3 & 3 \\ \end{bmatrix} \]
# Col 1
m11 <- 1
m21 <- 2
m31 <- 3
# Col 2
m12 <- -6
m22 <- -1
m32 <- 3
w_col1 <- -2
w_col2 <- 3
D <- matrix(data = c(m11,m21,m31,
m12,m22,m32,
w_col1 * m11 + w_col2 * m12, w_col1 * m21 + w_col2 * m22, w_col1 * m31 + w_col2 * m32),
nrow = 3,
ncol = 3,
byrow = FALSE)
D
[,1] [,2] [,3]
[1,] 1 -6 -20
[2,] 2 -1 -7
[3,] 3 3 3
det(D)
[1] 1.049161e-14
#solve(D)
# Using tryCatch to catch and print the error message
result <- tryCatch({
solve(D)
}, error = function(e) {
cat("Error: ", conditionMessage(e), "\n")
})
Error: system is computationally singular: reciprocal condition number = 2.77556e-18
Determinant of E is not defined if you have NA
, non-numerical or missing values.
Do not replace NA
values with 0. Think about how to impute missing values.
Let \(a\) and \(b\) be \(K \times 1\) column vectors. Consider the scalar product:
\[ f(b) = a^\top b = \sum_{i=1}^K a_i b_i \]
Then the derivative with respect to \(b\) is:
\[ \frac{\partial (a^\top b)}{\partial b} = a \]
This is because \(a^\top b\) is a linear combination of the \(b_i\), and when differentiating, each \(a_i\) becomes the partial derivative with respect to \(b_i\).
Since \(a^\top b = b^\top a\) (a scalar), we also have:
\[ \frac{\partial (b^\top a)}{\partial b} = a \]
The expression \(a^\top b\) is simply the dot product between two vectors:
\[ a^\top b = \sum_{i=1}^K a_i b_i \]
This is a scalar — a weighted sum of the components of b, where the weights are given by the elements of a.
When we take the derivative of this scalar with respect to the vector b, we are effectively collecting the coefficients \(a_i\):
\[ \frac{\partial}{\partial b} \begin{bmatrix} a_1 b_1 + a_2 b_2 + \dots + a_K b_K \end{bmatrix} = \begin{bmatrix} a_1 \\ a_2 \\ \vdots \\ a_K \end{bmatrix} = a \]
Thus, the gradient of \(a^\top b\) with respect to b is simply a.
Let \(A\) be a symmetric \(K \times K\) matrix, and \(b\) a \(K \times 1\) vector. Then:
\[ f(b) = b^\top A b \]
This is a quadratic form, which expands to:
\[ \sum_{i=1}^K \sum_{j=1}^K b_i A_{ij} b_j \]
Taking the derivative with respect to \(b\), we get:
\[ \frac{\partial (b^\top A b)}{\partial b} = 2 A b \quad \text{(if } A = A^\top \text{)} \]
This is the vector version of the familiar scalar derivative \(\frac{d}{dx}(ax^2) = 2ax\).
If \(A\) is not symmetric, the derivative becomes:
\[ \frac{\partial (b^\top A b)}{\partial b} = A^\top b + A b = (A + A^\top) b \]
Expression | Derivative w.r.t. \(b\) | Condition |
---|---|---|
\(a^\top b\) or \(b^\top a\) | \(a\) | Always |
\(b^\top A b\) | \(2Ab\) | If \(A\) is symmetric |
\(b^\top A b\) | \((A + A^\top)b\) | If \(A\) is not symmetric |
The appendix below might be of particular interest.