What is being presented in the blog

Although Matrix Factorization is a side topic, it’s required for Linear Regression related computations. So, I thought of creating a blog on such an important topic. I like to introduce the algorithm for Matrix Factorization. There are a couple of well known techniques e.g. singular value decomposition (SVD) and Alternating Least Squares (ALS).

Mathematical steps for SVD computation

  1. There is a well-known theorem that if \(A\) is any \(m×n\) matrix, there are \(∃\) three matrices \(U\), \(Σ\) and \(V\) s.t. \(A=U.Σ.V^T\). Refer Theorem 10 (page 417)), in Linear Algebra and its Applications, 4th Edition, by David C. Lay.

     

  2. What is \(Σ\)?
    \(Σ = \begin{bmatrix} D & 0 \\ 0 & 0 \end{bmatrix}\), where \(D=\begin{bmatrix} { \sigma }_{ 1 } & ... & 0 \\ ... & ... & ... \\ 0 & ... & { \sigma }_{ r } \end{bmatrix}\), where \({ \sigma }_{ 1 },{ \sigma }_{ 2 },{ \sigma }_{ 3 }...{ \sigma }_{ r }\) are positive square roots of the Eigen Values of \({ A }^{ T }.A\).

     

  3. What are \(U\) and \(V\)?
    Columns of \(U\) will be LSV of \(A\), and columns of \(V\) will be RSV of \(A\).

     

  4. So, first compute \({ A }^{ T }.A\).

     

  5. Next, compute Eigen Values of \({ A }^{ T }.A\): \(\lambda _{ 1 },{ \lambda }_{ 2 },\lambda _{ 3 }......\lambda _{ r }\)

     

  6. Let’s, take square roots of the \(\lambda _{ 1 }\), and assign to \(\sigma _{ 1 }\):
    \(\sigma _{ 1 }=\sqrt { { \lambda }_{ 1 } } ,\quad \sigma _{ 2 }=\sqrt { { \lambda }_{ 2 } } ,\quad \sigma _{ 3 }=\sqrt { { \lambda }_{ 3 } } .....\quad \sigma _{ r }=\sqrt { { \lambda }_{ r } }\)

     

  7. Also, compute the Eigen Vectors of \({ A }^{ T }.A\): \(V={ v }_{ 1 },{ v }_{ 2 },.....{ v }_{ n }\). These will be the RSV of \(A\).

     

  8. Now, compute: \({ u }_{ 1 }=\frac { 1 }{ { \sigma }_{ 1 } } A{ v }_{ 1 },{ u }_{ 2 }=\frac { 1 }{ { \sigma }_{ 2 } } A{ v }_{ 2 },.....{ u }_{ r }=\frac { 1 }{ { \sigma }_{ r } } A{ v }_{ r }\).

     

  9. Then, \(D=\begin{bmatrix} \sqrt { { \lambda }_{ 1 } } & ... & 0 \\ ... & ... & ... \\ 0 & ... & \sqrt { { \lambda }_{ r } } \end{bmatrix}\)

     

  10. Finally, \(A=[{ u }_{ 1 }...{ u }_{ m }].Σ.{ [v_{ 1 }...{ v }_{ m }] }^{ T }\)

     

Math Example to implement the above outline

  1. We’ll perform SVD of matrix \(A=\begin{bmatrix} 1 & 2 & 3 \\ -1 & 0 & 4 \end{bmatrix}\)

     

  2. First, I’ll code matrix \(A\) and its transpose.
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]   -1    0    4
##      [,1] [,2]
## [1,]    1   -1
## [2,]    2    0
## [3,]    3    4
  1. Now, in order to compute \(X\) = \(A.{ A }^{ T }\) and \(Y\) = \({ A }^{ T }.A\), I’ll code a general purpose matrix multiplication function my_mat_mult(), in below code chunk.
  1. Computing \(X\) and \(Y\), by calling my_mat_mult().
##      [,1] [,2]
## [1,]   14   11
## [2,]   11   17
##      [,1] [,2] [,3]
## [1,]    2    2   -1
## [2,]    2    4    6
## [3,]   -1    6   25
  1. Computing eigenvectors and eigenvalues of \(X\) and \(Y\), by R’s built-in function eigen().
## eigen() decomposition
## $values
## [1] 26.601802  4.398198
## 
## $vectors
##           [,1]       [,2]
## [1,] 0.6576043 -0.7533635
## [2,] 0.7533635  0.6576043
## eigen() decomposition
## $values
## [1] 2.660180e+01 4.398198e+00 1.058982e-16
## 
## $vectors
##             [,1]       [,2]       [,3]
## [1,] -0.01856629 -0.6727903  0.7396003
## [2,]  0.25499937 -0.7184510 -0.6471502
## [3,]  0.96676296  0.1765824  0.1849001
  1. Now, I’ll do Single valued Decomposition (SVD) in the following code chunk, using R’s built-in function, svd().
## $d
## [1] 5.157693 2.097188
## 
## $u
##            [,1]       [,2]
## [1,] -0.6576043 -0.7533635
## [2,] -0.7533635  0.6576043
## 
## $v
##             [,1]       [,2]
## [1,]  0.01856629 -0.6727903
## [2,] -0.25499937 -0.7184510
## [3,] -0.96676296  0.1765824

Explanation: SVD_A contains the decomposed matrices of \(A\), namely \(A\), Sigma and V_transposed, in that order. Sigma contains a diagonal matrix in the top left, and zeros in the rest of the locations.

The column vectors of \(U\) are the left-singular vectors of \(A\). In the above results, observe the two columns vectors under $u.

The diagonal elements of D (top left in Sigma) are singular values of A. Note that they not vectors, because they are just elements of the diagonal matrix, D. In the above results, observe the two numbers under $d. 

The column vectors of \(V\) are the right-singular vectors of \(A\). In the above results, observe the two columns vectors $v.

Now, I’ll compare the left-singular vectors of \(A\) (i.e. \(U\) or $u), with the eigen vectors of \(X\). In comparison_table1, we observe that the two sets of vectors (without the signs) are equal each to each.

##      SVD_A_U_Col1 SVD_A_U_Col2  e_X_Col1   e_X_Col2
## [1,]   -0.6576043   -0.7533635 0.6576043 -0.7533635
## [2,]   -0.7533635    0.6576043 0.7533635  0.6576043
  1. In the following code chunk, I’ll compare the right-singular vectors of \(A\) (i.e. \(V\) or $v), with the eigen vectors of \(Y\). In this case, we know that while \(Y\) has 3 eigen vectors, \(V\) has 2 vectors. So, if we ignore the 3rd vector (column heading ‘IGNORE’), then the two sets of vectors (without signs) are equal, each to each.
##      SVD_A_V_Col1 SVD_A_V_Col2    e_Y_Col1   e_Y_Col2     IGNORE
## [1,]   0.01856629   -0.6727903 -0.01856629 -0.6727903  0.7396003
## [2,]  -0.25499937   -0.7184510  0.25499937 -0.7184510 -0.6471502
## [3,]  -0.96676296    0.1765824  0.96676296  0.1765824  0.1849001

Citation

  1. Linear Algebra and its Applications, 4th Edition, by David C. Lay