Project 3 - Matrix Factorization Methods

The goal of this assignment is give you practice working with Matrix Factorization techniques.

Your task is implement a matrix factorization method—such as singular value decomposition (SVD) or Alternating Least Squares (ALS)—in the context of a recommender system.

You may approach this assignment in a number of ways. You are welcome to start with an existing recommender system written by yourself or someone else. Remember as always to cite your sources, so that you can be graded on what you added, not what you found.

SVD can be thought of as a pre-processing step for feature engineering. You might easily start with thousands or millions of items, and use SVD to create a much smaller set of “k” items (e.g. 20 or 70).

library(recommenderlab)
library(dplyr)
library(tidyr)
library(kableExtra)
library(data.table)
data("MovieLense")

Create Data Subset for SVD Decomposition

In this example to see how SVD decomposition recommender systems work, we will take a small subset of popular movies; Toy Story, Aladdin, Lion King, The Rock, Twister (1996), The Abyss, Alien, Psycho, and The Shining.

Note that these 9 movies are in a specific genre, respectively Animation, Action, and Horror.

To highlight some of the features of SVD decomposition, the users have been selected to have a preference for one of the 3 genres:

Animation preference: 18, 664, 771, 821
Action preference: 430, 453
Horror preference 270, 372, 617, 833

# Get subset of movie Lense dataset of most rated movies and most actively rating users
movie_ratings <- getData.frame(MovieLense[rowCounts(MovieLense) > 60,  colCounts(MovieLense) > 150])

# Determine 3 movies for each genre of Animation, Action, and Horror
animation <- MovieLenseMeta %>% filter(Animation == 1 & `Children's`== 1 & title %in% movie_ratings$item) 
action <- MovieLenseMeta %>% filter(Thriller == 1 & Adventure == 1 & Musical == 0 & title %in% movie_ratings$item)
horror <- MovieLenseMeta %>% filter(Horror == 1 & title %in% movie_ratings$item)


# Create dataframe of the selected 9 movies 
df2 <- spread(movie_ratings %>% filter(item %in% animation$title[1:3] | item %in% action$title[1:3] | item %in% horror$title[1:3]), key = item, value = rating)
col.order <- c("user","Toy Story (1995)","Aladdin (1992)", "Lion King, The (1994)", "Rock, The (1996)", "Twister (1996)", "Abyss, The (1989)", "Alien (1979)", "Psycho (1960)", "Shining, The (1980)")
df2 <- df2[,col.order]
df3 <- df2[2:length(df2)]
row.names(df3) <- as.character(df2$user)
df3[is.na(df3)] <- 0


# Determine users that have a visible preference of a specific genre based on ratings
animation_fans <- df3 %>% filter(`Toy Story (1995)` > 3  & `Lion King, The (1994)` > 3 & `Aladdin (1992)` > 3 &
                                 `Rock, The (1996)` <= 3 & `Twister (1996)`  <= 3  &  `Abyss, The (1989)` <= 3 &
                                 `Alien (1979)` <= 3 & `Psycho (1960)` <= 3 & `Shining, The (1980)` <= 3)

action_fans <- df3 %>% filter(`Toy Story (1995)` <= 3  & `Lion King, The (1994)` <= 2 & `Aladdin (1992)` <= 2 &
                                 `Rock, The (1996)` >= 3 & `Twister (1996)`  >= 0  &  `Abyss, The (1989)` >= 3 &
                                 `Alien (1979)` <= 3 & `Psycho (1960)` <= 3 & `Shining, The (1980)` <= 3)
  

horror_fans <- df3 %>% filter(`Toy Story (1995)` %in% c(0)  & `Lion King, The (1994)` < 3 & `Aladdin (1992)` < 3 &
                              `Rock, The (1996)` <= 3 & `Twister (1996)`  <= 3  &  `Abyss, The (1989)` != 0 &
                              `Alien (1979)` >= 3 & `Psycho (1960)` >= 3 & `Shining, The (1980)` >= 3)


# Combine users of specific genre preference 
m2 <- rbind(animation_fans, action_fans)
m2 <- rbind(m2, horror_fans)
m <- as.matrix(m2)

m %>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)  %>% add_header_above(c(" ", "Animation" = 3, "Action" = 3, "Horror" = 3))%>% pack_rows("Animation Fans", 1, 4) %>% pack_rows("Action Fans", 5, 6) %>% pack_rows("Horror Fans", 7, 10)

	Animation			Action			Horror
	Toy Story (1995)	Aladdin (1992)	Lion King, The (1994)	Rock, The (1996)	Twister (1996)	Abyss, The (1989)	Alien (1979)	Psycho (1960)	Shining, The (1980)
Animation Fans
18	5	4	4	0	0	0	0	3	3
664	4	4	4	0	3	0	3	0	0
771	5	4	5	0	0	2	0	0	0
821	5	5	5	3	3	0	0	0	0
Action Fans
430	0	0	0	3	0	3	0	0	0
453	0	0	0	4	0	3	0	0	0
Horror Fans
270	0	0	0	0	3	5	5	5	5
372	0	0	0	0	0	4	5	5	5
617	0	0	0	0	0	1	4	5	5
833	0	0	0	0	2	2	5	5	4

Singular Value Decomposition

we will decompose out matrix M into 3 matrices where \(M=U\sum V^T\) such that:

\(U\) and \(V\) are column orthonormal
\(V^T\) has orthonormal rows
\(\sum\) is a diagonal matrix with singular values

#Determine rank of current matrix
rankMatrix(m)[1]

## [1] 9

svd_m<-svd(m)

#U
#round(s$u, 2) %>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)
svd_m$u

##              [,1]         [,2]        [,3]       [,4]         [,5]        [,6]
##  [1,] -0.30980303  0.346022876  0.27923899  0.5587444  0.198941909  0.26999978
##  [2,] -0.24798789  0.380454840  0.11496702 -0.5913655 -0.134210073 -0.39014688
##  [3,] -0.20260356  0.455247906 -0.04693749  0.2673050 -0.628730047 -0.10806889
##  [4,] -0.21925320  0.546067913 -0.19711604 -0.1941723  0.483582175  0.16055517
##  [5,] -0.06003199 -0.005796787 -0.57338346  0.1276974 -0.003855702 -0.06663823
##  [6,] -0.06301899  0.001149358 -0.67461043  0.1437329  0.180275120 -0.19852322
##  [7,] -0.49354250 -0.252315727 -0.16698190 -0.2424484 -0.247073513  0.66657110
##  [8,] -0.44612275 -0.263793774 -0.04262905  0.1861726 -0.236944063 -0.35506315
##  [9,] -0.37049344 -0.222558541  0.21487165  0.2087524  0.357377159 -0.34491078
## [10,] -0.40737402 -0.217402513  0.08951243 -0.2344180  0.189304919 -0.06500960
##               [,7]        [,8]        [,9]
##  [1,] -0.008713349  0.52472726  0.07891848
##  [2,]  0.209845438  0.39003902  0.24138032
##  [3,] -0.286197537 -0.41145047  0.14094383
##  [4,]  0.107914317 -0.38275183 -0.39428187
##  [5,] -0.037888031  0.18748759 -0.09860938
##  [6,] -0.054112429  0.14476842  0.37314312
##  [7,]  0.226203779 -0.05585837  0.21492089
##  [8,]  0.280207160  0.10428242 -0.61232733
##  [9,]  0.237390292 -0.42695789  0.42732698
## [10,] -0.819714846  0.06590076 -0.08648909

#S
#round(s$d, 2) %>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)
svd_m$d

## [1] 19.1021537 15.2971192  7.0350655  4.8391352  3.4252460  1.9709804  0.6266352
## [8]  0.3965897  0.1716448

#V*
#t(round(s$v, 2))%>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)
t(svd_m$v)

##              [,1]         [,2]        [,3]        [,4]        [,5]         [,6]
##  [1,] -0.24344116 -0.216616618 -0.22722294 -0.05705804 -0.19354356 -0.325188725
##  [2,]  0.53987373  0.487493231  0.51725360  0.10625601  0.10379837 -0.135814054
##  [3,]  0.09037519  0.057354675  0.05068274 -0.71213839 -0.08079070 -0.632458333
##  [4,]  0.16406311 -0.006638758  0.04859942  0.07759799 -0.73417965  0.128383737
##  [5,] -0.07820463  0.047271942 -0.13628569  0.63069336  0.20013325 -0.635095603
##  [6,]  0.02630303 -0.055854498 -0.11068452 -0.25994275  0.59915308  0.216161866
##  [7,] -0.15256259  0.318063495 -0.13865761 -0.01016678 -0.01203107  0.002252788
##  [8,]  0.53652646  0.250899295 -0.78657218 -0.01694198 -0.03507913  0.041727195
##  [9,]  0.54428354 -0.736630677  0.08450579  0.08097388  0.07620290 -0.086615810
##              [,7]        [,8]       [,9]
##  [1,] -0.46911641 -0.49821987 -0.4768938
##  [2,] -0.22333811 -0.24463980 -0.2304278
##  [3,]  0.08584072  0.18643503  0.1737113
##  [4,] -0.49441795  0.26172588  0.3101680
##  [5,] -0.13040957  0.26571763  0.2104501
##  [6,] -0.66849572  0.16130917  0.1942926
##  [7,]  0.02007220 -0.64743906  0.6606822
##  [8,]  0.08550283  0.02777775 -0.1383909
##  [9,]  0.08139614 -0.26850405  0.2353800

Latent Factors

The SUV decomposition helps us see some patterns in the underlying data or latent factors.

The \(U\) matrix maps users to genres and can be interpreted as a user to latent factor similarity matrix

The \(V^T\) matrix maps movies to latent factors per user.

The \(\sum\) matrix measure the strength of each of these latent factors within our dataset where the larger the numeber on the diagonal, the stronger the factor. Factors which are very low can be considered noise within the data

#U
u <- round(svd_m$u, 2)  
colnames(u) <- c("Animation Latent Factor", "Horror Latent Factor", "Action Latent Factor", " ", " ", " ", " ", " ", " ")
rownames(u) <- rownames(m)
u %>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)

	Animation Latent Factor	Horror Latent Factor	Action Latent Factor
18	-0.31	0.35	0.28	0.56	0.20	0.27	-0.01	0.52	0.08
664	-0.25	0.38	0.11	-0.59	-0.13	-0.39	0.21	0.39	0.24
771	-0.20	0.46	-0.05	0.27	-0.63	-0.11	-0.29	-0.41	0.14
821	-0.22	0.55	-0.20	-0.19	0.48	0.16	0.11	-0.38	-0.39
430	-0.06	-0.01	-0.57	0.13	0.00	-0.07	-0.04	0.19	-0.10
453	-0.06	0.00	-0.67	0.14	0.18	-0.20	-0.05	0.14	0.37
270	-0.49	-0.25	-0.17	-0.24	-0.25	0.67	0.23	-0.06	0.21
372	-0.45	-0.26	-0.04	0.19	-0.24	-0.36	0.28	0.10	-0.61
617	-0.37	-0.22	0.21	0.21	0.36	-0.34	0.24	-0.43	0.43
833	-0.41	-0.22	0.09	-0.23	0.19	-0.07	-0.82	0.07	-0.09

#S
s <- diag(round(svd_m$d, 2), length(round(svd_m$d, 2)))
colnames(s) <- c("Animation Latent Factor Strength", "Horror Latent Factor Strength", "Action Latent Factor Strength", " ", " ", " ", " ", " ", " ")
s %>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)

Animation Latent Factor Strength	Horror Latent Factor Strength	Action Latent Factor Strength
19.1	0.0	0.00	0.00	0.00	0.00	0.00	0.0	0.00
0.0	15.3	0.00	0.00	0.00	0.00	0.00	0.0	0.00
0.0	0.0	7.04	0.00	0.00	0.00	0.00	0.0	0.00
0.0	0.0	0.00	4.84	0.00	0.00	0.00	0.0	0.00
0.0	0.0	0.00	0.00	3.43	0.00	0.00	0.0	0.00
0.0	0.0	0.00	0.00	0.00	1.97	0.00	0.0	0.00
0.0	0.0	0.00	0.00	0.00	0.00	0.63	0.0	0.00
0.0	0.0	0.00	0.00	0.00	0.00	0.00	0.4	0.00
0.0	0.0	0.00	0.00	0.00	0.00	0.00	0.0	0.17

#V*
v <- t(round(svd_m$v, 2))
v  %>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)

-0.24	-0.22	-0.23	-0.06	-0.19	-0.33	-0.47	-0.50	-0.48
0.54	0.49	0.52	0.11	0.10	-0.14	-0.22	-0.24	-0.23
0.09	0.06	0.05	-0.71	-0.08	-0.63	0.09	0.19	0.17
0.16	-0.01	0.05	0.08	-0.73	0.13	-0.49	0.26	0.31
-0.08	0.05	-0.14	0.63	0.20	-0.64	-0.13	0.27	0.21
0.03	-0.06	-0.11	-0.26	0.60	0.22	-0.67	0.16	0.19
-0.15	0.32	-0.14	-0.01	-0.01	0.00	0.02	-0.65	0.66
0.54	0.25	-0.79	-0.02	-0.04	0.04	0.09	0.03	-0.14
0.54	-0.74	0.08	0.08	0.08	-0.09	0.08	-0.27	0.24

Dimensionality Reduction

Since we know in our dataset we have 3 genres, thus we anticipate 3 latent factors, the other factors can be thought of as noise. We can determine which level of noise to remove and reduce the dimensionality of our dataset.

We start by looking at the \(\sum\) matrix and computing the sum of the squares along the diagonal, the total energy. We can drop 10% of the energy as a general rule of thumb. All 3 matrices can be reduced by the same factor, thus reducing the rank of our utility matrix

sum(svd_m$d^2)

## [1] 688

energy_pct <- round(((svd_m$d^2)/sum(svd_m$d^2))*100,2)
reduce_by <- 0
for(i in length(energy_pct):1)
{
  if(sum(energy_pct[i:length(energy_pct)]) < 10)
    reduce_by = i
}


#U
u_reduced <- round(svd_m$u[,1:1:reduce_by-1],2)
u_reduced %>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)

-0.31	0.35	0.28
-0.25	0.38	0.11
-0.20	0.46	-0.05
-0.22	0.55	-0.20
-0.06	-0.01	-0.57
-0.06	0.00	-0.67
-0.49	-0.25	-0.17
-0.45	-0.26	-0.04
-0.37	-0.22	0.21
-0.41	-0.22	0.09

#S
s_reduced <- round(diag(c(svd_m$d[1:reduce_by-1]), length(svd_m$d[1:reduce_by-1])),4) 
s_reduced %>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)

19.1022	0.0000	0.0000
0.0000	15.2971	0.0000
0.0000	0.0000	7.0351

#V*
v_reduced <- round(t(svd_m$v)[1:reduce_by-1,],2) 
v_reduced %>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)

-0.24	-0.22	-0.23	-0.06	-0.19	-0.33	-0.47	-0.50	-0.48
0.54	0.49	0.52	0.11	0.10	-0.14	-0.22	-0.24	-0.23
0.09	0.06	0.05	-0.71	-0.08	-0.63	0.09	0.19	0.17

Applying Recommendations

The most simple recommendation that can be made is to use the dot product of the SVD matrices to reconstruct as close as possible the original matrix while reducing the Root Mean Squared Error

r <- u_reduced %*% s_reduced %*% v_reduced

r[r < 0] <- 0
r[r > 5] <- 5
r <- round(r)

# Original matrix
m %>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)

	Toy Story (1995)	Aladdin (1992)	Lion King, The (1994)	Rock, The (1996)	Twister (1996)	Abyss, The (1989)	Alien (1979)	Psycho (1960)	Shining, The (1980)
18	5	4	4	0	0	0	0	3	3
664	4	4	4	0	3	0	3	0	0
771	5	4	5	0	0	2	0	0	0
821	5	5	5	3	3	0	0	0	0
430	0	0	0	3	0	3	0	0	0
453	0	0	0	4	0	3	0	0	0
270	0	0	0	0	3	5	5	5	5
372	0	0	0	0	0	4	5	5	5
617	0	0	0	0	0	1	4	5	5
833	0	0	0	0	2	2	5	5	4

# Reconstructed matrix
r %>% kable() %>% kable_styling(bootstrap_options = "striped", full_width = F)

4	4	4	0	2	0	2	2	2
4	4	4	0	1	0	1	1	1
5	4	5	1	1	0	0	0	0
5	5	5	2	2	1	0	0	0
0	0	0	3	1	3	0	0	0
0	0	0	3	1	3	0	0	0
0	0	0	1	1	4	5	5	5
0	0	0	0	1	4	5	5	5
0	0	0	0	1	2	4	5	4
0	0	0	0	1	3	4	5	5

#RMSE
RMSE(m,r)

## [1] 0.7226494

Conclusion

While SVD factorization methods alleviate issues with content based filtering algorithms and help see some underlying patterns within the dataset. However, interpreting latent factors can be difficult.