Objective

• Your task is implement a matrix factorization method—such as singular value decomposition (SVD) or Alternating Least Squares (ALS)—in the context of a recommender system.

• You may approach this assignment in a number of ways. You are welcome to start with an existing recommender system written by yourself or someone else. Remember as always to cite your sources, so that you can be graded on what you added, not what you found.

Introduction

SVD can be thought of as a pre-processing step for feature engineering. You might easily start with thousands or millions of items, and use SVD to create a much smaller set of “k” items (e.g. 20 or 70).

• This project is based on the work done in Project 2

• In this project we will add SVD to further explore the recommender system. I have used the recommenderlab package.

Data Preparation

Load the libraries

library(recommenderlab)  
library(dplyr)           
library(tidyr)           
library(ggplot2)         
library(ggrepel)         
library(tictoc)

Load the Data

The data set is from MovieLens project and it was downloaded from [Movie Lens] (https://grouplens.org/datasets/movielens/)

ratings <- read.csv("https://raw.githubusercontent.com/PriyaShaji/Data612/master/Project_2/ratings.csv") 

movies <- read.csv("https://raw.githubusercontent.com/PriyaShaji/Data612/master/Project_2/movies.csv")

Convert to Matrix

Movie_Matrix <- ratings %>%
  select(-timestamp) %>%
  spread(movieId, rating)

row.names(Movie_Matrix) <- Movie_Matrix[,1]

Movie_Matrix <- Movie_Matrix[-c(1)]
Movie_Matrix <- as(as.matrix(Movie_Matrix), "realRatingMatrix")

Movie_Matrix

## 610 x 9724 rating matrix of class 'realRatingMatrix' with 100836 ratings.

Our movie matrix contains 610 users and 9,724 items/movies.

Train and Test Sets

Now we will split our data into train and test sets

set.seed(88)
eval <- evaluationScheme(Movie_Matrix, method = "split",
                         train = 0.8, given= 20, goodRating=3)
train <- getData(eval, "train")
known <- getData(eval, "known")
unknown <- getData(eval, "unknown")

Algorithms Used

User-Based Collaborative Filtering

Firstly, we will build a user-based collaborative filtering model.

tic("UBCF Model - Training")
modelUBCF <- Recommender(train, method = "UBCF")
toc(log = TRUE, quiet = TRUE)

tic("UBCF Model - Predicting")
predUBCF <- predict(modelUBCF, newdata = known, type = "ratings")
toc(log = TRUE, quiet = TRUE)

( accUBCF <- calcPredictionAccuracy(predUBCF, unknown) )

##      RMSE       MSE       MAE 
## 0.9320803 0.8687737 0.7174840

Singular Value Decomposition Model(SVD Model)

Now we will build a SVD Model in order to compare this model with UBCF Model. For building SVD Model, we will generate a model with 50 concepts/categories. It will have all the required information and also has a lower value of RMSE and gives a reasonable processing time.

tic("SVD Model - Training")
modelSVD <- Recommender(train, method = "SVD", parameter = list(k = 50))
toc(log = TRUE, quiet = TRUE)

tic("SVD Model - Predicting")
predSVD <- predict(modelSVD, newdata = known, type = "ratings")
toc(log = TRUE, quiet = TRUE)

( accSVD <- calcPredictionAccuracy(predSVD, unknown) )

##      RMSE       MSE       MAE 
## 0.9361910 0.8764536 0.7210165

As we can see RMSE is very similar to the UBCF model. On the surface these models appear to be similar.

Run-Time

One major difference between SVD and UBCF Model is their run-times.

Let’s explore their log displays to individually analyze their run-time.

log <- as.data.frame(unlist(tic.log(format = TRUE)))
colnames(log) <- c("Run Time")
knitr::kable(log, format = "html") %>%
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover"))

Run Time
UBCF Model - Training: 0.015 sec elapsed
UBCF Model - Predicting: 2.556 sec elapsed
SVD Model - Training: 2.172 sec elapsed
SVD Model - Predicting: 0.542 sec elapsed

As we can see from the log display of both the models:

UBCF takes less time to build a model, but takes more resources making predictions while SVD model is the opposite - resource intensive to build a model, but quick to make predictions.

Evaluation

Now let us evaluate our predictions by seeing the prediction matrix of a particular user.

Here, let’s see for user 400th.

mov_rated <- as.data.frame(Movie_Matrix@data[c("400"), ]) 
colnames(mov_rated) <- c("Rating")
mov_rated$movieId <- as.integer(rownames(mov_rated))
mov_rated <- mov_rated %>% filter(Rating != 0) %>% 
  inner_join (movies, by="movieId") %>%
  arrange(Rating) %>%
  select(Movie = "title", Rating)
knitr::kable(mov_rated, format = "html") %>%
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover"))

Movie	Rating
Spider-Man 3 (2007)	2.5
Indiana Jones and the Kingdom of the Crystal Skull (2008)	2.5
Back to the Future (1985)	4.0
Gladiator (2000)	4.0
Lord of the Rings: The Fellowship of the Ring, The (2001)	4.0
Lord of the Rings: The Return of the King, The (2003)	4.0
Lucky Number Slevin (2006)	4.0
Pursuit of Happyness, The (2006)	4.0
Departed, The (2006)	4.0
The Martian (2015)	4.0
Logan (2017)	4.0
Forrest Gump (1994)	4.5
Blade Runner (1982)	4.5
Die Hard (1988)	4.5
One Flew Over the Cuckoo’s Nest (1975)	4.5
Princess Bride, The (1987)	4.5
Raiders of the Lost Ark (Indiana Jones and the Raiders of the Lost Ark) (1981)	4.5
Goodfellas (1990)	4.5
Godfather: Part II, The (1974)	4.5
Shining, The (1980)	4.5
Donnie Darko (2001)	4.5
Dark Knight, The (2008)	4.5
How to Train Your Dragon (2010)	4.5
Star Wars: Episode VII - The Force Awakens (2015)	4.5
Arrival (2016)	4.5
Heat (1995)	5.0
Seven (a.k.a. Se7en) (1995)	5.0
Usual Suspects, The (1995)	5.0
Star Wars: Episode IV - A New Hope (1977)	5.0
Léon: The Professional (a.k.a. The Professional) (Léon) (1994)	5.0
Pulp Fiction (1994)	5.0
Shawshank Redemption, The (1994)	5.0
Silence of the Lambs, The (1991)	5.0
Fargo (1996)	5.0
Trainspotting (1996)	5.0
Godfather, The (1972)	5.0
Star Wars: Episode V - The Empire Strikes Back (1980)	5.0
Star Wars: Episode VI - Return of the Jedi (1983)	5.0
Matrix, The (1999)	5.0
Fight Club (1999)	5.0
Requiem for a Dream (2000)	5.0
Inside Man (2006)	5.0
Inception (2010)	5.0

• As we see that user 400th movie likes comes under action , low on romantic , dramatic movie genre categories.

• Now we can see the movies suggested by SVD to user 400th.

mov_recommend <- as.data.frame(predSVD@data[c("400"), ]) 
colnames(mov_recommend) <- c("Rating")
mov_recommend$movieId <- as.integer(rownames(mov_recommend))
mov_recommend <- mov_recommend %>% arrange(desc(Rating)) %>% head(6) %>% 
  inner_join (movies, by="movieId") %>%
  select(Movie = "title")
knitr::kable(mov_recommend, format = "html") %>%
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover"))

Movie
American Beauty (1999)
Pulp Fiction (1994)
Schindler’s List (1993)
Sixth Sense, The (1999)
Raiders of the Lost Ark (Indiana Jones and the Raiders of the Lost Ark) (1981)
Saving Private Ryan (1998)

Therefore by analyzing top 6 movies being recommended to user 400th, we see that they also are action and drama genre movie categories.

Singular Value Decomposition(Manual)

Normalize Matrix

Let us normalize the ratings matrix

# Normalize matrix
movieMatrix <- as.matrix(normalize(Movie_Matrix)@data)

# Perform SVD
movieSVD <- svd(movieMatrix)
rownames(movieSVD$u) <- rownames(movieMatrix)
rownames(movieSVD$v) <- colnames(movieMatrix)

As we have seen earlier, our data has 610 users. In order to be usable we need to reduce number of dimensions/concepts by setting some singular values in the diagonal matrix Σ to 0.

# Reduce dimensions
n <- length(movieSVD$d)
total_energy <- sum(movieSVD$d^2)
for (i in (n-1):1) {
  energy <- sum(movieSVD$d[1:i]^2)
  if (energy/total_energy<0.9) {
    n_dims <- i+1
    break
  }
}

trim_mov_D <- movieSVD$d[1:n_dims]
trim_mov_U <- movieSVD$u[, 1:n_dims]
trim_mov_V <- movieSVD$v[, 1:n_dims]

As we had 610 users in our ratings matrix. and after reducing the dimensionality of the diagonal matrix Σ , we have 251 dimensions/concepts.

head(trim_mov_D)

## [1] 76.20047 43.62240 41.77917 39.37051 37.95619 36.54896

Consider two first concepts with singular values 76.2 and 43.6. Let us pick 5 movies with highest and lowest values in each concept and plot them.

mov_count <- 5

movies_df <- as.data.frame(trim_mov_V) %>% select(V1, V2)
movies_df$movieId <- as.integer(rownames(movies_df))

mov_sample <- movies_df %>% arrange(V1) %>% head(mov_count)
mov_sample <- rbind(mov_sample, movies_df %>% arrange(desc(V1)) %>% head(mov_count))
mov_sample <- rbind(mov_sample, movies_df %>% arrange(V2) %>% head(mov_count))
mov_sample <- rbind(mov_sample, movies_df %>% arrange(desc(V2)) %>% head(mov_count))
mov_sample <- mov_sample %>% inner_join(movies, by = "movieId") %>% 
  select(Movie = "title", Concept1 = "V1", Concept2 = "V2")
mov_sample$Concept1 <- round(mov_sample$Concept1, 4)
mov_sample$Concept2 <- round(mov_sample$Concept2, 4)

knitr::kable(mov_sample, format = "html") %>%
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover"))

Movie	Concept1	Concept2
Pulp Fiction (1994)	-0.1353	-0.0097
Star Wars: Episode IV - A New Hope (1977)	-0.1182	0.0268
Star Wars: Episode V - The Empire Strikes Back (1980)	-0.1156	0.0233
Godfather, The (1972)	-0.1093	-0.0118
Fight Club (1999)	-0.1068	-0.0065
Batman & Robin (1997)	0.0599	-0.0060
Batman Forever (1995)	0.0584	0.0255
Wild Wild West (1999)	0.0580	0.0242
Hollow Man (2000)	0.0564	0.0003
Nutty Professor, The (1996)	0.0542	0.0157
Charlie’s Angels: Full Throttle (2003)	0.0366	-0.0589
Transformers: Dark of the Moon (2011)	0.0175	-0.0537
Battlefield Earth (2000)	0.0337	-0.0524
Schindler’s List (1993)	-0.0757	-0.0522
Shawshank Redemption, The (1994)	-0.1057	-0.0500
Cannonball Run, The (1981)	0.0061	0.0630
Naked Gun: From the Files of Police Squad!, The (1988)	-0.0104	0.0588
Blazing Saddles (1974)	-0.0309	0.0585
Ace Ventura: Pet Detective (1994)	0.0255	0.0585
Beverly Hills Cop (1984)	-0.0090	0.0563

Summary

Collaborative Filtering:

• It successfully avoids the problem posed by dynamic user preference as item-based CF is more static.

• However, several problems remain for this method. First, the main issue is scalability. The computation grows with both the customer and the product. The worst case complexity is O(mn) with m users and n items.

Singular Value Decomposition:

• SVD decreases the dimension of the utility matrix by extracting its latent factors.

• SVD handles the problem of scalability and sparsity posed by CF successfully. However, SVD is not without flaw. The main drawback of SVD is that there is no to little explanation to the reason that we recommend an item to a user. This can be a huge problem if users are eager to know why a specific item is recommended to them.

Reference

Introduction to Recommender System

DATA 612 Project 3 | Matrix Factorization methods

Priya Shaji

2019-06-24