project 3

Introduction

The purpose of the project is to implement latent-based approach to collaborative filtering.

Latent factor approach leverages matrix factorization techniques to arrive at recommendations. A latent factor is not something that we observe in our data, rather, we infer it based on the value (ratings) given by the user to the item. Matrix Factorization (MF) or matrix decomposition is a method of splitting a matrix into a multiple matrices. The product of these matraces will produce the original matrix. Matrix Factorization is one of the most popular algorithms to solve co-clustering problem.

The following methods of Latent factor approach will be considered in this project:

Singular Value Decomposition (SVD)
Funk SVD

Apart from the methods listed above, user-based CF will be evaluated as well in order to assess the performance of all the methods.

Singular Value Decomposition (SVD) is a matrix factorization technique to reduce the dimension of impute data.The idea is to find the direction of maximum variances and retain only those who can considerably explain variation in data. While SVD can achieve a very good result on non-sparse data, in real life SVD doesn’t work so well on data - as real-life data is significantly sparse. SVD function in recommenderLab package uses column-mean as default method for missing values impute. This usually works OK, but results are usually biased.

Funk SVD - implements matrix decomposition by the stochastic gradient descent optimization popularized by Simon Funk to minimize the error on the known values. Funk SVD ignores the missing values and compute latent factors only using the values we know. Conceptually, it is a simple iterative optimization process that assumes the existence of a cost function and arbitrary initial values for the optimization variables. Gradient descent has been shown to work well optimizing MF models. However, it is not a popular choice for an optimizer for MF if the dimensionality of the original rating matrix is high. In real life, this issue requires parallelization mechanism and a better exploitation of matrix factorizations unique cost function nature.

# reading Data

data(MovieLense)
movie_matrix<-MovieLense

# creating evaluation scheme

set.seed(123)

# 5-fold CV; everything that is above 3 is considered a good rating; 5 neighbours will be find for a given user(item) to make recommendation
es<- evaluationScheme(movie_matrix, method = "cross", train = 0.9, given = 5, goodRating = 3, k = 5)

Evaluating user-based CF approach with Pearson Coefficient as a similarity measure. This approach has demonstrated the best results in project 2.

#  testing user-based CF using centered data and Pearson coefficient as a similarity measure to find neighbours

param_ubcf<-list(normalize="center", method = "Cosine")
result_1<- evaluate(es, method = "UBCF", type = "ratings", param = param_ubcf)

## UBCF run fold/sample [model time/prediction time]
##   1  [0.015sec/0.901sec] 
##   2  [0.006sec/0.726sec] 
##   3  [0.007sec/0.569sec] 
##   4  [0.006sec/0.783sec] 
##   5  [0.006sec/0.601sec]

avg(result_1)

##         RMSE      MSE       MAE
## res 1.122067 1.259213 0.8906996

#  testing SVD method using the following parametrs: k = 10, maxiter = 100, normalize = center

param_svd = list(normalize="center", maxiter = 100, k =100)
result_2<- evaluate(es, method = "SVD",  param = param_svd, type = "ratings")

## SVD run fold/sample [model time/prediction time]
##   1  [1.69sec/0.14sec] 
##   2  [1.746sec/0.155sec] 
##   3  [1.832sec/0.17sec] 
##   4  [1.779sec/0.165sec] 
##   5  [1.68sec/0.178sec]

avg(result_2)

##         RMSE      MSE       MAE
## res 1.144679 1.310516 0.9100549

 # testing funk SVD method using the following paramentrs: k     =  10, gamma    =  0.015, lambda    =  0.001, normalize     =  center, min_epochs   =  50, max_epochs   =  200
param_svdf<- list(normalize="center", k = 10, gamma  =  0.015,lambda     =  0.001, min_epochs    =  50, max_epochs   =  200)
result_3<- evaluate(es, method = "SVDF", type = "ratings", param = param_svdf)

## SVDF run fold/sample [model time/prediction time]
##   1  [93.983sec/13.64sec] 
##   2  [116.089sec/17.06sec] 
##   3  [92.758sec/14.478sec] 
##   4  [95.522sec/14.067sec] 
##   5  [111.908sec/16.809sec]

avg(result_3)

##         RMSE      MSE       MAE
## res 1.103049 1.217039 0.8697382

Models’ performance is been summarized below:

m1<-cbind(RMSE=avg(result_1))
m2<-cbind(RMSE=avg(result_2))
m3<-cbind(RMSE=avg(result_3))

summary = rbind(m1, m2, m3)
rownames(summary) <- c("UBCF","SVD","FUNK SVD")
summary

##              RMSE      MSE       MAE
## UBCF     1.122067 1.259213 0.8906996
## SVD      1.144679 1.310516 0.9100549
## FUNK SVD 1.103049 1.217039 0.8697382

Funk SVD has the lowest RMSE compare to user-based and SVD methods, the difference is very small though.

Let’s look at the ROC curve of all three methods for 5, 10, 15 and 20 recommendations.

algorithms <- list(

  "user-based" = list(name="UBCF", param=list(normalize="center", method = "Cosine")),
  "SVD" = list(name="SVD", param=list(normalize="center", maxiter = 100, k =100)),
  "Funk SVD" = list(name = "SVDF", param = list(normalize="center", k = 10, gamma    =  0.015,lambda     =  0.001, min_epochs    =  50, max_epochs   =  200))
  
)

results <- evaluate(es, algorithms, n=c(5, 10, 15, 20))

## UBCF run fold/sample [model time/prediction time]
##   1  [0.006sec/0.609sec] 
##   2  [0.007sec/0.761sec] 
##   3  [0.007sec/0.597sec] 
##   4  [0.007sec/0.592sec] 
##   5  [0.164sec/0.6sec] 
## SVD run fold/sample [model time/prediction time]
##   1  [1.558sec/0.23sec] 
##   2  [1.595sec/0.219sec] 
##   3  [1.632sec/0.366sec] 
##   4  [1.46sec/0.371sec] 
##   5  [1.574sec/0.349sec] 
## SVDF run fold/sample [model time/prediction time]
##   1  [94.077sec/14.154sec] 
##   2  [93.422sec/16.88sec] 
##   3  [78.988sec/14.313sec] 
##   4  [76.705sec/14.232sec] 
##   5  [90.8sec/17.148sec]

plot(results, annotate = 1:4, legend="topleft", main = "ROC")

As we see user-based outperformed latent-based methods in terms of accuracy for the number of predictions more than 5.

Let’s build the complete model using SVD method and make recommendations.

# splitting data on train and test sets
esf<- evaluationScheme(movie_matrix, method = "split", train = 0.9, given = 5, goodRating = 3)
train <-getData(esf, "train")
test <-getData(esf, "unknown")
test_known <- getData(esf, "known")
#  building user-based recommendation model
param_f<- list(normalize = "center", maxiter = 100, k =100)
final_model <- Recommender(train, method = "SVD", param = param_f)
final_model

## Recommender of type 'SVD' for 'realRatingMatrix' 
## learned using 848 users.

# getting recommendations (top 10)
final_prediction<- predict (final_model, test, n = 10, type = "topNList")
final_prediction@items[1]

## $`2`
##  [1] 181 275 420 744 187 249 268 504 564 188

final_prediction@ratings[1]

## $`2`
##  [1] 4.021754 3.973405 3.963762 3.932934 3.932772 3.931617 3.916135
##  [8] 3.900941 3.891237 3.869463

SVD is faster than collaborative filtering in making predictions. SVD handles the problem of scalability and sparsity posed by CF successfully. However, SVD is not without flaw. The main drawback of SVD is that there is no to little explanation to the reason that we recommend an item to an user.

project 3

Olga Shiligin

22/06/2019

Introduction