The purpose of the project is to implement latent-based approach to collaborative filtering.
Latent factor approach leverages matrix factorization techniques to arrive at recommendations. A latent factor is not something that we observe in our data, rather, we infer it based on the value (ratings) given by the user to the item. Matrix Factorization (MF) or matrix decomposition is a method of splitting a matrix into a multiple matrices. The product of these matraces will produce the original matrix. Matrix Factorization is one of the most popular algorithms to solve co-clustering problem.
The following methods of Latent factor approach will be considered in this project:
Apart from the methods listed above, user-based CF will be evaluated as well in order to assess the performance of all the methods.
Singular Value Decomposition (SVD) is a matrix factorization technique to reduce the dimension of impute data.The idea is to find the direction of maximum variances and retain only those who can considerably explain variation in data. While SVD can achieve a very good result on non-sparse data, in real life SVD doesn’t work so well on data - as real-life data is significantly sparse. SVD function in recommenderLab package uses column-mean as default method for missing values impute. This usually works OK, but results are usually biased.
Funk SVD - implements matrix decomposition by the stochastic gradient descent optimization popularized by Simon Funk to minimize the error on the known values. Funk SVD ignores the missing values and compute latent factors only using the values we know. Conceptually, it is a simple iterative optimization process that assumes the existence of a cost function and arbitrary initial values for the optimization variables. Gradient descent has been shown to work well optimizing MF models. However, it is not a popular choice for an optimizer for MF if the dimensionality of the original rating matrix is high. In real life, this issue requires parallelization mechanism and a better exploitation of matrix factorizations unique cost function nature.
# reading Data
data(MovieLense)
movie_matrix<-MovieLense
# creating evaluation scheme
set.seed(123)
# 5-fold CV; everything that is above 3 is considered a good rating; 5 neighbours will be find for a given user(item) to make recommendation
es<- evaluationScheme(movie_matrix, method = "cross", train = 0.9, given = 5, goodRating = 3, k = 5)
Evaluating user-based CF approach with Pearson Coefficient as a similarity measure. This approach has demonstrated the best results in project 2.
# testing user-based CF using centered data and Pearson coefficient as a similarity measure to find neighbours
param_ubcf<-list(normalize="center", method = "Cosine")
result_1<- evaluate(es, method = "UBCF", type = "ratings", param = param_ubcf)
## UBCF run fold/sample [model time/prediction time]
## 1 [0.015sec/0.901sec]
## 2 [0.006sec/0.726sec]
## 3 [0.007sec/0.569sec]
## 4 [0.006sec/0.783sec]
## 5 [0.006sec/0.601sec]
avg(result_1)
## RMSE MSE MAE
## res 1.122067 1.259213 0.8906996
# testing SVD method using the following parametrs: k = 10, maxiter = 100, normalize = center
param_svd = list(normalize="center", maxiter = 100, k =100)
result_2<- evaluate(es, method = "SVD", param = param_svd, type = "ratings")
## SVD run fold/sample [model time/prediction time]
## 1 [1.69sec/0.14sec]
## 2 [1.746sec/0.155sec]
## 3 [1.832sec/0.17sec]
## 4 [1.779sec/0.165sec]
## 5 [1.68sec/0.178sec]
avg(result_2)
## RMSE MSE MAE
## res 1.144679 1.310516 0.9100549
# testing funk SVD method using the following paramentrs: k = 10, gamma = 0.015, lambda = 0.001, normalize = center, min_epochs = 50, max_epochs = 200
param_svdf<- list(normalize="center", k = 10, gamma = 0.015,lambda = 0.001, min_epochs = 50, max_epochs = 200)
result_3<- evaluate(es, method = "SVDF", type = "ratings", param = param_svdf)
## SVDF run fold/sample [model time/prediction time]
## 1 [93.983sec/13.64sec]
## 2 [116.089sec/17.06sec]
## 3 [92.758sec/14.478sec]
## 4 [95.522sec/14.067sec]
## 5 [111.908sec/16.809sec]
avg(result_3)
## RMSE MSE MAE
## res 1.103049 1.217039 0.8697382
Models’ performance is been summarized below:
m1<-cbind(RMSE=avg(result_1))
m2<-cbind(RMSE=avg(result_2))
m3<-cbind(RMSE=avg(result_3))
summary = rbind(m1, m2, m3)
rownames(summary) <- c("UBCF","SVD","FUNK SVD")
summary
## RMSE MSE MAE
## UBCF 1.122067 1.259213 0.8906996
## SVD 1.144679 1.310516 0.9100549
## FUNK SVD 1.103049 1.217039 0.8697382
Funk SVD has the lowest RMSE compare to user-based and SVD methods, the difference is very small though.
Let’s look at the ROC curve of all three methods for 5, 10, 15 and 20 recommendations.
algorithms <- list(
"user-based" = list(name="UBCF", param=list(normalize="center", method = "Cosine")),
"SVD" = list(name="SVD", param=list(normalize="center", maxiter = 100, k =100)),
"Funk SVD" = list(name = "SVDF", param = list(normalize="center", k = 10, gamma = 0.015,lambda = 0.001, min_epochs = 50, max_epochs = 200))
)
results <- evaluate(es, algorithms, n=c(5, 10, 15, 20))
## UBCF run fold/sample [model time/prediction time]
## 1 [0.006sec/0.609sec]
## 2 [0.007sec/0.761sec]
## 3 [0.007sec/0.597sec]
## 4 [0.007sec/0.592sec]
## 5 [0.164sec/0.6sec]
## SVD run fold/sample [model time/prediction time]
## 1 [1.558sec/0.23sec]
## 2 [1.595sec/0.219sec]
## 3 [1.632sec/0.366sec]
## 4 [1.46sec/0.371sec]
## 5 [1.574sec/0.349sec]
## SVDF run fold/sample [model time/prediction time]
## 1 [94.077sec/14.154sec]
## 2 [93.422sec/16.88sec]
## 3 [78.988sec/14.313sec]
## 4 [76.705sec/14.232sec]
## 5 [90.8sec/17.148sec]
plot(results, annotate = 1:4, legend="topleft", main = "ROC")
As we see user-based outperformed latent-based methods in terms of accuracy for the number of predictions more than 5.
Let’s build the complete model using SVD method and make recommendations.
# splitting data on train and test sets
esf<- evaluationScheme(movie_matrix, method = "split", train = 0.9, given = 5, goodRating = 3)
train <-getData(esf, "train")
test <-getData(esf, "unknown")
test_known <- getData(esf, "known")
# building user-based recommendation model
param_f<- list(normalize = "center", maxiter = 100, k =100)
final_model <- Recommender(train, method = "SVD", param = param_f)
final_model
## Recommender of type 'SVD' for 'realRatingMatrix'
## learned using 848 users.
# getting recommendations (top 10)
final_prediction<- predict (final_model, test, n = 10, type = "topNList")
final_prediction@items[1]
## $`2`
## [1] 181 275 420 744 187 249 268 504 564 188
final_prediction@ratings[1]
## $`2`
## [1] 4.021754 3.973405 3.963762 3.932934 3.932772 3.931617 3.916135
## [8] 3.900941 3.891237 3.869463
SVD is faster than collaborative filtering in making predictions. SVD handles the problem of scalability and sparsity posed by CF successfully. However, SVD is not without flaw. The main drawback of SVD is that there is no to little explanation to the reason that we recommend an item to an user.