library(tidyverse)
library(kableExtra)
library(knitr)
library(recommenderlab)
library(dplyr)
library(ggplot2)         
library(ggrepel)         
library(tictoc)

Goal

The goal of this assignment is give you practice working with accuracy and other recommender system metrics.

In this assignment you’re asked to do at least one or (if you like) both of the following:

  • Work in a small group, and/or

  • Choose a different dataset to work with from your previous projects.

Deliverables

  1. As in your previous assignments, compare the accuracy of at least two recommender system algorithms against your offline data.

  2. Implement support for at least one business or user experience goal such as increased serendipity, novelty, or diversity.

  3. Compare and report on any change in accuracy before and after you’ve made the change in #2.

  4. As part of your textual conclusion, discuss one or more additional experiments that could be performed and/or metrics that could be evaluated only if online evaluation was possible. Also, briefly propose how you would design a reasonable online evaluation environment.

Data Load

The dataset I chose for the project is the MovieLens Data-Set from the RCran recommendorlab package.

The 100k MovieLense ratings data set. The data was collected through the MovieLens web site (movielens.umn.edu) during the seven-month period from September 19th, 1997 through April 22nd, 1998. The data set contains about 100,000 ratings (1-5) from 943 users on 1664 movies. Movie metadata is also provided in

The data set is from MovieLens project and it was downloaded from Movie Lens

We use the recommenderlab package to develop and test our recommender system using different recommendation algorithms.

The MovieLense 100k dataset is included with recommenderlab.

data("MovieLense")

Data Preperation

Dropping Users & Movies that have low rating inputs

r <- MovieLense[rowCounts(MovieLense) >= 50, colCounts(MovieLense) >= 100]
r
## 565 x 336 rating matrix of class 'realRatingMatrix' with 55832 ratings.

Data Exploration

ratings <- as.vector(MovieLense@data)
cat("Table of Movie Lense Ratings")
## Table of Movie Lense Ratings
kable((table_ratings <- table(ratings))) %>%
  kable_styling("striped")
ratings Freq
0 1469760
1 6059
2 11307
3 27002
4 33947
5 21077
mldf <- as(MovieLense, 'data.frame')
head(mldf)
summary(rowCounts(MovieLense))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    19.0    32.0    64.0   105.4   147.5   735.0
hist(rowCounts(MovieLense), breaks = 40, main = "Distribution of Rating Count per User")

hist(colCounts(MovieLense), breaks = 40, main = "Distribution of Rating Count per Movie")

hist(rowCounts(r), breaks = 40, main = "Distribution of Rating Count per User - Reduced")

hist(colCounts(r), breaks = 40, main = "Distribution of Rating Count per Movie - Reduced")

Using the dlpyr function glimpse to take a closer look at the data

glimpse(mldf)
## Observations: 99,392
## Variables: 3
## $ user   <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ item   <fct> "Toy Story (1995)", "GoldenEye (1995)", "Four Rooms (1995)",...
## $ rating <dbl> 5, 3, 4, 3, 3, 5, 4, 1, 5, 3, 2, 5, 5, 5, 5, 5, 3, 4, 5, 4, ...

Summarize the rating variable using the baseR function summary

summary(mldf$rating)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00    3.00    4.00    3.53    4.00    5.00

Based on the above findings, the MovieLense Dataset has 99,392 observations with 3 variables.

In addition, the rating;s column has a mean of 3.53 and median of 4.00.

The data is slightly skewed to the left and scored on a scale from 1 to 5.

Similarity Matrix

similarity_users <- similarity(MovieLense[1:4,],
                               method = "cosine",
                               which = "users")
cat("Similarity Matrix User's Output","\n")
## Similarity Matrix User's Output
(similarityMatrix <- as.matrix(similarity_users))
##           1         2         3         4
## 1 0.0000000 0.9605820 0.8339504 0.9192637
## 2 0.9605820 0.0000000 0.9268716 0.9370341
## 3 0.8339504 0.9268716 0.0000000 0.9130323
## 4 0.9192637 0.9370341 0.9130323 0.0000000
image(as.matrix(similarity_users), main = "MovieLense: A Similarity of Users")

getRatingMatrix(MovieLense)[1:10, 1:30] 
## 10 x 30 sparse Matrix of class "dgCMatrix"
##    [[ suppressing 30 column names 'Toy Story (1995)', 'GoldenEye (1995)', 'Four Rooms (1995)' ... ]]
##                                                               
## 1  5 3 4 3 3 5 4 1 5 3 2 5 5 5 5 5 3 4 5 4 1 4 4 3 4 3 2 4 1 3
## 2  4 . . . . . . . . 2 . . 4 4 . . . . 3 . . . . . 4 . . . . .
## 3  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
## 4  . . . . . . . . . . 4 . . . . . . . . . . . . . . . . . . .
## 5  4 3 . . . . . . . . . . . . . . 4 . . . 3 . . 4 3 . . . 4 .
## 6  4 . . . . . 2 4 4 . . 4 2 5 3 . . . 4 . 3 3 4 . . . . 2 . .
## 7  . . . 5 . . 5 5 5 4 3 5 . . . . . . . . . 5 3 . 3 . 4 5 3 .
## 8  . . . . . . 3 . . . 3 . . . . . . . . . . 5 . . . . . . . .
## 9  . . . . . 5 4 . . . . . . . . . . . . . . . . . . . . . . .
## 10 4 . . 4 . . 4 . 4 . 4 5 3 . . 4 . . . . . 5 5 . . . . . . .

Recommender Models

recommender_models <- recommenderRegistry$get_entries(
    dataType = "realRatingMatrix")
names(recommender_models)
##  [1] "ALS_realRatingMatrix"          "ALS_implicit_realRatingMatrix"
##  [3] "IBCF_realRatingMatrix"         "LIBMF_realRatingMatrix"       
##  [5] "POPULAR_realRatingMatrix"      "RANDOM_realRatingMatrix"      
##  [7] "RERECOMMEND_realRatingMatrix"  "SVD_realRatingMatrix"         
##  [9] "SVDF_realRatingMatrix"         "UBCF_realRatingMatrix"

For this analysis of MovieLense rating’s accuracy, we will be exploring several Recommender Models.

  • RANDOM: randomly chosen items
  • SVD: singular value decomposition method
  • SVDF: singular value decomposition method with stochastic gradient descent optimization.
  • ALS: alternating least squares algorithm for latent factors.

Developmemt

e <- evaluationScheme(r, method = "split", train = 0.9, given = 15, goodRating = 3.5)
a <- list(
    "Random" = list(name="RANDOM", param = NULL),
    "SVD" = list(name="SVD", param = NULL),
    "SVDF" = list(name="SVDF", param = NULL),
    "ALS" = list(name="ALS", param = NULL)
)

The goal of this section is to first train the models, using the training set and then use the test set to predict ratings, to evaluate the prediction accuracy of the various recommender models.

Measing the accuracy of the predicted vs the actual ratings using several error metrics:

  • RMSE: root mean squared error
  • MSE: mean squared error
  • MAE: mean average error

Run Models for Predicted Ratings

results <- evaluate(e, a, type = "ratings")
## RANDOM run fold/sample [model time/prediction time]
##   1  [0sec/0.05sec] 
## SVD run fold/sample [model time/prediction time]
##   1  [0.06sec/0sec] 
## SVDF run fold/sample [model time/prediction time]
##   1  [15.03sec/1.65sec] 
## ALS run fold/sample [model time/prediction time]
##   1  [0sec/9.4sec]
plot(results)
title(main = "Predicted vs. Actual Ratings")

Evaluation of Initial Models

The SVD, SVDF, and ALS Models all have a higher accuracy than the Random Selection Model.

On the other hand, while the SVDF method has slightly lower errormetrics than the ALS method, it takes significantly longer to evulate the model.

Combined/Hybrid Recommendor Model

In previous projects, when we compared UBCF (User-Based Collaborative Filtering) and IBCF (Item-Based Collaborative Filtering) Models, the recommended movies tended to be dominated by the most popular movies.

The strong correlation between ratings and popularity is the cause for this result.

In an attempt to combat this interaction, the creation of a combined or hybrid model to attempt to add variety to the recommended movies.

RANDOM SVDF

By creating a Hybrid Model that builds off the most accurate model explored above, SVDF, with adding multiplicity through the RANDOM Selection Model.

Specifically, we create a hybrid recommender based on a 50/50 ratio of the SVDF and RANDOM Models.

svdf_model <- Recommender(getData(e, "train"), "SVDF")
svdf_model
## Recommender of type 'SVDF' for 'realRatingMatrix' 
## learned using 508 users.
random_model <- Recommender(getData(e, "train"), "RANDOM")
random_model
## Recommender of type 'RANDOM' for 'realRatingMatrix' 
## learned using 508 users.
hybrid_model <- HybridRecommender(
    svdf_model,
    random_model,
    weights = c(0.5, 0.5)
)
hybrid_model
## Recommender of type 'HYBRID' for 'ratingMatrix' 
## learned using NA users.

Evaluation of Combined Recommendor Model

For the hybrid model, we use the same evaluation e as before (90/10 training / test split with given-15 protocol) to predict ratings in the test set and then evaluate prediction accuracy.

svdf_predict <- predict(svdf_model, getData(e, "known"), type = "ratings")
svdf_predict
## 57 x 336 rating matrix of class 'realRatingMatrix' with 18297 ratings.
random_predict <- predict(random_model, getData(e, "known"), type = "ratings")
random_predict
## 57 x 336 rating matrix of class 'realRatingMatrix' with 18297 ratings.
hybrid_predict <- predict(hybrid_model, getData(e, "known"), type = "ratings")
hybrid_predict
## 57 x 336 rating matrix of class 'realRatingMatrix' with 18297 ratings.
error_metrics <- rbind(
    SVDF = calcPredictionAccuracy(svdf_predict, getData(e, "unknown")),
    RANDOM = calcPredictionAccuracy(random_predict, getData(e, "unknown")),
    HYBRID = calcPredictionAccuracy(hybrid_predict, getData(e, "unknown"))
    )
error_metrics %>% round(3) %>% kable(caption = "Error Metrics for Hybrid Model of SVDF & RANDOM")
Error Metrics for Hybrid Model of SVDF & RANDOM
RMSE MSE MAE
SVDF 0.956 0.913 0.746
RANDOM 1.342 1.801 1.049
HYBRID 1.052 1.107 0.827

The prediction accuracy of the hybrid recommender falls between the SVDF and RANDOM Models, however, closer to the SVDF Model.

Using the 50/50 split of the SVDF and RANDOM Models, the RMSE of the hybrid model weakened by 14%, 0.888 to 1.014, when compared to the SVDF model.

Consequently, with a slight decrease in rating’s accuracy, movie viewers may enjoy a larger array of movie recommendations.

Conclusion

When exploring the RANDOM, SVD, SVDF, and ALS Models, when looking at error metrics such as RMSE, MSE and MAE, the SVDF had the highest prediction accuracy.

In this project, using the creation of a hybrid recommender using a 50/50 blend of the SVDF and RANDOM Models, in order to add variety to the recommendations.

The error metrics of the hybrid model fell between the error metrics of the SVDF and RANDOM models.

Compared to the SVDF model, the RMSE of the hybrid model increased by 14% when adding diversity from the RANDOM model.