Project Instructions

The goal of this assignment is for you to try out different ways of implementing and configuring a recommender, and to evaluate your different approaches.For assignment 2, start with an existing dataset of user-item ratings, such MovieLens. Implement at least two of these recommendation algorithms:

You should evaluate and compare different approaches, using different algorithms, normalization techniques, similarity methods, neighborhood sizes, etc. You don’t need to be exhaustive—these are just some suggested possibilities.

You may use the course text’s recommenderlab or any other library that you want. Please provide at least one graph, and a textual summary of your findings and recommendations.

Introduction and motivation

In this project we will use the built-in MovieLense dataset from the recommenderlab library. We will implement and compare results from several recommendation algorithms. Our recommenders will predict movie ratings given user and movie profiles as well as users’ past ratings.

Load Data

Our data is stored as a realRatingMatrix, where each row represents a user and each column represents a movie. We can see that there are a total of 99,392 ratings, so not every user has watched every movie. This means that our matrix is sparse – most cells have no information.

## 943 x 1664 rating matrix of class 'realRatingMatrix' with 99392 ratings.

Explore Data Set

With reference to code by (Hahsler, 2019), we did some exploration of the dataset so that we get a sense of the how ratings are distributed according to user and item. This will inform how we will subset the data for modeling.

Distribution of ratings by user and item

First, we can take a look at the number of ratings per user.

The data is severely right skewed, which indicates that most users have only rated a few movies.

Next, we can take a look at the number of ratings per movie.

This distribution is also severely right skewed, which means that most movies only have a few ratings.

This aligns with our original intuitios – given the skewedness in both of these distributions, it is obvious that there is a lot of sparsity in the dataset.

Overall distribution of ratings

Next, we can look at the distribution of ratings.

The overall distribution of ratings is left skewed, indicating that people tend to rate movies more positively. The mean rating is confirmed below:

## [1] 3.587565

View of meta data and heatmap of ratings

The format of MovieLenseMeta is a data frame with movie title, year, IMDb URL and indicator variables for 19 genres.

## 'data.frame':    1664 obs. of  22 variables:
##  $ title      : chr  "Toy Story (1995)" "GoldenEye (1995)" "Four Rooms (1995)" "Get Shorty (1995)" ...
##  $ year       : num  1995 1995 1995 1995 1995 ...
##  $ url        : chr  "http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)" "http://us.imdb.com/M/title-exact?GoldenEye%20(1995)" "http://us.imdb.com/M/title-exact?Four%20Rooms%20(1995)" "http://us.imdb.com/M/title-exact?Get%20Shorty%20(1995)" ...
##  $ unknown    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Action     : int  0 1 0 1 0 0 0 0 0 0 ...
##  $ Adventure  : int  0 1 0 0 0 0 0 0 0 0 ...
##  $ Animation  : int  1 0 0 0 0 0 0 0 0 0 ...
##  $ Children's : int  1 0 0 0 0 0 0 1 0 0 ...
##  $ Comedy     : int  1 0 0 1 0 0 0 1 0 0 ...
##  $ Crime      : int  0 0 0 0 1 0 0 0 0 0 ...
##  $ Documentary: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Drama      : int  0 0 0 1 1 1 1 1 1 1 ...
##  $ Fantasy    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Film-Noir  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Horror     : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Musical    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Mystery    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Romance    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Sci-Fi     : int  0 0 0 0 0 0 1 0 0 0 ...
##  $ Thriller   : int  0 1 1 0 1 0 0 0 0 0 ...
##  $ War        : int  0 0 0 0 0 0 0 0 0 1 ...
##  $ Western    : int  0 0 0 0 0 0 0 0 0 0 ...

We can also build out a heatmap of the most relevant users/movies. This will consist of the users that rated the most movies and the movies that have the most ratings. Dark columns represent highly rated movies and dark rows represent users that give high ratings. This indicates the need for normalization of the data – we don’t want to skew the results from users that predominantly rank high or low.

Prepare Data For Modeling

Our data prep will involve: selecting relevant data (movies that have a lot of ratings and users that have seen a lot of movies) and splitting into training and test sets. Once we’ve cleaned things up, we can start to build our recommender models.

Select relevant data

It is obvious that the dataset is very sparse. It includes many individuals that rated very few movies, and many movies that have very few ratings. In order to have a healthy baseline on which to our build recommendation models, we will take into consideration those users who have rated at least 50 movies and those movies that are rated by at least 100 users. A look at the disribution of average rating per user shows us that we have a large variation across the subset.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Split into training and testing

Next, we can split the data into train and test sets. We are left with 462 records in the training set and 98 records in the test set.

## [1] "Training Set: 462 records | Test Set: 98 records"

Recommendation Algorithms

Now that we’ve cleaned up our data, we can configure and implement our recommender systems. There are 3 main filtering approaches: content-based, user-user collaborative, and item-item collaborative.

In content-based filtering, recommendations are user-based – a user profile is defined and the algorithm recommends new items that are similar to previously rated items.

In collaborative filtering, recommendations are based on information about similar users (user-user) or similar items (item-item). The recommenderlab library contains 3 similarity functions that we will explore: cosine, pearson, and jaccard.

For this project, we will focus on user-user and item-item based collaborative filtering.

User-User Collaborative Filtering

User-based collaborative filtering: This recommends to a user the items that are the most preferred by similar users. Our approach will be to:

  1. Measure the similarity between every pair of users.
  2. Identify the most similar users.
  3. Aggregate the ratings from the movies watched by similar users.
  4. Select the top rated movies for recommendation.

The parameters for this approach are similar to those of the item-item filtering: the similarity method (cosine, pearson, and jaccard), the number of nearest neighbors, and the normalization technique.

We’ll start off with a simple, non-optimized user-user based model to get a feel for the functions. We’ll use our train set to create the model and our test set to get recommendations on. Here, we’ve defined the number of recommendations as 5, the similarity method as cosine, and the number of neighbors as 5. We’ll plot the distribution of recommendations.

We can see that that most movies are recommended only a few times.

Now, we can optimize our model by tweaking the similarity method, number of neighbors, and normalization type. We’ll create a function to calculate and record the RMSE, MSE, and MAE for each set of parameters.

In this case, we’ll define an evaluation scheme that consists of 10-fold cross validation with a “good rating” defined as 3. We will cycle through all parameter combinations to identify what the optimal combination is.

We can now visualize our errors:

To identify which set of parameters to use, we can identify the combination that has the lowest error.

##        RMSE       MSE      MAE similarity neighbors   norm ERROR_MEAN
## 1 0.9588247 0.9193448 0.748171    pearson       100 center  0.8754469

This is the set of parameters we will use for the final comparison.

Item-Item Collaborative Filtering

Item-based collaborative filtering: This recommends to a user the items that are most similar on the user’s previous ratings. We will follow this process:

  1. Measure the similarity in rating between every pair of movies.
  2. Identify the k-most similar movies for each movie title.
  3. Identify the movies most similar to previous rated movies for each user.

Here, we mimic our prior approach to user-based collaborative filtering.

The results are very similar to UBCF – there are a large number of movies that are only recommended once.

Next, we’ll define a IBCF-specific function that will calculate the error associated with differing parameters.

Finally, we’ll cycle through all combinations of our parameters and calculate the error associated with each.

We can identify which set of parameters yields the lowest average error. We will use this as the basis of comparison against the UBCF methodology.

The combination of parameters that minimizes the mean error includes Pearson’s correlation, 100 neighbors, and Z-score normalization. Already, we can see that the error using the IBCF methodology is much higher than the UBCF methodology.

##       RMSE      MSE       MAE similarity k-closest items    norm
## 1 1.150618 1.323922 0.8954775    pearson             100 Z-score
##   ERROR_MEAN
## 1   1.123339

Evaluation

For our final evaluation, we compare the user-user and item-item based models, each optimized with those combinations of parameters that minimize mean error. We plot a ROC curve and Precision-recall for each model, and then examine the differences.

User-based collaborative filtering

## UBCF run fold/sample [model time/prediction time]
##   1  [0sec/0.17sec] 
##   2  [0.02sec/0.17sec] 
##   3  [0.02sec/0.15sec] 
##   4  [0sec/0.22sec] 
##   5  [0.02sec/0.22sec] 
##   6  [0sec/0.19sec] 
##   7  [0.02sec/0.17sec] 
##   8  [0sec/0.15sec] 
##   9  [0sec/0.17sec] 
##   10  [0.02sec/0.37sec]

Item-based collaborative filtering

## IBCF run fold/sample [model time/prediction time]
##   1  [0.88sec/0.03sec] 
##   2  [0.68sec/0.05sec] 
##   3  [0.65sec/0.04sec] 
##   4  [0.66sec/0.03sec] 
##   5  [0.68sec/0.03sec] 
##   6  [0.68sec/0.03sec] 
##   7  [0.91sec/0.03sec] 
##   8  [0.66sec/0.05sec] 
##   9  [0.58sec/0.01sec] 
##   10  [0.51sec/0.02sec]

Comparison Finally, we can compare the UBCF and IBCF models to each other.

## UBCF run fold/sample [model time/prediction time]
##   1  [0.02sec/0.12sec] 
##   2  [0sec/0.11sec] 
##   3  [0sec/0.12sec] 
##   4  [0.02sec/0.29sec] 
##   5  [0sec/0.12sec] 
##   6  [0sec/0.14sec] 
##   7  [0sec/0.16sec] 
##   8  [0sec/0.38sec] 
##   9  [0.01sec/0.18sec] 
##   10  [0.02sec/0.17sec] 
## IBCF run fold/sample [model time/prediction time]
##   1  [0.76sec/0.07sec] 
##   2  [0.84sec/0.03sec] 
##   3  [0.65sec/0.05sec] 
##   4  [0.72sec/0.03sec] 
##   5  [0.63sec/0.05sec] 
##   6  [1.05sec/0.03sec] 
##   7  [0.75sec/0.03sec] 
##   8  [0.75sec/0.05sec] 
##   9  [0.72sec/0.05sec] 
##   10  [0.68sec/0.03sec]

Discussion

This analysis suggests that, for our subset of the MovieLense dataset, a user-based collaborative filtering model with optimal model parameters outperforms its item-based collaborative filtering counterpart.

References

Hahsler, M. (2019, August 27). MovieLense Dataset (100k). Rdrr.Io. https://rdrr.io/cran/recommenderlab/man/MovieLense.html

“Recommender Systems 101 – a Step by Step Practical Example in R.” R, 24 Dec. 2014, www.r-bloggers.com/recommender-systems-101-a-step-by-step-practical-example-in-r/.