Instruction

In this assignment, we will work in a small group and choose a different dataset to work with from our previous projects. We will compare the accuracy of at least two recommender system algorithms against our offline data, implement support for at least one business or user experience goal by increasing diversity, serendipity or novelty, and then compare and report the change in accuracy before and after the increase.

Introduction

In this project, we will use a different dataset to implement at least two recommender system algorithms, such as the User-Based Collaborative Filtering (UBCF) model, Item-Based Collaborative Filtering (IBCF) model, singular value decomposition (SVD) model, and random model. We will then implement support by increasing the serendipity, and compare the change in the model accuracies before and after the serendipity increase. The dataset we will use for this project is MovieLense. We will introduce the dataset in detail below.

Load Packages

library(recommenderlab)
library(ggplot2)
library(tidyverse)
library(dplyr)
library(kableExtra)
library(gridExtra)
library(rmdformats)
library(formattable)
library(scales)

Read Data

set.seed(3)
data(MovieLense)
y<-as.matrix(MovieLense@data[1:10,1:100])
y  %>% kable(caption = "Showing Part of the Dataset") %>% kable_styling("striped", full_width = TRUE)
Showing Part of the Dataset
Toy Story (1995) GoldenEye (1995) Four Rooms (1995) Get Shorty (1995) Copycat (1995) Shanghai Triad (Yao a yao yao dao waipo qiao) (1995) Twelve Monkeys (1995) Babe (1995) Dead Man Walking (1995) Richard III (1995) Seven (Se7en) (1995) Usual Suspects, The (1995) Mighty Aphrodite (1995) Postino, Il (1994) Mr. Holland's Opus (1995) French Twist (Gazon maudit) (1995) From Dusk Till Dawn (1996) White Balloon, The (1995) Antonia's Line (1995) Angels and Insects (1995) Muppet Treasure Island (1996) Braveheart (1995) Taxi Driver (1976) Rumble in the Bronx (1995) Birdcage, The (1996) Brothers McMullen, The (1995) Bad Boys (1995) Apollo 13 (1995) Batman Forever (1995) Belle de jour (1967) Crimson Tide (1995) Crumb (1994) Desperado (1995) Doom Generation, The (1995) Free Willy 2: The Adventure Home (1995) Mad Love (1995) Nadja (1994) Net, The (1995) Strange Days (1995) To Wong Foo, Thanks for Everything! Julie Newmar (1995) Billy Madison (1995) Clerks (1994) Disclosure (1994) Dolores Claiborne (1994) Eat Drink Man Woman (1994) Exotica (1994) Ed Wood (1994) Hoop Dreams (1994) I.Q. (1994) Star Wars (1977) Legends of the Fall (1994) Madness of King George, The (1994) Natural Born Killers (1994) Outbreak (1995) Professional, The (1994) Pulp Fiction (1994) Priest (1994) Quiz Show (1994) Three Colors: Red (1994) Three Colors: Blue (1993) Three Colors: White (1994) Stargate (1994) Santa Clause, The (1994) Shawshank Redemption, The (1994) What's Eating Gilbert Grape (1993) While You Were Sleeping (1995) Ace Ventura: Pet Detective (1994) Crow, The (1994) Forrest Gump (1994) Four Weddings and a Funeral (1994) Lion King, The (1994) Mask, The (1994) Maverick (1994) Faster Pussycat! Kill! Kill! (1965) Brother Minister: The Assassination of Malcolm X (1994) Carlito's Way (1993) Firm, The (1993) Free Willy (1993) Fugitive, The (1993) Hot Shots! Part Deux (1993) Hudsucker Proxy, The (1994) Jurassic Park (1993) Much Ado About Nothing (1993) Robert A. Heinlein's The Puppet Masters (1994) Ref, The (1994) Remains of the Day, The (1993) Searching for Bobby Fischer (1993) Sleepless in Seattle (1993) Blade Runner (1982) So I Married an Axe Murderer (1993) Nightmare Before Christmas, The (1993) True Romance (1993) Welcome to the Dollhouse (1995) Home Alone (1990) Aladdin (1992) Terminator 2: Judgment Day (1991) Dances with Wolves (1990) Silence of the Lambs, The (1991) Snow White and the Seven Dwarfs (1937) Fargo (1996)
5 3 4 3 3 5 4 1 5 3 2 5 5 5 5 5 3 4 5 4 1 4 4 3 4 3 2 4 1 3 3 5 4 2 1 2 2 3 4 3 2 5 4 5 5 4 4 5 3 5 4 4 3 3 5 4 5 4 5 5 4 3 2 5 4 4 3 4 3 3 3 4 3 1 4 4 4 1 4 4 5 5 3 4 3 5 5 4 5 4 5 3 5 2 4 5 3 4 3 5
4 0 0 0 0 0 0 0 0 2 0 0 4 4 0 0 0 0 3 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 3 0 0 4 3 0 0 0 4 0 0 0 0 0 0 0 0 0 0 4 0 5 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 4 1 0 0 1 0 0 1 4 0 0 0 0 0 0 0 0 3 2 0 0 0 0 0 0 0 0 5 3 0 0 0 3 4 0 0 3 3 5
4 0 0 0 0 0 2 4 4 0 0 4 2 5 3 0 0 0 4 0 3 3 4 0 0 0 0 2 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 4 0 0 0 0 0 4 0 0 5 0 0 0 0 4 0 0 0 0 3 3 4 0 0 0 0 0 0 0 3 0 4 0 0 0 0 3 4 0 4 0 0 0 0 0 2 0 0 5 0 5
0 0 0 5 0 0 5 5 5 4 3 5 0 0 0 0 0 0 0 0 0 5 3 0 3 0 4 5 3 0 4 4 0 0 0 0 0 0 5 0 0 0 0 5 0 0 5 0 0 5 2 4 5 3 0 5 0 0 0 0 0 3 0 5 0 0 0 4 5 1 5 5 3 0 0 0 5 3 4 4 5 3 0 0 0 4 0 0 5 3 3 5 5 0 0 5 5 4 5 5
0 0 0 0 0 0 3 0 0 0 3 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 5 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 5 0 0 0 0 0 0 4 0 0 0 0 0 0 3 0 0 0 0
0 0 0 0 0 5 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 0 0 4 0 0 4 0 4 0 4 5 3 0 0 4 0 0 0 0 0 5 5 0 0 0 0 0 0 0 0 4 4 0 0 0 0 0 0 4 0 0 0 0 0 0 0 4 0 5 0 0 0 0 0 5 0 0 4 3 0 0 0 4 0 0 0 0 4 4 0 0 0 0 0 0 0 0 0 0 0 4 0 0 4 0 0 0 0 0 0 0 4 0 0 0 0 4 5 5

Data Exploration

The dataset we will use for this project is MovieLense. This data is about movies. The table contains the ratings that the users give to movies, which was collected through the MovieLens website (movielens.umn.edu) from 943 users on 1664 movies during the seven-month period from September 19th, 1997 through April 22nd, 1998

It is a 943 rows x 1664 columns rating matrix of class 'realRatingMatrix' with 99,392 ratings. Each row of MovieLense corresponds to a user, and each column corresponds to a movie. There are more than 943 x 1664 = 1,500,000 combinations between a user and a movie. Therefore, storing the complete matrix would require more than 1,500,000 cells. However, not every user has watched every movie. Therefore, there are fewer then 100,000 ratings, and the matrix is sparse.

It is also included in our textbook, Building a Recommendation System with R, by Suresh K. Gorakala and Michele Usuelli.

dim(MovieLense)
## [1]  943 1664
table(MovieLense@data %>% as.vector()) %>%
  data.frame() %>%
  mutate(Pcnt = percent(Freq/sum(Freq))) %>%
  rename(Rating = Var1) %>%
  kable() %>%
  kable_styling(bootstrap_options = c('striped', 'bordered'), full_width = FALSE) %>%
  add_header_above(c('Rating Frequency' = 3))
Rating Frequency
Rating Freq Pcnt
0 1469760 93.67%
1 6059 0.39%
2 11307 0.72%
3 27002 1.72%
4 33947 2.16%
5 21077 1.34%

Data Preparation

As not every user has watched every movie, the dataset is large and sparse, which we may not use the whole dataset to build our models. Let's first take a look at the number of ratings per user.

MovieLense %>%
rowCounts() %>%
  data.frame() %>%
  rename(Rating_Per_User = '.') %>%
    ggplot(aes(x=Rating_Per_User)) +
    geom_histogram(color = 'grey', fill = 'deeppink4') +
    ggtitle('Number of Rating Per User')

To build our recommendation models, we will select the most relevant data: users who have rated at least 50 movies and movies that have been rated at least 100 times.

The plot shown below describes the distribution of the average rating per user (Row_Mean) from our filtered dataset.

rating_movie <- MovieLense[rowCounts(MovieLense) > 50,
                             colCounts(MovieLense) > 100]
rating_movie
## 560 x 332 rating matrix of class 'realRatingMatrix' with 55298 ratings.
rating_movie %>%
  rowMeans() %>%
  data.frame() %>%
  rename(Row_Mean = '.') %>%
  ggplot(aes(x = Row_Mean)) +
  geom_histogram(color = 'grey', fill = 'deeppink4') +
  ggtitle("Distribution of the average rating per user")

Building Recommendation Models

By splitting the dataset into training dataset (80%) and testing dataset (20%), we will implement the User-Based Collaborative Filtering (UBCF) model, Item-Based Collaborative Filtering (IBCF) model, Singular Value Decomposition (SVD) model, and Random model to our datasets with different normalization techniques and similarity measures.

set.seed(3)
eval_sets <- evaluationScheme(data = rating_movie, method = "split", train = 0.8, given = 15, goodRating = 3, k = 1)
eval_sets
## Evaluation scheme with 15 items given
## Method: 'split' with 1 run(s).
## Training set proportion: 0.800
## Good ratings: >=3.000000
## Data set: 560 x 332 rating matrix of class 'realRatingMatrix' with 55298 ratings.

UBCF Models

We will evaluate three models of User-Based Collaborative Filtering (UBCF) algorithm by using the recommenderlab package with mean-centering normalization technique and three similarity measures (Pearson correlation, Euclidean distance and Cosine distance).

ubcf_models <- list(
  ubcf_prs_center = list(name = "UBCF", param = list(method = "pearson", normalize = "center")),
  ubcf_euc_center = list(name = "UBCF", param = list(method = "euclidean", normalize = "center")),
  ubcf_cos_center = list(name = "UBCF", param = list(method = "cosine", normalize = "center"))
)

ubcf_eval_results <- evaluate(eval_sets, 
                              method = ubcf_models, 
                              n = seq(10, 100, 10)
                              )
## UBCF run fold/sample [model time/prediction time]
##   1  [0.03sec/0.54sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0.02sec/0.48sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0.02sec/0.32sec]

The results of the three UBCF models are plotted below in ROC curve and Precision-Recall.

plot(ubcf_eval_results, annotate = TRUE, legend="topleft")
title("UBCF_ROC Curve")

plot(ubcf_eval_results, "prec/rec", annotate = TRUE, legend="bottomleft")
title("UBCF_Precision-Recall")

IBCF Models

We will evaluate three models of Item-Based Collaborative Filtering (IBCF) algorithm by using the recommenderlab package with mean-centering normalization technique and three similarity measures (Pearson correlation, Euclidean distance and Cosine distance).

ibcf_models <- list(
  ibcf_prs_center = list(name = "IBCF", param = list(method = "pearson", normalize = "center")),
  ibcf_euc_center = list(name = "IBCF", param = list(method = "euclidean", normalize = "center")),
  ibcf_cos_center = list(name = "IBCF", param = list(method = "cosine", normalize = "center"))
)
ibcf_eval_results <- evaluate(eval_sets, 
                              method = ibcf_models, 
                              n = seq(10, 100, 10)
                              )
## IBCF run fold/sample [model time/prediction time]
##   1  [0.7sec/0.09sec] 
## IBCF run fold/sample [model time/prediction time]
##   1  [0.54sec/0.08sec] 
## IBCF run fold/sample [model time/prediction time]
##   1  [0.55sec/0.08sec]

The results of the three IBCF models are plotted below in ROC curve and Precision-Recall.

plot(ibcf_eval_results, annotate = TRUE, legend="topleft")
title("IBCF_ROC Curve")

plot(ibcf_eval_results, "prec/rec", annotate = TRUE, legend="bottomleft")
title("IBCF_Precision-Recall")

SVD Models

We will evaluate two models of Singular Value Decomposition (SVD) algorithm by using the recommenderlab package with mean-centering normalization technique.

svd_models <- list(
  svd_center = list(name = "SVD", param = list(normalize = "center")),
  svd_z = list(name = "SVD", param = list(normalize = "Z-score"))
)
svd_eval_results <- evaluate(x = eval_sets, 
                                 method = svd_models, 
                                 n = seq(10, 100, 10))
## SVD run fold/sample [model time/prediction time]
##   1  [0.06sec/0.08sec] 
## SVD run fold/sample [model time/prediction time]
##   1  [0.1sec/0.07sec]

The results of the two SVD models are plotted below in ROC curve and Precision-Recall.

plot(svd_eval_results, annotate = TRUE, legend="topleft")
title("SVD_ROC Curve")

plot(svd_eval_results, "prec/rec", annotate = TRUE, legend="bottomleft")
title("SVD_Precision-Recall")

Random Models

We will evaluate three models by using the random algorithm from the recommenderlab package with mean-centering normalization technique and three similarity measures (Pearson correlation, Euclidean distance and Cosine distance).

random_models <- list(
  random_prs_center = list(name = "RANDOM", param = list(method = "pearson", normalize = "center")),
  random_euc_center = list(name = "RANDOM", param = list(method = "Euclidean", normalize = "center")),
  random_cos_center = list(name = "RANDOM", param = list(method = "Cosine", normalize = "center"))
)
random_eval_results <- evaluate(x = eval_sets, 
                              method = random_models, 
                              n = seq(10, 100, 10)
                              )
## RANDOM run fold/sample [model time/prediction time]
##   1  [0sec/0.09sec] 
## RANDOM run fold/sample [model time/prediction time]
##   1  [0sec/0.09sec] 
## RANDOM run fold/sample [model time/prediction time]
##   1  [0sec/0.06sec]

The results of the three random models are plotted below in ROC curve and Precision-Recall.

plot(random_eval_results, annotate = TRUE, legend="topleft")
title("Random ROC curve")

plot(random_eval_results, "prec/rec", annotate = TRUE, legend="bottomleft")
title("Random Precision-recall")

Increasing Serendipity

To add serendipity to our recommendations, we will add an element of chance to the model. We will create a hybrid recommendation model that based on a 20/80 weighting of the original algorithms in the previous steps with the RANDOM algorithm.

train <- getData(eval_sets, 'train')
known <- getData(eval_sets, 'known')
unknown <- getData(eval_sets, 'unknown')

UBCF vs HYBRID

ubcf_prs_center model is evaluated as the best model among all UBCF models mentioned above, therefore it is used to compare with the hybrid model.

# UBCF Model
UBCF_train <- Recommender(getData(eval_sets, "train"), "IBCF", parameter = list(method = "pearson", normalize = "center"))

# Random Model
RANDOM_train <- Recommender(getData(eval_sets, "train"), "RANDOM")

# Hybrid Model
Hybrid_1_train <- HybridRecommender(
    UBCF_train,
    RANDOM_train,
    weights = c(0.2, 0.8)
)

# Accuracy Metrics of UBCF Model
UBCF_pred <- predict(UBCF_train,getData(eval_sets,'known'), type = 'ratings')
UBCF_error <- calcPredictionAccuracy(UBCF_pred, getData(eval_sets, "unknown"))

# Accuracy Metrics of Random Model
RANDOM_pred <- predict(RANDOM_train,getData(eval_sets,'known'), type = 'ratings')
RANDOM_error <- calcPredictionAccuracy(RANDOM_pred, getData(eval_sets, "unknown"))

# Accuracy Metrics of Hybrid Model
Hybrid_1_pred <- predict(Hybrid_1_train,getData(eval_sets,'known'), type = 'ratings')
Hybrid_1_error <- calcPredictionAccuracy(Hybrid_1_pred, getData(eval_sets, "unknown"))

rbind(UBCF_error, RANDOM_error, Hybrid_1_error) %>%
  kable() %>%
  kable_styling(bootstrap_options = c('striped','bordered'), full_width = FALSE) %>%
  add_header_above(c('UBCF vs HYBRID'=4))
UBCF vs HYBRID
RMSE MSE MAE
UBCF_error 1.224255 1.498801 0.9178247
RANDOM_error 1.354651 1.835081 1.0610072
Hybrid_1_error 1.301450 1.693771 1.0344655

ICBF vs HYBRID

ibcf_euc_center model is evaluated as the best model among all IBCF models mentioned above, therefore it is used to compare with the hybrid model.

#IBCF Model
IBCF_train <- Recommender(getData(eval_sets, "train"), "IBCF", parameter = list(method = "Euclidean", normalize = "center"))

#Random Model
RANDOM_train <- Recommender(getData(eval_sets, "train"), "RANDOM")

#Hybrid Model
Hybrid_2_train <- HybridRecommender(
    IBCF_train,
    RANDOM_train,
    weights = c(0.2, 0.8)
)

# Accuracy Metrics of IBCF Model
IBCF_pred <- predict(IBCF_train,getData(eval_sets,'known'), type = 'ratings')
IBCF_error <- calcPredictionAccuracy(IBCF_pred, getData(eval_sets, "unknown"))

# Accuracy Metrics of Random Model
RANDOM_pred <- predict(RANDOM_train,getData(eval_sets,'known'), type = 'ratings')
RANDOM_error <- calcPredictionAccuracy(RANDOM_pred, getData(eval_sets, "unknown"))

# Accuracy Metrics of Hybrid Model
Hybrid_2_pred <- predict(Hybrid_2_train,getData(eval_sets,'known'), type = 'ratings')
Hybrid_2_error <- calcPredictionAccuracy(Hybrid_2_pred, getData(eval_sets, "unknown"))

rbind(IBCF_error, RANDOM_error, Hybrid_2_error) %>%
  kable() %>%
  kable_styling(bootstrap_options = c('striped','bordered'), full_width = FALSE) %>%
  add_header_above(c('IBCF vs HYBRID'=4))
IBCF vs HYBRID
RMSE MSE MAE
IBCF_error 1.134691 1.287524 0.8317443
RANDOM_error 1.366850 1.868280 1.0689180
Hybrid_2_error 1.280364 1.639333 1.0120783

SVD vs HYBRID

svd_center model is evaluated as the best model among all SVD models mentioned above, therefore it is used to compare with the hybrid model.

#SVD Model
SVD_train <- Recommender(getData(eval_sets, "train"), "SVD", parameter = list(normalize = "Center"))

#Random Model
RANDOM_train <- Recommender(getData(eval_sets, "train"), "RANDOM")

#Hybrid Model
Hybrid_3_train <- HybridRecommender(
    SVD_train,
    RANDOM_train,
    weights = c(0.2, 0.8)
)

# Accuracy Metrics of SVD Model
SVD_pred <- predict(SVD_train,getData(eval_sets,'known'), type = 'ratings')
SVD_error <- calcPredictionAccuracy(SVD_pred, getData(eval_sets, "unknown"))

# Accuracy Metrics of Random Model
RANDOM_pred <- predict(RANDOM_train,getData(eval_sets,'known'), type = 'ratings')
RANDOM_error <- calcPredictionAccuracy(RANDOM_pred, getData(eval_sets, "unknown"))

# Accuracy Metrics of Hybrid Model
Hybrid_3_pred <- predict(Hybrid_3_train,getData(eval_sets,'known'), type = 'ratings')
Hybrid_3_error <- calcPredictionAccuracy(Hybrid_3_pred, getData(eval_sets, "unknown"))

rbind(SVD_error, RANDOM_error, Hybrid_3_error) %>%
  kable() %>%
  kable_styling(bootstrap_options = c('striped','bordered'), full_width = FALSE) %>%
  add_header_above(c('SVD vs HYBRID'=4))
SVD vs HYBRID
RMSE MSE MAE
SVD_error 1.013785 1.027761 0.8053420
RANDOM_error 1.353671 1.832426 1.0622231
Hybrid_3_error 1.224399 1.499152 0.9737575

Conclusion

From the three tables of performance metrics for UBCF, IBCF and SVD models vs Hybrid models, we can see that the prediction accuracies of all three Hybrid recommendation models fall between its original model and random model. Based on the 20/80 weighting of the original algorithms and the random algorithm, the RMSE of the Hybrid model is a little bit weakened as all random models have higher RMSE than their original models.

Recommender systems can be evaluated offline and online. The main difference between offline and online datasets is the way of accuracy testing. For offline evaluation, it is just like what we did with the MovieLense dataset above to test the effectiveness of the system by calculating RMSE and other metrics. When we use offline dataset, the recommendation will be tested according to the "unknown" part of the test dataset. However, when our recommender system becomes online, i.e. online evaluation is possible, the "unknown" part will then be a new registered user or a live user that is looking for recommendations. We will evaluate users' click-through rate and conversion rate in the online recommender system. Click-through rate is a metric shown as a percentage that measures how many people clicked on the item, while conversion rate is a metric shown as a percentage that displays how many users complete an action on the item out of the total number of visitors. We can also create a user preference survey in the user profile section for users to fill in their preferences on the items, such as movie genres, actor/actress, location, etc. Adding the two metrics and user-preference design to our online recommender systems can provide more user-item information to the analysis and therefore can further improve the model accuracy and produce more interesting recommendations.