Libraries

library("recommenderlab")
library("ggplot2")

I will begin by loading the movies lense data-set from the recommender lab into the environment. I will also subset the data for movies have been rated by more than 50 user and films rated more than 100 times.

Data

data("MovieLense")
ratings_movies = MovieLense[rowCounts(MovieLense)>50,colCounts(MovieLense)>100]

The movie rating distribution is right-skewed with the median of

quantile(colMeans(ratings_movies, na.rm = T), .5)
##      50% 
## 3.684294
movie_mean_rating = colMeans(ratings_movies, na.rm = T)

qplot(movie_mean_rating) + ggtitle("Distribution of movie Ratings") 

rm(movie_mean_rating)

The user rating distribution seems to somewhat normal with the median of

quantile(rowMeans(ratings_movies, na.rm = T), .5)
##      50% 
## 3.740624
user_mean_rating = rowMeans(ratings_movies, na.rm = T)
qplot(user_mean_rating) + ggtitle("Distribution of User Ratings")

rm(user_mean_rating)

A heat map of the first 100 users and movies reveals sparsity, but also some very active users.

image(ratings_movies[1:100,1:100], main="Heatmap First 100 Users & movies")

To create a train and test set an evaluation scheme is needed. The function takes the rating data. The data will be split 80% train and 20% test.

set.seed(143)
eval_sets = evaluationScheme(data =ratings_movies,method ="cross-validation",train=.8,given=15,goodRating=3,k=5)

rm(MovieLense,MovieLenseMeta,ratings_movies)

IBCF

IBCF Model 1

The first model is an Item Base collaborative filtering model. The data will be normalized using centering. The distance measurement used is cosine. The models will be evaluated using the unknown data set and stored in a data frame for comparison to other models. Finally, the top predictions for the model will be printed.

IBCF_cosine_model = Recommender(data=getData(eval_sets,"train"),method ="IBCF",parameter=list( normalize = "center", method="cosine"))

pred_IBCF_cos = predict(object =IBCF_cosine_model,newdata=getData(eval_sets,"known"),n=10,type="ratings")

model_stats =data.frame(IBCF_cosine_model=calcPredictionAccuracy(x=pred_IBCF_cos,data=getData(eval_sets,"unknown")))

pred_IBCF_cos_top = predict(object =IBCF_cosine_model,newdata=getData(eval_sets,"known"),n=10,type="topNList")

pred_IBCF_cos_top@itemLabels[pred_IBCF_cos_top@items[[1]]]
##  [1] "Toy Story (1995)"           "GoldenEye (1995)"          
##  [3] "Twelve Monkeys (1995)"      "Babe (1995)"               
##  [5] "Dead Man Walking (1995)"    "Seven (Se7en) (1995)"      
##  [7] "Usual Suspects, The (1995)" "Taxi Driver (1976)"        
##  [9] "Rumble in the Bronx (1995)" "Birdcage, The (1996)"
rm(IBCF_cosine_model,pred_IBCF_cos,pred_IBCF_cos_top)

IBCF Model 2

IBCF model 2 uses Pearson distance measurement otherwise it is similar to IBCF model one.

IBCF_pearson_model = Recommender(data=getData(eval_sets,"train"),method ="IBCF",parameter=list( normalize = "center", method="pearson"))

pred_IBCF_pear = predict(object =IBCF_pearson_model,newdata=getData(eval_sets,"known"),n=10,type="ratings")

model_stats$IBCF_pearson_model= cbind( calcPredictionAccuracy(x=pred_IBCF_pear,data=getData(eval_sets,"unknown")))

pred_IBCF_pear_top = predict(object =IBCF_pearson_model,newdata=getData(eval_sets,"known"),n=10,type="topNList")

pred_IBCF_pear_top@itemLabels[pred_IBCF_pear_top@items[[1]]]
##  [1] "GoldenEye (1995)"           "Get Shorty (1995)"         
##  [3] "Babe (1995)"                "Dead Man Walking (1995)"   
##  [5] "Seven (Se7en) (1995)"       "Usual Suspects, The (1995)"
##  [7] "Taxi Driver (1976)"         "Rumble in the Bronx (1995)"
##  [9] "Apollo 13 (1995)"           "Batman Forever (1995)"
rm(IBCF_pearson_model,pred_IBCF_pear,pred_IBCF_pear_top)

UBCF

UBCF Model 1

The user base collaborative filtering model will use the same methodology implemented earlier. The model will be normalization using centering and the distance measurement used is cosine.

UBCF_cosine_model = Recommender(data=getData(eval_sets,"train"),method ="UBCF",parameter=list( normalize = "center", method="cosine"))

pred_UBCF_cos = predict(object =UBCF_cosine_model,newdata=getData(eval_sets,"known"),n=10,type="ratings")

model_stats$UBCF_cosine_model= cbind( calcPredictionAccuracy(x=pred_UBCF_cos,data=getData(eval_sets,"unknown")))

pred_UBCF_cos_top = predict(object =UBCF_cosine_model,newdata=getData(eval_sets,"known"),n=10,type="topNList")

pred_UBCF_cos_top@itemLabels[pred_UBCF_cos_top@items[[1]]]
##  [1] "Silence of the Lambs, The (1991)" "Godfather, The (1972)"           
##  [3] "Fargo (1996)"                     "Shawshank Redemption, The (1994)"
##  [5] "Wizard of Oz, The (1939)"         "Twelve Monkeys (1995)"           
##  [7] "Usual Suspects, The (1995)"       "Fugitive, The (1993)"            
##  [9] "Toy Story (1995)"                 "Star Wars (1977)"
rm(UBCF_cosine_model,pred_UBCF_cos,pred_UBCF_cos_top)

UBCF Model 2

UBCF model 2 will use Z-score normalization technique and a distance measurement of Pearson.

UBCF_pearson_modelz = Recommender(data=getData(eval_sets,"train"),method ="UBCF",parameter=list( normalize = "Z-score", method="pearson"))

pred_UBCF_pear = predict(object =UBCF_pearson_modelz,newdata=getData(eval_sets,"known"),n=10,type="ratings")

model_stats$UBCF_pearson_modelz= cbind( calcPredictionAccuracy(x=pred_UBCF_pear,data=getData(eval_sets,"unknown")))

pred_UBCF_pear_top = predict(object =UBCF_pearson_modelz,newdata=getData(eval_sets,"known"),n=10,type="topNList")

pred_UBCF_pear_top@itemLabels[pred_UBCF_pear_top@items[[1]]]
##  [1] "Silence of the Lambs, The (1991)" "Titanic (1997)"                  
##  [3] "Contact (1997)"                   "Fargo (1996)"                    
##  [5] "Princess Bride, The (1987)"       "Fugitive, The (1993)"            
##  [7] "Raiders of the Lost Ark (1981)"   "Godfather, The (1972)"           
##  [9] "Shawshank Redemption, The (1994)" "Usual Suspects, The (1995)"
rm(UBCF_pearson_modelz,pred_UBCF_pear,pred_UBCF_pear_top)

Conslusions proj2

After evaluating all four models, the user based collaborative filtering models performed better than IBCF using RMSE. In my next build, I would like to search techniques that allow me to evaluate multiple models using ROC/AUC, and implementations of hybrid methods for collaborative filtering.

knitr::kable(model_stats,caption="Model Statistics")
Model Statistics
IBCF_cosine_model IBCF_pearson_model UBCF_cosine_model UBCF_pearson_modelz
RMSE 1.0874451 1.2054091 0.9639279 0.9866335
MSE 1.1825369 1.4530111 0.9291571 0.9734457
MAE 0.7935224 0.8880822 0.7572562 0.7812307
rm(model_stats)

Project 4

I will start by creating the models I would like to evaluate. Seven models will be evaluated and plotted. After model evaluation metrics will be plotted and evaluated.

evaluation_models = list(
  IBCF_cos = list(name = "IBCF", param = list(method = "cosine")),
  IBCF_pea = list(name = "IBCF", param = list(method = "pearson")),
  IBCF_cos_ctr = list(name = "IBCF", param = list(normalize = "center",method = "cosine")),
  UBCF_cos = list(name = "UBCF", param = list(method = "cosine")),
  UBCF_pea = list(name = "UBCF", param = list(method = "pearson")),
  UBCF_pea_Z = list(name = "UBCF", param = list(normalize = "Z-score",method = "pearson")),
  random = list(name = "RANDOM", param=NULL)
)

n_recommendations = c(1, 3, 5, 10, 15, 25, 40,50,60,70,80,90,100)

results = evaluate(x = eval_sets,
                    method = evaluation_models,
                    n = n_recommendations)
## IBCF run fold/sample [model time/prediction time]
##   1  [0.4sec/0.04sec] 
##   2  [0.41sec/0.04sec] 
##   3  [0.43sec/0.04sec] 
##   4  [0.43sec/0.03sec] 
##   5  [0.41sec/0.05sec] 
## IBCF run fold/sample [model time/prediction time]
##   1  [0.56sec/0.04sec] 
##   2  [0.47sec/0.04sec] 
##   3  [0.5sec/0.05sec] 
##   4  [0.45sec/0.05sec] 
##   5  [0.43sec/0.08sec] 
## IBCF run fold/sample [model time/prediction time]
##   1  [0.46sec/0.03sec] 
##   2  [0.43sec/0.05sec] 
##   3  [0.41sec/0.05sec] 
##   4  [0.31sec/0.14sec] 
##   5  [0.41sec/0.04sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0.02sec/0.19sec] 
##   2  [0sec/0.2sec] 
##   3  [0sec/0.23sec] 
##   4  [0sec/0.23sec] 
##   5  [0sec/0.31sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0sec/0.17sec] 
##   2  [0sec/0.19sec] 
##   3  [0sec/0.28sec] 
##   4  [0sec/0.28sec] 
##   5  [0sec/0.17sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0.02sec/0.2sec] 
##   2  [0.01sec/0.3sec] 
##   3  [0sec/0.3sec] 
##   4  [0.02sec/0.17sec] 
##   5  [0.01sec/0.19sec] 
## RANDOM run fold/sample [model time/prediction time]
##   1  [0sec/0.17sec] 
##   2  [0sec/0.05sec] 
##   3  [0sec/0.06sec] 
##   4  [0.02sec/0.05sec] 
##   5  [0sec/0.04sec]

After running the recommendation models, it is possible to extract performance measures which I have named eval_metrics. I will create more measures from the underlying metric data. These measures are Accuracy, sensitivity, and F1Score. The F1 score is a measure of a test’s accuracy by considering precision and recall. A zero F score indicates bad precision and recall.

\[Accuracy=\frac{TP+TN}{TP+FP+TN+FN}\]

eval_metrics = as.data.frame(getConfusionMatrix(results[["UBCF_cos"]])[[1]][,1:8])

eval_metrics$Accuracy = (eval_metrics$TP+eval_metrics$TN)/(eval_metrics$TP+eval_metrics$FP+eval_metrics$TN+eval_metrics$FN)

\[Sensitivity=\frac{TP}{TP+FN}\]

eval_metrics$Sensitivity = eval_metrics$TP/(eval_metrics$TP+eval_metrics$FN)

\[F1 Score=\frac{2*Precision*Sensitivity}{Precision+Sensitivity}\]

eval_metrics$F1Score = (2*eval_metrics$precision*eval_metrics$Sensitivity)/(eval_metrics$precision+eval_metrics$Sensitivity)


knitr::kable(eval_metrics)
TP FP FN TN precision recall TPR FPR Accuracy Sensitivity F1Score
1 0.6696429 0.3303571 71.94643 244.0536 0.6696429 0.0113003 0.0113003 0.0012857 0.7719975 0.0092217 0.0181928
3 1.9107143 1.0892857 70.70536 243.2946 0.6369048 0.0312770 0.0312770 0.0042346 0.7735185 0.0263126 0.0505373
5 2.8660714 2.1339286 69.75000 242.2500 0.5732143 0.0467346 0.0467346 0.0084264 0.7732368 0.0394688 0.0738525
10 5.4285714 4.5714286 67.18750 239.8125 0.5428571 0.0850203 0.0850203 0.0179160 0.7736311 0.0747572 0.1314168
15 7.4821429 7.5178571 65.13393 236.8661 0.4988095 0.1155771 0.1155771 0.0296455 0.7708146 0.1030370 0.1707938
25 11.1607143 13.8392857 61.45536 230.5446 0.4464286 0.1674012 0.1674012 0.0548011 0.7624775 0.1536948 0.2286655
40 15.7857143 24.2142857 56.83036 220.1696 0.3946429 0.2340940 0.2340940 0.0966304 0.7443387 0.2173860 0.2803457
50 18.7321429 31.2678571 53.88393 213.1161 0.3746429 0.2728021 0.2728021 0.1249642 0.7313824 0.2579614 0.3055414
60 21.5000000 38.5000000 51.11607 205.8839 0.3583333 0.3122786 0.3122786 0.1543660 0.7172995 0.2960777 0.3242443
70 24.2857143 45.7142857 48.33036 198.6696 0.3469388 0.3496372 0.3496372 0.1835282 0.7033292 0.3344399 0.3405747
80 26.7678571 53.2321429 45.84821 191.1518 0.3345982 0.3836178 0.3836178 0.2141774 0.6874437 0.3686217 0.3507869
90 28.9732143 61.0267857 43.64286 183.3571 0.3219246 0.4126958 0.4126958 0.2459086 0.6698119 0.3989918 0.3563389
100 31.0803571 68.9196429 41.53571 175.4643 0.3108036 0.4381393 0.4381393 0.2779551 0.6515604 0.4280093 0.3601097
plot(results, annotate=1, legend = "bottomright") 
title("ROC curve")

plot(results, "prec/rec", annotate = 1, legend = "topright")
title("Precision Recall")

results_ratings = evaluate(x = eval_sets, 
               method = evaluation_models, 
               type="ratings")
## IBCF run fold/sample [model time/prediction time]
##   1  [0.4sec/0.02sec] 
##   2  [0.4sec/0.01sec] 
##   3  [0.39sec/0.02sec] 
##   4  [0.39sec/0.02sec] 
##   5  [0.39sec/0.02sec] 
## IBCF run fold/sample [model time/prediction time]
##   1  [0.43sec/0.02sec] 
##   2  [0.42sec/0.01sec] 
##   3  [0.43sec/0.01sec] 
##   4  [0.42sec/0.01sec] 
##   5  [0.53sec/0.02sec] 
## IBCF run fold/sample [model time/prediction time]
##   1  [0.38sec/0.02sec] 
##   2  [0.41sec/0.01sec] 
##   3  [0.41sec/0.01sec] 
##   4  [0.42sec/0.01sec] 
##   5  [0.41sec/0.02sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0.01sec/0.14sec] 
##   2  [0sec/0.26sec] 
##   3  [0.02sec/0.14sec] 
##   4  [0sec/0.15sec] 
##   5  [0sec/0.15sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0sec/0.27sec] 
##   2  [0sec/0.15sec] 
##   3  [0.02sec/0.16sec] 
##   4  [0sec/0.16sec] 
##   5  [0.02sec/0.15sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0.02sec/0.28sec] 
##   2  [0.02sec/0.26sec] 
##   3  [0.02sec/0.14sec] 
##   4  [0.02sec/0.14sec] 
##   5  [0.01sec/0.15sec] 
## RANDOM run fold/sample [model time/prediction time]
##   1  [0sec/0.02sec] 
##   2  [0sec/0.13sec] 
##   3  [0sec/0.01sec] 
##   4  [0sec/0.01sec] 
##   5  [0sec/0.02sec]
table =  as.data.frame( t(sapply(avg(results_ratings), rbind)))
colnames(table) = c("RMSE", "MSE", "MAE")
knitr::kable(table)
RMSE MSE MAE
IBCF_cos 1.1021610 1.2149475 0.8137186
IBCF_pea 1.2102673 1.4649911 0.8996828
IBCF_cos_ctr 1.1021610 1.2149475 0.8137186
UBCF_cos 0.9707491 0.9423672 0.7650323
UBCF_pea 0.9930580 0.9861776 0.7863773
UBCF_pea_Z 0.9931664 0.9863920 0.7865486
random 1.3345242 1.7810893 1.0461225

Summary

Based on the prior exercises various Item and user recommendation models where built. The new approach allowed the evaluating of many models with a more simplified workflow.

The best model based on AUC was the UBCF with cosine similarity. The model also yielded the lowest RMSE of .97. To increase serendipity in the model evaluation, I would generate a random selection from one of the favorite user genre. Analyzing performance with offline data can be trivial, but if the system were online, we could undoubtedly separate a portion of the user base into an A/B testing and compare the results. Also, consideration of the system’s resources must be tested before implementation.