Libraries

library("recommenderlab")
library("ggplot2")

I will begin by loading the movies lense data-set from the recommender lab into the environment. I will also subset the data for movies have been rated by more than 50 user and films rated more than 100 times.

Data

data("MovieLense")
ratings_movies = MovieLense[rowCounts(MovieLense)>50,colCounts(MovieLense)>100]

The movie rating distribution is right-skewed with the median of

quantile(colMeans(ratings_movies, na.rm = T), .5)

##      50% 
## 3.684294

movie_mean_rating = colMeans(ratings_movies, na.rm = T)

qplot(movie_mean_rating) + ggtitle("Distribution of movie Ratings")

rm(movie_mean_rating)

The user rating distribution seems to somewhat normal with the median of

quantile(rowMeans(ratings_movies, na.rm = T), .5)

##      50% 
## 3.740624

user_mean_rating = rowMeans(ratings_movies, na.rm = T)
qplot(user_mean_rating) + ggtitle("Distribution of User Ratings")

rm(user_mean_rating)

A heat map of the first 100 users and movies reveals sparsity, but also some very active users.

image(ratings_movies[1:100,1:100], main="Heatmap First 100 Users & movies")

To create a train and test set an evaluation scheme is needed. The function takes the rating data. The data will be split 80% train and 20% test.

set.seed(143)
eval_sets = evaluationScheme(data =ratings_movies,method ="cross-validation",train=.8,given=15,goodRating=3,k=5)

rm(MovieLense,MovieLenseMeta,ratings_movies)

IBCF

IBCF Model 1

The first model is an Item Base collaborative filtering model. The data will be normalized using centering. The distance measurement used is cosine. The models will be evaluated using the unknown data set and stored in a data frame for comparison to other models. Finally, the top predictions for the model will be printed.

IBCF_cosine_model = Recommender(data=getData(eval_sets,"train"),method ="IBCF",parameter=list( normalize = "center", method="cosine"))

pred_IBCF_cos = predict(object =IBCF_cosine_model,newdata=getData(eval_sets,"known"),n=10,type="ratings")

model_stats =data.frame(IBCF_cosine_model=calcPredictionAccuracy(x=pred_IBCF_cos,data=getData(eval_sets,"unknown")))

pred_IBCF_cos_top = predict(object =IBCF_cosine_model,newdata=getData(eval_sets,"known"),n=10,type="topNList")

pred_IBCF_cos_top@itemLabels[pred_IBCF_cos_top@items[[1]]]

##  [1] "Toy Story (1995)"           "GoldenEye (1995)"          
##  [3] "Twelve Monkeys (1995)"      "Babe (1995)"               
##  [5] "Dead Man Walking (1995)"    "Seven (Se7en) (1995)"      
##  [7] "Usual Suspects, The (1995)" "Taxi Driver (1976)"        
##  [9] "Rumble in the Bronx (1995)" "Birdcage, The (1996)"

rm(IBCF_cosine_model,pred_IBCF_cos,pred_IBCF_cos_top)

IBCF Model 2

IBCF model 2 uses Pearson distance measurement otherwise it is similar to IBCF model one.

IBCF_pearson_model = Recommender(data=getData(eval_sets,"train"),method ="IBCF",parameter=list( normalize = "center", method="pearson"))

pred_IBCF_pear = predict(object =IBCF_pearson_model,newdata=getData(eval_sets,"known"),n=10,type="ratings")

model_stats$IBCF_pearson_model= cbind( calcPredictionAccuracy(x=pred_IBCF_pear,data=getData(eval_sets,"unknown")))

pred_IBCF_pear_top = predict(object =IBCF_pearson_model,newdata=getData(eval_sets,"known"),n=10,type="topNList")

pred_IBCF_pear_top@itemLabels[pred_IBCF_pear_top@items[[1]]]

##  [1] "GoldenEye (1995)"           "Get Shorty (1995)"         
##  [3] "Babe (1995)"                "Dead Man Walking (1995)"   
##  [5] "Seven (Se7en) (1995)"       "Usual Suspects, The (1995)"
##  [7] "Taxi Driver (1976)"         "Rumble in the Bronx (1995)"
##  [9] "Apollo 13 (1995)"           "Batman Forever (1995)"

rm(IBCF_pearson_model,pred_IBCF_pear,pred_IBCF_pear_top)

UBCF

UBCF Model 1

The user base collaborative filtering model will use the same methodology implemented earlier. The model will be normalization using centering and the distance measurement used is cosine.

UBCF_cosine_model = Recommender(data=getData(eval_sets,"train"),method ="UBCF",parameter=list( normalize = "center", method="cosine"))

pred_UBCF_cos = predict(object =UBCF_cosine_model,newdata=getData(eval_sets,"known"),n=10,type="ratings")

model_stats$UBCF_cosine_model= cbind( calcPredictionAccuracy(x=pred_UBCF_cos,data=getData(eval_sets,"unknown")))

pred_UBCF_cos_top = predict(object =UBCF_cosine_model,newdata=getData(eval_sets,"known"),n=10,type="topNList")

pred_UBCF_cos_top@itemLabels[pred_UBCF_cos_top@items[[1]]]

##  [1] "Silence of the Lambs, The (1991)" "Godfather, The (1972)"           
##  [3] "Fargo (1996)"                     "Shawshank Redemption, The (1994)"
##  [5] "Wizard of Oz, The (1939)"         "Twelve Monkeys (1995)"           
##  [7] "Usual Suspects, The (1995)"       "Fugitive, The (1993)"            
##  [9] "Toy Story (1995)"                 "Star Wars (1977)"

rm(UBCF_cosine_model,pred_UBCF_cos,pred_UBCF_cos_top)

UBCF Model 2

UBCF model 2 will use Z-score normalization technique and a distance measurement of Pearson.

UBCF_pearson_modelz = Recommender(data=getData(eval_sets,"train"),method ="UBCF",parameter=list( normalize = "Z-score", method="pearson"))

pred_UBCF_pear = predict(object =UBCF_pearson_modelz,newdata=getData(eval_sets,"known"),n=10,type="ratings")

model_stats$UBCF_pearson_modelz= cbind( calcPredictionAccuracy(x=pred_UBCF_pear,data=getData(eval_sets,"unknown")))

pred_UBCF_pear_top = predict(object =UBCF_pearson_modelz,newdata=getData(eval_sets,"known"),n=10,type="topNList")

pred_UBCF_pear_top@itemLabels[pred_UBCF_pear_top@items[[1]]]

##  [1] "Silence of the Lambs, The (1991)" "Titanic (1997)"                  
##  [3] "Contact (1997)"                   "Fargo (1996)"                    
##  [5] "Princess Bride, The (1987)"       "Fugitive, The (1993)"            
##  [7] "Raiders of the Lost Ark (1981)"   "Godfather, The (1972)"           
##  [9] "Shawshank Redemption, The (1994)" "Usual Suspects, The (1995)"

rm(UBCF_pearson_modelz,pred_UBCF_pear,pred_UBCF_pear_top)

Conslusions proj2

After evaluating all four models, the user based collaborative filtering models performed better than IBCF using RMSE. In my next build, I would like to search techniques that allow me to evaluate multiple models using ROC/AUC, and implementations of hybrid methods for collaborative filtering.

knitr::kable(model_stats,caption="Model Statistics")

Model Statistics
	IBCF_cosine_model	IBCF_pearson_model	UBCF_cosine_model	UBCF_pearson_modelz
RMSE	1.0874451	1.2054091	0.9639279	0.9866335
MSE	1.1825369	1.4530111	0.9291571	0.9734457
MAE	0.7935224	0.8880822	0.7572562	0.7812307

rm(model_stats)

Project 4

I will start by creating the models I would like to evaluate. Seven models will be evaluated and plotted. After model evaluation metrics will be plotted and evaluated.

evaluation_models = list(
  IBCF_cos = list(name = "IBCF", param = list(method = "cosine")),
  IBCF_pea = list(name = "IBCF", param = list(method = "pearson")),
  IBCF_cos_ctr = list(name = "IBCF", param = list(normalize = "center",method = "cosine")),
  UBCF_cos = list(name = "UBCF", param = list(method = "cosine")),
  UBCF_pea = list(name = "UBCF", param = list(method = "pearson")),
  UBCF_pea_Z = list(name = "UBCF", param = list(normalize = "Z-score",method = "pearson")),
  random = list(name = "RANDOM", param=NULL)
)

n_recommendations = c(1, 3, 5, 10, 15, 25, 40,50,60,70,80,90,100)

results = evaluate(x = eval_sets,
                    method = evaluation_models,
                    n = n_recommendations)

## IBCF run fold/sample [model time/prediction time]
##   1  [0.4sec/0.04sec] 
##   2  [0.41sec/0.04sec] 
##   3  [0.43sec/0.04sec] 
##   4  [0.43sec/0.03sec] 
##   5  [0.41sec/0.05sec] 
## IBCF run fold/sample [model time/prediction time]
##   1  [0.56sec/0.04sec] 
##   2  [0.47sec/0.04sec] 
##   3  [0.5sec/0.05sec] 
##   4  [0.45sec/0.05sec] 
##   5  [0.43sec/0.08sec] 
## IBCF run fold/sample [model time/prediction time]
##   1  [0.46sec/0.03sec] 
##   2  [0.43sec/0.05sec] 
##   3  [0.41sec/0.05sec] 
##   4  [0.31sec/0.14sec] 
##   5  [0.41sec/0.04sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0.02sec/0.19sec] 
##   2  [0sec/0.2sec] 
##   3  [0sec/0.23sec] 
##   4  [0sec/0.23sec] 
##   5  [0sec/0.31sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0sec/0.17sec] 
##   2  [0sec/0.19sec] 
##   3  [0sec/0.28sec] 
##   4  [0sec/0.28sec] 
##   5  [0sec/0.17sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0.02sec/0.2sec] 
##   2  [0.01sec/0.3sec] 
##   3  [0sec/0.3sec] 
##   4  [0.02sec/0.17sec] 
##   5  [0.01sec/0.19sec] 
## RANDOM run fold/sample [model time/prediction time]
##   1  [0sec/0.17sec] 
##   2  [0sec/0.05sec] 
##   3  [0sec/0.06sec] 
##   4  [0.02sec/0.05sec] 
##   5  [0sec/0.04sec]

After running the recommendation models, it is possible to extract performance measures which I have named eval_metrics. I will create more measures from the underlying metric data. These measures are Accuracy, sensitivity, and F1Score. The F1 score is a measure of a test’s accuracy by considering precision and recall. A zero F score indicates bad precision and recall.

\[Accuracy=\frac{TP+TN}{TP+FP+TN+FN}\]

eval_metrics = as.data.frame(getConfusionMatrix(results[["UBCF_cos"]])[[1]][,1:8])

eval_metrics$Accuracy = (eval_metrics$TP+eval_metrics$TN)/(eval_metrics$TP+eval_metrics$FP+eval_metrics$TN+eval_metrics$FN)

\[Sensitivity=\frac{TP}{TP+FN}\]

eval_metrics$Sensitivity = eval_metrics$TP/(eval_metrics$TP+eval_metrics$FN)

\[F1 Score=\frac{2*Precision*Sensitivity}{Precision+Sensitivity}\]

eval_metrics$F1Score = (2*eval_metrics$precision*eval_metrics$Sensitivity)/(eval_metrics$precision+eval_metrics$Sensitivity)


knitr::kable(eval_metrics)

	TP	FP	FN	TN	precision	recall	TPR	FPR	Accuracy	Sensitivity	F1Score
1	0.6696429	0.3303571	71.94643	244.0536	0.6696429	0.0113003	0.0113003	0.0012857	0.7719975	0.0092217	0.0181928
3	1.9107143	1.0892857	70.70536	243.2946	0.6369048	0.0312770	0.0312770	0.0042346	0.7735185	0.0263126	0.0505373
5	2.8660714	2.1339286	69.75000	242.2500	0.5732143	0.0467346	0.0467346	0.0084264	0.7732368	0.0394688	0.0738525
10	5.4285714	4.5714286	67.18750	239.8125	0.5428571	0.0850203	0.0850203	0.0179160	0.7736311	0.0747572	0.1314168
15	7.4821429	7.5178571	65.13393	236.8661	0.4988095	0.1155771	0.1155771	0.0296455	0.7708146	0.1030370	0.1707938
25	11.1607143	13.8392857	61.45536	230.5446	0.4464286	0.1674012	0.1674012	0.0548011	0.7624775	0.1536948	0.2286655
40	15.7857143	24.2142857	56.83036	220.1696	0.3946429	0.2340940	0.2340940	0.0966304	0.7443387	0.2173860	0.2803457
50	18.7321429	31.2678571	53.88393	213.1161	0.3746429	0.2728021	0.2728021	0.1249642	0.7313824	0.2579614	0.3055414
60	21.5000000	38.5000000	51.11607	205.8839	0.3583333	0.3122786	0.3122786	0.1543660	0.7172995	0.2960777	0.3242443
70	24.2857143	45.7142857	48.33036	198.6696	0.3469388	0.3496372	0.3496372	0.1835282	0.7033292	0.3344399	0.3405747
80	26.7678571	53.2321429	45.84821	191.1518	0.3345982	0.3836178	0.3836178	0.2141774	0.6874437	0.3686217	0.3507869
90	28.9732143	61.0267857	43.64286	183.3571	0.3219246	0.4126958	0.4126958	0.2459086	0.6698119	0.3989918	0.3563389
100	31.0803571	68.9196429	41.53571	175.4643	0.3108036	0.4381393	0.4381393	0.2779551	0.6515604	0.4280093	0.3601097

plot(results, annotate=1, legend = "bottomright") 
title("ROC curve")

plot(results, "prec/rec", annotate = 1, legend = "topright")
title("Precision Recall")

results_ratings = evaluate(x = eval_sets, 
               method = evaluation_models, 
               type="ratings")

## IBCF run fold/sample [model time/prediction time]
##   1  [0.4sec/0.02sec] 
##   2  [0.4sec/0.01sec] 
##   3  [0.39sec/0.02sec] 
##   4  [0.39sec/0.02sec] 
##   5  [0.39sec/0.02sec] 
## IBCF run fold/sample [model time/prediction time]
##   1  [0.43sec/0.02sec] 
##   2  [0.42sec/0.01sec] 
##   3  [0.43sec/0.01sec] 
##   4  [0.42sec/0.01sec] 
##   5  [0.53sec/0.02sec] 
## IBCF run fold/sample [model time/prediction time]
##   1  [0.38sec/0.02sec] 
##   2  [0.41sec/0.01sec] 
##   3  [0.41sec/0.01sec] 
##   4  [0.42sec/0.01sec] 
##   5  [0.41sec/0.02sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0.01sec/0.14sec] 
##   2  [0sec/0.26sec] 
##   3  [0.02sec/0.14sec] 
##   4  [0sec/0.15sec] 
##   5  [0sec/0.15sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0sec/0.27sec] 
##   2  [0sec/0.15sec] 
##   3  [0.02sec/0.16sec] 
##   4  [0sec/0.16sec] 
##   5  [0.02sec/0.15sec] 
## UBCF run fold/sample [model time/prediction time]
##   1  [0.02sec/0.28sec] 
##   2  [0.02sec/0.26sec] 
##   3  [0.02sec/0.14sec] 
##   4  [0.02sec/0.14sec] 
##   5  [0.01sec/0.15sec] 
## RANDOM run fold/sample [model time/prediction time]
##   1  [0sec/0.02sec] 
##   2  [0sec/0.13sec] 
##   3  [0sec/0.01sec] 
##   4  [0sec/0.01sec] 
##   5  [0sec/0.02sec]

table =  as.data.frame( t(sapply(avg(results_ratings), rbind)))
colnames(table) = c("RMSE", "MSE", "MAE")
knitr::kable(table)

	RMSE	MSE	MAE
IBCF_cos	1.1021610	1.2149475	0.8137186
IBCF_pea	1.2102673	1.4649911	0.8996828
IBCF_cos_ctr	1.1021610	1.2149475	0.8137186
UBCF_cos	0.9707491	0.9423672	0.7650323
UBCF_pea	0.9930580	0.9861776	0.7863773
UBCF_pea_Z	0.9931664	0.9863920	0.7865486
random	1.3345242	1.7810893	1.0461225

Summary

Based on the prior exercises various Item and user recommendation models where built. The new approach allowed the evaluating of many models with a more simplified workflow.

The best model based on AUC was the UBCF with cosine similarity. The model also yielded the lowest RMSE of .97. To increase serendipity in the model evaluation, I would generate a random selection from one of the favorite user genre. Analyzing performance with offline data can be trivial, but if the system were online, we could undoubtedly separate a portion of the user base into an A/B testing and compare the results. Also, consideration of the system’s resources must be tested before implementation.