Joke Recommender:

Collaborative Filters

…

Employer to applicant: “In this job we need someone who is responsible.”
Applicant: “I’m the one you want. On my last job, every time anything went wrong, they said I was responsible.”

## Formal class 'realRatingMatrix' [package "recommenderlab"] with 2 slots ## ..@ data :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ## .. .. ..@ i : int [1:1708993] 1 3 6 7 8 9 10 11 12 14 ... ## .. .. ..@ p : int [1:102] 0 0 15507 32461 48216 63117 86615 105769 129266 152763 ... ## .. .. ..@ Dim : int [1:2] 23500 101 ## .. .. ..@ Dimnames:List of 2 ## .. .. .. ..$ : NULL ## .. .. .. ..$ : chr [1:101] "V1" "V2" "V3" "V4" ... ## .. .. ..@ x : num [1:1708993] -4.37 0.34 3.5 -7.67 1.02 3.64 9.27 0.39 0.1 -0.15 ... ## .. .. ..@ factors : list() ## ..@ normalize: NULL

## 1 x 101 rating matrix of class 'realRatingMatrix' with 50 ratings.

Utilizing the Jester dataset(source at end), we created a ratings recommender system for 101 jokes and 23500 users. We turned it into a realRatingMatrix and then segregated training data from testing data. We were able to put up to 36 jokes into the training set for each user.

Our distribution of ratings shows that more jokes got higher ratings, resulting in a left skewed rating set.

## Formal class 'evaluationScheme' [package "recommenderlab"] with 9 slots ## ..@ method : chr "split" ## ..@ given : int 36 ## ..@ k : int 1 ## ..@ train : num 0.85 ## ..@ runsTrain :List of 1 ## .. ..$ : int [1:19975] 12917 1853 15246 11675 16889 19724 9033 8272 4787 3460 ... ## ..@ data :Formal class 'realRatingMatrix' [package "recommenderlab"] with 2 slots ## .. .. ..@ data :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ## .. .. .. .. ..@ i : int [1:1708993] 1 3 6 7 8 9 10 11 12 14 ... ## .. .. .. .. ..@ p : int [1:102] 0 0 15507 32461 48216 63117 86615 105769 129266 152763 ... ## .. .. .. .. ..@ Dim : int [1:2] 23500 101 ## .. .. .. .. ..@ Dimnames:List of 2 ## .. .. .. .. .. ..$ : NULL ## .. .. .. .. .. ..$ : chr [1:101] "V1" "V2" "V3" "V4" ... ## .. .. .. .. ..@ x : num [1:1708993] -4.37 0.34 3.5 -7.67 1.02 3.64 9.27 0.39 0.1 -0.15 ... ## .. .. .. .. ..@ factors : list() ## .. .. ..@ normalize: NULL ## ..@ knownData :Formal class 'realRatingMatrix' [package "recommenderlab"] with 2 slots ## .. .. ..@ data :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ## .. .. .. .. ..@ i : int [1:846000] 10 11 15 16 20 22 31 35 36 42 ... ## .. .. .. .. ..@ p : int [1:102] 0 0 6649 14281 21155 27572 40628 49789 62713 75640 ... ## .. .. .. .. ..@ Dim : int [1:2] 23500 101 ## .. .. .. .. ..@ Dimnames:List of 2 ## .. .. .. .. .. ..$ : NULL ## .. .. .. .. .. ..$ : chr [1:101] "V1" "V2" "V3" "V4" ... ## .. .. .. .. ..@ x : num [1:846000] 9.27 0.39 4.51 -4.22 -9.13 -6.7 -1.46 -8.35 -6.6 -9.61 ... ## .. .. .. .. ..@ factors : list() ## .. .. ..@ normalize: NULL ## ..@ unknownData:Formal class 'realRatingMatrix' [package "recommenderlab"] with 2 slots ## .. .. ..@ data :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ## .. .. .. .. ..@ i : int [1:862993] 1 3 6 7 8 9 12 14 18 21 ... ## .. .. .. .. ..@ p : int [1:102] 0 0 8858 18180 27061 35545 45987 55980 66553 77123 ... ## .. .. .. .. ..@ Dim : int [1:2] 23500 101 ## .. .. .. .. ..@ Dimnames:List of 2 ## .. .. .. .. .. ..$ : NULL ## .. .. .. .. .. ..$ : chr [1:101] "V1" "V2" "V3" "V4" ... ## .. .. .. .. ..@ x : num [1:862993] -4.37 0.34 3.5 -7.67 1.02 3.64 0.1 -0.15 0.29 1.21 ... ## .. .. .. .. ..@ factors : list() ## .. .. ..@ normalize: NULL ## ..@ goodRating : num 5

Our set of jokes in the test set has a mode at 63 that dwarfs all of the other sizes. In the mid 30s, the next larger size of testing jokes is located. Many users rated 99 jokes.

To look at the similarity between jokes, we create at a matrix of the similarity between jokes for the first ten users. The later set of jokes, in a lighter color along the right and top, is more different from other jokes. Pockets of the earlier jokes are quite alike.

Looking at jokes 70-100, and their ratings among all users, we see less similarity. These jokes appear to be more different from each other and from the earlier jokes.

…

For our first model, we entertain an item-based collaborative filtering method. We use the default cosine distance method.

## Recommender of type 'IBCF' for 'realRatingMatrix' ## learned using 19975 users.

## RMSE MSE MAE ## [1,] 2.052864 4.214252 2.052864 ## [2,] 4.118384 16.961090 3.353863 ## [3,] 5.541133 30.704160 4.380510 ## [4,] 4.514304 20.378942 3.386728 ## [5,] 5.052953 25.532337 4.188887 ## [6,] 5.246133 27.521909 4.472954

## [1] 4.54378

After calculating the Root Mean Square Error for our model and displaying the average and the first 6 users’ Root Mean Square Error, we became concerned that training set, with up to 36 jokes per user, was too sparse. We created a separate model, filtered for users with a minimum of 70 ratings. The new RMSE, at 4.3468, was not better, and wouldn’t create a model worth eliminating users. (It also didn’t lead to quicker runtime.) In the following plot, we can see that RMSE from user to user is fairly normal, but right-skewed. Many users have a RMSE close to 4.5, meaning that their user experience should not be that different. The users in the tail of the distribution should experience worse recommendations.

## IBCF run fold/sample [model time/prediction time] ## 1 [2.18sec/0.72sec]

## TP FP FN TN precision recall TPR FPR ## 2 0.2309220 1.769078 8.9750355 54.024965 0.1154610 0.02235561 0.02235561 0.03161449 ## 4 0.4709220 3.529078 8.7350355 52.264965 0.1177305 0.04469941 0.04469941 0.06294149 ## 6 0.7129078 5.287092 8.4930496 50.506950 0.1188180 0.06665448 0.06665448 0.09433529 ## 8 0.9676596 7.032340 8.2382979 48.761702 0.1209574 0.09016481 0.09016481 0.12537475 ## 10 1.2252482 8.774752 7.9807092 47.019291 0.1225248 0.11657771 0.11657771 0.15650882 ## 12 1.4913475 10.508652 7.7146099 45.285390 0.1242790 0.14532526 0.14532526 0.18741902 ## 14 1.7557447 12.244255 7.4502128 43.549787 0.1254103 0.17243659 0.17243659 0.21843529 ## 16 2.0394326 13.960567 7.1665248 41.833475 0.1274645 0.20374856 0.20374856 0.24910570 ## 18 2.3412766 15.658723 6.8646809 40.135319 0.1300709 0.23857265 0.23857265 0.27942431 ## 20 2.6371631 17.362837 6.5687943 38.431206 0.1318582 0.27105806 0.27105806 0.30984080 ## 22 2.9472340 19.052766 6.2587234 36.741277 0.1339652 0.30735911 0.30735911 0.34002362 ## 24 3.2536170 20.746383 5.9523404 35.047660 0.1355674 0.34136081 0.34136081 0.37024487 ## 26 3.5747518 22.425248 5.6312057 33.368794 0.1374905 0.37840066 0.37840066 0.40026568 ## 28 3.9001418 24.099858 5.3058156 31.694184 0.1392908 0.41772123 0.41772123 0.43030696 ## 30 4.2278014 25.772199 4.9781560 30.021844 0.1409267 0.45771084 0.45771084 0.46018631 ## 32 4.5648227 27.435177 4.6411348 28.358865 0.1426507 0.49765726 0.49765726 0.48980380 ## 34 4.8964539 29.103546 4.3095035 26.690496 0.1440134 0.53668152 0.53668152 0.51949900 ## 36 5.2371631 30.762837 3.9687943 25.031206 0.1454768 0.57697052 0.57697052 0.54910648 ## 38 5.5727660 32.427234 3.6331915 23.366809 0.1466517 0.61581105 0.61581105 0.57876397 ## 40 5.9083688 34.091631 3.2975887 21.702411 0.1477092 0.65564382 0.65564382 0.60843777 ## 42 6.2391489 35.760851 2.9668085 20.033191 0.1485512 0.69086138 0.69086138 0.63805976 ## 44 6.5537589 37.446241 2.6521986 18.347801 0.1489491 0.72838633 0.72838633 0.66833370 ## 46 6.8706383 39.129362 2.3353191 16.664681 0.1493617 0.76474777 0.76474777 0.69835287 ## 48 7.1724823 40.827518 2.0334752 14.966525 0.1494267 0.79430214 0.79430214 0.72867914 ## 50 7.4660993 42.533901 1.7398582 13.260142 0.1493220 0.82499196 0.82499196 0.75942063 ## 52 7.7546099 44.245390 1.4513475 11.548652 0.1491271 0.85745072 0.85745072 0.79014379 ## 54 8.0309220 45.969078 1.1750355 9.824965 0.1487208 0.88688933 0.88688933 0.82113534 ## 56 8.2964539 47.703546 0.9095035 8.090496 0.1481510 0.91234265 0.91234265 0.85229229 ## 58 8.5387234 49.461277 0.6672340 6.332766 0.1472194 0.93685168 0.93685168 0.88400776 ## 60 8.7673759 51.232624 0.4385816 4.561418 0.1461229 0.95997600 0.95997600 0.91593936 ## 62 8.9883688 53.011631 0.2175887 2.782411 0.1449737 0.97921354 0.97921354 0.94809871 ## 64 9.2059574 54.794043 0.0000000 1.000000 0.1438431 1.00000000 1.00000000 0.98031466

To evaluate our model, we look at the confusion matrix for the likelihood of detecting good and bad jokes (for user 1). At a size of 64, our true negatives go to 1 and our false negatives go to 0. We can see the tradeoff between sensitivity and specificity. When we plot the ROC curve, it moves nearly straight along the y=x line. Using Item-Based CF and cosine distance, our model does not perform very well.

When we look at the precision-recall curve, we find a maximum at 48. A recommendation set of 48 will lead to the optimum set, according to this model and this measure. On a practical basis, this may be too many. It represents almost half of the set. It may make a good “joke-of-the-week” set.

#------------------------ #This set could include a UBCF model. To present more models would require dimension reduction methods to speed up model creation. #------------------------ rec_user <- Recommender(training_set, method = "UBCF") rec_user model_info<-getModel(rec_user)

To look for a better model, we create a list of 5 models: Item-Based Collaborative Filtering and User-Based Collaborative Filtering, each with Pearson Correlation and Cosine distance. A random model is the last of the five.

## IBCF run fold/sample [model time/prediction time] ## 1 [2.24sec/0.7sec] ## IBCF run fold/sample [model time/prediction time] ## 1 [1.75sec/0.73sec] ## UBCF run fold/sample [model time/prediction time] ## 1 [0.15sec/81.88sec] ## UBCF run fold/sample [model time/prediction time] ## 1 [0.16sec/61.12sec] ## RANDOM run fold/sample [model time/prediction time] ## 1 [0sec/0.72sec]

For a set of size 5 to 70, we look at mean precision, recall, true positive and false positive rates. Then, we plot a ROC curve and a precision-recall curve to look at all of the models. Like before, we see that our first model was nearly as bad as a random guess. An item-based CF model utilizing Pearson correlation works much better, especially if our goal is high recall. That model would give us a decent joke-a-week calendar, with 52 suggestions. In a lower range, a User-Based CF model with correlation works much better. If we want a top 5 or top 10 list, this should be the appropriate model. In our last graphs, we can see the superiority of both user-based models, which deliver a better recall and precision. For a small list, the version with correlation is the clear winner.

## precision recall TPR FPR ## 5 0.1174468 0.05526907 0.05526907 0.07875388 ## 10 0.1225248 0.11657771 0.11657771 0.15650882 ## 15 0.1267329 0.18833777 0.18833777 0.23369951 ## 20 0.1318582 0.27105806 0.27105806 0.30984080 ## 25 0.1367262 0.35904765 0.35904765 0.38513087 ## 30 0.1409267 0.45771084 0.45771084 0.46018631

## IBCF run fold/sample [model time/prediction time] ## 1 [2.35sec/0.55sec] ## IBCF run fold/sample [model time/prediction time] ## 1 [1.58sec/0.56sec] ## UBCF run fold/sample [model time/prediction time] ## 1 [0.13sec/80.85sec] ## UBCF run fold/sample [model time/prediction time] ## 1 [0.15sec/60.73sec] ## RANDOM run fold/sample [model time/prediction time] ## 1 [0sec/0.75sec]

With our best model, we create a top-5 model for user 1000. It suggests jokes 70,69,11,55 and 67. Among those jokes: “In this job we need someone who is responsible.” Applicant: “I’m the one you want. On my last job, every time anything went wrong, they said I was responsible.”

testing_set<-getData(eval_sets,"unknown") rec_user<- Recommender(testing_set, method = "UBCF",parameter=list(method="pearson")) items_to_recommend<-5 eval_prediction_user<-predict(object=rec_user,newdata=getData(eval_sets,"unknown"),n=items_to_recommend) eval_prediction_user@items[[1000]]

Source for dataset: Eigentaste: A Constant Time Collaborative Filtering Algorithm. Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins. Information Retrieval, 4(2), 133-151. July 2001.

Data 643 project 2

Dan Wigodsky

June 14, 2018

Joke Recommender:

Collaborative Filters

…

Employer to applicant: “In this job we need someone who is responsible.”
Applicant: “I’m the one you want. On my last job, every time anything went wrong, they said I was responsible.”

Utilizing the Jester dataset(source at end), we created a ratings recommender system for 101 jokes and 23500 users. We turned it into a realRatingMatrix and then segregated training data from testing data. We were able to put up to 36 jokes into the training set for each user.

Our distribution of ratings shows that more jokes got higher ratings, resulting in a left skewed rating set.

To look at the similarity between jokes, we create at a matrix of the similarity between jokes for the first ten users. The later set of jokes, in a lighter color along the right and top, is more different from other jokes. Pockets of the earlier jokes are quite alike.

Looking at jokes 70-100, and their ratings among all users, we see less similarity. These jokes appear to be more different from each other and from the earlier jokes.

…

For our first model, we entertain an item-based collaborative filtering method. We use the default cosine distance method.

When we look at the precision-recall curve, we find a maximum at 48. A recommendation set of 48 will lead to the optimum set, according to this model and this measure. On a practical basis, this may be too many. It represents almost half of the set. It may make a good “joke-of-the-week” set.

To look for a better model, we create a list of 5 models: Item-Based Collaborative Filtering and User-Based Collaborative Filtering, each with Pearson Correlation and Cosine distance. A random model is the last of the five.

Data 643 project 2

Dan Wigodsky

June 14, 2018

Joke Recommender:

Collaborative Filters

…

Employer to applicant: “In this job we need someone who is responsible.” Applicant: “I’m the one you want. On my last job, every time anything went wrong, they said I was responsible.”

Utilizing the Jester dataset(source at end), we created a ratings recommender system for 101 jokes and 23500 users. We turned it into a realRatingMatrix and then segregated training data from testing data. We were able to put up to 36 jokes into the training set for each user.

Our distribution of ratings shows that more jokes got higher ratings, resulting in a left skewed rating set.

To look at the similarity between jokes, we create at a matrix of the similarity between jokes for the first ten users. The later set of jokes, in a lighter color along the right and top, is more different from other jokes. Pockets of the earlier jokes are quite alike.

Looking at jokes 70-100, and their ratings among all users, we see less similarity. These jokes appear to be more different from each other and from the earlier jokes.

…

For our first model, we entertain an item-based collaborative filtering method. We use the default cosine distance method.

When we look at the precision-recall curve, we find a maximum at 48. A recommendation set of 48 will lead to the optimum set, according to this model and this measure. On a practical basis, this may be too many. It represents almost half of the set. It may make a good “joke-of-the-week” set.

To look for a better model, we create a list of 5 models: Item-Based Collaborative Filtering and User-Based Collaborative Filtering, each with Pearson Correlation and Cosine distance. A random model is the last of the five.

Employer to applicant: “In this job we need someone who is responsible.”
Applicant: “I’m the one you want. On my last job, every time anything went wrong, they said I was responsible.”