The goal of this assignment is to try out different ways of implementing and configuring a recommender, and to evaluate different approaches. Start with an existing dataset of user-item ratings, and implement at least two of these recommendation algorithms:
Evaluate and compare different approaches, and provide at least one graph and a summary of findings and recommendations.
For this project, we use dataset 1_2 from the Jester database of joke ratings. The dataset includes ratings of 100 jokes from 23,500 users, each of whom has rated 36 or more jokes. Ratings are given as real values ranging from -10.00 (worse) to +10.00 (best).
We use the recommenderlab package to develop and test our recommender system using different recommendation algorithms. In addition, we use some of the ideas and code suggestions from the recommenderlab vignette, which I also used in a project for DATA 607 last fall.
library(tidyverse)
library(recommenderlab)
library(knitr)
We start by preparing the data for the recommender system. We note from the data description for the Jester database that the rating value “99” corresponds to “null” or “not rated”, and that the first column in the dataset is the number of jokes rated by each user. So our data preparation includes the following steps:
realRatingMatrixIn the framework of the user-item utility matrix, the “users” are the raters and the “items” are the jokes. We see that the ratings matrix includes 1.71 million ratings in total, with 23,500 users and 100 jokes. This implies that the ratings matrix is 73% populated with non-blank ratings (or conversely, 27% sparse), compared to a total of 2.35 million possible ratings.
# load data from github or from disk
# github repo
file <- "https://raw.githubusercontent.com/kecbenson/DATA612/master/jester-data-2.csv"
# local directory
#file <- "C:/Users/Kevin/Documents/CUNY - MSDS/DATA 612/jester_dataset_1_2/jester-data-2.csv"
df <- read.csv(file, header = FALSE)
# check that rowcounts match, then discard first column
all.equal(rowSums(df[ , 2:ncol(df)] != 99), df[ , 1])
## [1] TRUE
df <- df[-1]
# set values of 99 to NA and set row & column names
df[df == 99] <- NA
rownames(df) <- paste0("u", 0:(nrow(df)-1))
colnames(df) <- paste0("j", 0:(ncol(df)-1))
glimpse(df)
## Observations: 23,500
## Variables: 100
## $ j0 <dbl> NA, -4.37, NA, 0.34, NA, NA, 3.50, -7.67, 1.02, 3.64, 9.27...
## $ j1 <dbl> 8.11, -3.88, NA, -6.55, NA, NA, 2.28, 7.14, 6.07, 1.60, 6....
## $ j2 <dbl> NA, 0.73, NA, 2.86, NA, NA, 3.01, -6.31, 6.65, -0.39, -9.8...
## $ j3 <dbl> NA, -3.20, NA, NA, NA, NA, -0.63, NA, -0.87, 3.74, -8.83, ...
## $ j4 <dbl> -2.28, -6.41, 0.73, -3.64, 9.13, -1.46, 4.95, -9.32, 6.80,...
## $ j5 <dbl> -4.22, 1.17, NA, 1.12, NA, NA, -0.49, 6.07, 0.68, -9.61, 0...
## $ j6 <dbl> 5.49, 7.82, 5.53, 5.34, -9.32, 2.72, -0.49, 3.59, 5.34, 8....
## $ j7 <dbl> -2.62, -4.76, 3.25, 2.33, -2.04, -3.83, -0.68, -2.72, 4.56...
## $ j8 <dbl> NA, -6.41, NA, NA, NA, NA, 3.30, NA, -0.39, -7.09, -6.17, ...
## $ j9 <dbl> -2.28, 0.73, NA, 2.33, NA, NA, -1.41, -8.01, 8.25, 3.88, -...
## $ j10 <dbl> 5.34, 1.99, NA, 3.54, NA, NA, 4.71, -9.13, 8.93, -6.36, 9....
## $ j11 <dbl> 3.54, 0.63, NA, 0.92, 2.04, NA, 3.30, -8.06, 6.21, -9.76, ...
## $ j12 <dbl> -9.71, -1.41, -2.04, -5.87, -2.04, -3.83, 1.26, 2.67, 0.10...
## $ j13 <dbl> 3.30, -4.32, 2.91, 2.77, NA, 0.97, -4.03, 7.43, 9.08, -7.9...
## $ j14 <dbl> -5.58, 0.63, 2.62, 4.76, -2.23, -3.83, -5.49, -7.23, -0.73...
## $ j15 <dbl> -6.07, -9.22, -1.70, -3.54, -8.50, -1.26, -5.44, -7.23, 2....
## $ j16 <dbl> -0.29, 0.87, 2.28, -7.38, 3.16, 0.00, 2.86, -0.53, 6.46, -...
## $ j17 <dbl> 2.33, 0.68, 3.11, -4.37, 2.18, -4.90, 1.17, 1.60, -5.34, -...
## $ j18 <dbl> 3.93, 0.29, 2.91, -1.75, 3.93, -2.72, 1.80, -7.91, 0.97, 2...
## $ j19 <dbl> -0.34, -4.51, 6.75, -4.81, 2.96, -0.39, 1.07, -8.11, 3.83,...
## $ j20 <dbl> -6.89, -6.65, 2.72, -1.75, 6.41, 1.50, 2.62, -7.23, 8.59, ...
## $ j21 <dbl> -2.09, -4.03, 2.82, 2.72, 2.62, NA, 0.73, -9.13, 4.71, 4.8...
## $ j22 <dbl> -8.83, 1.94, NA, 2.96, NA, NA, -0.39, -7.77, 8.45, -9.13, ...
## $ j23 <dbl> NA, -3.11, NA, NA, NA, NA, -1.94, NA, -0.44, 6.12, -8.98, ...
## $ j24 <dbl> 5.15, -4.32, NA, 1.02, NA, NA, 1.99, -8.88, 8.59, 1.60, -6...
## $ j25 <dbl> 3.98, -0.63, 4.22, 2.96, 5.10, 2.62, 1.21, 7.38, 7.52, -9....
## $ j26 <dbl> 4.47, 5.83, 2.77, 3.25, 0.05, 3.74, 2.09, 0.00, 6.26, -9.3...
## $ j27 <dbl> 3.88, 3.64, 3.35, 0.10, 6.12, NA, 3.74, -8.59, 9.08, -3.83...
## $ j28 <dbl> 5.29, 0.39, 4.22, 2.18, 7.18, 3.06, 0.63, 1.65, 7.28, -0.4...
## $ j29 <dbl> NA, -0.44, NA, 0.97, NA, NA, 3.30, -8.16, 6.17, -6.17, -6....
## $ j30 <dbl> 7.09, 0.73, 3.20, -6.89, 8.35, 2.04, 2.04, 8.40, 6.99, -8....
## $ j31 <dbl> -2.23, 1.07, 4.51, -3.93, 7.43, 4.03, -2.48, -6.55, 8.30, ...
## $ j32 <dbl> NA, 3.06, NA, NA, NA, NA, 3.35, -7.86, -7.91, -2.67, -0.49...
## $ j33 <dbl> NA, -4.51, NA, -2.43, NA, NA, 2.18, -8.01, 6.80, -4.90, 4....
## $ j34 <dbl> 3.11, -0.39, 5.24, 4.17, 7.57, 4.27, 4.32, 0.44, 2.82, 3.2...
## $ j35 <dbl> -4.42, 3.11, 2.33, -4.47, 5.49, 5.49, 1.89, -1.50, 3.50, 4...
## $ j36 <dbl> NA, 1.02, NA, NA, NA, NA, -1.99, NA, 0.15, 3.30, -9.08, 4....
## $ j37 <dbl> -5.87, -2.09, 0.00, 0.63, 8.98, 3.06, 4.17, -8.88, 1.99, 0...
## $ j38 <dbl> NA, -6.21, 1.84, 1.50, NA, NA, 2.43, 8.20, 9.17, -5.87, -0...
## $ j39 <dbl> -7.28, -4.71, 4.42, 3.45, NA, NA, 2.86, -9.13, 8.20, -4.17...
## $ j40 <dbl> NA, 1.07, NA, 2.91, NA, NA, 4.47, -8.45, 8.40, -3.54, -9.6...
## $ j41 <dbl> -4.71, 2.77, 2.82, 1.65, 5.63, 2.72, -1.02, 7.38, 6.80, -9...
## $ j42 <dbl> -8.93, 2.04, NA, NA, NA, NA, 2.38, -7.48, 8.30, -2.62, -0....
## $ j43 <dbl> NA, -5.97, NA, 3.06, NA, NA, -0.63, NA, -1.26, 1.75, -8.83...
## $ j44 <dbl> 3.40, -0.58, NA, 0.58, 4.66, NA, -4.76, 7.86, 8.01, -2.57,...
## $ j45 <dbl> -7.18, -3.74, 4.17, 3.93, NA, 0.97, -2.09, 7.67, 7.48, -1....
## $ j46 <dbl> 2.28, 3.30, NA, 3.69, NA, NA, 3.16, 7.48, 8.11, -8.88, -8....
## $ j47 <dbl> 7.48, -1.46, 3.69, 5.15, NA, 0.97, 3.79, -8.69, 7.91, 1.31...
## $ j48 <dbl> 5.15, 6.02, 3.79, 3.30, 4.17, 4.08, 4.27, -7.57, 9.03, -3....
## $ j49 <dbl> 5.73, 0.44, 2.48, 1.55, 8.11, 1.99, 1.99, 1.60, 8.88, -3.5...
## $ j50 <dbl> NA, -4.22, NA, -4.90, 7.33, NA, 1.70, 8.20, 7.28, -9.66, -...
## $ j51 <dbl> NA, -0.44, NA, -6.99, NA, NA, 4.08, 7.82, 8.11, -9.66, -7....
## $ j52 <dbl> 5.78, -1.94, 3.64, 3.06, 6.02, 4.90, 0.87, 0.68, 9.08, -4....
## $ j53 <dbl> 3.20, 8.45, 3.01, 3.64, 4.71, 0.97, -2.62, 7.38, 8.88, -2....
## $ j54 <dbl> -0.44, -3.01, NA, 1.21, -0.58, NA, 2.04, -8.69, 6.21, -3.3...
## $ j55 <dbl> 5.73, -4.51, 2.52, 0.68, 7.09, 1.50, 3.30, -7.23, 8.83, -8...
## $ j56 <dbl> NA, 3.54, NA, NA, -2.23, 0.68, 3.06, NA, 3.54, -8.01, -9.8...
## $ j57 <dbl> NA, -0.44, 4.71, NA, NA, NA, -3.74, NA, 0.19, -5.78, -4.13...
## $ j58 <dbl> NA, 4.56, NA, 1.02, NA, NA, 0.34, -8.16, 6.65, -9.76, -4.1...
## $ j59 <dbl> NA, 2.04, NA, 2.28, NA, NA, -0.92, -7.67, 6.07, -0.34, -0....
## $ j60 <dbl> -8.25, 5.78, -0.34, 1.41, 5.78, 2.09, 1.75, 8.35, 8.93, -3...
## $ j61 <dbl> 4.61, 5.63, 3.35, 1.21, 1.94, 3.06, 0.78, -7.23, 3.40, -1....
## $ j62 <dbl> NA, 6.02, NA, 0.83, NA, NA, 3.69, -7.72, 8.11, -9.71, -6.8...
## $ j63 <dbl> NA, -4.37, NA, -0.10, NA, NA, -0.68, -7.67, 8.54, 6.75, -0...
## $ j64 <dbl> -3.98, -8.98, 3.88, 4.71, 5.24, 3.20, 3.35, -1.80, 7.82, 1...
## $ j65 <dbl> 4.90, 0.87, -0.15, 1.12, 6.36, 1.89, 1.65, -8.01, 7.67, -5...
## $ j66 <dbl> NA, -4.66, NA, 3.01, NA, NA, 0.92, -6.07, 5.29, 0.24, -0.3...
## $ j67 <dbl> -4.32, 2.82, 3.06, 2.67, NA, 1.50, 2.52, -7.23, -0.29, -0....
## $ j68 <dbl> -2.72, 2.86, 2.96, 3.06, NA, 3.69, 1.70, -8.01, 0.78, -6.0...
## $ j69 <dbl> NA, -4.03, NA, 1.31, NA, NA, 5.00, -8.01, 8.06, -9.71, -3....
## $ j70 <dbl> NA, 3.06, NA, -3.64, NA, NA, 7.18, NA, -0.05, 0.44, NA, NA...
## $ j71 <dbl> NA, 2.91, NA, NA, NA, NA, 6.75, NA, 8.50, -2.14, NA, NA, N...
## $ j72 <dbl> NA, 5.44, 4.56, 2.43, NA, NA, 4.71, NA, 8.88, -9.56, NA, N...
## $ j73 <dbl> NA, -4.76, NA, NA, NA, NA, 3.20, NA, 4.27, -9.71, NA, NA, ...
## $ j74 <dbl> NA, -8.93, NA, NA, NA, NA, 2.23, NA, 9.08, -9.27, NA, NA, ...
## $ j75 <dbl> NA, -0.49, NA, NA, NA, NA, 4.47, NA, 8.54, -9.08, NA, NA, ...
## $ j76 <dbl> NA, 8.16, NA, NA, -0.05, NA, 3.50, -4.32, 6.17, -8.01, NA,...
## $ j77 <dbl> NA, -3.16, NA, NA, NA, NA, 5.97, NA, 8.74, -9.61, NA, NA, ...
## $ j78 <dbl> NA, 0.83, NA, NA, NA, NA, 0.44, NA, 3.20, -9.61, NA, NA, N...
## $ j79 <dbl> NA, 1.94, NA, -4.66, NA, NA, 4.51, NA, 8.50, -9.56, NA, NA...
## $ j80 <dbl> NA, -0.49, NA, NA, NA, 1.80, 7.14, NA, 0.92, 4.61, 9.13, 8...
## $ j81 <dbl> NA, 2.62, NA, NA, NA, NA, 5.00, -6.21, 9.13, -0.29, -9.71,...
## $ j82 <dbl> NA, 1.99, NA, NA, NA, NA, 8.98, NA, 8.54, -9.56, NA, NA, N...
## $ j83 <dbl> NA, -3.11, NA, NA, NA, NA, 5.39, NA, 6.36, -3.45, NA, NA, ...
## $ j84 <dbl> NA, -0.44, NA, NA, NA, NA, 4.27, NA, 9.03, -9.71, NA, NA, ...
## $ j85 <dbl> NA, -6.50, NA, NA, NA, NA, 4.47, NA, 9.03, 1.12, NA, NA, N...
## $ j86 <dbl> NA, 2.04, NA, NA, NA, NA, 3.20, NA, 3.54, -6.50, 9.03, NA,...
## $ j87 <dbl> NA, -3.16, NA, NA, NA, NA, 4.56, NA, 4.66, -6.94, -5.83, N...
## $ j88 <dbl> 1.21, 2.09, NA, NA, NA, NA, 4.22, 1.65, 8.93, -5.10, NA, N...
## $ j89 <dbl> NA, 5.34, 2.38, NA, NA, NA, 3.83, NA, 6.46, -6.46, NA, NA,...
## $ j90 <dbl> NA, 5.73, NA, NA, NA, NA, 1.89, NA, 6.75, 8.59, NA, NA, NA...
## $ j91 <dbl> NA, -6.70, NA, NA, NA, NA, 3.54, NA, 8.74, -3.69, NA, NA, ...
## $ j92 <dbl> NA, 1.99, NA, NA, NA, NA, 2.38, NA, 9.17, -1.84, NA, NA, N...
## $ j93 <dbl> NA, 2.62, NA, NA, NA, 1.89, 2.77, NA, 7.52, 3.11, NA, 7.04...
## $ j94 <dbl> NA, -0.49, 3.16, NA, NA, NA, -0.24, NA, 5.73, 0.87, NA, NA...
## $ j95 <dbl> -5.92, 3.45, NA, NA, NA, NA, 2.28, NA, 5.92, 1.02, NA, NA,...
## $ j96 <dbl> NA, 3.20, NA, NA, NA, NA, 5.05, 1.60, -6.60, 4.17, NA, NA,...
## $ j97 <dbl> NA, -0.53, NA, NA, NA, NA, 4.51, NA, 0.24, -6.55, NA, NA, ...
## $ j98 <dbl> NA, -0.53, NA, NA, NA, NA, 4.08, NA, 9.08, -8.93, NA, NA, ...
## $ j99 <dbl> NA, -2.96, NA, NA, NA, NA, 2.96, NA, 8.98, -9.42, NA, NA, ...
# create raw ratings matrix
r <- as.matrix(df) %>% as("realRatingMatrix")
r
## 23500 x 100 rating matrix of class 'realRatingMatrix' with 1708993 ratings.
getRatingMatrix(r)[1:10, 1:10]
## 10 x 10 sparse Matrix of class "dgCMatrix"
## [[ suppressing 10 column names 'j0', 'j1', 'j2' ... ]]
##
## u0 . 8.11 . . -2.28 -4.22 5.49 -2.62 . -2.28
## u1 -4.37 -3.88 0.73 -3.20 -6.41 1.17 7.82 -4.76 -6.41 0.73
## u2 . . . . 0.73 . 5.53 3.25 . .
## u3 0.34 -6.55 2.86 . -3.64 1.12 5.34 2.33 . 2.33
## u4 . . . . 9.13 . -9.32 -2.04 . .
## u5 . . . . -1.46 . 2.72 -3.83 . .
## u6 3.50 2.28 3.01 -0.63 4.95 -0.49 -0.49 -0.68 3.30 -1.41
## u7 -7.67 7.14 -6.31 . -9.32 6.07 3.59 -2.72 . -8.01
## u8 1.02 6.07 6.65 -0.87 6.80 0.68 5.34 4.56 -0.39 8.25
## u9 3.64 1.60 -0.39 3.74 -6.36 -9.61 8.88 -4.08 -7.09 3.88
For comparison, we also create the normalized forms of the rating matrix, using centering and Z-score methods for normalization.
# create normalized ratings matrix
r_n1 <- normalize(r)
r_n1
## 23500 x 100 rating matrix of class 'realRatingMatrix' with 1708993 ratings.
## Normalized using center on rows.
getRatingMatrix(r_n1)[1:10, 1:10] %>% round(2)
## 10 x 10 sparse Matrix of class "dgCMatrix"
## [[ suppressing 10 column names 'j0', 'j1', 'j2' ... ]]
##
## u0 . 8.17 . . -2.22 -4.16 5.55 -2.56 . -2.22
## u1 -4.09 -3.60 1.01 -2.92 -6.13 1.45 8.10 -4.48 -6.13 1.01
## u2 . . . . -2.17 . 2.63 0.35 . .
## u3 -0.18 -7.07 2.34 . -4.16 0.60 4.82 1.81 . 1.81
## u4 . . . . 5.65 . -12.80 -5.52 . .
## u5 . . . . -2.79 . 1.39 -5.16 . .
## u6 1.39 0.17 0.90 -2.74 2.84 -2.60 -2.60 -2.79 1.19 -3.52
## u7 -4.87 9.94 -3.51 . -6.52 8.87 6.39 0.08 . -5.21
## u8 -4.57 0.48 1.06 -6.46 1.21 -4.91 -0.25 -1.03 -5.98 2.66
## u9 7.07 5.03 3.04 7.17 -2.93 -6.18 12.31 -0.65 -3.66 7.31
r_n2 <- normalize(r, method = "Z-score")
r_n2
## 23500 x 100 rating matrix of class 'realRatingMatrix' with 1708993 ratings.
## Normalized using z-score on rows.
getRatingMatrix(r_n2)[1:10, 1:10] %>% round(2)
## 10 x 10 sparse Matrix of class "dgCMatrix"
## [[ suppressing 10 column names 'j0', 'j1', 'j2' ... ]]
##
## u0 . 1.57 . . -0.43 -0.80 1.07 -0.49 . -0.43
## u1 -1.02 -0.90 0.25 -0.73 -1.54 0.36 2.03 -1.12 -1.54 0.25
## u2 . . . . -1.23 . 1.49 0.20 . .
## u3 -0.05 -2.09 0.69 . -1.23 0.18 1.42 0.53 . 0.53
## u4 . . . . 1.27 . -2.87 -1.24 . .
## u5 . . . . -1.08 . 0.54 -2.00 . .
## u6 0.51 0.06 0.33 -1.01 1.05 -0.96 -0.96 -1.03 0.44 -1.30
## u7 -0.76 1.54 -0.54 . -1.01 1.38 0.99 0.01 . -0.81
## u8 -1.20 0.13 0.28 -1.70 0.32 -1.29 -0.07 -0.27 -1.57 0.70
## u9 1.41 1.00 0.61 1.43 -0.59 -1.23 2.45 -0.13 -0.73 1.46
We use image to visually represent portions of the raw and normalized ratings matrices.
# show part of ratings matrices
image(r[1:25, 1:50], main = "Raw Ratings")
image(r_n1[1:25, 1:50], main = "Normalized Ratings - Centered")
image(r_n1[1:25, 1:50], main = "Normalized Ratings - Z-Score")
Next we do some preliminary data analysis. Reviewing the distributions of the raw and normalized ratings, we see that the mean and median ratings are 0.8 and 1.4, respectively. As expected, the normalized ratings are centered at 0, and they are approximately normally distributed, as seen in the histogram of rating Z-scores. It is interesting that the ratings distribution is skewed to the right, with a higher frequency of ratings above 0 than ratings below 0. This is true even after normalizing the ratings, which suggests a significant user bias toward ratings above 0. Perhaps some people are too polite to give bad jokes a negative rating?
# distribution of ratings
par(mfrow = c(1, 3))
hist(getRatings(r), main = "Raw Ratings")
hist(getRatings(r_n1), main = "Norm. Ratings - Centered")
hist(getRatings(r_n2), prob = TRUE, xlim = c(-5, 5), breaks = 40,
main = "Norm. Ratings - Z-Score")
curve(dnorm(x), add = TRUE, col = "blue")
par(mfrow = c(1,1))
summary(getRatings(r))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -9.9500 -3.2000 1.3600 0.7573 5.1000 10.0000
summary(getRatings(r_n1))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -18.7149 -2.9704 0.4446 0.0000 3.1672 18.2654
summary(getRatings(r_n2))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -9.3382 -0.7244 0.1254 0.0000 0.7520 8.8608
Turning to ratings per user, we can see that the number of jokes rated per user averages about 72-73. There are two modes in the distribution, from users who rated all 100 jokes and from those who rated 70-74 jokes. Also there appears to be a slight mode at the minimum of 36; this probably arises from the user cutoff of 36 ratings applied when creating the dataset (the Jester data description indicates that only users with at least 36 ratings are included). In addition, the average raw rating per user is approximately 1 and ranges from -10 to 9; as expected the average normalized ratings per user is 0.
# distribution of rating count & avg rating per user
hist(rowCounts(r), breaks = 40, main = "Distribution of Rating Count per User")
summary(rowCounts(r))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 36.00 53.00 72.00 72.72 100.00 100.00
par(mfrow = c(1,3))
hist(rowMeans(r), main = "Avg. Rating per User")
hist(rowMeans(r_n1), main = "Avg. Rating (Centered) per User")
hist(rowMeans(r_n2), main = "Avg. Rating (Z-Score) per User")
par(mfrow = c(1,1))
summary(rowMeans(r))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -9.9474 -0.8841 0.8597 0.7576 2.5702 9.3133
summary(rowMeans(r_n1))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1.460e-14 -2.842e-16 0.000e+00 4.248e-18 2.961e-16 1.296e-14
summary(rowMeans(r_n2))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -6.399e-13 -6.550e-17 2.000e-19 -2.500e-17 6.990e-17 8.982e-14
Reviewing the rating matrix along the second dimension (by joke), we can see that the average number of ratings per joke is 17,090 (mean) or 18,648 (median). The distribution appears bi-modal, with many jokes receiving under 11,000 ratings and some jokes receiving well over 20,000 ratings (with a particular spike from jokes rated by virtually all 23,500 users). There are no jokes in the range of 11,000 to 14,000 ratings. Further, the average rating per joke is about 1, with a high of 3.5 and low of -3.6. As expected, the average normalized rating per joke is approximately 0.
# distribution of rating count & avg rating per country
hist(colCounts(r), breaks = 40, main = "Distribution of Rating Count per Joke")
summary(colCounts(r))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 8164 9732 18648 17090 23329 23499
par(mfrow = c(1,3))
hist(colMeans(r), main = "Avg. Rating per Joke")
hist(colMeans(r_n1), main = "Avg. Rating (Centered) per Joke")
hist(colMeans(r_n2), main = "Avg. Rating (Z-Score) per Joke")
par(mfrow = c(1,1))
summary(colMeans(r))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -3.5738 -0.4375 0.9445 0.7163 1.8245 3.4593
summary(colMeans(r_n1))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -4.25191 -1.19483 0.20238 -0.06176 0.97356 2.69231
summary(colMeans(r_n2))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.95941 -0.25569 0.05488 -0.01312 0.22596 0.60609
We can see that the most frequently rated jokes were rated by virtually all 23,500 users.
# top N most frequently rated jokes
N <- 10
t <- sort(colCounts(r), decreasing = TRUE) %>% head(N)
topN <- cbind(Rank = 1:N,
Joke = names(t),
Frequency = t
)
rownames(topN) <- NULL
kable(topN, caption = "Top 10 Most Frequently Rated Jokes")
| Rank | Joke | Frequency |
|---|---|---|
| 1 | j12 | 23499 |
| 2 | j14 | 23499 |
| 3 | j16 | 23499 |
| 4 | j35 | 23499 |
| 5 | j49 | 23499 |
| 6 | j4 | 23498 |
| 7 | j19 | 23498 |
| 8 | j52 | 23498 |
| 9 | j6 | 23497 |
| 10 | j7 | 23497 |
On the other hand, the least frequently rated jokes were rated by under 9,000 users.
# top N least frequently rated jokes
N <- 10
t <- sort(colCounts(r), decreasing = FALSE) %>% head(N)
topN <- cbind(Rank = 1:N,
Joke = names(t),
Frequency = t
)
rownames(topN) <- NULL
kable(topN, caption = "10 Least Frequently Rated Jokes")
| Rank | Joke | Frequency |
|---|---|---|
| 1 | j70 | 8164 |
| 2 | j72 | 8231 |
| 3 | j71 | 8288 |
| 4 | j73 | 8392 |
| 5 | j74 | 8393 |
| 6 | j77 | 8494 |
| 7 | j75 | 8513 |
| 8 | j76 | 8551 |
| 9 | j78 | 8586 |
| 10 | j79 | 8643 |
Finally, we summarize the top 10 most highly rated jokes based on the raw ratings as well as the normalized ratings (for both centering and Z-scores). Note that the top 10 lists are almost identical regardless of rating method; only a few jokes move slightly in position.
# top N most highly rated jokes
N <- 10
t <- sort(colMeans(r), decreasing = TRUE) %>% head(N)
t1 <- sort(colMeans(r_n1), decreasing = TRUE) %>% head(N)
t2 <- sort(colMeans(r_n2), decreasing = TRUE) %>% head(N)
topN <- cbind(Rank = 1:N,
Joke = names(t), Raw = round(t, 2),
Joke1 = names(t1), Centered = round(t1, 2),
Joke2 = names(t2), "Z-Score" = round(t2, 2)
)
rownames(topN) <- NULL
kable(topN, caption = "Top 10 Most Highly Rated Jokes (Based on Raw, Centered, and Z-Score Ratings)")
| Rank | Joke | Raw | Joke1 | Centered | Joke2 | Z-Score |
|---|---|---|---|---|---|---|
| 1 | j88 | 3.46 | j49 | 2.69 | j49 | 0.61 |
| 2 | j49 | 3.45 | j88 | 2.55 | j88 | 0.58 |
| 3 | j31 | 3 | j31 | 2.24 | j31 | 0.5 |
| 4 | j26 | 2.95 | j26 | 2.19 | j26 | 0.5 |
| 5 | j34 | 2.94 | j34 | 2.18 | j35 | 0.49 |
| 6 | j35 | 2.92 | j35 | 2.16 | j34 | 0.49 |
| 7 | j61 | 2.66 | j61 | 1.9 | j28 | 0.43 |
| 8 | j71 | 2.66 | j28 | 1.9 | j61 | 0.43 |
| 9 | j28 | 2.65 | j52 | 1.78 | j52 | 0.4 |
| 10 | j52 | 2.53 | j71 | 1.75 | j48 | 0.39 |
Here are the top three jokes:
Joke 88: “A Czechoslovakian man felt his eyesight was growing steadily worse, and felt it was time to go see an optometrist. The doctor started with some simple testing, and showed him a standard eye chart with letters of diminishing size: CRKBNWXSKZY … ‘Can you read this?’ the doctor asked. ‘Read it?’ the Czech answered. ‘Doc, I know him!’”
Joke 49: “Three engineering students were gathered together discussing the possible designers of the human body. One said, ‘It was a mechanical engineer. Just look at all the joints.’ Another said, ‘No, it was an electrical engineer. The nervous systems many thousands of electrical connections.’ The last said, ‘Actually it was a civil engineer. Who else would run a toxic waste pipeline through a recreational area?’”
Joke 31: “President Clinton looks up from his desk in the Oval Office to see one of his aides nervously approach him. ‘What is it?’ exclaims the President. ‘It’s this Abortion Bill Mr. President, what do you want to do about it?’ the aide replies. ‘Just go ahead and pay it.’ responds the President.”
Likewise, the lowest rated jokes are consistent regardless of rating method.
# top N lowest-rated jokes
N <- 10
t <- sort(colMeans(r), decreasing = FALSE) %>% head(N)
t1 <- sort(colMeans(r_n1), decreasing = FALSE) %>% head(N)
t2 <- sort(colMeans(r_n2), decreasing = FALSE) %>% head(N)
topN <- cbind(Rank = 1:N,
Joke = names(t), Raw = round(t, 2),
Joke1 = names(t1), Centered = round(t1, 2),
Joke2 = names(t2), "Z-Score" = round(t2, 2)
)
rownames(topN) <- NULL
kable(topN, caption = "10 Lowest-Rated Jokes (Based on Raw, Centered, and Z-Score Ratings)")
| Rank | Joke | Raw | Joke1 | Centered | Joke2 | Z-Score |
|---|---|---|---|---|---|---|
| 1 | j57 | -3.57 | j57 | -4.25 | j57 | -0.96 |
| 2 | j15 | -2.89 | j15 | -3.64 | j15 | -0.87 |
| 3 | j14 | -2.18 | j14 | -2.94 | j14 | -0.7 |
| 4 | j43 | -1.87 | j43 | -2.54 | j12 | -0.58 |
| 5 | j56 | -1.81 | j56 | -2.5 | j43 | -0.56 |
| 6 | j23 | -1.78 | j12 | -2.5 | j56 | -0.55 |
| 7 | j12 | -1.74 | j23 | -2.41 | j23 | -0.52 |
| 8 | j36 | -1.41 | j73 | -2.3 | j73 | -0.52 |
| 9 | j32 | -1.41 | j32 | -2.1 | j16 | -0.49 |
| 10 | j73 | -1.4 | j36 | -2.1 | j19 | -0.48 |
Here are the three worst-rated jokes in the dataset:
Joke 57: “Why are there so many Jones’s in the phone book? Because they all have phones.”
Joke 15: “Q: What did the blind person say when given some matzah? A: Who the hell wrote this?”
Joke 14: “The father was very anxious to marry off his only daughter so he wanted to impress her date. ‘Do you like to screw,’ he says. ‘Huh’ replied the surprised first date. ‘My daughter she loves to screw and she’s good at it, you and her should go screw,’ carefully explained the father. Now very interested the boy replied, ‘Yes, sir.’ Minutes later the girl came down the stairs, kissed her father goodbye and the couple left. After only a few minutes she reappeared, furious, dress torn, hair a mess and screamed ‘Dammit, Daddy, it’s the TWIST, get it straight!’”
Because of the large size of the dataset, we create a smaller more manageable dataset with random sampling. This allows more efficient model development on my PC. We randomly sample 2,000 users from the dataset, which results in a 2,000-by-100 rating matrix with 145,509 ratings. Again the rating matrix is roughly 73% dense (27% sparse), out of 200,000 possible ratings.
# random sample of dataset
set.seed(42)
n_samp <- 2000
idx <- sort(sample(1:nrow(r), n_samp))
rs <- r[idx]
rs
## 2000 x 100 rating matrix of class 'realRatingMatrix' with 145509 ratings.
The distributions of rating count per user and per joke are similar to the prior distributions (seen above in the exploratory data analysis), other than the fact that we have a smaller set of 2,000 users rather than 23,500.
hist(rowCounts(rs), breaks = 40, main = "Distribution of Rating Count per User")
summary(rowCounts(rs))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 36.00 53.00 72.00 72.75 100.00 100.00
hist(colCounts(rs), breaks = 20, main = "Distribution of Rating Count per Joke")
summary(colCounts(rs))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 674 827 1597 1455 1988 2000
Now we test the recommenderlab infrastructure, by creating and testing sample recommender algorithms.
We start by creating a sample recommender system using the “POPULAR” algorithm (see list of prediction algorithms below). First we show top 5 and top 3 recommended jokes for various users. Note that the top-N lists are returned already sorted in descending order.
## create recommender, show sample outputs
rec <- Recommender(rs, method = "POPULAR")
rec
## Recommender of type 'POPULAR' for 'realRatingMatrix'
## learned using 2000 users.
names(getModel(rec))
## [1] "topN" "ratings" "normalize"
## [4] "aggregationRatings" "aggregationPopularity" "verbose"
# sample predicted top5 list, for 5 users
rec_top5 <- predict(rec, rs[51:55], n = 5)
rec_top5
## Recommendations as 'topNList' with n = 5 for 5 users.
as(rec_top5, "list")
## $u529
## [1] "j88" "j71" "j92" "j75" "j27"
##
## $u556
## [1] "j88" "j71" "j92" "j75" "j87"
##
## $u563
## [1] "j88" "j71" "j92" "j75" "j87"
##
## $u583
## [1] "j88" "j11" "j71" "j37" "j92"
##
## $u586
## [1] "j88" "j71" "j92" "j75" "j87"
# sample predicted top3 list, for 5 users
rec_top3 <- predict(rec, rs[51:55], n = 3)
rec_top3
## Recommendations as 'topNList' with n = 3 for 5 users.
as(rec_top3, "list")
## $u529
## [1] "j88" "j71" "j92"
##
## $u556
## [1] "j88" "j71" "j92"
##
## $u563
## [1] "j88" "j71" "j92"
##
## $u583
## [1] "j88" "j11" "j71"
##
## $u586
## [1] "j88" "j71" "j92"
Second, we show predicted ratings for various users. In the ratings version, the users’ actual ratings show up as “NA”, but they are included in the ratingMatrix version. Also note that the ratings are normalized using centering on rows.
# sample predicted ratings (NA for actual ratings), for 5 users
rec_rtg <- predict(rec, rs[51:55], type = "ratings")
rec_rtg
## 5 x 100 rating matrix of class 'realRatingMatrix' with 244 ratings.
as(rec_rtg, "matrix")[ , 1:10] %>% round(2)
## j0 j1 j2 j3 j4 j5 j6 j7 j8 j9
## u529 -0.34 -1.13 -1.25 -2.73 NA -0.26 NA NA -1.93 NA
## u556 NA NA NA NA NA NA NA NA NA NA
## u563 -1.31 NA -2.23 -3.71 NA NA NA NA -2.91 -1.28
## u583 -3.48 -4.27 -4.40 -5.88 NA -3.40 NA NA -5.08 -3.45
## u586 1.31 NA NA -1.08 NA 1.40 NA NA -0.28 NA
# sample predicted ratings (with actual ratings), for 5 users
rec_rtg1 <- predict(rec, rs[51:55], type = "ratingMatrix")
rec_rtg1
## 5 x 100 rating matrix of class 'realRatingMatrix' with 500 ratings.
as(rec_rtg1, "matrix")[ , 1:10] %>% round(2)
## j0 j1 j2 j3 j4 j5 j6 j7 j8 j9
## u529 -0.34 -1.13 -1.25 -2.73 -0.27 -0.26 -2.17 -3.53 -1.93 2.54
## u556 1.24 -0.46 0.36 -3.71 -6.92 -3.96 -3.52 -2.31 0.51 -4.59
## u563 -1.31 -6.73 -2.23 -3.71 3.80 7.93 10.26 -2.22 -2.91 -1.28
## u583 -3.48 -4.27 -4.40 -5.88 0.34 -3.40 -2.91 6.46 -5.08 -3.45
## u586 1.31 -5.90 -8.77 -1.08 5.21 1.40 -10.81 3.66 -0.28 -7.94
Third, we measure and compare error metrics of predicted ratings from competing models. Below we compare the UBCF (user-based collaborative filtering) and IBCF (item-based collaborative filtering) algorithms. To do this, we set up an evaluation scheme based on the following criteria:
Calculated error metrics include the root mean squared error (RMSE), the mean squared error (MSE), and the mean average error (MAE).
# sample errors of predicted ratings: UBCF vs IBCF
es <- evaluationScheme(rs, method = "split", train = 0.9, given = 20, goodRating = 5)
rec1 <- Recommender(getData(es, "train"), "UBCF")
rec1
## Recommender of type 'UBCF' for 'realRatingMatrix'
## learned using 1800 users.
rec2 <- Recommender(getData(es, "train"), "IBCF")
rec2
## Recommender of type 'IBCF' for 'realRatingMatrix'
## learned using 1800 users.
pred1 <- predict(rec1, getData(es, "known"), type = "ratings")
pred1
## 200 x 100 rating matrix of class 'realRatingMatrix' with 16000 ratings.
pred2 <- predict(rec2, getData(es, "known"), type = "ratings")
pred2
## 200 x 100 rating matrix of class 'realRatingMatrix' with 15994 ratings.
error <- rbind(
UBCF = calcPredictionAccuracy(pred1, getData(es, "unknown")),
IBCF = calcPredictionAccuracy(pred2, getData(es, "unknown"))
)
error %>% round(3) %>% kable(caption = "Evaluation of Rating Prediction Errors: UBCF vs. IBCF")
| RMSE | MSE | MAE | |
|---|---|---|---|
| UBCF | 4.523 | 20.454 | 3.584 |
| IBCF | 5.107 | 26.079 | 4.018 |
Finally, we evaluate a recommender system with 3 measures of model performance:
We illustrate with a top-N recommender system using the UBCF algorithm under the following evaluation scheme:
# sample evaluation of top-N recommender algorithm: UBCF
es1 <- evaluationScheme(rs, method = "cross", k = 4, given = 20, goodRating = 5)
es1
## Evaluation scheme with 20 items given
## Method: 'cross-validation' with 4 run(s).
## Good ratings: >=5.000000
## Data set: 2000 x 100 rating matrix of class 'realRatingMatrix' with 145509 ratings.
results <- evaluate(es1, method = "UBCF", type = "topNList",
n = c(1, 5, 10, 20, 30, 40, 50, 60, 80))
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/1.03sec]
## 2 [0.01sec/1.27sec]
## 3 [0.02sec/1.05sec]
## 4 [0.01sec/1.05sec]
results
## Evaluation results for 4 folds/samples using method 'UBCF'.
avg(results) %>% round(3) %>%
kable(caption = "Confusion Matrix (k=4): Top-N Lists for N=1 to 100")
| TP | FP | FN | TN | precision | recall | TPR | FPR | |
|---|---|---|---|---|---|---|---|---|
| 1 | 0.414 | 0.586 | 13.160 | 65.839 | 0.414 | 0.042 | 0.042 | 0.008 |
| 5 | 1.924 | 3.075 | 11.650 | 63.350 | 0.385 | 0.193 | 0.193 | 0.045 |
| 10 | 3.543 | 6.457 | 10.032 | 59.968 | 0.354 | 0.339 | 0.339 | 0.094 |
| 20 | 5.878 | 14.122 | 7.696 | 52.303 | 0.294 | 0.512 | 0.512 | 0.207 |
| 30 | 7.468 | 22.532 | 6.107 | 43.894 | 0.249 | 0.613 | 0.613 | 0.333 |
| 40 | 8.834 | 31.166 | 4.741 | 35.259 | 0.221 | 0.698 | 0.698 | 0.462 |
| 50 | 10.076 | 39.924 | 3.498 | 26.502 | 0.202 | 0.770 | 0.770 | 0.594 |
| 60 | 11.293 | 48.707 | 2.281 | 17.719 | 0.188 | 0.840 | 0.840 | 0.727 |
| 80 | 13.575 | 66.425 | 0.000 | 0.000 | 0.170 | 1.000 | 1.000 | 1.000 |
plot(results, annotate = TRUE, main = "ROC Curve for UBCF Method")
plot(results, "prec/rec", annotate = TRUE, main = "Precision-Recall Plot for UBCF Method")
In order to develop a recommender system for the Jester dataset, we need to decide which recommendation algorithm to use. The recommenderlab package offers a variety of algorithms to choose from. For this project, we consider the 5 algorithms below, some of which depend on certain parameters:
Note that the IBCF, UBCF, and SVD algorithms each depend on one or more input parameters. What are the optimal parameter choices for each algorithm? In order to determine this, we evaluate the models using ROC curves and precision-recall plots for predicted top-N lists under a range of values for the k and nn parameters and choices for the similarity measure; we did not test normalization method (centering vs. Z-score). Based on this analysis, we choose the following parameters:
For development of the recommender models, we use an evaluation scheme similar to those used previously:
## compare recommender algorithms
scheme <- evaluationScheme(rs, method = "split", train = 0.9, given = 20, goodRating = 5)
scheme
## Evaluation scheme with 20 items given
## Method: 'split' with 1 run(s).
## Training set proportion: 0.900
## Good ratings: >=5.000000
## Data set: 2000 x 100 rating matrix of class 'realRatingMatrix' with 145509 ratings.
# compare IBCF for various k, similarity=cosine (default)
alg_ic <- list(
"ib5" = list(name="IBCF", param = list(k = 5)),
"ib15" = list(name="IBCF", param = list(k = 15)),
"ib30" = list(name="IBCF", param = list(k = 30)),
"ib50" = list(name="IBCF", param = list(k = 50))
)
results_ic <- evaluate(scheme, alg_ic, type = "topNList",
n = c(1, 5, 10, 20, 30, 40, 50, 60, 80))
## IBCF run fold/sample [model time/prediction time]
## 1 [0.14sec/0.03sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.14sec/0.06sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.13sec/0.04sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.13sec/0.04sec]
# compare IBCF for various k, similarity=pearson
alg_ip <- list(
"ib5" = list(name="IBCF", param = list(k = 5, method="pearson")),
"ib15" = list(name="IBCF", param = list(k = 15, method="pearson")),
"ib30" = list(name="IBCF", param = list(k = 30, method="pearson")),
"ib50" = list(name="IBCF", param = list(k = 50, method="pearson"))
)
results_ip <- evaluate(scheme, alg_ip, type = "topNList",
n = c(1, 5, 10, 20, 30, 40, 50, 60, 80))
## IBCF run fold/sample [model time/prediction time]
## 1 [0.17sec/0.03sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.17sec/0.04sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.18sec/0.04sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.17sec/0.04sec]
par(mfrow = c(1, 2))
# plot ROC curves
plot(results_ic, annotate = 2, legend = "bottomright")
title(main = "ROC: IBCF-Cosine")
plot(results_ip, annotate = 2, legend = "bottomright")
title(main = "ROC: IBCF-Pearson")
# plot precision-recall plots
plot(results_ic, "prec/rec", annotate = 2, legend = "bottomright", ylim = c(0, 0.3))
title(main = "Prec-Recall: IBCF-Cosine")
plot(results_ip, "prec/rec", annotate = 2, legend = "bottomright", ylim = c(0, 0.3))
title(main = "Prec-Recall: IBCF-Pearson")
# compare UBCF for various nearest neighbors, similarity=cosine (default)
alg_uc <- list(
"ub5" = list(name="UBCF", param = list(nn = 5)),
"ub15" = list(name="UBCF", param = list(nn = 15)),
"ub25" = list(name="UBCF", param = list(nn = 25)),
"ub50" = list(name="UBCF", param = list(nn = 50))
)
results_uc <- evaluate(scheme, alg_uc, type = "topNList",
n = c(1, 5, 10, 20, 30, 40, 50, 60, 80))
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.43sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.44sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.01sec/0.44sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.46sec]
# compare UBCF for various nearest neighbors, similarity=pearson
alg_up <- list(
"ub5" = list(name="UBCF", param = list(nn = 5, method="pearson")),
"ub15" = list(name="UBCF", param = list(nn = 15, method="pearson")),
"ub25" = list(name="UBCF", param = list(nn = 25, method="pearson")),
"ub50" = list(name="UBCF", param = list(nn = 50, method="pearson"))
)
results_up <- evaluate(scheme, alg_up, type = "topNList",
n = c(1, 5, 10, 20, 30, 40, 50, 60, 80))
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.48sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.03sec/0.45sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.49sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.51sec]
# plot ROC curves
plot(results_uc, annotate = 4, legend = "bottomright")
title(main = "ROC: UBCF-Cosine")
plot(results_up, annotate = 4, legend = "bottomright")
title(main = "ROC: UBCF-Pearson")
# plot precision-recall plots
plot(results_uc, "prec/rec", annotate = 4, legend = "bottomleft", ylim = c(0, 0.4))
title(main = "Prec-Recall: UBCF-Cosine")
plot(results_up, "prec/rec", annotate = 4, legend = "bottomleft", ylim = c(0, 0.4))
title(main = "Prec-Recall: UBCF-Pearson")
# compare SVD for various k
alg_s <- list(
"svd3" = list(name="SVD", param = list(k = 3)),
"svd5" = list(name="SVD", param = list(k = 5)),
"svd10" = list(name="SVD", param = list(k = 10)),
"svd20" = list(name="SVD", param = list(k = 20))
)
results_s <- evaluate(scheme, alg_s, type = "topNList",
n = c(1, 5, 10, 20, 30, 40, 50, 60, 80))
## SVD run fold/sample [model time/prediction time]
## 1 [0.05sec/0.03sec]
## SVD run fold/sample [model time/prediction time]
## 1 [0.05sec/0.03sec]
## SVD run fold/sample [model time/prediction time]
## 1 [0.04sec/0.03sec]
## SVD run fold/sample [model time/prediction time]
## 1 [0.08sec/0.04sec]
par(mfrow = c(1, 1))
plot(results_s, annotate = 3, legend = "bottomright")
title(main = "ROC Curve: SVD")
plot(results_s, "prec/rec", annotate = 3, legend = "bottomleft", ylim = c(0, 0.4))
title(main = "Precision-Recall Plot: SVD")
Having chosen the algorithm parameters, now we can evaluate model performance using the 5 algorithms. In our case, predictor systems based on the UBCF and POPULAR algorithms produce the best performance, as measured by ROC curves, precision-recall plots, and error metrics.
# now compare different algorithms with optimal parameters
algorithms <- list(
"random items" = list(name="RANDOM", param = NULL),
"popular items" = list(name="POPULAR", param = NULL),
"item-based CF" = list(name="IBCF", param = list(k = 5, method = "pearson")),
"user-based CF" = list(name="UBCF", param = list(nn = 50, method = "pearson")),
"SVD approximation" = list(name="SVD", param = list(k = 10))
)
# run algorithms for top-N list
results <- evaluate(scheme, algorithms, type = "topNList",
n = c(1, 5, 10, 20, 30, 40, 50, 60, 80))
## RANDOM run fold/sample [model time/prediction time]
## 1 [0sec/0.06sec]
## POPULAR run fold/sample [model time/prediction time]
## 1 [0.02sec/0.31sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.16sec/0.03sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.45sec]
## SVD run fold/sample [model time/prediction time]
## 1 [0.04sec/0.04sec]
results
## List of evaluation results for 5 recommenders:
## Evaluation results for 1 folds/samples using method 'RANDOM'.
## Evaluation results for 1 folds/samples using method 'POPULAR'.
## Evaluation results for 1 folds/samples using method 'IBCF'.
## Evaluation results for 1 folds/samples using method 'UBCF'.
## Evaluation results for 1 folds/samples using method 'SVD'.
plot(results, annotate = 4, legend = "bottomright")
title(main = "Comparison of ROC Curves")
plot(results, "prec/rec", annotate = 4, legend = "topright", ylim = c(0, 0.4))
title(main = "Comparison of Precision-Recall Plots")
# run algorithms for predicted ratings
results <- evaluate(scheme, algorithms, type = "ratings")
## RANDOM run fold/sample [model time/prediction time]
## 1 [0sec/0.01sec]
## POPULAR run fold/sample [model time/prediction time]
## 1 [0.02sec/0.01sec]
## IBCF run fold/sample [model time/prediction time]
## 1 [0.16sec/0.01sec]
## UBCF run fold/sample [model time/prediction time]
## 1 [0.02sec/0.44sec]
## SVD run fold/sample [model time/prediction time]
## 1 [0.05sec/0.02sec]
results
## List of evaluation results for 5 recommenders:
## Evaluation results for 1 folds/samples using method 'RANDOM'.
## Evaluation results for 1 folds/samples using method 'POPULAR'.
## Evaluation results for 1 folds/samples using method 'IBCF'.
## Evaluation results for 1 folds/samples using method 'UBCF'.
## Evaluation results for 1 folds/samples using method 'SVD'.
plot(results)
title(main = "Comparison of Model Error vs. Actual Ratings")
plot(results, ylim = c(0, 10))
title(main = "Same Comparison: Close-up")
The recommenderlab package offers a wide variety of tools and algorithms to test and develop recommender models. In this project we explored some of these features using a subset of the Jester dataset. Some conclusions from our work include:
Some suggestions for future work include:
recommenderlab documentation indicates that centering and Z-score are the only methods currently implemented.HybridRecommender function in recommenderlab, and evaluate their performance vs. the standalone algorithms. For instance, try combinations of UBCF, IBCF, and SVD to see if a hybrid model can out-perform POPULAR.