For this simple exploration of recommender systems using recommenderlab
in R
, I focused on getting the techniques down properly and optimizing the parameters within the algorithms using `params = list(normalize=“some-method”, method=“some-method”), so that it was neither advantageous nor necessary to bring in more than a single ratings predictor.
With that in mind I chose to use rating
which is he overall rating for each restaurant in the data set. But I hope to revisit this model in the next assignment using more variables to see how it improves predicitions for such a small set.
To see if there was enhanced performance using different dissimilarity metrics and normalization equations, I did some testing across both sets as follows.
Dissimilarity Methods
Normalization Methods
Three separate error metrics were calculated for each pass with a dissimilary/normalization pair to see if they varied at all in magnitude or direction or change as the different models were explored.
Errors Metrics:
Observations: 1161
Features : 5
Type | Levels | Possible Values | |
---|---|---|---|
UserId | character | 138 | |
placeID | character | 130 | |
rating | numeric | 3 | 0,1,2 |
food_rating | numeric | 3 | 0,1,2 |
service_rating | numeric | 3 | 0,1,2 |
# Reading Data
food <- read.csv("data/rating_final.csv")
### Subsetting for Items, Users and Overall Rating
food_for_rec <- food[, c("userID", "placeID", "rating")]
### Renaming for simplicity (and agreement with `recommenderlab conventions`)å
colnames(food_for_rec) <- c("user", "item", "rating")
## Going Wide
food_for_rec <- food_for_rec %>% spread(key = item, value = rating)
# Extracting names for indexing the matrix
mat_names <- food_for_rec$user
restaurant_matrix <- as.matrix(food_for_rec)
# Adding index to matrix (user ID's)
rownames(restaurant_matrix) <- mat_names
## Making `recommenderlab` matrix
restaurants <- as(restaurant_matrix, "realRatingMatrix")
This is a very sparse matrix with users in different geographic areas thus very few of our reviewers have reviewed more than 5-8 restauants.
According, this matrix is sparse, 5% of the fields host a review of 1 or 2. There may be challenges developing a useful predictor with so few reviewers and reviews for 130 locations. It will be important to test the minimum review size to see if it significanly affects the results.
For the follow comaprisons of item-based and user-based recommenders using recommenderlab
I tested the effectiveness using 5, 6, & 7 as minimum reviews per reviewer to see if the density of reviews or number of reviewers were more influential on the results.
Rating | Percent of Matrix |
---|---|
0 - No Review | 94.9 |
1 | 23.6 |
2 | 027.3 |
# Distribution Plot Zeros Removed
plot_ratings <- spread(food[, c("placeID", "userID", "rating")], key = placeID, value = as.numeric(rating),
fill = 0)
plot_ratings <- as.numeric(as.matrix(plot_ratings[, 2:ncol(plot_ratings)]))
props <- prop.table(table(plot_ratings))
zero_ratings <- length(as.vector(plot_ratings[plot_ratings == 0]))
hist(as.vector(plot_ratings[plot_ratings > 0]), breaks = c(0.75, 1.25, 1.75, 2.25),
main = "Distribution of Ratings", col = "maroon", xlab = "Ratings", xlim = c(0,
3))
# Heatmap of first 50
image(normalize(restaurants)[1:50, 1:50], main = "First 50 Reviewers & First 50 Restaurants",
xlab = "Restaurants", ylab = "Reviewers")
# Funtion which takes
test_ibcf_ubcf_params <- function(matrix, train_ratio, min_reviews, normal = "center",
method = "pearson") {
set.seed(11)
matrix = matrix[rowCounts(matrix) > min_reviews, ]
eval <- evaluationScheme(matrix, method = "split", train = train_ratio, given = min_reviews,
goodRating = 2)
params = list(normalize = normal, method = method)
item_eval <- Recommender(getData(eval, "train"), "IBCF", param = params)
user_eval <- Recommender(getData(eval, "train"), "UBCF", param = params)
item_pred <- predict(item_eval, getData(eval, "known"), type = "ratings")
user_pred <- predict(user_eval, getData(eval, "known"), type = "ratings")
error <- rbind(UBCF = calcPredictionAccuracy(user_pred, getData(eval, "unknown")),
IBCF = calcPredictionAccuracy(item_pred, getData(eval, "unknown")))
print(error)
}
# Calling Function with - Center Normalizatio - Pearson Distance Metric - 80%
# Train 20% test - 8 'given' minimum observations per user
(example <- test_ibcf_ubcf_params(restaurants, 0.8, 7, "z-score", "pearson"))
RMSE MSE MAE
UBCF 0.6659594 0.4435019 0.6527636
IBCF 0.4642387 0.2155176 0.3356630
RMSE MSE MAE
UBCF 0.6659594 0.4435019 0.6527636
IBCF 0.4642387 0.2155176 0.3356630
A model was built using a full 80% Training and 20% testing on the matrix subset with 5
, 6
, or 7
mimimum reviews per reviewer (set as given
in params).
After subsetting the matrix by the number of reviewers, models were built comparing them across scaling methods center
and z-score
using both User-Based
and Item-Based
recommenders with Pearson
, Cosine
and Jaccard
similarity metrics.
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.5420853 | 0.2938565 | 0.4582473 |
IBCF | 0.5904402 | 0.3486197 | 0.4426859 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.5627917 | 0.3167345 | 0.4945736 |
IBCF | 0.3700895 | 0.1369662 | 0.2510929 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.6601833 | 0.4358419 | 0.645840 |
IBCF | 0.4642387 | 0.2155176 | 0.335663 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.5307169 | 0.2816605 | 0.4506577 |
IBCF | 0.5904402 | 0.3486197 | 0.4426859 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.5013587 | 0.2513605 | 0.4764868 |
IBCF | 0.3700895 | 0.1369662 | 0.2510929 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.6659594 | 0.4435019 | 0.6527636 |
IBCF | 0.4642387 | 0.2155176 | 0.3356630 |
The best of the Center Scaled
Pearson, 6-Reviewer, Center Scaled, Item-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.5627917 | 0.3167345 | 0.4945736 |
IBCF | 0.3700895 | 0.1369662 | 0.2510929 |
The best of the z-score Scaled
Pearson, 6-Reviewer, Z-score Scaled, Item-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.5013587 | 0.2513605 | 0.4764868 |
IBCF | 0.3700895 | 0.1369662 | 0.2510929 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4603170 | 0.2118917 | 0.3549880 |
IBCF | 0.6635908 | 0.4403528 | 0.5162437 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4389073 | 0.1926396 | 0.3809020 |
IBCF | 0.4825190 | 0.2328246 | 0.3351627 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4783140 | 0.2287843 | 0.4396375 |
IBCF | 0.5612189 | 0.3149666 | 0.4478474 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4576094 | 0.2094063 | 0.3538655 |
IBCF | 0.6635908 | 0.4403528 | 0.5162437 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4392227 | 0.1929165 | 0.3866922 |
IBCF | 0.4825190 | 0.2328246 | 0.3351627 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4774789 | 0.2279861 | 0.4393303 |
IBCF | 0.5612189 | 0.3149666 | 0.4478474 |
Center Scaled Data
Cosine, 6-Reviewer, Center Scaled User-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4389073 | 0.1926396 | 0.3809020 |
IBCF | 0.4825190 | 0.2328246 | 0.3351627 |
z-score Scaled Data
Cosine, 6-Reviewer, Z-Score Scaled User-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4392227 | 0.1929165 | 0.3866922 |
IBCF | 0.4825190 | 0.2328246 | 0.3351627 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4586139 | 0.2103268 | 0.3568262 |
IBCF | 0.6102080 | 0.3723538 | 0.4634503 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4428941 | 0.1961551 | 0.3827466 |
IBCF | 0.4944291 | 0.2444602 | 0.3713836 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4804685 | 0.2308499 | 0.4432246 |
IBCF | 0.4101490 | 0.1682222 | 0.3228571 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.454166 | 0.2062668 | 0.3527740 |
IBCF | 0.610208 | 0.3723538 | 0.4634503 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4408108 | 0.1943141 | 0.3877180 |
IBCF | 0.4944291 | 0.2444602 | 0.3713836 |
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4789744 | 0.2294165 | 0.4417319 |
IBCF | 0.4101490 | 0.1682222 | 0.3228571 |
The Jaccard Methods are Universally the worst with center scaling error lowest in user-based
with 6
reviewers and lowest for item-based
lowest with 7
reviewers.
Z-score scaling was uniformly best with 6
reviews, the user-based
slightly out-performing iem-based
in RMSE
and MSE
but worse in MAE
but not far off.
Jaccard, 7 - Reviewer, Center-Scaled, Item-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4804685 | 0.2308499 | 0.4432246 |
IBCF | 0.4101490 | 0.1682222 | 0.3228571 |
Jaccard, 7 - Reviewer, Z-Score Scaled, Item-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4789744 | 0.2294165 | 0.4417319 |
IBCF | 0.4101490 | 0.1682222 | 0.3228571 |
This comparison was interesting. Comparing Pearson, Cosine and Jaccard metrics, it is clear that you could build a decent recommender with sufficient data using any of the three. The errors ranged from .1 - .65 points on a 2-point scale.
I would not say that .65 is trivial error predicting between 1 and 2, but the .2 range seems reasonable given 138 reviewers and 130 restaurants with 907 ratings and 16897 empty slots. However, this seems like an honorable start which could be improved upon with more reviews.
Making a final selection, selecting for a low mean absolute error seems to make the most sense as it should be the average error without regard for over or under-shooting. This is more interpretable than trying to back-caclualte from the Mean Squared Error or Root Mean Squared Error.
It also makes sense, that if several of the models were very close, to select for the one which requires the least reviews to train, as this means more obeservations are preserved and used in the training and testing, which should make the model more generalizable.
In this case, Pearson, User-Based, and Item-Based with 6 reviewers are both equally good and our first choice over the Jaccard with similar scores because it required a minimum number of reviews of 7 reducing the number of observations in the model.
Pearson, 6-Reviewer, Center Scaled, Item-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.5627917 | 0.3167345 | 0.4945736 |
IBCF | 0.3700895 | 0.1369662 | 0.2510929 |
Pearson, 6-Reviewer, Z-score Scaled, Item-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.5013587 | 0.2513605 | 0.4764868 |
IBCF | 0.3700895 | 0.1369662 | 0.2510929 |
Pearson, 6-Reviewer, Center Scaled, Item-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.5627917 | 0.3167345 | 0.4945736 |
IBCF | 0.3700895 | 0.1369662 | 0.2510929 |
Pearson, 6-Reviewer, Z-score Scaled, Item-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.5013587 | 0.2513605 | 0.4764868 |
IBCF | 0.3700895 | 0.1369662 | 0.2510929 |
Cosine, 6-Reviewer, Center Scaled User-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4389073 | 0.1926396 | 0.3809020 |
IBCF | 0.4825190 | 0.2328246 | 0.3351627 |
Cosine, 6-Reviewer, Z-Score Scaled User-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4392227 | 0.1929165 | 0.3866922 |
IBCF | 0.4825190 | 0.2328246 | 0.3351627 |
Jaccard, 7 - Reviewer, Center-Scaled, Item-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4804685 | 0.2308499 | 0.4432246 |
IBCF | 0.4101490 | 0.1682222 | 0.3228571 |
Jaccard, 7 - Reviewer, Z-Score Scaled, Item-Based
RMSE | MSE | MAE | |
---|---|---|---|
UBCF | 0.4789744 | 0.2294165 | 0.4417319 |
IBCF | 0.4101490 | 0.1682222 | 0.3228571 |
Having created a function that took all the parameters early on, it was just a matter of passing parameters into the function. I could have automated this, but I wanted the ability to evaluate them one at a time and check my results repeatedly without running a whole list and having to parse out just the one set of errors. So, this is just a long set of function calls. Everything else used in this was incorporated into that function up front.
test_ibcf_ubcf_params(restaurants, 0.8, 6, "center", "pearson")
test_ibcf_ubcf_params(restaurants, 0.8, 6, "z-score", "pearson")
test_ibcf_ubcf_params(restaurants, 0.8, 6, "center", "cosine")
test_ibcf_ubcf_params(restaurants, 0.8, 6, "z-score", "cosine")
test_ibcf_ubcf_params(restaurants, 0.8, 6, "center", "jaccard")
test_ibcf_ubcf_params(restaurants, 0.8, 6, "z-score", "jaccard")
test_ibcf_ubcf_params(restaurants, 0.8, 5, "center", "pearson")
test_ibcf_ubcf_params(restaurants, 0.8, 5, "z-score", "pearson")
test_ibcf_ubcf_params(restaurants, 0.8, 5, "center", "cosine")
test_ibcf_ubcf_params(restaurants, 0.8, 5, "z-score", "cosine")
test_ibcf_ubcf_params(restaurants, 0.8, 5, "center", "jaccard")
test_ibcf_ubcf_params(restaurants, 0.8, 5, "z-score", "jaccard")
test_ibcf_ubcf_params(restaurants, 0.8, 7, "center", "pearson")
test_ibcf_ubcf_params(restaurants, 0.8, 7, "z-score", "pearson")
test_ibcf_ubcf_params(restaurants, 0.8, 7, "center", "cosine")
test_ibcf_ubcf_params(restaurants, 0.8, 7, "z-score", "cosine")
test_ibcf_ubcf_params(restaurants, 0.8, 7, "center", "jaccard")
test_ibcf_ubcf_params(restaurants, 0.8, 7, "z-score", "jaccard")
The Full Code is avaliable with this .Rmd file in the Github repository should you choose to repeate any of this . I did change my seed at some point and not remember the original one, so if I did not guess correctly you will likely get similar but not exactly the same results in your run as I worked from the rMarkdown cache for most of this site development.
This dataset was originally downloaded from the UCI ML Repository: Restaurant & Consumer Data Sets
Creators: Rafael Ponce Medellín and Juan Gabriel González Serna rafaponce@cenidet.edu.mx, gabriel@cenidet.edu.mx Department of Computer Science. National Center for Research and Technological Development CENIDET, México
Blanca Vargas-Govea, Juan Gabriel González-Serna, Rafael Ponce-Medellan. Effects of relevant contextual features in the performance of a restaurant recommender system. In RecSys101: Workshop on Context Aware Recommender Systems (CARS-2011), Chicago, IL, USA, October 23, 2011