Data 612 Project# 4

library(tidyverse)
library(kableExtra)
library(knitr)
library(recommenderlab)
library(dplyr)
library(ggplot2)         
library(ggrepel)         
library(tictoc)

Goal

The goal of this assignment is give you practice working with accuracy and other recommender system metrics.

In this assignment you’re asked to do at least one or (if you like) both of the following:

Work in a small group, and/or
Choose a different dataset to work with from your previous projects.

Deliverables

As in your previous assignments, compare the accuracy of at least two recommender system algorithms against your offline data.
Implement support for at least one business or user experience goal such as increased serendipity, novelty, or diversity.
Compare and report on any change in accuracy before and after you’ve made the change in #2.
As part of your textual conclusion, discuss one or more additional experiments that could be performed and/or metrics that could be evaluated only if online evaluation was possible. Also, briefly propose how you would design a reasonable online evaluation environment.

DataSet

For this project, I will be using a subset of the data produced by Amazon Open Data.
Index
Readme
Registry

Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazon’s iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.

Since there are tons of products on Amazon, I decided to focus on Video Game Reviews.

I posted my partial of Amazon Video Game Product Reviews to my Github

Data Import

# CSV Import
vg_ratings <- read.csv(paste0("https://raw.githubusercontent.com/josephsimone/Data-612/master/project_4/ratings_Amazon.csv"))

Data Preperation

# Sparse Matrix
vg_ratings_matrix <- sparseMatrix(as.integer(vg_ratings$UserId), as.integer(vg_ratings$ProductId), x = vg_ratings$Rating)
colnames(vg_ratings_matrix) <- levels(vg_ratings$ProductId)
rownames(vg_ratings_matrix) <- levels(vg_ratings$UserId)

In order for the recommendation system to work, I had to sparse the matrix manually was to high so had to put some ratings so that all users to have a “sufficient” numbers of reviews.

# Matrix Creation
vg_real_matrix <- as(vg_ratings_matrix, "realRatingMatrix")

# Setup
set.seed(123)
e <- evaluationScheme(vg_real_matrix, method = "split", train = 0.8, given=3, goodRating = 3)
train <- getData(e, "train")
known <- getData(e, "known")
unknown <- getData(e, "unknown")

Setting up a data-frame to hold the Times of each Model

# count
timer <- data.frame(Model=factor(), Training=double(), Predicting=double())

Recommender Models

USER BASED COLLABORATIVE FILTERING

# Model
m <- "UBCF"

# Training
tic()
UBCF_model<- Recommender(train, method = m)
t <- toc(quiet = TRUE)
tt<- round(t$toc - t$tic, 2)

# Predicting
tic()
UBCF_predict <- predict(UBCF_model, newdata = known, type = "ratings")
t <- toc(quiet = TRUE)
pt <- round(t$toc - t$tic, 2)
timer <- rbind(timer, data.frame(Model = as.factor(m), 
                                   Training = as.double(tt), 
                                   Predicting = as.double(pt)))

# Accuracy
UBCF_accuracy<- calcPredictionAccuracy(UBCF_predict, unknown)

RANDOM

# Model
m <- "RANDOM"

# Training
tic()
Random_model<- Recommender(train, method = m)
t <- toc(quiet = TRUE)
tt<- round(t$toc - t$tic, 2)

# Predicting
tic()
Random_predict<- predict(Random_model, newdata = known, type = "ratings")
t <- toc(quiet = TRUE)
pt <- round(t$toc - t$tic, 2)
timer <- rbind(timer, data.frame(Model = as.factor(m), 
                                   Training = as.double(tt), 
                                   Predicting = as.double(pt)))

# Accuracy
Random_accuracy <- calcPredictionAccuracy(Random_predict, unknown)

SVD

# Mode
m <- "SVD"

# Training
tic()
SVD_model <- Recommender(train, method = m, parameter = list(k = 50))
t <- toc(quiet = TRUE)
tt<- round(t$toc - t$tic, 2)

# Predicting
tic()
SVD_predict <- predict(SVD_model, newdata = known, type = "ratings")
t <- toc(quiet = TRUE)
pt <- round(t$toc - t$tic, 2)
timer <- rbind(timer, data.frame(Model = as.factor(m), 
                                   Training = as.double(tt), 
                                   Predicting = as.double(pt)))

# Accuracy
SVD_accuracy <- calcPredictionAccuracy(SVD_predict, unknown)

Evaluation Models

# Accuracy of Models
model_accuracy <- rbind(UBCF_accuracy, Random_accuracy)
model_accuracy <- rbind(model_accuracy, SVD_accuracy)
rownames(model_accuracy) <- c("UBCF", "Random", "SVD")
knitr::kable(model_accuracy, format = "html") %>%
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover"))

	RMSE	MSE	MAE
UBCF	1.145340	1.311803	0.8212842
Random	1.390219	1.932708	0.9893071
SVD	1.152275	1.327738	0.8095514

When reviewing the accuracy results for each of the three models, it appears that the UBCF and SVD Models have very similar results. However, the SVD model performs slightly better. Futhermore, The Random Model did not perform well in comparison to the other two Models.

Complexity

# Runtimes
mls <- list(
  "UBCF" = list(name = "UBCF", param = NULL),
  "Random" = list(name = "RANDOM", param = NULL),
  "SVD" = list(name = "SVD", param = list(k = 50))
  )
results_eval <- evaluate(x = e, method = mls, n = c(1, 5, 10, 30, 60))

## UBCF run fold/sample [model time/prediction time]
##   1  [0sec/155.14sec] 
## RANDOM run fold/sample [model time/prediction time]
##   1  [0sec/5.04sec] 
## SVD run fold/sample [model time/prediction time]
##   1  [11.62sec/5.52sec]

ROC Curve

# ROC Curve
plot(results_eval, 
     annotate = TRUE, legend = "topleft", main = "ROC Curve")

The ROC Curves paints a different picture regarding the UBCF Model. While the UBCF performs at the top, this Recommendor System uses a lot of computational horsepower. For Amazon, the computation costs might not be a worry of their’s these days, however, it might not be the most suitable solution for everyone.

Traing/Prediction Times

# Times from Count
rownames(timer) <- timer$Model
knitr::kable(timer[, 2:3], format = "html") %>%
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover"))

	Training	Predicting
UBCF	0.01	160.21
RANDOM	0.00	2.96
SVD	12.69	3.61

It is always important to consider Training and Prediction Times.

According the above table:

The UBCF Model can be produced quickly, however, predictions can takes some time.
The Random Model is efficien as well, however, is outperfomed by the rest.
The SVD Model takes longer to construct than to predict, alternatively, it is faster than the UBCF Model.

Conclusion

When exploring Product Reviews of Video Games listed or currently listed on Amazon, we looks a few models to measure different performance metrics, like precision and processing time. One of the themes that was uncovered, is whether the priority lies with the speed of the organizations model or how much resources the company can put into projections. Everyone desires better accuracy metrics but the challenge will be at least for most of the organizations balance this requirement vs the cost of computing.

In certain circumstances the environment may require the use of an API to continuously updating the training data. In previous project I have worked with the MovieLens DataSet, and how it could pertain to the Netflix Recommendation System. As a avid user myself, I can help but think if some sort of Hybrid Recommendation system could be created for the “best of both” models.