0.1 Introduction

The goal of this assignment is to provide practice working with Matrix Factorization techniques.

The task is to implement a matrix factorization method—such as singular value decomposition (SVD) or Alternating Least Squares (ALS)—in the context of a recommender system.

You may approach this assignment in a number of ways. You are welcome to start with an existing recommender system written by yourself or someone else. Remember as always to cite your sources, so that you can be graded on what you added, not what you found.

SVD can be thought of as a pre-processing step for feature engineering. You might easily start with thousands or millions of items, and use SVD to create a much smaller set of “k” items (e.g. 20 or 70).

0.1.1 Assignment Highlights

SVD builds features that may or may not map neatly to items (such as movie genres or news topics). As in many areas of machine learning, the lack of explainability can be an issue).

SVD requires that there are no missing values. There are various ways to handle this, including (1) imputation of missing values, (2) mean-centering values around 0, or (3) using a more advance technique, such as stochastic gradient descent to simulate SVD in populating the factored matrices.

Calculating the SVD matrices can be computationally expensive, although calculating ratings once the factorization is completed is very fast. You may need to create a subset of your data for SVD calculations to be successfully performed, especially on a machine with a small RAM footprint.

0.1.2 About the Data

For Project 3, I will build on Project 2 using the same MovieLens dataset, and evaluate the models using RMSE, MSE and MAE.The MovieLens Latest Small Datasets contain 100,000 ratings and 3,600 tag applications applied to 9,000 movies by 600 users.

0.1.3 Loading the MovieLense Data

## [1] "data"      "normalize"
## [1] "realRatingMatrix"
## attr(,"package")
## [1] "recommenderlab"
## [1]  943 1664

0.1.4 Data structure

## Formal class 'realRatingMatrix' [package "recommenderlab"] with 2 slots
##   ..@ data     :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
##   .. .. ..@ i       : int [1:99392] 0 1 4 5 9 12 14 15 16 17 ...
##   .. .. ..@ p       : int [1:1665] 0 452 583 673 882 968 994 1386 1605 1904 ...
##   .. .. ..@ Dim     : int [1:2] 943 1664
##   .. .. ..@ Dimnames:List of 2
##   .. .. .. ..$ : chr [1:943] "1" "2" "3" "4" ...
##   .. .. .. ..$ : chr [1:1664] "Toy Story (1995)" "GoldenEye (1995)" "Four Rooms (1995)" "Get Shorty (1995)" ...
##   .. .. ..@ x       : num [1:99392] 5 4 4 4 4 3 1 5 4 5 ...
##   .. .. ..@ factors : list()
##   ..@ normalize: NULL

0.1.5 Exploring the ratings values

## [1] 5 4 0 3 1 2
## ratingvalues
##       0       1       2       3       4       5 
## 1469760    6059   11307   27002   33947   21077

0.1.6 Excluding the missing values

0.2 Create Training and Testing sets

Use 80% for training the model

Prepare training dataset

Prepare testing set

Prepare evaluation set

0.3 User-User Collaborative Filtering

The User based collaborative filtering algorithms are based on measuring the similarity between users. A “Recommender” object is then given the “UBCF” (User-based collaborative filter), with a center normalization, cosine method, with 25 nearest neighbors.But first, I will compute the similarity matrix.

0.3.1 Similarity Matrix

A similarity matrix is a recommenderlab function that takes the “realRatingMatrix”" and calculates a cosine similarity which aids in the investigation of model development.

0.3.2 The User Based Model

Building the User Based Model using 25 nearest neighbor. Building the user collaborative filtering system to recommend movies to users based on how similar they are with other users.

0.3.3 Making Predictions with newdata = test

0.4 Model Evaluation - User Based

##      RMSE       MSE       MAE 
## 0.9420550 0.8874676 0.7463825

0.5 item-item Collaborative Filtering

The Item based collaborative filtering algorithms are based on measuring the similarity between items A “Recommender” object is then given the “IBCF” (Item-based collaborative filter), with a center normalization, cosine method, with K = 250.But first, I will compute the similarity matrix.

0.5.1 Similarity Matrix

A similarity matrix is a recommenderlab function that takes the “realRatingMatrix”" and calculates a cosine similarity which aids in the investigation of model development.

The diagonal is yellow because it’s comparing each items with itself.

0.5.2 The Item Based Model

Building the item-item collaborative filtering system to recommend movies to users where their item’s ratings are similar.

0.5.3 Making Predictions with newdata = test

0.6 Model Evaluation - Item Based

##      RMSE       MSE       MAE 
## 1.1347000 1.2875440 0.8856329

0.7 Singular Value Decomposition Model(SVD Model)

Singular Value Decomposition (SVD) will enable us to uncover hiden features in a ratings matrix which are generated from the collective behavior of users. The SVD uses dimensional reduction which can minimize the problem of overfitting to provide robust and compact representations of the data.

0.7.1 SVD Model

I will generate an SVD model with a much smaller set of “k” items (70). It will contain all the required information and I will compare to the Item and User Based Models.

0.7.2 Making Predictions with newdata = test

0.8 Model Evaluation - User Based

##      RMSE       MSE       MAE 
## 0.9648359 0.9309084 0.7649963

0.9 Model Performance

0.9.1 Comparing the Recommender Systems

##         ModelName Model_RMSE Model_MSE Model_MAE
## 1 UserBased Model      0.942     0.887     0.747
## 2 ItemBased Model      1.135     1.288     0.886
## 3       SVD Model      0.965     0.931     0.765

0.10 Conclusion

The RMSE is a measure of the differences between values predicted by a model and the values observed.The RMSE is a good measure to use if we want to estimate the standard deviation of a typical observed value from the model’s prediction, assuming that the observed data can be decomposed.

If the noise is small, as estimated by RMSE, this generally means that the model is good at predicting the observed data, and if RMSE is large, this generally means the model is failing to account for important features underlying the data.

The Metrics shows that the RMSE, MSE and MAE for the User based Recommender system is slightly better in performance compared to the Item Based and the SVD systems. The RMSE for the User based system is lower than the other systems, the MSE and MAE for this system is also lower.

The lower RMSE shows that a collaborative filtering model may be more effective at predicting items based on ratings than the SVD. Although using a larger value of k may improve the performance, it is important identify other possible applications of the SVD.

The disadvantage of SVD is that there is little explanation why an item was recommended since similarity matrix was not calculated. This can be problematic if users are eager to know why a specific item was recommended.

0.10.1 References

Breese JS, Heckerman D, Kadie C (1998). “Empirical Analysis of Predictive Algorithms for Collaborative Filtering.” In Uncertainty in Artificial Intelligence. Proceedings of the Fourteenth Conference, pp. 43-52.

Kohavi, Ron (1995). “A study of cross-validation and bootstrap for accuracy estimation and model selection”. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 1137-1143.

Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix Factorization Techniques for Recommender Systems. Computer, 42(8), 30–37. https://doi.org/10.1109/MC.2009.263

---
title: "Data 612 - Project 3"
author: Emmanuel Hayble-Gomes
date: "06/22/2020"
output:
  html_document:
    code_download: yes
    code_folding: hide
    highlight: pygments
    number_sections: yes
    theme: flatly
    toc: yes
    toc_float: yes
  pdf_document:
    toc: yes
---

## Introduction

The goal of this assignment is to provide practice working with Matrix Factorization techniques.

The task is to implement a matrix factorization method—such as singular value decomposition (SVD) or Alternating Least Squares (ALS)—in the context of a recommender system.

You may approach this assignment in a number of ways. You are welcome to start with an existing recommender system written by yourself or someone else. Remember as always to cite your sources, so that you can be graded on what you added, not what you found.

SVD can be thought of as a pre-processing step for feature engineering. You might easily start with thousands or millions of items, and use SVD to create a much smaller set of “k” items (e.g. 20 or 70).

### Assignment Highlights

SVD builds features that may or may not map neatly to items (such as movie genres or news topics). As in many areas of machine learning, the lack of explainability can be an issue).

SVD requires that there are no missing values. There are various ways to handle this, including (1) imputation of missing values, (2) mean-centering values around 0, or (3) <advanced> using a more advance technique, such as stochastic gradient descent to simulate SVD in populating the factored matrices.

Calculating the SVD matrices can be computationally expensive, although calculating ratings once the factorization is completed is very fast. You may need to create a subset of your data for SVD calculations to be successfully performed, especially on a machine with a small RAM footprint.

### About the Data

For Project 3, I will build on Project 2 using the same MovieLens dataset, and evaluate the models using RMSE, MSE and MAE.The MovieLens Latest Small Datasets contain 100,000 ratings and 3,600 tag applications applied to 9,000 movies by 600 users.

```{r, message=FALSE, warning=FALSE}
library(dplyr)
library(ggplot2)
library(kableExtra)
library(recommenderlab)
```

### Loading the MovieLense Data

```{r}
data (MovieLense)
```


```{r}
slotNames(MovieLense)
class(MovieLense)
dim(MovieLense)
```

### Data structure

```{r}
str(MovieLense)
```

### Exploring the ratings values

```{r}
ratingvalues <- as.vector(MovieLense@data)
unique(ratingvalues) # The rating is numeric with the least value as 0.5 and the highest values as 5.

tablerating <- table(ratingvalues)
tablerating
```

### Excluding the missing values

```{r}
ratingvalues <- ratingvalues[ratingvalues !=0]
```


### Distribution of the Ratings

```{r}
hist(ratingvalues, 
     breaks = 6, 
     main="Distribution of Ratings",
     xlab="Ratings",
     col="blue",
     freq=TRUE
     )
```

### Pre-processing

**Convert data to numeric**

```{r}
movies <- as(MovieLense, 'data.frame')
movies$user <- as.numeric(movies$user)
movies$item <- as.numeric(movies$item)
```

### Create the sparse Matrix

```{r}
ratingsMat <- sparseMatrix(i = movies$user, j = movies$item, x = movies$rating, 
                               dims = c(length(unique(movies$user)), length(unique(movies$item))),  
                               dimnames = list(paste("u", 1:length(unique(movies$user)), sep = ""), 
                                               paste("m", 1:length(unique(movies$item)), sep = "")))

ratingsReal <- new("realRatingMatrix", data = ratingsMat)
ratingsReal
```

**I'm selecting the Users who have rated at least 100 movies and those movies that have been watched at least 150 times**

```{r}
ratings <- MovieLense[rowCounts(MovieLense) > 100, colCounts(MovieLense) > 150]
```

## Create Training and Testing sets

Use 80% for training the model

```{r}
set.seed (100)
Scheme <- evaluationScheme(ratings, method = "split",
                         train = 0.8, given= 20, goodRating=4)
```

**Prepare training dataset**

```{r}
train <- getData(Scheme, "train")
```

**Prepare testing set**

```{r}
test <- getData(Scheme, "known")
```

**Prepare evaluation set**

```{r}
evaluation <- getData(Scheme, "unknown")
```

## User-User Collaborative Filtering

The User based collaborative filtering algorithms are based on measuring the similarity between users. A “Recommender” object is then given the “UBCF” (User-based collaborative filter), with a center normalization, cosine method, with 25 nearest neighbors.But first, I will compute the similarity matrix.

### Similarity Matrix

A similarity matrix is a recommenderlab function that takes the “realRatingMatrix”" and calculates a cosine similarity which aids in the investigation of model development. 

```{r}
similarityUsers <- similarity(MovieLense[1:4, ], method = "cosine", which = "users")

image(as.matrix(similarityUsers), main = "Users similarity")
```

### The User Based Model

Building the User Based Model using 25 nearest neighbor. Building the user collaborative filtering system to recommend movies to users based on how similar they are with other users.

```{r}
Usermodel <- Recommender(train, method = "UBCF", 
                     param=list(normalize = "center", method="Cosine", nn=25))
```

### Making Predictions with newdata = test

```{r}
Userpred <- predict(Usermodel, newdata = test, type = "ratings")
```

## Model Evaluation - User Based

```{r}
Useraccuracy <- calcPredictionAccuracy(Userpred, evaluation)
Useraccuracy
```

## item-item Collaborative Filtering

The Item based collaborative filtering algorithms are based on measuring the similarity between items A “Recommender” object is then given the “IBCF” (Item-based collaborative filter), with a center normalization, cosine method, with K = 250.But first, I will compute the similarity matrix.

### Similarity Matrix

A similarity matrix is a recommenderlab function that takes the “realRatingMatrix”" and calculates a cosine similarity which aids in the investigation of model development. 

```{r}
similarityitems <- similarity(MovieLense[, 1:4], method = "cosine", which = "items")

image(as.matrix(similarityitems), main = "Items similarity")
```

The diagonal is yellow because it’s comparing each items with itself. 

### The Item Based Model

Building the item-item collaborative filtering system to recommend movies to users where their item’s ratings are similar.

```{r}
Itemmodel <- Recommender(train, method = "IBCF", 
                     param=list(normalize = "center", method="Cosine", k=250))
```

### Making Predictions with newdata = test

```{r}
Itempred <- predict(Itemmodel, newdata = test, type = "ratings")
```

## Model Evaluation - Item Based

```{r}

Itemaccuracy <- calcPredictionAccuracy(Itempred, evaluation)
Itemaccuracy
```

## Singular Value Decomposition Model(SVD Model)

Singular Value Decomposition (SVD) will enable us to uncover hiden features in a ratings matrix which are generated from the collective behavior of users. The SVD uses dimensional reduction which can minimize the problem of overfitting to provide robust and compact representations of the data.

### SVD Model

I will generate an SVD model with a much smaller set of “k” items (70). It will contain all the required information and I will compare to the Item and User Based Models. 

```{r}
SVDmodel <- Recommender(train, method = "SVD", parameter = list(k = 70))
```

### Making Predictions with newdata = test

```{r}
SVDpred <- predict(SVDmodel, newdata = test, type = "ratings")
```

## Model Evaluation - User Based

```{r}
SVDaccuracy <- calcPredictionAccuracy(SVDpred, evaluation)
SVDaccuracy
```

## Model Performance

### **Comparing the Recommender Systems**

```{r}
ModelName <- c("UserBased Model","ItemBased Model","SVD Model")
Model_RMSE <- c("0.942", "1.135","0.965")
Model_MSE <- c("0.887", "1.288","0.931")
Model_MAE <- c("0.747", "0.886","0.765")

Model_Performance <- data.frame(ModelName,Model_RMSE,Model_MSE,Model_MAE)
Model_Performance
```

## Conclusion

The RMSE is a measure of the differences between values predicted by a model and the values observed.The RMSE is a good measure to use if we want to estimate the standard deviation of a typical observed value from the model’s prediction, assuming that the observed data can be decomposed.

If the noise is small, as estimated by RMSE, this generally means that the model is good at predicting the observed data, and if RMSE is large, this generally means the model is failing to account for important features underlying the data.

The Metrics shows that the RMSE, MSE and MAE for the User based Recommender system is slightly better in performance compared to the Item Based and the SVD systems. The RMSE for the User based system is lower than the other systems, the MSE and MAE for this system is also lower.

The lower RMSE shows that a collaborative filtering model may be more effective at predicting items based on ratings than the SVD. Although using a larger value of k may improve the performance, it is important identify other possible applications of the SVD.

The disadvantage of SVD is that there is little explanation why an item was recommended since similarity matrix was not calculated. This can be problematic if users are eager to know why a specific item was recommended.


### **References**

Breese JS, Heckerman D, Kadie C (1998). "Empirical Analysis of Predictive Algorithms for Collaborative Filtering." In Uncertainty in Artificial Intelligence. Proceedings of the Fourteenth Conference, pp. 43-52.

Kohavi, Ron (1995). "A study of cross-validation and bootstrap for accuracy estimation and model selection". Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 1137-1143.

Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix Factorization Techniques for Recommender Systems. Computer, 42(8), 30–37. https://doi.org/10.1109/MC.2009.263
