if (!require("knitr")) install.packages("knitr")
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("kableExtra")) install.packages("kableExtra")
if (!require("dplyr")) install.packages("dplyr")
if (!require("ggrepel")) install.packages("ggrepel")
if (!require("recommenderlab")) install.packages("recommenderlab")
if (!require("tictoc")) install.packages("tictoc")
if (!require("sparklyr")) install.packages("sparklyr")

 

Objective

Data

Load your data into (for example)**

I used MovieLens small datasets: 100,000 ratings and 3,600 tag applications applied to 9,000 movies by 600 users.

Display your data

userId movieId rating timestamp
1 1 4 964982703
1 3 4 964981247
1 6 4 964982224
1 47 5 964983815
1 50 5 964982931
1 70 3 964982400

Transform Data

I used realRatingMatrix from ‘recommenderlab’ to transform data.

## [1]  610 9724

Centralized Recommender System - recommenderlab

I used realRatingMatrix from ‘recommenderlab’ to transform data.

##      RMSE       MSE       MAE 
## 0.9266040 0.8585949 0.7170423

Distributed Recommender System - SparkR

RMSE MSE MAE
Centralized System 0.9266040 0.8585949 0.7170423
Distributed System 0.6656635 0.4431079 0.4978245
train_time predict_time
Centralized System 0.02 225.89
Distributed System 5.63 3.19

Summary results.