The Goal of this assignment is to use a Global Baseline Estimate (GBE) recommendation system using R and see how personalization changes results in predictions. With the movie rating data collected in the previous assignment (Assignment_2A_Movie ratings), I aim to predict rating values for movies that reviewers have not yet seen. This is possible with the GBE formula. The dataset has missing values (NA/NULL) and these will be the target for our prediction algorithm. This model will account for reviewers movie bias preferences and popularity towards the item bias.
Dataset contains ratings of 6 reviewers (Liz, Jed, Brenda, Jamie, Justice, Theresa) and 6 horror/thriller movies(The Substance, Nosferatu, Frankenstein, 28 years later, Sinners, The Conjuring 4)
Source: Dataset from local PostgreSQL populated via Assigment2A_movie_rating.sql
Connecting/Loading Data
PostgreSQL database connection with DBI and RPostgres, import raw ratings table into R data frame, and convert NULL
values in SQL into NA values in R for calculationBias Calculation and apply GBE forumla With dyplyr, use Group_by function and calculate:
Comparison and Data Visualization
Raw average vs GBE average predictions and seeing the user bias distribution. Find the predicted score for a reviewer, in this case, I will use Liz as an example.