This project implements a Global Baseline Estimate recommendation model using movie ratings data. The goal is to estimate ratings using overall averages and systematic user-level and movie-level effects.
The dataset contains movie ratings provided by users. Each observation represents a user rating for a specific movie.
The Global Baseline Estimate begins with the overall mean rating and then incorporates movie-specific and user-specific deviations from that mean.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
ratings <- read.csv("movie_ratings.csv")
head(ratings)
## name title rating
## 1 Alex Black Panther 5
## 2 Alex Avengers: Endgame 4
## 3 Alex The Dark Knight 3
## 4 Chris Black Panther 3
## 5 Chris Get Out 2
## 6 Jordan Get Out 4
str(ratings)
## 'data.frame': 12 obs. of 3 variables:
## $ name : chr "Alex" "Alex" "Alex" "Chris" ...
## $ title : chr "Black Panther" "Avengers: Endgame" "The Dark Knight" "Black Panther" ...
## $ rating: int 5 4 3 3 2 4 4 5 5 4 ...