Introduction

This project implements a Global Baseline Estimate recommendation model using movie ratings data. The goal is to estimate ratings using overall averages and systematic user-level and movie-level effects.

Data Description

The dataset contains movie ratings provided by users. Each observation represents a user rating for a specific movie.

Methodology

The Global Baseline Estimate begins with the overall mean rating and then incorporates movie-specific and user-specific deviations from that mean.

Model Implementation

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
ratings <- read.csv("movie_ratings.csv")

head(ratings)
##     name             title rating
## 1   Alex     Black Panther      5
## 2   Alex Avengers: Endgame      4
## 3   Alex   The Dark Knight      3
## 4  Chris     Black Panther      3
## 5  Chris           Get Out      2
## 6 Jordan           Get Out      4
str(ratings)
## 'data.frame':    12 obs. of  3 variables:
##  $ name  : chr  "Alex" "Alex" "Alex" "Chris" ...
##  $ title : chr  "Black Panther" "Avengers: Endgame" "The Dark Knight" "Black Panther" ...
##  $ rating: int  5 4 3 3 2 4 4 5 5 4 ...