The purpose of this project is to implement a Global Baseline Estimate recommendation system using movie rating data. This approach provides a non-personalized baseline model that estimates ratings based on overall averages and systematic item-level and user-level effects.
The project will begin by importing the provided movie ratings dataset into R and reviewing its structure to understand the available variables. The global average rating will be calculated as the foundation of the baseline estimate. Next, movie-specific and user-specific deviations from the global mean will be computed and incorporated into the final baseline prediction.
This structured approach allows for incremental model building and helps separate general rating tendencies from individual preferences.
Potential challenges include missing or incomplete ratings, variation in the number of ratings per user or movie, and the need to correctly align user and movie identifiers. Another challenge may involve ensuring that calculations are performed efficiently and reproducibly across the dataset.
OpenAI. (2026). ChatGPT (Version 5.2) [Large language model]. https://chat.openai.com. Accessed February 12, 2026.