In the previous assignment, I used a Global Baseline Estimate (GBE) to predict movie ratings. The GBE is non-personalized, it uses the overall mean, a user bias, and a movie bias to make predictions, but it does not consider which specific users share similar taste.
In this project, I build a personalized recommender system using User-to-User Collaborative Filtering. This algorithm identifies users who rate movies similarly to a target user (also known as neighbors), and uses their ratings to predict what the target user would rate an unseen movie.
What the recommender outputs:
Predicted ratings for every unrated user movie pair
A top-N ranked recommendation list per user
RMSE evaluation via leave-one-out cross-validation, compared against the GBE baseline
CaptainAmerica Deadpool Frozen JungleBook PitchPerfect2 StarWarsForce
Burton NA NA NA 4 NA 4
Charley 4 5 4 3 2 3
Dan NA 5 NA NA NA 5
Dieudonne 5 4 NA NA NA 5
Matt 4 NA 2 NA 2 5
Mauricio 4 NA 3 3 4 NA
Max 4 4 4 2 2 4
Nathan NA NA NA NA NA 4
Param 4 4 1 NA NA 5
Parshu 4 3 5 5 2 3
Prashanth 5 5 5 5 NA 4
Shipra NA NA 4 5 NA 3
Sreejaya 5 5 5 4 4 5
Steve 4 NA NA NA NA 4
Vuthy 4 5 3 3 3 NA
Xingjia NA NA 5 5 NA NA
CaptainAmerica Deadpool Frozen JungleBook PitchPerfect2 StarWarsForce
Burton NA NA NA 0.00 NA 0.00
Charley 0.50 1.50 0.50 -0.50 -1.50 -0.50
Dan NA 0.00 NA NA NA 0.00
Dieudonne 0.33 -0.67 NA NA NA 0.33
Matt 0.75 NA -1.25 NA -1.25 1.75
Mauricio 0.50 NA -0.50 -0.50 0.50 NA
Max 0.67 0.67 0.67 -1.33 -1.33 0.67
Nathan NA NA NA NA NA 0.00
Param 0.50 0.50 -2.50 NA NA 1.50
Parshu 0.33 -0.67 1.33 1.33 -1.67 -0.67
Prashanth 0.20 0.20 0.20 0.20 NA -0.80
Shipra NA NA 0.00 1.00 NA -1.00
Sreejaya 0.33 0.33 0.33 -0.67 -0.67 0.33
Steve 0.00 NA NA NA NA 0.00
Vuthy 0.40 1.40 -0.60 -0.60 -0.60 NA
Xingjia NA NA 0.00 0.00 NA NA
Burton Charley Dan Dieudonne Matt Mauricio Max Nathan Param Parshu
Burton 1 0.00 NA NA NA NA 0.00 NA NA 0.00
Charley 0 1.00 0 -0.74 0.17 -0.29 0.74 NA -0.19 0.31
Dan NA 0.00 1 0.00 NA NA 0.00 NA 0.00 0.00
Dieudonne NA -0.74 0 1.00 0.93 NA 0.00 NA 0.25 0.41
Matt NA 0.17 NA 0.93 1.00 0.23 0.55 NA 0.91 -0.09
Mauricio NA -0.29 NA NA 0.23 1.00 0.00 NA 0.83 -0.79
Max 0 0.74 0 0.00 0.55 0.00 1.00 NA 0.00 0.11
Nathan NA NA NA NA NA NA NA 1 NA NA
Param NA -0.19 0 0.25 0.91 0.83 0.00 NA 1.00 -0.90
Parshu 0 0.31 0 0.41 -0.09 -0.79 0.11 NA -0.90 1.00
Prashanth 0 0.50 0 -0.48 -0.78 -0.33 -0.24 NA -0.57 0.52
Shipra 0 0.00 NA NA -0.81 -0.71 -0.87 NA -0.51 0.71
Sreejaya 0 0.74 0 0.00 0.55 0.00 1.00 NA 0.00 0.11
Steve NA 0.00 NA 0.00 0.00 NA 0.00 NA 0.00 0.00
Vuthy NA 0.78 NA -0.74 1.00 0.45 0.61 NA 0.59 -0.30
Xingjia NA 0.00 NA NA NA 0.00 0.00 NA NA 0.00
Prashanth Shipra Sreejaya Steve Vuthy Xingjia
Burton 0.00 0.00 0.00 NA NA NA
Charley 0.50 0.00 0.74 0 0.78 0
Dan 0.00 NA 0.00 NA NA NA
Dieudonne -0.48 NA 0.00 0 -0.74 NA
Matt -0.78 -0.81 0.55 0 1.00 NA
Mauricio -0.33 -0.71 0.00 NA 0.45 0
Max -0.24 -0.87 1.00 0 0.61 0
Nathan NA NA NA NA NA NA
Param -0.57 -0.51 0.00 0 0.59 NA
Parshu 0.52 0.71 0.11 0 -0.30 0
Prashanth 1.00 0.83 -0.24 0 0.18 0
Shipra 0.83 1.00 -0.87 NA -0.71 0
Sreejaya -0.24 -0.87 1.00 0 0.61 0
Steve 0.00 NA 0.00 1 NA NA
Vuthy 0.18 -0.71 0.61 NA 1.00 0
Xingjia 0.00 0.00 0.00 NA 0.00 1
if (cf_rmse < gbe_rmse) { improvement <-round((1- cf_rmse / gbe_rmse) *100, 1)cat("Collaborative Filtering improves over GBE by", improvement, "%\n")} else {cat("Global Baseline Estimate performed better on this dataset.\n")cat("This is not unusual with very sparse, small datasets where\n")cat("similarity estimates are unreliable.\n")}
Collaborative Filtering improves over GBE by 7.7 %
8 Conclusion
8.1 What Was Built
So for this project I implemented a User-to-User Collaborative Filtering recommender system using the same 16-critic, 6-movie survey dataset from the previous Global Baseline Estimate assignment. The system:
Centers each user’s ratings by subtracting their personal average to remove scale differences
Computes Pearson correlation between all pairs of users (requiring at least 2 co-rated movies)
Predicts missing ratings using a weighted average of the most similar neighbors’ centered ratings (up to k = 5 neighbors with positive similarity)
It’ll fall back to the Global Baseline Estimate when no valid neighbors are available
8.2 What It Outputs
Predicted ratings for all 40 unrated user–movie pairs
Ranked recommendation lists for each user, showing unseen movies ordered by predicted rating
RMSE scores for both models via leave-one-out cross-validation
By using the collaborative filter we achieved a 7.7% improvement over the non-personalized baseline. So, that demonstrates that by using user similarity it does improve prediction accuracy even on a small dataset.
8.4 Limitations
Small and sparse data: With only 16 users and 6 movies, similarity estimates are based on very few co-rated items, making them inherently noisy.
Out-of-range predictions: The model can predict ratings outside the 1–5 scale (e.g., Dieudonne’s predicted 6.0 for JungleBook). Clamping predictions to the valid range would address this.
Cold-start users: Critics like Burton and Nathan with very few ratings cannot be effectively scored by collaborative filtering and we have to lean on the baseline fallback.