Music Recommendations at Scale with Spark by Christopher Johnson (Spotify)

Synopsis:

This was a great video at a high-level of how Spotify make recommendations to its end-users.

Recommendation at Spotify utilized the following:

While Netflix, one of the best recommender systems in the business uses Collaborative Filtering & Explicit Factorization, Spotify uses an Implicit Factorization format which has subtle differences.

In explicit factorization:


Discussion I

However in Spotify’s case, an Implicit Factorization is applied instead,

Instead of explicitly capturing the users’ ratinigs like in Netflix, Spotify simply used a binary system. With 1 = streamed while 0 = never streamed. The structure looks similar to the explicit and the motivation is still the same; minimization of the RSME but on a weighted basis contraints by some regularization factor, \(\lambda\). Regularization is simply a constraint on the miminzation problem to prevent overfitting.

In the you-tube video, the Alternating Least Squares was shown to solve this problem: The main idea is to alternate between holding the songs fixed and solving for the users and holding the users fixed and solve for the songs vectors. This back and forth is why its called the Alternating Least Squares Method:


Summary & Recommendation