Final Project Proposal

Objective

My intention is to use the tools and methods learned throughout this course to build a fully functioning recommender system on a complex dataset.
I’ll explore various models to settle on the best approach, but given the size of the data and my experience with Python and R crashing from far smaller matrices, my expectation is that a Spark implementation will be the way to go.

Dataset

The article “9 Must-Have Datasets for Investigating Recommender Systems” on KDnuggets brought my attention to the last.fm dataset. As a music lover with a wide range of tastes, a music based recommender seems like a good one to investigate and build up. It contains ~2K users and ~18000 artists. The data is provided by members of the Information Retrieval group at Universidad Autonoma de Madrid.