MovieLens 1M movie ratings. Stable benchmark dataset. 1 million ratings from 6000 users on 4000 movies. Released 2/2003, The dataset can be downloaded from https://grouplens.org/datasets/movielens/1m/. This dataset is choosen for analysis
I will be performing analysis Collaborative Filtering on existing MovieLens dataset of user-item ratings also analysing the prediction using spark ALS
Collaborative filtering is a method of making automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating).
Spark ALS CF,which users and products are described by a small set of latent factors that can be used to predict missing entries
Study the data, perform some pre-processing such Sparse Matrix Conversionon data for further model building, Separating the genre of movies etc
IBCF and UBCF models are used comparison and performance
Identifying the algorithms and recommendation model
Evaluation of model
Calculate Accuracy measures
Identifying the Probability thresholds ROC/Presicion Recall
Comparison of model and picking the ideal one (which is best one? IBCF or UBCF)