2017-12-12

Motivation

Recomendation is one of the hottest topics in data science and machine learning. It can be used for recomending:

  • movies on netflix;

  • products on amazon; and

  • songs on online music (Spotify or WSDM)

We wanted to explor this further.

Process

The data was taken from a kaggle competion for the Asian music app WSDM. The files were in .csv format which we uploaded to R and then uploaded to a database.

  • Challenges:
  • Non-Latin characters into database.

EDA

We ran some statistics on sign-ups compared over time. We realized that the most sign-ups happened over the weekend, with a slight up-tick on Friday leading into the weekend as compared to the rest of the week which looks relatively static.

Statistical Analysis

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

XGBoost Graph