Dataset

We use a TidyTuesday data set comprised of the best hip hop songs off all time. The data set was compile by bbc news and includes a file for polls and a file for rankings.

Our Game Plan

We will use the Poll data set to create recommenders that given a list of favorite songs the recommender with suggest other songs that the user might/should like. We will execute the gameplan by training three alternative models UBCF, IBCF and Random.

EDA

We will perform a some EDA to develop a better understanding of the data.

title n
Juicy 18
Nuthin’ But A ‘G’ Thang 14
The Message 14
Shook Ones (Part II) 13
Fight The Power 11
C.R.E.A.M. 10
93 ’Til Infinity 7
N.Y. State Of Mind 7
Dear Mama 6
Jesus Walks 6

Pools over Time

It appears the majority of pools took place during the 1990s.

## # A tibble: 10 x 7
##    artist                                   n    n1    n2    n3    n4    n5
##    <chr>                                <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
##  1 The Notorious B.I.G.                    22    10     3     4     2     3
##  2 Public Enemy                            18     9     4     2     2     1
##  3 Wu-Tang Clan                            20     6     1     3     7     3
##  4 Grandmaster Flash & The Furious Five    14     5     3     1     0     5
##  5 2Pac                                    13     4     3     1     4     1
##  6 Eric B & Rakim                          11     4     1     5     0     1
##  7 Mobb Deep                               14     4     5     1     2     2
##  8 Nas                                     13     3     5     2     1     2
##  9 The Pharcyde                             6     3     2     0     0     1
## 10 Clipse                                   5     2     0     1     1     1
## # A tibble: 535 x 9
##     rank title artist gender  year critic_name critic_rols critic_country
##    <dbl> <chr> <chr>  <chr>  <dbl> <chr>       <chr>       <chr>         
##  1     1 Term~ Publi~ male    1998 Joseph Aba~ Fat Beats   US            
##  2     2 4th ~ Gza f~ male    1995 Joseph Aba~ Fat Beats   US            
##  3     3 Pete~ Run D~ male    1986 Joseph Aba~ Fat Beats   US            
##  4     4 Play~ GLOBE~ male    2001 Joseph Aba~ Fat Beats   US            
##  5     5 Time~ O.C.   male    1994 Joseph Aba~ Fat Beats   US            
##  6     1 Play~ Slum ~ male    1997 Biba Adams  Critic      US            
##  7     2 Self~ Stop ~ mixed   1989 Biba Adams  Critic      US            
##  8     3 Push~ Salt-~ female  1986 Biba Adams  Critic      US            
##  9     4 Ambi~ 2Pac   male    1996 Biba Adams  Critic      US            
## 10     5 Big ~ JAY-Z~ male    1999 Biba Adams  Critic      US            
## # ... with 525 more rows, and 1 more variable: critic_country2 <chr>
## [1] 0.1551402

Create Binanry Hip Hop Matrix

Create a Training Schema

We create a training schema that utilized 80% of the data.

## Evaluation scheme using all-but-1 items
## Method: 'split' with 1 run(s).
## Training set proportion: 0.800
## Good ratings: NA
## Data set: 107 x 309 rating matrix of class 'binaryRatingMatrix' with 535 ratings.

Train Models

Per our gameplan, we train UBCF, IBCF and Random models.

User-based Filtering Model

## UBCF run fold/sample [model time/prediction time]
##   1  [0.03sec/0.35sec]
## IBCF run fold/sample [model time/prediction time]
##   1  [0.54sec/0.11sec]
## RANDOM run fold/sample [model time/prediction time]
##   1  [0sec/0.02sec]

Evaluate Models

## UBCF run fold/sample [model time/prediction time]
##   1  [0sec/0.25sec] 
## IBCF run fold/sample [model time/prediction time]
##   1  [0.36sec/0.01sec] 
## RANDOM run fold/sample [model time/prediction time]
##   1  [0sec/0.01sec]

ROC Curve Plots

Create Final Models Using Optimized Parameters

We create final models using the best parameter values as derived from the ROC curves.

## Available parameter (with default values):
## k     =  30
## method    =  Jaccard
## normalize_sim_matrix  =  FALSE
## alpha     =  0.5
## verbose   =  FALSE

Make Prediction and Evaluate Accuracy

Let take a look at the accuracy of our three models. Overall, UBCF remains the best model.

TP FP FN TN precision recall TPR FPR
UBCF 0.1818182 9.363636 0.8181818 299.6364 0.0190476 0.1818182 0.1818182 0.0303030
IBCF 0.0909091 9.909091 0.9090909 299.0909 0.0090909 0.0909091 0.0909091 0.0320683
Random 0.0000000 10.000000 1.0000000 299.0000 0.0000000 0.0000000 0.0000000 0.0323625

Create Recommender Engines

Create a List of Favorite Songs

Apply Favorite Song List to Recommendation Engines

Once the Recommendation Engines are given five favorite songs, it recommends a list of song that the user might / should like. I’m not a big hip hop fan, but my son gave the recommender said the UBCF did in fact do the best job.

UBCF Recommendatations

##                         X1
## 1                    Juicy
## 2       N.Y. State Of Mind
## 3          Bring Da Ruckus
## 4                Dear Mama
## 5          Fuck Tha Police
## 6              Jesus Walks
## 7            Lose Yourself
## 8  Nuthin’ But A ‘G’ Thang
## 9     Shook Ones (Part II)
## 10  America’s Most Blunted

IBCF Recommendations

##                                        X1
## 1                                     DNA
## 2         Quiet Storm (Remix ft. Lil Kim)
## 3                       I've Seen Footage
## 4                     Concrete Schoolyard
## 5                        Guess Who’s Back
## 6                           Puppet Master
## 7  Wu-Tang Clan Ain’t Nuthing Ta Fuck Wit
## 8                         Bring Da Ruckus
## 9                           m.A.A.d. city
## 10                 Straight Outta Compton

Random Recommendations

##                                                      X1
## 1                               Nuthin’ But A ‘G’ Thang
## 2                Wu-Tang Clan Ain’t Nuthing Ta Fuck Wit
## 3                                               Push It
## 4  B.I.B.L.E. (Basic Instructions Before Leaving Earth)
## 5                                     Bible On The Dash
## 6                                         m.A.A.d. city
## 7                                           Look At Me!
## 8                                           White Lines
## 9                                           Mass Appeal
## 10                       Da Art of Storytellin’ (Pt. 2)

Business User Experience Goal

As the President of the Hip Hop Artist Organization, one of my objectives is to promote all of our artistS. Market research shows that most Hip Hop fans have 3 to 5 artistS that they mainly follow and support. In an attempt to increae that number and champion all hip hop music, I have asked my crackerjack data scientist to create a Recommender Engine that provides the user with some new content suggestion. The data scientist has accomplished this by creating a Hybrid Recommender that combines two models - UBCF and Random. What’s even better, the Hybrid model has weights that allow me to determine how strongly one model influences the recommendations relative to the other modeling approach. The higher the UBCF weighting the more the recommendations will be like the UBCF model and vice versa.

Hybrid Model with Weight Set to 90% (UBCF) to 10% (RANDOM)

## $`1`
##  [1] "Juicy"                            "Bring Da Ruckus"                 
##  [3] "Dear Mama"                        "Fuck Tha Police"                 
##  [5] "Jesus Walks"                      "Lose Yourself"                   
##  [7] "N.Y. State Of Mind"               "America’s Most Blunted"          
##  [9] "B.O.B."                           "Black Steel In The Hour Of Chaos"
## $`1`
##  [1] "Juicy"              "N.Y. State Of Mind" "Dear Mama"         
##  [4] "Lose Yourself"      "Bring Da Ruckus"    "Ready Or Not"      
##  [7] "California Love"    "I Am I Be"          "Fuck Tha Police"   
## [10] "Jesus Walks"

Hybrid Model with Weight Set to 50% (UBCF) to 50% (RANDOM)

## $`1`
##  [1] "Juicy"                            "Bring Da Ruckus"                 
##  [3] "Dear Mama"                        "Fuck Tha Police"                 
##  [5] "Jesus Walks"                      "Lose Yourself"                   
##  [7] "N.Y. State Of Mind"               "America’s Most Blunted"          
##  [9] "B.O.B."                           "Black Steel In The Hour Of Chaos"
## $`1`
##  [1] "N.Y. State Of Mind"                 "Dear Mama"                         
##  [3] "Double Trouble At The Amphitheatre" "Passin’ Me By"                     
##  [5] "Follow The Leader"                  "C.R.E.A.M."                        
##  [7] "Ready Or Not"                       "Walk This Way"                     
##  [9] "Quiet Storm (Remix)"                "Quiet Storm (Remix ft. Lil Kim)"