Raw Data

Column

Unique Users & Profiles

Blow are a few samples of Most Reviewed Profiles with review counts as wells the the top Users for reviewing profiles along with the number of reviews per user.

Number of Unique Users: 220970

Number of Reviews Per Profile:

profile_id reviews
NA 85611
156148 33389
31116 28398
193687 23649
121859 23639
83773 23113

Number of Reviews per User:

user_id reviews
1 345
2 97
3 20
4 101
5 105
6 96

Distribution of Reviews

Looking distributions of both profiles reviewed and reviews by users there is clearly

Smaller Data

Column

Creating Subset of Most Reviewed Profiles

In order to make the data more manageable for local evaluation, I reduced it to the most reviewed 100 profiles

Summary of Most Reviewed Profiles

NA Values for User_ID : 0

Number of Unique User_ID’s: 130910

NA Values for Profile_ID: 0

Number of Unique Profile_ID’s: 130910

NA Values for Rating: 0

Column

Top 12 Profiles & User Reviews

A quick review of the most reviewed profiles in the smaller data set.

Profiles User Reviews
156148 33389
31116 28398
193687 23649
121859 23639
83773 23113
22319 21387
71636 21284
89855 20634
20737 18550
162707 18224
68989 16591
60983 16253

Total Possible Numbers of Reviews: 13091000

realRatingMatrix

Column

Distribution of Ratings

To get an understanding of the data, is is important to evaluate the distribution of rating values in our final ratings matrix.

The NA values were removed so as not to skew the data. The most common rating is 10, with a solid showing in 6 and 8 as well.

Comparing Models

Column

Setting Scheme & Evaluating Models

As previously, I set up a schema to evaluate models prior to building and testing a the diversity component.

Two models are compared, Item Based Collaborative Filtering and Singular Value Decomposition.

An evaluation results list is created and plotted. Clearly NEITHER model is particularly good.

Both the ROC curve and the Precision & Recall curve clearly show that the Item Based Recommender is of little use with all the different settings producing the same results.

The SVD is not much different, however, is does show that the performance is better baseline and improves slightly as n increases.

Based on these graphs I will be using the Singular Value Decomposition model for the diversity test.

Column

ROC Curve

Precision & Recall Curve

SVD Wins

Column

Make Predictions

Sixty predictions are made from the test data, using the Singular Value Decomposition model and saved into predictions and ratings.
[1] "102328" "179192" "61157"  "34328"  "54929" 
[1] 9.012046 8.890652 8.565071 8.538497 8.535509

Adding Diversity

Column

Augmented Sampling

In order to add some diversity to the recommender system, the top three profiles are presented and two more, which are randomly selected from the remaining 57 predictions. The hope is that by choosing people a bit further out, this will improve the odds of finding potential dates who are interesting in ways not covered by the sites questionnaire but may be interesting for other reasons.

Column

Sample Predictions

Both Profiles and Ratings are generated and based on the ratings the predictions seem to be in the ballpark of the original (although slightly lower)

Test Observation 41 Profiles:

75169, 93891, 54929, 61157, 45992

Test Observation 41 Ratings:

8.372086, 8.3496249, 8.3268807, 8.1493106, 8.2385773

Performance

Column

Diversity Estimation

In order to estimate the difference in variation due to the new recommendation selection algorithm, the mean of each row of predictions and diversity predictions is calculated. Rowise differences are calculated and the mean difference in variation is calculated by summing them up and dividing to the number of rows.

Prediction Variance Increase Due to Diversity 0.1426675

Based on the method used to select recommendations in the diversity algorithm, there is an average change of .14 in the ratings recommended.

While there is no sense that this small change in recommendation will have a pragmatic change in the results, clearly there is some amount of variety added due to the new selection model.

Column

Online Performance

In the grander scheme of things all of this work in the background is a decent starting place, however, the real value of recommender systems is in applied performance.

Whether or not this data is useful can only be ascertained through extensive monitoring and performance testing live, online.

Were I to use a recommender system like this one live, I would build in a number of features in the site to gather data to try to estimate the utility of the diversity we believe we added. There would be three features that I would find useful. - A post viewing survey asking if they recommendation was good (or if the profile was desirable) - A timer which tracks how long users are on each profile and compares the organic recommendations to the diversity enhanced recommendations - A counter which tracks how many times a member views a profile, then compare the organic recommendations to the diversity recommendations

I think with these sorts of measurements, we could use A/B testing to figure out whether or not new algorithm is providing reasonably meaningful suggestions relative to the standard model.