Expand Friend Circle Using Yelp Data

Ganesh Sethuraman
Nov 22, 2015

An approach to expand the firend circle and thereby improve the business reach

Introduction

Correlation Pairs of critical user Data

  • The prediction task here is to find/recommend new friend(s) using power of data & social graph
    • Based on the users profile and their current friends, their reviews, their connections, etc
  • Thereby increasing the user base of a business (and hence the reach)
  • The friend prediction helps to improve the business top-line
  • The data that lends itself for modelling, correlation illustrate the same.

Data & Graph Data

Yelp: A User Community Graph

  • UserA and User B: The two user pair of friends and not-friends.
  • shortestPathAB: The shortest path between the two users, this is graph based
  • TotalFriendsCountA & B: A measure of extrovert-ism of users.
  • Review Count A & B: A measure of extrovert-ism of users.
  • isFriend: The outcome variable for the classification algorithms

Methods & Evaluation

Model Data Preparation

  • Random samples of User who are not friends based on the available data in user for the city (isFriend = “N”)
  • The data is split into training and test set
  • The training data accuracy and the test set accruacy are used to measure the output.

Methods Used

  • Random Forest: It is tree based ensemble method with 3 iterations and with cross validations.
  • Gradient Boosted Tree Model: It is boosted tree based ensemble method with same 3 iterations and with cross validations.

Results & Discussion

Method Param Random Forest Gradient Boosted Tree
Training Accuracy 0.7412235 0.7782541
Test Accuracy 0.7782541 0.7661692
  • Test Set Accuracy: The test set accuracy is better with Gradient boosted tree with 76.6%.
  • Observation 1: We are able to identify a user pair could be a friend or not based on the available & dervied.
    • Tips, and check-ins will help in identifing potential pairs
  • Observation 2: Help Business & Yelp to have a cohesive communities.