Elite Yelp Users: Checklist Attributes from Data

Max K. Goff
21 November 2015

Yelp Elite Users

Fewer than 1% of Yelp users have ever achieved elite status

  • Yelp blog: “… there is no set check list …” to achieve elite status.
  • Can text mining and analysis of user activity create a check list?
  • Can a Yelp Elite User predictive model yield a 95% Confidence Interval?

Project Paper:

Githup project assets: https://github.com/maxgoff/YelpDataScienceProject.git

Methods and Data

The Yelp Academic Data Sets were ingested, flattened, tidied, combined, and modified, resulting in a test set to facilitate predictive modeling:

FieldName RType VariableType
isElite factor Response
Review Count numeric Independent
Votes Count integer Independent
Fans integer Independent
Ave Review Len integer Independent
Average Stars numeric Independent
Flesch Kincaid numeric Independent
Friends Count numeric Independent
Days Yelping integer Independent
Compliments integer Independent

Elite Users Do Better

Elite over Regular Ratios

  • Post 6.3 times more reviews than regular users
  • Receive 12.47 times more Compliments
  • Have 10.57 times more Fans

Derived Check List for Elite Status

Activity Performance Level
Write a review 5 or more per month
Characters per review at least 1095
Target reading level 5th grade
Vote on other reviews at least 16.2 per month
Get a compliment at least 2.5 per month
Make a friend at least 1 per month
Get a fan at least 1 per quarter

13 Classification Models tested, selected: caret::xgbTree

  • Model Confidence interval: 95%
  • Sensitivity: 98.9% (User is not Elite)
  • Specificity: 83.95% (User is Elite)