Characterizing Yelp Users

Data Science
Capstone Project

Question

Users are categorized based on the average number of stars they have given to businesses they have reviewed: A-users (4.5 to 5.0), B-users (3.5 to 4.5), C-users (3.0 to 3.5) and D-users (0.0 to 3.0)

  • What are their common characteristics as determined from their number of reviews, votes (funny/useful/cool), compliments, friends, and fans, and
  • What are their differences?
  • How do these characteristics change by business category?

Of interest to businesses to target customers based on these characteristics.

Methodology & Results (1 of 2)

Used statistical modeling: Quantile Regression & LOESS Regression.

Quantile Regression: good for skewed data.

plot of chunk unnamed-chunk-1

Methodology & Results (2 of 2)

LOESS Regression: good for data exhibiting non-linearity.

plot of chunk unnamed-chunk-2

Conclusion

  1. What are the user groups' common characteristics?
    • User groups B & C are quite similar and A & D are quite similar in terms of: number of reviews, friends, fans, votes received and compliments received.
  2. What are their differences?
    • User group D is hardly popular or influential in all of the characterisitics mentioned above.
    • User group C is slightly less influential than user group B.
  3. How do these characteristics change by business category?
    • These characteristics do not change by business category (food and service businesses).