Capstone Project Presentation

Marty Gaupp
Nov 2015

Problem Statement: Are bad ratings (those receiving 1 star) more useful than good ratings (those receiving 5 stars), or vice versa?

I will analyze the Yelp reviews dataset to answer this question so that I can help users of Yelp determine what types of reviews they should trust more - good ratings or bad ratings.

Methods and Data - Exploratory Analysis

Results of exploratory analysis on the votes.useful variable in the Reviews dataset

Star Box Plot

Too hard to tell which rating is more useful, so turn to statistics…

Methods and Data - Aggregate Level

Count useful votes for each star rating - determine relative usefulness:

\[ \mbox{Relative_Usefulness} = \frac{\mbox{Useful_Vote_Count}}{\mbox{Rating_Count}} \]

The following data and hypothesis test results:

stars	Rating_Count	Useful_Vote_Count	Relative_Usefulness
1	159,811	210,546	1.32
5	579,527	564,130	0.97

\[ \begin{array}{l} \mbox{H}_0: \mbox{Relative Usefulness}_1 \leq \mbox{Relative Usefulness}_5 \\ \mbox{H}_1: \mbox{Relative Usefulness}_1 > \mbox{Relative Usefulness}_5 \\ \mbox{test stat: } 55.647 \\ \mbox{p-value: } 0 \mbox{ therefore reject H}_0 \mbox{ and conclude H}_1 \\ \end{array} \]

Clearly, 1 star ratings are more useful than 5 star ratings

Methods and Data - Business Level

Determine usefulness counts/percents at the business level - results:

OneCntBetter	OnePercBetter	FiveCntBetter	FivePercBetter	NumOfBusinesses
13,530	18,901	30,724	26,067	60,785

Conduct hypotheses tests on counts & percents:

\[ \begin{array}{l} \mbox{H}_0: \mbox{# 1 Star Counts/Percents More Useful} \geq \mbox{# 5 Star Counts/Percents More Useful} \\ \mbox{H}_1: \mbox{# 1 Star Counts/Percents More Useful} < \mbox{# 5 Star Counts/Percents More Useful} \\ \mbox{test stat: } -125.443 \mbox{ and } -48.408 \\ \mbox{p-value: } 0 \mbox{ and } 0 \mbox{ therefore reject H}_0 \mbox{ and conclude H}_1 \\ \end{array} \]

In both cases, 5 star ratings are more useful than 1 star ratings

Results and Discussion

Contradictory results
- In aggregate: 1 star ratings are more useful than 5 star ratings
- At business level: 5 star ratings more useful than 1 star ratings
Contradiction due to Simpson's paradox
- Statitistical result that appears in one group of data but then reverses itself when the individual groups are combined
Overall conclusion
- Best to trust ratings at the individual business level
  - 5 star ratings tend to be more useful than 1 star ratings
- But… if it's a close result, might still have to take a gamble
  - Look at the text of the votes
  - Look at the recency of the votes - trust more current ones