George Liu
November 2015
Using the dataset to answer:
What are the lowest rated categories? Can Yelp data be used to explore business opportunities? If yes, then: what are the the top pain points in the lowest-rated industries from a consumer's standpoint?
A 5-step process is used:
1. Get the data: use “jsonlite” package to read in and restructure the data.
2. Clean the data: decide on a list of categories to use and filter out different categories.
3. Analyze the data: calculate group means and medians.
4. Prepare the data: extract review text data for real estate category.
5. Perform text mining: find term frequency and correlation to infer hot topics.
The study does provide data to gather insights about opportunities in the industry, showing there are numerous pain points in the real estate category. Some major ones include:
These can then be considered by industry players to improve service quality, design new services or create business strategies.