Paul Lim Min Chim
20 Nov 2015
Sentiment Analysis of Yelp Restaurant Reviews
Introduction Sentiment analysis is the computational study of opinions, sentiments and emotions expressed in text. It has many applications and is useful for social media monitoring, tracking of business reviews and business analytics.
Problem Statement Using sentiment analysis, I hope to answer the following with regards to the Yelp Dataset:
Who will find this interesting? This may be of interest to business owners who want to identify their business strengths and weaknesses based on customer reviews. Such analysis can also be used to predict how key review phrases may influence the review rating, and ultimately impact the business.
Methods
For capstone project, the focus is on restaurants review data from Yelp Dataset. A sample of 1,000 restaurant reviews is taken, from which I build the corpus and term-document matrix:
I skip stemming and removal of sparse terms in order to consider all words from the reviews. I build and sort data frames of 4-Grams tokens before plotting bar plots and wordclouds for visualization and analysis. The data visualization and analysis of the most-frequently-used phrases (for reviews with different star-ratings) will be used to answer the questions.
Exploratory Analysis & Results
The full report and plots can be viewed here.
|
Figure 1: |
Figure 2: |
|
Figure 3: |
Figure 4: |
|
Figure 3: |
Figure 4: |
Discussion & Conclusion
Most-frequently used phrases:
Motivation for writing reviews: