In today’s competitive business world, efficiently gathering and utilizing customer feedback is a must to satisfy users with the products they need. Immediate and Valuable feedback helps with decision-making and product roadmap. Traditional questionnaires and surveys would be too time and budget consuming. In the big data era with so many social medias like twitter, Facebook, yelp, etc, analyzing and visualizing the sentiment of people’s everyday comments on the products would be fast and intuitive. We can not only get their attitude towards the product, but also learn the detailed problems they are facing with. We can further get an insight on how the reviews distribute based on customers’ geolocation.
The ultimate goal of the project is to give suggestions on how to improve products by building sentiment analysis model on real time data gathered from Twitter stream API and yelp API. The first experiment starts with some Tweets, where people talk about anything and everything. Pre-processing steps are used to eliminate noise of the text data. Then sentiment analysis is performed to achieve positive, negative and neutral statements, together with people’s emotion on the product. Finally an interactive visualization is build to show the exploratory data analysis result of the data.
(Data is here)
There are some noise of tweets, such as retweet entities, @people, punctuations, numbers, html links, etc. To eliminate the impact of these, pre-processing steps are needed before the sentiment analysis model. Besides positive vs negative polarity scores of people’s sentiment, we can also have an insight of their emotions by text analysis. Here’re some visualization on what I classified from tweets. (Note, some of the negative reviews cannot be categorized into any of the emtions yet.)
A much fancier interactive visualization is here