Sentiment Analysis: Cincinnati Zoo vs. Newport Aquarium
Introduction
Cincinnati is a city with many different attractions, venues, restaurants and much more that is great for things to entertain family’s. Two of the greater Cincinnati area’s most visited places are the Cincinnati Zoo and the Newport Aquarium in northern Kentucky. This report aims to take a look into the sentiments about these two similar attractions to see how they differ in costumer perception online. There will we three main questions this report intends to answer:
Key Questions:
How do the general attitudes differ between the Cincinnati Zoo and the Newport Aquarium?
What emotions are associated with each attraction?
How do sentiments for each attraction change by day of the week or time of the year?
Data Collection
The data used for this sentiment analysis comes from Yelp reviews for both the Cincinnati Zoo and Newport Aquarium. About 240 reviews have been collected from each attraction. This will help us see what exactly costumers are saying about each venue and if there are any differences in the sentiments and emotions produced with each.
The Yelp reviews that were scraped return an unstructured data set of text, so to be able to perform any analysis some data cleansing steps took place. The pages were first re-hosted to a .CSV file and then the following data cleansing steps were performed:
The data was fixed using the lubridayte package so chronological analysis could be made.
Data sets for both places were combined into one data frame with “Venue” as new column specifying which was which.
The data was then broken down to the word level and stop words were removed to avoid skewing analysis.
The NRC and Bing lexicons were also both used in this analysis. For the Bing analysis, further transformations will be made to create sentiment counts.
Question 1: General Attitudes
Question: How do the general attitudes differ between the Cincinnati Zoo and the Newport Aquarium?
By answering this question, we will be able to view the public perception of each attraction based off what words are used the most in their Yelp reviews but also if the reviews are positive or negative. To answer this question, we will use the Bing lexicon to create sentiment counts for each word that appears. We can then look into the positive and negative words associated with each attraction as well as the general sentiment.
The above bar graph shows the positive and negative words that appeaared at least 20 times for both the zoo and the aquarium and the number of times they appear. As shown above, there are many words that appear in the reviews for both venues, with the majority of which being positive. However, the aquarium has some more negative words, although these words may not really be negative. Words like “shark” and “tank” are considered negative words in the Bing lexicon, but those are words you would expect to hear when reviewing an aquarium. Based off of this result, these words will be thrown off to not skew further analysis.
Aside from words that are being thrown out, the aquarium still has some more negative words like expensive and crowded appearing in the data. This could point to the aquarium having a slightly worse public perception due to the volume of these negative words.
Above is a bar graph that shows the total number of positive and negative words that appear for each venue. It should be noted that the words like “shark”, “tank”, and “tanks” has been excluded based off the results from the previous graph. It is clear that both venues have similar public perceptions that are generally good with far more positive words being used. However, the aquarium has both more negative words and less positive ones. This could point to a slightly more negative public perception towards the aquarium then the Zoo.
Question 2: Emotional Sentiments
Question: What emotions are associated with each attraction?
To answer this question, the NRC lexicon will be used to determine which emotions are associated with each word in the. This will give insight into what emotions each venue brings out of people, identidying strong and weak points for each and how they compare.
This column chart shows the total number of words associated with each emotion for each attraction. Again, both the zoo and aquarium are pretty similar overall, but there are some slight differences. For example, the biggest difference is in the joy and positive emotions. The zoo in this case is higher in both, pointing to more positive feeling about the zoo in the public eye. Once again, the aquarium is slightly more negative and has more fear and anticipation then the zoo. This points to the public feeling a little more stressed and worried when it comes to the aquarium. These things highlight something the aquarium could aim to improve.
Question 3: Sentiments During Different Times
Question: How do sentiments for each attraction change by day of the week or time of the year?
To answer this question, the reviews will be grouped by their day of the week and the month that they occurred. Assuming that a reviewer is placing the review maybe a day or two after they went, or on the day they went, we can use this analysis to determine if there are certain days or times of the year that each attraction is successful or could improve.
Above is a column chart of the total positivity scores for each venue by day of the week. Positivity score is the total number of positive words subtracted by the total number of negative words. Here we can see a few more differences by venue. The zoo seems to have higher positivity over all, but there is one day where the aquarium has more positivity, Saturday. This could highlight mid-week success for the aquarium as people leave reviews in the days following their visit. As for the zoo, the seem to excel on Wednesday, which points to early week and possible weekend success. Higher positivity on Sunday points to success on days like Friday and Saturday for both venues. Wednesday is the biggest difference, in which the Zoo seems to be successful and the aquarium struggles. This could point to weekday struggles for the aquarium and a need for improvement here.
The above column chart is the same analysis as the previous one, but by month instead of day of the week. The aquarium seems to dominate the winter months in the early part of the year, and this is expected since it is an indoor venue and the Zoo is primarily outdoor. The inverse is true for the summer, as the zoo seems to excel, although the aquarium still has a higher score in June. This could highlight a shortcoming for the zoo as the summer should be great time for them.
For the later half of the year, both venues have a low in September. This highlights an area of improvement for both venues as schools are going back but the weather is still nice. The winter months are even, except for December. This again makes sense since the zoo host festival of lights for Christmas.
This graph shows how the Zoo peaks in the summer months and December and the aquarium peaks in January and again towards the summer. This showcases a need for both venues to increase positivity when school is in and where they are already excelling.
Conclusion
Overall, the sentiments and emotions associated with the Cincinnati Zoo and Newport Aquarium are quite similar. The Zoo seems to be more positive overall, and it has more positive emotions associated with it. The fear that some people have for the aquarium seem to lower its public perception. While both venues perform similarly in general throughout the year, the Cincinnati Zoo seems to have a little better perception than the aquarium.