The data being used for this analysis is gathered from the hotel review site Trip Advisor. It contains iformation about individuals who had left reviews on the Trip Advisor website for the Ritz Carleton Hotel in New York City, Central Park. The variables include the reviewer name, the date the review was written, the rating, the content of the the review, and the date of stay at the hotel.
The data can be found using the like provided below :
The goal of this analysis is to better understand what makes a good and a bad review. Are their certain things like date of stay or date of review that impact the overall review. I will also look to be conducting a sentiment analysis to get an overall idea of what the general feeling is towards this particular hotel. It is important to note that when it comes to these review sites, studies have shown that individuals are more likey to review negatively than they are to review positively. It seems as though that individuals who do enjoy their experience are less likely to leave a review as compared to those who did not enjoy their experience.
At first glance it appears that April and December only have extremely high ratings, this is due to the limited amount of observations in each month, only 1 for April and 1 for December. Both Febuary and March have more observations and therefor we expect to see a little more variation. March has 5 observations and Febuary has 3. There is more variation in Febuary as copared to March, and Febuary also has a lower average rating.
When looking at the month that the reviewer left the review, we expect to see a similar trend. Logically speaking, most individuals do not tend to write a review months after their stay. However, there is a chance that individuals did stay late into the month and then write the review upon arriving home which could have extended to the next month. The results shown in this visual do apporximately match what we had seen in the previous visual with Febuary having the largest variation and march on average having a higher score.
The visual below shows which words appeared the most in our reviews and whether they are positive or negative. Using this package, it is clear that all of the words that had appeared over three times are positive. It seems as though that the reviewrs enjoyed how clean, nice, comfortable, and quite their stay was. This is to be expected for a hotel that charges $1,200 a night per room on the low end.
When observing the visual below, the words that are the largest appear the most throughout the reviews. Once again, most of them being positive. It seems to be that most of the guests appreciate how close it is to Central Park along with all of the great things that are around the location. Staff is another word that appears frequently. Notice the name Eric, coupled with the word staff I would imagine that they have a staff member named Eric that some guests really appreciated.
Similar to the above visual, this chart provides the counts of all words that were mentioned more than 4 times. The top four words do not provide us with much insight, but the following words such as service, bar, location, and view all have a relatively positive association. There words can better help us understand what people enjoy/value and how companies can better provide for their customers.