load library

# Add any libraries and general settings up here.
# I suggest you start with these two libraries, since you'll probably use them:
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(here)
## here() starts at /Users/heeseoyun/Downloads/Final_example (1)
library(dplyr)
library(readr)  
library(ggplot2)
library(wordcloud)
## Loading required package: RColorBrewer
library(tidytext)
library(tidyr)
library(stringr)
library(RColorBrewer)

Executive summary

What is (are) your main question(s)? What is your story? What does the final graphic show?

This research seeks to answer the following question: “What factors are most influential in enhancing customer satisfaction for tourists staying in hotels in Paris, France?”

The study focuses on uncovering these factors using 515,000 customer reviews from Booking.com. By examining textual (positive/negative reviews) and numerical (ratings and word counts) data, it identifies the key drivers of satisfaction and dissatisfaction. The insights aim to support the strategic decision-making of hoteliers by highlighting areas that need improvement and reinforcing successful aspects of their services.

Techniques like keyword extraction, sentiment analysis, and visualization tools are employed. Word clouds provide a high-level view of frequently mentioned keywords, while bar graphs and scatter plots analyze relationships between satisfaction factors and average scores.

By addressing these questions, this research contributes to the sustainable growth of the hospitality industry in Paris, especially post-pandemic. Hotels can leverage these findings to enhance customer satisfaction, strengthen market competitiveness, and meet evolving expectations for service quality.

Data background

Explain where the data came from, what agency or company made it, how it is structured, what it shows, etc.

The dataset used in this study, “515K Hotel Reviews Data in Europe,” from kaggle employs the 515K Hotel Reviews Data in Europe, sourced from Booking.com. The dataset contains over 515,000 reviews and ratings for 1,493 luxury hotels, offering both numerical and textual insights into customer satisfaction.

Selected Variables and Their Relevance: These variables were selected to balance quantitative and qualitative insights. Numerical scores help quantify satisfaction levels, while positive and negative reviews uncover the specific reasons behind customer opinions. By focusing on these elements, the analysis can comprehensively address the research question.

Data Cleaning

Describe and show how you cleaned and reshaped the data

Load Data

Load 515K Hotel Reviews Data in Europe. The data is imported for analysis and verified by displaying the first few rows.

# Load the data using relative path
hotel_reviews <- read.csv("Hotel_Reviews.csv", stringsAsFactors = FALSE)

# View the first few rows to verify
head(hotel_reviews)
##                                               Hotel_Address
## 1  s Gravesandestraat 55 Oost 1092 AA Amsterdam Netherlands
## 2  s Gravesandestraat 55 Oost 1092 AA Amsterdam Netherlands
## 3  s Gravesandestraat 55 Oost 1092 AA Amsterdam Netherlands
## 4  s Gravesandestraat 55 Oost 1092 AA Amsterdam Netherlands
## 5  s Gravesandestraat 55 Oost 1092 AA Amsterdam Netherlands
## 6  s Gravesandestraat 55 Oost 1092 AA Amsterdam Netherlands
##   Additional_Number_of_Scoring Review_Date Average_Score  Hotel_Name
## 1                          194    8/3/2017           7.7 Hotel Arena
## 2                          194    8/3/2017           7.7 Hotel Arena
## 3                          194   7/31/2017           7.7 Hotel Arena
## 4                          194   7/31/2017           7.7 Hotel Arena
## 5                          194   7/24/2017           7.7 Hotel Arena
## 6                          194   7/24/2017           7.7 Hotel Arena
##   Reviewer_Nationality
## 1              Russia 
## 2             Ireland 
## 3           Australia 
## 4      United Kingdom 
## 5         New Zealand 
## 6              Poland 
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Negative_Review
## 1  I am so angry that i made this post available via all possible sites i use when planing my trips so no one will make the mistake of booking this place I made my booking via booking com We stayed for 6 nights in this hotel from 11 to 17 July Upon arrival we were placed in a small room on the 2nd floor of the hotel It turned out that this was not the room we booked I had specially reserved the 2 level duplex room so that we would have a big windows and high ceilings The room itself was ok if you don t mind the broken window that can not be closed hello rain and a mini fridge that contained some sort of a bio weapon at least i guessed so by the smell of it I intimately asked to change the room and after explaining 2 times that i booked a duplex btw it costs the same as a simple double but got way more volume due to the high ceiling was offered a room but only the next day SO i had to check out the next day before 11 o clock in order to get the room i waned to Not the best way to begin your holiday So we had to wait till 13 00 in order to check in my new room what a wonderful waist of my time The room 023 i got was just as i wanted to peaceful internal garden view big window We were tired from waiting the room so we placed our belongings and rushed to the city In the evening it turned out that there was a constant noise in the room i guess it was made by vibrating vent tubes or something it was constant and annoying as hell AND it did not stop even at 2 am making it hard to fall asleep for me and my wife I have an audio recording that i can not attach here but if you want i can send it via e mail The next day the technician came but was not able to determine the cause of the disturbing sound so i was offered to change the room once again the hotel was fully booked and they had only 1 room left the one that was smaller but seems newer 
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             No Negative
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Rooms are nice but for elderly a bit difficult as most rooms are two story with narrow steps So ask for single level Inside the rooms are very very basic just tea coffee and boiler and no bar empty fridge 
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   My room was dirty and I was afraid to walk barefoot on the floor which looked as if it was not cleaned in weeks White furniture which looked nice in pictures was dirty too and the door looked like it was attacked by an angry dog My shower drain was clogged and the staff did not respond to my request to clean it On a day with heavy rainfall a pretty common occurrence in Amsterdam the roof in my room was leaking luckily not on the bed you could also see signs of earlier water damage I also saw insects running on the floor Overall the second floor of the property looked dirty and badly kept On top of all of this a repairman who came to fix something in a room next door at midnight was very noisy as were many of the guests I understand the challenges of running a hotel in an old building but this negligence is inconsistent with prices demanded by the hotel On the last night after I complained about water damage the night shift manager offered to move me to a different room but that offer came pretty late around midnight when I was already in bed and ready to sleep 
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   You When I booked with your company on line you showed me pictures of a room I thought I was getting and paying for and then when we arrived that s room was booked and the staff told me we could only book the villa suite theough them directly Which was completely false advertising After being there we realised that you have grouped lots of rooms on the photos together leaving me the consumer confused and extreamly disgruntled especially as its my my wife s 40th birthday present Please make your website more clear through pricing and photos as again I didn t really know what I was paying for and how much it had wnded up being Your photos told me I was getting something I wasn t Not happy and won t be using you again 
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Backyard of the hotel is total mess shouldn t happen in hotel with 4 stars 
##   Review_Total_Negative_Word_Counts Total_Number_of_Reviews
## 1                               397                    1403
## 2                                 0                    1403
## 3                                42                    1403
## 4                               210                    1403
## 5                               140                    1403
## 6                                17                    1403
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       Positive_Review
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Only the park outside of the hotel was beautiful 
## 2  No real complaints the hotel was great great location surroundings rooms amenities and service Two recommendations however firstly the staff upon check in are very confusing regarding deposit payments and the staff offer you upon checkout to refund your original payment and you can make a new one Bit confusing Secondly the on site restaurant is a bit lacking very well thought out and excellent quality food for anyone of a vegetarian or vegan background but even a wrap or toasted sandwich option would be great Aside from those minor minor things fantastic spot and will be back when i return to Amsterdam 
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Location was good and staff were ok It is cute hotel the breakfast range is nice Will go back 
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Great location in nice surroundings the bar and restaurant are nice and have a lovely outdoor area The building also has quite some character 
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Amazing location and building Romantic setting 
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       Good restaurant with modern design great chill out place Great park nearby the hotel and awesome main stairs 
##   Review_Total_Positive_Word_Counts Total_Number_of_Reviews_Reviewer_Has_Given
## 1                                11                                          7
## 2                               105                                          7
## 3                                21                                          9
## 4                                26                                          1
## 5                                 8                                          3
## 6                                20                                          1
##   Reviewer_Score
## 1            2.9
## 2            7.5
## 3            7.1
## 4            3.8
## 5            6.7
## 6            6.7
##                                                                                                                                  Tags
## 1                                                         [' Leisure trip ', ' Couple ', ' Duplex Double Room ', ' Stayed 6 nights ']
## 2                                                         [' Leisure trip ', ' Couple ', ' Duplex Double Room ', ' Stayed 4 nights ']
## 3 [' Leisure trip ', ' Family with young children ', ' Duplex Double Room ', ' Stayed 3 nights ', ' Submitted from a mobile device ']
## 4                                                  [' Leisure trip ', ' Solo traveler ', ' Duplex Double Room ', ' Stayed 3 nights ']
## 5                                  [' Leisure trip ', ' Couple ', ' Suite ', ' Stayed 2 nights ', ' Submitted from a mobile device ']
## 6                                                           [' Leisure trip ', ' Group ', ' Duplex Double Room ', ' Stayed 1 night ']
##   days_since_review      lat      lng
## 1            0 days 52.36058 4.915968
## 2            0 days 52.36058 4.915968
## 3            3 days 52.36058 4.915968
## 4            3 days 52.36058 4.915968
## 5           10 days 52.36058 4.915968
## 6           10 days 52.36058 4.915968

Filltering Data 1: Paris Hotel

To focus the analysis on Paris, the dataset is filtered to include only hotels located in this city. Rows where the Hotel_Address column contains the word “Paris” are extracted. This reduces the dataset size, optimizing computation and ensuring relevance to the research question.

# Filter the data for rows where the Hotel_Address contains "Paris"
paris_hotels <- hotel_reviews %>%
  filter(grepl("Paris", Hotel_Address, ignore.case = TRUE))

# View the first few rows of the filtered data
head(paris_hotels)
##                                     Hotel_Address Additional_Number_of_Scoring
## 1 1 3 Rue d Argentine 16th arr 75116 Paris France                           26
## 2 1 3 Rue d Argentine 16th arr 75116 Paris France                           26
## 3 1 3 Rue d Argentine 16th arr 75116 Paris France                           26
## 4 1 3 Rue d Argentine 16th arr 75116 Paris France                           26
## 5 1 3 Rue d Argentine 16th arr 75116 Paris France                           26
## 6 1 3 Rue d Argentine 16th arr 75116 Paris France                           26
##   Review_Date Average_Score          Hotel_Name Reviewer_Nationality
## 1   6/29/2017           8.4 Monhotel Lounge SPA              Brazil 
## 2   4/25/2017           8.4 Monhotel Lounge SPA          Luxembourg 
## 3   4/18/2017           8.4 Monhotel Lounge SPA      United Kingdom 
## 4  11/11/2016           8.4 Monhotel Lounge SPA             Belgium 
## 5   8/28/2016           8.4 Monhotel Lounge SPA               Qatar 
## 6   8/28/2016           8.4 Monhotel Lounge SPA        Saudi Arabia 
##                                                                                                                                                                                                                                                                                                                                                                                             Negative_Review
## 1                                                                                                                                                                                                                                                                                                                                                                                               No Negative
## 2  Not only did the staff on arrival ask to copy the details of my credit card this is OK but they also noted down the CVV number from the back of the credit card This means that the hotel has all the information to use my credit card for internet purchases Since this is unacceptable I asked the receptionist to delete the CVV but she refused This policy should be changed Otherwise good hotel 
## 3                                                                                                                                                                                                                                                                                                                                                                                                       N A
## 4                                                                                          The sauna is a wonderfull addition it s a shame I had to slalom in the cleaning staff halway almost filled with cleaning products Also the proximity to the cleaning staff s room added all the noise to what was supposed to be a quiet moment maybe it would be a good idea to relocate their general quarter 
## 5                                                                                                                                                                                                                                                                                                                                              They have maintenance and I couldn t enjoy the balcony view 
## 6                                                                                                                                                                                                                                                                                             Reception not impressive building was under construction Room passage was way to small Room has also no space
##   Review_Total_Negative_Word_Counts Total_Number_of_Reviews
## 1                                 0                     171
## 2                                76                     171
## 3                                 3                     171
## 4                                60                     171
## 5                                13                     171
## 6                                19                     171
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Positive_Review
## 1  Nice hotel Room was beautiful and bed very comfortable Did not expect big rooms in Paris so size was really ok with 2 windows Bathroom modern and new with a GREAT shower Staff was wonderfull since reservation They sent me an email prior to my arrival asking if they could help me in anyway and also asking if I had any special needs such as extra beds I had a last minute health problem in the family and had to arrive one day earlier and stay for 1 night less Audrey was wonderful and managed to receive me before and dealt herself with Booking to change my reservation I sent a note to her when I was entering my flight from Cannes to Paris and when I left the airplane she had already sorted everything out They even charged me a lower rate as on the day I arrived the room costed a little less In this same day Saturday June 25 I went out with friends and arrived really hungry at the hotel at 1 30AM and the guy from reception can t believe I forgot his name Maybe Hadesmi managed to open the restaurant and get to my room a nice aspargus risoto bread yougurt and a warm smile even thought riom service finishes at midnight Really had a very nice experience at Monhotel and recomend it 
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              No Positive
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Fantastic weekend with my partner We Would definitely stay here again Location was excellent and couldn t have asked for a better area to stay in Was so close to the metro and all the shops bars and restaurants Room was lovely and cosey and very modern with a classy edge to the facility 
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Good location really good breakfast maybe the best I ever had in Paris in a confortable and stylish setting Everything you need is provided on site you can literally arrive with your hands in your pockets Helpfull and attentive staff 
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       I asked to be downgraded because I was paying for a balcony room while I couldn t enjoy the balcony Instead they upgraded me with the biggest room in the hotel for the same rate 
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         The room has everything inside as a full package
##   Review_Total_Positive_Word_Counts Total_Number_of_Reviews_Reviewer_Has_Given
## 1                               231                                          1
## 2                                 0                                          1
## 3                                56                                          4
## 4                                42                                         14
## 5                                37                                          4
## 6                                10                                          5
##   Reviewer_Score
## 1            9.2
## 2            8.8
## 3            7.9
## 4            9.6
## 5            8.8
## 6            7.5
##                                                                                                                                   Tags
## 1                       [' Leisure trip ', ' Group ', ' Comfort Double Room ', ' Stayed 1 night ', ' Submitted from a mobile device ']
## 2                                                [' Business trip ', ' Solo traveler ', ' Superior Double Room ', ' Stayed 4 nights ']
## 3                     [' Leisure trip ', ' Couple ', ' Premium Double Room ', ' Stayed 2 nights ', ' Submitted from a mobile device ']
## 4            [' Business trip ', ' Solo traveler ', ' Superior Double Room ', ' Stayed 4 nights ', ' Submitted from a mobile device ']
## 5 [' Business trip ', ' Solo traveler ', ' Deluxe Double Room with Balcony ', ' Stayed 5 nights ', ' Submitted from a mobile device ']
## 6               [' Leisure trip ', ' Solo traveler ', ' Premium Double Room ', ' Stayed 1 night ', ' Submitted from a mobile device ']
##   days_since_review      lat      lng
## 1           35 days 48.87435 2.289733
## 2           100 day 48.87435 2.289733
## 3           107 day 48.87435 2.289733
## 4           265 day 48.87435 2.289733
## 5           340 day 48.87435 2.289733
## 6           340 day 48.87435 2.289733
# Optional: Check the number of rows in the filtered data
nrow(paris_hotels)
## [1] 59928

Filltering Data 2: select and reorder data

From the filtered dataset, only the variables necessary for the study are selected. This includes variables representing average scores, hotel names, and review texts. Redundant variables are removed to streamline the dataset, and columns are reordered for convenience.

# Create a new data frame with only the required variables
selected_data <- paris_hotels %>%
  select(Average_Score, Hotel_Name, Positive_Review, Negative_Review)

# View the first few rows of the new data frame
head(selected_data)
##   Average_Score          Hotel_Name
## 1           8.4 Monhotel Lounge SPA
## 2           8.4 Monhotel Lounge SPA
## 3           8.4 Monhotel Lounge SPA
## 4           8.4 Monhotel Lounge SPA
## 5           8.4 Monhotel Lounge SPA
## 6           8.4 Monhotel Lounge SPA
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Positive_Review
## 1  Nice hotel Room was beautiful and bed very comfortable Did not expect big rooms in Paris so size was really ok with 2 windows Bathroom modern and new with a GREAT shower Staff was wonderfull since reservation They sent me an email prior to my arrival asking if they could help me in anyway and also asking if I had any special needs such as extra beds I had a last minute health problem in the family and had to arrive one day earlier and stay for 1 night less Audrey was wonderful and managed to receive me before and dealt herself with Booking to change my reservation I sent a note to her when I was entering my flight from Cannes to Paris and when I left the airplane she had already sorted everything out They even charged me a lower rate as on the day I arrived the room costed a little less In this same day Saturday June 25 I went out with friends and arrived really hungry at the hotel at 1 30AM and the guy from reception can t believe I forgot his name Maybe Hadesmi managed to open the restaurant and get to my room a nice aspargus risoto bread yougurt and a warm smile even thought riom service finishes at midnight Really had a very nice experience at Monhotel and recomend it 
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              No Positive
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Fantastic weekend with my partner We Would definitely stay here again Location was excellent and couldn t have asked for a better area to stay in Was so close to the metro and all the shops bars and restaurants Room was lovely and cosey and very modern with a classy edge to the facility 
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Good location really good breakfast maybe the best I ever had in Paris in a confortable and stylish setting Everything you need is provided on site you can literally arrive with your hands in your pockets Helpfull and attentive staff 
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       I asked to be downgraded because I was paying for a balcony room while I couldn t enjoy the balcony Instead they upgraded me with the biggest room in the hotel for the same rate 
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         The room has everything inside as a full package
##                                                                                                                                                                                                                                                                                                                                                                                             Negative_Review
## 1                                                                                                                                                                                                                                                                                                                                                                                               No Negative
## 2  Not only did the staff on arrival ask to copy the details of my credit card this is OK but they also noted down the CVV number from the back of the credit card This means that the hotel has all the information to use my credit card for internet purchases Since this is unacceptable I asked the receptionist to delete the CVV but she refused This policy should be changed Otherwise good hotel 
## 3                                                                                                                                                                                                                                                                                                                                                                                                       N A
## 4                                                                                          The sauna is a wonderfull addition it s a shame I had to slalom in the cleaning staff halway almost filled with cleaning products Also the proximity to the cleaning staff s room added all the noise to what was supposed to be a quiet moment maybe it would be a good idea to relocate their general quarter 
## 5                                                                                                                                                                                                                                                                                                                                              They have maintenance and I couldn t enjoy the balcony view 
## 6                                                                                                                                                                                                                                                                                             Reception not impressive building was under construction Room passage was way to small Room has also no space
# Reorder columns in the specified order
ordered_data <- paris_hotels %>%
  select(Hotel_Name, Average_Score, Positive_Review, Negative_Review)

# View the first few rows of the reordered data frame
head(ordered_data)
##            Hotel_Name Average_Score
## 1 Monhotel Lounge SPA           8.4
## 2 Monhotel Lounge SPA           8.4
## 3 Monhotel Lounge SPA           8.4
## 4 Monhotel Lounge SPA           8.4
## 5 Monhotel Lounge SPA           8.4
## 6 Monhotel Lounge SPA           8.4
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Positive_Review
## 1  Nice hotel Room was beautiful and bed very comfortable Did not expect big rooms in Paris so size was really ok with 2 windows Bathroom modern and new with a GREAT shower Staff was wonderfull since reservation They sent me an email prior to my arrival asking if they could help me in anyway and also asking if I had any special needs such as extra beds I had a last minute health problem in the family and had to arrive one day earlier and stay for 1 night less Audrey was wonderful and managed to receive me before and dealt herself with Booking to change my reservation I sent a note to her when I was entering my flight from Cannes to Paris and when I left the airplane she had already sorted everything out They even charged me a lower rate as on the day I arrived the room costed a little less In this same day Saturday June 25 I went out with friends and arrived really hungry at the hotel at 1 30AM and the guy from reception can t believe I forgot his name Maybe Hadesmi managed to open the restaurant and get to my room a nice aspargus risoto bread yougurt and a warm smile even thought riom service finishes at midnight Really had a very nice experience at Monhotel and recomend it 
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              No Positive
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Fantastic weekend with my partner We Would definitely stay here again Location was excellent and couldn t have asked for a better area to stay in Was so close to the metro and all the shops bars and restaurants Room was lovely and cosey and very modern with a classy edge to the facility 
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Good location really good breakfast maybe the best I ever had in Paris in a confortable and stylish setting Everything you need is provided on site you can literally arrive with your hands in your pockets Helpfull and attentive staff 
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       I asked to be downgraded because I was paying for a balcony room while I couldn t enjoy the balcony Instead they upgraded me with the biggest room in the hotel for the same rate 
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         The room has everything inside as a full package
##                                                                                                                                                                                                                                                                                                                                                                                             Negative_Review
## 1                                                                                                                                                                                                                                                                                                                                                                                               No Negative
## 2  Not only did the staff on arrival ask to copy the details of my credit card this is OK but they also noted down the CVV number from the back of the credit card This means that the hotel has all the information to use my credit card for internet purchases Since this is unacceptable I asked the receptionist to delete the CVV but she refused This policy should be changed Otherwise good hotel 
## 3                                                                                                                                                                                                                                                                                                                                                                                                       N A
## 4                                                                                          The sauna is a wonderfull addition it s a shame I had to slalom in the cleaning staff halway almost filled with cleaning products Also the proximity to the cleaning staff s room added all the noise to what was supposed to be a quiet moment maybe it would be a good idea to relocate their general quarter 
## 5                                                                                                                                                                                                                                                                                                                                              They have maintenance and I couldn t enjoy the balcony view 
## 6                                                                                                                                                                                                                                                                                             Reception not impressive building was under construction Room passage was way to small Room has also no space

Summary stats

Basic statistical measures, such as mean and median, are calculated for the Average_Score variable. These measures help establish benchmarks for distinguishing high-rated and low-rated hotels, which will later be correlated with review keywords.

# Calculate the summary statistics for Average_Score
summary_stats <- summary(ordered_data$Average_Score)
print(summary_stats)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   6.800   8.100   8.500   8.409   8.800   9.800

Visualization: Box Plot

To visualize the distribution of hotel ratings, a box plot is created. This visualization highlights the mean, median, and outliers, offering a clear understanding of rating trends across hotels in Paris.

# Create the boxplot
boxplot_graph <- ggplot(ordered_data, aes(x = "", y = Average_Score)) +
  geom_boxplot(fill = "blue", color = "black") +
  labs(title = "Boxplot of Average Hotel Ratings", y = "Average Score") +
  theme_minimal()

boxplot_graph

ggsave("Boxplot_Average_Hotel_Ratings.png", boxplot_graph, width = 10, height = 6)

Individual figures

Describe and show how you created the first figure. Why did you choose this figure type?

In showing the figures that you created, describe why you designed it the way you did. Why did you choose those colors, fonts, and other design elements? Does it convey truth?

Figure 1 : Core Keywords - Wordcloud & Bar plot

By analyzing positive and negative reviews of various hotels in Paris, frequently mentioned words are identified as core elements and visualized using a wordcloud for intuitive understanding. The size of each word corresponds to its frequency, making it effortless to pinpoint the most significant factors contributing to customer satisfaction and dissatisfaction. The approach allows hotels to easily prioritize areas for improvement while reinforcing their strengths to better meet customer expectations.

The keywords of each positive review and negative review are tokenized to extract the upper keyword after calculating the frequency. At this time, unnecessary or meaningless words are excluded and extracted again according to the analyst’s judgment. Repeat this until meaningful words are extracted.

Wordclouds were chosen for keyword analysis because they visually represent keyword frequency, with larger words indicating higher importance. This makes it easy to identify dominant themes in positive and negative reviews at a glance. By filtering out unnecessary words and focusing on meaningful ones, wordclouds provide a clear, intuitive summary, making them ideal for both exploratory analysis and presenting insights effectively.

Data processing for Wordcloud Visualization 1: Tokenize

# Extract key keywords from Positive_Reviews
# Tokenize words from positive reviews
p_token <- ordered_data %>%
  unnest_tokens(word, Positive_Review)

# Remove stop words (common words with no meaningful context)
p_tidyToken <- p_token %>%
  anti_join(stop_words)
## Joining with `by = join_by(word)`
# Count word frequencies and sort in descending order
p_tidyToken <- p_tidyToken %>%
  count(word, sort = TRUE)

# Display the top 20 most frequent keywords
head(p_tidyToken, 20)
##           word     n
## 1     location 26845
## 2        staff 24461
## 3        hotel 16355
## 4     friendly 10853
## 5      helpful 10514
## 6         nice  8710
## 7        clean  7847
## 8    breakfast  7452
## 9  comfortable  6815
## 10   excellent  6727
## 11       paris  5490
## 12         bed  5385
## 13       metro  5140
## 14       close  4324
## 15    positive  3876
## 16     perfect  3840
## 17        stay  3809
## 18      lovely  3196
## 19       quiet  3087
## 20     service  2883
# Tokenize words from negative reviews
n_token <- ordered_data %>%
  unnest_tokens(word, Negative_Review)

# Remove stop words (common words with no meaningful context)
n_tidyToken <- n_token %>%
  anti_join(stop_words)
## Joining with `by = join_by(word)`
# Count word frequencies and sort in descending order
n_tidyToken <- n_tidyToken %>%
  count(word, sort = TRUE)

# Display the top 20 most frequent keywords
head(n_tidyToken, 20)
##         word     n
## 1   negative 16826
## 2      hotel  8715
## 3  breakfast  6723
## 4      staff  4054
## 5        bit  3099
## 6   bathroom  2920
## 7       didn  2476
## 8     shower  2400
## 9        bed  2385
## 10     night  2298
## 11   service  2094
## 12      time  1959
## 13 expensive  1852
## 14      stay  1847
## 15     paris  1828
## 16       day  1776
## 17      poor  1737
## 18      wifi  1724
## 19     price  1696
## 20     water  1693

Data processing for Wordcloud Visualization 2: Data cleaning

#positive
p_tidyToken <- p_token %>%
  anti_join(stop_words) %>%  # Remove stop words
  filter(!word %in% c("hotel", "paris", "positive", "nice", "comfortable", "excellent", "perfect", "stay", "lovely")) %>%  # Exclude specified words
  count(word, sort = TRUE)  # Count word frequencies and sort in descending order
## Joining with `by = join_by(word)`
# Display the top 20 most frequent keywords
head(p_tidyToken, 20)
##           word     n
## 1     location 26845
## 2        staff 24461
## 3     friendly 10853
## 4      helpful 10514
## 5        clean  7847
## 6    breakfast  7452
## 7          bed  5385
## 8        metro  5140
## 9        close  4324
## 10       quiet  3087
## 11     service  2883
## 12     station  2860
## 13 restaurants  2689
## 14        walk  2654
## 15     amazing  2548
## 16        view  2425
## 17    bathroom  2298
## 18       tower  2250
## 19      eiffel  2091
## 20   beautiful  1996
# negative
n_tidyToken <- n_token %>%
  anti_join(stop_words) %>%  # Remove common stop words
  filter(!word %in% c("negative", "hotel", "bit", "didn", "paris", "2", "4", "night", "time", "stay", "day", "poor")) %>%  # Exclude specified words
  count(word, sort = TRUE)  # Count word frequencies and sort in descending order
## Joining with `by = join_by(word)`
# Display the top 20 most frequent keywords
head(n_tidyToken, 20)
##         word    n
## 1  breakfast 6723
## 2      staff 4054
## 3   bathroom 2920
## 4     shower 2400
## 5        bed 2385
## 6    service 2094
## 7  expensive 1852
## 8       wifi 1724
## 9      price 1696
## 10     water 1693
## 11    coffee 1639
## 12 reception 1551
## 13     floor 1521
## 14       bar 1506
## 15     check 1434
## 16       bad 1414
## 17  location 1313
## 18      wasn 1304
## 19      door 1302
## 20       air 1184

Visualization of Wordclouds

The Wordcloud visualization highlights the most frequently mentioned keywords from positive and negative reviews, providing a clear view of customer priorities and concerns.

  • Positive reviews keywords Positive reviews often feature words like “clean,” “friendly,” and “location,” indicating that cleanliness, service quality, and accessibility are primary drivers of customer satisfaction in Parisian hotels.

  • Negative reviews keywords negative reviews frequently include terms such as “breakfast,” “price,” and “noise,” revealing common pain points.

Word cloud for positive keywords

# Open a PNG device to save the wordcloud
png("Wordcloud_Positive_Keywords.png", width = 800, height = 600)

# Activate the graphics device and generate the wordcloud
wordcloud(
  words = p_tidyToken$word,
  freq = p_tidyToken$n,
  min.freq = 1000,
  max.words = 100,
  random.order = FALSE,
  rot.per = 0.35,
  colors = brewer.pal(8, "Set1")
)

# Close the graphics device to save the image
dev.off()
## quartz_off_screen 
##                 2
# Activate the graphics device and generate the wordcloud
wordcloud(
  words = p_tidyToken$word,
  freq = p_tidyToken$n,
  min.freq = 1000,
  max.words = 100,
  random.order = FALSE,
  rot.per = 0.35,
  colors = brewer.pal(8, "Set1")
)

Word cloud for nagative keywords

# Open a PNG device to save the word cloud
png("Wordcloud_Negative_Keywords.png", width = 800, height = 600)

# Generate the word cloud
wordcloud(
  words = n_tidyToken$word, 
  freq = n_tidyToken$n, 
  min.freq = 1000, 
  max.words = 100, 
  random.order = FALSE, 
  rot.per = 0.35, 
  colors = brewer.pal(8, "Set1")
)

# Close the graphics device to save the image
dev.off()
## quartz_off_screen 
##                 2
# Generate the word cloud
wordcloud(
  words = n_tidyToken$word, 
  freq = n_tidyToken$n, 
  min.freq = 1000, 
  max.words = 100, 
  random.order = FALSE, 
  rot.per = 0.35, 
  colors = brewer.pal(8, "Set1")
)

Figure 2: Bar Plot - Keyword Frequency Analysis

While word clouds provide an engaging, high-level overview of text data by visually emphasizing frequently mentioned keywords, they lack the precision needed for detailed analysis. To address this limitation, bar graphs were used as a complementary tool to provide a more analytical and structured approach.

Bar graphs effectively display frequency distributions, allowing for clear and precise comparisons between keywords. This approach helps prioritize the most impactful factors by showcasing exact values and trends, making them an ideal choice for gaining deeper insights into the patterns present in both positive and negative review content.

This figure visualizes the most frequently mentioned positive and negative keywords from customer reviews in high-rated and low-rated hotels, respectively. The primary objective is to identify the key factors contributing to customer satisfaction and dissatisfaction, based on review content.

Data processing for Bar plot visualization

# Predefined positive/negative keywords from reviews
positive_keywords <- c("location", "staff", "friendly", "helpful", "clean", "breakfast", "bed", "metro", "quiet", "service")
negative_keywords <- c("breakfast", "staff", "bathroom", "shower", "bed", 
                       "service", "expensive", "wifi", "price", "water")

# Setting rating criteria (High-rating vs Low-rating hotels)
high_rating_hotels <- ordered_data %>%
  filter(Average_Score > median(ordered_data$Average_Score))

low_rating_hotels <- ordered_data %>%
  filter(Average_Score <= median(ordered_data$Average_Score))

# Analyze the frequency of positive keywords in high-rating hotels
high_positive_token <- high_rating_hotels %>%
  unnest_tokens(word, Positive_Review) %>%
  filter(word %in% positive_keywords) %>%  # Filter predefined positive keywords from reviews
  anti_join(stop_words) %>%
  count(word, sort = TRUE)
## Joining with `by = join_by(word)`
# Analyze the frequency of negative keywords in low-rating hotels
low_negative_token <- low_rating_hotels %>%
  unnest_tokens(word, Negative_Review) %>%
  filter(word %in% negative_keywords) %>%  # Filter predefined negative keywords from reviews
  anti_join(stop_words) %>%
  count(word, sort = TRUE)
## Joining with `by = join_by(word)`

Bar Plot Visualization - keywords

The analysis reveals that excellent service, cleanliness, and convenient locations are the primary drivers of customer satisfaction in high-rated hotels, as reflected in positive keywords like staff, location, and clean. On the other hand, dissatisfaction in low-rated hotels stems from issues such as poor breakfast quality, inadequate bathroom conditions, and perceived high costs, highlighted by keywords like breakfast, bathroom, and expensive.

To enhance guest experiences, high-rated hotels should continue prioritizing their strengths while addressing secondary concerns like breakfast quality. Conversely, low-rated hotels must focus on improving basic amenities and staff service to resolve key pain points and boost overall satisfaction.

  • Positive keywords from review data: The bar graph for positive keywords highlights terms like “location,” “staff,” and “clean,” which frequently appear in high-rated hotel reviews. These keywords suggest that convenient locations, excellent staff service, and cleanliness are crucial for achieving high customer satisfaction. Hotels with high ratings often prioritize cleanliness, service quality, and prime locations. These factors are consistently valued by customers and directly influence satisfaction levels.

  • Negative keywords from review data: The bar graph for negative keywords shows terms like “breakfast,” “bathroom,” and “expensive,” which are commonly mentioned in low-rated hotel reviews. These keywords indicate areas where hotels may underperform, leading to dissatisfaction among guests. Guests at low-rated hotels often express dissatisfaction with breakfast quality, bathroom conditions, and perceived high costs. Addressing these areas could help improve customer satisfaction and overall ratings.

Keywords frequency from Positive reviews in High-Rating Hotels

bar_chart_high_positive <- ggplot(high_positive_token, aes(x = reorder(word, n), y = n)) +
  geom_col(fill = "steelblue") +  # Single color for simplicity
  labs(title = "Keyword Frequency in High-Rating Hotels (Positive Reviews)",
       x = "Keyword",
       y = "Frequency") +
  theme_minimal() +
  coord_flip()  # Flip the Y-axis for better readability

bar_chart_high_positive

# Save the bar chart as an image
ggsave("Keyword_Frequency_High_Rating_Hotels.png", bar_chart_high_positive, width = 10, height = 6)

Keywords frequency from Negative revies in Low-Rating Hotels

bar_chart_low_negative <- ggplot(low_negative_token, aes(x = reorder(word, n), y = n)) +
  geom_col(fill = "red") +  # Single color for simplicity
  labs(title = "Keyword Frequency in Low-Rating Hotels (Negative Reviews)",
       x = "Keyword",
       y = "Frequency") +
  theme_minimal() +
  coord_flip()  # Flip the Y-axis for better readability

bar_chart_low_negative

# Save the bar chart as an image
ggsave("Keyword_Frequency_Low_Rating_Hotels.png", bar_chart_low_negative, width = 10, height = 6)

Figure 3: Bar Plot - Sentiment Analysis

To complement keyword analysis from another perspective, sentiment analysis provides a more nuanced understanding of guest feedback by categorizing it into distinct positive and negative sentiments. Unlike simple keyword frequency visualization, sentiment analysis captures the emotional tone of reviews, offering deeper insights into the factors that drive satisfaction or dissatisfaction. By emphasizing the polarity of sentiments, this approach allows hotels to identify not only the most frequently mentioned aspects but also the emotional weight behind them. This makes sentiment analysis a powerful tool for prioritizing actionable improvements and strategically enhancing guest experiences.

Data processing for Bar plot visualization

# Attach sentiment lexicon for analysis (using bing)
bing_sentiments <- get_sentiments("bing")

# Analyze sentiment in Positive_Review
positive_sentiment <- ordered_data %>%
  unnest_tokens(word, Positive_Review) %>%
  inner_join(bing_sentiments, by = "word") %>%
  count(sentiment, sort = TRUE) %>%
  mutate(review_type = "Positive")

# Analyze sentiment in Negative_Review
negative_sentiment <- ordered_data %>%
  unnest_tokens(word, Negative_Review) %>%
  inner_join(bing_sentiments, by = "word") %>%
  count(sentiment, sort = TRUE) %>%
  mutate(review_type = "Negative")

# Combine both datasets
combined_sentiments <- bind_rows(positive_sentiment, negative_sentiment)

Bar Plot Visualization - sentiment comparison

The bar plot highlights the frequency of positive and negative sentiments in customer reviews. Positive sentiments, represented by words like clean and friendly, dominate in positive reviews, while negative sentiments, including expensive and uncomfortable, are prominent in negative reviews.

This figure demonstrates the importance of enhancing positive elements such as cleanliness and staff service while addressing recurring issues related to pricing and amenities. The sentiment comparison provides a balanced view of customer feedback, helping hotels prioritize efforts to enhance guest satisfaction effectively.

  • Positive Sentiments: Keywords like staff, cleanliness, and friendly frequently appear in positive reviews, reinforcing their role in customer satisfaction. These aspects are consistently associated with high ratings and positive guest experiences.

  • Negative Sentiments: Keywords like breakfast, price, and uncomfortable dominate negative reviews, indicating areas where hotels underperform and need improvement.

bar plot Visualization

sentiment_comparison_chart <- ggplot(combined_sentiments, aes(x = sentiment, y = n, fill = review_type)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(
    title = "Sentiment Comparison for Positive and Negative Reviews",
    x = "Sentiment",
    y = "Frequency",
    fill = "Review Type"
  ) +
  theme_minimal()

sentiment_comparison_chart

# Save the bar chart as an image
ggsave("Sentiment_Comparison_Positive_Negative_Reviews.png", sentiment_comparison_chart, width = 10, height = 6)

Figure 4: Scattor plot with Trend Line - Keyword Ratios vs. Ratings

This figure visualizes the relationship between keyword ratios (both from positive and negative reviews) and hotel ratings. By plotting keyword ratios against the average ratings of hotels, we aim to uncover patterns that indicate how strongly the presence of specific keywords in reviews correlates with customer satisfaction.

The scatter plot was chosen because it effectively displays correlations and distributions of data points, enabling us to analyze trends and relationships between variables. The addition of a trend line highlights overarching patterns, such as whether positive keyword ratios are directly proportional to higher ratings or whether negative keyword ratios are inversely related to ratings. This design ensures the visualization is both intuitive and data-driven.

Data processing for Scattor plot visualization 1

# Define keywords from positive and negative reviews
positive_keywords <- c("location", "staff", "friendly", "helpful", "clean",
                      "breakfast", "bed", "metro", "quiet", "service")
negative_keywords <- c("breakfast", "staff", "bathroom", "shower", "bed", 
                       "service", "expensive", "wifi", "price", "water")

# Define a unified list of keywords
all_keywords <- c("location", "staff", "breakfast", "service", "clean", 
                  "bed", "metro", "bathroom", "close")

# Group data by hotel and prepare the initial summary
hotel_keywords_summary <- ordered_data %>%
  group_by(Hotel_Name) %>%
  summarise(
    # Calculate the average score
    average_score = mean(Average_Score, na.rm = TRUE),
    
    # Count the number of positive reviews
    num_positive_reviews = sum(!is.na(Positive_Review)),
    
    # Count the number of negative reviews
    num_negative_reviews = sum(!is.na(Negative_Review)),
    
    # Calculate the total frequency of positive keywords in positive reviews
    p_keywords_num = sum(str_count(tolower(Positive_Review), paste(positive_keywords, collapse = "|"))),
    
    # Calculate the total frequency of negative keywords in negative reviews
    n_keywords_num = sum(str_count(tolower(Negative_Review), paste(negative_keywords, collapse = "|")))
  ) %>%
  ungroup() %>%
  # Calculate positive and negative keyword ratios
  mutate(
    positive_keyword_ratio = p_keywords_num / num_positive_reviews,
    negative_keyword_ratio = n_keywords_num / num_negative_reviews
  )

# Use a for loop to calculate the frequency of each keyword in positive/negative reviews and convert to ratios
for (keyword in all_keywords) {
  # Calculate the frequency of the keyword in positive reviews and convert to ratio
  hotel_keywords_summary <- hotel_keywords_summary %>%
    mutate(!!paste0("p_", keyword, "_ratio") :=
             ordered_data %>%
               group_by(Hotel_Name) %>%
               summarise(keyword_count = sum(str_count(tolower(Positive_Review), fixed(keyword)), na.rm = TRUE)) %>%
               pull(keyword_count) / num_positive_reviews)
  
  # Calculate the frequency of the keyword in negative reviews and convert to ratio
  hotel_keywords_summary <- hotel_keywords_summary %>%
    mutate(!!paste0("n_", keyword, "_ratio") :=
             ordered_data %>%
               group_by(Hotel_Name) %>%
               summarise(keyword_count = sum(str_count(tolower(Negative_Review), fixed(keyword)), na.rm = TRUE)) %>%
               pull(keyword_count) / num_negative_reviews)
}

# View the full dataset sorted by average score
hotel_keywords_summary <- hotel_keywords_summary %>%
  arrange(desc(average_score))
head(hotel_keywords_summary, 1000)
## # A tibble: 458 × 26
##    Hotel_Name            average_score num_positive_reviews num_negative_reviews
##    <chr>                         <dbl>                <int>                <int>
##  1 Ritz Paris                      9.8                   28                   28
##  2 H tel de La Tamise E…           9.6                   61                   61
##  3 Hotel The Peninsula …           9.5                   58                   58
##  4 Le Narcisse Blanc Spa           9.5                   57                   57
##  5 Goralska R sidences …           9.4                   24                   24
##  6 H tel D Aubusson                9.4                  294                  294
##  7 Hotel Eiffel Blomet             9.4                   15                   15
##  8 Hotel Monge                     9.4                  115                  115
##  9 La Chambre du Marais            9.4                   88                   88
## 10 Nolinski Paris                  9.4                  113                  113
## # ℹ 448 more rows
## # ℹ 22 more variables: p_keywords_num <int>, n_keywords_num <int>,
## #   positive_keyword_ratio <dbl>, negative_keyword_ratio <dbl>,
## #   p_location_ratio <dbl>, n_location_ratio <dbl>, p_staff_ratio <dbl>,
## #   n_staff_ratio <dbl>, p_breakfast_ratio <dbl>, n_breakfast_ratio <dbl>,
## #   p_service_ratio <dbl>, n_service_ratio <dbl>, p_clean_ratio <dbl>,
## #   n_clean_ratio <dbl>, p_bed_ratio <dbl>, n_bed_ratio <dbl>, …
# Create and view a smaller dataset with key columns only
visible_columns <- c("Hotel_Name", "average_score", "positive_keyword_ratio", "negative_keyword_ratio")
hidden_columns <- setdiff(names(hotel_keywords_summary), visible_columns)
hotel_keywords_summary_visible <- hotel_keywords_summary %>%
  select(all_of(visible_columns))
head(hotel_keywords_summary_visible, 1000)
## # A tibble: 458 × 4
##    Hotel_Name        average_score positive_keyword_ratio negative_keyword_ratio
##    <chr>                     <dbl>                  <dbl>                  <dbl>
##  1 Ritz Paris                  9.8                   1.04                  0.25 
##  2 H tel de La Tami…           9.6                   2.46                  0.148
##  3 Hotel The Penins…           9.5                   1.24                  0.328
##  4 Le Narcisse Blan…           9.5                   1.81                  0.667
##  5 Goralska R siden…           9.4                   1.79                  0.542
##  6 H tel D Aubusson            9.4                   1.90                  0.340
##  7 Hotel Eiffel Blo…           9.4                   2.8                   0.4  
##  8 Hotel Monge                 9.4                   2.35                  0.287
##  9 La Chambre du Ma…           9.4                   2.25                  0.216
## 10 Nolinski Paris              9.4                   1.53                  0.310
## # ℹ 448 more rows

Scattor Plot Visualization - Positve vs. Negative

The scatter plot shows the correlation between average hotel ratings (x-axis) and the frequency ratios of positive (blue points) and negative (red points) keywords (y-axis) in customer reviews. The trend lines for keywords from positive and negative reviews highlight the overall patterns, offering a balanced perspective on the factors driving satisfaction and dissatisfaction.

This data is calculated and applied in a ratio rather than the frequency. Since the number of reviews for each hotel is different, the total number is divided by the number of reviews for the hotel for accurate judgment.

By visualizing the relationship between keyword ratios and ratings, this analysis provides actionable insights for hotels to strategically enhance their services. Leveraging strengths like cleanliness and service quality while prioritizing improvements in key pain points, such as pricing and amenities, can help hotels achieve sustainable growth and greater customer satisfaction.

  • Keywords from Positive reviews (Blue Points): The upward-sloping trend line for positive keywords illustrates a clear positive correlation between the frequency of positive terms in reviews and higher hotel ratings. At this time, the key drivers are the 10 keywords previously obtained through the word cloud: [location, staff, friendly, helpful, clean, breast, bed, metro, quiet, service] Hotels achieving higher ratings often receive reviews highlighting aspects such as staff, clean, and location, underscoring the importance of these factors in guest satisfaction. And keyword staff, service, bed, bathroom and breakfast shows the steeper the upward curve in positive review.

  • Keywords from Negative reviews (Red Points): The downward-sloping trend line for negative keywords indicates a negative correlation between the frequency of negative terms and hotel ratings.At this time, the key drivers are the 10 keywords previously obtained through the word cloud: [breakfast, staff, bathroom, show, bed, service, expansive, wifi, price, water] Lower-rated hotels frequently receive reviews with terms such as breakfast, bathroom, and expensive, pointing to critical areas where improvements are needed to mitigate dissatisfaction and boost ratings. And keyword staff, service, bed, bathroom and breakfast shows the steeper the downward curve in negative reviews

# Visualizing the relationship between ratings and the ratio of positive/negative keywords
rating_keyword_ratio_plot <- ggplot(hotel_keywords_summary, aes(x = average_score)) +
  
  # Scatter plot for the ratio of positive keywords
  geom_point(aes(y = positive_keyword_ratio, color = "Positive Reviews"), alpha = 0.6) +
  geom_smooth(aes(y = positive_keyword_ratio, color = "Positive Reviews"), 
              method = "lm", se = FALSE, linetype = "solid") +
  
  # Scatter plot for the ratio of negative keywords
  geom_point(aes(y = negative_keyword_ratio, color = "Negative Reviews"), alpha = 0.6) +
  geom_smooth(aes(y = negative_keyword_ratio, color = "Negative Reviews"), 
              method = "lm", se = FALSE, linetype = "solid") +
  
  # Title and axis labels
  labs(
    title = "Correlation Between Rating and Keyword Frequency Ratio",
    x = "Average Score",
    y = "Keyword Frequency Ratio",
    subtitle = "Blue points: Positive Keywords Ratio, Red points: Negative Keywords Ratio"
  ) +
  
  # Color legend
  scale_color_manual(values = c("Positive Reviews" = "blue", "Negative Reviews" = "red")) +
  
  # Minimal theme for clean visuals
  theme_minimal()

rating_keyword_ratio_plot
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'

# Save the scatter plot as an image
ggsave("Rating_Keyword_Frequency_Ratio.png", rating_keyword_ratio_plot, width = 12, height = 8)
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'

Data processing for Facet scattor plot visualization 2

# Select relevant columns and transform the data
facet_data <- hotel_keywords_summary %>%
  pivot_longer(
    cols = matches("^(p_|n_).+_ratio$"),  # Select columns starting with p_ or n_ and ending with _ratio
    names_to = "keyword_type",
    values_to = "ratio"
  ) %>%
  mutate(
    sentiment = ifelse(grepl("^p_", keyword_type), "Positive", "Negative"),  # Separate Positive/Negative sentiments
    keyword = gsub("^(p_|n_)", "", keyword_type) %>% gsub("_ratio$", "", .)  # Remove p_/n_ and _ratio from column names
  )

# Filter data: Use only keywords included in all_keywords
facet_data <- facet_data %>%
  filter(keyword %in% all_keywords)

# Check the data structure and preview
str(facet_data)
## tibble [8,244 × 12] (S3: tbl_df/tbl/data.frame)
##  $ Hotel_Name            : chr [1:8244] "Ritz Paris" "Ritz Paris" "Ritz Paris" "Ritz Paris" ...
##  $ average_score         : num [1:8244] 9.8 9.8 9.8 9.8 9.8 9.8 9.8 9.8 9.8 9.8 ...
##  $ num_positive_reviews  : int [1:8244] 28 28 28 28 28 28 28 28 28 28 ...
##  $ num_negative_reviews  : int [1:8244] 28 28 28 28 28 28 28 28 28 28 ...
##  $ p_keywords_num        : int [1:8244] 29 29 29 29 29 29 29 29 29 29 ...
##  $ n_keywords_num        : int [1:8244] 7 7 7 7 7 7 7 7 7 7 ...
##  $ positive_keyword_ratio: num [1:8244] 1.04 1.04 1.04 1.04 1.04 ...
##  $ negative_keyword_ratio: num [1:8244] 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 ...
##  $ keyword_type          : chr [1:8244] "p_location_ratio" "n_location_ratio" "p_staff_ratio" "n_staff_ratio" ...
##  $ ratio                 : num [1:8244] 0.0357 0 0.4286 0.0357 0.0714 ...
##  $ sentiment             : chr [1:8244] "Positive" "Negative" "Positive" "Negative" ...
##  $ keyword               : chr [1:8244] "location" "location" "staff" "staff" ...
head(facet_data)
## # A tibble: 6 × 12
##   Hotel_Name average_score num_positive_reviews num_negative_reviews
##   <chr>              <dbl>                <int>                <int>
## 1 Ritz Paris           9.8                   28                   28
## 2 Ritz Paris           9.8                   28                   28
## 3 Ritz Paris           9.8                   28                   28
## 4 Ritz Paris           9.8                   28                   28
## 5 Ritz Paris           9.8                   28                   28
## 6 Ritz Paris           9.8                   28                   28
## # ℹ 8 more variables: p_keywords_num <int>, n_keywords_num <int>,
## #   positive_keyword_ratio <dbl>, negative_keyword_ratio <dbl>,
## #   keyword_type <chr>, ratio <dbl>, sentiment <chr>, keyword <chr>

Facet scattor Plot Visualization - Keywords (Positive vs. Negative)

The earlier scatter plot provided a general overview of the correlation between positive and negative keyword ratios and hotel ratings, highlighting collective trends. However, this aggregated view lacked the granularity required to evaluate the individual impact of specific keywords. To address this limitation, a facet scatter plot was created to break down the relationships between specific keywords (e.g., bathroom, bed, breakfast, etc.) and average hotel ratings. By visualizing the keyword ratios in positive (blue trend lines) and negative (red trend lines) reviews across individual panels, the facet plot allows for a more detailed and targeted analysis of satisfaction and dissatisfaction drivers.

Facet scatter plots were specifically chosen because they enable side-by-side comparisons of multiple variables, helping to uncover nuanced patterns for each keyword. This design ensures that each keyword’s unique contribution to ratings is clear and actionable. For instance, keywords like clean, staff, and service exhibit strong positive trends, indicating their frequent mentions in positive reviews are closely tied to higher ratings. Conversely, keywords like breakfast and bathroom display flat or slightly negative trends in negative reviews, reinforcing their association with lower ratings. This granular analysis enables hotels to focus on addressing specific weaknesses while maintaining key strengths to improve overall guest satisfaction and ratings.

  • Positive Keywords (Blue Trend Lines): Keywords like clean, staff, and service exhibit a strong positive trend, indicating that frequent mentions of these terms in positive reviews are strongly associated with higher ratings. Other keywords like location and bed show a moderate positive trend, suggesting they contribute positively but less strongly compared to clean and staff.

  • Negative Keywords (Red Trend Lines): Keywords like bathroom and breakfast show a flat or slightly negative trend, highlighting that their frequent mentions in negative reviews are associated with lower ratings. The red lines for most keywords remain flat or negatively sloped, reinforcing the negative impact of these aspects on customer satisfaction.

# Create Facet Scatter Plot
facet_scatter_plot <- ggplot(facet_data, aes(x = average_score, y = ratio, color = sentiment)) +
  geom_point(alpha = 0.15) +
  geom_smooth(method = "lm", se = FALSE, linetype = "solid", alpha = 0.7) +
  facet_wrap(~ keyword, scales = "free_y") +  # Create facets for each keyword
  labs(
    title = "Keyword Ratio vs Average Score by Sentiment",
    x = "Average Score",
    y = "Keyword Ratio",
    color = "Reviews"
  ) +
  scale_color_manual(values = c("Positive" = "blue", "Negative" = "red")) +
  theme_minimal() +
  theme(
    strip.text = element_text(size = 10, face = "bold"),
    legend.position = "top"
  )

facet_scatter_plot
## `geom_smooth()` using formula = 'y ~ x'

# Save the Facet Scatter Plot as an image
ggsave("Facet_Scatter_Keyword_Ratio.png", facet_scatter_plot, width = 14, height = 10)
## `geom_smooth()` using formula = 'y ~ x'

Conclusion

The analysis of customer reviews using various visualizations highlights the key factors influencing hotel ratings and provides actionable insights for improving guest satisfaction. Across all figures, common themes such as cleanliness, staff service, location, and pricing emerge as the most significant contributors to customer experiences.

The sentiment analysis reinforces these findings by providing a balanced view of customer feedback. Positive sentiments dominate high-rated reviews, reflecting the satisfaction derived from clean facilities and friendly staff. Conversely, negative sentiments in low-rated reviews highlight dissatisfaction with price and comfort-related issues. By leveraging these insights, hotels can strategically align their resources to enhance strengths, address weaknesses, and create superior guest experiences.

In conclusion, this analysis provides a roadmap for hoteliers to optimize their services and prioritize operational improvements effectively. By focusing on both strengths and areas for improvement, hotels can drive customer satisfaction, improve their competitive positioning, and achieve sustainable success in the hospitality industry.