Does offering an enological experience impact Airbnb ratings, and is this moderated by Superhost status?
This report analyzes how offering an enological experience (e.g., wine tasting, offering a glass of wine) impacts an Airbnb listing’s review score ratings in Florence. This relationship is moderated by Superhost status.
``` r
library(readxl)
library(dplyr)
library(stringr)
library(ggplot2)
library(car)
library(jpeg)
``` r
# Set working directory
setwd("/Users/djemkasahinpasic/Documents/LUISS/Uni/Market Data Analysis/Project 1 - Florence")
# Load Data
reviews <- read_excel("reviews.xlsx")
listings <- read_excel("listings.xlsx")
listing_id, listing_url, last_scraped, name, description
host_id, host_name, host_since, host_location
neighborhood_cleansed (renamed to *
neighborhood`*)price
review_scores_rating, review_scores_accuracy, review_scores_cleanliness, review_scores_checkin, review_scores_communication, review_scores_location, review_scores_value
selected_amenities
host_is_superhost
listing_id
id
date
listings <- listings %>% rename("neighborhood" = "neighbourhood_cleansed", )
reviews <- reviews %>%
mutate(
enological_experience = ifelse(
str_detect(tolower(comments), "vino|wine|vin|wein|vinho|wino|şarap|wijn|viini"),
1,
0
)
)
reviews_listings <- merge(reviews, listings, by = "listing_id", all.x = TRUE)
final_dataset <- reviews_listings %>%
select(listing_id, review_scores_rating, enological_experience, host_is_superhost)
final_dataset$host_is_superhost <- ifelse(final_dataset$host_is_superhost == "t", 1, 0)
final_dataset <- na.omit(final_dataset)
str(final_dataset)
## 'data.frame': 209792 obs. of 4 variables:
## $ listing_id : num 31840 31840 31840 31840 31840 ...
## $ review_scores_rating : num 4.66 4.66 4.66 4.66 4.66 4.66 4.88 4.74 4.74 4.74 ...
## $ enological_experience: num 0 0 0 0 0 0 0 0 0 0 ...
## $ host_is_superhost : num 0 0 0 0 0 0 0 0 0 0 ...
## - attr(*, "na.action")= 'omit' Named int [1:17119] 403 404 405 406 407 408 409 410 411 412 ...
## ..- attr(*, "names")= chr [1:17119] "403" "404" "405" "406" ...
head(final_dataset)
## listing_id review_scores_rating enological_experience host_is_superhost
## 1 31840 4.66 0 0
## 2 31840 4.66 0 0
## 3 31840 4.66 0 0
## 4 31840 4.66 0 0
## 5 31840 4.66 0 0
## 6 31840 4.66 0 0
table(final_dataset$host_is_superhost)
##
## 0 1
## 72494 137298
table(final_dataset$host_is_superhost)
##
## 0 1
## 72494 137298
host_counts <- table(final_dataset$host_is_superhost)
summary(final_dataset$review_scores_rating)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 4.72 4.85 4.80 4.93 5.00
reviews_listings$price <- as.numeric(gsub("[^0-9.]", "", reviews_listings$price))
summary(reviews_listings$price)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 10.0 101.0 142.0 197.3 201.0 92324.0 5324
price_summary <- summary(reviews_listings$price, na.rm = TRUE)
cat("The average price in Florence is", round(mean(reviews_listings$price, na.rm = TRUE), 2),
", the maximum price is", max(reviews_listings$price, na.rm = TRUE),
", the minimum price is", min(reviews_listings$price, na.rm = TRUE), "\n")
## The average price in Florence is 197.3 , the maximum price is 92324 , the minimum price is 10
ggplot(final_dataset, aes(x = review_scores_rating)) +
geom_histogram(binwidth = 0.5, fill = "blue", alpha = 0.7, color = "black") +
theme_minimal() +
labs(title = "Distribution of Review Scores", x = "Review Score", y = "Frequency")
# The chart displays the distribution of Airbnb listing review scores.
Most reviews are concentrated at higher ratings, indicating an overall
trend toward positive evaluations. However, some variation is present,
with a limited number of lower scores. This suggests that while
perceived quality is generally high, some exceptions exist, possibly
influenced by factors such as the offered experience or service
quality.
final_dataset$enological_experience <- as.numeric(final_dataset$enological_experience)
total_comments <- nrow(final_dataset)
wine_comments <- sum(final_dataset$enological_experience == 1)
non_wine_comments <- sum(final_dataset$enological_experience == 0)
percentage_wine <- wine_comments / total_comments * 100
cat("The total number of comments is", total_comments, "\n")
## The total number of comments is 209792
cat("The number of comments mentioning 'wine' is", wine_comments, "\n")
## The number of comments mentioning 'wine' is 12362
cat("The number of comments that do not mention an enological experience is", non_wine_comments, "\n")
## The number of comments that do not mention an enological experience is 197430
cat("The percentage of comments mentioning 'wine' is", round(percentage_wine, 2), "%\n")
## The percentage of comments mentioning 'wine' is 5.89 %
ggplot(data.frame(category = c("Contains 'wine'", "Does not contain 'wine'"),
count = c(wine_comments, total_comments - wine_comments)),
aes(x = category, y = count, fill = category)) +
geom_bar(stat = "identity", alpha = 0.7) +
theme_minimal() +
labs(title = "Proportion of Comments Mentioning 'Wine'", x = "Comment Type", y = "Count")
ggplot(final_dataset, aes(x = factor(host_is_superhost, labels = c("Host", "Superhost")), fill = factor(host_is_superhost))) +
geom_bar(alpha = 0.7) +
theme_minimal() +
labs(title = "Number of Hosts vs Superhosts", x = "Superhost Status", y = "Count") +
scale_fill_manual(values = c("red", "blue"), labels = c("Host", "Superhost"))
cat("The total number of hosts is", host_counts["0"], ", the total number of Superhosts is", host_counts["1"], "\n")
## The total number of hosts is 72494 , the total number of Superhosts is 137298
experience_counts <- table(final_dataset$enological_experience, final_dataset$host_is_superhost)
ggplot(final_dataset, aes(x = factor(enological_experience, labels = c("Does Not Offer Experience", "Offers Experience")),
fill = factor(host_is_superhost, labels = c("Host", "Superhost")))) +
geom_bar(position = "dodge", alpha = 0.7) +
theme_minimal() +
labs(title = "Enological Experience by Host Type",
x = "Enological Experience",
y = "Count",
fill = "Host Type") +
scale_fill_manual(values = c("red", "blue"))
cat("The number of listings that do not offer an enological experience:\n",
"- Hosts:", experience_counts["0", "0"], "\n",
"- Superhosts:", experience_counts["0", "1"], "\n")
## The number of listings that do not offer an enological experience:
## - Hosts: 68982
## - Superhosts: 128448
cat("The number of listings that offer an enological experience:\n",
"- Hosts:", experience_counts["1", "0"], "\n",
"- Superhosts:", experience_counts["1", "1"], "\n")
## The number of listings that offer an enological experience:
## - Hosts: 3512
## - Superhosts: 8850
How does offering an enological experience (e.g., wine tasting, offering glass of wine, etc.) impact an Airbnb listing’s review score ratings, and how is this relationship moderated by Superhost status?
review_scores_rating
)
facilitates benchmarking across listings.review_scores_rating
using the IQR (Interquartile
Range) method. These could represent unusually high or low
ratings that may impact analysis.This analysis aims to determine how offering an enological experience (e.g., wine tasting, offering a glass of wine) impacts an Airbnb listing’s review scores, and whether this relationship is moderated by the host’s Superhost status. By analyzing these factors, we aim to understand how hosts (both Superhosts and non-Superhosts) can leverage wine-related experiences to optimize guest satisfaction and ratings.
Findings suggest that Superhost status has a stronger impact on ratings than wine offerings, highlighting the importance of consistent high-quality service. However, non-Superhost hosts can use enological experiences to enhance guest satisfaction, improve ratings, and increase their chances of becoming Superhosts.
Airbnb hosts can use enological experiences as a key differentiator to boost their ratings and attract more guests. Highlighting wine-related offerings in listing descriptions can make properties more appealing and improve booking rates. While Superhosts already have a credibility advantage, they can further enhance their guest experience by incorporating wine-related experiences, reinforcing their status and maintaining a competitive edge.