setwd("C:/Users/anger/OneDrive - University of Cincinnati/BANA 7025/BANA 7025");
data.df <- read.csv("C:/Users/anger/OneDrive - University of Cincinnati/BANA 7025/hotels.csv", stringsAsFactors = FALSE)
library(ggplot2)
ggplot(data.df,
aes(x = is_canceled,
fill = hotel)) +
geom_bar(position = "fill") +
labs(y = "Proportion", x = "Cancellation (0 = no, 1 = yes)", title = "Cancellations by Hotel Type")
For this assignment I used hotel data collected by Antonio, Almeida, and Nunes (2019). I was interested in determining how specific factors impacted the cancellation rate. This visually, specifically analyzes the relationship between hotel type and the cancellation rate. The left bar indicates the proportion of bookings that did NOT result in a cancellation, and the right bar indicates the proportion of bookings that did result in a cancellation. Bookings made at city hotels are represented in red, and bookings made at resort hotels are represented in blue. As you can see from the graph the proportion of bookings made at city hotels is larger in general than the proportion of bookings made at resort hotels. However, there is an even larger difference between the proportions of bookings that result in cancellations. Bookings made at resort hotels comprise approximately 40% of bookings that do not result in a cancellation and 25% of bookings that do result in cancellations. Bookings made a city hotels comprise approximately 60% of bookings that do not result in a cancellation and 75% of bookings that do result in a cancellation. Hence, when you compare the proportion of city hotel bookings that result in a cancellation to reosrt hotels, there is a large discrepancy. Overall, I believe there is a relationship between hotel type and the cancellation rate; bookings made at city hotels are more likely to result in a cancellation.