Here we will analyze the various types of hotel bookings methods. By the end you will know what the most popular payment type is and compare the frequency of City Hotel bookings with Resort Hotel bookings .
Use Readr to upload the hotel bookings data.
hotel_bookings <-read.csv("hotel_bookings.csv")
colnames(hotel_bookings)
## [1] "hotel" "is_canceled"
## [3] "lead_time" "arrival_date_year"
## [5] "arrival_date_month" "arrival_date_week_number"
## [7] "arrival_date_day_of_month" "stays_in_weekend_nights"
## [9] "stays_in_week_nights" "adults"
## [11] "children" "babies"
## [13] "meal" "country"
## [15] "market_segment" "distribution_channel"
## [17] "is_repeated_guest" "previous_cancellations"
## [19] "previous_bookings_not_canceled" "reserved_room_type"
## [21] "assigned_room_type" "booking_changes"
## [23] "deposit_type" "agent"
## [25] "company" "days_in_waiting_list"
## [27] "customer_type" "adr"
## [29] "required_car_parking_spaces" "total_of_special_requests"
## [31] "reservation_status" "reservation_status_date"
Set up the environment by loading ‘ggplot2’and ’tidyverse’
library(ggplot2)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ lubridate 1.9.4 ✔ tibble 3.3.0
## ✔ purrr 1.1.0 ✔ tidyr 1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Analyze payment type popularity by hotel type.
ggplot(data=hotel_bookings)+
geom_bar(mapping = aes(x=market_segment)) +
facet_wrap(~hotel) +
labs(title="Payment Type Popularity by City")
Here we highlighted the first and last arrival date years included within the plot. We also tilted the market segment names for readability.
min(hotel_bookings$arrival_date_year)
## [1] 2015
max(hotel_bookings$arrival_date_year)
## [1] 2017
mindate <-min(hotel_bookings$arrival_date_year)
maxdate <-max(hotel_bookings$arrival_date_year)
ggplot(data=hotel_bookings)+
geom_bar(mapping = aes(x=market_segment)) +
facet_wrap(~hotel) +
theme(axis.text.x=element_text(angle=45)) +
labs(title="Payment Type Popularity by hotel type for hotel bookings",
subtitle = paste0("Data from: ", mindate, " to ", maxdate))
Here we made the dates smaller and less distracting.
ggplot(data=hotel_bookings)+
geom_bar(mapping = aes(x=market_segment)) +
facet_wrap(~hotel) +
theme(axis.text.x=element_text(angle=45)) +
labs(title="Payment Type Popularity by hotel type for hotel bookings",
caption= paste0("Data from: ", mindate, " to ", maxdate),
x="Market Segment",
y="Number of Bookings")
You can save your plot using ggsave() and edit the size to your preference.
ggsave("hotel_booking_chart.png",
width=7,
height=7)
This was a required step.
## *R Markdown Paragraph*
## **Bold is Useful**
```