Hotel Bookings Plot Analysis

Here we will analyze the various types of hotel bookings methods. By the end you will know what the most popular payment type is and compare the frequency of City Hotel bookings with Resort Hotel bookings .

Import Data

Use Readr to upload the hotel bookings data.

hotel_bookings <-read.csv("hotel_bookings.csv")
colnames(hotel_bookings)
##  [1] "hotel"                          "is_canceled"                   
##  [3] "lead_time"                      "arrival_date_year"             
##  [5] "arrival_date_month"             "arrival_date_week_number"      
##  [7] "arrival_date_day_of_month"      "stays_in_weekend_nights"       
##  [9] "stays_in_week_nights"           "adults"                        
## [11] "children"                       "babies"                        
## [13] "meal"                           "country"                       
## [15] "market_segment"                 "distribution_channel"          
## [17] "is_repeated_guest"              "previous_cancellations"        
## [19] "previous_bookings_not_canceled" "reserved_room_type"            
## [21] "assigned_room_type"             "booking_changes"               
## [23] "deposit_type"                   "agent"                         
## [25] "company"                        "days_in_waiting_list"          
## [27] "customer_type"                  "adr"                           
## [29] "required_car_parking_spaces"    "total_of_special_requests"     
## [31] "reservation_status"             "reservation_status_date"

Install Packages

Set up the environment by loading ‘ggplot2’and ’tidyverse’

library(ggplot2)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ lubridate 1.9.4     ✔ tibble    3.3.0
## ✔ purrr     1.1.0     ✔ tidyr     1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Annotating Your Chart

Analyze payment type popularity by hotel type.

ggplot(data=hotel_bookings)+
  geom_bar(mapping = aes(x=market_segment)) +
  facet_wrap(~hotel) +
  labs(title="Payment Type Popularity by City")

Add More Details

Here we highlighted the first and last arrival date years included within the plot. We also tilted the market segment names for readability.

min(hotel_bookings$arrival_date_year)
## [1] 2015
max(hotel_bookings$arrival_date_year)
## [1] 2017
mindate <-min(hotel_bookings$arrival_date_year)
maxdate <-max(hotel_bookings$arrival_date_year)
ggplot(data=hotel_bookings)+
  geom_bar(mapping = aes(x=market_segment)) +
  facet_wrap(~hotel) +
  theme(axis.text.x=element_text(angle=45)) +
  labs(title="Payment Type Popularity by hotel type for hotel bookings",
       subtitle = paste0("Data from: ", mindate, " to ", maxdate))

Switch Subtitle to Caption

Here we made the dates smaller and less distracting.

ggplot(data=hotel_bookings)+
  geom_bar(mapping = aes(x=market_segment)) +
  facet_wrap(~hotel) +
  theme(axis.text.x=element_text(angle=45)) +
  labs(title="Payment Type Popularity by hotel type for hotel bookings",
       caption= paste0("Data from: ", mindate, " to ", maxdate), 
       x="Market Segment",
       y="Number of Bookings")

Save Your Visualization

You can save your plot using ggsave() and edit the size to your preference.

ggsave("hotel_booking_chart.png",
       width=7,
       height=7)

Getting Started with R Markdown

This was a required step.

## *R Markdown Paragraph*
## **Bold is Useful**

```