The data were sourced from an article published in the Journal
of Data in Brief by ELSEVIER, authored by Nuno
et al. (2019), and available on ScienceDirect. It encompasses a
total of 119,390 booking transactions from two hotels: an unnamed resort
hotel in the Algarve, Portugal, with 40,060 observations (H1), and a
city hotel in Lisbon, Portugal, with 79,330 observations (H2). Both
datasets share the same structure. The dataset includes bookings
scheduled to arrive between July 1, 2015, and August 31, 2017, covering
both completed stays and cancellations.
hotel_bookings <- read.csv("hotel_bookings.csv")
Setting up the environment for analysis.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.4.4 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(dplyr)
library(lubridate)
library(ggthemes)
library(forecast)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(tseries)
library(padr)
library(rsample)
library(scales)
##
## Attaching package: 'scales'
##
## The following object is masked from 'package:purrr':
##
## discard
##
## The following object is masked from 'package:readr':
##
## col_factor
library(recipes)
##
## Attaching package: 'recipes'
##
## The following object is masked from 'package:stringr':
##
## fixed
##
## The following object is masked from 'package:stats':
##
## step
View(hotel_bookings)
hotel_bookings %>%
count(hotel, market_segment, is_canceled) %>%
group_by(hotel) %>%
mutate(total = sum(n),
ratio = n/total,
is_canceled = ifelse(is_canceled == 0, "No", "Yes")) %>%
ungroup() %>%
mutate(market_segment = tidytext::reorder_within(market_segment, ratio, hotel)) %>%
ggplot(aes(ratio, market_segment, fill = is_canceled)) +
geom_col(position = "dodge") +
labs(x = "Percentage", y = "Market Segment",
title = "Hotel Demand by Market Segment",
fill = "Booking Cancelled") +
facet_wrap(~hotel, scales = "free_y") +
scale_x_continuous(labels = percent_format(accuracy = 1)) +
scale_fill_manual(values = c("darkgreen", "darkred")) +
tidytext::scale_y_reordered() +
theme_pander() +
theme(legend.position = "top")

Based on the results, majority of bookings are made through travel
agents (TA), both online and offline, which together account for more
than 40% of the total non-cancelled transactions. The other segments do
not have as many transactions, but it might be beneficial to examine the
revenue generated by each market segment by looking at the Average Daily
Rate (ADR).
hotel_bookings %>%
filter(is_canceled == F) %>%
group_by(hotel, market_segment) %>%
summarise(adr = sum(adr)) %>%
mutate(total = sum(adr),
ratio = adr/total) %>%
ungroup() %>%
mutate(market_segment = tidytext::reorder_within(market_segment, ratio, hotel)) %>%
ggplot(aes(ratio, market_segment)) +
geom_col(fill = "blue") +
facet_wrap(~hotel, scales = "free_y") +
tidytext::scale_y_reordered() +
scale_x_continuous(labels = percent_format(accuracy = 1)) +
theme_pander() +
labs(x = NULL, y = NULL,
title = "ADR Contributions by Market Segments")
## `summarise()` has grouped output by 'hotel'. You can override using the
## `.groups` argument.

The Average Daily Rate (ADR) is calculated by dividing the sum of
all lodging transactions by the total number of nights stayed. Thus, a
higher ADR signifies more revenue generated per night. The total (ADR)
generated from online travel agents is the highest for both city hotels
and resort hotels. The segments of offline travel agents/tour operators
and direct bookings have a narrow margin between them, with the direct
booking segment contributing more to resort hotels, despite having a
lower number of transactions. This identifies the leading three market
segments, both in quantity and profitability.
ggplot(data = hotel_bookings, aes(x = lead_time, y = children)) +
geom_point(aes(color = children)) + # Color by 'children'
scale_color_gradient(low = "skyblue", high = "darkblue") + # Gradient color
labs(title = "Relationship between Booking Lead Time and Guests Traveling with Children",
x = "Booking Lead Time",
y = "Number of Children",
color = "Children") +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5)) # Center the plot title
## Warning: Removed 4 rows containing missing values (`geom_point()`).

Explored the relationship between booking lead time and guests
travelling with children.The plot shows a group of guests who typically
make early bookings, and this plot showed that many of these guests do
not have children.
If the marketing department aims to launch a family-friendly
promotion targeting key market segments. They may seek to identify which
segments generate the highest number of bookings and determine the
preferred locations of these bookings, whether in city hotels or resort
hotels.
A bar chart may be useful to display each hotel type alongside the
market segments, utilizing distinct colors to represent each
segment.
ggplot(data = hotel_bookings) + geom_bar(mapping = aes(x = hotel, fill = market_segment)) +
labs(title="Hotel Type and Market Segment")

After creating the bar chart, it becomes apparent that comparing the
size of the market segments at the top of the bars may be challenging.
Furthermore, a clearer comparison of each segment would be beneficial.
Consequently, the decision is made to create a separate plot for each
market segment to facilitate a more detailed analysis.
ggplot(data = hotel_bookings) +
geom_bar(mapping = aes(x = hotel), fill = "steelblue") + # Adding color to the bars
facet_wrap(~market_segment) +
labs(title = "Comparison of the Size of Each Market Segment") +
theme_minimal() + # Using a minimal theme for a cleaner look
theme(plot.title = element_text(hjust = 0.5)) # Centering the plot title

Now, that a separate bar chart for each market segment. We can have
a clearer idea of the size of each market segment, as well as the
corresponding data for each hotel type.
After considering all the data, the stakeholders may decide to send
the promotion to families that make online bookings for city hotels. The
online segment is the fastest growing segment, and families tend to
spend more at city hotels than other types of guests.
In addition, it would be better to visualize a plot that shows the
relationship between lead time and guests travelling with children for
online bookings at city hotels. This will give a better idea of the
specific timing for the promotion.
Moreover, we may break it down into the following two steps:
1) filtering the data;
2) plotting the filtered data.
onlineta_city_hotels <- filter(hotel_bookings, (hotel=="City Hotel" & hotel_bookings$market_segment=="Online TA"))
View the assigned data
View(onlineta_city_hotels)
ggplot(data = onlineta_city_hotels, aes(x = lead_time, y = children, color = as.factor(is_canceled))) +
geom_point() + # Keep the original point aesthetics
labs(title = "Online Bookings for City Hotels",
x = "Lead Time",
y = "Number of Children",
color = "Booking Cancelled") +
scale_color_manual(values = c("0" = "deepskyblue", "1" = "coral"),
labels = c("No", "Yes")) +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5)) +
guides(color = guide_legend(override.aes = list(shape = 15, size = 6))) # Use square shape for legend keys, adjust size as needed
## Warning: Removed 1 rows containing missing values (`geom_point()`).

Based on the filter applied, the scatter plot displays data for
online bookings at city hotels. This visualization indicates that
bookings with three children showing a notably shorter lead time (less
than 200 days). On the other hand, bookings with two children and below
have longer lead time. Therefore, promotions targeting families with
three children can be strategically scheduled closer to the relevant
booking dates for greater effectiveness.
(The filtered data will enable the creation of varied views and
facilitate a deeper investigation into more specific relationships
within the dataset.)
Recommendations
If the booking lead time is longer—meaning people are planning their
trips well in advance—there are several marketing strategic actions it
can take to make the most of this situation;
1. Early Bird Discounts
Offer discounts or special packages to guests who book their stay
several months ahead of time. This encourages guests to lock in their
reservations early, securing revenue for the company ahead of time.
2. Flexible Cancellation Policies
To attract early bookings, companies can offer more flexible
cancellation policies. Knowing they can cancel without penalty makes
customers more likely to book early.
3. Personalized Marketing
Use the lead time to engage with customers through personalized
marketing. This could include sending them information about the
destination, tips for their upcoming stay, or offers on upgrades and
additional services.
4. Revenue Management
Longer lead times give companies more data to optimize their pricing
strategies. By analyzing booking patterns, they can adjust prices to
maximize revenue, perhaps increasing prices as the arrival date gets
closer and demand goes up.
5. Improved Resource Planning
With a good idea of future occupancy rates, a company can better
plan its resources, including staffing, inventory, and maintenance work.
This helps in providing a better guest experience and managing costs
efficiently.
6. Loyalty Programs
Encourage early bookings by offering points or rewards through a
loyalty program. Guests could earn more points for booking early, which
they can redeem for perks like free nights, room upgrades, or other
services.
7. Special Experiences
Offer exclusive experiences or access to special events for guests
who book early. This could be a dinner at a top restaurant, tickets to
local attractions, or a unique tour that’s only available to early
bookers.
8. Forecasting and Strategy Adjustments
Use the data from booking patterns to improve demand forecasting and
strategic planning. Understanding why lead times are longer can help a
company adjust its marketing and operational strategies to meet customer
needs better.
Adapting to longer booking lead times is all about understanding
customer behavior and preferences, then aligning your business
strategies to meet those expectations while optimizing your revenue and
operational efficiency.
Shorter booking lead time
If it faces shorter booking lead times—meaning people are making
their plans closer to their travel date—it suggests a shift towards more
last-minute bookings. It may adapt to several strategies;
1. Last-Minute Deals
Offer attractive last-minute deals to encourage spur-of-the-moment
bookings. This can help fill up unsold inventory and ensure rooms or
services don’t go unused.
2. Flexible Pricing Strategies
Implement dynamic pricing strategies that adjust based on demand.
When the lead time is short, and demand is high, prices can be adjusted
upward. Conversely, if there’s a lot of available inventory, lowering
prices can help increase occupancy.
3. Targeted Marketing Campaigns
Launch targeted marketing campaigns aimed at last-minute bookers.
Use social media, email newsletters, and other direct communication
channels to reach potential customers with special offers and incentives
that encourage immediate booking.
4. Streamline the Booking Process
Make sure the booking process is as easy and fast as possible. A
streamlined, hassle-free booking experience is crucial for capturing
last-minute bookings, especially through mobile devices.
5. Package Deals and Extras
Create package deals or offer added extras, like free breakfast,
parking, or Wi-Fi, to make last-minute offers more attractive compared
to competitors. This can also enhance guest satisfaction and perceived
value.
6. Partnerships and Cross-Promotions
Collaborate with local attractions, restaurants, and event
organizers to create comprehensive last-minute packages. This not only
adds value for guests but also promotes local tourism.
7. Efficient Resource Management
With shorter lead times, companies need to be agile in managing
resources, including staffing and inventory. Quick adjustments based on
the latest booking trends can help manage costs and ensure guests
receive the best experience.
8. Utilize Technology for Real-Time Insights
Invest in technology solutions that provide real-time insights into
booking patterns and guest preferences. This data can inform decisions
on pricing, marketing, and resource allocation.
9. Encourage Reviews and Social Proof
Positive reviews and recommendations can be particularly persuasive
for last-minute bookers. Encourage satisfied guests to leave reviews and
share their experiences on social media to attract more last-minute
bookings.
10. Loyalty Programs
Adjust loyalty program rewards to offer immediate benefits or perks
for last-minute bookings. This can encourage loyalty members to choose
your company even when booking late.
Companies facing shorter booking lead times should focus on
flexibility, efficiency, and targeted marketing to attract last-minute
bookers while optimizing their revenue and ensuring a great guest
experience.