This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter.

Numeric summary for two columns

numeric_summary <- summary(hotel_data[c(‘lead_time’, ‘stays_in_weekend_nights’)])

Display the numeric summary

numeric_summary

Numeric summary for two columns

numeric_summary <- summary(hotel_data[c(‘lead_time’, ‘stays_in_weekend_nights’)])

Display the numeric summary

numeric_summary

Display unique values and counts for categorical columns

categorical_summary1 <- table(hotel_data\(meal) categorical_summary2 <- table(hotel_data\)market_segment)

Display the categorical summary for “meal”

categorical_summary1

Display the categorical summary for “market_segment”

categorical_summary2

Aggregating lead time impact on cancellations

lead_time_cancellation_aggregate <- aggregate(is_canceled ~ lead_time, data = hotel_data, FUN = function(x) mean(x == 1))

Display the result

lead_time_cancellation_aggregate

Aggregating effect of meal type on customer satisfaction

meal_satisfaction_aggregate <- aggregate(adr ~ meal, data = hotel_data, FUN = mean)

Display the result

meal_satisfaction_aggregate

Lead Time Distribution

lead_time_plot <- ggplot(hotel_data, aes(x = lead_time)) + geom_histogram(binwidth = 50, fill = “#66c2a5”, color = “#1f78b4”, alpha = 0.7) + labs(title = “Lead Time Distribution”, x = “Lead Time (days)”, y = “Frequency”) + theme_minimal()

Display the plot

lead_time_plot

Correlation between Lead Time and Cancellations

lead_time_cancellation_plot <- ggplot(hotel_data, aes(x = lead_time, fill = factor(is_canceled))) + geom_density(alpha = 0.7) + labs(title = “Correlation between Lead Time and Cancellations”, x = “Lead Time (days)”, y = “Density”, fill = “Cancellation”) + theme_minimal()

Display the plot

lead_time_cancellation_plot

QUESTION 2

A set of at least 3 novel questions to investigate informed by the following:

column summaries (i.e., the above bullet)

data documentation

your project’s goals/purpose

Booking Pattern and Lead Time

Question : How does the lead time booking and arrival vary across diff type of meal (meal column) and market segments ( market_segment column)

Reasoning: By having the lead time pattern, Meal and market segments can be associated and marketting and operational planning can be strategiyes accordingly.

Lead Time Impact on cancellation

How does the lead time - number of days and arrival , correlate with the likehood of cancellation

Reasoning: By understanding the relationship between lead time and cancellation , you can gain insights into whether customer are more likehood to cencel reservation made well in advance or closure to the avival date so that marketting and operational planning can be strategiyes accordingly

Customer preference across market segment

How do cusomer prefernce like booking changes, special request etc. vary across different market segment like online travel agency and corporate.

Reasoning:Exploring how customer behaviour differs among market segment can guide targets market segment and improves Services based on unique needs and preference of each segment.