Group 1 : Based on lead_time 1. Convert reservation_status_date of data set to date mm/dd/YYYY. 2. Groups the data by ‘market_segment’ using the group_by function. 3. Calculated the mean value of lead time for each market segment which is group by summarize function. 4. Computes the count of observations in each group using the n function. 5. Filters the grouped data to identify the group with the minimum count (n). 4. Assign a special tag to the smallest group Lowest Probability Group. 5. Merge back to the original data frame.
#Convert reservation_status_date to date mm/dd/YYYY according to R.
h_data$reservation_status_date <- as.Date(h_data$reservation_status_date, format="%m/%d/%Y")
grouped_market_segment <- h_data %>%
group_by(market_segment) %>%
summarize(average_lead_time = mean(lead_time),
n = n()) %>%
ungroup()
# Assign a special tag to the smallest group
smallest_group_market_segment <- grouped_market_segment %>%
group_by_all() %>%
filter(n == min(n)) %>%
ungroup() %>%
mutate(special_tag = "Lowest Probability Group")
# Merge back to the original data frame
hotel_data_with_tag_market_segment <- h_data %>%
left_join(smallest_group_market_segment, by = "market_segment")
Insight : This analysis aims to identify the market segment with the lowest probability, based on the average lead time. The bar plot visually represents the average lead time for each market segment, and the special tag is used to highlight the group with the lowest probability.
#Grouping 2: Based on meal column 1. Groups the data by the meal column using the group_by function. 2. Calculates the mean of the stays_in_weekend_nights variable within each meal group using the summarize function. 3. Computes the count of observations in each group using the n function. 4. Filters the grouped data to identify the group with the minimum count (n). 5. Assign a special tag to the smallest group- Lowest Probability Group. 6. Merges the data frame with the special tags back to the original data frame using the meal column.
grouped_meal <- h_data %>%
group_by(meal) %>%
summarize(average_stays_in_weekend_nights = mean(stays_in_weekend_nights),
n = n()) %>%
ungroup()
# Assign a special tag to the smallest group
smallest_group_meal <- grouped_meal %>%
group_by_all() %>%
filter(n == min(n)) %>%
ungroup() %>%
mutate(special_tag = "Lowest Probability Group")
# Merge back to the original data frame
hotel_data_with_tag_meal <- h_data %>%
left_join(smallest_group_meal, by = "meal")
Insight : The analysis aims to identify patterns or differences in
the average stays in weekend nights based on different meal types.
The visualization provides a clear comparison of the average stays in
weekend nights across different meal types, allowing for insights into
potential relationships or variations associated with meal
choices.
#Grouping 3: Group by deposit_type and summarize total_of_special_requests. 1. Groups the data by the deposit_type column using the group_by function. 2. Calculates the mean of the total_of_special_requests variable within each deposit_type group using the summarize function. 3.Computes the count of observations in each group using the n function. 4. Filters the grouped data to identify the group with the minimum count (n). 5. Assign a special tag to the smallest group- Lowest Probability Group. 6. Merges the data frame with the special tags back to the original data frame using the ‘deposit_type’ column.
grouped_deposit_type <- h_data %>%
group_by(deposit_type) %>%
summarize(average_total_of_special_requests = mean(total_of_special_requests),
n = n()) %>%
ungroup()
#Number_total_of_special_requests <-mean(total_of_special_requests)
# Assign a special tag to the smallest group
smallest_group_deposit_type <- grouped_deposit_type %>%
group_by_all() %>%
filter(n == min(n)) %>%
ungroup() %>%
mutate(special_tag = "Lowest Probability Group")
# Merge back to the original data frame
hotel_data_with_tag_deposit_type <- h_data %>%
left_join(smallest_group_deposit_type, by = "deposit_type")
1.Creates a bar plot using ggplot to visualize the average total special requests for each deposit type. 2.The x-axis represents the ‘Deposit Type’, and the y-axis represents the ‘Average Total Special Requests’. 3.Bars are colored with the fill “#66c2a5”. 4.The plot has a minimal theme.
Insight: The analysis aims to explore the relationship between the deposit type and the average total special requests made by guests.The visualization provides a comparison of the average total special requests across different deposit types, allowing for insights into potential patterns or differences associated with deposit preferences. This analysis can help understand if there are specific deposit types that are associated with a higher or lower average total of special requests, providing valuable insights for hotel management or decision-makers.
Thank You