10/13/2024Regork, a national grocery chain, is seeking to identify which age demographic makes the highest frequency of travel-related purchases. Understanding this will help the company create targeted marketing campaigns and promotions to drive sales.
# Methodology
We used the Complete Journey dataset, which contains detailed transactions and demographic information for households.
library(dplyr) library(ggplot2) library(readr) library(tidyr) library(completejourney)
Replace with actual dataset loading code transactions <- get_transactions() demographics <- get_demographics()
transactions <- completejourney::get_transactions() products <- completejourney::products head(products)
travel_products <- products %>% filter(grepl(“travel|toiletries|luggage”, product_description, ignore.case = TRUE))
head(travel_products)
colnames(products)
head(products)
travel_products <- products %>% filter(grepl(“travel|toiletries|luggage”, product_category, ignore.case = TRUE) | grepl(“travel|toiletries|luggage”, department, ignore.case = TRUE))
head(travel_products) travel_related_transactions <- transactions %>% inner_join(travel_products, by = “product_id”) %>% mutate(travel_flag = 1)
merged_data <- travel_related_transactions %>% inner_join(demographics, by = “household_id”)
head(demographics)
merged_data <- travel_related_transactions %>% inner_join(demographics, by = “household_id”)
head(merged_data)
travel_by_age <- merged_data %>% group_by(age) %>%
summarize(travel_count = sum(travel_flag),
total_transactions = n())
travel_by_age <- merged_data %>% group_by(age) %>% # Use the
correct age column summarize(travel_count = sum(travel_flag),
total_transactions = n())
sum(is.na(merged_data$age))
merged_data <- merged_data %>% filter(!is.na(age))
str(merged_data$age)
merged_data\(age <- as.numeric(merged_data\)age)
merged_data\(age <- factor(merged_data\)age, ordered = TRUE, levels = c(“19-24”, “25-34”, “35-44”, “45-54”, “55-64”, “65+”))
merged_data\(age <- factor(merged_data\)age)
str(merged_data) unique(merged_data$age)
unique(demographics$age)
head(demographics)
unique(transactions\(household_id) unique(demographics\)household_id)
transactions\(household_id <- as.character(transactions\)household_id) demographics\(household_id <- as.character(demographics\)household_id)
str(transactions$household_id)
str(demographics$household_id)
merged_data <- travel_related_transactions %>% inner_join(demographics, by = “household_id”)
head(merged_data)
unique(merged_data$age)
travel_by_age <- merged_data %>% group_by(age) %>%
summarize(travel_count = sum(travel_flag),
total_transactions = n())
ggplot(travel_by_age, aes(x = age, y = travel_count)) + geom_bar(stat = “identity”, fill = “steelblue”) + labs(title = “Travel-Related Purchases by Age”, x = “Age Group”, y = “Number of Travel-Related Purchases”) + theme_minimal()
install.packages(“ggplot2”) library(ggplot2)
ggplot(travel_by_age, aes(x = age, y = travel_count)) + geom_bar(stat = “identity”, fill = “steelblue”) + labs(title = “Travel-Related Purchases by Age”, x = “Age Group”, y = “Number of Travel-Related Purchases”) + theme_minimal()
travel_by_age <- travel_by_age %>% mutate(travel_frequency = (travel_count / total_transactions) * 100)
print(travel_by_age)
Below is the bar chart showing the number of travel-related purchases by age group:
Based on the analysis, the 45-54 age group has the highest frequency of travel-related purchases. We recommend the following strategies for Regork:
The analysis reveals that older age groups, particularly those between 45-54, are the most frequent purchasers of travel-related products. By focusing on these demographics with tailored marketing strategies, Regork can increase revenue and drive more sales.