10/13/2024Regork, a national grocery chain, is looking to increase revenue and profits by identifying potential areas for growth. Our task was to determine which age demographic makes the highest frequency of travel-related purchases. This information is critical for Regork’s marketing team to create targeted promotions and capture additional market share in travel-related products.
To address this, we analyzed transaction and demographic data from the Complete Journey dataset. Using this data, we filtered for travel-related purchases and grouped them by age demographics. Our analysis identified the most frequent age group for these purchases, providing actionable insights for marketing strategies.
We used the Complete Journey dataset, which contains detailed transactions and demographic information for households. We will be using both a bar chart, pie chart, and scatter plot to visualize the data in this section.
library(dplyr) # For data wrangling library(ggplot2) # For plotting library(knitr) # For knitting report library(tidyr) # For data manipulation transactions <- get_transactions() demographics <- get_demographics()
transactions <- completejourney::get_transactions() products <- completejourney::products head(products)
travel_products <- products %>% filter(grepl(“travel|toiletries|luggage”, product_description, ignore.case = TRUE))
head(travel_products)
colnames(products)
head(products)
travel_products <- products %>% filter(grepl(“travel|toiletries|luggage”, product_category, ignore.case = TRUE) | grepl(“travel|toiletries|luggage”, department, ignore.case = TRUE))
head(travel_products) travel_related_transactions <- transactions %>% inner_join(travel_products, by = “product_id”) %>% mutate(travel_flag = 1)
merged_data <- travel_related_transactions %>% inner_join(demographics, by = “household_id”)
head(demographics)
merged_data <- travel_related_transactions %>% inner_join(demographics, by = “household_id”)
head(merged_data)
travel_by_age <- merged_data %>% group_by(age) %>%
summarize(travel_count = sum(travel_flag),
total_transactions = n())
travel_by_age <- merged_data %>% group_by(age) %>% # Use the
correct age column summarize(travel_count = sum(travel_flag),
total_transactions = n())
sum(is.na(merged_data$age))
merged_data <- merged_data %>% filter(!is.na(age))
str(merged_data$age)
merged_data\(age <- as.numeric(merged_data\)age)
merged_data\(age <- factor(merged_data\)age, ordered = TRUE, levels = c(“19-24”, “25-34”, “35-44”, “45-54”, “55-64”, “65+”))
merged_data\(age <- factor(merged_data\)age)
str(merged_data) unique(merged_data$age)
unique(demographics$age)
head(demographics)
unique(transactions\(household_id) unique(demographics\)household_id)
transactions\(household_id <- as.character(transactions\)household_id) demographics\(household_id <- as.character(demographics\)household_id)
str(transactions$household_id)
str(demographics$household_id)
merged_data <- travel_related_transactions %>% inner_join(demographics, by = “household_id”)
head(merged_data)
unique(merged_data$age)
travel_by_age <- merged_data %>% group_by(age) %>%
summarize(travel_count = sum(travel_flag),
total_transactions = n())
ggplot(travel_by_age, aes(x = age, y = travel_count)) + geom_bar(stat = “identity”, fill = “steelblue”) + labs(title = “Travel-Related Purchases by Age”, x = “Age Group”, y = “Number of Travel-Related Purchases”) + theme_minimal()
travel_by_age <- travel_by_age %>% mutate(travel_frequency = (travel_count / total_transactions) * 100)
print(travel_by_age)
Below is the bar chart showing the number of travel-related purchases by age group:
To further illustrate the
distribution of travel-related purchases across age groups, the
following pie chart shows the proportion of total travel-related
purchases by age group:
To explore if there’s a relationship between the number of total
transactions and travel-related purchases by age group, we present the
scatter plot below:
In this analysis, we aimed to identify the age demographic with the highest frequency of travel-related purchases for Regork. By using the Complete Journey dataset, we filtered transactions related to travel products and grouped them by age.
The age group 45-54 has the highest frequency and proportion of travel-related purchases, as shown in both the bar chart and pie chart. Additionally, the scatter plot reveals that the number of transactions correlates with travel purchases, suggesting that higher engagement leads to more travel-related purchases.”
Key Findings: - The age group 45-54 has the highest frequency of travel-related purchases. - The age group 25-34 also shows a significant number of purchases, suggesting potential for targeting this younger audience.
Implications for Regork: - Targeted Marketing: Focus on the 45-54 age group with promotions for travel essentials. - Cross-Promotion: Bundle travel products with complementary items like personal care products. - Expand to Other Age Groups: Consider targeted promotions for the 65+ demographic to increase travel-related purchases in underrepresented segments.
Limitations: This analysis focuses on travel-related purchases based on product descriptions, which may not capture all relevant products. Further refinement of the product filtering criteria could improve accuracy.
Based on the analysis, the 45-54 age group has the highest frequency of travel-related purchases. We recommend the following strategies for Regork:
The analysis reveals that older age groups, particularly those between 45-54, are the most frequent purchasers of travel-related products. By focusing on these demographics with tailored marketing strategies, Regork can increase revenue and drive more sales.