Currently, the consumption trends of customer groups are completely different, fluctuating based on many different demographics. It is completely practical and urgent for grocery stores to grasp and orient their sales strategies, both bringing profits and outlining the long-term effectiveness of the strategy.
Recognizing and evaluating the trends of big names like Walmart, Tesco, grocery stores are tending to sell items together, also known as bundles, bringing in a lot of revenue, especially when sold at the right times. Based on the connection of data in the completejourney package, Regork can find suitable factors to deploy bundles and increase its sales.
Using the differences in customer demographics, Regork can analyze which age group will spend the most on its products, thereby focusing and building strategies, stimulating supply for the items that this age group needs. From there, we can combine the purchasing density of each product, deduce which products are “famous”, and then combine the “famous” products together, creating bundles with close connections. In addition, taking advantage of the times (holidays) and customer loyalty, Regork can completely promote the bundle trading strategy and bring in huge revenue.
In this report, we will analyze and evaluate each data that Regork needs and combine them together, creating a detailed and vivid statistical picture. From there, develop strategies for each product set. The most important thing is that the methods proposed through each statistical data will contribute to sales growth and enhance Regork’s potential development.
library(tidyverse)
library(dplyr)
library(ggplot2)
library(tidyr)
library(lubridate)
library(arules)
library(completejourney)
library(widyr)
library(kableExtra)
Our analytical approach and data:
We use the join function to combine the get_transactions and product data to get more detailed product data.
Then we combine the demographics data and the get_promotions big data to get a more comprehensive view of consumer shopping behavior.
Finally, we choose to join the promotions dataset and transactions to get more information and assess the impact of marketing campaigns and coupons on purchases based on product category.
# Load the datasets
transactions <- get_transactions()
products <- products
demographics <- demographics
promotions <- get_promotions()
# Data Frame
# Join transactions with products
transactions_products <- transactions %>%
left_join(products, by = "product_id")
# Join transactions with demographics
transactions_demographics <- transactions %>%
left_join(demographics, by = "household_id") %>%
filter(!is.na(age))
# Join transactions with promotions
transactions_promotions <- promotions %>%
inner_join(transactions, by = c('product_id', 'store_id', 'week'))
How can demographic groups under-purchasing certain products (age, income, household size) generate different levels of revenue and sales for specific product categories?
Which demographic groups under-purchasing certain products, indicating potential untapped markets?
# Sales by demographic group
revenue_by_demographic <- transactions_demographics %>%
inner_join(products, by = 'product_id') %>%
group_by(age, income, household_size, product_category) %>%
summarise(total_revenue = sum(sales_value), .groups = 'drop')
# Plot revenue by demographic group
ggplot(revenue_by_demographic, aes(x = age, y = total_revenue, fill = income)) +
geom_bar(stat = "identity", position = "dodge") +
labs(title = "Sales by Demographic Group", x = "Age Group", y = "Total Sales")
find the demographic group that generates the highest revenue for each product category.
# Find the highest revenue-generating demographic groups for each product category
high_revenue_demographics <- revenue_by_demographic %>%
group_by(product_category) %>%
top_n(1, total_revenue) %>% # Select the demographic group with the highest revenue for each category
arrange(product_category, desc(total_revenue))
# View the highest revenue-generating demographic groups per product category
high_revenue_demographics %>%
head() %>%
kable("html", caption = "High Revenue - Generating Demographics by Product Category") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = FALSE)
| age | income | household_size | product_category | total_revenue |
|---|---|---|---|---|
| 65+ | 35-49K | 1 | ADULT INCONTINENCE | 489.56 |
| 45-54 | 50-74K | 1 | AIR CARE | 427.67 |
| 45-54 | 50-74K | 1 | ANALGESICS | 319.09 |
| 19-24 | 50-74K | 3 | ANTACIDS | 364.42 |
| 45-54 | 50-74K | 1 | APPAREL | 112.23 |
| 45-54 | 75-99K | 2 | APPLES | 561.83 |
# Calculate the average revenue per product category
avg_revenue_by_category <- revenue_by_demographic %>%
group_by(product_category) %>%
summarise(avg_revenue = mean(total_revenue, na.rm = TRUE), .groups = 'drop')
# Join the average revenue with the original data to find under-purchasing groups
under_purchasing_groups <- revenue_by_demographic %>%
left_join(avg_revenue_by_category, by = "product_category") %>%
filter(total_revenue < avg_revenue) %>% # Identify groups with less than average revenue
arrange(product_category, total_revenue)
# View the under-purchasing demographic groups
under_purchasing_groups %>%
head() %>%
kable("html", caption = "Under-Purchasing Demographic Groups") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
full_width = FALSE, position = "center") %>%
column_spec(1, bold = TRUE, color = "black", background = "#f2f2f2") %>%
column_spec(2, width = "20em")
| age | income | household_size | product_category | total_revenue | avg_revenue |
|---|---|---|---|---|---|
| 25-34 | 50-74K | 2 | ADULT INCONTINENCE | 2.79 | 52.05714 |
| 45-54 | 50-74K | 1 | ADULT INCONTINENCE | 2.79 | 52.05714 |
| 65+ | 50-74K | 2 | ADULT INCONTINENCE | 2.79 | 52.05714 |
| 35-44 | 15-24K | 3 | ADULT INCONTINENCE | 2.95 | 52.05714 |
| 35-44 | 15-24K | 4 | ADULT INCONTINENCE | 2.99 | 52.05714 |
| 55-64 | 25-34K | 3 | ADULT INCONTINENCE | 3.18 | 52.05714 |
We will analyze which products are frequently purchased together. Furthermore, we will also show how Regork can leverage this data to develop strategies such as cross-selling or bundling strategies to contribute to increasing revenue.
And most importantly, we will try to find products that are closely linked to customer consumption behavior (e.g., frozen pizzas and beer) to analyze their behavior for long-term projects.
# Ensure distinct transactions with relevant columns
transactions_products_pairing <- transactions_products %>%
select(basket_id, product_category) %>%
distinct()
# Find product pairs frequently purchased together
product_pairs <- transactions_products_pairing %>%
group_by(basket_id) %>%
summarise(products = list(sort(unique(product_category)))) %>% # Sort product names to ensure consistent order
unnest(products) %>%
pairwise_count(products, basket_id, sort = TRUE) %>%
mutate(pair = ifelse(item1 < item2,
paste(item1, "&", item2),
paste(item2, "&", item1))) %>% # Standardize pair order
group_by(pair) %>%
summarise(n = sum(n)) %>% # Aggregate counts for duplicate pairs
arrange(desc(n))
# Get top 10 pairs
top_pairs <- top_n(product_pairs, 10, n)
# Plot the bar chart
ggplot(top_pairs, aes(x = reorder(pair, n), y = n)) +
geom_col(fill = "steelblue") +
coord_flip() +
labs(title = "Top 10 Frequently Bought Together Product Pairs",
x = "Product Pair",
y = "Frequency") +
theme_minimal()
We will first research and evaluate which products have appropriate and relevant categories that experience sales spikes, especially for special holidays of the year such as Halloween or Christmas.
We will also make many hypotheses about what factors influence consumer trends. To do that, we analyze data of shopping trends for items that change over the year to gain more insights into this factor and propose business strategies.
Seasonal Product Categories for Plots: Halloween → CANDY, CHOCOLATE, PUMPKIN Thanksgiving → TURKEY, STUFFING, PIES, CRANBERRY SAUCE Christmas → CHRISTMAS DECOR, HOLIDAY TREATS Valentine’s Day → VALENTINE GIFTS, CHOCOLATE Summer BBQ → BBQ MEATS, WATERMELON
# Add date information to transactions
transactions_products_seasonal <- transactions_products %>%
mutate(date = as.Date(transaction_timestamp))
unique_product_types <- transactions_products_seasonal %>%
distinct(product_type) %>%
arrange(product_type)
# Simplified product category mapping
transactions_products_seasonal <- transactions_products_seasonal %>%
mutate(product_group = case_when(
str_detect(product_type, "CANDY|CHOCOLATE") ~ "Candy & Chocolate",
str_detect(product_type, "PUMPKIN") ~ "Pumpkin",
str_detect(product_type, "TURKEY") ~ "Turkey",
str_detect(product_type, "HOLIDAY|CHRISTMAS") ~ "Holiday Treats",
str_detect(product_type, "VALENTINE") ~ "Valentine Gifts",
str_detect(product_type, "BBQ|HOT DOGS|BURGERS") ~ "BBQ Meats",
str_detect(product_type, "WATERMELON") ~ "Watermelon",
TRUE ~ NA_character_
)) %>%
filter(!is.na(product_group))
# Function to create sales trend plots for each holiday
plot_holiday_trends <- function(holiday_name, categories) {
holiday_sales <- transactions_products_seasonal %>%
filter(product_group %in% categories) %>%
group_by(date, product_group) %>%
summarise(total_sales = sum(quantity, na.rm = TRUE), .groups = 'drop')
ggplot(holiday_sales, aes(x = date, y = total_sales, color = product_group)) +
geom_line(size = 1) +
labs(title = paste("Sales Trend for", holiday_name),
x = "Date",
y = "Total Sales",
color = "Product Category") +
theme_minimal()
}
# Generate plots for each holiday
plot_halloween <- plot_holiday_trends("Halloween", c("Candy & Chocolate", "Pumpkin"))
plot_thanksgiving <- plot_holiday_trends("Thanksgiving", "Turkey")
plot_christmas <- plot_holiday_trends("Christmas", c("Holiday Treats"))
plot_valentine <- plot_holiday_trends("Valentine's Day", c("Candy & Chocolate", "Valentine Gifts"))
plot_summer_bbq <- plot_holiday_trends("Summer BBQ", c("BBQ Meats", "Watermelon"))
print(plot_halloween)
print(plot_thanksgiving)
print(plot_christmas)
print(plot_valentine)
print(plot_summer_bbq)
To analyze the marketing effectiveness of Regork, we base it on the profitability and revenue of marketing strategies related to promotions.
In addition, we also consider the influence of coupons or discounts on consumer purchasing behavior.
# Create two subsets: with promotions and without promotions
sales_with_promotions <- transactions_promotions %>%
filter(retail_disc > 0 | coupon_disc > 0 | coupon_match_disc > 0)
sales_without_promotions <- transactions_promotions %>%
filter(retail_disc == 0 & coupon_disc == 0 & coupon_match_disc == 0)
# Summarize sales with and without promotions
summary_comparison <- data.frame(
category = c("With Promotions", "Without Promotions"),
total_sales_value = c(
sum(sales_with_promotions$sales_value, na.rm = TRUE),
sum(sales_without_promotions$sales_value, na.rm = TRUE)
),
total_quantity = c(
sum(sales_with_promotions$quantity, na.rm = TRUE),
sum(sales_without_promotions$quantity, na.rm = TRUE)
),
avg_sales_value = c(
mean(sales_with_promotions$sales_value, na.rm = TRUE),
mean(sales_without_promotions$sales_value, na.rm = TRUE)
),
avg_quantity = c(
mean(sales_with_promotions$quantity, na.rm = TRUE),
mean(sales_without_promotions$quantity, na.rm = TRUE)
)
)
# Print the summary comparison
summary_comparison %>%
kable("html", caption = "Comparison of Sales and Quantity with and without Promotions") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
full_width = FALSE, position = "center") %>%
column_spec(1, bold = TRUE) %>%
column_spec(2:5, width = "10em") %>%
row_spec(0, bold = TRUE, font_size = 14)
| category | total_sales_value | total_quantity | avg_sales_value | avg_quantity |
|---|---|---|---|---|
| With Promotions | 739529.5 | 397202 | 2.670977 | 1.434584 |
| Without Promotions | 166196.5 | 67380 | 3.267724 | 1.324813 |
# Reshape data for line plot
summary_comparison_long <- na.omit(summary_comparison) %>%
pivot_longer(cols = c(total_sales_value, total_quantity),
names_to = "metric", values_to = "value")
# Plot: Stacked bar plot to compare all four metrics in one plot
ggplot(summary_comparison_long, aes(x = category, y = value, fill = metric)) +
geom_bar(stat = "identity") +
labs(title = "Comparison of Sales and Quantity with and without Promotions",
x = "Promotion Type", y = "Value") +
scale_fill_manual(values = c( "lightgreen","lightsalmon")) +
scale_y_continuous(labels = scales::comma) +
theme_minimal()
Finally, loyalty and retention analysis is also a campaign that Regork should pay attention to. Can Regork enhance customer retention, through our analysis of the shopping habits of loyal customers and occasional shoppers?
In addition, we also analyze the repeatability of certain products and their categories specifically among those repeat customers.
First, we prioritize the analysis by taking the number of transactions per customer and then aggregating that number, inferring the loyalty of customers and then classifying them into occasional groups based on this.
# Calculate the number of transactions per household_id (representing the customer)
customer_transactions <- transactions_products %>%
group_by(household_id) %>%
summarise(transaction_count = n_distinct(basket_id)) # basket_id represents distinct purchases
# Classify customers as loyal or occasional based on the number of transactions
customer_transactions <- customer_transactions %>%
mutate(customer_type = ifelse(transaction_count > 5, "Loyal", "Occasional"))
# Merge this classification back with the transactions_products dataset
transactions_products <- transactions_products %>%
left_join(customer_transactions, by = "household_id")
# View the updated transactions_products dataset to confirm
transactions_products %>%
head() %>%
kable("html", caption = "Customer Transactions and Loyalty Classification") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
full_width = FALSE, position = "center") %>%
column_spec(1, bold = TRUE) %>%
column_spec(2) %>%
column_spec(3, width = "20em") %>%
column_spec(4, bold = TRUE) %>%
column_spec(5, bold = TRUE) %>%
row_spec(0, bold = TRUE, font_size = 14)
| household_id | store_id | basket_id | product_id | quantity | sales_value | retail_disc | coupon_disc | coupon_match_disc | week | transaction_timestamp | manufacturer_id | department | brand | product_category | product_type | package_size | transaction_count | customer_type |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 900 | 330 | 31198570044 | 1095275 | 1 | 0.50 | 0.00 | 0 | 0 | 1 | 2017-01-01 06:53:26 | 2 | PASTRY | National | ROLLS | ROLLS: BAGELS | 4 OZ | 674 | Loyal |
| 900 | 330 | 31198570047 | 9878513 | 1 | 0.99 | 0.10 | 0 | 0 | 1 | 2017-01-01 07:10:28 | 69 | GROCERY | Private | FACIAL TISS/DNR NAPKIN | FACIAL TISSUE & PAPER HANDKE | 85 CT | 674 | Loyal |
| 1228 | 406 | 31198655051 | 1041453 | 1 | 1.43 | 0.15 | 0 | 0 | 1 | 2017-01-01 07:26:30 | 69 | GROCERY | Private | BAG SNACKS | POTATO CHIPS | 11.5 OZ | 403 | Loyal |
| 906 | 319 | 31198705046 | 1020156 | 1 | 1.50 | 0.29 | 0 | 0 | 1 | 2017-01-01 07:30:27 | 2142 | GROCERY | National | REFRGRATD DOUGH PRODUCTS | REFRIGERATED BAGELS | 17.1 OZ | 211 | Loyal |
| 906 | 319 | 31198705046 | 1053875 | 2 | 2.78 | 0.80 | 0 | 0 | 1 | 2017-01-01 07:30:27 | 2326 | GROCERY | National | SEAFOOD - SHELF STABLE | TUNA | 5.0 OZ | 211 | Loyal |
| 906 | 319 | 31198705046 | 1060312 | 1 | 5.49 | 0.50 | 0 | 0 | 1 | 2017-01-01 07:30:27 | 608 | GROCERY | National | FRZN MEAT/MEAT DINNERS | FRZN BREADED PREPARED CHICK | 30 OZ | 211 | Loyal |
We implemented and compared purchasing behavior (purchase quantity, average transaction value) between loyal and occasional customers.
# Summarize purchasing patterns for loyal vs. occasional customers
purchase_summary <- transactions_products %>%
group_by(customer_type) %>%
summarise(
avg_transaction_value = mean(sales_value, na.rm = TRUE),
total_quantity = sum(quantity, na.rm = TRUE),
total_sales_value = sum(sales_value, na.rm = TRUE),
avg_quantity_per_purchase = mean(quantity, na.rm = TRUE)
)
# Print the summary
purchase_summary %>%
kable("html", caption = "Purchasing Patterns of Loyal vs. Occasional Customers") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
full_width = FALSE, position = "center") %>%
column_spec(1, bold = TRUE) %>%
column_spec(2) %>%
column_spec(3) %>%
column_spec(4) %>%
column_spec(5) %>%
row_spec(0, bold = TRUE, font_size = 14)
| customer_type | avg_transaction_value | total_quantity | total_sales_value | avg_quantity_per_purchase |
|---|---|---|---|---|
| Loyal | 3.128353 | 152835591 | 4584329.75 | 104.29523 |
| Occasional | 3.007147 | 87517 | 11709.83 | 22.47483 |
# Filter data for loyal customers only
loyal_customers_data <- transactions_products %>%
filter(customer_type == "Loyal")
# Count the number of purchases per product category
popular_categories <- loyal_customers_data %>%
group_by(product_category, na.rm = TRUE) %>%
summarise(
total_quantity = sum(quantity, na.rm = TRUE),
total_sales_value = sum(sales_value, na.rm = TRUE)
) %>%
arrange(desc(total_quantity))
# Display the most popular product categories among loyal customers
popular_categories %>%
kable("html", caption = "Most Popular Product Categories Among Loyal Customers") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
full_width = FALSE, position = "center") %>%
column_spec(1, bold = TRUE) %>%
column_spec(2, width = "10em") %>%
column_spec(3, width = "12em") %>%
row_spec(0, bold = TRUE, font_size = 14)
| product_category | na.rm | total_quantity | total_sales_value |
|---|---|---|---|
| COUPON/MISC ITEMS | TRUE | 150839909 | 385723.92 |
| FUEL | TRUE | 101364 | 19926.95 |
| SOFT DRINKS | TRUE | 89232 | 181442.94 |
| FLUID MILK PRODUCTS | TRUE | 66290 | 116067.42 |
| BAKED BREAD/BUNS/ROLLS | TRUE | 56541 | 82189.83 |
| CHEESE | TRUE | 54873 | 106729.26 |
| SOUP | TRUE | 48441 | 48854.73 |
| FRZN MEAT/MEAT DINNERS | TRUE | 45731 | 92799.55 |
| BAG SNACKS | TRUE | 45631 | 84682.53 |
| VEGETABLES - SHELF STABLE | TRUE | 43166 | 31930.61 |
| YOGURT | TRUE | 42820 | 34365.00 |
| BEEF | TRUE | 37267 | 176314.12 |
| FROZEN PIZZA | TRUE | 37265 | 80876.54 |
| CANDY - CHECKLANE | TRUE | 32378 | 16053.13 |
| LUNCHMEAT | TRUE | 25730 | 51737.72 |
| CANDY - PACKAGED | TRUE | 25281 | 43554.09 |
| MEAT - SHELF STABLE | TRUE | 24005 | 30049.66 |
| COLD CEREAL | TRUE | 23346 | 62850.74 |
| TROPICAL FRUIT | TRUE | 22725 | 23672.08 |
| REFRGRATD JUICES/DRNKS | TRUE | 20675 | 40629.86 |
| CANNED JUICES | TRUE | 19667 | 41848.78 |
| EGGS | TRUE | 19533 | 22287.33 |
| CRACKERS/MISC BKD FD | TRUE | 18658 | 36216.72 |
| FRZN VEGETABLE/VEG DSH | TRUE | 18653 | 25659.68 |
| HISPANIC | TRUE | 18171 | 27156.35 |
| BAKED SWEET GOODS | TRUE | 18074 | 31617.70 |
| FRUIT - SHELF STABLE | TRUE | 17858 | 23488.63 |
| ICE CREAM/MILK/SHERBTS | TRUE | 17720 | 46723.77 |
| CAT FOOD | TRUE | 17598 | 19314.31 |
| BABY FOODS | TRUE | 17033 | 13535.03 |
| DINNER MXS:DRY | TRUE | 16978 | 26300.19 |
| WATER - CARBONATED/FLVRD DRINK | TRUE | 16592 | 35187.91 |
| PWDR/CRYSTL DRNK MX | TRUE | 16554 | 9346.44 |
| DRY BN/VEG/POTATO/RICE | TRUE | 16095 | 21385.11 |
| CITRUS | TRUE | 15968 | 14316.11 |
| CONDIMENTS/SAUCES | TRUE | 15144 | 25420.89 |
| MILK BY-PRODUCTS | TRUE | 15049 | 23649.00 |
| CHICKEN | TRUE | 14812 | 52577.23 |
| CONVENIENT BRKFST/WHLSM SNACKS | TRUE | 14682 | 31631.17 |
| DELI MEATS | TRUE | 14657 | 60051.20 |
| VEGETABLES - ALL OTHERS | TRUE | 14388 | 18846.59 |
| REFRGRATD DOUGH PRODUCTS | TRUE | 13966 | 23781.90 |
| COOKIES/CONES | TRUE | 13911 | 29852.09 |
| BAKING MIXES | TRUE | 13512 | 17469.37 |
| SALD DRSNG/SNDWCH SPRD | TRUE | 13058 | 26671.79 |
| SEAFOOD - SHELF STABLE | TRUE | 12326 | 13874.37 |
| DRY NOODLES/PASTA | TRUE | 12244 | 13121.82 |
| BEANS - CANNED GLASS & MW | TRUE | 12192 | 10020.95 |
| SALAD MIX | TRUE | 12032 | 26021.87 |
| DOG FOODS | TRUE | 11749 | 31075.06 |
| BEERS/ALES | TRUE | 10942 | 81663.06 |
| POTATOES | TRUE | 10906 | 26418.87 |
| PASTA SAUCE | TRUE | 10708 | 19263.42 |
| HOT DOGS | TRUE | 10652 | 19315.15 |
| ONIONS | TRUE | 10625 | 12795.56 |
| APPLES | TRUE | 10255 | 22221.03 |
| PORK | TRUE | 10106 | 50682.82 |
| BATH TISSUES | TRUE | 9791 | 34811.02 |
| TOMATOES | TRUE | 9688 | 25168.33 |
| BAKING NEEDS | TRUE | 9671 | 16778.23 |
| MARGARINES | TRUE | 9646 | 13102.66 |
| ISOTONIC DRINKS | TRUE | 9566 | 14708.88 |
| BREAKFAST SAUSAGE/SANDWICHES | TRUE | 9481 | 25287.39 |
| FRZN NOVELTIES/WTR ICE | TRUE | 9463 | 23291.11 |
| CIGARETTES | TRUE | 9324 | 54305.99 |
| BERRIES | TRUE | 9106 | 20864.33 |
| DINNER SAUSAGE | TRUE | 9069 | 24014.36 |
| PAPER TOWELS | TRUE | 8959 | 17477.23 |
| BACON | TRUE | 8903 | 25156.60 |
| FD WRAPS/BAGS/TRSH BG | TRUE | 8792 | 24968.23 |
| COFFEE | TRUE | 8645 | 36764.73 |
| CHEESES | TRUE | 8258 | 29937.88 |
| BREAKFAST SWEETS | TRUE | 8214 | 11015.86 |
| DRY MIX DESSERTS | TRUE | 8195 | 6940.14 |
| CORN | TRUE | 8091 | 2522.82 |
| GREETING CARDS/WRAP/PARTY SPLY | TRUE | 7957 | 19361.22 |
| PNT BTR/JELLY/JAMS | TRUE | 7848 | 17810.40 |
| SPICES & EXTRACTS | TRUE | 7781 | 15753.08 |
| PEPPERS-ALL | TRUE | 7660 | 9280.99 |
| DRY SAUCES/GRAVY | TRUE | 7488 | 7009.80 |
| FRZN BREAKFAST FOODS | TRUE | 7447 | 14610.61 |
| HAIR CARE PRODUCTS | TRUE | 7373 | 25745.35 |
| PAPER HOUSEWARES | TRUE | 7097 | 14563.75 |
| FROZEN MEAT | TRUE | 7083 | 29231.83 |
| ORAL HYGIENE PRODUCTS | TRUE | 7062 | 21068.84 |
| VEGETABLES SALAD | TRUE | 6958 | 7827.85 |
| SUGARS/SWEETNERS | TRUE | 6881 | 15272.12 |
| MEAT - MISC | TRUE | 6514 | 37560.37 |
| HOUSEHOLD CLEANG NEEDS | TRUE | 6484 | 21683.43 |
| SEAFOOD - FROZEN | TRUE | 6467 | 34510.99 |
| FRZN POTATOES | TRUE | 6260 | 12075.08 |
| SALAD BAR | TRUE | 6165 | 17748.60 |
| GRAPES | TRUE | 6110 | 19337.15 |
| REFRIGERATED | TRUE | 5922 | 14921.39 |
| BREAD | TRUE | 5874 | 10177.93 |
| FROZEN PIE/DESSERTS | TRUE | 5770 | 12384.97 |
| WAREHOUSE SNACKS | TRUE | 5760 | 8703.41 |
| MISC. DAIRY | TRUE | 5756 | 12722.48 |
| FACIAL TISS/DNR NAPKIN | TRUE | 5732 | 10461.15 |
| MAGAZINE | TRUE | 5454 | 18185.23 |
| LAUNDRY DETERGENTS | TRUE | 5365 | 32142.57 |
| SHORTENING/OIL | TRUE | 5364 | 15981.28 |
| FROZEN BREAD/DOUGH | TRUE | 5268 | 10898.10 |
| CARROTS | TRUE | 5162 | 9037.26 |
| STATIONERY & SCHOOL SUPPLIES | TRUE | 4862 | 10092.71 |
| SALADS/DIPS | TRUE | 4815 | 14335.31 |
| SOAP - LIQUID & BAR | TRUE | 4656 | 12514.56 |
| ROLLS | TRUE | 4540 | 5955.69 |
| ORGANICS FRUIT & VEGETABLES | TRUE | 4521 | 10933.10 |
| HEAT/SERVE | TRUE | 4354 | 16227.50 |
| PICKLE/RELISH/PKLD VEG | TRUE | 4285 | 8956.96 |
| VALUE ADDED FRUIT | TRUE | 4055 | 10898.35 |
| HOT CEREAL | TRUE | 4053 | 9824.09 |
| DISHWASH DETERGENTS | TRUE | 4005 | 10605.25 |
| AIR CARE | TRUE | 3953 | 10265.91 |
| DOMESTIC WINE | TRUE | 3951 | 35878.76 |
| COLD AND FLU | TRUE | 3799 | 14581.42 |
| CHICKEN/POULTRY | TRUE | 3769 | 18035.31 |
| FITNESS&DIET | TRUE | 3733 | 5931.26 |
| MAKEUP AND TREATMENT | TRUE | 3555 | 13667.33 |
| STONE FRUIT | TRUE | 3546 | 9319.26 |
| SNACK NUTS | TRUE | 3535 | 9639.56 |
| BUTTER | TRUE | 3507 | 9309.44 |
| PROCESSED | TRUE | 3451 | 9128.84 |
| CAKES | TRUE | 3312 | 22376.84 |
| PREPARED FOOD | TRUE | 3267 | 12423.86 |
| MOLASSES/SYRUP/PANCAKE MIXS | TRUE | 3240 | 7444.11 |
| DEODORANTS | TRUE | 3234 | 9084.74 |
| MUSHROOMS | TRUE | 3231 | 6802.86 |
| ANALGESICS | TRUE | 3221 | 14876.61 |
| BROCCOLI/CAULIFLOWER | TRUE | 3192 | 5732.18 |
| INFANT FORMULA | TRUE | 3147 | 26825.79 |
| NEWSPAPER | TRUE | 3142 | 3483.51 |
| NA | TRUE | 3116 | 7599.67 |
| FEMININE HYGIENE | TRUE | 3101 | 11421.80 |
| COOKIES | TRUE | 3050 | 8521.25 |
| MELONS | TRUE | 3038 | 8401.09 |
| POPCORN | TRUE | 2999 | 6909.89 |
| DIAPERS & DISPOSABLES | TRUE | 2990 | 33821.37 |
| LAUNDRY ADDITIVES | TRUE | 2912 | 11523.83 |
| BABY HBC | TRUE | 2877 | 8876.11 |
| FRZN JCE CONC/DRNKS | TRUE | 2865 | 3335.22 |
| SEAFOOD-FRESH | TRUE | 2661 | 13593.48 |
| SMOKED MEATS | TRUE | 2630 | 15274.27 |
| SHAVING CARE PRODUCTS | TRUE | 2495 | 13482.30 |
| COCOA MIXES | TRUE | 2478 | 5056.32 |
| CHRISTMAS SEASONAL | TRUE | 2433 | 5698.90 |
| BLEACH | TRUE | 2402 | 5758.13 |
| TEAS | TRUE | 2379 | 6350.87 |
| ELECTRICAL SUPPPLIES | TRUE | 2293 | 5775.99 |
| DRIED FRUIT | TRUE | 2228 | 5494.60 |
| BROOMS AND MOPS | TRUE | 2219 | 5652.12 |
| WATER | TRUE | 2170 | 2275.44 |
| CAT LITTER | TRUE | 2124 | 9917.37 |
| TURKEY | TRUE | 2098 | 11060.53 |
| HAND/BODY/FACIAL PRODUCTS | TRUE | 2078 | 11742.28 |
| PET CARE SUPPLIES | TRUE | 1986 | 7112.19 |
| HERBS | TRUE | 1974 | 2200.09 |
| FIRST AID PRODUCTS | TRUE | 1970 | 6873.07 |
| OLIVES | TRUE | 1932 | 3428.85 |
| FROZEN | TRUE | 1910 | 6149.18 |
| DIETARY AID PRODUCTS | TRUE | 1898 | 9744.51 |
| FLOUR & MEALS | TRUE | 1895 | 3110.68 |
| CANNED MILK | TRUE | 1841 | 2164.11 |
| PLASTIC HOUSEWARES | TRUE | 1826 | 4570.57 |
| VITAMINS | TRUE | 1788 | 11690.92 |
| PIES | TRUE | 1772 | 7004.26 |
| TOYS AND GAMES | TRUE | 1772 | 5953.16 |
| FLORAL-FRESH CUT | TRUE | 1763 | 11121.83 |
| LIQUOR | TRUE | 1678 | 13597.76 |
| VALUE ADDED VEGETABLES | TRUE | 1660 | 4175.38 |
| BATTERIES | TRUE | 1653 | 7467.49 |
| FRZN ICE | TRUE | 1605 | 3630.88 |
| MISCELLANEOUS | TRUE | 1447 | 3196.46 |
| SPRING/SUMMER SEASONAL | TRUE | 1424 | 9737.21 |
| KITCHEN GADGETS | TRUE | 1417 | 4127.47 |
| CHARCOAL AND LIGHTER FLUID | TRUE | 1408 | 6336.26 |
| SQUASH | TRUE | 1374 | 1846.26 |
| ANTACIDS | TRUE | 1293 | 6731.27 |
| NUTS | TRUE | 1290 | 5108.60 |
| IMPORTED WINE | TRUE | 1281 | 11336.72 |
| CEREAL/BREAKFAST | TRUE | 1195 | 4015.54 |
| CHIPS&SNACKS | TRUE | 1180 | 3219.01 |
| PEARS | TRUE | 1162 | 2564.97 |
| SNKS/CKYS/CRKR/CNDY | TRUE | 1120 | 2164.60 |
| LAXATIVES | TRUE | 1097 | 5211.51 |
| RICE CAKES | TRUE | 1068 | 2083.77 |
| CANDLES/ACCESSORIES | TRUE | 1065 | 2972.08 |
| SWEET GOODS & SNACKS | TRUE | 1046 | 2912.63 |
| EASTER | TRUE | 1038 | 1576.94 |
| FRZN FRUITS | TRUE | 1028 | 3403.85 |
| BOOKSTORE | TRUE | 1027 | 8071.00 |
| SINUS AND ALLERGY | TRUE | 1027 | 6834.01 |
| COFFEE FILTERS | TRUE | 993 | 2719.80 |
| SNACKS | TRUE | 947 | 2388.19 |
| HAIR CARE ACCESSORIES | TRUE | 923 | 2543.32 |
| JUICE | TRUE | 922 | 2192.03 |
| HARDWARE SUPPLIES | TRUE | 920 | 2836.26 |
| DOMESTIC GOODS | TRUE | 853 | 2076.79 |
| INFANT CARE PRODUCTS | TRUE | 844 | 3591.33 |
| SANDWICHES | TRUE | 834 | 2956.66 |
| GARDEN CENTER | TRUE | 824 | 5055.42 |
| MISC WINE | TRUE | 814 | 4873.54 |
| BATH | TRUE | 807 | 2552.69 |
| SYRUPS/TOPPINGS | TRUE | 769 | 1487.46 |
| SEAFOOD - MISC | TRUE | 755 | 1341.83 |
| PREPARED/PKGD FOODS | TRUE | 745 | 1691.21 |
| IN-STORE PHOTOFINISHING | TRUE | 729 | 4761.44 |
| EYE AND EAR CARE PRODUCTS | TRUE | 715 | 4139.34 |
| COOKWARE & BAKEWARE | TRUE | 692 | 3743.97 |
| DISPOSIBLE FOILWARE | TRUE | 692 | 1943.48 |
| FROZEN - BOXED(GROCERY) | TRUE | 663 | 3225.70 |
| TOBACCO OTHER | TRUE | 649 | 2047.51 |
| FILM AND CAMERA PRODUCTS | TRUE | 635 | 3857.33 |
| PREPAID WIRELESS&ACCESSORIES | TRUE | 625 | 11459.27 |
| HOSIERY/SOCKS | TRUE | 621 | 2124.78 |
| NON-DAIRY BEVERAGES | TRUE | 603 | 1281.31 |
| AUDIO/VIDEO PRODUCTS | TRUE | 580 | 7457.78 |
| AUTOMOTIVE PRODUCTS | TRUE | 576 | 1755.20 |
| J-HOOKS | TRUE | 555 | 1725.56 |
| COUPONS/STORE & MFG | TRUE | 528 | 649.52 |
| INSECTICIDES | TRUE | 527 | 1990.25 |
| HALLOWEEN | TRUE | 519 | 1962.50 |
| SUSHI | TRUE | 503 | 2859.04 |
| TICKETS | TRUE | 478 | 4391.23 |
| BABYFOOD | TRUE | 451 | 363.95 |
| NEW AGE | TRUE | 424 | 1109.92 |
| ADULT INCONTINENCE | TRUE | 396 | 3202.63 |
| LAWN AND GARDEN SHOP | TRUE | 350 | 2092.42 |
| FLORAL-FLOWERING PLANTS | TRUE | 346 | 2774.50 |
| FOOT CARE PRODUCTS | TRUE | 346 | 2013.92 |
| DELI SUPPLIES | TRUE | 345 | 999.99 |
| FROZEN CHICKEN | TRUE | 337 | 2309.63 |
| VALENTINE | TRUE | 334 | 985.21 |
| ROSES | TRUE | 329 | 4060.88 |
| FAMILY PLANNING | TRUE | 322 | 2624.94 |
| APPAREL | TRUE | 313 | 2086.84 |
| BAKING | TRUE | 310 | 980.40 |
| FLORAL BALLOONS | TRUE | 285 | 1213.33 |
| IRONING AND CHEMICALS | TRUE | 285 | 1702.03 |
| PARTY TRAYS | TRUE | 283 | 3805.75 |
| PACKAGED NATURAL SNACKS | TRUE | 281 | 843.10 |
| SUNTAN | TRUE | 280 | 1869.26 |
| CONDIMENTS | TRUE | 266 | 967.46 |
| BEVERAGE | TRUE | 255 | 596.67 |
| BIRD SEED | TRUE | 230 | 726.73 |
| DELI SPECIALTIES (RETAIL PK) | TRUE | 208 | 528.71 |
| ETHNIC PERSONAL CARE | TRUE | 208 | 644.49 |
| SEASONAL | TRUE | 203 | 834.47 |
| DOLLAR VALUE PRODUCTS | TRUE | 201 | 208.52 |
| COFFEE SHOP | TRUE | 200 | 571.16 |
| LAMB | TRUE | 187 | 829.27 |
| FLORAL- HARD GOODS | TRUE | 185 | 1043.10 |
| SERVICE BEVERAGE | TRUE | 185 | 201.32 |
| HOME FREEZING & CANNING SUPPLY | TRUE | 183 | 568.99 |
| FIREWORKS | TRUE | 166 | 877.12 |
| SPORTS MEMORABLILIA | TRUE | 166 | 1521.69 |
| FLORAL-FOLIAGE PLANTS | TRUE | 164 | 1493.31 |
| SHOE CARE | TRUE | 164 | 354.83 |
| EXOTIC GAME/FOWL | TRUE | 163 | 860.42 |
| COSMETIC ACCESSORIES | TRUE | 161 | 956.12 |
| GLASSWARE & DINNERWARE | TRUE | 158 | 581.16 |
| FALL AND WINTER SEASONAL | TRUE | 138 | 749.47 |
| DRY TEA/COFFEE/COCO MIX | TRUE | 116 | 408.98 |
| OVERNIGHT PHOTOFINISHING | TRUE | 115 | 1029.48 |
| SEWING | TRUE | 109 | 182.85 |
| PERSONAL CARE APPLIANCES | TRUE | 101 | 1341.33 |
| BAKERY PARTY TRAYS | TRUE | 97 | 613.87 |
| CIGARS | TRUE | 97 | 354.82 |
| BOTTLE DEPOSITS | TRUE | 95 | 68.90 |
| QUICK SERVICE | TRUE | 90 | 388.42 |
| GLASSES/VISION AIDS | TRUE | 81 | 665.68 |
| CONTINUITIES | TRUE | 76 | 269.36 |
| FRAGRANCES | TRUE | 76 | 1548.00 |
| PORTABLE ELECTRIC APPLIANCES | TRUE | 72 | 1027.11 |
| NON EDIBLE PRODUCTS | TRUE | 60 | 309.43 |
| FLORAL-ACCESSORIES | TRUE | 59 | 320.99 |
| COFFEE SHOP SWEET GOODS&RETAIL | TRUE | 57 | 301.39 |
| COUPON | TRUE | 47 | 309.46 |
| VEAL | TRUE | 47 | 167.13 |
| HOME FURNISHINGS | TRUE | 41 | 692.18 |
| LONG DISTANCE CALLING CARDS | TRUE | 40 | 451.59 |
| MEAT SUPPLIES | TRUE | 40 | 194.37 |
| SMOKING CESSATIONS | TRUE | 39 | 1539.57 |
| FRZN SEAFOOD | TRUE | 34 | 149.76 |
| PROPANE | TRUE | 33 | 673.68 |
| RESTRICTED DIET | TRUE | 33 | 64.59 |
| NATURAL HBC | TRUE | 22 | 84.08 |
| NATURAL VITAMINS | TRUE | 21 | 194.46 |
| RW FRESH PROCESSED MEAT | TRUE | 17 | 30.84 |
| WATCHES/CALCULATORS/LOBBY | TRUE | 16 | 66.10 |
| GIFT & FRUIT BASKETS | TRUE | 13 | 159.39 |
| PROD SUPPLIES | TRUE | 12 | 19.69 |
| MISCELLANEOUS HBC | TRUE | 11 | 30.49 |
| PKG.SEAFOOD MISC | TRUE | 8 | 27.92 |
| BOUQUET (NON ROSE) | TRUE | 7 | 53.93 |
| PHARMACY | TRUE | 6 | 50.94 |
| BULK FOODS | TRUE | 4 | 20.06 |
| HOME HEALTH CARE | TRUE | 4 | 49.52 |
| TOYS | TRUE | 3 | 4.17 |
| EASTER LILY | TRUE | 2 | 31.98 |
| NDAIRY/TEAS/JUICE/SOD | TRUE | 2 | 9.40 |
| MISCELLANEOUS CROUTONS | TRUE | 1 | 6.39 |
We define the Target High-Revenue Demographics: We based on the demographic analysis including age, income, and household size. Then we identify which groups have the most revenue (from demographics)of each product category.
Recommendation:
-> Focus on personalized ads and promotions for customers within the age and income groups that generate the highest revenue for a specific product category. For instance, if air care products are generating higher revenue for the 45-54 age group with 50-74k income, we should continue to target our promotion and advertisement to this segment. -> For under-purchasing groups, consider pairing it with popular items with a slight discount to increase product visibility. For example, “Buy product A ( under-purchasing product), with product B ( popular product) with only $X more”. Also, offer loyalty rewards to convert occasional customers to loyal ones.
We try to connect and define frequent product pairs from the Market Basket Analysis. We also identify the co-purchased product from the transaction data. Then lastly, we propose bundling opportunities for these products.
Recommendation:
-> Mixed Bundling: Use the mixed bundling technique to offer a slight discount for any top-frequent pair of items bought together as a bundle. For instance, “Buy a Bottle of Chocolate Milk, Get 10% off Coke.” -> Tiered Discounts: Offer tiered discounts based on quantity: “Buy 2 get 5% off, Buy 3 get 10% off, Buy 4 get 15% off”. This fosters sales and customer loyalty within that product. Cross-selling: Promoting frequent-paired products as recommendations for customers at checkout or online. If a customer buys a frozen pizza, for instance, we can suggest pairing it with Coke or Beer, as this is a common combination among other shoppers. This personalized approach improves customer experience while increasing sales.
Based on the Seasonal and Trend Analysis, we will identify which products experience the sales peak during special times of the year such as Halloween, Christmas, and Valentine’s. By leveraging this advantageous information to suggest promotions for these times.
Recommendation: -> Leverage holiday sales trends by promoting high-demand products during peak seasons at 2-3 weeks before -> Boost sales with seasonal flash sales and time-sensitive offers to drive urgency
We demonstrate the kind of Discount from Coupon and Discount Analysis. To do that, we try to identify various types of promotions ranging from retail to coupon and match, then determine which one is the most effective among the rest.
Recommendation:
Multi-Discount Promotions: . Offering a combination of retail discounts and coupon discounts at a time, such as 20% off retail price + $5 off coupon Loyalty Program Enhancements: Focus on the loyalty program with milestone-based incentives for loyal customers. For instance, “10% off on the 10th purchase at Regork”
We classify shoppers as loyal and Occasional, and then we can target the loyal segments with different promotional strategies to maximize their engagement.
Recommendation:
-> Loyal customers will have better promotions compared to normal customers, such as 10% extra on top of the 15% discount that normal customers get. -> Offering exclusive weekly deals for loyal customers to increase purchase frequency and higher customer retention. Customers will be returning every week if they know there will be good deals and stop looking elsewhere for similar products.