2023-10-05In today’s ever evolving business landscape, companies and organizations like Regork Inc. must continue to seek new opportunities for growth and expansion. For Regork Inc this is a strategic goal to thrive in a competitive marketplace. As a data engineer entrusted with this task, our prime objective is clear: we first need to identify untapped growth possibilities that can push Regork to newer and better heights.
“How can Regork strategically manage its inventory and position its products in stores to boost both revenue and profitability”
While there are endless possibilities to focus on we chose to focus on the strategic growth initiatives that can deliver tangible results. We will explore various data to answer our business question above, such as which products drive the most revenue, which customer segments should we utilize for targeting, and how marketing campaigns impact the overall sales.
Our analysis involved data processing, data joins, and advanced visualization techniques to derive meaningful insights. The R code provided in the report forms the basis of our analysis.
For our analysis, we began by identifying total sales per product_id in the transactions datasets, then we filtered out rows in the promotions data set for each unique product_id, but same product_id can be placed in different stores in the different display locations. Therefore, we grouped the merged data set by display location, this gave us the data set for each product_id and display locations
How we get our solution?
As we code and dive more into the Data we can translate our insights into actionable recommendations that Regrok Inc. can leverage to boost revenue and enhance customer satisfaction, as well as operational efficiency and market share.
Our examination of product sales across different age groups unveils valuable insights for targeting specific demographics effectively.
By pinpointing product categories with lower sales and untapped growth potential, we can develop strategies to revitalize these areas.
The creation of a visual representation showcasing the top 10 product categories with the highest sales in transactions involving coupon discounts enables us to identify critical areas for growth. This chart guides marketing efforts towards the most successful product categories, optimizing resource allocation.
Our analysis identifies the most popular product categories in the highest-performing display locations. This information helps us identify opportunities for product improvement and diversification within these categories. To arrive at these insights, we calculated total sales by product and store, merged this data with promotions, removed outliers in total sales at display locations, and filtered for the top three locations with the highest sales. This data informs us about popular products, which we can visualize based on total sales by product category.
We offer data-driven recommendations aimed at enhancing revenue and profitability, ensuring that our insights translate into tangible business improvements.
CompleteJourney found here: https://bradleyboehmke.github.io/completejourney/index.html
The R data package CompleteJourney provides us information about the company Regork Inc. that we can use to perform analysis and give our recommendation.
The chart below illustrates product sales by age group. Each bar represents a product category, and the colors indicate different age groups. The height of the bars represents the total sales in dollars.
## # A tibble: 1,469,307 × 11
## household_id store_id basket_id product_id quantity sales_value retail_disc
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 900 330 31198570044 1095275 1 0.5 0
## 2 900 330 31198570047 9878513 1 0.99 0.1
## 3 1228 406 31198655051 1041453 1 1.43 0.15
## 4 906 319 31198705046 1020156 1 1.5 0.29
## 5 906 319 31198705046 1053875 2 2.78 0.8
## 6 906 319 31198705046 1060312 1 5.49 0.5
## 7 906 319 31198705046 1075313 1 1.5 0.29
## 8 1058 381 31198676055 985893 1 1.88 0.21
## 9 1058 381 31198676055 988791 1 1.5 1.29
## 10 1058 381 31198676055 9297106 1 2.69 0
## # ℹ 1,469,297 more rows
## # ℹ 4 more variables: coupon_disc <dbl>, coupon_match_disc <dbl>, week <int>,
## # transaction_timestamp <dttm>
## # A tibble: 20,940,529 × 5
## product_id store_id display_location mailer_location week
## <chr> <chr> <fct> <fct> <int>
## 1 1000050 316 9 0 1
## 2 1000050 337 3 0 1
## 3 1000050 441 5 0 1
## 4 1000092 292 0 A 1
## 5 1000092 293 0 A 1
## 6 1000092 295 0 A 1
## 7 1000092 298 0 A 1
## 8 1000092 299 0 A 1
## 9 1000092 304 0 A 1
## 10 1000092 306 0 A 1
## # ℹ 20,940,519 more rows
## # A tibble: 92,331 × 7
## product_id manufacturer_id department brand product_category product_type
## <chr> <chr> <chr> <fct> <chr> <chr>
## 1 25671 2 GROCERY Natio… FRZN ICE ICE - CRUSH…
## 2 26081 2 MISCELLANEOUS Natio… <NA> <NA>
## 3 26093 69 PASTRY Priva… BREAD BREAD:ITALI…
## 4 26190 69 GROCERY Priva… FRUIT - SHELF S… APPLE SAUCE
## 5 26355 69 GROCERY Priva… COOKIES/CONES SPECIALTY C…
## 6 26426 69 GROCERY Priva… SPICES & EXTRAC… SPICES & SE…
## 7 26540 69 GROCERY Priva… COOKIES/CONES TRAY PACK/C…
## 8 26601 69 DRUG GM Priva… VITAMINS VITAMIN - M…
## 9 26636 69 PASTRY Priva… BREAKFAST SWEETS SW GDS: SW …
## 10 26691 16 GROCERY Priva… PNT BTR/JELLY/J… HONEY
## # ℹ 92,321 more rows
## # ℹ 1 more variable: package_size <chr>
## # A tibble: 801 × 8
## household_id age income home_ownership marital_status household_size
## <chr> <ord> <ord> <ord> <ord> <ord>
## 1 1 65+ 35-49K Homeowner Married 2
## 2 1001 45-54 50-74K Homeowner Unmarried 1
## 3 1003 35-44 25-34K <NA> Unmarried 1
## 4 1004 25-34 15-24K <NA> Unmarried 1
## 5 101 45-54 Under 15K Homeowner Married 4
## 6 1012 35-44 35-49K <NA> Married 5+
## 7 1014 45-54 15-24K <NA> Married 4
## 8 1015 45-54 50-74K Homeowner Unmarried 1
## 9 1018 45-54 35-49K Homeowner Married 5+
## 10 1020 45-54 25-34K Homeowner Married 2
## # ℹ 791 more rows
## # ℹ 2 more variables: household_comp <ord>, kids_count <ord>
# Products Sales by age group
product_sales_age_group <- transactions %>%
inner_join(prods, by = 'product_id') %>%
inner_join(demo, by = 'household_id') %>%
group_by(product_category, age) %>%
summarize(sales = sum(sales_value)) %>%
arrange(desc(sales)) %>%
head(75)
product_sales_age_group
## # A tibble: 75 × 3
## # Groups: product_category [37]
## product_category age sales
## <chr> <ord> <dbl>
## 1 COUPON/MISC ITEMS 45-54 89171.
## 2 COUPON/MISC ITEMS 35-44 72982.
## 3 COUPON/MISC ITEMS 25-34 43606.
## 4 SOFT DRINKS 45-54 38801.
## 5 BEEF 45-54 36485.
## 6 SOFT DRINKS 35-44 27337.
## 7 BEEF 35-44 27124.
## 8 FLUID MILK PRODUCTS 45-54 25563.
## 9 CHEESE 45-54 21791.
## 10 FRZN MEAT/MEAT DINNERS 45-54 18892.
## # ℹ 65 more rows
product_sales_age_group %>%
ggplot(aes(x = reorder(product_category, sales, FUN = sum),
y = sales,
fill = age
)
) +
geom_col() +
coord_flip() +
labs(
title = 'Top Most Product Categories Sold Items by age Groups',
y = 'Total Net Sales $$$',
x = 'Product Categories',
subtitle =
'The data here shows the age groups for the most popular product
categories sold.'
) +
guides(fill = guide_legend(title = "Age Group")) +
theme(plot.title = element_text(hjust = 0.5,face = "bold", size = 14),
plot.subtitle = element_text(size = 9),
axis.text = element_text(size = 8)
)
The bubble chart below visualizes the total sales for the 10 most sold product categories when campaign is used. The size of each bubble represents the total sales, and the color indicates the product categories.
# Create a bubble chart
transactions %>%
inner_join(prods, by = "product_id") %>%
filter(coupon_disc != 0) %>% # Filter transactions with coupon discounts
group_by(product_category) %>%
summarize(total_sales = sum(sales_value)) %>% #calculate total sales by product category
arrange(desc(total_sales)) %>%
top_n(10) %>% #Select the top 10 product categories
ggplot(aes(x = product_category, y = total_sales, size = total_sales, color = product_category)) +
geom_point() +
theme_minimal() + # Use a minimal theme for cleaner look
theme(
axis.text.x = element_text(angle = 45, hjust = 1),
plot.title = element_text(hjust = 0.5, size = 16, face = "bold"),
legend.position = "none"
) +
labs(
title = "Top 10 Product Categories Using Coupons",
subtitle = "Total Sales Grouped by Product Category for Coupon Transactions",
x = "Product Category",
y = "Total Sales"
) +
scale_size_continuous(range = c(5, 20)) + # Adjust bubble size
scale_y_continuous(limits = c(0, 8000)) # Set y-axis limits
The bubble chart below visualizes the average total sales on each display location. The size of each bubble represents the average total sales, and the color indicates the display location.
# Calculate total sales by product and store
total_sales_by_product <- transactions %>%
group_by(product_id, store_id) %>%
summarise(total_sales = sum(sales_value)) %>%
na.omit()
# Merge transactions and promotions datasets
promotion_sales <- total_sales_by_product %>%
left_join(promotions, by = c("product_id", "store_id")) %>%
group_by(display_location) %>%
summarise(total_display_sales = sum(total_sales, na.rm = TRUE)) %>%
na.omit()
# Remove Outliers for display location
# Calculate the IQR for total_display_sales
Q1 <- quantile(promotion_sales$total_display_sales, 0.25)
Q3 <- quantile(promotion_sales$total_display_sales, 0.75)
IQR <- Q3 - Q1
# Define a threshold to identify outliers
threshold <- 1.5 * IQR
# Filter promotion_sales to remove outliers
promotion_sales <- promotion_sales %>%
filter(!(total_display_sales > (Q3 + threshold) | total_display_sales < (Q1 - threshold)))
# Create a bubble chart to visualize average total sales by display location
ggplot(promotion_sales, aes(x = display_location, y = total_display_sales, size = total_display_sales, fill = display_location)) +
geom_point(shape = 21, color = "black", alpha = 0.7) +
scale_size_continuous(range = c(5, 15)) + # Adjust size range for better visibility
labs(title = "Total Sales by Display Location",
x = "Display Location",
y = "Total Sales",
size = "Total Sales") + # Updated y-axis label
theme_minimal() +
scale_y_continuous(limits = c(40000, 3000000), labels = comma_format(scale = 1e-3, suffix = "K")) +
theme(plot.title = element_text(hjust = 0.5, size = 14, face = "bold")) + # Center the title
guides(size = "none")
# Filter for the top-performing display locations
top_display_locations <- promotion_sales %>%
top_n(3, wt = total_display_sales) # Select the top 3 display locations
# filter promotion data for the top display location
promotions_display_data <- promotions %>%
filter(display_location %in% top_display_locations$display_location)
# Merge with promotions and transactions data to get details of the top display locations
top_display_data <- transactions %>% select(household_id, product_id, store_id, sales_value, week) %>%
inner_join(promotions_display_data, by = c("product_id", "store_id", "week"))
The bar chart above shows the total sales by product category in the top 3 display locations. Each bar represents a product category, and the fill color indicates the display location.
# most popular products in the top display locations
top_products <- prods %>% inner_join(top_display_data, by = "product_id") %>%
group_by(product_id, product_category, display_location)%>%
summarise(total_sales = sum(sales_value))%>% arrange(desc(total_sales))%>%
na.omit()
top_100_products <- head(top_products, 100)# Select the top 100 products
# Bar chart of the most popular products in the top display locations
ggplot(top_100_products, aes(x = product_category, y = total_sales, fill = display_location)) +
geom_bar(stat = "identity") +
labs(title = "Total Sales by Product Category in Top 3 Display Locations",
x = "Product Category",
y = "Total Sales",
fill = "Display Location") +
theme(axis.text.x = element_text(angle = 45, hjust = 1), plot.title = element_text(hjust = 0.5, size = 14, face = "bold")) +
scale_fill_manual(values = c("blue", "green", "red")) # Customize fill colors as needed
Our analysis uncovered several key findings:
By analyzing product sales by age group, we uncovered distinct preferences among different age groups.Some age groups exhibited a strong affinity for specific product categories. From this visualization we identified opportunities to target specific demographics effectively. This personalized approach can lead to increased sales and customer satisfaction.
Our data reveals distinct sales patterns among different age groups. The “45-54” age group emerges as the most significant contributor to our sales, particularly in categories like “COUPON/MISC ITEMS” and “SOFT DRINKS.”
Among our product categories, “SOFT DRINKS” stands out as a top performer, with substantial total sales. This category is closely followed by “FLUID MILK PRODUCTS” and “BAKED BREAD/BUNS/ROLLS.”
We identified underperforming product categories with significant growth potential. Strategies to enhance these categories can unlock latent profitability.
We identified the top display locations with the highest average total sales. Notably, “2-store rear,” “7-in-aisle,” and “9-secondary location display” emerged as the frontrunners in this regard. Additionally, “0-display” and “3-front end cap” also demonstrated significant sales potential. These information should be considered for optimizing product placement strategies and to make informed inventory planning decision to elevate customer satisfaction.
Within these top-performing display locations, we also identified the most popular product categories. This finding opens the door for recommendations to expand and diversify the product variety within these categories.
Based on our findings, we propose the following recommendations to enhance Regork Inc.’s revenue and profitability:
Utilize customer insights to improve product offerings, refine store layouts, and optimize product placement strategies. Tailoring these aspects to match customer preferences can result in increased revenue and elevated customer satisfaction.
Leverage the insights gained from our data analysis to make informed inventory decisions. This includes selecting the right products to stock and when to introduce them, aligning inventory more closely with customer demand.
Focus marketing efforts on age groups 19-24 and 25-34, especially for soft drinks, as these segments exhibit the highest propensity to purchase such products. Consider introducing introductory items in strategic locations to attract more customers.
Concentrate efforts on display locations like “2-store rear,” “7-in-aisle,” “9-secondary” and “3 - frontend.” These locations have demonstrated strong sales potential and should be prioritized for key product placements.
Emphasize product categories such as produce, soft drinks, meat, baked bread, and cheese. These categories have shown promise for revenue growth and warrant special attention in marketing and inventory planning.
Launch targeted marketing campaigns tailored to age groups that exhibit growth potential. Craft messages and promotions that resonate with these demographics to maximize their engagement.
Utilize customer demographics to personalize marketing messages and promotions further. Tailoring your marketing approach can lead to a more personalized and engaging customer experience, which can drive increased sales and loyalty.
In conclusion, this report provides valuable insights and recommendations to guide Regork’s strategic decisions. By optimizing inventory management, implementing targeted marketing, analyzing profitability, and prioritizing customer satisfaction, Regork can position itself for continued growth, increased revenue, and enhanced profitability.
At Regork Inc., one of our major challenges we wanted to tackle: how can we boost revenue and profitability in a highly competitive market? We needed to identify the key drivers of revenue, optimize product placements, and enhance customer targeting to achieve sustainable growth.
In our quest to tackle this challenge, we embarked on a data-driven journey that encompassed a comprehensive analysis of product sales by age group, enabling us to uncover hidden demographics with immense growth potential. We delved deep into underperforming product categories, devising strategies to unlock their latent profitability. Our recommendations were solidly grounded in data, providing Regork Inc. with a clear path towards informed decision-making and highly personalized marketing efforts.
Moreover, we harnessed advanced visualization techniques, including the creation of a captivating bubble chart and a revealing bar chart showcasing the top-performing products in the top three display locations. These visualizations transformed raw data into actionable insights, equipping Regork Inc. with a robust strategy for revenue enhancement, improved profitability, and an elevated customer experience.