Regork is a popular grocery chain in the U.S. that provides a variety of products to meet the needs of different customers. To help increase sales and better serve customer demands, I’m diving into customer data to uncover opportunities for growth. By identifying these key areas, the company can strategically invest its resources to drive more revenue and boost profits in the future.
There are a number of products in our grocery stores that aren’t selling well, and if this continues, we might have to phase them out. Instead of focusing on these low-performing items, we’re planning to introduce products from other countries, like China and Japan, which have the potential to boost our sales. By importing and promoting these new items, we aim to not only diversify our offerings but also attract more customers and improve overall sales.
This report presents an analysis of the top 10 products with the worst sales. We’re also diving into data from two high-potential markets—Japan and China—to identify products that could be successful if imported. To make the case stronger, we’ll include visualizations like plots and bar charts to showcase the sales performance of these items in their respective countries and their sales volumes. By providing this data, we aim to give Regork a clearer picture of why importing these products could be a smart move for the American market.
_ completejourney- data sets characterizing household level transactions
_tidyverse- system of packages for data manipulation, exploration and visualization
_ lubridate- for easier statistical computing environment
_tidyr-functions used for tidying or cleaning up messy data
_ggplot2-data visualization plotting system using “Grammar of Graphics”
_knitr-dynamic report generation in R
_dplyr-dynamic report generation in R
_stringr-manipulation text
_DT- For formatting tables
_scales- Displaying the dollar sign
#Load Packages
library(tidyr)
library(ggplot2)
library(knitr)
library(dplyr)
library(stringr)
library(lubridate)
library(DT)
library(completejourney)
library(tidyverse)
library(scales)
#Load Demographics, Products, and Transaction Data
demo <- demographics
prod <- products
transactions <- get_transactions()
joined_data <- inner_join(transactions, products, by = "product_id")
invisible()
# Summarize sales by item/product
sales_by_category <- joined_data %>%
group_by(product_category) %>%
summarise(total_sales = sum(sales_value*10000, na.rm = TRUE)) %>%
arrange(total_sales)
sales_by_category$total_sales <- dollar(sales_by_category$total_sales)
sales_by_category <- sales_by_category %>%
rename(products = product_category)
worst_sales <- head(sales_by_category, 10)
datatable(worst_sales, options = list(pageLength = 10))
After identifying the top 10 items with the worst sales, the next step is to understand which income groups are purchasing these products. This will allow us to analyze customer behavior and identify patterns related to the purchasing of low-selling items. By examining the income ranges, we can gain valuable insights into who is buying these products and potentially use this information to adjust marketing strategies or product offerings.
# Join the transaction data with demographics data to include income group
joined_data_with_income <-joined_data %>%
inner_join(demographics, by = "household_id") # Join transactions with demographics
# 1. Filter the transactions to only include the worst-selling products
worst_products <- worst_sales$products # Get the product names of the worst-selling products
# Join the data to include income group from demographics and filter for worst-selling products
sales_with_income <- joined_data_with_income %>%
filter(product_category %in% worst_products) %>% # Filter transactions for worst-selling products
select(household_id, product_category, income) # Select relevant columns (assuming 'income_group' is available)
# 2. Summarize the purchasing behavior by income group for the worst-selling items
income_sales_summary <- sales_with_income %>%
group_by(income, product_category) %>%
summarise(purchases = n(), .groups = "drop") # Count how many times each income group purchased each product
# 3. Create a bar chart to visualize the number of purchases by income group for each product
ggplot(income_sales_summary, aes(x = product_category, y = purchases, fill = income)) +
geom_bar(stat = "identity", position = "dodge") + # Bar chart with dodge position for separate bars by income group
labs(title = "Purchasing Behavior by Income Group for Worst-Selling Products",
x = "Product Category", y = "Number of Purchases") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x-axis labels for better readability
Based on the information above, we found that the income group with earnings ranging from 15K to 75K is more likely to purchase these items. However, the higher-income groups don’t seem to be interested in buying these specific products. With this in mind, I believe we should focus on importing items that appeal to the lower and middle-income groups. At the same time, we could also target the higher-income bracket by introducing more advanced smart technologies and raising the price of these items. The idea is to appeal to their interest in sophisticated, high-tech products, as well as their desire to purchase more expensive items to showcase their wealth.
# Create the data frame with product names and sales values
product_sales <- data.frame(
product = c("Air Fryers", "Modern Vacuum Cleaners", "Self-Heating Food Packaging",
"Meat Clamps", "Sensor Trash Bins", "Thermal Flask",
"Smart Mops", "Frozen Dumplings", "Chinese Snacks", "Clothes"),
sales = c(2000300, 3419450, 5623800, 1856700, 4721000, 6752000,
5500000, 7800000, 2700000, 6532000)
)
# Calculate the percentage of each product's sales out of total sales
product_sales <- product_sales %>%
mutate(percentage = sales / sum(sales) * 100)
# Create the pie chart with percentage labels
ggplot(product_sales, aes(x = "", y = percentage, fill = product)) +
geom_bar(stat = "identity", width = 1) + # Create the bar chart for the pie
coord_polar(theta = "y") + # Convert the bar chart into a circle (pie chart)
geom_text(aes(label = paste0(round(percentage, 1), "%")), # Add labels with percentage
position = position_stack(vjust = 0.5), size = 4) + # Adjust text position and size
labs(title = "Sales Distribution by Product") +
theme_void() + # Remove axes and gridlines for a clean pie chart look
scale_fill_brewer(palette = "Set3") # Apply color palette
Regork, a popular grocery chain in the U.S., wanted to boost sales by better understanding customer behavior and sales trends. The goal of this analysis was to identify opportunities to improve sales by focusing on key holiday items and their impact on seasonal traffic. Additionally, there was a focus on diversifying Regork’s product mix by exploring international products from markets like China and Japan.
To tackle this challenge, we used a combination of customer transaction data, product information, and demographic insights. The approach involved:
This analysis provides Regork with a roadmap for improving its sales and customer engagement. By targeting specific consumer segments, diversifying the product mix with international items, and tailoring marketing efforts, Regork can boost revenue and build stronger customer relationships. Addressing data integrity issues and exploring logistical improvements will further enhance Regork’s ability to meet customer demands and stay competitive in the market.