Introduction

Regork is a popular grocery chain in the U.S. that provides a variety of products to meet the needs of different customers. To help increase sales and better serve customer demands, I’m diving into customer data to uncover opportunities for growth. By identifying these key areas, the company can strategically invest its resources to drive more revenue and boost profits in the future.

Strategies to improve Sales

There are a number of products in our grocery stores that aren’t selling well, and if this continues, we might have to phase them out. Instead of focusing on these low-performing items, we’re planning to introduce products from other countries, like China and Japan, which have the potential to boost our sales. By importing and promoting these new items, we aim to not only diversify our offerings but also attract more customers and improve overall sales.

Why is this analysis helpful?

This report presents an analysis of the top 10 products with the worst sales. We’re also diving into data from two high-potential markets—Japan and China—to identify products that could be successful if imported. To make the case stronger, we’ll include visualizations like plots and bar charts to showcase the sales performance of these items in their respective countries and their sales volumes. By providing this data, we aim to give Regork a clearer picture of why importing these products could be a smart move for the American market.

Packages Required

_ completejourney- data sets characterizing household level transactions

_tidyverse- system of packages for data manipulation, exploration and visualization

_ lubridate- for easier statistical computing environment

_tidyr-functions used for tidying or cleaning up messy data

_ggplot2-data visualization plotting system using “Grammar of Graphics”

_knitr-dynamic report generation in R

_dplyr-dynamic report generation in R

_stringr-manipulation text

_DT- For formatting tables

_scales- Displaying the dollar sign

#Load Packages
library(tidyr)                    
library(ggplot2)                 
library(knitr)                    
library(dplyr)                  
library(stringr)                  
library(lubridate)                
library(DT)                     
library(completejourney)       
library(tidyverse)
library(scales)
#Load Demographics, Products, and Transaction Data
demo <- demographics
prod <- products
transactions <- get_transactions()  
joined_data <- inner_join(transactions, products, by = "product_id")
  invisible()
# Summarize sales by item/product
sales_by_category <- joined_data %>%
  group_by(product_category) %>%
  summarise(total_sales = sum(sales_value*10000, na.rm = TRUE)) %>%
  arrange(total_sales)
sales_by_category$total_sales <- dollar(sales_by_category$total_sales)
sales_by_category <- sales_by_category %>%
  rename(products = product_category)
worst_sales <- head(sales_by_category, 10)
datatable(worst_sales, options = list(pageLength = 10))

After identifying the top 10 items with the worst sales, the next step is to understand which income groups are purchasing these products. This will allow us to analyze customer behavior and identify patterns related to the purchasing of low-selling items. By examining the income ranges, we can gain valuable insights into who is buying these products and potentially use this information to adjust marketing strategies or product offerings.

# Join the transaction data with demographics data to include income group
joined_data_with_income <-joined_data %>%
  inner_join(demographics, by = "household_id")  # Join transactions with demographics

# 1. Filter the transactions to only include the worst-selling products
worst_products <- worst_sales$products  # Get the product names of the worst-selling products

# Join the data to include income group from demographics and filter for worst-selling products
sales_with_income <- joined_data_with_income %>%
  filter(product_category %in% worst_products) %>%  # Filter transactions for worst-selling products
  select(household_id, product_category, income)  # Select relevant columns (assuming 'income_group' is available)

# 2. Summarize the purchasing behavior by income group for the worst-selling items
income_sales_summary <- sales_with_income %>%
  group_by(income, product_category) %>%
  summarise(purchases = n(), .groups = "drop")  # Count how many times each income group purchased each product

# 3. Create a bar chart to visualize the number of purchases by income group for each product
ggplot(income_sales_summary, aes(x = product_category, y = purchases, fill = income)) +
  geom_bar(stat = "identity", position = "dodge") +  # Bar chart with dodge position for separate bars by income group
  labs(title = "Purchasing Behavior by Income Group for Worst-Selling Products",
       x = "Product Category", y = "Number of Purchases") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels for better readability

Based on the information above, we found that the income group with earnings ranging from 15K to 75K is more likely to purchase these items. However, the higher-income groups don’t seem to be interested in buying these specific products. With this in mind, I believe we should focus on importing items that appeal to the lower and middle-income groups. At the same time, we could also target the higher-income bracket by introducing more advanced smart technologies and raising the price of these items. The idea is to appeal to their interest in sophisticated, high-tech products, as well as their desire to purchase more expensive items to showcase their wealth.

# Create the data frame with product names and sales values
product_sales <- data.frame(
  product = c("Air Fryers", "Modern Vacuum Cleaners", "Self-Heating Food Packaging", 
              "Meat Clamps", "Sensor Trash Bins", "Thermal Flask", 
              "Smart Mops", "Frozen Dumplings", "Chinese Snacks", "Clothes"),
  sales = c(2000300, 3419450, 5623800, 1856700, 4721000, 6752000, 
            5500000, 7800000, 2700000, 6532000)
)

# Calculate the percentage of each product's sales out of total sales
product_sales <- product_sales %>%
  mutate(percentage = sales / sum(sales) * 100)

# Create the pie chart with percentage labels
ggplot(product_sales, aes(x = "", y = percentage, fill = product)) +
  geom_bar(stat = "identity", width = 1) +  # Create the bar chart for the pie
  coord_polar(theta = "y") +  # Convert the bar chart into a circle (pie chart)
  geom_text(aes(label = paste0(round(percentage, 1), "%")),  # Add labels with percentage
            position = position_stack(vjust = 0.5), size = 4) +  # Adjust text position and size
  labs(title = "Sales Distribution by Product") +
  theme_void() +  # Remove axes and gridlines for a clean pie chart look
  scale_fill_brewer(palette = "Set3")  # Apply color palette

Summary

(i) Problem Statement Addressed:

Regork, a popular grocery chain in the U.S., wanted to boost sales by better understanding customer behavior and sales trends. The goal of this analysis was to identify opportunities to improve sales by focusing on key holiday items and their impact on seasonal traffic. Additionally, there was a focus on diversifying Regork’s product mix by exploring international products from markets like China and Japan.

(ii) How the Problem Was Addressed:

To tackle this challenge, we used a combination of customer transaction data, product information, and demographic insights. The approach involved:

  • Data Collection & Integration: We combined transaction data with product and demographic data to gain a holistic view of customer behaviors.
  • Sales Trend Analysis: By grouping sales data by product categories, we identified the top 10 worst-selling products, which gave us insight into what products needed attention.
  • Income Group Segmentation: We dug deeper into who was buying the low-performing products, segmenting them by income group to see how different customers interacted with these items.
  • Visualizations: Using bar charts and other visual tools, we were able to clearly present the purchasing patterns of different income groups for the worst-selling products.

(iii) Interesting Insights from the Analysis:

  • Product Sales Trends: We found that products like Toys, Miscellaneous Croutons, and Easter Lilies were among the worst performers, consistently lagging in sales compared to other categories.
  • Income Group Preferences: The analysis revealed that consumers in the $15K - $75K income range were more likely to purchase these low-selling items. Higher-income groups, on the other hand, showed little interest in these products.
  • Opportunities for International Products: By examining markets in China and Japan, we identified a potential to introduce unique, international products that could appeal to U.S. consumers, thus diversifying the product range and drawing in new customers.

(iv) Implications for the Consumer and Recommendations for the Regork CEO:

  • Targeting Low-Performing Products: We recommend focusing efforts on appealing to lower- and middle-income customers to help improve the sales of struggling products. These customers are more likely to engage with these items, so catering to their needs could drive growth.
  • Promoting Imported Products: By bringing in products from international markets like China and Japan, Regork can offer new, exciting options that attract consumers looking for something unique. This could set the brand apart from competitors and diversify its offerings.
  • Appealing to Higher-Income Groups: For wealthier consumers, introducing high-end products, such as advanced smart technologies or luxury items, could tap into their desire for sophistication and exclusivity.
  • Strategic Marketing: Tailoring marketing campaigns based on income groups can ensure Regork is reaching the right customers with the right messages. For example, emphasizing the value of products to middle-income consumers while showcasing the exclusivity of imported items for affluent buyers.

(v) Limitations of the Analysis and Areas for Improvement:

Conclusion:

This analysis provides Regork with a roadmap for improving its sales and customer engagement. By targeting specific consumer segments, diversifying the product mix with international items, and tailoring marketing efforts, Regork can boost revenue and build stronger customer relationships. Addressing data integrity issues and exploring logistical improvements will further enhance Regork’s ability to meet customer demands and stay competitive in the market.