In any organization, profitability can be determined by two
metrics–revenue and expenses. There are two ways to maximize
profitability. The first is to increase revenue given that expenses
remain the same. The second is to try to reduce expenses while
maintaining the same revenue. Regarding the latter, there are many
approaches to how this can be achieved. You can negotiate better
contracts from suppliers, streamline operations and distributions, or
reduce any unnecessary waste or spend. For this analysis, I decided to
evaluate the impact of reducing overall inventory by removing products
that yield very small sales revenue. Reducing our inventory would help
us reduce expenses for several reasons. Firstly, it eliminates the
initial expense of purchasing these products that do not yield many
sales to recover that cost. Secondly, it helps streamline the sourcing
process as it would potentially reduce the number of suppliers we need
to engage with. Additionally, reducing inventory could potentially free
up shelf space to be used by products that yield higher revenue which
could also allow us to generate more profit. This would be a secondary
effect of this conclusion, but a positive one. If we can conclude that
reducing products in our inventory will reduce expenses without
significantly reducing revenue, this would result in higher
profitability for Regork. As a result, this increased profitability
would enable regork to evaluate a number of different strategies
including, but not limited to expansion, new investments, philanthropy,
etc.
The above thoughts can be summarized as follows. Are there certain
products that are not producing enough sales? Can we remove these
products from our inventory to help us save on expenses and free up
shelf space for higher yielding products? Could reducing inventory help
Regork increase profitability?
I addressed this question using the data provided in the completejourney
package–more specifically the transactions and products datasets. I
evaluated the top 10 product categories (by total sales) and found a
threshold sales value as the minimum requirement for a product to remain
in inventory. Any products that earn below this threshold are removed
from inventory and total sales are calculated again. I then compared the
original sales value to the new sales value with products removed. I
then compared the number of products for each category before and after
removal. The sales value decreased minimally in each category while the
number of products decreased considerably. I would recommend
consolidating our inventory. Some products do not generate enough
revenue to help the company be profitable. The expenses associated with
these products likely mean these products contribute negatively to our
profitability. A deeper analysis with more detail data about the
sourcing and supplier contracts is likely needed to truly determine the
impact to expenses, however I feel confident in saying that we would be
better served removing products from our inventory.
For this evaluation, the following packages are required:
library(tidyverse)
library(completejourney)
library(knitr)
My theory is that if we remove lower revenue products from our inventory, we can cut expenses and increase overall profitability. To do this, I evaluated the 10 product categories that generated the most sales. After identifying these categories (and removing the COUPON/MISC ITEMS category for being too generic), I looked at the sales value for each product within these groups and evaluated how the total sales for each category would change if we removed a certain number of products.
I start by loading the full transactions dataset.
# Load the full transactions dataset into the 'transactions' object
transactions <- get_transactions()
After loading the full dataset, I identify the top 10 product categories
with the highest revenue.
# Find the product_category values that yield the top 10 sales amount
# Excluding COUPON/MISC ITEMS
top_10_catgs <- transactions %>%
inner_join(products, by = 'product_id') %>%
group_by(product_category) %>%
summarize(
total_sales = sum(sales_value),
total_quantity = sum(quantity)
) %>%
arrange(desc(total_sales)) %>%
filter(product_category != 'COUPON/MISC ITEMS') %>%
slice_head(n = 10)
The barchart below shows the total revenue for each fo the top 10
categories.
# Creating a bar chart for sales amount for the top 10 product_category values
top_10_catgs %>%
ggplot(aes(
fct_reorder(factor(product_category), total_sales),
total_sales,
fill = product_category)
) +
geom_bar(stat = 'identity') +
scale_y_continuous(labels = scales::dollar) +
scale_fill_brewer(palette = 'Set3') +
labs(
title = 'Top 10 Product Categories by Total Sales',
subtitle = 'NOTE: COUPON/MISC ITEMS category removed.',
x = 'Product Category',
y = 'Total Sales',
fill = 'Product Category'
) +
coord_flip() +
theme(legend.position = 'none')
These categories generate a lot of sales revenue for Regork.
Now I evaluate the number of products in each of these categories.
# Counting the number of product_ids within each of the top 10 product_category
# values
catg_n_products <- top_10_catgs %>%
inner_join(products, by = 'product_category') %>%
group_by(product_category, total_sales) %>%
summarize(
n_products = n_distinct(product_id)
)
# Creating a bar chart for count of products for the top 10 product_category
# values
catg_n_products %>%
ggplot(aes(fct_reorder(factor(product_category), n_products), n_products, fill = product_category)) +
geom_bar(stat = 'identity') +
coord_flip() +
scale_fill_brewer(palette = 'Set3') +
scale_y_continuous(labels = scales::comma) +
labs(
title = 'Number of Products in Top 10 Product Categories',
x = 'Product Category',
y = 'Number of Products',
) +
theme(legend.position = 'none')
Each category has a lot of unique products within them. The theory is
that some of these products will generate a lot of revenue while others
will product a very small revenue. Assuming a lot of these products
could be very similar to one another, it’s reasonable to conclude that
some products are more popular than other products of the same type. For
example, in SOFT DRINKS, we could have many different varieties. Some of
these products could be limited releases, but also some of them cold be
regular products that simply do not get purchased very frequently.
The below charts break the total revenue for each product into different
groupings to assess if there is a large number of these prooducts that
generate lower sales.
top_10_catgs %>%
inner_join(products, by = 'product_category') %>%
inner_join(transactions, by = 'product_id') %>%
group_by(product_category, product_id) %>%
summarize(
total_sales = sum(sales_value)
) %>%
mutate(total_sales_group = case_when(
total_sales <= 50 ~ '$0 - $50',
total_sales <= 100 ~ '$50 - $100',
total_sales <= 150 ~ '$100 - $150',
total_sales <= 200 ~ '$150 - $200',
total_sales <= 250 ~ '$200 - $250',
total_sales <= 300 ~ '$250 - $300',
total_sales <= 350 ~ '$300 - $350',
total_sales <= 400 ~ '$350 - $400',
total_sales <= 450 ~ '$400 - $450',
total_sales <= 500 ~ '$450 - $500',
TRUE ~ '$500 +'
)) %>%
ggplot(aes(total_sales_group)) +
geom_bar(color = 'darkslategray3', fill = 'darkslategray3') +
facet_wrap(~ product_category, ncol = 5) +
coord_flip() +
labs(
title = 'Number of Products by Total Sales Bin',
subtitle = 'Broken out by product category',
x = 'Total Sales Bin',
y = 'Number of Products'
)
The charts above indicate that for each category, the overwhelming
majority of the products total between $0 and $50 of sales revenue. Most
products seem to generate very little overall revenue. It is worth
looking into the lowest sales group further to see the distribution of
sales value across this group for each product category. This will help
us get a better idea of just how many really low revenue products we
have. This will help us set a threshold of minimum revenue for our
evaluation.
top_10_catgs %>%
inner_join(products, by = 'product_category') %>%
inner_join(transactions, by = 'product_id') %>%
group_by(product_category, product_id) %>%
summarize(
total_sales = sum(sales_value)
) %>%
filter(total_sales <= 50) %>%
ggplot(aes(total_sales)) +
geom_histogram(bins = 20, color = 'darkslategray3', fill = 'darkslategray3') +
geom_vline(xintercept = 5, linetype = 'dashed', color = 'darkslategrey') +
facet_wrap(~ product_category, ncol = 5) +
labs(
title = 'Histograms of Products with $0-$50 of Total Sales',
subtitle = 'Broken out by product category',
x = 'Total Sales',
y = 'Number of Products'
)
These histograms show that for every product category, there is a
considerable number of products that generate $5 or less in sales
revenue. The dashed vertical line is placed at $5 and we can see some
peaks generally to the left of this line or at least around this line.
The $5 threshold is somewhat arbitrary. It is a low enough amount to
figure that cutting any products that do not reach this threshold likely
would not overly impact the overall revenue. Additionally, the value is
low enough to not inadvertently cut out products that are less popular,
but still potentially some shoppers’ favorite.
So let’s suppose that we remove all of the products that have generated
$5 in sales or less. How does this impact revenue for these products? If
it has very low impact to the overall revenue for these groups, it’s
possible that the cost to procure these products and shelve them could
outweigh their revenue contribution. The below chart illustrates
this.
all_products <- top_10_catgs %>%
inner_join(products, by = 'product_category') %>%
inner_join(transactions, by = 'product_id') %>%
group_by(product_category, product_id) %>%
summarize(
total_sales = sum(sales_value)
) %>%
group_by(product_category) %>%
summarize(
total_sales = sum(total_sales),
n_products = n_distinct(product_id)
)
# Calculate total sales and number of products per category for only products
# that generate > $5 in sales
gt5_products <- top_10_catgs %>%
inner_join(products, by = 'product_category') %>%
inner_join(transactions, by = 'product_id') %>%
group_by(product_category, product_id) %>%
summarize(
total_sales = sum(sales_value)
) %>%
filter(total_sales > 5) %>%
group_by(product_category) %>%
summarize(
total_sales = sum(total_sales),
n_products = n_distinct(product_id)
)
product_compare <- all_products %>%
inner_join(gt5_products, by = 'product_category')
names(product_compare) <- c('product_category', 'total_sales_all',
'n_products_all', 'total_sales_trim',
'n_products_trim')
product_compare %>%
select(product_category, total_sales_all, total_sales_trim) %>%
pivot_longer(cols = 2:3, names_to = 'type', values_to = 'sales_amount') %>%
mutate(type = if_else(type == 'total_sales_all', 'Before', 'After')) %>%
ggplot(aes(product_category, sales_amount, fill = type)) +
geom_col(position = 'dodge') +
scale_y_continuous(labels = scales::dollar) +
coord_flip() +
labs(
title = 'Total Sales Comparison',
subtitle = 'Broken out by product category',
x = 'Product Category',
y = 'Total Sales',
fill = 'Before/After Product Removal'
)
We can see that for each category, both bars are very close. While we do
lose some revenue from removing products, the impact appears to be
minimal.
On the other hand, let’s look at the difference in the number of unique
products remaining after we remove those that generate $5 of revenue or
less.
product_compare %>%
select(product_category, n_products_all, n_products_trim) %>%
pivot_longer(cols = 2:3, names_to = 'type', values_to = 'sales_amount') %>%
mutate(type = if_else(type == 'n_products_all', 'Before', 'After')) %>%
ggplot(aes(product_category, sales_amount, fill = type)) +
geom_col(position = 'dodge') +
coord_flip() +
scale_y_continuous(labels = scales::comma) +
labs(
title = 'Number of Products Comparison',
subtitle = 'Broken out by product category',
x = 'Product Category',
y = 'Number of Products',
fill = 'Before/After Product Removal'
)
In this case, the difference between the original number of products and
the new reduced number of products is drastic. By removing these low
revenue products, we are able to cut out a significant portion of our
inventory. It appears that removing these low revenue products has very
minimal overall impact to revenue for each group, but reduces the number
of products significantly. This is encouraging as it could mean that
there is an overall trend for every department.
Below is a table indicating the difference in revenue and product count
for each category.
product_compare_diffs <- product_compare %>%
mutate(
sales_diff = total_sales_all - total_sales_trim,
pct_sales_diff =
paste0(
round((total_sales_all - total_sales_trim) / total_sales_all, 4) * 100,
'%'
),
products_diff = n_products_all - n_products_trim,
pct_products_diff =
paste0(
round((n_products_all - n_products_trim) / n_products_all, 4) * 100,
'%'
)
) %>%
select(product_category, sales_diff, pct_sales_diff, products_diff, pct_products_diff)
kable(
product_compare_diffs,
caption = 'Comparison of Revenue and Products (By Category)',
col.names = c('Product Category', 'Sales',
'Sales (%)', 'Number of Products',
'Number of Products (%)'),
align = c('l', 'r', 'r', 'r', 'r')
)
| Product Category | Sales | Sales (%) | Number of Products | Number of Products (%) |
|---|---|---|---|---|
| BAG SNACKS | 1159.54 | 1.37% | 468 | 40.91% |
| BAKED BREAD/BUNS/ROLLS | 844.01 | 1.02% | 382 | 43.56% |
| BEEF | 405.74 | 0.23% | 120 | 16.06% |
| BEERS/ALES | 174.13 | 0.21% | 51 | 8.1% |
| CHEESE | 819.10 | 0.77% | 297 | 36.76% |
| COLD CEREAL | 418.55 | 0.66% | 155 | 34.52% |
| FLUID MILK PRODUCTS | 313.51 | 0.27% | 131 | 38.08% |
| FROZEN PIZZA | 486.72 | 0.6% | 181 | 26.93% |
| FRZN MEAT/MEAT DINNERS | 637.20 | 0.68% | 266 | 28.88% |
| SOFT DRINKS | 1060.28 | 0.58% | 468 | 35.89% |
What can we get from this table? The sales revenue for each group only
drops by 0.21% to 1.37% while the number of products reduced drops by
8.1% to 43.56%. If we exclude the two lowest changes of 8.1% and 16.06%,
the percentage decrease in number of products range from 26.93% to
43.56%. This means that for most groups at least a quarter of the
products are reduced and potentially up to almost half. This does not
guarantee cutting half of the expense to procure these items, but it
still could be a considerable cost saving.
But what if we apply this logic to the all of our products? How many
products would we remove from our inventory and how much would sales
reduce?
total_inventory <- transactions %>%
summarize(
total_sales_all = sum(sales_value),
n_products_all = n_distinct(product_id)
)
trimmed_inventory <- transactions %>%
group_by(product_id) %>%
summarize(
total_sales = sum(sales_value)
) %>%
filter(total_sales > 5) %>%
summarize(
total_sales_trim = sum(total_sales),
n_products_trim = n_distinct(product_id)
)
total_diffs <- total_inventory %>%
cbind(trimmed_inventory) %>%
mutate(
sales_diff = total_sales_all - total_sales_trim,
pct_sales_diff =
paste0(
round((total_sales_all - total_sales_trim) / total_sales_all, 4) * 100,
'%'
),
products_diff = n_products_all - n_products_trim,
pct_products_diff =
paste0(
round((n_products_all - n_products_trim) / n_products_all, 4) * 100,
'%'
)
) %>%
select(sales_diff, pct_sales_diff, products_diff, pct_products_diff)
kable(
total_diffs,
caption = 'Comparison of Revenue and Products (Overall)',
col.names = c('Sales',
'Sales (%)', 'Number of Products',
'Number of Products (%)'),
align = c('r', 'r', 'r', 'r')
)
| Sales | Sales (%) | Number of Products | Number of Products (%) |
|---|---|---|---|
| 58107.32 | 1.26% | 22089 | 32.24% |
By removing all products that generated $5 of revenue or less, we only
decrease sales by 1.26% while decreasing the number of products by
32.24%. Of course there is a small reduction in the total company
revenue, but this is quite likely more than offset by the reduction of
products.
I wanted to evaluate if reducing product inventory could help
increase profitability from a number of different angles. The theory
being that it would reduce procurement expenses, help streamline
operations, and potentially allow for more profitable products to
reclaim some shelf space. This is entirely contingent on whether or not
removing these products from our inventory have a minimal effect on
revenue. This problem was addressed by using the transactions and
products datasets. I combined these datasets to tie transaction
information to product categories and then explored to find a good
threshold value for product revenue. Once this was determined, I
evaluated the impact of removing all products with $5 or less in sales
revenue. The analysis revealed very minimal impact to the overall
revenue of Regork–only a 1.26% reduction–while cutting nearly one-third
(32.24%) of products. Removing these poorer performing products could
result in many savings. It is worth considering that, for some
customers, we could be cutting some of their favorite products. I would
ultimately recommend to consolidate the inventory for the reasons
mentioned above–lower procurement costs, streamline operations, and free
up shelf space.
It is important to consider the limitations of this analysis. Some of
the limitations are as follows: