Using data from a small business, I analyze their strenghts and weaknesses, and make recommendations of how they should proceed in the future based on their sales and revenue.
library(sqldf)
library(tidyverse)
library(dplyr)
library(fpp2)
library(readxl)
This business sells a range of items ranging from jewelry and furniture to Christmas items and Skin Care goods. After downloading the CSV of the business’s sales data from Kaggle, I loaded it into RStudio and used SQL to data mine. In total, this business sells 19 distinct kinds of products. The business’s top 6 performing products were in baskets, arts & sculpture, jewelry, home decor, kitchen, and Christmas. Each of these categories generate a revenue of over $14,000 per year, with baskets leading the way with $134,791.
b <- ggplot(df_6, aes(Product_Type,Total_Net_Sales)) + geom_bar(stat="identity",color="#FF6666") + xlab("Product Type") + ylab("Revenue") + theme_minimal()
plot(b)
Next, I plotted a time series of total sales for the business from 2017 to 2019 in order to visualize the data better. One thing that stuck out to me immediately was the seasonality of the data; the holiday season (November and December) consistently had higher sales than the other months. To dive deeper into this, I looked at the top 10 months with the best sales during this three-year span. Six out of the top ten months occurred during November or December, confirming the trend. Given that one of the best-selling products that this business sells is Christmas items, this makes sense. The holiday months averaged $19,142 in revenue while non-holiday months averaged just $8,937. In other words, that’s a 214% increase from non-holiday to holiday months. Given these statistics, I believe this business should invest even more into its Christmas items in order to maximize its revenue.
myts <- ggplot(business_retailsales2, aes(x=Date, y=Total_Sales)) + geom_line(color='red')
myts <- myts + ylab("Revenue") + labs(title="Monthly Revenue") + theme_classic()
myts
As you can see, towards the end of 2019 sales shot up to record numbers. In November and December of that year alone, revenue was over 60,000. Based on the whole of the data, these numbers seem like outliers.
I also took a look at the Average Order Value (AOV) in order to better guage where this business stands and what decisions need to be made going forward. The AOV for this business is 68.86. Twelve out of the nineteen unique categories of items this business sells have an average price of less than this AOV. This lends credence to the idea that items with a lower average price such as Christmas and kitchen items are the main rivers of the business’s revenue. With many of the other categories also having an average price of at or under $30, it’s also possible that customers are buying from multiple categories at a time.
This business should look to raise its AOV in order to get more ROI on its marketing efforts.
f <- ts(business_retailsales2[,10],start=c(2017,1),end=c(2019,10),frequency=12)
autoplot(f)
When we look at the AOV over time (above), we see that it has fallen by a notable amount. This is a potentially worrying sign for the business that they are not getting the most out of their inputs. Given that the more expensive items have higher margins, I suggest they ramp up their marketing and outreach efforts in order to increase their AOV again.
I looked at the rate at which each product type was returned after purchase by dividing the value of the returns by the value of gross sales. It turns out that Christmas items have the highest rate of return at 4.3%, followed by arts & sculpture, baskets, kitchenware, jewelry, and home decor, respectively. This may be because consumers buy Christmas items for one particular season, and once that season ends there is no need for the items for at least another eleven months, prompting them to return the item if eligible. It is unclear if anything could be done about that. However, this business should strive to improve the quality of the other top-returned products, especially in high-margin categories like arts & sculpture and baskets, in order to improve its net profit.
Next, I sought to forecast future revenue based on the existing data, starting by creating a time series.
#creating a time series of the business's sales
y <- ts(business_retailsales2[,8],start=c(2017,1),end=c(2019,10), frequency=12)
I tried out several models before settling on the ARIMA model since it had a lower standard deviation than the ETS model. In order to avoid the potential of November and December 2019 numbers to skew the projections, I removed them from the inputs.
fit_arima <- auto.arima(y,d=1,D=1,stepwise=FALSE,approximation=FALSE,trace=TRUE)
##
## ARIMA(0,1,0)(0,1,0)[12] : 327.5845
## ARIMA(0,1,1)(0,1,0)[12] : 319.9557
## ARIMA(0,1,2)(0,1,0)[12] : 321.334
## ARIMA(0,1,3)(0,1,0)[12] : 324.0979
## ARIMA(0,1,4)(0,1,0)[12] : Inf
## ARIMA(0,1,5)(0,1,0)[12] : Inf
## ARIMA(1,1,0)(0,1,0)[12] : 321.88
## ARIMA(1,1,1)(0,1,0)[12] : 321.8154
## ARIMA(1,1,2)(0,1,0)[12] : 324.173
## ARIMA(1,1,3)(0,1,0)[12] : Inf
## ARIMA(1,1,4)(0,1,0)[12] : Inf
## ARIMA(2,1,0)(0,1,0)[12] : 320.9962
## ARIMA(2,1,1)(0,1,0)[12] : 323.4308
## ARIMA(2,1,2)(0,1,0)[12] : Inf
## ARIMA(2,1,3)(0,1,0)[12] : 330.4297
## ARIMA(3,1,0)(0,1,0)[12] : 324.0395
## ARIMA(3,1,1)(0,1,0)[12] : Inf
## ARIMA(3,1,2)(0,1,0)[12] : Inf
## ARIMA(4,1,0)(0,1,0)[12] : 326.0175
## ARIMA(4,1,1)(0,1,0)[12] : 329.4831
## ARIMA(5,1,0)(0,1,0)[12] : 328.5458
##
##
##
## Best model: ARIMA(0,1,1)(0,1,0)[12]
fcst <- forecast(fit_arima,h=38)
autoplot(fcst)
According to the model, revenue will continue to fluctuate between lower numbers in the spring and summer and higher numbers during the holiday season. I believe there is a lot of room for growth, however, if this business manages to improve its AOV and products with high rates of returns. These steps, in addition to investing more into its top-selling products, will help grow its net profit and ensure greater prosperity in the future.