1 Adidas’s Sales Dataset

1.1 About Dataset

An Adidas sales dataset is a collection of data that includes information on the sales of Adidas products. This type of dataset may include details such as the number of units sold, the total sales revenue, the location of the sales, the type of product sold, and any other relevant information.

Adidas sales data can be useful for a variety of purposes, such as analyzing sales trends, identifying successful products or marketing campaigns, and developing strategies for future sales. It can also be used to compare Adidas sales to those of competitors, or to analyze the effectiveness of different marketing or sales channels.

There are a variety of sources that could potentially provide an Adidas sales dataset, including Adidas itself, market research firms, government agencies, or other organizations that track sales data. The specific data points included in an Adidas sales dataset may vary depending on the source and the purpose for which it is being used.

1.2 Libraries and Setup

We’ll set-up caching for this notebook given how computationally expensive some of the code we will write can get.

this workspace using the library() function:

library(dplyr)
library(lubridate)
library(ggplot2)
library(plotly)
library(glue)
library(ggpubr)
library(scales)

written library is very useful for the results of the analysis

2 Dataset

data = read.csv("data input/Adidas Sales.csv", header = T)
head(data)

Describe about dataset :

  • ‘Retailer’ : Product purchase place
  • ‘Retailer.ID’ : ID of store
  • ‘Invoice.Date’ : Purchase date
  • ‘Region’ : Purchase region in US
  • State : Purchase state in US
  • ‘City’ : Purchase city in US
  • ‘Product’ : Product category
  • Price.per.Unit : Product price per unit
  • Units.Sold : Total units sold
  • Total.Sales : Total income of each category
  • Operating.Profit : Profit sales
  • Operating.Margin : Margin earned in each category
  • Sales.Method : Product purchase method

2.1 Data Preparation

The following is the data type before the changes are made

glimpse(data)
## Rows: 9,648
## Columns: 13
## $ Retailer         <chr> "Foot Locker", "Foot Locker", "Foot Locker", "Foot Lo…
## $ Retailer.ID      <int> 1185732, 1185732, 1185732, 1185732, 1185732, 1185732,…
## $ Invoice.Date     <chr> "2020-01-01", "2020-01-02", "2020-01-03", "2020-01-04…
## $ Region           <chr> "Northeast", "Northeast", "Northeast", "Northeast", "…
## $ State            <chr> "New York", "New York", "New York", "New York", "New …
## $ City             <chr> "New York", "New York", "New York", "New York", "New …
## $ Product          <chr> "Men's Street Footwear", "Men's Athletic Footwear", "…
## $ Price.per.Unit   <dbl> 50, 50, 40, 45, 60, 50, 50, 50, 40, 45, 60, 50, 50, 5…
## $ Units.Sold       <dbl> 1200, 1000, 1000, 850, 900, 1000, 1250, 900, 950, 825…
## $ Total.Sales      <dbl> 600000, 500000, 400000, 382500, 540000, 500000, 62500…
## $ Operating.Profit <dbl> 300000.0, 150000.0, 140000.0, 133875.0, 162000.0, 125…
## $ Operating.Margin <dbl> 0.50, 0.30, 0.35, 0.35, 0.30, 0.25, 0.50, 0.30, 0.35,…
## $ Sales.Method     <chr> "In-store", "In-store", "In-store", "In-store", "In-s…

choose the variable used and adjust the appropriate data type

data_clean <- data %>% 
  select(-c(Retailer.ID)) %>%
  mutate(
    Retailer = as.factor(Retailer),
    Invoice.Date = ymd(Invoice.Date),
    Region = as.factor(Region),
    State = as.factor(State),
    City = as.factor(City),
    Product = as.factor(Product),
    Sales.Method = as.factor(Sales.Method)
  )
glimpse(data_clean)
## Rows: 9,648
## Columns: 12
## $ Retailer         <fct> Foot Locker, Foot Locker, Foot Locker, Foot Locker, F…
## $ Invoice.Date     <date> 2020-01-01, 2020-01-02, 2020-01-03, 2020-01-04, 2020…
## $ Region           <fct> Northeast, Northeast, Northeast, Northeast, Northeast…
## $ State            <fct> New York, New York, New York, New York, New York, New…
## $ City             <fct> New York, New York, New York, New York, New York, New…
## $ Product          <fct> Men's Street Footwear, Men's Athletic Footwear, Women…
## $ Price.per.Unit   <dbl> 50, 50, 40, 45, 60, 50, 50, 50, 40, 45, 60, 50, 50, 5…
## $ Units.Sold       <dbl> 1200, 1000, 1000, 850, 900, 1000, 1250, 900, 950, 825…
## $ Total.Sales      <dbl> 600000, 500000, 400000, 382500, 540000, 500000, 62500…
## $ Operating.Profit <dbl> 300000.0, 150000.0, 140000.0, 133875.0, 162000.0, 125…
## $ Operating.Margin <dbl> 0.50, 0.30, 0.35, 0.35, 0.30, 0.25, 0.50, 0.30, 0.35,…
## $ Sales.Method     <fct> In-store, In-store, In-store, In-store, In-store, In-…

2.2 Looking for Missing Value

colSums(is.na(data_clean))
##         Retailer     Invoice.Date           Region            State 
##                0                0                0                0 
##             City          Product   Price.per.Unit       Units.Sold 
##                0                0                0                0 
##      Total.Sales Operating.Profit Operating.Margin     Sales.Method 
##                0                0                0                0

the dataset that we have does not have a missing value so that further analysis can be carried out and no missing value handling is required

2.3 Dimention of Dataset

dim(data_clean)
## [1] 9648   12

the data used is 9,648 rows with 12 variables

3 Descriptive Statistics

sort(table(data_clean$Retailer), decreasing = T)
## 
##   Foot Locker     West Gear Sports Direct        Kohl's        Amazon 
##          2637          2374          2032          1030           949 
##       Walmart 
##           626

the retailer with the most sales is Foot Locker and was followed by West Gear. For retailers with the least number of sales transactions in Walmart

c(min(data_clean$Invoice.Date), max(data_clean$Invoice.Date))
## [1] "2020-01-01" "2021-12-31"

data used from 1 January 2022 up to 31 December 2021

sort(table(data_clean$Region), decreasing = T)
## 
##      West Northeast   Midwest     South Southeast 
##      2448      2376      1872      1728      1224

West Region is the area that has the most Adidas purchase transactions and Southeast Region is the area with the fewest adidas purchase transactions

The following is the State recorded in the Adidas sales dataset :

unique(data_clean$State)
##  [1] New York       Texas          California     Illinois       Pennsylvania  
##  [6] Nevada         Colorado       Washington     Florida        Minnesota     
## [11] Montana        Tennessee      Nebraska       Alabama        Maine         
## [16] Alaska         Hawaii         Wyoming        Virginia       Michigan      
## [21] Missouri       Utah           Oregon         Louisiana      Idaho         
## [26] Arizona        New Mexico     Georgia        South Carolina North Carolina
## [31] Ohio           Kentucky       Mississippi    Arkansas       Oklahoma      
## [36] Kansas         South Dakota   North Dakota   Iowa           Wisconsin     
## [41] Indiana        West Virginia  Maryland       Delaware       New Jersey    
## [46] Connecticut    Rhode Island   Massachusetts  Vermont        New Hampshire 
## 50 Levels: Alabama Alaska Arizona Arkansas California Colorado ... Wyoming

The following is the City recorded in the Adidas sales dataset :

unique(data_clean$City)
##  [1] New York       Houston        San Francisco  Los Angeles    Chicago       
##  [6] Dallas         Philadelphia   Las Vegas      Denver         Seattle       
## [11] Miami          Minneapolis    Billings       Knoxville      Omaha         
## [16] Birmingham     Portland       Anchorage      Honolulu       Orlando       
## [21] Albany         Cheyenne       Richmond       Detroit        St. Louis     
## [26] Salt Lake City New Orleans    Boise          Phoenix        Albuquerque   
## [31] Atlanta        Charleston     Charlotte      Columbus       Louisville    
## [36] Jackson        Little Rock    Oklahoma City  Wichita        Sioux Falls   
## [41] Fargo          Des Moines     Milwaukee      Indianapolis   Baltimore     
## [46] Wilmington     Newark         Hartford       Providence     Boston        
## [51] Burlington     Manchester    
## 52 Levels: Albany Albuquerque Anchorage Atlanta Baltimore ... Wilmington
sort(table(data_clean$Product), decreasing = T)
## 
##   Men's Athletic Footwear     Men's Street Footwear           Women's Apparel 
##                      1610                      1610                      1608 
##   Women's Street Footwear             Men's Apparel Women's Athletic Footwear 
##                      1608                      1606                      1606

Men’s Athletic Footwear and Men’s Street Footwear are the best-selling types of adidas products with a total purchase of 1,610 units in US, end then Men’s Apparel and Women’s Athletic Footwear has sales of at least 1,606 units

summary(data_clean$Price.per.Unit)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    7.00   35.00   45.00   45.22   55.00  110.00

The cheapest price is 7$ and the most expensive price for an Adidas product is 110$ . For the average price of Adidas products is priced at 45.22$

summary(data_clean$Units.Sold)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0   106.0   176.0   256.9   350.0  1275.0

The average product sold is 257 units per day and there are days when Adidas products are not sold at all and there are product sold at most 1,275 units per day

summary(data_clean$Total.Sales)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0    4254    9576   93273  150000  825000

The average total sales is 93,273$ per day end then there are days when there is no income at all and total sales at most 825,000$ per day

summary(data_clean$Operating.Profit)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0    1922    4371   34425   52063  390000

The average profit Adidas sales is 34,425$ per day end then there are days when there is no profit at all and the biggest profit is 390,000$ per day

summary(data_clean$Operating.Margin)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.100   0.350   0.410   0.423   0.490   0.800

The average margin Adidas sales is 42,3% per day end then there are days when there is no margin at all and the biggest margin is 80% per day

sort(table(data_clean$Sales.Method), decreasing = T)
## 
##   Online   Outlet In-store 
##     4889     3019     1740

The most customers buy adidas products online with a total of 4,889 people and customers came to the in-store at least 1,740 people

3.1 Zero Sales

zero_sales <- data_clean %>%
    filter(Units.Sold == "0")
zero_sales

On June 5, 2021 and June 11, 2021 in Omaha, Midwest region Women’s Athletic Footwear products did not sell at Foot Locker. there is a possibility that on that day Foot Locker is out of stock for Women’s Athletic Footwear products. So that it can be noticed that the sales team at Adidas can maintain product availability in every offline and online store

3.2 Correlation Unit Sold and Profit

cor(data_clean$Units.Sold, data_clean$Operating.Profit)
## [1] 0.8923794

can also be seen the correlation between sales and profit.with a value of 0.892 it can be said to have a positive correlation because the correlation value is close to 1, so that when sales go up, profits will also go up, On the other hand, when sales decrease, profits will also decrease. so that to maintain high profits a marketing strategy is needed to make sales increase.

4 Plot Data

The following is presented as a plot of data from adidas sales

4.1 Pie Chart

Pie charts can be used to show percentages of a whole, and represents percentages at a set point.

On this pie chart, a chart will be presented based on the region with the most and the fewest sales transactions

4.1.1 Pie Chart of West Region

pie_data <- data_clean %>% 
  filter(Region == "West") %>%
  group_by(Sales.Method) %>% 
  summarise(amnt = n()) %>% 
  ungroup() %>% 
  arrange(desc(Sales.Method)) %>%
  mutate(prop = amnt / sum(amnt) *100) %>%
  mutate(ypos = cumsum(prop)- 0.5*prop )

pie_data <- pie_data %>%
  mutate(label = glue(
    "Method Purchasing : {Sales.Method}
     Amount : {comma(amnt)}"))


pie=plot_ly(pie_data, labels = ~Sales.Method, values = ~prop ,type = "pie",
            textposition = 'inside',
            textinfo = 'label+percent',
            hoverinfo = 'text',
            text = ~paste(label),
            showlegend = TRUE)
pie=  pie %>% layout(title = 'Method Purchasing on West Region',
                 xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
                 yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))

pie

Online sales in the West Region have the largest percentage of 47,1% with a nominal value of 1,152 transactions. and the least In-store sales with 16,3% or 398 transactions

4.1.2 Pie Chart of Southeast Region

pie_data1 <- data_clean %>% 
  filter(Region == "Southeast") %>%
  group_by(Sales.Method) %>% 
  summarise(amnt = n()) %>% 
  ungroup() %>% 
  arrange(desc(Sales.Method)) %>%
  mutate(prop = amnt / sum(amnt) *100) %>%
  mutate(ypos = cumsum(prop)- 0.5*prop )

pie_data1 <- pie_data1 %>%
  mutate(label = glue(
    "Method Purchasing : {Sales.Method}
     Amount : {comma(amnt)}"))


pie1=plot_ly(pie_data1, labels = ~Sales.Method, values = ~prop ,type = "pie",
            textposition = 'inside',
            textinfo = 'label+percent',
            hoverinfo = 'text',
            text = ~paste(label),
            showlegend = TRUE)
pie1=  pie1 %>% layout(title = 'Method Purchasing on Southeast Region',
                 xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
                 yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))

pie1

Online sales in the West Region have the largest percentage of 64,7% with a nominal value of 792 transactions. and the least In-store sales with 15,8% or 194 transactions

Of the two samples the most popular method was online. Because it doesn’t take much effort to buy Adidas products. Customers only need a cellphone / internet-connected device to buy an Adidas product. For this reason, the company can facilitate customers by making an online shop display very well and detailing a product so that customers can more easily choose the product they want to buy and can also increase online shop sales.

For the in-store to have the fewest transactions, the reason is the in-store is too far from where their live. To be able to increase in-store sales, maybe the company can provide special promotions for certain in-store purchases

4.2 Plot Trend

Trend plots are used to illustrate how the trends data over a period of time. It describes a functional representation of two variables (x , y). In which the x is the time-dependent variable whereas y is the collected data.

trend_Product <- data_clean %>% 
  filter(Invoice.Date > "2021-10-01" & Invoice.Date < "2021-12-31") %>% 
  group_by(Invoice.Date) %>% 
  summarise(sum = sum(Units.Sold)) %>% 
  ungroup()%>%
  mutate(
    label3 = glue(
      "Invoice.Date: {Invoice.Date}
      Product Count: {comma(sum)}")
  )

plot_trend <- ggplot(data = trend_Product, mapping =  aes(x = Invoice.Date, 
                                         y = sum)) +
  geom_line(color = 'blue') +
  geom_point(aes(text = label3), color='blue')+
  scale_x_date(date_labels = "%b %Y")+
  labs(
    title = "Trend Last 3 Month",
    x = "Invoice.Date",
    y = "Product Count")+
  theme_minimal()


ggplotly(plot_trend, tooltip = "text")

From the trend plot, it can be seen that the last 3 months have almost the same trend. on October 17, 2021, November 17, 2021 and December 16, 2021 had very high purchase rates, Most likely on this date the majority of people in US have received a salary so that the level of sales extrimly increased. to maintain or increase sales maybe the company can make a pay day promo.

As well as for the end of the month sales are relatively very few and it can be interpreted that the data is seasonal data. To increase sales at the end of the month, author suggested that the company make a month-end promo.

4.3 Profit Plot

data_profit <- data_clean %>% 
  filter(Invoice.Date > "2021-10-01" & Invoice.Date < "2021-12-31") %>% 
  group_by(Product) %>% 
  summarise(sum = sum(Operating.Profit)) %>% 
  ungroup() %>% 
  arrange(-sum) %>% 
  mutate(label3 = glue("Category: {Product}
                       Total Profit: {comma(sum)}")) 

plot3 <- ggplot(data_profit,
                aes(x = sum,
                    y = reorder(Product, sum),
                    color = sum,
                    text = label3)) +
  scale_color_continuous(low = "red",
                         high = "black") +
  geom_point(size = 3) + 
  geom_segment(aes(x = 0,
                   xend = sum,
                   yend = Product),
               size = 1.5) +
  labs(title = "Total Profit of Product Adidas in Last 3 Month",
       x = "Total Product",
       y = NULL)+
  theme_minimal()+
  theme(plot.title = element_text(face="bold"), # menebalkan judul title 
        legend.position = "none"
  )

ggplotly(plot3, tooltip = "text")

The highest sales profit for the last 3 months was in the Men’s Street Footwear category with total profit 17,602,900$ and the lowest profits is Women’s Athletic Footwear. Maybe the marketing team can make promote for products that have low profits in order to provide good profits

4.4 Sales Plot

data_retailer <- data_clean %>% 
  filter(Invoice.Date > "2021-10-01" & Invoice.Date < "2021-12-31") %>% 
  group_by(Retailer) %>% 
  summarise(sum = sum(Units.Sold)) %>% 
  ungroup() %>%
  mutate(label4 = glue(
    "Retailer : {Retailer}
    Product Count: {comma(sum)}"
  ))


plot4 <- ggplot(data = data_retailer, aes(x = sum, 
                                       y = reorder(Retailer, sum), 
                                       text = label4)) +
  geom_col(aes(fill = sum)) +
  scale_fill_gradient(low="pink", high="maroon") +
  labs(title = "Amount Sales from Retailer in Last 3 Month",
       x = "Amount Sales",
       y = "Retailer") +
  theme_minimal() +
  theme(legend.position = "none") 

ggplotly(plot4, tooltip = "text")

The highest retail sales for the last 3 months was on Foot Locker with 132,001 transaction, and the lowest retail sales on Walmart wih 26,617 transaction. The adidas marketing team needs to do further analysis of the causes of walmart having small sales of adidas products. maybe at walmart there are competing brands that are more desirable,Therefore what can be done, one of which is to make a campaign to convince the public of the good quality of Adidas, up to date designs, prices according to quality, etc.