An Adidas sales dataset is a collection of data that includes information on the sales of Adidas products. This type of dataset may include details such as the number of units sold, the total sales revenue, the location of the sales, the type of product sold, and any other relevant information.
Adidas sales data can be useful for a variety of purposes, such as analyzing sales trends, identifying successful products or marketing campaigns, and developing strategies for future sales. It can also be used to compare Adidas sales to those of competitors, or to analyze the effectiveness of different marketing or sales channels.
There are a variety of sources that could potentially provide an Adidas sales dataset, including Adidas itself, market research firms, government agencies, or other organizations that track sales data. The specific data points included in an Adidas sales dataset may vary depending on the source and the purpose for which it is being used.
We’ll set-up caching for this notebook given how computationally expensive some of the code we will write can get.
this workspace using the library() function:
library(dplyr)
library(lubridate)
library(ggplot2)
library(plotly)
library(glue)
library(ggpubr)
library(scales)written library is very useful for the results of the analysis
data = read.csv("data input/Adidas Sales.csv", header = T)
head(data)Describe about dataset :
State : Purchase state in USPrice.per.Unit : Product price per unitUnits.Sold : Total units soldTotal.Sales : Total income of each categoryOperating.Profit : Profit salesOperating.Margin : Margin earned in each categorySales.Method : Product purchase methodThe following is the data type before the changes are made
glimpse(data)## Rows: 9,648
## Columns: 13
## $ Retailer <chr> "Foot Locker", "Foot Locker", "Foot Locker", "Foot Lo…
## $ Retailer.ID <int> 1185732, 1185732, 1185732, 1185732, 1185732, 1185732,…
## $ Invoice.Date <chr> "2020-01-01", "2020-01-02", "2020-01-03", "2020-01-04…
## $ Region <chr> "Northeast", "Northeast", "Northeast", "Northeast", "…
## $ State <chr> "New York", "New York", "New York", "New York", "New …
## $ City <chr> "New York", "New York", "New York", "New York", "New …
## $ Product <chr> "Men's Street Footwear", "Men's Athletic Footwear", "…
## $ Price.per.Unit <dbl> 50, 50, 40, 45, 60, 50, 50, 50, 40, 45, 60, 50, 50, 5…
## $ Units.Sold <dbl> 1200, 1000, 1000, 850, 900, 1000, 1250, 900, 950, 825…
## $ Total.Sales <dbl> 600000, 500000, 400000, 382500, 540000, 500000, 62500…
## $ Operating.Profit <dbl> 300000.0, 150000.0, 140000.0, 133875.0, 162000.0, 125…
## $ Operating.Margin <dbl> 0.50, 0.30, 0.35, 0.35, 0.30, 0.25, 0.50, 0.30, 0.35,…
## $ Sales.Method <chr> "In-store", "In-store", "In-store", "In-store", "In-s…
choose the variable used and adjust the appropriate data type
data_clean <- data %>%
select(-c(Retailer.ID)) %>%
mutate(
Retailer = as.factor(Retailer),
Invoice.Date = ymd(Invoice.Date),
Region = as.factor(Region),
State = as.factor(State),
City = as.factor(City),
Product = as.factor(Product),
Sales.Method = as.factor(Sales.Method)
)
glimpse(data_clean)## Rows: 9,648
## Columns: 12
## $ Retailer <fct> Foot Locker, Foot Locker, Foot Locker, Foot Locker, F…
## $ Invoice.Date <date> 2020-01-01, 2020-01-02, 2020-01-03, 2020-01-04, 2020…
## $ Region <fct> Northeast, Northeast, Northeast, Northeast, Northeast…
## $ State <fct> New York, New York, New York, New York, New York, New…
## $ City <fct> New York, New York, New York, New York, New York, New…
## $ Product <fct> Men's Street Footwear, Men's Athletic Footwear, Women…
## $ Price.per.Unit <dbl> 50, 50, 40, 45, 60, 50, 50, 50, 40, 45, 60, 50, 50, 5…
## $ Units.Sold <dbl> 1200, 1000, 1000, 850, 900, 1000, 1250, 900, 950, 825…
## $ Total.Sales <dbl> 600000, 500000, 400000, 382500, 540000, 500000, 62500…
## $ Operating.Profit <dbl> 300000.0, 150000.0, 140000.0, 133875.0, 162000.0, 125…
## $ Operating.Margin <dbl> 0.50, 0.30, 0.35, 0.35, 0.30, 0.25, 0.50, 0.30, 0.35,…
## $ Sales.Method <fct> In-store, In-store, In-store, In-store, In-store, In-…
colSums(is.na(data_clean))## Retailer Invoice.Date Region State
## 0 0 0 0
## City Product Price.per.Unit Units.Sold
## 0 0 0 0
## Total.Sales Operating.Profit Operating.Margin Sales.Method
## 0 0 0 0
the dataset that we have does not have a missing value so that further analysis can be carried out and no missing value handling is required
dim(data_clean)## [1] 9648 12
the data used is 9,648 rows with 12 variables
sort(table(data_clean$Retailer), decreasing = T)##
## Foot Locker West Gear Sports Direct Kohl's Amazon
## 2637 2374 2032 1030 949
## Walmart
## 626
the retailer with the most sales is Foot Locker and was followed by West Gear. For retailers with the least number of sales transactions in Walmart
c(min(data_clean$Invoice.Date), max(data_clean$Invoice.Date))## [1] "2020-01-01" "2021-12-31"
data used from 1 January 2022 up to 31 December 2021
sort(table(data_clean$Region), decreasing = T)##
## West Northeast Midwest South Southeast
## 2448 2376 1872 1728 1224
West Region is the area that has the most Adidas purchase transactions and Southeast Region is the area with the fewest adidas purchase transactions
The following is the State recorded in the Adidas sales dataset :
unique(data_clean$State)## [1] New York Texas California Illinois Pennsylvania
## [6] Nevada Colorado Washington Florida Minnesota
## [11] Montana Tennessee Nebraska Alabama Maine
## [16] Alaska Hawaii Wyoming Virginia Michigan
## [21] Missouri Utah Oregon Louisiana Idaho
## [26] Arizona New Mexico Georgia South Carolina North Carolina
## [31] Ohio Kentucky Mississippi Arkansas Oklahoma
## [36] Kansas South Dakota North Dakota Iowa Wisconsin
## [41] Indiana West Virginia Maryland Delaware New Jersey
## [46] Connecticut Rhode Island Massachusetts Vermont New Hampshire
## 50 Levels: Alabama Alaska Arizona Arkansas California Colorado ... Wyoming
The following is the City recorded in the Adidas sales dataset :
unique(data_clean$City)## [1] New York Houston San Francisco Los Angeles Chicago
## [6] Dallas Philadelphia Las Vegas Denver Seattle
## [11] Miami Minneapolis Billings Knoxville Omaha
## [16] Birmingham Portland Anchorage Honolulu Orlando
## [21] Albany Cheyenne Richmond Detroit St. Louis
## [26] Salt Lake City New Orleans Boise Phoenix Albuquerque
## [31] Atlanta Charleston Charlotte Columbus Louisville
## [36] Jackson Little Rock Oklahoma City Wichita Sioux Falls
## [41] Fargo Des Moines Milwaukee Indianapolis Baltimore
## [46] Wilmington Newark Hartford Providence Boston
## [51] Burlington Manchester
## 52 Levels: Albany Albuquerque Anchorage Atlanta Baltimore ... Wilmington
sort(table(data_clean$Product), decreasing = T)##
## Men's Athletic Footwear Men's Street Footwear Women's Apparel
## 1610 1610 1608
## Women's Street Footwear Men's Apparel Women's Athletic Footwear
## 1608 1606 1606
Men’s Athletic Footwear and Men’s Street Footwear are the best-selling types of adidas products with a total purchase of 1,610 units in US, end then Men’s Apparel and Women’s Athletic Footwear has sales of at least 1,606 units
summary(data_clean$Price.per.Unit)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 7.00 35.00 45.00 45.22 55.00 110.00
The cheapest price is 7$ and the most expensive price for an Adidas product is 110$ . For the average price of Adidas products is priced at 45.22$
summary(data_clean$Units.Sold)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 106.0 176.0 256.9 350.0 1275.0
The average product sold is 257 units per day and there are days when Adidas products are not sold at all and there are product sold at most 1,275 units per day
summary(data_clean$Total.Sales)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 4254 9576 93273 150000 825000
The average total sales is 93,273$ per day end then there are days when there is no income at all and total sales at most 825,000$ per day
summary(data_clean$Operating.Profit)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 1922 4371 34425 52063 390000
The average profit Adidas sales is 34,425$ per day end then there are days when there is no profit at all and the biggest profit is 390,000$ per day
summary(data_clean$Operating.Margin)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.100 0.350 0.410 0.423 0.490 0.800
The average margin Adidas sales is 42,3% per day end then there are days when there is no margin at all and the biggest margin is 80% per day
sort(table(data_clean$Sales.Method), decreasing = T)##
## Online Outlet In-store
## 4889 3019 1740
The most customers buy adidas products online with a total of 4,889 people and customers came to the in-store at least 1,740 people
zero_sales <- data_clean %>%
filter(Units.Sold == "0")
zero_salesOn June 5, 2021 and June 11, 2021 in Omaha, Midwest region Women’s Athletic Footwear products did not sell at Foot Locker. there is a possibility that on that day Foot Locker is out of stock for Women’s Athletic Footwear products. So that it can be noticed that the sales team at Adidas can maintain product availability in every offline and online store
cor(data_clean$Units.Sold, data_clean$Operating.Profit)## [1] 0.8923794
can also be seen the correlation between sales and profit.with a value of 0.892 it can be said to have a positive correlation because the correlation value is close to 1, so that when sales go up, profits will also go up, On the other hand, when sales decrease, profits will also decrease. so that to maintain high profits a marketing strategy is needed to make sales increase.
The following is presented as a plot of data from adidas sales
Pie charts can be used to show percentages of a whole, and represents percentages at a set point.
On this pie chart, a chart will be presented based on the region with the most and the fewest sales transactions
pie_data <- data_clean %>%
filter(Region == "West") %>%
group_by(Sales.Method) %>%
summarise(amnt = n()) %>%
ungroup() %>%
arrange(desc(Sales.Method)) %>%
mutate(prop = amnt / sum(amnt) *100) %>%
mutate(ypos = cumsum(prop)- 0.5*prop )
pie_data <- pie_data %>%
mutate(label = glue(
"Method Purchasing : {Sales.Method}
Amount : {comma(amnt)}"))
pie=plot_ly(pie_data, labels = ~Sales.Method, values = ~prop ,type = "pie",
textposition = 'inside',
textinfo = 'label+percent',
hoverinfo = 'text',
text = ~paste(label),
showlegend = TRUE)
pie= pie %>% layout(title = 'Method Purchasing on West Region',
xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))
pieOnline sales in the West Region have the largest percentage of 47,1% with a nominal value of 1,152 transactions. and the least In-store sales with 16,3% or 398 transactions
pie_data1 <- data_clean %>%
filter(Region == "Southeast") %>%
group_by(Sales.Method) %>%
summarise(amnt = n()) %>%
ungroup() %>%
arrange(desc(Sales.Method)) %>%
mutate(prop = amnt / sum(amnt) *100) %>%
mutate(ypos = cumsum(prop)- 0.5*prop )
pie_data1 <- pie_data1 %>%
mutate(label = glue(
"Method Purchasing : {Sales.Method}
Amount : {comma(amnt)}"))
pie1=plot_ly(pie_data1, labels = ~Sales.Method, values = ~prop ,type = "pie",
textposition = 'inside',
textinfo = 'label+percent',
hoverinfo = 'text',
text = ~paste(label),
showlegend = TRUE)
pie1= pie1 %>% layout(title = 'Method Purchasing on Southeast Region',
xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))
pie1Online sales in the West Region have the largest percentage of 64,7% with a nominal value of 792 transactions. and the least In-store sales with 15,8% or 194 transactions
Of the two samples the most popular method was online. Because it doesn’t take much effort to buy Adidas products. Customers only need a cellphone / internet-connected device to buy an Adidas product. For this reason, the company can facilitate customers by making an online shop display very well and detailing a product so that customers can more easily choose the product they want to buy and can also increase online shop sales.
For the in-store to have the fewest transactions, the reason is the in-store is too far from where their live. To be able to increase in-store sales, maybe the company can provide special promotions for certain in-store purchases
Trend plots are used to illustrate how the trends data over a period of time. It describes a functional representation of two variables (x , y). In which the x is the time-dependent variable whereas y is the collected data.
trend_Product <- data_clean %>%
filter(Invoice.Date > "2021-10-01" & Invoice.Date < "2021-12-31") %>%
group_by(Invoice.Date) %>%
summarise(sum = sum(Units.Sold)) %>%
ungroup()%>%
mutate(
label3 = glue(
"Invoice.Date: {Invoice.Date}
Product Count: {comma(sum)}")
)
plot_trend <- ggplot(data = trend_Product, mapping = aes(x = Invoice.Date,
y = sum)) +
geom_line(color = 'blue') +
geom_point(aes(text = label3), color='blue')+
scale_x_date(date_labels = "%b %Y")+
labs(
title = "Trend Last 3 Month",
x = "Invoice.Date",
y = "Product Count")+
theme_minimal()
ggplotly(plot_trend, tooltip = "text")From the trend plot, it can be seen that the last 3 months have almost the same trend. on October 17, 2021, November 17, 2021 and December 16, 2021 had very high purchase rates, Most likely on this date the majority of people in US have received a salary so that the level of sales extrimly increased. to maintain or increase sales maybe the company can make a pay day promo.
As well as for the end of the month sales are relatively very few and it can be interpreted that the data is seasonal data. To increase sales at the end of the month, author suggested that the company make a month-end promo.
data_profit <- data_clean %>%
filter(Invoice.Date > "2021-10-01" & Invoice.Date < "2021-12-31") %>%
group_by(Product) %>%
summarise(sum = sum(Operating.Profit)) %>%
ungroup() %>%
arrange(-sum) %>%
mutate(label3 = glue("Category: {Product}
Total Profit: {comma(sum)}"))
plot3 <- ggplot(data_profit,
aes(x = sum,
y = reorder(Product, sum),
color = sum,
text = label3)) +
scale_color_continuous(low = "red",
high = "black") +
geom_point(size = 3) +
geom_segment(aes(x = 0,
xend = sum,
yend = Product),
size = 1.5) +
labs(title = "Total Profit of Product Adidas in Last 3 Month",
x = "Total Product",
y = NULL)+
theme_minimal()+
theme(plot.title = element_text(face="bold"), # menebalkan judul title
legend.position = "none"
)
ggplotly(plot3, tooltip = "text")The highest sales profit for the last 3 months was in the Men’s Street Footwear category with total profit 17,602,900$ and the lowest profits is Women’s Athletic Footwear. Maybe the marketing team can make promote for products that have low profits in order to provide good profits
data_retailer <- data_clean %>%
filter(Invoice.Date > "2021-10-01" & Invoice.Date < "2021-12-31") %>%
group_by(Retailer) %>%
summarise(sum = sum(Units.Sold)) %>%
ungroup() %>%
mutate(label4 = glue(
"Retailer : {Retailer}
Product Count: {comma(sum)}"
))
plot4 <- ggplot(data = data_retailer, aes(x = sum,
y = reorder(Retailer, sum),
text = label4)) +
geom_col(aes(fill = sum)) +
scale_fill_gradient(low="pink", high="maroon") +
labs(title = "Amount Sales from Retailer in Last 3 Month",
x = "Amount Sales",
y = "Retailer") +
theme_minimal() +
theme(legend.position = "none")
ggplotly(plot4, tooltip = "text")The highest retail sales for the last 3 months was on Foot Locker with 132,001 transaction, and the lowest retail sales on Walmart wih 26,617 transaction. The adidas marketing team needs to do further analysis of the causes of walmart having small sales of adidas products. maybe at walmart there are competing brands that are more desirable,Therefore what can be done, one of which is to make a campaign to convince the public of the good quality of Adidas, up to date designs, prices according to quality, etc.