Introduction

In the global athletic shoe market Nike and Adidas have been maintaining the world championship in sport’s shoes for a few decades because of their good designs, quality and the different features which they have. In present they produce apparel, accessories, equipment apart from the athletic footwear under their competitive brands. The objective of this report is to compare the cost price of Nike and Adidas products on a selected range to see who is dealing better with regard to purchasing power.

Problem Statement

Data

Data Source: https://www.kaggle.com/australiastats/australia-top-sport-chains About data file: The dataset consists of data about Australia top sports brand - Nike and Adidas. It contains postcode wise information about their sales statistics in Australia. Also, it includes the information about their cost price and sale price, number of units were sold under each category such as men, women, kids etc. and each of the transactions done day by day for the period of 2016-2018.

The original dataset includes the seven variables and 76466 rows.

Descriptive Statistics and Visualisation

Description of Variables

The original dataset includes the seven variables and 76466 rows. The variables are as follows: Date- The sales dates represent from 01/01/2016 to 01/08 /2017 Chain – Represent the two brand names Nike and Adidas Post code – Post code numbers relevant to the states in Australia Category -All the products classify under ten categories (Accessories, Equipment , Groceries, Home, Hosiery, Juniors, Kids , Men , Shoes ,Women) Total unit - number of units were sold during each day under each category and each postcode Sale price – discounted price of both brands in Australian dollars Cost price - Cost of unit prices of each product.

head(VICKids)
##        Date  Chain Postcode Category Units Sale_Price Cost_Price State
## 1 1/01/2016   Nike     3550     Kids    65       2.19       3.92   VIC
## 2 1/01/2016   Nike     3018     Kids    68       1.95       2.75   VIC
## 3 1/01/2016   Nike     3550     Kids    50       2.74       3.18   VIC
## 4 1/01/2016   Nike     3018     Kids    17       2.47       3.28   VIC
## 5 1/01/2016 Adidas     3429     Kids    19       3.47       3.95   VIC
## 6 1/01/2016 Adidas     3630     Kids     1       1.00       3.25   VIC
##       Suburb
## 1    Bendigo
## 2     Altona
## 3    Bendigo
## 4     Altona
## 5    Sunbury
## 6 Shepparton
VICKids %>% group_by(Chain) %>% summarise(Min = min(Cost_Price, na.rm = TRUE),
                                               Q1 = quantile(Cost_Price, probs = 0.25, na.rm = TRUE),
                                               Median = median(Cost_Price, na.rm = TRUE),
                                               Q3 = quantile(Cost_Price, probs = 0.75, na.rm = TRUE),
                                               Max = max(Cost_Price, na.rm = TRUE),
                                               IQR = Q3 - Q1,
                                               Mean = mean(Cost_Price, na.rm = TRUE),
                                               SD = sd(Cost_Price, na.rm = TRUE),
                                               n = n(),
                                               Missing = sum(is.na(Cost_Price)))
## # A tibble: 2 x 11
##   Chain    Min    Q1 Median    Q3   Max   IQR  Mean    SD     n Missing
##   <chr>  <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>   <int>
## 1 Adidas  0.5   2.52   3.3   4.11    15  1.58  3.44  1.47   955       0
## 2 Nike    0.35  1.79   2.58  3.2     12  1.41  2.63  1.19  1524       0
VICKids$Cost_Price %>% hist(col = "blue",
                            ylim = c(0,0.4),
                            xlim = c(0,20),
                            xlab = "Profit($)",
                            main = "Profit by Chain",
                            breaks = 10,
                            density = 20,
                            prob = TRUE)
lines(density(VICKids$Cost_Price, adjust = 2), col = "red", lwd = 2)

summary(VICKids$Cost_Price)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.350   2.000   2.840   2.939   3.600  15.000
boxplot(NikeVICKids["Cost_Price"], las=2, main = "Victoria Kids Nike Cost price", ylab = "Cost Price", xlab = "Nike")

boxplot(AdidasVICKids["Cost_Price"], las=2, main = "Victoria Kids Adidas Cost Price", ylab = "Cost Price", xlab = "Adidas")

boxplot(VICKids["Cost_Price"], las=2, main = "Victoria Kids Cost Price", ylab = "Cost", xlab = "VIC Kids")

ggplot(VICKids,aes(x=Cost_Price)) + 
  geom_density(aes(group=Chain, colour=Chain, fill=Chain),alpha=0.3) +
  theme_bw()

VICKids %>% histogram(~Cost_Price | Chain, data = .,layout=c(1,2), main= 'Comparison of Price Distribution')

Hypothesis Testing

*In the Q-Q plot, the dotted arcs correspond to 95% CI for the normal quantiles, points should be falling inside the tails of the distribution.

leveneTest(Cost_Price ~ Chain, data = VICKids)
## Levene's Test for Homogeneity of Variance (center = median)
##         Df F value    Pr(>F)    
## group    1  29.544 6.002e-08 ***
##       2477                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
t.test(
  Cost_Price ~ Chain,
  data = VICKids,
  var.equal = FALSE,
  alternative = "two.sided"
)
## 
##  Welch Two Sample t-test
## 
## data:  Cost_Price by Chain
## t = 14.361, df = 1714.6, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.7007888 0.9224804
## sample estimates:
## mean in group Adidas   mean in group Nike 
##             3.438209             2.626575

Hypthesis Testing Cont.

\[H_0: \mu_1 = \mu_2 \]

\[H_A: \mu_1 \ne \mu_2\]

\[S = \sum^n_{i = 1}d^2_i\]

Discussion

By doing analysis on the whole dataset containing different categories and products that we collected, there is not enough evidence to prove that either one of the supermarket is costly or cheaper. Considering the dataset that we have collected through boxplot we can determine Adidas is slighty expensive than Nike When t-test is performed on the complete dataset without any classifications, it is found that Adidas is slightly expensive than Nike These results are based on the dataset that we collected for the investigation and it may vary if the categories and products are increased for the report.

References

https://www.kaggle.com/australiastats/australia-top-sport-chains?