In the global athletic shoe market Nike and Adidas have been maintaining the world championship in sportās shoes for a few decades because of their good designs, quality and the different features which they have. In present they produce apparel, accessories, equipment apart from the athletic footwear under their competitive brands. The objective of this report is to compare the cost price of Nike and Adidas products on a selected range to see who is dealing better with regard to purchasing power.
Data Source: https://www.kaggle.com/australiastats/australia-top-sport-chains About data file: The dataset consists of data about Australia top sports brand - Nike and Adidas. It contains postcode wise information about their sales statistics in Australia. Also, it includes the information about their cost price and sale price, number of units were sold under each category such as men, women, kids etc. and each of the transactions done day by day for the period of 2016-2018.
The original dataset includes the seven variables and 76466 rows.
Description of Variables
The original dataset includes the seven variables and 76466 rows. The variables are as follows: Date- The sales dates represent from 01/01/2016 to 01/08 /2017 Chain ā Represent the two brand names Nike and Adidas Post code ā Post code numbers relevant to the states in Australia Category -All the products classify under ten categories (Accessories, Equipment , Groceries, Home, Hosiery, Juniors, Kids , Men , Shoes ,Women) Total unit - number of units were sold during each day under each category and each postcode Sale price ā discounted price of both brands in Australian dollars Cost price - Cost of unit prices of each product.
head(VICKids)
## Date Chain Postcode Category Units Sale_Price Cost_Price State
## 1 1/01/2016 Nike 3550 Kids 65 2.19 3.92 VIC
## 2 1/01/2016 Nike 3018 Kids 68 1.95 2.75 VIC
## 3 1/01/2016 Nike 3550 Kids 50 2.74 3.18 VIC
## 4 1/01/2016 Nike 3018 Kids 17 2.47 3.28 VIC
## 5 1/01/2016 Adidas 3429 Kids 19 3.47 3.95 VIC
## 6 1/01/2016 Adidas 3630 Kids 1 1.00 3.25 VIC
## Suburb
## 1 Bendigo
## 2 Altona
## 3 Bendigo
## 4 Altona
## 5 Sunbury
## 6 Shepparton
VICKids %>% group_by(Chain) %>% summarise(Min = min(Cost_Price, na.rm = TRUE),
Q1 = quantile(Cost_Price, probs = 0.25, na.rm = TRUE),
Median = median(Cost_Price, na.rm = TRUE),
Q3 = quantile(Cost_Price, probs = 0.75, na.rm = TRUE),
Max = max(Cost_Price, na.rm = TRUE),
IQR = Q3 - Q1,
Mean = mean(Cost_Price, na.rm = TRUE),
SD = sd(Cost_Price, na.rm = TRUE),
n = n(),
Missing = sum(is.na(Cost_Price)))
## # A tibble: 2 x 11
## Chain Min Q1 Median Q3 Max IQR Mean SD n Missing
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int>
## 1 Adidas 0.5 2.52 3.3 4.11 15 1.58 3.44 1.47 955 0
## 2 Nike 0.35 1.79 2.58 3.2 12 1.41 2.63 1.19 1524 0
VICKids$Cost_Price %>% hist(col = "blue",
ylim = c(0,0.4),
xlim = c(0,20),
xlab = "Profit($)",
main = "Profit by Chain",
breaks = 10,
density = 20,
prob = TRUE)
lines(density(VICKids$Cost_Price, adjust = 2), col = "red", lwd = 2)
summary(VICKids$Cost_Price)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.350 2.000 2.840 2.939 3.600 15.000
boxplot(NikeVICKids["Cost_Price"], las=2, main = "Victoria Kids Nike Cost price", ylab = "Cost Price", xlab = "Nike")
boxplot(AdidasVICKids["Cost_Price"], las=2, main = "Victoria Kids Adidas Cost Price", ylab = "Cost Price", xlab = "Adidas")
boxplot(VICKids["Cost_Price"], las=2, main = "Victoria Kids Cost Price", ylab = "Cost", xlab = "VIC Kids")
ggplot(VICKids,aes(x=Cost_Price)) +
geom_density(aes(group=Chain, colour=Chain, fill=Chain),alpha=0.3) +
theme_bw()
VICKids %>% histogram(~Cost_Price | Chain, data = .,layout=c(1,2), main= 'Comparison of Price Distribution')
*In the Q-Q plot, the dotted arcs correspond to 95% CI for the normal quantiles, points should be falling inside the tails of the distribution.
leveneTest(Cost_Price ~ Chain, data = VICKids)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 29.544 6.002e-08 ***
## 2477
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
t.test(
Cost_Price ~ Chain,
data = VICKids,
var.equal = FALSE,
alternative = "two.sided"
)
##
## Welch Two Sample t-test
##
## data: Cost_Price by Chain
## t = 14.361, df = 1714.6, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.7007888 0.9224804
## sample estimates:
## mean in group Adidas mean in group Nike
## 3.438209 2.626575
\[H_0: \mu_1 = \mu_2 \]
\[H_A: \mu_1 \ne \mu_2\]
\[S = \sum^n_{i = 1}d^2_i\]
By doing analysis on the whole dataset containing different categories and products that we collected, there is not enough evidence to prove that either one of the supermarket is costly or cheaper. Considering the dataset that we have collected through boxplot we can determine Adidas is slighty expensive than Nike When t-test is performed on the complete dataset without any classifications, it is found that Adidas is slightly expensive than Nike These results are based on the dataset that we collected for the investigation and it may vary if the categories and products are increased for the report.