Author: Yaqi Hu
Are there differences in pricing between women’s and men’s clothing items at Zara?
Section, whether the clothing is for woman or man. And the Prices of both women’s and men’s clothing from Zara.
mydata <-read.table("./zara.csv", header= TRUE, sep=";",dec=".")
head(mydata)
## Product.ID Product.Position Promotion Product.Category Seasonal Sales.Volume
## 1 185102 Aisle No Clothing No 2823
## 2 188771 Aisle No Clothing No 654
## 3 180176 End-cap Yes Clothing Yes 2220
## 4 112917 Aisle Yes Clothing Yes 1568
## 5 192936 End-cap No Clothing Yes 2942
## 6 117590 End-cap No Clothing No 2968
## brand url
## 1 Zara https://www.zara.com/us/en/basic-puffer-jacket-p06985450.html
## 2 Zara https://www.zara.com/us/en/tuxedo-jacket-p08896675.html
## 3 Zara https://www.zara.com/us/en/slim-fit-suit-jacket-p01564520.html
## 4 Zara https://www.zara.com/us/en/stretch-suit-jacket-p01564300.html
## 5 Zara https://www.zara.com/us/en/double-faced-jacket-p08281477.html
## 6 Zara https://www.zara.com/us/en/contrasting-collar-jacket-p06987331.html
## sku name
## 1 272145190-250-2 BASIC PUFFER JACKET
## 2 324052738-800-46 TUXEDO JACKET
## 3 335342680-800-44 SLIM FIT SUIT JACKET
## 4 328303236-420-44 STRETCH SUIT JACKET
## 5 312368260-800-2 DOUBLE FACED JACKET
## 6 320298385-807-2 CONTRASTING COLLAR JACKET
## description
## 1 Puffer jacket made of tear-resistant ripstop fabric. High collar and adjustable long sleeves with adhesive straps. Welt pockets at hip. Adjustable hem with side elastics. Front zip closure.
## 2 Straight fit blazer. Pointed lapel collar and long sleeves with buttoned cuffs. Welt pockets at hip and interior pocket. Central back vent at hem. Front button closure.
## 3 Slim fit jacket. Notched lapel collar. Long sleeves with buttoned cuffs. Welt pocket at chest and flap pockets at hip. Interior pocket. Back vents. Front button closure.
## 4 Slim fit jacket made of viscose blend fabric. Notched lapel collar. Long sleeves with buttoned cuffs. Welt pocket at chest and flap pockets at hip. Interior pocket. Back vents. Front button closure.
## 5 Jacket made of faux leather faux shearling with fleece interior. Tabbed lapel collar. Long sleeves. Zip pockets at hip. Front zip closure.
## 6 Relaxed fit jacket. Contrasting lapel collar and long sleeves with buttoned cuffs. Front pouch pockets. Interior pocket. Washed effect. Front zip closure.
## price currency scraped_at terms section
## 1 19.99 USD 2024-02-19T08:50:05.654618 jackets MAN
## 2 169.00 USD 2024-02-19T08:50:06.590930 jackets MAN
## 3 129.00 USD 2024-02-19T08:50:07.301419 jackets MAN
## 4 129.00 USD 2024-02-19T08:50:07.882922 jackets MAN
## 5 139.00 USD 2024-02-19T08:50:08.453847 jackets MAN
## 6 79.90 USD 2024-02-19T08:50:09.140497 jackets MAN
Unit of observations: items sold at Zara
Sample size: 252
Used variables:
Price: Price of the product in USD
Section: Specifies whether the product is intended for men or women
Other variables:
Product.ID: Identification number for each product
Product.Position: Location of the product
Promotion: Indicates whether the product is currently being offered at a promotional price.
Product Category: Broad product group of an item
Seasonal: Product sold seasonally
Sales Volume: The number of units sold for an item.
Brand: Brand of the item
URL: web link to the item
SKU: Stock Keeping Unit, identification number to manage the inventory for the product
Name: Name of the product
Description: Description of the product
Currency: Currency of the product price.
Scraped_at: The time when the data was scraped
Terms: Subcategory of the product
Dataset from: https://www.kaggle.com/datasets/xontoloyo/data-penjualan-zara The data shows product sales from Zara stores.
#Converting categorical variable into factors
mydata$section <- factor(mydata$section)
#Summary
summary(mydata$price)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 7.99 49.90 79.90 86.25 109.00 439.00
summary(mydata$section)
## MAN WOMAN
## 218 34
library(psych)
describe(mydata$price)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 252 86.25 52.08 79.9 80.92 43.14 7.99 439 431.01 2.36 10.99 3.28
The cheapest clothing item costs $7.99. On average, a piece of clothing sells for $86.25. 50% of the clothing items are priced lower than $79.90, and the other 50% are priced above. The maximum price of a clothing item amounts to $439.00. The sample contains 218 man clothing and 34 woman clothing. The range of the lowest price and highest price amounts 431.01.
library(psych)
describeBy(x = mydata$price,
group = mydata$section)
##
## Descriptive statistics by group
## group: MAN
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 218 91.82 53.01 89.9 86.99 29.65 9.99 439 429.01 2.34 10.91 3.59
## ------------------------------------------------------------
## group: WOMAN
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 34 50.53 25.25 48.9 48.4 4.45 7.99 169 161.01 2.79 11.83 4.33
On average man clothing $91.82 are more expensive than woman clothing $50.53
Independent samples T-test, because we want to compare the mean of two groups. Woman clothing and man clothing are in different groups independent from each other. H0: Mean man clothing price = Mean woman clothing price H1: Mean man clothing price ≠ Mean woman clothing price
# Draw a plot to check the normality
library(ggplot2)
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
ggplot(mydata[mydata$section == "WOMAN",], aes(x= price))+
theme_linedraw()+
geom_bar(fill = "darkred")+
ylab("Frequency")+
ggtitle("Woman clothing")
ggplot(mydata[mydata$section == "MAN",], aes(x= price))+
theme_linedraw()+
geom_bar(fill = "darkblue")+
ylab("Frequency")+
ggtitle("Man clothing")
# Check with Shapiro test
#shapiro.test(mydata$price[mydata$section == "MAN"])
#shapiro.test(mydata$price[mydata$section == "WOMAN"])
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(rstatix)
##
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
##
## filter
mydata %>%
group_by(section) %>%
shapiro_test(price)
## # A tibble: 2 × 4
## section variable statistic p
## <fct> <chr> <dbl> <dbl>
## 1 MAN price 0.829 9.51e-15
## 2 WOMAN price 0.662 1.33e- 7
The dependent variable is numeric. The histogram shows a few outliners. Both Shapiro-tests suggest that the prices are not normally distributed (p<0.001).
The condition normality distribution is not met, therefore a non-parametric will be used. Instead of independent t-test the Wilcoxon Rank Sum Test will be used.
wilcox.test(mydata$price ~ mydata$section,
paired = FALSE,
correct = FALSE,
exact = FALSE,
alternative = "two.sided")
##
## Wilcoxon rank sum test
##
## data: mydata$price by mydata$section
## W = 5996, p-value = 5.917e-09
## alternative hypothesis: true location shift is not equal to 0
The Wilcoxon rank sum test, shows a significant result (p<0.001).The null hypothesis is rejected. This indicates that true location shift is not equal to 0. Woman and man clothings are priced differently.
#install.packages("effectsize")
library(effectsize)
##
## Attaching package: 'effectsize'
## The following objects are masked from 'package:rstatix':
##
## cohens_d, eta_squared
## The following object is masked from 'package:psych':
##
## phi
effectsize(wilcox.test(mydata$price ~ mydata$section),
paired = FALSE,
correct = FALSE,
exact = FALSE,
alternative ="two.sided")
## r (rank biserial) | 95% CI
## --------------------------------
## 0.62 | [0.47, 0.73]
interpret_rank_biserial(0.62)
## [1] "very large"
## (Rules: funder2019)
The effect size shows that there is a very large difference between the distributions.
Are there differences in pricing between women’s and men’s clothing items at Zara? Wilcoxon Rank Sum Test suggests that the pricing between women’s and men’s clothing are significantly different with a very large effect (0.62). The results suggest that men’s clothing are more expensive at Zara than women’s clothing.