Wardrobe Economics: Examining Sales, Categories, and Geography for Products (Clothes and Accessories). This comprehensive collection of data opens the door to a wealth of insights waiting to be discovered. Gain a profound understanding of consumer preferences and buying behavior as you immerse yourself in the intricate details of sales transactions, diverse product categories, and the geographical spread of fashion trends.
now we explain step by step, how to make visualisation in R with data set Wardrabe Product Detail
let’s inspect our data use head()
from inspect data above, we can see the data consist :
#> 'data.frame': 30 obs. of 13 variables:
#> $ Date : int 20210601 20210602 20210603 20210604 20210605 20210606 20210607 20210608 20210609 20210610 ...
#> $ Customer_ID : int 98 92 92 99 66 97 45 81 47 24 ...
#> $ Product_ID : int 321 261 264 251 251 304 357 258 260 263 ...
#> $ Quantity : int 1 4 1 3 1 3 2 1 3 3 ...
#> $ Unit_Price : num 117.3 32.3 36.2 29.9 41.8 ...
#> $ Sales_Revenue : num 117.3 129.1 36.2 89.7 41.8 ...
#> $ Product_Description: chr "Cycling Jerseys" "Casual Shirts" "Casual Shirts" "Jeans" ...
#> $ Product_Category : chr "Sports" "Menswear" "Menswear" "Menswear" ...
#> $ Product_Line : chr "Tops" "Tops" "Tops" "Trousers" ...
#> $ Raw_Material : chr "Fabrics" "Cotton" "Cotton" "Cotton" ...
#> $ Region : chr "York" "Worcester" "Worcester" "Winchester" ...
#> $ Latitude : num 54 52.2 52.2 51.1 51.1 ...
#> $ Longitude : num -1.08 -2.22 -2.22 -1.31 -1.31 ...
The data processing is start change not correct type data into correct data use library lubridate and dplyr and save with ne name
wardrobe_clean <- wardrobe %>%
mutate(Date = ymd(Date),
Customer_ID = as.character(Customer_ID),
Product_ID = as.character(Product_ID),
Product_Category = as.factor(Product_Category),
Product_Line = as.factor(Product_Line),
Raw_Material = as.factor(Raw_Material),
)
head(wardrobe_clean)then make sure no duplicates and missing value
#> [1] 0
no one duplicates
then check missing value
#> Date Customer_ID Product_ID Quantity
#> 0 0 0 0
#> Unit_Price Sales_Revenue Product_Description Product_Category
#> 0 0 0 0
#> Product_Line Raw_Material Region Latitude
#> 0 0 0 0
#> Longitude
#> 0
no missing value our data
check again our data, we can see how the distribution data
#> Date Customer_ID Product_ID Quantity
#> Min. :2021-06-01 Length:30 Length:30 Min. :1.000
#> 1st Qu.:2021-06-08 Class :character Class :character 1st Qu.:1.000
#> Median :2021-06-15 Mode :character Mode :character Median :2.000
#> Mean :2021-06-15 Mean :2.067
#> 3rd Qu.:2021-06-22 3rd Qu.:3.000
#> Max. :2021-06-30 Max. :4.000
#> Unit_Price Sales_Revenue Product_Description Product_Category
#> Min. : 21.97 Min. : 21.97 Length:30 Accessories: 2
#> 1st Qu.: 32.39 1st Qu.: 36.77 Class :character Menswear :13
#> Median : 36.19 Median : 79.26 Mode :character Sports : 2
#> Mean : 40.50 Mean : 79.69 Womenswear :13
#> 3rd Qu.: 44.34 3rd Qu.:113.76
#> Max. :117.31 Max. :175.49
#> Product_Line Raw_Material Region Latitude
#> Leathers: 1 Cashmere : 4 Length:30 Min. :50.26
#> Shoes : 1 Cotton :15 Class :character 1st Qu.:51.06
#> Tops :23 Fabrics : 1 Mode :character Median :52.19
#> Trousers: 5 Leather : 4 Mean :52.24
#> Polyester: 2 3rd Qu.:53.68
#> Wool : 4 Max. :53.96
#> Longitude
#> Min. :-5.051
#> 1st Qu.:-2.647
#> Median :-1.490
#> Mean :-2.270
#> 3rd Qu.:-1.353
#> Max. :-1.080
our data consist data sales wardrobe form Januari - Deseber 2021, we
can see the frequency visual doing explatory visualization. Exploratory
visualization is visualize to know our data, we can the distribution
frequency, we can make histogram with the formula
hist().
Sales revenue from production wardrobe the most under 50
let’s see sum from product category
#>
#> Accessories Menswear Sports Womenswear
#> 2 13 2 13
from the chart Womenswear dan Menswear similirarity amount
from the chart raw material cotton the most amount
the stage of creating visualizations to present our data. Therefore, at this stage we will create a visualization with an attractive informative display.
Let’s create a neater and more interesting visualization using the ggplot2 library. First, let’s try to improve the barchart in the Exploratory Data Analysis section above.
Let’s create a dataframe from average sales revenue from each product category, we can use formula group_by ,summarise ,ungroup is that same with formula aggregate.
wardrobe_revenue <- wardrobe_clean %>%
group_by(Product_Category) %>%
summarise(avg_revenue = mean(Sales_Revenue)) %>%
ungroup()
wardrobe_revenueggplot(data = wardrobe_revenue, mapping = aes(x = reorder(Product_Category, avg_revenue), y = avg_revenue ))+
geom_col(mapping = aes(fill = avg_revenue)) +
scale_fill_gradient(low = "#b8d5e6", high = "#0a7e8c") +
labs(title = "Average Revenue Product Category",
x = "Product Category",
y = "Product Revenue") +
theme_minimal() +
theme(legend.position = "none")
the conclusion from the chart above is the average sales, the
accessories product category is the most sold
then Let’s create a dataframe from average price revenue from each
product category e can use formula group_by()
,summarise ,ungroup
wardrobe_price <- wardrobe_clean %>%
group_by(Product_Category) %>%
summarise(avg_price = mean(Unit_Price)) %>%
ungroup()
wardrobe_pricethen we can make the chart
ggplot(data = wardrobe_price, mapping = aes(x = reorder(Product_Category,avg_price), y = avg_price)) +
geom_col(mapping = aes(fill = avg_price)) +
scale_fill_gradient(low = "#b8d5e6", high = "#0a7e8c") +
labs(title = "Average Price of Product Category",
x = "Product Category",
y = "Acerage Price") +
theme_minimal() +
theme(legend.position = "none")the conclusion from the chart above is the average price, the womenswear product category is the most sold
now we can make the chart from all product
wardrobe_product <- wardrobe_clean %>%
group_by(Product_Description) %>%
summarise(avg_price = mean(Unit_Price)) %>%
ungroup()
wardrobe_productggplot(data = wardrobe_product, mapping = aes(x = avg_price, y = reorder(Product_Description, avg_price))) +
geom_col(mapping = aes(fill = avg_price)) +
scale_fill_gradient(low = "#b8d5e6", high = "#0a7e8c")+
labs(title = "Average Price Of Product",
x = "Avarage Price",
y = NULL) +
theme_minimal() +
theme(legend.position = "none")
the conclusion from the chart above is the average price, the cycling
jerseys is the most sold
Based on the explanation above we can conclude : - based on average sales Product Category of the most popular wardrobe is accessories an average price 42.65 but we can compare this with the average product category that is most liked is sports with average price 69.63. - The top 5 average price of the most sold products is : 1. Cycling Jersey 2. Underwear 3. Sweats 4. Belts 5. Pyjamas