Management of PT. Elips (a retail company) want to know about their constumer and their current product sales. The management expect more sales if they know the demanded product and the effect about discount to their sales. Hopefully, there is any insight from discounted product so they know which product effected from discount. Hence, the management could give better product sales decision, which product sales will increased by discount and which are not
elips <- read.csv("datainput/retail.csv")
head(elips)tail(elips)dim(elips)## [1] 9994 15
names(elips)## [1] "Row.ID" "Order.ID" "Order.Date" "Ship.Date" "Ship.Mode"
## [6] "Customer.ID" "Segment" "Product.ID" "Category" "Sub.Category"
## [11] "Product.Name" "Sales" "Quantity" "Discount" "Profit"
str(elips)## 'data.frame': 9994 obs. of 15 variables:
## $ Row.ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Order.ID : chr "CA-2016-152156" "CA-2016-152156" "CA-2016-138688" "US-2015-108966" ...
## $ Order.Date : chr "11/8/16" "11/8/16" "6/12/16" "10/11/15" ...
## $ Ship.Date : chr "11/11/16" "11/11/16" "6/16/16" "10/18/15" ...
## $ Ship.Mode : chr "Second Class" "Second Class" "Second Class" "Standard Class" ...
## $ Customer.ID : chr "CG-12520" "CG-12520" "DV-13045" "SO-20335" ...
## $ Segment : chr "Consumer" "Consumer" "Corporate" "Consumer" ...
## $ Product.ID : chr "FUR-BO-10001798" "FUR-CH-10000454" "OFF-LA-10000240" "FUR-TA-10000577" ...
## $ Category : chr "Furniture" "Furniture" "Office Supplies" "Furniture" ...
## $ Sub.Category: chr "Bookcases" "Chairs" "Labels" "Tables" ...
## $ Product.Name: chr "Bush Somerset Collection Bookcase" "Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back" "Self-Adhesive Address Labels for Typewriters by Universal" "Bretford CR4500 Series Slim Rectangular Table" ...
## $ Sales : num 262 731.9 14.6 957.6 22.4 ...
## $ Quantity : int 2 3 2 5 2 7 4 6 3 5 ...
## $ Discount : num 0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
## $ Profit : num 41.91 219.58 6.87 -383.03 2.52 ...
elips$Row.ID <- as.character(elips$Row.ID)
elips$Order.ID <- as.character(elips$Order.ID)
elips$Order.Date <- as.Date(elips$Order.Date,"%m/%d/%y")
elips$Ship.Date <- as.Date(elips$Ship.Date,"%m/%d/%y")
elips$Customer.ID <- as.character(elips$Customer.ID)
elips$Product.ID <- as.character(elips$Product.ID)
elips$Product.Name <- as.character(elips$Product.Name)
elips$Ship.Mode <- as.factor(elips$Ship.Mode)
elips$Segment <- as.factor(elips$Segment)
elips$Category <- as.factor(elips$Category)
elips$Sub.Category <- as.factor(elips$Sub.Category)
str(elips)## 'data.frame': 9994 obs. of 15 variables:
## $ Row.ID : chr "1" "2" "3" "4" ...
## $ Order.ID : chr "CA-2016-152156" "CA-2016-152156" "CA-2016-138688" "US-2015-108966" ...
## $ Order.Date : Date, format: "2016-11-08" "2016-11-08" ...
## $ Ship.Date : Date, format: "2016-11-11" "2016-11-11" ...
## $ Ship.Mode : Factor w/ 4 levels "First Class",..: 3 3 3 4 4 4 4 4 4 4 ...
## $ Customer.ID : chr "CG-12520" "CG-12520" "DV-13045" "SO-20335" ...
## $ Segment : Factor w/ 3 levels "Consumer","Corporate",..: 1 1 2 1 1 1 1 1 1 1 ...
## $ Product.ID : chr "FUR-BO-10001798" "FUR-CH-10000454" "OFF-LA-10000240" "FUR-TA-10000577" ...
## $ Category : Factor w/ 3 levels "Furniture","Office Supplies",..: 1 1 2 1 2 1 2 3 2 2 ...
## $ Sub.Category: Factor w/ 17 levels "Accessories",..: 5 6 11 17 15 10 3 14 4 2 ...
## $ Product.Name: chr "Bush Somerset Collection Bookcase" "Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back" "Self-Adhesive Address Labels for Typewriters by Universal" "Bretford CR4500 Series Slim Rectangular Table" ...
## $ Sales : num 262 731.9 14.6 957.6 22.4 ...
## $ Quantity : int 2 3 2 5 2 7 4 6 3 5 ...
## $ Discount : num 0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
## $ Profit : num 41.91 219.58 6.87 -383.03 2.52 ...
colSums(is.na(elips))## Row.ID Order.ID Order.Date Ship.Date Ship.Mode Customer.ID
## 0 0 0 0 0 0
## Segment Product.ID Category Sub.Category Product.Name Sales
## 0 0 0 0 0 0
## Quantity Discount Profit
## 0 0 0
anyNA(elips)## [1] FALSE
Elips <- elips[,c(3:15)]
head(Elips)summary(Elips)## Order.Date Ship.Date Ship.Mode
## Min. :2014-01-03 Min. :2014-01-07 First Class :1538
## 1st Qu.:2015-05-23 1st Qu.:2015-05-27 Same Day : 543
## Median :2016-06-26 Median :2016-06-29 Second Class :1945
## Mean :2016-04-30 Mean :2016-05-03 Standard Class:5968
## 3rd Qu.:2017-05-14 3rd Qu.:2017-05-18
## Max. :2017-12-30 Max. :2018-01-05
##
## Customer.ID Segment Product.ID
## Length:9994 Consumer :5191 Length:9994
## Class :character Corporate :3020 Class :character
## Mode :character Home Office:1783 Mode :character
##
##
##
##
## Category Sub.Category Product.Name
## Furniture :2121 Binders :1523 Length:9994
## Office Supplies:6026 Paper :1370 Class :character
## Technology :1847 Furnishings: 957 Mode :character
## Phones : 889
## Storage : 846
## Art : 796
## (Other) :3613
## Sales Quantity Discount Profit
## Min. : 0.444 Min. : 1.00 Min. :0.0000 Min. :-6599.978
## 1st Qu.: 17.280 1st Qu.: 2.00 1st Qu.:0.0000 1st Qu.: 1.729
## Median : 54.490 Median : 3.00 Median :0.2000 Median : 8.666
## Mean : 229.858 Mean : 3.79 Mean :0.1562 Mean : 28.657
## 3rd Qu.: 209.940 3rd Qu.: 5.00 3rd Qu.:0.2000 3rd Qu.: 29.364
## Max. :22638.480 Max. :14.00 Max. :0.8000 Max. : 8399.976
##
sum(Elips$Quantity)## [1] 37873
sum(Elips$Profit)## [1] 286397
Initial order date is on 3/1/2014 and the first delivery is on 7/1/2014. Classes with the highest to lowest number of delivery, consecutively, are Standard Class, Second Class, First Class, and Same Day. Product Segment with the highest number of orders/purchases is Consumer Goods, followed with Corporate Goods and Home Office Goods. Categories of goods most purchased from highest to lowest are Office Supply, Furniture, and Technology. The most purchased goods are of the sub-category of Binders, followed by Paper, and Furnishing. The total number of sales is at 37873. The maximum sales is at 22638.4; minimum at 0.4, with an average sales of 229.858. Maximum Quantity is at 14; minimum at 1, with an average of 3.79. Maximum discount rate is set at 80% with an average of 20%. Maximum profit reaches 8399.9 with a minimum profit at -6599.9, yielding an average profit of 28.65. The total profit sums up to 286397.
aggregate(Elips$Profit~Elips$Segment,Elips,mean)aggregate(Elips$Profit~Elips$Segment,Elips,var)aggregate(Elips$Profit~Elips$Segment,Elips,sd)boxplot(Elips$Profit)Elips[Elips$Sales== min(Elips$Sales),]Lowest sales in the Consumer segment comes from Office Supplies in which the profit ratio is at 1.11.
os <- Elips[Elips$Category == "Office Supplies",]
round(prop.table(table(os$Segment))*100,2)##
## Consumer Corporate Home Office
## 51.89 30.20 17.91
Highest number of buyers of Office Supplies come from the Consumer segment
dis <- Elips[Elips$Discount == max(Elips$Discount),]
dis_date <- sort(table(dis$Order.Date),decreasing = T)[1]
dis_date## 2014-07-26
## 4
An 80% Discount occurs 4 times in July 2014.
dis2 <- dis[dis$Order.Date == "2014-07-26",]
table(dis2$Segment)##
## Consumer Corporate Home Office
## 4 0 0
Buyers of 80%-discounted goods are consumers.
xtabs(Quantity~Segment+Category,Elips)## Category
## Segment Furniture Office Supplies Technology
## Consumer 4166 11758 3597
## Corporate 2495 7018 2095
## Home Office 1367 4130 1247
heatmap(xtabs(Quantity~Segment+Category,Elips),cexRow = 0.8,cexCol = 0.8,scale = "column",Colv = NA, Rowv = NA)aggregate(Quantity~Segment+Category,Elips,median)sum(Elips$Quantity)## [1] 37873
sum(Elips$Profit)## [1] 286397
nol <- Elips[Elips$Discount == 0,]
knitr::kable(table(nol$Discount))| Var1 | Freq |
|---|---|
| 0 | 4798 |
More0 <- Elips[Elips$Discount > 0 & Elips$Discount <=0.8,]
knitr::kable(table(More0$Discount))| Var1 | Freq |
|---|---|
| 0.1 | 94 |
| 0.15 | 52 |
| 0.2 | 3657 |
| 0.3 | 227 |
| 0.32 | 27 |
| 0.4 | 206 |
| 0.45 | 11 |
| 0.5 | 66 |
| 0.6 | 138 |
| 0.7 | 418 |
| 0.8 | 300 |
sort(xtabs(Profit~Category,Elips),decreasing = T)## Category
## Technology Office Supplies Furniture
## 145454.95 122490.80 18451.27
The most profitable category is Technology.
Elips[Elips$Profit== min(Elips$Profit),]The biggest loss occurs on order date 2016-11-26 in which 5 items of Technology-Maxhines-Cubify CubeX 3D Printer Double Head Print are discounted 70%.
sort(xtabs(Profit ~ Segment, Elips),decreasing = T)## Segment
## Consumer Corporate Home Office
## 134119.21 91979.13 60298.68
Profit based on segment is consumer.
sort(xtabs(Sales ~ Category, Elips),decreasing = T)## Category
## Technology Furniture Office Supplies
## 836154.0 741999.8 719047.0
Highest selling price / highest selling margin of goods is Technology.
xtabs(Profit ~ Segment+Ship.Mode,Elips)## Ship.Mode
## Segment First Class Same Day Second Class Standard Class
## Consumer 21374.044 9874.205 24946.911 77924.049
## Corporate 14464.472 1818.142 18225.713 57470.807
## Home Office 13131.324 4199.412 14274.011 28693.932
plot(xtabs(Profit ~ Segment+Ship.Mode,Elips))Every segment prefers delivery by Standard Class the most, followed by Second Class, First Class, and Same Day, and consumer provides the highest profit.
sort(xtabs(Quantity ~ Category, Elips),decreasing=T)## Category
## Office Supplies Furniture Technology
## 22906 8028 6939
Based on item sold, office supply was the highest folloewd by furniture and technology
Tech_Category <- Elips[Elips$Category == "Technology",]
Off_Sup_Category <- Elips[Elips$Category == "Office Supplies",]
Furniture_Category <- Elips[Elips$Category == "Furniture",]sum(Tech_Category$Profit)## [1] 145454.9
sum(Off_Sup_Category$Profit)## [1] 122490.8
sum(Furniture_Category$Profit)## [1] 18451.27
cor(Tech_Category$Quantity,Tech_Category$Discount)## [1] -0.01413123
cor(Tech_Category$Profit,Tech_Category$Discount)## [1] -0.2688532
The number of purchases on goods of the Technology category has a contrast correlation toward discounts, which means that discounts do not contribute to sales of Technology goods.
The profit of Technology is inversely proportional to discounts, meaning that discounts contribute negatively towards profit of the sales of Technology goods.
since PT. Elips started their company to sell Furniture, Office supplies and Technology goods in 2014 to 2017, there were 9994 transaction recorded, sold 37873 items. Buyer from this company derived in three segment, they were consumer, Corporate and Home Office. Based on all transaction transaction we can conclude that product categorty that oftenly brought was consumer goods followed by Corporate goods then Home office, where Office supply goods found highly consumed then followed by Furniture and Technology goods. Binders, paper and furnishing are the top three item consumed from office supply. Office supply buyer mostly came from Consumer . Shipping mode that usually used for each segment was standard class followed by second class, first class then same day. Maximum sales achieved per single transaction was 22638.4 and maximum quantity sold per single transaction was 14 but oftenly 3 items per transaction. Margin of product profit maximum per transaction was 8399.9, and the average was 28.65. Among those three category, technoloy given high margin, but due to high discount about 80% from technoloy caused a big loss about -6599.9.