This data is contains of sales record included the profit or loss from a retail company (i named it PT. Elips). PT.Elips is a global retail company who sells Office Supplies, Furniture and Technology stuff. The buyers are coming from different segment such as Corporate, Home Office until individual consumer. This company also could provide the best option for your delivery which suitable with your needs.
Note : Detail explanatory will be given at the end of content
elips<- read.csv("data_input/retail.csv")
head(elips)
## Row.ID Order.ID Order.Date Ship.Date Ship.Mode Customer.ID
## 1 1 CA-2016-152156 11/8/16 11/11/16 Second Class CG-12520
## 2 2 CA-2016-152156 11/8/16 11/11/16 Second Class CG-12520
## 3 3 CA-2016-138688 6/12/16 6/16/16 Second Class DV-13045
## 4 4 US-2015-108966 10/11/15 10/18/15 Standard Class SO-20335
## 5 5 US-2015-108966 10/11/15 10/18/15 Standard Class SO-20335
## 6 6 CA-2014-115812 6/9/14 6/14/14 Standard Class BH-11710
## Segment Product.ID Category Sub.Category
## 1 Consumer FUR-BO-10001798 Furniture Bookcases
## 2 Consumer FUR-CH-10000454 Furniture Chairs
## 3 Corporate OFF-LA-10000240 Office Supplies Labels
## 4 Consumer FUR-TA-10000577 Furniture Tables
## 5 Consumer OFF-ST-10000760 Office Supplies Storage
## 6 Consumer FUR-FU-10001487 Furniture Furnishings
## Product.Name Sales
## 1 Bush Somerset Collection Bookcase 261.9600
## 2 Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back 731.9400
## 3 Self-Adhesive Address Labels for Typewriters by Universal 14.6200
## 4 Bretford CR4500 Series Slim Rectangular Table 957.5775
## 5 Eldon Fold 'N Roll Cart System 22.3680
## 6 Eldon Expressions Wood and Plastic Desk Accessories, Cherry Wood 48.8600
## Quantity Discount Profit
## 1 2 0.00 41.9136
## 2 3 0.00 219.5820
## 3 2 0.00 6.8714
## 4 5 0.45 -383.0310
## 5 2 0.20 2.5164
## 6 7 0.00 14.1694
tail(elips)
## Row.ID Order.ID Order.Date Ship.Date Ship.Mode Customer.ID
## 9989 9989 CA-2017-163629 11/17/17 11/21/17 Standard Class RA-19885
## 9990 9990 CA-2014-110422 1/21/14 1/23/14 Second Class TB-21400
## 9991 9991 CA-2017-121258 2/26/17 3/3/17 Standard Class DB-13060
## 9992 9992 CA-2017-121258 2/26/17 3/3/17 Standard Class DB-13060
## 9993 9993 CA-2017-121258 2/26/17 3/3/17 Standard Class DB-13060
## 9994 9994 CA-2017-119914 5/4/17 5/9/17 Second Class CC-12220
## Segment Product.ID Category Sub.Category
## 9989 Corporate TEC-PH-10004006 Technology Phones
## 9990 Consumer FUR-FU-10001889 Furniture Furnishings
## 9991 Consumer FUR-FU-10000747 Furniture Furnishings
## 9992 Consumer TEC-PH-10003645 Technology Phones
## 9993 Consumer OFF-PA-10004041 Office Supplies Paper
## 9994 Consumer OFF-AP-10002684 Office Supplies Appliances
## Product.Name
## 9989 Panasonic KX - TS880B Telephone
## 9990 Ultra Door Pull Handle
## 9991 Tenex B1-RE Series Chair Mats for Low Pile Carpets
## 9992 Aastra 57i VoIP phone
## 9993 It's Hot Message Books with Stickers, 2 3/4" x 5"
## 9994 Acco 7-Outlet Masterpiece Power Center, Wihtout Fax/Phone Line Protection
## Sales Quantity Discount Profit
## 9989 206.100 5 0.0 55.6470
## 9990 25.248 3 0.2 4.1028
## 9991 91.960 2 0.0 15.6332
## 9992 258.576 2 0.2 19.3932
## 9993 29.600 4 0.0 13.3200
## 9994 243.160 2 0.0 72.9480
dim(elips)
## [1] 9994 15
names(elips)
## [1] "Row.ID" "Order.ID" "Order.Date" "Ship.Date" "Ship.Mode"
## [6] "Customer.ID" "Segment" "Product.ID" "Category" "Sub.Category"
## [11] "Product.Name" "Sales" "Quantity" "Discount" "Profit"
From our inspection we can conclude : * elips data contain 9994 of rows and 15 of coloumns * Each of column name : “Row.ID”, “Order.ID”, “Order.Date”, “Ship.Date”, “Ship.Mode”, “Customer.ID”, “Segment”,“Product.ID”, “Category”, “Sub.Category”, “Product.Name”, “Sales”, “Quantity”, “Discount”, “Profit”
Check data type for each coloum
str(elips)
## 'data.frame': 9994 obs. of 15 variables:
## $ Row.ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Order.ID : chr "CA-2016-152156" "CA-2016-152156" "CA-2016-138688" "US-2015-108966" ...
## $ Order.Date : chr "11/8/16" "11/8/16" "6/12/16" "10/11/15" ...
## $ Ship.Date : chr "11/11/16" "11/11/16" "6/16/16" "10/18/15" ...
## $ Ship.Mode : chr "Second Class" "Second Class" "Second Class" "Standard Class" ...
## $ Customer.ID : chr "CG-12520" "CG-12520" "DV-13045" "SO-20335" ...
## $ Segment : chr "Consumer" "Consumer" "Corporate" "Consumer" ...
## $ Product.ID : chr "FUR-BO-10001798" "FUR-CH-10000454" "OFF-LA-10000240" "FUR-TA-10000577" ...
## $ Category : chr "Furniture" "Furniture" "Office Supplies" "Furniture" ...
## $ Sub.Category: chr "Bookcases" "Chairs" "Labels" "Tables" ...
## $ Product.Name: chr "Bush Somerset Collection Bookcase" "Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back" "Self-Adhesive Address Labels for Typewriters by Universal" "Bretford CR4500 Series Slim Rectangular Table" ...
## $ Sales : num 262 731.9 14.6 957.6 22.4 ...
## $ Quantity : int 2 3 2 5 2 7 4 6 3 5 ...
## $ Discount : num 0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
## $ Profit : num 41.91 219.58 6.87 -383.03 2.52 ...
From this result, we find some of data type not in the corect type. we need to convert it into corect type (data coertion)
elips$Row.ID <- as.character(elips$Row.ID)
elips$Order.ID<- as.character (elips$Order.ID)
elips$Order.Date<- as.Date(elips$Order.Date, "%m/%d/%y")
elips$Ship.Date <- as.Date(elips$Ship.Date, "%m/%d/%y")
elips$Customer.ID<-as.character(elips$Customer.ID)
elips$Product.ID<- as.character(elips$Product.ID)
elips$Product.Name<-as.character(elips$Product.Name)
str(elips)
## 'data.frame': 9994 obs. of 15 variables:
## $ Row.ID : chr "1" "2" "3" "4" ...
## $ Order.ID : chr "CA-2016-152156" "CA-2016-152156" "CA-2016-138688" "US-2015-108966" ...
## $ Order.Date : Date, format: "2016-11-08" "2016-11-08" ...
## $ Ship.Date : Date, format: "2016-11-11" "2016-11-11" ...
## $ Ship.Mode : chr "Second Class" "Second Class" "Second Class" "Standard Class" ...
## $ Customer.ID : chr "CG-12520" "CG-12520" "DV-13045" "SO-20335" ...
## $ Segment : chr "Consumer" "Consumer" "Corporate" "Consumer" ...
## $ Product.ID : chr "FUR-BO-10001798" "FUR-CH-10000454" "OFF-LA-10000240" "FUR-TA-10000577" ...
## $ Category : chr "Furniture" "Furniture" "Office Supplies" "Furniture" ...
## $ Sub.Category: chr "Bookcases" "Chairs" "Labels" "Tables" ...
## $ Product.Name: chr "Bush Somerset Collection Bookcase" "Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back" "Self-Adhesive Address Labels for Typewriters by Universal" "Bretford CR4500 Series Slim Rectangular Table" ...
## $ Sales : num 262 731.9 14.6 957.6 22.4 ...
## $ Quantity : int 2 3 2 5 2 7 4 6 3 5 ...
## $ Discount : num 0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
## $ Profit : num 41.91 219.58 6.87 -383.03 2.52 ...
Each of column already changed into desired data type
Cek for missing value
colSums(is.na(elips))
## Row.ID Order.ID Order.Date Ship.Date Ship.Mode Customer.ID
## 0 0 0 0 0 0
## Segment Product.ID Category Sub.Category Product.Name Sales
## 0 0 0 0 0 0
## Quantity Discount Profit
## 0 0 0
anyNA(elips)
## [1] FALSE
Elips<- elips[,c(3:15)]
head(Elips)
## Order.Date Ship.Date Ship.Mode Customer.ID Segment Product.ID
## 1 2016-11-08 2016-11-11 Second Class CG-12520 Consumer FUR-BO-10001798
## 2 2016-11-08 2016-11-11 Second Class CG-12520 Consumer FUR-CH-10000454
## 3 2016-06-12 2016-06-16 Second Class DV-13045 Corporate OFF-LA-10000240
## 4 2015-10-11 2015-10-18 Standard Class SO-20335 Consumer FUR-TA-10000577
## 5 2015-10-11 2015-10-18 Standard Class SO-20335 Consumer OFF-ST-10000760
## 6 2014-06-09 2014-06-14 Standard Class BH-11710 Consumer FUR-FU-10001487
## Category Sub.Category
## 1 Furniture Bookcases
## 2 Furniture Chairs
## 3 Office Supplies Labels
## 4 Furniture Tables
## 5 Office Supplies Storage
## 6 Furniture Furnishings
## Product.Name Sales
## 1 Bush Somerset Collection Bookcase 261.9600
## 2 Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back 731.9400
## 3 Self-Adhesive Address Labels for Typewriters by Universal 14.6200
## 4 Bretford CR4500 Series Slim Rectangular Table 957.5775
## 5 Eldon Fold 'N Roll Cart System 22.3680
## 6 Eldon Expressions Wood and Plastic Desk Accessories, Cherry Wood 48.8600
## Quantity Discount Profit
## 1 2 0.00 41.9136
## 2 3 0.00 219.5820
## 3 2 0.00 6.8714
## 4 5 0.45 -383.0310
## 5 2 0.20 2.5164
## 6 7 0.00 14.1694
Brief explanation
summary(Elips)
## Order.Date Ship.Date Ship.Mode
## Min. :2014-01-03 Min. :2014-01-07 Length:9994
## 1st Qu.:2015-05-23 1st Qu.:2015-05-27 Class :character
## Median :2016-06-26 Median :2016-06-29 Mode :character
## Mean :2016-04-30 Mean :2016-05-03
## 3rd Qu.:2017-05-14 3rd Qu.:2017-05-18
## Max. :2017-12-30 Max. :2018-01-05
## Customer.ID Segment Product.ID Category
## Length:9994 Length:9994 Length:9994 Length:9994
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## Sub.Category Product.Name Sales Quantity
## Length:9994 Length:9994 Min. : 0.444 Min. : 1.00
## Class :character Class :character 1st Qu.: 17.280 1st Qu.: 2.00
## Mode :character Mode :character Median : 54.490 Median : 3.00
## Mean : 229.858 Mean : 3.79
## 3rd Qu.: 209.940 3rd Qu.: 5.00
## Max. :22638.480 Max. :14.00
## Discount Profit
## Min. :0.0000 Min. :-6599.978
## 1st Qu.:0.0000 1st Qu.: 1.729
## Median :0.2000 Median : 8.666
## Mean :0.1562 Mean : 28.657
## 3rd Qu.:0.2000 3rd Qu.: 29.364
## Max. :0.8000 Max. : 8399.976
Summary : 1. First order occured in Jan 2014 2. Standart Class was the most popular one for shipping option and Same day delivery was the lowest one 3. PT. Elips has the most buyers coming from cunsumer segment and home office segment was the lowest 4. Office Supply category was the best seller compare to others 5. In Sub category, Other stuff was the most favourite one then followed by Binders and paper. Art is the lowest 6. PT. Elips gained the average sales at 229.858; with max sales value at 22638.480 and minumum sales value at 0.444 7. Max quantity order was 14 within 4 years but average at 3.79 8. PT. Elips max amount of discount was 80%, but mean number of discount around 15% 9. PT. Elips ever had loss at 6599 and gained 8399 as their profit with average profit at 28.657
Check the Outlier within profit
aggregate(Profit ~ Segment, Elips, mean)
## Segment Profit
## 1 Consumer 25.83687
## 2 Corporate 30.45667
## 3 Home Office 33.81866
aggregate(Profit~Segment,Elips,var)
## Segment Profit
## 1 Consumer 58851.86
## 2 Corporate 53817.63
## 3 Home Office 45105.89
aggregate(Profit~Segment,Elips,sd)
## Segment Profit
## 1 Consumer 242.5940
## 2 Corporate 231.9863
## 3 Home Office 212.3815
boxplot(Elips$Profit)
From result above, we find posibilities for the outliers, but from our calculation, Sd value is around 200 ( my oppinion its still be tolerated), so the process may continue.
1.Which category gives the lowest sales number? how much profit?
Elips[Elips$Sales == 0.444,]
## Order.Date Ship.Date Ship.Mode Customer.ID Segment Product.ID
## 4102 2017-06-19 2017-06-23 Standard Class ZC-21910 Consumer OFF-AP-10002906
## Category Sub.Category
## 4102 Office Supplies Appliances
## Product.Name
## 4102 Hoover Replacement Belt for Commercial Guardsman Heavy-Duty Upright Vacuum
## Sales Quantity Discount Profit
## 4102 0.444 1 0.8 -1.11
Answer : Sales 0.444 comes from segment ‘consumer’, office supplies, and got loss 1.11
2.Which segment buying the most office suppies?
os <- Elips[Elips$Category=="Office Supplies",]
round(prop.table(table(os$Segment))*100,2)
##
## Consumer Corporate Home Office
## 51.89 30.20 17.91
Answer : consumer segment –> 51.86%
3.How many times buyers get the highest discount 80% and when it happened?
dis<-Elips[Elips$Discount==0.8,]
(sort(table(dis$Order.Date),decreasing = T)[1])
## 2014-07-26
## 4
Answer : 80% discount was on July 2014, happened 4 times
4.Related to number 3, who was the buyers?
dis2<-dis[dis$Order.Date=="2014-07-26",]
(table(dis2$Segment))
##
## Consumer
## 4
Answer : all buyer was coming from consumer segment
xtabs(Quantity~Segment+Category,Elips)
## Category
## Segment Furniture Office Supplies Technology
## Consumer 4166 11758 3597
## Corporate 2495 7018 2095
## Home Office 1367 4130 1247
plot(xtabs(Quantity~Segment+Category,Elips))
heatmap(xtabs(Quantity~Segment+Category, Elips), cexRow = 0.8, cexCol = 0.8, scale = "column", Colv = NA, Rowv = NA)
Answer : Based on result above: segmen consumer with category office supply make the highest quantity order with sum 11758
6.Median of quantity for every segment and category?
aggregate(Quantity~Segment+Category,Elips,median)
## Segment Category Quantity
## 1 Consumer Furniture 3
## 2 Corporate Furniture 3
## 3 Home Office Furniture 3
## 4 Consumer Office Supplies 3
## 5 Corporate Office Supplies 3
## 6 Home Office Office Supplies 3
## 7 Consumer Technology 3
## 8 Corporate Technology 3
## 9 Home Office Technology 3
Answer : Median of quantity = 3
7.How much Total quantity order since PT. Elips start selling the product?
sum(Elips$Quantity)
## [1] 37873
8.How much profit in 4 years?
sum(Elips$Profit)
## [1] 286397
9.Discount Percentage (0%) (0-20%) (>20%)
nol<- Elips[Elips$Discount==0,]
knitr::kable(table(nol$Discount))
| Var1 | Freq |
|---|---|
| 0 | 4798 |
more0 <- Elips[Elips$Discount>0 & Elips$Discount<=0.8,]
knitr::kable(table(more0$Discount))
| Var1 | Freq |
|---|---|
| 0.1 | 94 |
| 0.15 | 52 |
| 0.2 | 3657 |
| 0.3 | 227 |
| 0.32 | 27 |
| 0.4 | 206 |
| 0.45 | 11 |
| 0.5 | 66 |
| 0.6 | 138 |
| 0.7 | 418 |
| 0.8 | 300 |
Answer : based on this data, half ammount of data was seeling without any discount given
10.Category which generate the highest profit?
xtabs(Profit~ Category, Elips)
## Category
## Furniture Office Supplies Technology
## 18451.27 122490.80 145454.95
Answer : Technology
Elips[Elips$Profit==-6599.978,]
## Order.Date Ship.Date Ship.Mode Customer.ID Segment Product.ID
## 7773 2016-11-25 2016-12-02 Standard Class CS-12505 Consumer TEC-MA-10000418
## Category Sub.Category Product.Name Sales
## 7773 Technology Machines Cubify CubeX 3D Printer Double Head Print 4499.985
## Quantity Discount Profit
## 7773 5 0.7 -6599.978
Answer :happened in 2016, was selling 5 items of machines, discount 70%
12.Highest Profit in Segment?
xtabs(Profit~ Segment, Elips)
## Segment
## Consumer Corporate Home Office
## 134119.21 91979.13 60298.68
Answer : Answer : Consumer was given the higest profit compare to other segment
13.Sales number based on category ?
xtabs(Sales~ Category, Elips)
## Category
## Furniture Office Supplies Technology
## 741999.8 719047.0 836154.0
Answer : Technology have the highest selling price compare to other
xtabs(Profit~Segment+Ship.Mode,Elips)
## Ship.Mode
## Segment First Class Same Day Second Class Standard Class
## Consumer 21374.044 9874.205 24946.911 77924.049
## Corporate 14464.472 1818.142 18225.713 57470.807
## Home Office 13131.324 4199.412 14274.011 28693.932
plot(xtabs(Profit~Segment+Ship.Mode,Elips))
Answer : Most of buyer was choosing Standar Class for shipping options. Among them consumer was the higest one
15.The highest quantity seller based on Category
xtabs(Quantity~Category,Elips)
## Category
## Furniture Office Supplies Technology
## 8028 22906 6939
Answer : Office Supply
In 4 years, since bigining 2014-2017 PT. Ellips has sold a total of 37873 items with a total profit of 286397.
The company sells 3 main categories: Furniture, Office Supplies and Technology. From this category, ‘Technology’ generated the highest profit with a value of 145454.95 but according to the data recorded the highest loss with a total of 6599,987 which occurred in 2016 also came from the ‘Technology’ category.
There are 3 different segments, such as: Consumer, Corporate and Home Office. ‘Consumer’ gives the highest amount of profit compared to others with 134119.21.
Delivery options consist of 4 types: First Class, Same day, Second class, and Standard class.According to the data, the choice of Standard Class is large enough to be used as a delivery service by all segments, besides that the consumer segment is the highest segment that chooses the Standard class mode.
Recommendation: