Hi !! Welcome to my Rmd :)
in this LBB i will use previous data which is Retail.csv
I hope you enjoy it !
This data is contains of sales record included the profit or loss from a retail company (i named it PT. Elips). PT.Elips is a global retail company who sells Office Supplies, Furniture and Technology stuff.
The buyers are coming from different segment such as Corporate, Home Office until individual consumer. This company also could provide the best option for your delivery which suitable with your needs.
Note : Detail explanatory will be given at the end of content
Make sure our data placed in the same folder our R project data.
Input data is DONE ! then let’s get started
## [1] 9994 15
## [1] "Row.ID" "Order.ID" "Order.Date" "Ship.Date"
## [5] "Ship.Mode" "Customer.ID" "Segment" "Product.ID"
## [9] "Category" "Sub.Category" "Product.Name" "Sales"
## [13] "Quantity" "Discount" "Profit"
From our inspection we can conclude :
* elips data contain 9994 of rows and 15 of coloumns
* Each of column name : “Row.ID”, “Order.ID”, “Order.Date”, “Ship.Date”, “Ship.Mode”, “Customer.ID”, “Segment”,“Product.ID”, “Category”, “Sub.Category”, “Product.Name”, “Sales”, “Quantity”, “Discount”, “Profit”
Check data type for each column
## 'data.frame': 9994 obs. of 15 variables:
## $ Row.ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Order.ID : Factor w/ 5009 levels "CA-2014-100006",..: 2501 2501 2297 4373 4373 202 202 202 202 202 ...
## $ Order.Date : Factor w/ 1237 levels "1/1/17","1/10/14",..: 305 305 836 94 94 922 922 922 922 922 ...
## $ Ship.Date : Factor w/ 1334 levels "1/1/15","1/1/16",..: 220 220 907 129 129 897 897 897 897 897 ...
## $ Ship.Mode : Factor w/ 4 levels "First Class",..: 3 3 3 4 4 4 4 4 4 4 ...
## $ Customer.ID : Factor w/ 793 levels "AA-10315","AA-10375",..: 144 144 240 706 706 89 89 89 89 89 ...
## $ Segment : Factor w/ 3 levels "Consumer","Corporate",..: 1 1 2 1 1 1 1 1 1 1 ...
## $ Product.ID : Factor w/ 1862 levels "FUR-BO-10000112",..: 13 56 947 320 1317 186 563 1762 795 438 ...
## $ Category : Factor w/ 3 levels "Furniture","Office Supplies",..: 1 1 2 1 2 1 2 3 2 2 ...
## $ Sub.Category: Factor w/ 17 levels "Accessories",..: 5 6 11 17 15 10 3 14 4 2 ...
## $ Product.Name: Factor w/ 1850 levels "\"While you Were Out\" Message Book, One Form per Page",..: 387 833 1440 367 574 570 1137 1099 535 295 ...
## $ Sales : num 262 731.9 14.6 957.6 22.4 ...
## $ Quantity : int 2 3 2 5 2 7 4 6 3 5 ...
## $ Discount : num 0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
## $ Profit : num 41.91 219.58 6.87 -383.03 2.52 ...
From this result, we find some of data type not in the corect type. we need to convert it into corect type (data coertion)
elips$Row.ID <- as.character(elips$Row.ID)
elips$Order.ID<- as.character (elips$Order.ID)
elips$Order.Date<- as.Date(elips$Order.Date, "%m/%d/%y")
elips$Ship.Date <- as.Date(elips$Ship.Date, "%m/%d/%y")
elips$Customer.ID<-as.character(elips$Customer.ID)
elips$Product.ID<- as.character(elips$Product.ID)
elips$Product.Name<-as.character(elips$Product.Name)
str(elips)
## 'data.frame': 9994 obs. of 15 variables:
## $ Row.ID : chr "1" "2" "3" "4" ...
## $ Order.ID : chr "CA-2016-152156" "CA-2016-152156" "CA-2016-138688" "US-2015-108966" ...
## $ Order.Date : Date, format: "2016-11-08" "2016-11-08" ...
## $ Ship.Date : Date, format: "2016-11-11" "2016-11-11" ...
## $ Ship.Mode : Factor w/ 4 levels "First Class",..: 3 3 3 4 4 4 4 4 4 4 ...
## $ Customer.ID : chr "CG-12520" "CG-12520" "DV-13045" "SO-20335" ...
## $ Segment : Factor w/ 3 levels "Consumer","Corporate",..: 1 1 2 1 1 1 1 1 1 1 ...
## $ Product.ID : chr "FUR-BO-10001798" "FUR-CH-10000454" "OFF-LA-10000240" "FUR-TA-10000577" ...
## $ Category : Factor w/ 3 levels "Furniture","Office Supplies",..: 1 1 2 1 2 1 2 3 2 2 ...
## $ Sub.Category: Factor w/ 17 levels "Accessories",..: 5 6 11 17 15 10 3 14 4 2 ...
## $ Product.Name: chr "Bush Somerset Collection Bookcase" "Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back" "Self-Adhesive Address Labels for Typewriters by Universal" "Bretford CR4500 Series Slim Rectangular Table" ...
## $ Sales : num 262 731.9 14.6 957.6 22.4 ...
## $ Quantity : int 2 3 2 5 2 7 4 6 3 5 ...
## $ Discount : num 0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
## $ Profit : num 41.91 219.58 6.87 -383.03 2.52 ...
Each of column already changed into desired data type
Cek for missing value
## Row.ID Order.ID Order.Date Ship.Date Ship.Mode
## 0 0 0 0 0
## Customer.ID Segment Product.ID Category Sub.Category
## 0 0 0 0 0
## Product.Name Sales Quantity Discount Profit
## 0 0 0 0 0
## [1] FALSE
Great!! No missing value
Do subset do delete some column (1&2 because we dont need the informations). then save it into ‘Elips’ variable
Now, Elips dataset is ready to be processed and analyzed
Brief explanation
## Order.Date Ship.Date Ship.Mode
## Min. :2014-01-03 Min. :2014-01-07 First Class :1538
## 1st Qu.:2015-05-23 1st Qu.:2015-05-27 Same Day : 543
## Median :2016-06-26 Median :2016-06-29 Second Class :1945
## Mean :2016-04-30 Mean :2016-05-03 Standard Class:5968
## 3rd Qu.:2017-05-14 3rd Qu.:2017-05-18
## Max. :2017-12-30 Max. :2018-01-05
##
## Customer.ID Segment Product.ID
## Length:9994 Consumer :5191 Length:9994
## Class :character Corporate :3020 Class :character
## Mode :character Home Office:1783 Mode :character
##
##
##
##
## Category Sub.Category Product.Name
## Furniture :2121 Binders :1523 Length:9994
## Office Supplies:6026 Paper :1370 Class :character
## Technology :1847 Furnishings: 957 Mode :character
## Phones : 889
## Storage : 846
## Art : 796
## (Other) :3613
## Sales Quantity Discount Profit
## Min. : 0.444 Min. : 1.00 Min. :0.0000 Min. :-6599.978
## 1st Qu.: 17.280 1st Qu.: 2.00 1st Qu.:0.0000 1st Qu.: 1.729
## Median : 54.490 Median : 3.00 Median :0.2000 Median : 8.666
## Mean : 229.858 Mean : 3.79 Mean :0.1562 Mean : 28.657
## 3rd Qu.: 209.940 3rd Qu.: 5.00 3rd Qu.:0.2000 3rd Qu.: 29.364
## Max. :22638.480 Max. :14.00 Max. :0.8000 Max. : 8399.976
##
Summary :
1. First order occured in Jan 2014
2. Standart Class was the most popular one for shipping option and Same day delivery was the lowest one
3. PT. Elips has the most buyers coming from cunsumer segment and home office segment was the lowest
4. Office Supply category was the best seller compare to others
5. In Sub category, Other stuff was the most favourite one then followed by Binders and paper. Art is the lowest
6. PT. Elips gained the average sales at 229.858; with max sales value at 22638.480 and minumum sales value at 0.444
7. Max quantity order was 14 within 4 years but average at 3.79
8. PT. Elips max amount of discount was 80%, but mean number of discount around 15%
9. PT. Elips ever had loss at 6599 and gained 8399 as their profit with average profit at 28.657
Check the Outlier within profit
From result above, we find posibilities for the outliers, but from our calculation, Sd value is around 200 ( my oppinion its still be tolerated), so the process may continue.
Answer : Sales 0.444 comes from segment ‘consumer’, office supplies, and got loss 1.11
##
## Consumer Corporate Home Office
## 51.89 30.20 17.91
Answer : consumer segment –> 51.89%
## 2014-07-26
## 4
Answer : 80% discount was on July 2014, happened 4 times
##
## Consumer Corporate Home Office
## 4 0 0
Answer : all buyer was coming from consumer segment
## Category
## Segment Furniture Office Supplies Technology
## Consumer 4166 11758 3597
## Corporate 2495 7018 2095
## Home Office 1367 4130 1247
Answer : Based on result above: segmen consumer with category office supply make the highest quantity order with sum 11758
Answer : Median of quantity = 3
## [1] 37873
Answer : 37873 items
## [1] 286397
Answer : 286397
Var1 | Freq |
---|---|
0 | 4798 |
Var1 | Freq |
---|---|
0.1 | 94 |
0.15 | 52 |
0.2 | 3657 |
0.3 | 227 |
0.32 | 27 |
0.4 | 206 |
0.45 | 11 |
0.5 | 66 |
0.6 | 138 |
0.7 | 418 |
0.8 | 300 |
Answer : based on this data, half ammount of data was seeling without any discount given
## Category
## Furniture Office Supplies Technology
## 18451.27 122490.80 145454.95
Answer : Technology
Answer : happened in 2016, was selling 5 items of machines, discount 70%
## Segment
## Consumer Corporate Home Office
## 134119.21 91979.13 60298.68
Answer : Consumer was given the higest profit compare to other segment
## Category
## Furniture Office Supplies Technology
## 741999.8 719047.0 836154.0
Answer : Technology have the highest selling price compare to other
## Ship.Mode
## Segment First Class Same Day Second Class Standard Class
## Consumer 21374.044 9874.205 24946.911 77924.049
## Corporate 14464.472 1818.142 18225.713 57470.807
## Home Office 13131.324 4199.412 14274.011 28693.932
Answer : Most of buyer was choosing Standar Class for shipping options. Among them consumer was the higest one
## Category
## Furniture Office Supplies Technology
## 8028 22906 6939
Answer : Office Supply
In 4 years, since bigining of 2014-2017 PT. Elips have sold total 37873 items with total profit 286397. This company selling 3 main category : Furniture, Office supplies and Technology. From this category, ‘Technology’ generate the highest profit with value 145454.95 compare to others but ironically, Highest loss with ammount 6599.987 happened in 2016 was coming from ‘Technology’ category as well. Apart from that, Technology categoru has the highest selling price.
Buyer who buying the product was coming from different segment, such as : Consumer, Corporate and Home Office. ‘Consumer’ was giving the higest ammount of profit compare to other with 134119.21. Shipping option consist of 4 types : First Class, Same day, Second class, and Standart class. Based on data, from all segment, most of them are choosing Standart Class as their shipping option. Consumer segment is the most highest segment who choosing Standart class mode.
Recomendations :