A template on how data analysis work
Explain the data in short paragraph
This data is contains of sales record included the profit or loss from a retail company (called PT. retail). PT.retail is a global retail company who sells Office Supplies, Furniture and Technology stuff.
The buyers are coming from different segment such as Corporate, Home Office until individual consumer. This company also could provide the best option for your delivery which suitable with your needs.
List severall business questions to be answered using the data
Make sure our data placed in the same folder our R project data.
Check the structure and type of data
## 'data.frame': 9994 obs. of 15 variables:
## $ Row.ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Order.ID : Factor w/ 5009 levels "CA-2014-100006",..: 2501 2501 2297 4373 4373 202 202 202 202 202 ...
## $ Order.Date : Factor w/ 1237 levels "1/1/17","1/10/14",..: 305 305 836 94 94 922 922 922 922 922 ...
## $ Ship.Date : Factor w/ 1334 levels "1/1/15","1/1/16",..: 220 220 907 129 129 897 897 897 897 897 ...
## $ Ship.Mode : Factor w/ 4 levels "First Class",..: 3 3 3 4 4 4 4 4 4 4 ...
## $ Customer.ID : Factor w/ 793 levels "AA-10315","AA-10375",..: 144 144 240 706 706 89 89 89 89 89 ...
## $ Segment : Factor w/ 3 levels "Consumer","Corporate",..: 1 1 2 1 1 1 1 1 1 1 ...
## $ Product.ID : Factor w/ 1862 levels "FUR-BO-10000112",..: 13 56 947 320 1317 186 563 1762 795 438 ...
## $ Category : Factor w/ 3 levels "Furniture","Office Supplies",..: 1 1 2 1 2 1 2 3 2 2 ...
## $ Sub.Category: Factor w/ 17 levels "Accessories",..: 5 6 11 17 15 10 3 14 4 2 ...
## $ Product.Name: Factor w/ 1850 levels "\"While you Were Out\" Message Book, One Form per Page",..: 387 833 1440 367 574 570 1137 1099 535 295 ...
## $ Sales : num 262 731.9 14.6 957.6 22.4 ...
## $ Quantity : int 2 3 2 5 2 7 4 6 3 5 ...
## $ Discount : num 0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
## $ Profit : num 41.91 219.58 6.87 -383.03 2.52 ...
From our inspection we can conclude :
* retail data contain 9994 of rows and 15 of columns
* From this result, we find some of data type not in the corect type. we need to convert it into corect type (data coertion)
Change any variable that has improper data type
retail$Row.ID <- as.character(retail$Row.ID)
retail$Order.ID<- as.character (retail$Order.ID)
retail$Customer.ID<-as.character(retail$Customer.ID)
retail$Product.ID<- as.character(retail$Product.ID)
retail$Product.Name<-as.character(retail$Product.Name)
str(retail)## 'data.frame': 9994 obs. of 15 variables:
## $ Row.ID : chr "1" "2" "3" "4" ...
## $ Order.ID : chr "CA-2016-152156" "CA-2016-152156" "CA-2016-138688" "US-2015-108966" ...
## $ Order.Date : Factor w/ 1237 levels "1/1/17","1/10/14",..: 305 305 836 94 94 922 922 922 922 922 ...
## $ Ship.Date : Factor w/ 1334 levels "1/1/15","1/1/16",..: 220 220 907 129 129 897 897 897 897 897 ...
## $ Ship.Mode : Factor w/ 4 levels "First Class",..: 3 3 3 4 4 4 4 4 4 4 ...
## $ Customer.ID : chr "CG-12520" "CG-12520" "DV-13045" "SO-20335" ...
## $ Segment : Factor w/ 3 levels "Consumer","Corporate",..: 1 1 2 1 1 1 1 1 1 1 ...
## $ Product.ID : chr "FUR-BO-10001798" "FUR-CH-10000454" "OFF-LA-10000240" "FUR-TA-10000577" ...
## $ Category : Factor w/ 3 levels "Furniture","Office Supplies",..: 1 1 2 1 2 1 2 3 2 2 ...
## $ Sub.Category: Factor w/ 17 levels "Accessories",..: 5 6 11 17 15 10 3 14 4 2 ...
## $ Product.Name: chr "Bush Somerset Collection Bookcase" "Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back" "Self-Adhesive Address Labels for Typewriters by Universal" "Bretford CR4500 Series Slim Rectangular Table" ...
## $ Sales : num 262 731.9 14.6 957.6 22.4 ...
## $ Quantity : int 2 3 2 5 2 7 4 6 3 5 ...
## $ Discount : num 0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
## $ Profit : num 41.91 219.58 6.87 -383.03 2.52 ...
Each of column already changed into desired data type
Do subset to delete some column (1&2 because we dont need the informations). then save it into ‘retail’ variable
Now, retail dataset is ready to be processed and analyzed
Let’s summarise the data
## Order.Date Ship.Date Ship.Mode Customer.ID
## 9/5/16 : 38 12/16/15: 35 First Class :1538 Length:9994
## 9/2/17 : 36 9/26/17 : 34 Same Day : 543 Class :character
## 11/10/16: 35 11/21/17: 32 Second Class :1945 Mode :character
## 12/1/17 : 34 12/6/17 : 32 Standard Class:5968
## 12/2/17 : 34 12/12/17: 30
## 12/9/17 : 33 9/15/17 : 30
## (Other) :9784 (Other) :9801
## Segment Product.ID Category
## Consumer :5191 Length:9994 Furniture :2121
## Corporate :3020 Class :character Office Supplies:6026
## Home Office:1783 Mode :character Technology :1847
##
##
##
##
## Sub.Category Product.Name Sales Quantity
## Binders :1523 Length:9994 Min. : 0.444 Min. : 1.00
## Paper :1370 Class :character 1st Qu.: 17.280 1st Qu.: 2.00
## Furnishings: 957 Mode :character Median : 54.490 Median : 3.00
## Phones : 889 Mean : 229.858 Mean : 3.79
## Storage : 846 3rd Qu.: 209.940 3rd Qu.: 5.00
## Art : 796 Max. :22638.480 Max. :14.00
## (Other) :3613
## Discount Profit
## Min. :0.0000 Min. :-6599.978
## 1st Qu.:0.0000 1st Qu.: 1.729
## Median :0.2000 Median : 8.666
## Mean :0.1562 Mean : 28.657
## 3rd Qu.:0.2000 3rd Qu.: 29.364
## Max. :0.8000 Max. : 8399.976
##
Summary :
Answer : Lowest sales comes from segment ‘consumer’, office supplies, and got loss 1.11
##
## Consumer Corporate Home Office
## 0.5189180 0.3020246 0.1790574
Answer : consumer segment –> 51.89%
## Category
## Segment Furniture Office Supplies Technology
## Consumer 4166 11758 3597
## Corporate 2495 7018 2095
## Home Office 1367 4130 1247
Answer : Based on result above: segmen consumer with category office supply make the highest quantity order with sum 11758
Answer : Median of quantity = 3
## [1] 37873
Answer : 37873 items
## [1] 286397
Answer : 286397
## Category
## Furniture Office Supplies Technology
## 18451.27 122490.80 145454.95
Answer : Technology
Answer : happened in 2016, was selling 5 items of machines, discount 70%
## Segment
## Consumer Corporate Home Office
## 134119.21 91979.13 60298.68
Answer : Consumer was given the higest profit compare to other segment
## Category
## Furniture Office Supplies Technology
## 741999.8 719047.0 836154.0
Answer : Technology have the highest selling price compare to other
## Ship.Mode
## Segment First Class Same Day Second Class Standard Class
## Consumer 21374.044 9874.205 24946.911 77924.049
## Corporate 14464.472 1818.142 18225.713 57470.807
## Home Office 13131.324 4199.412 14274.011 28693.932
Answer : Most of buyer was choosing Standar Class for shipping options. Among them consumer was the higest one
## Category
## Furniture Office Supplies Technology
## 8028 22906 6939
Answer : Office Supply
In 4 years, since bigining of 2014-2017 PT. retail have sold total 37873 items with total profit 286397. This company selling 3 main category : Furniture, Office supplies and Technology. From this category, ‘Technology’ generate the highest profit with value 145454.95 compare to others but ironically, Highest loss with ammount 6599.987 happened in 2016 was coming from ‘Technology’ category as well. Apart from that, Technology category has the highest selling price.
Buyer who bought the product was coming from different segment, such as : Consumer, Corporate and Home Office. ‘Consumer’ was giving the higest ammount of profit compare to other with 134119.21. Shipping option consist of 4 types : First Class, Same day, Second class, and Standart class. Based on data, from all segment, most of them are choosing Standart Class as their shipping option. Consumer segment is the most highest segment who choosing Standart class mode.
Recomendations :
From our calculation above, showing that Technology was selling with highest sales value but contradictory max loss was coming from Technology section as well. This is caused because discound given was too hight (80%) without think about the provit.Consider it, No discount given without making some profit!! Calculate twice before deciding how many percentage of discount would be given
Need to increase market target at Corporate segment and home office segment, we may give them more discount with taking back the profit to company
Make sure all category was safely supplied on the wareshouse specially for ffice supply items.Based on our calculation, buyers mostly buyed office supply category with 22906, it almost 2-3 times higer from other category