Explanation

This data is contains of sales record included the profit or loss from a retail company (i named it PT. Elips). PT.Elips is a global retail company who sells Office Supplies, Furniture and Technology stuff. The buyers are coming from different segment such as Corporate, Home Office until individual consumer. This company also could provide the best option for your delivery which suitable with your needs.

Note : Detail explanatory will be given at the end of content

Input Data

elips<- read.csv("data_input/retail.csv")

Data Inspection

head(elips)
##   Row.ID       Order.ID Order.Date Ship.Date      Ship.Mode Customer.ID
## 1      1 CA-2016-152156    11/8/16  11/11/16   Second Class    CG-12520
## 2      2 CA-2016-152156    11/8/16  11/11/16   Second Class    CG-12520
## 3      3 CA-2016-138688    6/12/16   6/16/16   Second Class    DV-13045
## 4      4 US-2015-108966   10/11/15  10/18/15 Standard Class    SO-20335
## 5      5 US-2015-108966   10/11/15  10/18/15 Standard Class    SO-20335
## 6      6 CA-2014-115812     6/9/14   6/14/14 Standard Class    BH-11710
##     Segment      Product.ID        Category Sub.Category
## 1  Consumer FUR-BO-10001798       Furniture    Bookcases
## 2  Consumer FUR-CH-10000454       Furniture       Chairs
## 3 Corporate OFF-LA-10000240 Office Supplies       Labels
## 4  Consumer FUR-TA-10000577       Furniture       Tables
## 5  Consumer OFF-ST-10000760 Office Supplies      Storage
## 6  Consumer FUR-FU-10001487       Furniture  Furnishings
##                                                       Product.Name    Sales
## 1                                Bush Somerset Collection Bookcase 261.9600
## 2      Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back 731.9400
## 3        Self-Adhesive Address Labels for Typewriters by Universal  14.6200
## 4                    Bretford CR4500 Series Slim Rectangular Table 957.5775
## 5                                   Eldon Fold 'N Roll Cart System  22.3680
## 6 Eldon Expressions Wood and Plastic Desk Accessories, Cherry Wood  48.8600
##   Quantity Discount    Profit
## 1        2     0.00   41.9136
## 2        3     0.00  219.5820
## 3        2     0.00    6.8714
## 4        5     0.45 -383.0310
## 5        2     0.20    2.5164
## 6        7     0.00   14.1694
tail(elips)
##      Row.ID       Order.ID Order.Date Ship.Date      Ship.Mode Customer.ID
## 9989   9989 CA-2017-163629   11/17/17  11/21/17 Standard Class    RA-19885
## 9990   9990 CA-2014-110422    1/21/14   1/23/14   Second Class    TB-21400
## 9991   9991 CA-2017-121258    2/26/17    3/3/17 Standard Class    DB-13060
## 9992   9992 CA-2017-121258    2/26/17    3/3/17 Standard Class    DB-13060
## 9993   9993 CA-2017-121258    2/26/17    3/3/17 Standard Class    DB-13060
## 9994   9994 CA-2017-119914     5/4/17    5/9/17   Second Class    CC-12220
##        Segment      Product.ID        Category Sub.Category
## 9989 Corporate TEC-PH-10004006      Technology       Phones
## 9990  Consumer FUR-FU-10001889       Furniture  Furnishings
## 9991  Consumer FUR-FU-10000747       Furniture  Furnishings
## 9992  Consumer TEC-PH-10003645      Technology       Phones
## 9993  Consumer OFF-PA-10004041 Office Supplies        Paper
## 9994  Consumer OFF-AP-10002684 Office Supplies   Appliances
##                                                                   Product.Name
## 9989                                           Panasonic KX - TS880B Telephone
## 9990                                                    Ultra Door Pull Handle
## 9991                        Tenex B1-RE Series Chair Mats for Low Pile Carpets
## 9992                                                     Aastra 57i VoIP phone
## 9993                         It's Hot Message Books with Stickers, 2 3/4" x 5"
## 9994 Acco 7-Outlet Masterpiece Power Center, Wihtout Fax/Phone Line Protection
##        Sales Quantity Discount  Profit
## 9989 206.100        5      0.0 55.6470
## 9990  25.248        3      0.2  4.1028
## 9991  91.960        2      0.0 15.6332
## 9992 258.576        2      0.2 19.3932
## 9993  29.600        4      0.0 13.3200
## 9994 243.160        2      0.0 72.9480
dim(elips)
## [1] 9994   15
names(elips)
##  [1] "Row.ID"       "Order.ID"     "Order.Date"   "Ship.Date"    "Ship.Mode"   
##  [6] "Customer.ID"  "Segment"      "Product.ID"   "Category"     "Sub.Category"
## [11] "Product.Name" "Sales"        "Quantity"     "Discount"     "Profit"

From our inspection we can conclude : * elips data contain 9994 of rows and 15 of coloumns * Each of column name : “Row.ID”, “Order.ID”, “Order.Date”, “Ship.Date”, “Ship.Mode”, “Customer.ID”, “Segment”,“Product.ID”, “Category”, “Sub.Category”, “Product.Name”, “Sales”, “Quantity”, “Discount”, “Profit”

Data Cleansing&Coertions

Check data type for each coloum

str(elips)
## 'data.frame':    9994 obs. of  15 variables:
##  $ Row.ID      : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Order.ID    : chr  "CA-2016-152156" "CA-2016-152156" "CA-2016-138688" "US-2015-108966" ...
##  $ Order.Date  : chr  "11/8/16" "11/8/16" "6/12/16" "10/11/15" ...
##  $ Ship.Date   : chr  "11/11/16" "11/11/16" "6/16/16" "10/18/15" ...
##  $ Ship.Mode   : chr  "Second Class" "Second Class" "Second Class" "Standard Class" ...
##  $ Customer.ID : chr  "CG-12520" "CG-12520" "DV-13045" "SO-20335" ...
##  $ Segment     : chr  "Consumer" "Consumer" "Corporate" "Consumer" ...
##  $ Product.ID  : chr  "FUR-BO-10001798" "FUR-CH-10000454" "OFF-LA-10000240" "FUR-TA-10000577" ...
##  $ Category    : chr  "Furniture" "Furniture" "Office Supplies" "Furniture" ...
##  $ Sub.Category: chr  "Bookcases" "Chairs" "Labels" "Tables" ...
##  $ Product.Name: chr  "Bush Somerset Collection Bookcase" "Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back" "Self-Adhesive Address Labels for Typewriters by Universal" "Bretford CR4500 Series Slim Rectangular Table" ...
##  $ Sales       : num  262 731.9 14.6 957.6 22.4 ...
##  $ Quantity    : int  2 3 2 5 2 7 4 6 3 5 ...
##  $ Discount    : num  0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
##  $ Profit      : num  41.91 219.58 6.87 -383.03 2.52 ...

From this result, we find some of data type not in the corect type. we need to convert it into corect type (data coertion)

elips$Row.ID <- as.character(elips$Row.ID)
elips$Order.ID<- as.character (elips$Order.ID)
elips$Order.Date<- as.Date(elips$Order.Date, "%m/%d/%y")
elips$Ship.Date <- as.Date(elips$Ship.Date, "%m/%d/%y")
elips$Customer.ID<-as.character(elips$Customer.ID)
elips$Product.ID<- as.character(elips$Product.ID)
elips$Product.Name<-as.character(elips$Product.Name)

str(elips)
## 'data.frame':    9994 obs. of  15 variables:
##  $ Row.ID      : chr  "1" "2" "3" "4" ...
##  $ Order.ID    : chr  "CA-2016-152156" "CA-2016-152156" "CA-2016-138688" "US-2015-108966" ...
##  $ Order.Date  : Date, format: "2016-11-08" "2016-11-08" ...
##  $ Ship.Date   : Date, format: "2016-11-11" "2016-11-11" ...
##  $ Ship.Mode   : chr  "Second Class" "Second Class" "Second Class" "Standard Class" ...
##  $ Customer.ID : chr  "CG-12520" "CG-12520" "DV-13045" "SO-20335" ...
##  $ Segment     : chr  "Consumer" "Consumer" "Corporate" "Consumer" ...
##  $ Product.ID  : chr  "FUR-BO-10001798" "FUR-CH-10000454" "OFF-LA-10000240" "FUR-TA-10000577" ...
##  $ Category    : chr  "Furniture" "Furniture" "Office Supplies" "Furniture" ...
##  $ Sub.Category: chr  "Bookcases" "Chairs" "Labels" "Tables" ...
##  $ Product.Name: chr  "Bush Somerset Collection Bookcase" "Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back" "Self-Adhesive Address Labels for Typewriters by Universal" "Bretford CR4500 Series Slim Rectangular Table" ...
##  $ Sales       : num  262 731.9 14.6 957.6 22.4 ...
##  $ Quantity    : int  2 3 2 5 2 7 4 6 3 5 ...
##  $ Discount    : num  0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
##  $ Profit      : num  41.91 219.58 6.87 -383.03 2.52 ...

Each of column already changed into desired data type

Data Explanation

Cek for missing value

colSums(is.na(elips))
##       Row.ID     Order.ID   Order.Date    Ship.Date    Ship.Mode  Customer.ID 
##            0            0            0            0            0            0 
##      Segment   Product.ID     Category Sub.Category Product.Name        Sales 
##            0            0            0            0            0            0 
##     Quantity     Discount       Profit 
##            0            0            0
anyNA(elips)
## [1] FALSE
Elips<- elips[,c(3:15)]
head(Elips)
##   Order.Date  Ship.Date      Ship.Mode Customer.ID   Segment      Product.ID
## 1 2016-11-08 2016-11-11   Second Class    CG-12520  Consumer FUR-BO-10001798
## 2 2016-11-08 2016-11-11   Second Class    CG-12520  Consumer FUR-CH-10000454
## 3 2016-06-12 2016-06-16   Second Class    DV-13045 Corporate OFF-LA-10000240
## 4 2015-10-11 2015-10-18 Standard Class    SO-20335  Consumer FUR-TA-10000577
## 5 2015-10-11 2015-10-18 Standard Class    SO-20335  Consumer OFF-ST-10000760
## 6 2014-06-09 2014-06-14 Standard Class    BH-11710  Consumer FUR-FU-10001487
##          Category Sub.Category
## 1       Furniture    Bookcases
## 2       Furniture       Chairs
## 3 Office Supplies       Labels
## 4       Furniture       Tables
## 5 Office Supplies      Storage
## 6       Furniture  Furnishings
##                                                       Product.Name    Sales
## 1                                Bush Somerset Collection Bookcase 261.9600
## 2      Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back 731.9400
## 3        Self-Adhesive Address Labels for Typewriters by Universal  14.6200
## 4                    Bretford CR4500 Series Slim Rectangular Table 957.5775
## 5                                   Eldon Fold 'N Roll Cart System  22.3680
## 6 Eldon Expressions Wood and Plastic Desk Accessories, Cherry Wood  48.8600
##   Quantity Discount    Profit
## 1        2     0.00   41.9136
## 2        3     0.00  219.5820
## 3        2     0.00    6.8714
## 4        5     0.45 -383.0310
## 5        2     0.20    2.5164
## 6        7     0.00   14.1694

Data Explanation

Brief explanation

summary(Elips)
##    Order.Date           Ship.Date           Ship.Mode        
##  Min.   :2014-01-03   Min.   :2014-01-07   Length:9994       
##  1st Qu.:2015-05-23   1st Qu.:2015-05-27   Class :character  
##  Median :2016-06-26   Median :2016-06-29   Mode  :character  
##  Mean   :2016-04-30   Mean   :2016-05-03                     
##  3rd Qu.:2017-05-14   3rd Qu.:2017-05-18                     
##  Max.   :2017-12-30   Max.   :2018-01-05                     
##  Customer.ID          Segment           Product.ID          Category        
##  Length:9994        Length:9994        Length:9994        Length:9994       
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  Sub.Category       Product.Name           Sales              Quantity    
##  Length:9994        Length:9994        Min.   :    0.444   Min.   : 1.00  
##  Class :character   Class :character   1st Qu.:   17.280   1st Qu.: 2.00  
##  Mode  :character   Mode  :character   Median :   54.490   Median : 3.00  
##                                        Mean   :  229.858   Mean   : 3.79  
##                                        3rd Qu.:  209.940   3rd Qu.: 5.00  
##                                        Max.   :22638.480   Max.   :14.00  
##     Discount          Profit         
##  Min.   :0.0000   Min.   :-6599.978  
##  1st Qu.:0.0000   1st Qu.:    1.729  
##  Median :0.2000   Median :    8.666  
##  Mean   :0.1562   Mean   :   28.657  
##  3rd Qu.:0.2000   3rd Qu.:   29.364  
##  Max.   :0.8000   Max.   : 8399.976

Summary : 1. First order occured in Jan 2014 2. Standart Class was the most popular one for shipping option and Same day delivery was the lowest one 3. PT. Elips has the most buyers coming from cunsumer segment and home office segment was the lowest 4. Office Supply category was the best seller compare to others 5. In Sub category, Other stuff was the most favourite one then followed by Binders and paper. Art is the lowest 6. PT. Elips gained the average sales at 229.858; with max sales value at 22638.480 and minumum sales value at 0.444 7. Max quantity order was 14 within 4 years but average at 3.79 8. PT. Elips max amount of discount was 80%, but mean number of discount around 15% 9. PT. Elips ever had loss at 6599 and gained 8399 as their profit with average profit at 28.657

Check the Outlier within profit

aggregate(Profit ~ Segment, Elips, mean)
##       Segment   Profit
## 1    Consumer 25.83687
## 2   Corporate 30.45667
## 3 Home Office 33.81866
aggregate(Profit~Segment,Elips,var)
##       Segment   Profit
## 1    Consumer 58851.86
## 2   Corporate 53817.63
## 3 Home Office 45105.89
aggregate(Profit~Segment,Elips,sd)
##       Segment   Profit
## 1    Consumer 242.5940
## 2   Corporate 231.9863
## 3 Home Office 212.3815
boxplot(Elips$Profit)

From result above, we find posibilities for the outliers, but from our calculation, Sd value is around 200 ( my oppinion its still be tolerated), so the process may continue.

Data Manipulation & Transformation

1.Which category gives the lowest sales number? how much profit?

Elips[Elips$Sales == 0.444,]
##      Order.Date  Ship.Date      Ship.Mode Customer.ID  Segment      Product.ID
## 4102 2017-06-19 2017-06-23 Standard Class    ZC-21910 Consumer OFF-AP-10002906
##             Category Sub.Category
## 4102 Office Supplies   Appliances
##                                                                    Product.Name
## 4102 Hoover Replacement Belt for Commercial Guardsman Heavy-Duty Upright Vacuum
##      Sales Quantity Discount Profit
## 4102 0.444        1      0.8  -1.11

Answer : Sales 0.444 comes from segment ‘consumer’, office supplies, and got loss 1.11

2.Which segment buying the most office suppies?

os <- Elips[Elips$Category=="Office Supplies",]
round(prop.table(table(os$Segment))*100,2)
## 
##    Consumer   Corporate Home Office 
##       51.89       30.20       17.91

Answer : consumer segment –> 51.86%

3.How many times buyers get the highest discount 80% and when it happened?

dis<-Elips[Elips$Discount==0.8,]
(sort(table(dis$Order.Date),decreasing = T)[1])
## 2014-07-26 
##          4

Answer : 80% discount was on July 2014, happened 4 times

4.Related to number 3, who was the buyers?

dis2<-dis[dis$Order.Date=="2014-07-26",]
(table(dis2$Segment))
## 
## Consumer 
##        4

Answer : all buyer was coming from consumer segment

  1. How much total of quantity order from each segment and category, and which is the highest?
xtabs(Quantity~Segment+Category,Elips)
##              Category
## Segment       Furniture Office Supplies Technology
##   Consumer         4166           11758       3597
##   Corporate        2495            7018       2095
##   Home Office      1367            4130       1247
plot(xtabs(Quantity~Segment+Category,Elips))

heatmap(xtabs(Quantity~Segment+Category, Elips), cexRow = 0.8, cexCol = 0.8, scale = "column", Colv = NA, Rowv = NA)

Answer : Based on result above: segmen consumer with category office supply make the highest quantity order with sum 11758

6.Median of quantity for every segment and category?

aggregate(Quantity~Segment+Category,Elips,median)
##       Segment        Category Quantity
## 1    Consumer       Furniture        3
## 2   Corporate       Furniture        3
## 3 Home Office       Furniture        3
## 4    Consumer Office Supplies        3
## 5   Corporate Office Supplies        3
## 6 Home Office Office Supplies        3
## 7    Consumer      Technology        3
## 8   Corporate      Technology        3
## 9 Home Office      Technology        3

Answer : Median of quantity = 3

7.How much Total quantity order since PT. Elips start selling the product?

sum(Elips$Quantity)
## [1] 37873

8.How much profit in 4 years?

sum(Elips$Profit)
## [1] 286397

9.Discount Percentage (0%) (0-20%) (>20%)

nol<- Elips[Elips$Discount==0,]
knitr::kable(table(nol$Discount))
Var1 Freq
0 4798
more0 <- Elips[Elips$Discount>0 & Elips$Discount<=0.8,]
knitr::kable(table(more0$Discount))
Var1 Freq
0.1 94
0.15 52
0.2 3657
0.3 227
0.32 27
0.4 206
0.45 11
0.5 66
0.6 138
0.7 418
0.8 300

Answer : based on this data, half ammount of data was seeling without any discount given

10.Category which generate the highest profit?

xtabs(Profit~ Category, Elips)
## Category
##       Furniture Office Supplies      Technology 
##        18451.27       122490.80       145454.95

Answer : Technology

  1. Highest Loss in which transaction?
Elips[Elips$Profit==-6599.978,]
##      Order.Date  Ship.Date      Ship.Mode Customer.ID  Segment      Product.ID
## 7773 2016-11-25 2016-12-02 Standard Class    CS-12505 Consumer TEC-MA-10000418
##        Category Sub.Category                              Product.Name    Sales
## 7773 Technology     Machines Cubify CubeX 3D Printer Double Head Print 4499.985
##      Quantity Discount    Profit
## 7773        5      0.7 -6599.978

Answer :happened in 2016, was selling 5 items of machines, discount 70%

12.Highest Profit in Segment?

xtabs(Profit~ Segment, Elips)
## Segment
##    Consumer   Corporate Home Office 
##   134119.21    91979.13    60298.68

Answer : Answer : Consumer was given the higest profit compare to other segment

13.Sales number based on category ?

xtabs(Sales~ Category, Elips)
## Category
##       Furniture Office Supplies      Technology 
##        741999.8        719047.0        836154.0

Answer : Technology have the highest selling price compare to other

  1. Profit based on Ship Mode + segment?
xtabs(Profit~Segment+Ship.Mode,Elips)
##              Ship.Mode
## Segment       First Class  Same Day Second Class Standard Class
##   Consumer      21374.044  9874.205    24946.911      77924.049
##   Corporate     14464.472  1818.142    18225.713      57470.807
##   Home Office   13131.324  4199.412    14274.011      28693.932
plot(xtabs(Profit~Segment+Ship.Mode,Elips))

Answer : Most of buyer was choosing Standar Class for shipping options. Among them consumer was the higest one

15.The highest quantity seller based on Category

xtabs(Quantity~Category,Elips)
## Category
##       Furniture Office Supplies      Technology 
##            8028           22906            6939

Answer : Office Supply

Explanatory Text & Business Recomendation

Recommendation:

  1. From our calculations above, it shows that Technology sells with the highest sales value but the contradictory maximum loss also comes from the Technology section. This is because the discount given is too high (80%) without thinking about the benefits. Need to re-evaluate and recalculate carefully before deciding on giving discounts in the future.
  2. Big opportunity is in target market in Corporate segment and home office segment, we can give them more discount by taking back profit for company
  3. Ensure all categories are safely supplied in a dedicated warehouse for office supplies. Based on our calculations, the purchase of office supplies category with 22906, almost 2-3 times higher than other categories