LBB 2 REV

BACKGROUND

LBB 2 Requirements

In making a report, don’t forget to cover the following:

  • Stages of pre-processing data that are carried out before making a visualization
  • Insight to be conveyed
  • Appropriate type of plot used
  • Aesthetic visualization displayed:
    1. The use of color (fill / color) and theme adjustment (theme)
    2. Providing labels in accordance with information:
      • Label of the x / y axis
      • Title / Subtitle
      • Legend, etc.

Case Study

ABC.com is a Marketplace that loves to give out amazing deals. They have various discount on every category and They want their customer get amazing discount and expect increasing Quantity of sold. ABC.com works for statisfy their members. They want to make a fast moving products so that ABC.com members will have more quantity sold and ABC.com still profitable. This will be a win situation for both.

Insight

Discount strategies

  • How much profit each category
  • how much quantity each category
  • how much average discount
  • how relation between average discount, quantity and profit?

INPUT DATA

Chunk Commentary :

  • read data retail and put to retail table

DATA INSPECTION

## 'data.frame':    9994 obs. of  15 variables:
##  $ Row.ID      : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Order.ID    : Factor w/ 5009 levels "CA-2014-100006",..: 2501 2501 2297 4373 4373 202 202 202 202 202 ...
##  $ Order.Date  : Factor w/ 1237 levels "1/1/17","1/10/14",..: 305 305 836 94 94 922 922 922 922 922 ...
##  $ Ship.Date   : Factor w/ 1334 levels "1/1/15","1/1/16",..: 220 220 907 129 129 897 897 897 897 897 ...
##  $ Ship.Mode   : Factor w/ 4 levels "First Class",..: 3 3 3 4 4 4 4 4 4 4 ...
##  $ Customer.ID : Factor w/ 793 levels "AA-10315","AA-10375",..: 144 144 240 706 706 89 89 89 89 89 ...
##  $ Segment     : Factor w/ 3 levels "Consumer","Corporate",..: 1 1 2 1 1 1 1 1 1 1 ...
##  $ Product.ID  : Factor w/ 1862 levels "FUR-BO-10000112",..: 13 56 947 320 1317 186 563 1762 795 438 ...
##  $ Category    : Factor w/ 3 levels "Furniture","Office Supplies",..: 1 1 2 1 2 1 2 3 2 2 ...
##  $ Sub.Category: Factor w/ 17 levels "Accessories",..: 5 6 11 17 15 10 3 14 4 2 ...
##  $ Product.Name: Factor w/ 1850 levels "\"While you Were Out\" Message Book, One Form per Page",..: 387 833 1440 367 574 570 1137 1099 535 295 ...
##  $ Sales       : num  262 731.9 14.6 957.6 22.4 ...
##  $ Quantity    : int  2 3 2 5 2 7 4 6 3 5 ...
##  $ Discount    : num  0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
##  $ Profit      : num  41.91 219.58 6.87 -383.03 2.52 ...

Chunk Commentary : check retail table structure

From our inspection we can conclude :

  • Retail data contain 9994 of rows and 15 of coloumns
  • Each of column name :
    • Row.ID
    • Order.ID
    • Order.Date
    • Ship.Date
    • Customer.ID
    • Segment
    • Product.ID
    • Category
    • Sub.Category
    • Product.Name
    • Sales
    • Quantity
    • Discount
    • Profit

DATA CLEANSING AND COERTIONS

Chunk Commentary :

  • change some data type in retail table
## 'data.frame':    9994 obs. of  15 variables:
##  $ Row.ID      : chr  "1" "2" "3" "4" ...
##  $ Order.ID    : chr  "CA-2016-152156" "CA-2016-152156" "CA-2016-138688" "US-2015-108966" ...
##  $ Order.Date  : Date, format: "2016-11-08" "2016-11-08" ...
##  $ Ship.Date   : Date, format: "2016-11-11" "2016-11-11" ...
##  $ Ship.Mode   : Factor w/ 4 levels "First Class",..: 3 3 3 4 4 4 4 4 4 4 ...
##  $ Customer.ID : chr  "CG-12520" "CG-12520" "DV-13045" "SO-20335" ...
##  $ Segment     : Factor w/ 3 levels "Consumer","Corporate",..: 1 1 2 1 1 1 1 1 1 1 ...
##  $ Product.ID  : chr  "FUR-BO-10001798" "FUR-CH-10000454" "OFF-LA-10000240" "FUR-TA-10000577" ...
##  $ Category    : Factor w/ 3 levels "Furniture","Office Supplies",..: 1 1 2 1 2 1 2 3 2 2 ...
##  $ Sub.Category: Factor w/ 17 levels "Accessories",..: 5 6 11 17 15 10 3 14 4 2 ...
##  $ Product.Name: chr  "Bush Somerset Collection Bookcase" "Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back" "Self-Adhesive Address Labels for Typewriters by Universal" "Bretford CR4500 Series Slim Rectangular Table" ...
##  $ Sales       : num  262 731.9 14.6 957.6 22.4 ...
##  $ Quantity    : int  2 3 2 5 2 7 4 6 3 5 ...
##  $ Discount    : num  0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
##  $ Profit      : num  41.91 219.58 6.87 -383.03 2.52 ...

Chunk Commentary :

  • check new structure in retail table

Check Missing Value

##       Row.ID     Order.ID   Order.Date    Ship.Date    Ship.Mode  Customer.ID 
##            0            0            0            0            0            0 
##      Segment   Product.ID     Category Sub.Category Product.Name        Sales 
##            0            0            0            0            0            0 
##     Quantity     Discount       Profit 
##            0            0            0

Chunk Commentary :

  • no missing value in retail table
##  [1] "Order.Date"   "Ship.Date"    "Ship.Mode"    "Customer.ID"  "Segment"     
##  [6] "Product.ID"   "Category"     "Sub.Category" "Product.Name" "Sales"       
## [11] "Quantity"     "Discount"     "Profit"

Chunk Commentary :

  • we erase two column, Row.ID and Order.ID because we dont use it.

DATA MANIPULATION

Chunk Commentary :

  • find aggregate of discount and category by mean, that store in data frame
  • find aggregate of profit and category by summation, that store in data frame
  • find aggregate of quantity and category by summation, that store in data frame

DATA EXPLANATION

how much quantity each category?

##          Category Quantity
## 1       Furniture     8028
## 2 Office Supplies    22906
## 3      Technology     6939

Chunk Comentary:

Quantity each Category:

  • Furniture Has 8.028 Quantity of demand
  • Office Supplies Has 22.906 Quantity of demand
  • Technology Has 6939 Quantity of demand

How much profit each category?

##          Category    Profit
## 1       Furniture  18451.27
## 2 Office Supplies 122490.80
## 3      Technology 145454.95

Chunk Commentary :

profit each category:

  • Furniture has a profit 18451.27
  • Office Supplies has a profit 122490.80
  • Technology has a profit 145454.95

How much average discount each category?

##          Category  Discount
## 1       Furniture 0.1739227
## 2 Office Supplies 0.1572851
## 3      Technology 0.1323227

Chunk Commentary :

Average Discount each category:

  • Furniture has a 0.1739227 average discount
  • Office Supplies has a 0.1572851 average discount
  • Techlogy has a 0.1323227 average discount

BUSINESS RECOMENDATION

  • Even though furniture gives the biggest discount than other categories, in fact furniture is the category that loses the most
  • meanwhile technology provides the smallest discount turns out to be the category that provides the biggest profit
  • Office supplies offer an average discount of 15% and contribute second-rate profit to all categories = Office supplies give the most quantity comparing to other Category

Lucky Putranto

15/3/2020