1 RETAIL

2 Penjelasan

Data Retail merupakan data tahun 2014-2017 yang berasal dari sebuah perusahaan retail di mana datanya berfokus pada 3 kategori produk yaitu furnitur, peralatan kantor, dan teknologi. Berikut penjelasan masing-masing kolomnya:

  • Order.ID ID transaksi
  • Order.Date Tanggal transaksi
  • Ship.Date Tanggal pengiriman
  • Ship.Mode Jenis pengiriman
  • Customer.ID ID pelanggan
  • Segment Segmen pelanggan
  • Product.ID ID produk
  • Category Kategori produk (furnitur, peralatan kantor, teknologi)
  • Sub-Category Jenis dari tiap kategory produk
  • Product.Name Nama produk
  • Sales Revenue dari transaksi
  • Quantity Jumlah produk terjual
  • Discount Diskon tiap produk tiap transaksi
  • Profit Keuntungan perusahaan tiap transaksi per produk

3 Eksplorasi Data

  • “Install Packages”
library(readr)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.2.2
library(scales)
## 
## Attaching package: 'scales'
## The following object is masked from 'package:readr':
## 
##     col_factor
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
library(skimr)
library(tidyr)
library(purrr)
## 
## Attaching package: 'purrr'
## The following object is masked from 'package:scales':
## 
##     discard
library(tibble)
library(stringr)

3.1 Pemanggilan Data

retail <- read.csv("retail.csv")

#menampilkan 6 data teratas (6 data adalah default "head" di R)
head(retail)
##   Row.ID       Order.ID Order.Date Ship.Date      Ship.Mode Customer.ID
## 1      1 CA-2016-152156    11/8/16  11/11/16   Second Class    CG-12520
## 2      2 CA-2016-152156    11/8/16  11/11/16   Second Class    CG-12520
## 3      3 CA-2016-138688    6/12/16   6/16/16   Second Class    DV-13045
## 4      4 US-2015-108966   10/11/15  10/18/15 Standard Class    SO-20335
## 5      5 US-2015-108966   10/11/15  10/18/15 Standard Class    SO-20335
## 6      6 CA-2014-115812     6/9/14   6/14/14 Standard Class    BH-11710
##     Segment      Product.ID        Category Sub.Category
## 1  Consumer FUR-BO-10001798       Furniture    Bookcases
## 2  Consumer FUR-CH-10000454       Furniture       Chairs
## 3 Corporate OFF-LA-10000240 Office Supplies       Labels
## 4  Consumer FUR-TA-10000577       Furniture       Tables
## 5  Consumer OFF-ST-10000760 Office Supplies      Storage
## 6  Consumer FUR-FU-10001487       Furniture  Furnishings
##                                                       Product.Name    Sales
## 1                                Bush Somerset Collection Bookcase 261.9600
## 2      Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back 731.9400
## 3        Self-Adhesive Address Labels for Typewriters by Universal  14.6200
## 4                    Bretford CR4500 Series Slim Rectangular Table 957.5775
## 5                                   Eldon Fold 'N Roll Cart System  22.3680
## 6 Eldon Expressions Wood and Plastic Desk Accessories, Cherry Wood  48.8600
##   Quantity Discount    Profit
## 1        2     0.00   41.9136
## 2        3     0.00  219.5820
## 3        2     0.00    6.8714
## 4        5     0.45 -383.0310
## 5        2     0.20    2.5164
## 6        7     0.00   14.1694

3.2 Cek Struktur Data

str(retail)
## 'data.frame':    9994 obs. of  15 variables:
##  $ Row.ID      : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Order.ID    : chr  "CA-2016-152156" "CA-2016-152156" "CA-2016-138688" "US-2015-108966" ...
##  $ Order.Date  : chr  "11/8/16" "11/8/16" "6/12/16" "10/11/15" ...
##  $ Ship.Date   : chr  "11/11/16" "11/11/16" "6/16/16" "10/18/15" ...
##  $ Ship.Mode   : chr  "Second Class" "Second Class" "Second Class" "Standard Class" ...
##  $ Customer.ID : chr  "CG-12520" "CG-12520" "DV-13045" "SO-20335" ...
##  $ Segment     : chr  "Consumer" "Consumer" "Corporate" "Consumer" ...
##  $ Product.ID  : chr  "FUR-BO-10001798" "FUR-CH-10000454" "OFF-LA-10000240" "FUR-TA-10000577" ...
##  $ Category    : chr  "Furniture" "Furniture" "Office Supplies" "Furniture" ...
##  $ Sub.Category: chr  "Bookcases" "Chairs" "Labels" "Tables" ...
##  $ Product.Name: chr  "Bush Somerset Collection Bookcase" "Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back" "Self-Adhesive Address Labels for Typewriters by Universal" "Bretford CR4500 Series Slim Rectangular Table" ...
##  $ Sales       : num  262 731.9 14.6 957.6 22.4 ...
##  $ Quantity    : int  2 3 2 5 2 7 4 6 3 5 ...
##  $ Discount    : num  0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
##  $ Profit      : num  41.91 219.58 6.87 -383.03 2.52 ...

3.3 Mengubah Tipe Data

Mengubah tipe data untuk kolom Order.Date, Ship.Date, month, dan year menjadi data “datetime”, sementara Ship.Mode, Segment, Category, dan Sub.Category menjadi data “factor”.

retail$Order.Date <- mdy(retail$Order.Date)
retail$Ship.Date <- mdy(retail$Ship.Date)
retail$Ship.Mode <- as.factor(retail$Ship.Mode)
retail$Segment <- as.factor(retail$Segment)
retail$Category <- as.factor(retail$Category)
retail$Sub.Category <- as.factor(retail$Sub.Category)
retail$month <- month(retail$Order.Date, label = T, abbr = T)
retail$year <- year(retail$Order.Date)

Mengecek apakah tipe data sudah berubah

str(retail)
## 'data.frame':    9994 obs. of  17 variables:
##  $ Row.ID      : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Order.ID    : chr  "CA-2016-152156" "CA-2016-152156" "CA-2016-138688" "US-2015-108966" ...
##  $ Order.Date  : Date, format: "2016-11-08" "2016-11-08" ...
##  $ Ship.Date   : Date, format: "2016-11-11" "2016-11-11" ...
##  $ Ship.Mode   : Factor w/ 4 levels "First Class",..: 3 3 3 4 4 4 4 4 4 4 ...
##  $ Customer.ID : chr  "CG-12520" "CG-12520" "DV-13045" "SO-20335" ...
##  $ Segment     : Factor w/ 3 levels "Consumer","Corporate",..: 1 1 2 1 1 1 1 1 1 1 ...
##  $ Product.ID  : chr  "FUR-BO-10001798" "FUR-CH-10000454" "OFF-LA-10000240" "FUR-TA-10000577" ...
##  $ Category    : Factor w/ 3 levels "Furniture","Office Supplies",..: 1 1 2 1 2 1 2 3 2 2 ...
##  $ Sub.Category: Factor w/ 17 levels "Accessories",..: 5 6 11 17 15 10 3 14 4 2 ...
##  $ Product.Name: chr  "Bush Somerset Collection Bookcase" "Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back" "Self-Adhesive Address Labels for Typewriters by Universal" "Bretford CR4500 Series Slim Rectangular Table" ...
##  $ Sales       : num  262 731.9 14.6 957.6 22.4 ...
##  $ Quantity    : int  2 3 2 5 2 7 4 6 3 5 ...
##  $ Discount    : num  0 0 0 0.45 0.2 0 0 0.2 0.2 0 ...
##  $ Profit      : num  41.91 219.58 6.87 -383.03 2.52 ...
##  $ month       : Ord.factor w/ 12 levels "Jan"<"Feb"<"Mar"<..: 11 11 6 10 10 6 6 6 6 6 ...
##  $ year        : num  2016 2016 2016 2015 2015 ...
retail_clean <- retail %>% 
  arrange(retail$Order.Date)

head(retail_clean)
##   Row.ID       Order.ID Order.Date  Ship.Date      Ship.Mode Customer.ID
## 1   7981 CA-2014-103800 2014-01-03 2014-01-07 Standard Class    DP-13000
## 2    740 CA-2014-112326 2014-01-04 2014-01-08 Standard Class    PO-19195
## 3    741 CA-2014-112326 2014-01-04 2014-01-08 Standard Class    PO-19195
## 4    742 CA-2014-112326 2014-01-04 2014-01-08 Standard Class    PO-19195
## 5   1760 CA-2014-141817 2014-01-05 2014-01-12 Standard Class    MB-18085
## 6   5328 CA-2014-130813 2014-01-06 2014-01-08   Second Class    LS-17230
##       Segment      Product.ID        Category Sub.Category
## 1    Consumer OFF-PA-10000174 Office Supplies        Paper
## 2 Home Office OFF-LA-10003223 Office Supplies       Labels
## 3 Home Office OFF-ST-10002743 Office Supplies      Storage
## 4 Home Office OFF-BI-10004094 Office Supplies      Binders
## 5    Consumer OFF-AR-10003478 Office Supplies          Art
## 6    Consumer OFF-PA-10002005 Office Supplies        Paper
##                                                               Product.Name
## 1 Message Book, Wirebound, Four 5 1/2" X 4" Forms/Pg., 200 Dupl. Sets/Book
## 2                                                                Avery 508
## 3                                            SAFCO Boltless Steel Shelving
## 4                               GBC Standard Plastic Binding Systems Combs
## 5       Avery Hi-Liter EverBold Pen Style Fluorescent Highlighters, 4/Pack
## 6                                                                Xerox 225
##     Sales Quantity Discount   Profit month year
## 1  16.448        2      0.2   5.5512   Jan 2014
## 2  11.784        3      0.2   4.2717   Jan 2014
## 3 272.736        3      0.2 -64.7748   Jan 2014
## 4   3.540        2      0.8  -5.4870   Jan 2014
## 5  19.536        3      0.2   4.8840   Jan 2014
## 6  19.440        3      0.0   9.3312   Jan 2014
tail(retail_clean)
##      Row.ID       Order.ID Order.Date  Ship.Date      Ship.Mode Customer.ID
## 9989    907 CA-2017-143259 2017-12-30 2018-01-03 Standard Class    PO-18865
## 9990    908 CA-2017-143259 2017-12-30 2018-01-03 Standard Class    PO-18865
## 9991    909 CA-2017-143259 2017-12-30 2018-01-03 Standard Class    PO-18865
## 9992   1297 CA-2017-115427 2017-12-30 2018-01-03 Standard Class    EB-13975
## 9993   1298 CA-2017-115427 2017-12-30 2018-01-03 Standard Class    EB-13975
## 9994   5092 CA-2017-156720 2017-12-30 2018-01-03 Standard Class    JM-15580
##        Segment      Product.ID        Category Sub.Category
## 9989  Consumer FUR-BO-10003441       Furniture    Bookcases
## 9990  Consumer TEC-PH-10004774      Technology       Phones
## 9991  Consumer OFF-BI-10003684 Office Supplies      Binders
## 9992 Corporate OFF-BI-10002103 Office Supplies      Binders
## 9993 Corporate OFF-BI-10004632 Office Supplies      Binders
## 9994  Consumer OFF-FA-10003472 Office Supplies    Fasteners
##                                              Product.Name   Sales Quantity
## 9989 Bush Westfield Collection Bookcases, Fully Assembled 323.136        4
## 9990                            Gear Head AU3700S Headset  90.930        7
## 9991                 Wilson Jones Legal Size Ring Binders  52.776        3
## 9992      Cardinal Slant-D Ring Binder, Heavy Gauge Vinyl  13.904        2
## 9993                                   GBC Binding covers  20.720        2
## 9994                                  Bagged Rubber Bands   3.024        3
##      Discount  Profit month year
## 9989      0.2 12.1176   Dec 2017
## 9990      0.0  2.7279   Dec 2017
## 9991      0.2 19.7910   Dec 2017
## 9992      0.2  4.5188   Dec 2017
## 9993      0.2  6.4750   Dec 2017
## 9994      0.2 -0.6048   Dec 2017

4 Analisis Data

Ada beberapa hal yang akan kita analisis dari data retail, yaitu: - Kategori dengan penjualan terbesar pada tahun 2017 - Pada bulan apa kategori tersebut memiliki penjualan terbesar di tahun 2017 - Sub kategori dari kategori tersebur dengan penjualan terbesar di tahun 2017

4.1 Penjualan Terbesar 2017 Berdasar Kategori

category_2017 <-
  retail_clean %>% 
  select(Category, Sales, year) %>% 
  filter(year == 2017)

ggplot(category_2017, aes(Category,Sales)) + geom_col() + theme_minimal() +
  labs(title = "Penjualan Terbesar Tahun 2017 per Kategori")

Pada tahun 2017, kategori teknologi memiliki nilai penjualan terbesar.

4.2 Penjualan Kategori Teknologi Tiap Bulan di Tahun 2017

tech_2017 <-
  retail_clean %>% 
  select(Category, Sales, month, year) %>% 
  filter(Category == "Technology" & year == 2017)

tech_month <- aggregate( Sales ~ month , tech_2017 , sum )

ggplot(tech_2017, aes(month,Sales)) + 
  geom_col() +
  geom_hline(yintercept = mean(tech_month$Sales, na.rm=TRUE), color='red', lty='dashed', lwd=2) +
  theme_minimal() +
  labs(title = "Penjualan Teknologi Selama 2017")
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.

Penjualan teknologi terbesar pada tahun 2017 terjadi di Bulan November dengan penjualan hampir $50,000. Pada tahun 2017, rata-rata penjualan produk teknologi berada di sekitar $22,500.

4.3 Penjualan Sub-Kategori Teknologi Tahun 2017

tech <-
  retail_clean %>% 
  select(Category, Sales, month, year, Sub.Category) %>% 
  filter(Category == "Technology" & year == 2017)

ggplot(tech, aes(Sub.Category,Sales)) + 
  geom_col() +
  theme_minimal() +
  labs(title = "Penjualan Sub-Kategori Teknologi Tahun 2017")

Sub-kategori teknologi yang paling besar menyumbang penjualan adalah Phones.