Description

This report provides forecast demand of materials using hierarchical clustering.

The data set using in this report for modeling is real demand forecast data in one of the company in Indonesia.

The report is structured as follows:
1. Data Extraction
2. Time Point Clustering
3. Data Preparation
4. Modeling
5. Recommendation

1. Data Extraction

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.2     v dplyr   1.0.7
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(readxl)
library(flexclust)
## Loading required package: grid
## Loading required package: lattice
## Loading required package: modeltools
## Loading required package: stats4
library(gridExtra)
## 
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
## 
##     combine
library(TSdist)
## Loading required package: proxy
## 
## Attaching package: 'proxy'
## The following objects are masked from 'package:stats':
## 
##     as.dist, dist
## The following object is masked from 'package:base':
## 
##     as.matrix
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## Loaded TSdist v3.7. See ?TSdist for help, citation("TSdist") for use in publication.
library(ggplot2)
library(fpp3)
## -- Attaching packages -------------------------------------------- fpp3 0.4.0 --
## v lubridate   1.7.10     v feasts      0.2.2 
## v tsibble     1.0.1      v fable       0.3.1 
## v tsibbledata 0.3.0
## -- Conflicts ------------------------------------------------- fpp3_conflicts --
## x gridExtra::combine() masks dplyr::combine()
## x lubridate::date()    masks base::date()
## x dplyr::filter()      masks stats::filter()
## x tsibble::intersect() masks base::intersect()
## x tsibble::interval()  masks lubridate::interval()
## x dplyr::lag()         masks stats::lag()
## x fabletools::refit()  masks modeltools::refit()
## x tsibble::setdiff()   masks base::setdiff()
## x tsibble::union()     masks base::union()

Library tidyverse to knowing data structures
Library readxl to read file of data
Library flexclust for flexible cluster algorithms
Library gridExtra for grid plit diagram
Library TSdist for find better distance
Library ggplot2 for plotting data

Read house data set from .XLSX file to R data frame. then, see the dataframe’s structure.

# Load and Transform ----
dt_rutin <- read_excel("dt_rutin.xlsx")
 
 # Structure of data frame
 str(dt_rutin)
## tibble [87 x 8] (S3: tbl_df/tbl/data.frame)
##  $ material       : chr [1:87] "LUBE;S;SHELL OMALA 220" "GREASE;N;SHELL ALVANIA EP 2" "GREASE;S;OMEGA 57" "GREASE;S;OMEGA 65" ...
##  $ 2014           : num [1:87] 996 18 30 15 45 ...
##  $ 2015           : num [1:87] 40 162 45 30 0 ...
##  $ 2016           : num [1:87] 209 20 30 30 15 209 209 209 209 208 ...
##  $ 2017           : num [1:87] 0 140 75 15 30 ...
##  $ 2018           : num [1:87] 1254 0 45 15 0 ...
##  $ 2019           : num [1:87] 627 0 30 30 15 808 627 209 418 0 ...
##  $ Rutin/Non Rutin: chr [1:87] "RUTIN" "RUTIN" "RUTIN" "RUTIN" ...
 head(dt_rutin)
## # A tibble: 6 x 8
##   material            `2014` `2015` `2016` `2017` `2018` `2019` `Rutin/Non Ruti~
##   <chr>                <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>           
## 1 LUBE;S;SHELL OMALA~  996       40    209     0    1254    627 RUTIN           
## 2 GREASE;N;SHELL ALV~   18.0    162     20   140       0      0 RUTIN           
## 3 GREASE;S;OMEGA 57     30       45     30    75      45     30 RUTIN           
## 4 GREASE;S;OMEGA 65     15       30     30    15      15     30 RUTIN           
## 5 GREASE;S;OMEGA 85     45        0     15    30       0     15 RUTIN           
## 6 LUBE;N;NYNAS           0     1045    209  1006.   1010    808 RUTIN

The data set has 1877 observations and 8 variables. The target. It means the data have 1877 material with time series from 2014-2019.

## statistical summary
summary(dt_rutin)
##    material              2014             2015             2016      
##  Length:87          Min.   :   0.0   Min.   :   0.0   Min.   :    0  
##  Class :character   1st Qu.:   1.0   1st Qu.:   1.0   1st Qu.:    1  
##  Mode  :character   Median :   3.0   Median :   2.0   Median :    3  
##                     Mean   : 266.9   Mean   : 118.7   Mean   :  276  
##                     3rd Qu.:  25.0   3rd Qu.:  30.0   3rd Qu.:   30  
##                     Max.   :9000.0   Max.   :3000.0   Max.   :16020  
##       2017              2018              2019        Rutin/Non Rutin   
##  Min.   :    0.0   Min.   :    0.0   Min.   :  0.00   Length:87         
##  1st Qu.:    1.0   1st Qu.:    0.0   1st Qu.:  0.00   Class :character  
##  Median :    3.0   Median :    3.0   Median :  2.00   Mode  :character  
##  Mean   :  284.1   Mean   :  526.5   Mean   : 94.14                     
##  3rd Qu.:   64.5   3rd Qu.:   80.0   3rd Qu.: 25.00                     
##  Max.   :10410.0   Max.   :20250.0   Max.   :975.00

We can see minimum, median, mean, and maximum values of each numeric variable.

Based on this data, we can see the different material use every year. also the data have a high fluctuatuin in every year. It means we need to cluster material depends on pattern.

3. Time Point Clustering

Plot total demand in 2014-2019 every material.

yearly <- dt_rutin %>%
  select(material, `2014`:`2019`) %>%
  pivot_longer(`2014`:`2019`, names_to = "tahun", values_to = "n", ) %>%
  mutate(tahun = as.integer(tahun), n = ceiling(n)) %>%
  arrange(tahun, material) %>%
  distinct(tahun, material, .keep_all = T) %>%
  as_tsibble(index = tahun, key = material)

yearly %>%
  ggplot(aes(x = tahun, y = n, color = material)) +
  geom_line(show.legend = FALSE) + 
  scale_y_continuous(labels = scales::comma) +
  facet_wrap(~ material, ncol = 10, scales = "free_y")

There are different pattern of data in every material. In the data, we can see pattern like intermittent, trend, lumphy, and erratic.

3.1 Data Preparation

We can filter with only material rutin because in inventory control, they just predict materials rutin.

In this data, we can see 87 observations after we cleaning the data and we remove the data with NA values.

3.2 Feature Scaling

We need to remove variable with character type and we can scale the variable data. We can also set number of digits in data = 0.

dt <- round(dt_rutin[ ,2:7], digits = 0)

scale_dt <- apply(dt, MARGIN = 1, FUN = scale )
x <- t(scale_dt)

4. Modeling

Find distance from this data using Dynamic Time Warping and then we can use hierarchical clustering with Average Method and Ward.D Method.

dist_dt_rutin <- TSDatabaseDistances(x, distance="dtw")
hc.ave <- hclust(dist_dt_rutin, method = "average")
hc.ward <- hclust(dist_dt_rutin, method = "ward.D")

4.1 Plot Dendogram

We can plot dendogram every method. Plot dendogram with Average method and Ward.D method.

plot(hc.ave, hang = -1, cex = 0.8,
     main = 'Average Linken')
rect.hclust(hc.ave, k=6)

plot(hc.ward, hang = -1, cex = 0.8,
     main = 'Ward D Linken')
rect.hclust(hc.ward, k=5)

From dendogram, we can see the most suitable division. We decide using 6 clustering with average method and 5 clustering using Ward.D method.

k <- 6
cluster.average <- cutree(hc.ave, k)
table(cluster.average)
## cluster.average
##  1  2  3  4  5  6 
## 12 25 23  6  6 15
k2 <- 5
cluster.ward <- cutree(hc.ward, k2)
table(cluster.ward)
## cluster.ward
##  1  2  3  4  5 
## 27 18 16  5 21

We can create new data frame for each cluster and also plot materials demand for every cluster.

## Dynamic Time Warping method average
# Cluster 1
cluster1 <- which(cluster.average %in% 1)
cluster1_dtw <- dt_rutin[cluster1, ]
cluster1_dtw
## # A tibble: 12 x 8
##    material           `2014` `2015` `2016` `2017` `2018` `2019` `Rutin/Non Ruti~
##    <chr>               <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>           
##  1 "LUBE;S;SHELL OMA~    996     40    209     0    1254    627 RUTIN           
##  2 "GREASE;S;OMEGA 6~     15     30     30    15      15     30 RUTIN           
##  3 "LUBE;N;NYNAS"          0   1045    209  1006.   1010    808 RUTIN           
##  4 "UNIV ACC;BAN MOB~      0      2     19    12       0     12 RUTIN           
##  5 "UNIV ACC;BAN MOB~      2      0      1     4       0      4 RUTIN           
##  6 "BEARING;BALL;620~      2      0     20    40       0     50 RUTIN           
##  7 "BEARING;BALL;630~      2      0      0     1       4      4 RUTIN           
##  8 "CONVEYOR ACC;PSK~     50     30      0     0     100     40 RUTIN           
##  9 "UNIV ACC;BOLT&NU~      4      0     50     0       9     40 RUTIN           
## 10 "SEAL;OIL;RUBBER;~      4      1      4     0       0      4 RUTIN           
## 11 "SWG;8\";CLASS 15~      0      1      5     2       0      4 RUTIN           
## 12 "THERMOWELL;NiCrF~      6     14      3     0       0     12 RUTIN
## Plot cluster 1

cluster1_dtw <- cluster1_dtw %>%
        select(material, `2014`:`2019`) %>%
        pivot_longer(`2014`:`2019`, names_to = "tahun", values_to = "n", ) %>%
        mutate(tahun = as.integer(tahun)) %>%
        arrange(tahun, material) %>%
        distinct(tahun, material, .keep_all = T) %>%
        as_tsibble(index = tahun, key = material)

cluster1_dtw %>%
      ggplot(aes(x = tahun, y = n, color = material)) +
      geom_line(show.legend = FALSE) + 
      scale_y_continuous(labels = scales::comma) +
      facet_wrap(~ material, ncol = 4, scales = "free_y")

# Cluster 2
cluster2 <- which(cluster.average %in% 2)
cluster2_dtw <- dt_rutin[cluster2, ]
cluster2_dtw
## # A tibble: 25 x 8
##    material           `2014` `2015` `2016` `2017` `2018` `2019` `Rutin/Non Ruti~
##    <chr>               <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>           
##  1 "GREASE;N;SHELL A~   18.0    162     20    140      0      0 RUTIN           
##  2 "GREASE;S;OMEGA 5~   30       45     30     75     45     30 RUTIN           
##  3 "LUBE;N;SHELL RIM~  209      209    209    209    836    418 RUTIN           
##  4 "LUBE;S;MOBIL SHC~  208      208    208    624      0      0 RUTIN           
##  5 "LUBE;S;ROTO Z"      40        1    209      0    209      0 RUTIN           
##  6 "FILTER;AIR;16546~    0        1      2      2      8      0 RUTIN           
##  7 "FILTER;AIR;CCF6\~    1        2      1      5      0      0 RUTIN           
##  8 "FILTER;OIL;15274~    1        2      2      0      8      0 RUTIN           
##  9 "BEARING;BALL;641~    2        2     10      2      2      0 RUTIN           
## 10 "CONVEYOR ACC;CAR~   20       47     25      0    100      0 RUTIN           
## # ... with 15 more rows
## Plot cluster 2

cluster2_dtw <- cluster2_dtw %>%
        select(material, `2014`:`2019`) %>%
        pivot_longer(`2014`:`2019`, names_to = "tahun", values_to = "n", ) %>%
        mutate(tahun = as.integer(tahun)) %>%
        arrange(tahun, material) %>%
        distinct(tahun, material, .keep_all = T) %>%
        as_tsibble(index = tahun, key = material)

cluster2_dtw %>%
        ggplot(aes(x = tahun, y = n, color = material)) +
        geom_line(show.legend = FALSE) + 
        scale_y_continuous(labels = scales::comma) +
        facet_wrap(~ material, ncol = 4, scales = "free_y")

# Cluster 3
cluster3 <- which(cluster.average %in% 3)
cluster3_dtw <- dt_rutin[cluster3, ]
cluster3_dtw
## # A tibble: 23 x 8
##    material           `2014` `2015` `2016` `2017` `2018` `2019` `Rutin/Non Ruti~
##    <chr>               <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>           
##  1 GREASE;S;OMEGA 85      45      0     15     30      0     15 RUTIN           
##  2 LUBE;S;SHELL RIMU~    418    209    209      0    100      0 RUTIN           
##  3 LUBE;T;SHELL TURB~   1254    209    209    418    836    418 RUTIN           
##  4 UNIV ACC;BAN MOBI~      2      0      2      2      0      2 RUTIN           
##  5 OIL FTR;DYN;BHP30~      2      1      0      1      0      1 RUTIN           
##  6 FILTER;WATER;68X2~     12     24     24      6      6      0 RUTIN           
##  7 FILTER;AIR;PPS-80~    512      4    517    516      0      0 RUTIN           
##  8 BEAR;DOD;MERRICK4~      6      6      2      6      2      0 RUTIN           
##  9 BEAR;DOD;MERRICK4~      6      6      2      1      2      0 RUTIN           
## 10 BEAR;DOD;MERRICK4~      6      6      2      0      2      2 RUTIN           
## # ... with 13 more rows
## Plot cluster 3

cluster3_dtw <- cluster3_dtw %>%
        select(material, `2014`:`2019`) %>%
        pivot_longer(`2014`:`2019`, names_to = "tahun", values_to = "n", ) %>%
        mutate(tahun = as.integer(tahun)) %>%
        arrange(tahun, material) %>%
        distinct(tahun, material, .keep_all = T) %>%
        as_tsibble(index = tahun, key = material)

 cluster3_dtw %>%
        ggplot(aes(x = tahun, y = n, color = material)) +
        geom_line(show.legend = FALSE) + 
        scale_y_continuous(labels = scales::comma) +
        facet_wrap(~ material, ncol = 4, scales = "free_y")

# Cluster 4
cluster4 <- which(cluster.average %in% 4)
cluster4_dtw <- dt_rutin[cluster4, ]
cluster4_dtw
## # A tibble: 6 x 8
##   material            `2014` `2015` `2016` `2017` `2018` `2019` `Rutin/Non Ruti~
##   <chr>                <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>           
## 1 LUBE;N;SHELL COREN~    209    209    209    209    209    627 RUTIN           
## 2 LUBE;S;OMEGA 690 S~     20      0     20    120     60    340 RUTIN           
## 3 O-SEAL;NFM;6300060~      2      0      1      1      0      4 RUTIN           
## 4 PACKING ;TOMBO1995~      0      0      2      5      6     20 RUTIN           
## 5 GAS;COMPRESSED AIR       0      1      1      5      0     15 RUTIN           
## 6 CHEM;SODIUM SULPHI~      0     25     25      0     25    375 RUTIN
## Plot cluster 4

cluster4_dtw <- cluster4_dtw %>%
        select(material, `2014`:`2019`) %>%
        pivot_longer(`2014`:`2019`, names_to = "tahun", values_to = "n", ) %>%
        mutate(tahun = as.integer(tahun)) %>%
        arrange(tahun, material) %>%
        distinct(tahun, material, .keep_all = T) %>%
        as_tsibble(index = tahun, key = material)

 cluster4_dtw %>%
        ggplot(aes(x = tahun, y = n, color = material)) +
        geom_line(show.legend = FALSE) + 
        scale_y_continuous(labels = scales::comma) +
        facet_wrap(~ material, ncol = 4, scales = "free_y")

# Cluster 5
cluster5 <- which(cluster.average %in% 5)
cluster5_dtw <- dt_rutin[cluster5, ]
cluster5_dtw
## # A tibble: 6 x 8
##   material            `2014` `2015` `2016` `2017` `2018` `2019` `Rutin/Non Ruti~
##   <chr>                <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>           
## 1 "LUBE;N;SHELL RIMU~      0      0    209    209    418    209 RUTIN           
## 2 "CONVEYOR ACC;BELT~      0      1      1      2      1      1 RUTIN           
## 3 "BOILER ACC;FURNAC~      0     30     27     50     55     40 RUTIN           
## 4 "FLEXIBLE HOSE;SS3~      0      0      8      4      8      3 RUTIN           
## 5 "PACKING;SHEET;ASB~      0      1      0      3      3      2 RUTIN           
## 6 "PACKING ;TOMBO199~      0      0      2      1      3      1 RUTIN
## Plot cluster 5

cluster5_dtw <- cluster5_dtw %>%
        select(material, `2014`:`2019`) %>%
        pivot_longer(`2014`:`2019`, names_to = "tahun", values_to = "n", ) %>%
        mutate(tahun = as.integer(tahun)) %>%
        arrange(tahun, material) %>%
        distinct(tahun, material, .keep_all = T) %>%
        as_tsibble(index = tahun, key = material)

 cluster5_dtw %>%
        ggplot(aes(x = tahun, y = n, color = material)) +
        geom_line(show.legend = FALSE) + 
        scale_y_continuous(labels = scales::comma) +
        facet_wrap(~ material, ncol = 4, scales = "free_y")

# Cluster 6
cluster6 <- which(cluster.average %in% 6)
cluster6_dtw <- dt_rutin[cluster6, ]
cluster6_dtw
## # A tibble: 15 x 8
##    material           `2014` `2015` `2016` `2017` `2018` `2019` `Rutin/Non Ruti~
##    <chr>               <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>           
##  1 "LUBE;T;SHELL TEL~    200    209    209    206    209    104 RUTIN           
##  2 "FTR;PAL;A10V028D~      3      3      2      3      1      2 RUTIN           
##  3 "SEPARATOR;DYN;BH~      1      1      1      1      2      0 RUTIN           
##  4 "FILTER;OIL;49305~      2      1      1      2      0      0 RUTIN           
##  5 "FILTER;AIR;91E61~      0      2      1      2      2      0 RUTIN           
##  6 "BEARING;BALL;631~      0      2      1      1      1      0 RUTIN           
##  7 "BEARING;BALL;631~      2      1      1      2      0      0 RUTIN           
##  8 "VALVE;BUTTERFLY;~      2      3      1      3      0      0 RUTIN           
##  9 "VALVE;CHECK;4\";~      1      2      1      2      0      0 RUTIN           
## 10 "OIL SEAL;TRS;CDM~      2      1      0      3      3      0 RUTIN           
## 11 "PWR DRIVE;JCL;63~      2      1      2      2      0      0 RUTIN           
## 12 "LOGIC BRD;JCL;63~      2      1      2      2      0      0 RUTIN           
## 13 "UNIV ACC;BATTERY~      2      2      2      2      4      0 RUTIN           
## 14 "CHEM;CAUSTIC SOD~   8000   3000  16020  10410  10050    676 RUTIN           
## 15 "UNIV ACC;FREON R~      1      1      2      1      0      0 RUTIN
## Plot cluster 6

cluster6_dtw <- cluster6_dtw %>%
        select(material, `2014`:`2019`) %>%
        pivot_longer(`2014`:`2019`, names_to = "tahun", values_to = "n", ) %>%
        mutate(tahun = as.integer(tahun)) %>%
        arrange(tahun, material) %>%
        distinct(tahun, material, .keep_all = T) %>%
        as_tsibble(index = tahun, key = material)

 cluster6_dtw %>%
        ggplot(aes(x = tahun, y = n, color = material)) +
        geom_line(show.legend = FALSE) + 
        scale_y_continuous(labels = scales::comma) +
        facet_wrap(~ material, ncol = 4, scales = "free_y")

Analyze cluster

We can see the different pattern from this data and we can also analyze the different pattern in every cluster. In first cluster, we have 12 materials and type of data is intermittend demand. In second cluster, we have 25 materials and they have the same pattern type. For the example, in this cluster in general demand of material have a high increased and in the next year this material have 0 demand. In third cluster, we have 23 materials and establish a data trend where the number of requests always decreases. In fourth cluster, we have 6 materials with the same patterns, where the data have a significantly increased in the last year. In the last cluster, we have 15 materials, and we can see the pattern from this data its always have periodically decreases year after year.

## Dynamic Time Warping with method Ward D
# Cluster 1
cluster1B <- which(cluster.ward %in% 1)
cluster1_dtw2 <- dt_rutin[cluster1B, ]
cluster1_dtw2
## # A tibble: 27 x 8
##    material           `2014` `2015` `2016` `2017` `2018` `2019` `Rutin/Non Ruti~
##    <chr>               <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>           
##  1 LUBE;S;SHELL OMAL~  996       40    209     0    1254    627 RUTIN           
##  2 GREASE;N;SHELL AL~   18.0    162     20   140       0      0 RUTIN           
##  3 GREASE;S;OMEGA 65    15       30     30    15      15     30 RUTIN           
##  4 LUBE;N;NYNAS          0     1045    209  1006.   1010    808 RUTIN           
##  5 LUBE;N;SHELL RIMU~    0        0    209   209     418    209 RUTIN           
##  6 LUBE;S;ROTO Z        40        1    209     0     209      0 RUTIN           
##  7 UNIV ACC;BAN MOBI~    0        2     19    12       0     12 RUTIN           
##  8 UNIV ACC;BAN MOBI~    2        0      1     4       0      4 RUTIN           
##  9 UNIV ACC;BAN MOBI~    2        0      2     2       0      2 RUTIN           
## 10 O-SEAL;NFM;630006~    2        0      1     1       0      4 RUTIN           
## # ... with 17 more rows
## Plot cluster 1

cluster1_dtw2 <- cluster1_dtw2 %>%
        select(material, `2014`:`2019`) %>%
        pivot_longer(`2014`:`2019`, names_to = "tahun", values_to = "n", ) %>%
        mutate(tahun = as.integer(tahun)) %>%
        arrange(tahun, material) %>%
        distinct(tahun, material, .keep_all = T) %>%
        as_tsibble(index = tahun, key = material)

 cluster1_dtw %>%
        ggplot(aes(x = tahun, y = n, color = material)) +
        geom_line(show.legend = FALSE) + 
        scale_y_continuous(labels = scales::comma) +
        facet_wrap(~ material, ncol = 4, scales = "free_y")

# Cluster 2
cluster2B <- which(cluster.ward %in% 2)
cluster2_dtw2 <- dt_rutin[cluster2B, ]
cluster2_dtw2
## # A tibble: 18 x 8
##    material           `2014` `2015` `2016` `2017` `2018` `2019` `Rutin/Non Ruti~
##    <chr>               <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>           
##  1 "GREASE;S;OMEGA 5~     30     45   30       75     45     30 RUTIN           
##  2 "LUBE;N;SHELL RIM~    209    209  209      209    836    418 RUTIN           
##  3 "LUBE;S;MOBIL SHC~    208    208  208      624      0      0 RUTIN           
##  4 "FILTER;AIR;16546~      0      1    2        2      8      0 RUTIN           
##  5 "FILTER;AIR;CCF6\~      1      2    1        5      0      0 RUTIN           
##  6 "FILTER;OIL;15274~      1      2    2        0      8      0 RUTIN           
##  7 "BEARING;BALL;641~      2      2   10        2      2      0 RUTIN           
##  8 "CONVEYOR ACC;CAR~     20     47   25        0    100      0 RUTIN           
##  9 "UNIV ACC;MALE & ~      3     51    4        0      0     10 RUTIN           
## 10 "VALVE;BALL;2\";D~      2      1    2       20      0      0 RUTIN           
## 11 "UNIV ACC;BOLT&NU~      0     20    0      100     50     18 RUTIN           
## 12 "VALVE;BALL;2\";D~      8      5    4       16      0      0 RUTIN           
## 13 "TOOL M;CLAMP ALL~      1      3    1        1      1      1 RUTIN           
## 14 "CHEM;AMMONIA HYD~    400    360  180      660   1390    625 RUTIN           
## 15 "CHEM;HYDROCHLORI~   9000   2500 1750     2750  20250    960 RUTIN           
## 16 "CHEM;POLYELECTRO~      0      0    0.3     15    150     25 RUTIN           
## 17 "CHEM;SCALE INHIB~    450    625  750     2400   6250    975 RUTIN           
## 18 "UNIV ACC;SEMEN T~    150    350 1000     2000    100     25 RUTIN
## Plot cluster 2

cluster2_dtw2 <- cluster2_dtw2 %>%
        select(material, `2014`:`2019`) %>%
        pivot_longer(`2014`:`2019`, names_to = "tahun", values_to = "n", ) %>%
        mutate(tahun = as.integer(tahun)) %>%
        arrange(tahun, material) %>%
        distinct(tahun, material, .keep_all = T) %>%
        as_tsibble(index = tahun, key = material)

 cluster2_dtw2 %>%
        ggplot(aes(x = tahun, y = n, color = material)) +
        geom_line(show.legend = FALSE) + 
        scale_y_continuous(labels = scales::comma) +
        facet_wrap(~ material, ncol = 4, scales = "free_y")

# Cluster 3
cluster3B <- which(cluster.ward %in% 3)
cluster3_dtw2 <- dt_rutin[cluster3B, ]
cluster3_dtw2
## # A tibble: 16 x 8
##    material           `2014` `2015` `2016` `2017` `2018` `2019` `Rutin/Non Ruti~
##    <chr>               <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>           
##  1 "GREASE;S;OMEGA 8~     45      0     15     30      0     15 RUTIN           
##  2 "LUBE;S;SHELL RIM~    418    209    209      0    100      0 RUTIN           
##  3 "LUBE;T;SHELL TUR~   1254    209    209    418    836    418 RUTIN           
##  4 "OIL FTR;DYN;BHP3~      2      1      0      1      0      1 RUTIN           
##  5 "FILTER;WATER;68X~     12     24     24      6      6      0 RUTIN           
##  6 "BEAR;DOD;MERRICK~      6      6      2      1      2      0 RUTIN           
##  7 "BEAR;DOD;MERRICK~      6      6      2      0      2      2 RUTIN           
##  8 "BEARING;BALL;620~     20      3      2      0      5      2 RUTIN           
##  9 "BEARING;BALL;620~      6      2      1      0      0      2 RUTIN           
## 10 "BEARING;CYL ROLL~      2      2      0      0      1      1 RUTIN           
## 11 "SEAL;OIL;RUBBER;~      8      6      0      5      5      0 RUTIN           
## 12 "ACTUATOR ACC;POS~     12      1      0      1      0      7 RUTIN           
## 13 "CONVEYOR ACC;IMP~     50     18      4      0     12      0 RUTIN           
## 14 "FLANGE;PVC;4\";1~      4      2      2      0      0      2 RUTIN           
## 15 "VALVE;GLOBE;1/2\~      0      2      2      1      0      1 RUTIN           
## 16 "EXCT ACC;CARBON ~     24      3     15     16     16      5 RUTIN
## Plot cluster 3

cluster3_dtw2 <- cluster3_dtw2 %>%
        select(material, `2014`:`2019`) %>%
        pivot_longer(`2014`:`2019`, names_to = "tahun", values_to = "n", ) %>%
        mutate(tahun = as.integer(tahun)) %>%
        arrange(tahun, material) %>%
        distinct(tahun, material, .keep_all = T) %>%
        as_tsibble(index = tahun, key = material)

 cluster3_dtw2 %>%
        ggplot(aes(x = tahun, y = n, color = material)) +
        geom_line(show.legend = FALSE) + 
        scale_y_continuous(labels = scales::comma) +
        facet_wrap(~ material, ncol = 4, scales = "free_y")

# Cluster 4
cluster4B <- which(cluster.ward %in% 4)
cluster4_dtw2 <- dt_rutin[cluster4B, ]
cluster4_dtw2
## # A tibble: 5 x 8
##   material            `2014` `2015` `2016` `2017` `2018` `2019` `Rutin/Non Ruti~
##   <chr>                <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>           
## 1 LUBE;N;SHELL COREN~    209    209    209    209    209    627 RUTIN           
## 2 LUBE;S;OMEGA 690 S~     20      0     20    120     60    340 RUTIN           
## 3 PACKING ;TOMBO1995~      0      0      2      5      6     20 RUTIN           
## 4 GAS;COMPRESSED AIR       0      1      1      5      0     15 RUTIN           
## 5 CHEM;SODIUM SULPHI~      0     25     25      0     25    375 RUTIN
## Plot cluster 4

cluster4_dtw2 <- cluster4_dtw2 %>%
        select(material, `2014`:`2019`) %>%
        pivot_longer(`2014`:`2019`, names_to = "tahun", values_to = "n", ) %>%
        mutate(tahun = as.integer(tahun)) %>%
        arrange(tahun, material) %>%
        distinct(tahun, material, .keep_all = T) %>%
        as_tsibble(index = tahun, key = material)

 cluster4_dtw2 %>%
        ggplot(aes(x = tahun, y = n, color = material)) +
        geom_line(show.legend = FALSE) + 
        scale_y_continuous(labels = scales::comma) +
        facet_wrap(~ material, ncol = 4, scales = "free_y")

# Cluster 5
cluster5B <- which(cluster.ward %in% 5)
cluster5_dtw2 <- dt_rutin[cluster5B, ]
cluster5_dtw2
## # A tibble: 21 x 8
##    material           `2014` `2015` `2016` `2017` `2018` `2019` `Rutin/Non Ruti~
##    <chr>               <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <chr>           
##  1 LUBE;T;SHELL TELL~    200    209    209    206    209    104 RUTIN           
##  2 FTR;PAL;A10V028DR~      3      3      2      3      1      2 RUTIN           
##  3 SEPARATOR;DYN;BHP~      1      1      1      1      2      0 RUTIN           
##  4 FILTER;OIL;493055~      2      1      1      2      0      0 RUTIN           
##  5 FILTER;AIR;PPS-80~    512      4    517    516      0      0 RUTIN           
##  6 FILTER;AIR;91E61-~      0      2      1      2      2      0 RUTIN           
##  7 BEAR;DOD;MERRICK4~      6      6      2      6      2      0 RUTIN           
##  8 BEAR;DOD;MERRICK4~      6      6      2      8      2      0 RUTIN           
##  9 BEARING;BALL;6310~      0      2      1      1      1      0 RUTIN           
## 10 BEARING;BALL;6313~      2      1      1      2      0      0 RUTIN           
## # ... with 11 more rows
## Plot cluster 5

cluster5_dtw2 <- cluster5_dtw2 %>%
        select(material, `2014`:`2019`) %>%
        pivot_longer(`2014`:`2019`, names_to = "tahun", values_to = "n", ) %>%
        mutate(tahun = as.integer(tahun)) %>%
        arrange(tahun, material) %>%
        distinct(tahun, material, .keep_all = T) %>%
        as_tsibble(index = tahun, key = material)

 cluster5_dtw2 %>%
        ggplot(aes(x = tahun, y = n, color = material)) +
        geom_line(show.legend = FALSE) + 
        scale_y_continuous(labels = scales::comma) +
        facet_wrap(~ material, ncol = 4, scales = "free_y")

Analyze cluster

Using method ward.D, we can see 5 different type in every cluster. In first cluster, demand rate from material establish intermittent type demand, where in the data has a high fluctuation. In second cluster, rate of demand in a year can be very high and in the next year can be very low or can reach 0 value. In third cluster, rate of demand indicate a periodic decline year by year and in the last year it can have 0 demand. In fourth cluster, rate of demand shows that demand has a significantly increase in the last year and it can show demand of material in the next year have a higher demand from the last year. In last cluster, rate of demand in this cluster indicates significantly decline in the last two years. It can show the materials could be not used in the next year. Division of inventory control can ask to user if they want change this material. Because this company always using a different type of material and it could be the materials with 0 demand will not be used again in the next year. By doing clustering in every pattern, it can make a quick decision to remove materials that are not longer to used.

5. Recommendations

  1. We can use Dynamic Time Warping to get the optimal distance from the data.
  2. Using Hierarchical Clustering with 2 different method : Ward.D and Average.
  3. For the next step, this data can be improved by build a model in every cluster so we can know which the better cluster depends a error values.