Supply chain disruptions caused by the COVID-19 pandemic pose a threat to the stability of ABC’s market leadership in the personal computer industry. Exploration and analysis by Prestige Worldwide Consulting can help ABC identify problems in the supply chain, anticipate and mitigate liabilities, optimize manufacturing and shipping throughput, and increase ABC’s resiliency to supply chain volatility in the uncertain post-COVID world.
To begin, we calculated the Quarter and Year of each Receipt Date in order to facilitate further data analysis.
We then calculated manufacturing lead time as the elapsed time between the date the purchase order was downloaded at the ODM site and the date the product was shipped from the site. We also calculated in-transit lead time as the elapsed time between the date the product was shipped from the original design manufacturers (ODM) site and the date it was received at the warehouse. Both values are given in days. Graphical exploration of these calculated values revealed impossible negative values as well as some suspiciously high values that we addressed later in the analysis.
With these additional variables created, we then inspected the data for missingness.
## [1] 145
## [1] 1.589215
Because these observations are missing ship and receipt dates, we were unable to calculate lead times for these observations. The number of rows missing all of these values is 145, representing 1.6% of the entire 9124-observation data set, so we felt comfortable excluding them from our analysis leaving us with 8979 observations available for further analysis.
We then turned our attention to the impossible and suspicious manufacturing and in-transit lead times contaminating the data. After identifying and excluding impossible values less than zero, we noticed the distribution of manufacturing lead times was right-skewed, so we applied a cube-root transformation to normalize the data before identifying and excluding suspicious outliers. The total of negative and unusual values was 348. Removing these along with the original 145 missing observations left us with 8631 out of 9124 observations, representing 94.6% of the original data set. We then visualized the distribution of acceptable manufacturing lead times, annotated with the mode, median, and mean.
We applied the same approach as above to the in-transit lead times. Due to the selection of a logarithmic transformation to normalize, we first excluded all values less than or equal to zero, then excluded outliers after applying the transformation. Removing 118 variables beyond the originally-removed 145 missing values left us with 8861 out of 9124 observations, representing 97.1% of the original data set. We then visualized the distribution of acceptable in-transit lead times, annotated with the mode, median, and mean.
Note: In this case, mode and median were equal to each other.
With the data cleaned, we estimated lead times according to each unique combination of Product (LOB), Origin, and Quarter (as well as Shipping Mode, for in-transit lead times) in order to more accurately reflect the unique circumstances and challenges facing each piece of the supply chain.
## , , = Q1
##
##
## Site A Site B Site C Site D
## Product A 1428 0 0 0
## Product B 193 0 754 1301
## Product C 82 0 0 0
##
## , , = Q2
##
##
## Site A Site B Site C Site D
## Product A 1116 0 0 0
## Product B 56 661 690 752
## Product C 156 0 0 0
##
## , , = Q3
##
##
## Site A Site B Site C Site D
## Product A 243 0 0 0
## Product B 0 819 122 245
## Product C 13 0 0 0
## # A tibble: 16 × 4
## # Groups: LOB, Origin [6]
## LOB Origin Quarter mean_value
## <chr> <chr> <chr> <dbl>
## 1 Product A Site A Q1 6.65
## 2 Product A Site A Q2 9.97
## 3 Product A Site A Q3 4.35
## 4 Product B Site A Q1 10.4
## 5 Product B Site A Q2 2.12
## 6 Product B Site B Q2 12.5
## 7 Product B Site B Q3 12.5
## 8 Product B Site C Q1 6.95
## 9 Product B Site C Q2 4.51
## 10 Product B Site C Q3 4.11
## 11 Product B Site D Q1 6.89
## 12 Product B Site D Q2 5.93
## 13 Product B Site D Q3 4.19
## 14 Product C Site A Q1 13.1
## 15 Product C Site A Q2 11.8
## 16 Product C Site A Q3 7.46
## # A tibble: 6 × 3
## # Groups: LOB [3]
## LOB Origin mean_value
## <chr> <chr> <dbl>
## 1 Product A Site A 7.78
## 2 Product B Site A 8.52
## 3 Product B Site B 12.5
## 4 Product B Site C 5.65
## 5 Product B Site D 6.29
## 6 Product C Site A 12.0
## # A tibble: 31 × 5
## # Groups: LOB, Origin, Ship Mode [11]
## LOB Origin `Ship Mode` Quarter mean_value
## <chr> <chr> <chr> <chr> <dbl>
## 1 Product A Site A AIR Q1 8.20
## 2 Product A Site A AIR Q2 8.02
## 3 Product A Site A AIR Q3 7.55
## 4 Product A Site A AIR Q4 16.8
## 5 Product A Site A FASTBOAT Q1 23.9
## 6 Product A Site A FASTBOAT Q2 31.2
## 7 Product A Site A FASTBOAT Q3 21.9
## 8 Product A Site A FASTBOAT Q4 34
## 9 Product A Site A OCEAN Q1 29.6
## 10 Product A Site A OCEAN Q2 34.6
## # ℹ 21 more rows
We then replaced irregular values with corrected values based on the average of the subgroup with the same characteristics as the irregular value. We have now replaced all out-of-range values with appropriate estimations, and our data set once again has 8979 rows, or 98.4%, of the original 9124 observations.
We explored the relationships in the data by producing pairwise plots between each of the variables.