Data Preparation

# load data
library(readr)
library(ggplot2)
library(dplyr)

# Load inventory dataset
inventory_data <- read_csv("hospital_supply_chain/inventory_data.csv")

# Convert Date column to Date format
inventory_data$Date <- as.Date(inventory_data$Date)

# Check the first few rows
head(inventory_data)

# Check structure and types of data
str(inventory_data)

Reseach Quesions

What strategies should hospitals implement to mitigate product shortages caused by global supply chain delays?

In this study, the goal is to identify potential shortages and supply chain risk by analyzing stock level, usage rates, and restock lead times, and deploy strategies ensuring inventory stability.

Cases

How many cases are there and how many?

Each case is an inventory record for a specific hospital supply item.

# number of cases confirm
nrow(inventory_data)
## [1] 500

The dataset includes 500 observations (inventory records)

Data Collection

Describe the method of data collection.

The data was gathered from hospital inventory management systems that monitor Stock levels

Restock lead times

Vendor details

Usage rates of medical equipment and consumables

The data contains daily records of medical equipment and consumables.

#Type of Study

What type of study is this (observational/experiment)?

An Observational study

It examines existing hospital inventory trends without employing experimental methods.

Data Source

If you collected the data, state self-collected. If not, provide a citation/link.

The dataset is publicly available on Kaggle and was uploaded using Posit cloud.

Source: https://www.kaggle.com/datasets/vanpatangan/hospital-supply-chain

Response and Explanatory Variables

Are they quantitative or qualitative?

Response Variable (Dependent Variable)

If performing a regression analysis, the dependent variable could be:

Restock_Lead_Time (predicting supply delays)

Current_Stock (predicting stock shortages)

#Explanatory Variables

Quantitative variables

Average_Usage_Per_Day (numerical) - How quickly an item is used.

Min_Required (numerical) - The lowest stock level required.

Max_Capacity (numerical) - Maximum inventory capacity.

Qualitative (categorical) Variables

Item_Type (Categorical): Indicates whether the item is consumable or equipment.

Vendor_ID (categorical) - Various supplier sources.

Relevant Summary Statistics

Provide summary statistics and visualization.

#Statistics summary for numerical variables
summary(inventory_data)
##       Date               Item_ID       Item_Type          Item_Name        
##  Min.   :2024-10-01   Min.   :100.0   Length:500         Length:500        
##  1st Qu.:2025-02-02   1st Qu.:102.0   Class :character   Class :character  
##  Median :2025-06-07   Median :104.0   Mode  :character   Mode  :character  
##  Mean   :2025-06-07   Mean   :104.5                                        
##  3rd Qu.:2025-10-10   3rd Qu.:107.0                                        
##  Max.   :2026-02-12   Max.   :109.0                                        
##  Current_Stock   Min_Required    Max_Capacity    Unit_Cost       
##  Min.   :  69   Min.   : 10.0   Min.   : 500   Min.   :    4.23  
##  1st Qu.:1308   1st Qu.:215.8   1st Qu.:1848   1st Qu.: 5422.46  
##  Median :2412   Median :496.5   Median :3311   Median :10129.96  
##  Mean   :2459   Mean   :486.0   Mean   :3289   Mean   :10277.33  
##  3rd Qu.:3719   3rd Qu.:734.2   3rd Qu.:4696   3rd Qu.:15206.32  
##  Max.   :4976   Max.   :995.0   Max.   :5992   Max.   :19984.16  
##  Avg_Usage_Per_Day Restock_Lead_Time  Vendor_ID        
##  Min.   :  2.0     Min.   : 1.00     Length:500        
##  1st Qu.:150.5     1st Qu.: 7.00     Class :character  
##  Median :257.0     Median :16.00     Mode  :character  
##  Mean   :261.8     Mean   :15.12                       
##  3rd Qu.:392.0     3rd Qu.:23.00                       
##  Max.   :499.0     Max.   :29.00
# Check missing values

colSums(is.na(inventory_data))
##              Date           Item_ID         Item_Type         Item_Name 
##                 0                 0                 0                 0 
##     Current_Stock      Min_Required      Max_Capacity         Unit_Cost 
##                 0                 0                 0                 0 
## Avg_Usage_Per_Day Restock_Lead_Time         Vendor_ID 
##                 0                 0                 0
# Type of unique items
table(inventory_data$Item_Type)
## 
## Consumable  Equipment 
##        266        234
#Unique vendors
table(inventory_data$Vendor_ID)
## 
## V001 V002 V003 
##  188  156  156

Visualizations

(A) Stock Levels by Item Type

ggplot(inventory_data, aes(x = Item_Type, y = Current_Stock, fill = Item_Type)) +
  geom_boxplot() +
  labs(title = "Stock Levels of Equipment vs Consumables",
       x = "Item Type", y = "Current Stock") +
  theme_minimal()

# (B) Restock Lead Time Distribution

ggplot(inventory_data, aes(x = Restock_Lead_Time)) +
  geom_histogram(fill = "lightblue", bins = 20) +
  labs(title = "Distribution of Restock Lead Times",
       x = "Lead Time (Days)", y = "Frequency") +
  theme_minimal()

(C) Items at Risk of Shortage

# Items below minimum requited stocks are identify

shortages <- inventory_data %>%
  filter(Current_Stock < Min_Required)

ggplot(shortages, aes(x = Item_Name, y = Current_Stock, fill = Item_Type)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(title = "Items Below Minimum Stock Requirement",
       x = "Medical Supply", y = "Current Stock") +
  theme_minimal()