Blood Product Wastage - Interrupted Time Series

Loading and Preparing the Data

The data used for this R markdown document is blood product wastage data from a hospital. The data lists every blood product that has gone to waste over a stretch of 4 years. Decision staff would like to know if their efforts to reduce wastage throughout 2023 have made a difference.

The first thing we need to do is load a few packages that we will need and then load up the datafile which is in xlsx format. An image of the excel spreadsheet is included for reference.

library(readxl)
library(lubridate)

## 
## Attaching package: 'lubridate'

## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union

library(stargazer)

## 
## Please cite as:

##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.

##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer

Wastage_Working <- read_excel("Wastage_Working.xlsx")
waste = Wastage_Working
knitr::include_graphics("Wastage_Working_Image.png")

An image of the excel spreadsheet used for this analysis.

Creating new variables

Decision staff has asked for the data to be presented monthly. So we will create a new dataframe from the existing data that will contain a monthly sum of the blood product wastage. Additionally, we will add a year column to this dataframe that extracts the year value from each data entry as a separate column. This will help us with the time series analysis later on.

monthly_total = aggregate(waste['AdjustedCost'], list(month=substr(waste$DiscardDate, 1, 7)), FUN = sum, na.rm = TRUE)
monthly_total$month= ym(monthly_total$month)

monthly_total$year = as.POSIXct(monthly_total$month, format = "%Y/%m/%d")
monthly_total$year = format(monthly_total$year, format = "%Y")

Time Series Analysis

When preparing data for time series analysis we need to create three new variables that are part of the time series equation. 1. A variable indicating the time 2. A variable indicating if the observation occured before or after the event 3. A variable indicating the time passed since the event (In our case, since the start of 2023)

monthly_total$time = rep(1:nrow(monthly_total)) 
monthly_total$treatment = ifelse(monthly_total$year < 2023, 0,1) 
monthly_total$timesince = c(rep(0,38),rep(1:10))

Calculating the coefficents

Now we can feed the data into a linear regression formula. The results can be visualized in a table using the stargazer package.

reg_waste = lm(AdjustedCost ~ time + treatment + timesince, data = monthly_total)

stargazer( reg_waste, 
           type = "text", 
           dep.var.labels = ("Adjusted Monthly Wastage Costs"),
           column.labels = "",
           covariate.labels = c("Time", "Treatment", "Time Since Treatment"),
           omit.stat = "all", 
           digits = 3,
           out = "waste_lm.html")

## 
## ===================================================
##                           Dependent variable:      
##                      ------------------------------
##                      Adjusted Monthly Wastage Costs
##                                                    
## ---------------------------------------------------
## Time                             44.095            
##                                (363.425)           
##                                                    
## Treatment                     -12,093.860          
##                               (18,513.040)         
##                                                    
## Time Since Treatment            753.891            
##                               (2,729.025)          
##                                                    
## Constant                      15,014.920*          
##                               (8,130.488)          
##                                                    
## ===================================================
## ===================================================
## Note:                   *p<0.1; **p<0.05; ***p<0.01

It appears as though the treatment effect (measures implemented in 2023 to reduce wastage) were immediately effective, observed by the -12,000 in the treatment line (albeit this was not statistically significant).

Plotting the results

Is is easier to visualize the results using a plot. The next code chunk will make a prediction of the monthly waste based on our regression from the previous code chunk and the monthly total data. Then we create a plot displaying the data, with the red line displaying the start of 2023 (our “treatment” variable).

pred_waste <- predict(reg_waste, monthly_total)

plot(monthly_total$AdjustedCost,
      col = "black",
      xlim = c( 1, nrow (monthly_total) ),
      xlab = "Time (months)", 
      ylab = "Adjusted Monthly Cost",
     main = "Monthly Wastage Costs 2019-2023")

lines(pred_waste, col="blue", lwd = 3 )

# Line marking the interruption
abline( v=38, col="red", lty=2 )