#Business Scenario: Emergency Department Volumes Analysis.

Case 1: How many patients will be arriving to emergency department at some time point per hour and minute? (Arrival Volumes forecast)

Case 2: How many patients will be siting in the ED at some point per minute(patient census)?

Import and preview patients data

ed_data <- read.csv("Simulated_ed_data.csv")
head(ed_data)

##   arrival_times depart_times
## 1  10/1/18 0:05 10/1/18 2:34
## 2  10/1/18 0:15 10/1/18 3:04
## 3  10/1/18 0:16 10/1/18 2:36
## 4  10/1/18 0:19 10/1/18 2:45
## 5  10/1/18 0:26 10/1/18 3:15
## 6  10/1/18 0:35 10/1/18 3:02

Convert the dates in strings to date format.

ed_data$arrival_times=mdy_hm(ed_data$arrival_times)
ed_data$depart_times=mdy_hm(ed_data$depart_times)

Case1: Calculate the Arrival Volumes per hour

Round the arrival times to the nearest hour to put the patients in one hour brackets.

volumes_per_hour <- ed_data %>% 
  mutate(timestamp=floor_date(arrival_times,unit='hour')) %>%
  count(timestamp)

Visualize the arrival volumes for each hour

volumes_per_hour%>%
  ggplot(mapping=aes(x=timestamp,y=n))+geom_line()+
  labs(title="Emergency Department Arrival Volumes per hour",
       subtitle='simulated data for three days',
       caption='Data Source:500 Patients')+
  theme(
    plot.title = element_text(color = "red", size = 12, face = "bold",hjust=0.5),
    plot.subtitle = element_text(color = "blue",hjust=0.5),
    plot.caption = element_text(color = "green", face = "italic",hjust=1)
  )+
  xlab('Date')+
  ylab('Volume')

Now, We will calculate patients arrival volumes each minute

Our objective is to calculate the volumes for ED each minute, if there are no patients
then we should fill the volume as zero.

volumes_per_minute <- ed_data %>% 
  mutate(timestamp=floor_date(arrival_times,unit='minute')) %>%
  count(timestamp)%>%
  select(timestamp,volume=n)

# create a sequence of times from the  start to end of your available data.

start<- min(volumes_per_minute$timestamp)
end <-  max(volumes_per_minute$timestamp)
complete_window <- tibble(timestamp=seq(start,end,by='mins'))

# do a left join to get all the timestamps

(total_volumes_minute <- complete_window %>% left_join(volumes_per_minute,by='timestamp')%>%
  mutate(volume=ifelse(is.na(volume),0,volume)))

## # A tibble: 3,051 x 2
##    timestamp           volume
##    <dttm>               <dbl>
##  1 2018-10-01 00:05:00      1
##  2 2018-10-01 00:06:00      0
##  3 2018-10-01 00:07:00      0
##  4 2018-10-01 00:08:00      0
##  5 2018-10-01 00:09:00      0
##  6 2018-10-01 00:10:00      0
##  7 2018-10-01 00:11:00      0
##  8 2018-10-01 00:12:00      0
##  9 2018-10-01 00:13:00      0
## 10 2018-10-01 00:14:00      0
## # ... with 3,041 more rows

Visualize the ED arrival volumes each minute

total_volumes_minute%>%
  ggplot(mapping=aes(x=timestamp,y=volume))+geom_line()+
  labs(title="Emergency Department Arrival Volumes per minute",
       subtitle='simulated data for three days',
       caption='Data Source:500 Patients')+
  theme(
    plot.title = element_text(color = "red", size = 12, face = "bold",hjust=0.5),
    plot.subtitle = element_text(color = "blue",hjust=0.5),
    plot.caption = element_text(color = "green", face = "italic",hjust=1)
  )+
  xlab('Date')+
  ylab('Volume')

# From this Viz, the darker dense areas are the times with high volume. 
# But, To find the patterns and trend more clearly we must go with more years of data
# We must address seasonality and trend before going for forecasting.

Case2 : Calculating census volumes.

Number of patients waiting in the ED at any given time.
we have to consider both the patients available and patients left the ED.
To accomplish this, we have to keep a counter to track everytime a patient enters and leaves the ED.
When a patient walks in the door we add one to the overall count,
and when a patient leaves we subtract one.
Since we have the timestamps of when patients enter and leave, this is a very simple task
we will split our data in to two chunks, one for arrival times and one for departures times.
for each data split, we will create a counter variable.
This variable will take the value of 1 for the arrival split and -1 for the departure split.
We then bind the two splits back together, arrange them by time,
and take a cumulative sum of the counter variable.

As the previous example we need to fill in the gaps where no arrivals or departures exist. Except this time, instead of filling in the gaps with zeros we take the last observation carried forward.because previous minute’s patients existence is the ED count.

Emergency department census volumes per minute

ed_data<- ed_data %>%
  mutate(arrival_times=floor_date(arrival_times,unit='minute'),
         depart_times=floor_date(depart_times,unit='minute'))

# Arrivals
arrivals <- ed_data%>% 
  select(timestamp=arrival_times)%>%
  mutate(counter=1)

# Departures
departures <- ed_data%>% 
  select(timestamp=depart_times)%>%
  mutate(counter=-1)

#ED census volumes per minute
census_volumes <- arrivals %>%
  bind_rows(departures)%>%
  arrange(timestamp,counter)%>%   #arrange by time
  mutate(volume=cumsum(counter)) #cumsum of counters to get the exact volumes at that point.

# create a sequence of times from the start to end of your available data.

start <- min(census_volumes$timestamp)
end   <- max(census_volumes$timestamp)
full_time_window <- tibble(timestamp=seq(start,end,by='mins'))

#right join to get the missing time intervals.

census_volumes <- census_volumes%>% 
  right_join(full_time_window,by='timestamp')%>%
  arrange(timestamp)%>%
  fill(volume,.direction='down') #take last observation carried forward.

#this is because even though there are no people arrived in to ED, the previous timestamp's 
#existing patients are available at that point in time.

Visualize the ED census volumes each minute

census_volumes%>%
  ggplot(mapping=aes(x=timestamp,y=volume))+
  geom_line()+
  labs(title="Emergency Department Census Volumes per minute",
       subtitle='simulated data for three days',
       caption='Data Source:500 Patients')+
  theme(
    plot.title = element_text(color = "red", size = 12, face = "bold",hjust=0.5),
    plot.subtitle = element_text(color = "blue",hjust=0.5),
    plot.caption = element_text(color = "green", face = "italic",hjust=1)
  )+
  xlab('Date')+
  ylab('Volume')

Conclusion:

Emergency Department Arrival Volumes:

With the simulated data, the arrivals volumes graphs conveys that midnight and afternoon’s on each day are the times most of the patient visits occur in Emergency department.
But, still to understand the patterns clearly we need to have more patients data for longer periods.

Emergency Department Census Volumes per minute.

Well, the patients census has fluctuations on each day at different points in time.
There are some patterns, but they are not conclusive enough to use for any doctor’s or nurse’s schedules planning.
and, the fluctuations we see are expected in Emergency department. But, to figure a pattern we must include atleast two season’s of patients data.
To check, whether we have any seasonal effects on patients like temperature change, pandemic occurrence and some natural calamities. which is causing the patient volumes to increase
Even to forecast, the data size is small and any planning on such small data is not advisable and it might affect the planning and operations of ED.
But the same volume calculations of ED can be applied to even the terabytes of data to understand the patterns of ED and project the outcomes as forecast and we can use it for future planning activities.

Emergency Department Volumes

Sasidhar Maddipatla

11/17/2021