#Business Scenario: Emergency Department Volumes Analysis.

Case 1: How many patients will be arriving to emergency department at some time point per hour and minute? (Arrival Volumes forecast)

Case 2: How many patients will be siting in the ED at some point per minute(patient census)?

Import and preview patients data

ed_data <- read.csv("Simulated_ed_data.csv")
head(ed_data)
##   arrival_times depart_times
## 1  10/1/18 0:05 10/1/18 2:34
## 2  10/1/18 0:15 10/1/18 3:04
## 3  10/1/18 0:16 10/1/18 2:36
## 4  10/1/18 0:19 10/1/18 2:45
## 5  10/1/18 0:26 10/1/18 3:15
## 6  10/1/18 0:35 10/1/18 3:02

Convert the dates in strings to date format.

ed_data$arrival_times=mdy_hm(ed_data$arrival_times)
ed_data$depart_times=mdy_hm(ed_data$depart_times)

Case1: Calculate the Arrival Volumes per hour

volumes_per_hour <- ed_data %>% 
  mutate(timestamp=floor_date(arrival_times,unit='hour')) %>%
  count(timestamp)

Visualize the arrival volumes for each hour

volumes_per_hour%>%
  ggplot(mapping=aes(x=timestamp,y=n))+geom_line()+
  labs(title="Emergency Department Arrival Volumes per hour",
       subtitle='simulated data for three days',
       caption='Data Source:500 Patients')+
  theme(
    plot.title = element_text(color = "red", size = 12, face = "bold",hjust=0.5),
    plot.subtitle = element_text(color = "blue",hjust=0.5),
    plot.caption = element_text(color = "green", face = "italic",hjust=1)
  )+
  xlab('Date')+
  ylab('Volume')

Now, We will calculate patients arrival volumes each minute

volumes_per_minute <- ed_data %>% 
  mutate(timestamp=floor_date(arrival_times,unit='minute')) %>%
  count(timestamp)%>%
  select(timestamp,volume=n)

# create a sequence of times from the  start to end of your available data.

start<- min(volumes_per_minute$timestamp)
end <-  max(volumes_per_minute$timestamp)
complete_window <- tibble(timestamp=seq(start,end,by='mins'))

# do a left join to get all the timestamps

(total_volumes_minute <- complete_window %>% left_join(volumes_per_minute,by='timestamp')%>%
  mutate(volume=ifelse(is.na(volume),0,volume)))
## # A tibble: 3,051 x 2
##    timestamp           volume
##    <dttm>               <dbl>
##  1 2018-10-01 00:05:00      1
##  2 2018-10-01 00:06:00      0
##  3 2018-10-01 00:07:00      0
##  4 2018-10-01 00:08:00      0
##  5 2018-10-01 00:09:00      0
##  6 2018-10-01 00:10:00      0
##  7 2018-10-01 00:11:00      0
##  8 2018-10-01 00:12:00      0
##  9 2018-10-01 00:13:00      0
## 10 2018-10-01 00:14:00      0
## # ... with 3,041 more rows

Visualize the ED arrival volumes each minute

total_volumes_minute%>%
  ggplot(mapping=aes(x=timestamp,y=volume))+geom_line()+
  labs(title="Emergency Department Arrival Volumes per minute",
       subtitle='simulated data for three days',
       caption='Data Source:500 Patients')+
  theme(
    plot.title = element_text(color = "red", size = 12, face = "bold",hjust=0.5),
    plot.subtitle = element_text(color = "blue",hjust=0.5),
    plot.caption = element_text(color = "green", face = "italic",hjust=1)
  )+
  xlab('Date')+
  ylab('Volume')

# From this Viz, the darker dense areas are the times with high volume. 
# But, To find the patterns and trend more clearly we must go with more years of data
# We must address seasonality and trend before going for forecasting.

Case2 : Calculating census volumes.

As the previous example we need to fill in the gaps where no arrivals or departures exist. Except this time, instead of filling in the gaps with zeros we take the last observation carried forward.because previous minute’s patients existence is the ED count.

Emergency department census volumes per minute

ed_data<- ed_data %>%
  mutate(arrival_times=floor_date(arrival_times,unit='minute'),
         depart_times=floor_date(depart_times,unit='minute'))

# Arrivals
arrivals <- ed_data%>% 
  select(timestamp=arrival_times)%>%
  mutate(counter=1)

# Departures
departures <- ed_data%>% 
  select(timestamp=depart_times)%>%
  mutate(counter=-1)

#ED census volumes per minute
census_volumes <- arrivals %>%
  bind_rows(departures)%>%
  arrange(timestamp,counter)%>%   #arrange by time
  mutate(volume=cumsum(counter)) #cumsum of counters to get the exact volumes at that point.

# create a sequence of times from the start to end of your available data.

start <- min(census_volumes$timestamp)
end   <- max(census_volumes$timestamp)
full_time_window <- tibble(timestamp=seq(start,end,by='mins'))

#right join to get the missing time intervals.

census_volumes <- census_volumes%>% 
  right_join(full_time_window,by='timestamp')%>%
  arrange(timestamp)%>%
  fill(volume,.direction='down') #take last observation carried forward.

#this is because even though there are no people arrived in to ED, the previous timestamp's 
#existing patients are available at that point in time.

Visualize the ED census volumes each minute

census_volumes%>%
  ggplot(mapping=aes(x=timestamp,y=volume))+
  geom_line()+
  labs(title="Emergency Department Census Volumes per minute",
       subtitle='simulated data for three days',
       caption='Data Source:500 Patients')+
  theme(
    plot.title = element_text(color = "red", size = 12, face = "bold",hjust=0.5),
    plot.subtitle = element_text(color = "blue",hjust=0.5),
    plot.caption = element_text(color = "green", face = "italic",hjust=1)
  )+
  xlab('Date')+
  ylab('Volume')

Conclusion:

Emergency Department Arrival Volumes:

  • With the simulated data, the arrivals volumes graphs conveys that midnight and afternoon’s on each day are the times most of the patient visits occur in Emergency department.

  • But, still to understand the patterns clearly we need to have more patients data for longer periods.

Emergency Department Census Volumes per minute.

  • Well, the patients census has fluctuations on each day at different points in time.

  • There are some patterns, but they are not conclusive enough to use for any doctor’s or nurse’s schedules planning.

  • and, the fluctuations we see are expected in Emergency department. But, to figure a pattern we must include atleast two season’s of patients data.

  • To check, whether we have any seasonal effects on patients like temperature change, pandemic occurrence and some natural calamities. which is causing the patient volumes to increase

  • Even to forecast, the data size is small and any planning on such small data is not advisable and it might affect the planning and operations of ED.

  • But the same volume calculations of ED can be applied to even the terabytes of data to understand the patterns of ED and project the outcomes as forecast and we can use it for future planning activities.