Let’s say you have just created a schedule for your workforce. If your workforce has the same shift start and shift end in all days, it should be easy to find how many heads you will have per interval. What if that was not the case? Most WFM schedulers have to create much more dynamic schedules with not only multiple shift start and shift end but also with shifts going across days. It is also possible that agents do not work all the days, and the shifts are of varying length. Once you have created such schedule, how do you convert it to coverage data to check your efficiency per interval? I have seen some excel solutions for this but let’s see how we can do this in much easier way using R. At the end of this, we will plot the coverage against the predetermined requirement and calculate the schedule efficiency.
Let’s import a sample schedule. Note that it is 24/7 and has some spanning across 2 days.
(schedule <- read.csv("schedule.csv"))
## Agent Dates start end
## 1 a1 2022-03-16 1 7
## 2 a1 2022-03-17 3 9
## 3 a2 2022-03-16 5 13
## 4 a2 2022-03-17 18 4
## 5 a3 2022-03-16 13 21
## 6 a3 2022-03-17 NA NA
We need to handle shifts that cross to the next day and shifts those which do not separately, and combine the results later. Let’s assume nobody works more than 12 hours a day, so shift end should be less than shift start if the shift crosses to the next day. This will work only if the time start and time end are in 24-hr notation.
# create two copies of df
schedule1<- schedule
schedule2<- schedule
# if end is less than start, end = end +24.
schedule1$end <- ifelse(schedule1$start > schedule1$end, schedule1$end + 24, schedule1$end)
schedule1$Dates <- as.Date(schedule1$Dates)
# 1 if working in that interval, 0 otherwise.
for (i in c(0:23))
{
schedule1[as.character(i)] <- ifelse(schedule1$start <= i & schedule1$end > i, 1,0)
}
Similarly for the parts of the schedule overflowing to the next day:
# schedule start is midnight.
schedule2$start2 <- 0
# take only those shifts that overflow. end = start =0 otherwise
schedule2$end2 <- ifelse(schedule2$start > schedule2$end, schedule2$end, 0)
# the next day
schedule2$Dates <- as.Date(schedule2$Dates) + 1
# 1 if working in that interval, 0 otherwise.
for (i in c(0:23))
{
schedule2[as.character(i)] <- ifelse(schedule2$start2 <= i & schedule2$end2 > i, 1,0)
}
Join the two parts horizontally
schedule_12 <- rbind(schedule1, schedule2[,c(-5,-6)]) # remove start2 and end2 columns to match to schedule1
Next we find the number of heads in every interval per day. We will create a dataframe with one column for intervals 0 to 23, each representing the 1 hour interval until the next hour.
library(tidyverse)
coverage <- data.frame(Interval = c(0:23)) # column interval
for (datee in unique(schedule_12$Dates))
{
coverage[paste0("Date_",as.Date(datee,origin="1970-01-01"))] <- schedule_12 %>% filter(Dates==datee) %>% select(!(Agent:end)) %>% colSums(na.rm = T)
}
head(coverage)
## Interval Date_2022-03-16 Date_2022-03-17 Date_2022-03-18
## 1 0 0 0 1
## 2 1 1 0 1
## 3 2 1 0 1
## 4 3 1 1 1
## 5 4 1 1 0
## 6 5 2 1 0
Melting the table from wide to long format, easier for plotting.
coverage_final <- pivot_longer(coverage, -c(Interval), values_to = "Heads", names_to = "Date")
coverage_final <- coverage_final[order(coverage_final$Date,coverage_final$Interval), ] # order by date and then by interval
library(DT)
datatable(coverage_final, rownames = F)
Plotting the coverage per interval. ggplot2’s facet_wrap function allows plotting by group. We will use this to show coverage of each date in a separate plot.
ggplot(coverage_final, aes(x = Interval, y = Heads)) +
geom_col(fill="tomato4")+
facet_wrap(~Date,ncol = 2)+
scale_x_continuous(breaks = seq(0,23,2))
It is often important to check the coverage against the requirement. We will add a column to our coverage_final dataframe and plot together.
coverage_final$Required <- round((runif(n=72,min = 0.5,max = 1.7))*coverage_final$Heads) # let's assume the requirement is somewhere between 50% and 170% of the scheduled heads
head(coverage_final)
## # A tibble: 6 x 4
## Interval Date Heads Required
## <int> <chr> <dbl> <dbl>
## 1 0 Date_2022-03-16 0 0
## 2 1 Date_2022-03-16 1 1
## 3 2 Date_2022-03-16 1 1
## 4 3 Date_2022-03-16 1 1
## 5 4 Date_2022-03-16 1 1
## 6 5 Date_2022-03-16 2 3
Let’s plot it against the scheduled heads.
ggplot(coverage_final, aes(x = Interval)) +
geom_col(aes(y = Heads),fill="tomato4")+
geom_line(aes(y = Required),fill="darkgreen", size = 1.5)+
facet_wrap(~Date,ncol = 2)+
scale_x_continuous(breaks = seq(0,23,2))
The honest efficiency calculation should be weighed average of per interval efficiency values. The efficiency is one minus the error. The error is the absolute difference of the schedule from the required. You may treat the understaffing and overstaffing differently if their business interpretations are different.
coverage_final$delta <- abs(coverage_final$Heads - coverage_final$Required) # NA if required = 0.
(Efficiency <- paste0(100*(1-sum(coverage_final$delta)/sum(coverage_final$Required)),"%"))
## [1] "82.6086956521739%"
I am crazy about data science and applying data science skills to workforce management. Reach me at LinkedIn if you wish to connect :)
You may as well enjoy my other blogs at RPubs