A function for filling in missing dates

I run into this problem often enough that I decided to create a function for it and to include it in the denodoExtractor package. That way, the function is available whenever you load the package - you don’t have to keep copying it over from one folder/script to another.

Example data

Let’s say we have some admits data from a small clinic, for January 2019.

df <- data.frame(admit_date = c("2019-01-01", 
                                "2019-01-04", 
                                "2019-01-10", 
                                "2019-01-23"), 
                 admits = c(25, 40, 12, 50))


df %>% 
  kable() %>% 
  kable_styling(bootstrap_options = c("striped","condensed","responsive"), 
                full_width = F)

admit_date	admits
2019-01-01	25
2019-01-04	40
2019-01-10	12
2019-01-23	50

Problem statement

How do I get a dataframe with 1 row for each day in January, regardless of whether or not there was an admit on that date? This is not hard to do, but it is a little tedious, and I want to remove as much friction as possible, so that it’s easy to focus on the big-picture goals of whatever analysis I’m doing.

Why?

Most obiously, to plot a time series, calculate a rate with the correct denominator, etc.

Solution

Install/Load the package first:

# install the package first: 
# devtools::install_github("nayefahmad/denodoExtractor")
# library(denodoExtractor)

Check function documentation:

# ?fill_dates

Run the function:

You only need to supply three arguments:

the name of the column that has dates in it
the start date of the range that you want to “fill in”
the end date of the range that you want to “fill in”

df %>% fill_dates(admit_date, 
                  "2019-01-01", 
                  "2019-01-31") %>% 
  
  # this part is not part of the example: 
  datatable()

## Joining, by = "dates_fill"

Quick plot

df %>% fill_dates(admit_date, 
                  "2019-01-01", 
                  "2019-01-31") %>%
  replace_na(replace = list(admits = 0)) %>% 
  
  ggplot(aes(x = dates_fill, 
             y = admits)) + 
  geom_point() + 
  geom_line()

## Joining, by = "dates_fill"