Introduction to Time Maps

This visualization of car towing events timing in the city of Baltimore was inspired by Max Watson’s post Time Maps: Visualizing Discrete Events Across Many Timescales. Please follow the link for more details on the logic and interpretation of time map graphs.

Loading R packages and raw data

The data was obtained from Open Baltimore data portal and imported into a CSV file named DOT_Towing.csv.

library(readr)
library(lubridate)
library(ggplot2)
library(ggthemes)
library(scales)
library(dplyr)

# markdown options
knitr::opts_chunk$set(tidy=FALSE, fig.align='center')

# raw data from Baltimore portal
raw <- read_csv('../data/raw/DOT_Towing.csv')

Intraday patterns

Before plotting the time maps of tow events, let’s check the intraday pattern of towing frequency. First, we need to strop the dates from the data/time variable.

# convert from text to date
towedDateTime <- mdy_hms(raw$towedDateTime)
towedTime <- towedDateTime

# change the date to a fixed day in a year (keeps time only)
year(towedTime) <- 2015
month(towedTime) <- 1
day(towedTime) <- 1

Next, we add a day of the week and a weekend variable and remove tow companies with less than 3,000 records.

df_day <- 
    data_frame(
        towCompany = raw$towCompany, 
        towedDateTime,
        towedTime) %>% 
    # remove missing
    filter(!is.na(towCompany) & !is.na(towedDateTime)) %>% 
    # week day variables
    mutate(
        week_day = wday(towedDateTime, label = T, abbr = T),
        week_end = ifelse(week_day %in% c('Sun','Sat'), 
                          'Weekend','Weekday') %>% factor) %>% 
    # remove tow companies with few records
    group_by(towCompany) %>% 
    mutate(company_n = n()) %>% 
    filter(company_n>=3000)

head(df_day)
## Source: local data frame [6 x 6]
## Groups: towCompany [2]
## 
##               towCompany       towedDateTime           towedTime week_day
##                    (chr)              (time)              (time)   (fctr)
## 1                   City 2015-11-13 13:05:00 2015-01-01 13:05:00      Fri
## 2                   City 2015-11-13 12:41:00 2015-01-01 12:41:00      Fri
## 3                   City 2015-11-13 12:33:00 2015-01-01 12:33:00      Fri
## 4 McDels Enterprise Inc. 2015-11-13 12:33:00 2015-01-01 12:33:00      Fri
## 5 McDels Enterprise Inc. 2015-11-13 12:24:00 2015-01-01 12:24:00      Fri
## 6 McDels Enterprise Inc. 2015-11-13 12:23:00 2015-01-01 12:23:00      Fri
## Variables not shown: week_end (fctr), company_n (int)

We use the data above to plot a histogram of all towing events in the dataset by time of day. The plot reveals a spike in towing activity around 8AM and even bigger spike around 5PM.

p <- 
    ggplot(df_day, aes(towedTime))+
    geom_histogram(binwidth=60*60, fill='white', color='black')+
    scale_x_datetime(breaks = date_breaks("3 hours"), 
                     labels = date_format("%I%p"))+
    theme_few(base_size = 14) + 
    theme(axis.text.x = element_text(angle = 90, vjust = 0))

p

If we plot Saturday and Sunday separately, we can see that the weekends are much more evenly distributed and do not exhibit the same spikes in towing as the regular weekdays.

p + facet_wrap(~week_end)

Finally, we can compare the timing patterns between the tow companies. Note that the plot scale is different for each company, because we’re interested timing not volume.

p + facet_grid(towCompany~week_end, scales = 'free_y')+
    theme(strip.text.y = element_text(angle = 0, hjust = 0),
          axis.text.y = element_text(size = 8))

Prepare Time Maps data

The calculations for making the time maps are straight-forward and are shown below.

df_tm <- 
    raw %>% 
    # remove missing companies
    filter(!is.na(towCompany)) %>% 
    select(towCompany, towedDateTime) %>%
    # convert date time
    mutate(towedDateTime = mdy_hms(towedDateTime)) %>% 
    # remove companies with few records
    group_by(towCompany) %>%
    filter(n()>=3000) %>% 
    # get before and after times
    arrange(towedDateTime) %>% 
    mutate(prv = lag(towedDateTime),
           nxt = lead(towedDateTime)) %>% 
    filter(!is.na(prv) & !is.na(nxt)) %>% 
    ungroup() %>% 
    # calculate x and y mapping
    mutate(x = as.numeric(towedDateTime - prv), 
           y = as.numeric(nxt - towedDateTime),
           towCompany = factor(towCompany)) %>% 
    # change 0's to 1 seconds (for log scale plot) and convert to hours
    mutate(x = pmax(1, x)/3600, 
           y = pmax(1, y)/3600)

head(df_tm %>% select(-prv, -nxt))
## Source: local data frame [6 x 4]
## 
##   towCompany       towedDateTime            x            y
##       (fctr)              (time)        (dbl)        (dbl)
## 1       City 2010-10-25 14:49:00 2.618333e+01 1.389767e+03
## 2       City 2010-12-22 12:35:00 1.389767e+03 9.166667e-01
## 3       City 2010-12-22 13:30:00 9.166667e-01 7.500000e-01
## 4       City 2010-12-22 14:15:00 7.500000e-01 3.500000e-01
## 5       City 2010-12-22 14:36:00 3.500000e-01 6.666667e-02
## 6       City 2010-12-22 14:40:00 6.666667e-02 2.777778e-04

Plot Time Maps

Below is the time map for the entire dataset.

p <- 
    ggplot(
        data = df_tm,
        mapping = aes(x,y)
    )+
    geom_density2d()+
    stat_density2d(aes(fill = ..level..), geom="polygon")+
    scale_fill_gradient(low="greenyellow", high="red")+
    scale_x_log10()+ scale_y_log10()+
    geom_abline(linetype=2)+
    theme_few()+
    theme(text=element_text(size=18),
          plot.title = element_text(lineheight=1.75, face="bold"),
          strip.text = element_text(size = 12, face="bold"),
          legend.position="none",
          axis.text.x=element_blank(), 
          axis.text.y=element_blank())+
    labs(x='Time from Previous Towing',
         y="Time untill Next Towing",
         title='Baltimore Time Maps of Tow Events')

p + theme(plot.title = element_text(size=16))

Finally, we can also plot the time map of each individual tow company.

p + 
    facet_wrap(~towCompany)+
    labs(title='Baltimore Time Maps of Tow Events by Company')