Required Libraries

library(tidyverse)
library(lubridate)
library(httr)
library(jsonlite)

Tidyvserse Packages

Tidyverse contains many packages within it that allows users to work with strings, mutate and rearange dataframes and access data through APIs or websites. We can see a few of these packages listed below.

tidyverse_packages()
##  [1] "broom"         "cli"           "crayon"        "dbplyr"       
##  [5] "dplyr"         "dtplyr"        "forcats"       "ggplot2"      
##  [9] "googledrive"   "googlesheets4" "haven"         "hms"          
## [13] "httr"          "jsonlite"      "lubridate"     "magrittr"     
## [17] "modelr"        "pillar"        "purrr"         "readr"        
## [21] "readxl"        "reprex"        "rlang"         "rstudioapi"   
## [25] "rvest"         "stringr"       "tibble"        "tidyr"        
## [29] "xml2"          "tidyverse"

Import Data

To demonstrate the capabilities of the lubridate package within the Tidyverse, I will be using a data set on requested film permits in NYC. As per its documentation “Lubridate provides tools that make it easier to parse and manipulate dates.”

Description of Data: The Film Office issues permits to productions filming on location in the City of New York and provides free police assistance, free parking privileges and access to most exterior locations free of charge. Not all filming activity requires a permit. These permits are generally required when asserting the exclusive use of city property, like a sidewalk, a street, or a park.

Source: NYC Filming Permits

import_data <- as.data.frame(fromJSON('https://data.cityofnewyork.us/resource/tg4x-b46p.json', simplifyVector = TRUE))

Brief Glance of the Raw Dataset

After importing the data using jsonlite, we are left we nineteen (19) columns which includes the start and end dates within different zip codes in New York City. We can see it uses the ISO 8601 standard format of date and time separated from each other with the usage of the string literal “T”.

knitr::kable(head(import_data, 1))
eventid eventtype startdatetime enddatetime enteredon eventagency parkingheld borough communityboard_s policeprecinct_s category subcategoryname country zipcode_s
677277 Theater Load in and Load Outs 2022-10-18T07:00:00.000 2022-10-18T12:00:00.000 2022-10-17T15:40:31.000 Mayor’s Office of Film, Theatre & Broadcasting ASHLAND PLACE between HANSON PLACE and LAFAYETTE AVENUE Brooklyn 2 88 Theater Theater United States of America 11217

Convert ISO to POSIXct

Lubridate has fantastic documentation you can find here. To transform our date columns to be much more useful, I will start by using the ymd_hms() function. Here, it will transform the original datetime columns from ISO format to POSIXct.

update_df <- 
  import_data |> 
  mutate(start_date = ymd_hms(startdatetime),
         end_date = ymd_hms(enddatetime))

Create Day of Week and Hour Columns

Now that we have dates in a workable datetime format, I can extract the day of the week and the hour from the timestamps to perform some exploratory analysis. To keep it simple, the start date will only be used.

update_df <-
  update_df |> 
  mutate(start_day_of_week = wday(start_date, label = TRUE, abbr = TRUE),
         start_hour = hour(start_date))

Preview Columns of Interest

Lets reduce the number of columns to look at for our exploration. This will contain the start date, start time (hour), category, event type, and borough.

borough category eventtype start_date start_day_of_week start_hour
Brooklyn Theater Theater Load in and Load Outs 2022-10-18 07:00:00 Tue 7
Manhattan Television Shooting Permit 2022-10-18 09:00:00 Tue 9
Brooklyn Television Shooting Permit 2022-10-18 07:00:00 Tue 7
Manhattan Still Photography Shooting Permit 2022-10-18 08:00:00 Tue 8
Bronx Commercial Shooting Permit 2022-10-18 06:00:00 Tue 6
Queens Television Shooting Permit 2022-10-17 07:00:00 Mon 7

Visualize Dates

Lastly, two plots will be created filtering out for only “shooting permits” where we can see the day of the week filming begins within each borough and what hour does filming begin based on the category such as television or commercial.

borough Mon Tue Wed Thu Fri Sat Sun
Bronx 7 10 9 6 2 2 0
Brooklyn 53 60 60 65 64 7 6
Manhattan 48 63 52 59 45 13 21
Queens 28 39 42 38 38 2 1
Staten Island 1 3 3 2 2 0 0

start_hour Commercial Documentary Film Music Video Still Photography Student Television Theater WEB
0 0 0 0 0 0 0 5 0 1
1 0 0 0 0 0 0 3 0 0
2 0 0 0 0 0 0 2 0 0
3 0 0 0 0 0 0 10 0 0
4 0 0 0 0 0 0 3 0 0
5 0 0 1 0 1 0 4 0 0
6 57 1 10 4 8 0 192 1 9
7 30 0 28 0 17 0 209 0 19
8 4 0 9 0 10 1 43 0 4
9 0 0 3 0 0 0 33 5 2
10 2 0 4 0 1 0 25 0 1
11 2 0 4 0 2 0 12 0 2
12 1 0 3 0 0 0 10 0 2
13 1 0 2 0 0 0 4 0 1
14 0 0 2 0 0 0 4 0 0
15 0 0 0 0 0 0 12 2 0
16 0 0 1 0 0 0 6 0 0
17 0 0 1 0 0 0 6 0 0
18 0 0 1 0 0 0 4 0 0
19 0 0 0 0 0 0 1 0 0
20 0 0 0 0 0 0 0 0 1
21 1 0 1 0 0 0 0 0 0