library(tidyverse)
library(lubridate)
library(httr)
library(jsonlite)
Tidyverse contains many packages within it that allows users to work with strings, mutate and rearange dataframes and access data through APIs or websites. We can see a few of these packages listed below.
tidyverse_packages()
## [1] "broom" "cli" "crayon" "dbplyr"
## [5] "dplyr" "dtplyr" "forcats" "ggplot2"
## [9] "googledrive" "googlesheets4" "haven" "hms"
## [13] "httr" "jsonlite" "lubridate" "magrittr"
## [17] "modelr" "pillar" "purrr" "readr"
## [21] "readxl" "reprex" "rlang" "rstudioapi"
## [25] "rvest" "stringr" "tibble" "tidyr"
## [29] "xml2" "tidyverse"
To demonstrate the capabilities of the lubridate package within the Tidyverse, I will be using a data set on requested film permits in NYC. As per its documentation “Lubridate provides tools that make it easier to parse and manipulate dates.”
Description of Data: The Film Office issues permits to productions filming on location in the City of New York and provides free police assistance, free parking privileges and access to most exterior locations free of charge. Not all filming activity requires a permit. These permits are generally required when asserting the exclusive use of city property, like a sidewalk, a street, or a park.
Source: NYC Filming Permits
import_data <- as.data.frame(fromJSON('https://data.cityofnewyork.us/resource/tg4x-b46p.json', simplifyVector = TRUE))
After importing the data using jsonlite, we are left we nineteen (19) columns which includes the start and end dates within different zip codes in New York City. We can see it uses the ISO 8601 standard format of date and time separated from each other with the usage of the string literal “T”.
knitr::kable(head(import_data, 1))
eventid | eventtype | startdatetime | enddatetime | enteredon | eventagency | parkingheld | borough | communityboard_s | policeprecinct_s | category | subcategoryname | country | zipcode_s |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
677277 | Theater Load in and Load Outs | 2022-10-18T07:00:00.000 | 2022-10-18T12:00:00.000 | 2022-10-17T15:40:31.000 | Mayor’s Office of Film, Theatre & Broadcasting | ASHLAND PLACE between HANSON PLACE and LAFAYETTE AVENUE | Brooklyn | 2 | 88 | Theater | Theater | United States of America | 11217 |
Lubridate has fantastic documentation you can find here. To transform our date columns to be much more useful, I will start by using the ymd_hms() function. Here, it will transform the original datetime columns from ISO format to POSIXct.
update_df <-
import_data |>
mutate(start_date = ymd_hms(startdatetime),
end_date = ymd_hms(enddatetime))
Now that we have dates in a workable datetime format, I can extract the day of the week and the hour from the timestamps to perform some exploratory analysis. To keep it simple, the start date will only be used.
update_df <-
update_df |>
mutate(start_day_of_week = wday(start_date, label = TRUE, abbr = TRUE),
start_hour = hour(start_date))
Lets reduce the number of columns to look at for our exploration. This will contain the start date, start time (hour), category, event type, and borough.
borough | category | eventtype | start_date | start_day_of_week | start_hour |
---|---|---|---|---|---|
Brooklyn | Theater | Theater Load in and Load Outs | 2022-10-18 07:00:00 | Tue | 7 |
Manhattan | Television | Shooting Permit | 2022-10-18 09:00:00 | Tue | 9 |
Brooklyn | Television | Shooting Permit | 2022-10-18 07:00:00 | Tue | 7 |
Manhattan | Still Photography | Shooting Permit | 2022-10-18 08:00:00 | Tue | 8 |
Bronx | Commercial | Shooting Permit | 2022-10-18 06:00:00 | Tue | 6 |
Queens | Television | Shooting Permit | 2022-10-17 07:00:00 | Mon | 7 |
Lastly, two plots will be created filtering out for only “shooting permits” where we can see the day of the week filming begins within each borough and what hour does filming begin based on the category such as television or commercial.
borough | Mon | Tue | Wed | Thu | Fri | Sat | Sun |
---|---|---|---|---|---|---|---|
Bronx | 7 | 10 | 9 | 6 | 2 | 2 | 0 |
Brooklyn | 53 | 60 | 60 | 65 | 64 | 7 | 6 |
Manhattan | 48 | 63 | 52 | 59 | 45 | 13 | 21 |
Queens | 28 | 39 | 42 | 38 | 38 | 2 | 1 |
Staten Island | 1 | 3 | 3 | 2 | 2 | 0 | 0 |
start_hour | Commercial | Documentary | Film | Music Video | Still Photography | Student | Television | Theater | WEB |
---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 0 | 1 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 |
2 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 |
3 | 0 | 0 | 0 | 0 | 0 | 0 | 10 | 0 | 0 |
4 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 |
5 | 0 | 0 | 1 | 0 | 1 | 0 | 4 | 0 | 0 |
6 | 57 | 1 | 10 | 4 | 8 | 0 | 192 | 1 | 9 |
7 | 30 | 0 | 28 | 0 | 17 | 0 | 209 | 0 | 19 |
8 | 4 | 0 | 9 | 0 | 10 | 1 | 43 | 0 | 4 |
9 | 0 | 0 | 3 | 0 | 0 | 0 | 33 | 5 | 2 |
10 | 2 | 0 | 4 | 0 | 1 | 0 | 25 | 0 | 1 |
11 | 2 | 0 | 4 | 0 | 2 | 0 | 12 | 0 | 2 |
12 | 1 | 0 | 3 | 0 | 0 | 0 | 10 | 0 | 2 |
13 | 1 | 0 | 2 | 0 | 0 | 0 | 4 | 0 | 1 |
14 | 0 | 0 | 2 | 0 | 0 | 0 | 4 | 0 | 0 |
15 | 0 | 0 | 0 | 0 | 0 | 0 | 12 | 2 | 0 |
16 | 0 | 0 | 1 | 0 | 0 | 0 | 6 | 0 | 0 |
17 | 0 | 0 | 1 | 0 | 0 | 0 | 6 | 0 | 0 |
18 | 0 | 0 | 1 | 0 | 0 | 0 | 4 | 0 | 0 |
19 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
20 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
21 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |