Introduction

Data is publically available on https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page.

Analysis is provided by Amy Liu Pathak and Joshua Curley. The dataset was trimmed down to trips with at least a pick up (PU) or drop off (DO) in zone 186, in which Penn Station is located.

Preview data

This is a random sample of 20 rows for FHV’s and taxis respectively. The total number of observations for FHV’s is 2,351,254, for yellow taxis is 1,441,332, and for green taxis is 2,263. Yellow and green taxis are combined in the “taxi” dataset, which is used to analyze all taxis.

datatable(fhvhv_sample,
          class = 'cell-border stripe', 
          extensions = 'Buttons',
          fillContainer = FALSE, 
          options = list(pageLength = 10,
                         dom = 'Bfrtip',
                         buttons = c('csv', 'excel', 'pdf'), 
                         scrollX = TRUE, 
            selection="multiple"
          ))
datatable(taxi_sample,
          class = 'cell-border stripe', 
          extensions = 'Buttons',
          fillContainer = FALSE, 
          options = list(pageLength = 10,
                         dom = 'Bfrtip',
                         buttons = c('csv', 'excel', 'pdf'), 
                         scrollX = TRUE, 
            selection="multiple"
          ))

Top origins and destinations

Where are people going? Where are people coming from? The tables below (which can be sorted for max to min) show top pickup and drop off zones with respect to trips ending or beginning in zone 186. The following are top locations.

Zone 79 (East Village), 230 (Times Sq), 161 (Midtown Center), 162 (Midtown East), 170 (Murray Hill), 48 (Clinton East)

Zone 265 is all points outside of NYC (excluding Newark Airport). This likely explains the imbalance between Origin and Destination for the NYC Taxis since there can’t be legitimate out-of-NYC TLC taxi pick ups. Conversely, FHVs that do not have TLC license cannot pick up in NYC (i.e .if someone takes a non-TLC Uber from NJ to NYC the Uber must leave the city before making next pick up).

There are many short distance trips in this dataset. Particularly taxi trips from zone 186 to zone 186 (trip around the block?). Perhaps we need better wayfinding so people know it’s a 5 minute walk to their destination nearby and we could eliminate 20K trips between taxis and FHVs that never even leave the zone. With many more also going to the adjacent zones (i.e. Times Square).

FHVHV

fhvhv_origins <- fhvhv %>% 
  filter(DOLocationID == 186) %>% 
  count(PULocationID) %>% 
  rename(
    "Pickup.Zone" = "PULocationID",
    "Total.Number.of.Rides" = "n"
    )


datatable(fhvhv_origins,
          class = 'cell-border stripe', 
          extensions = 'Buttons',
          fillContainer = FALSE, 
          options = list(pageLength = 10,
                         dom = 'Bfrtip',
                         buttons = c('csv', 'excel', 'pdf'), 
                         scrollX = TRUE, 
            selection="multiple"
          ))
fhvhv_destinations <- fhvhv %>% 
  filter(PULocationID == 186) %>% 
  count(DOLocationID) %>% 
  rename(
    "Dropoff.Zone" = "DOLocationID",
    "Total.Number.of.Rides" = "n"
    )


datatable(fhvhv_destinations,
          class = 'cell-border stripe', 
          extensions = 'Buttons',
          fillContainer = FALSE, 
          options = list(pageLength = 10,
                         dom = 'Bfrtip',
                         buttons = c('csv', 'excel', 'pdf'), 
                         scrollX = TRUE, 
            selection="multiple"
          ))

Taxi

taxi_origins <- taxi %>% 
  filter(DOLocationID == 186) %>% 
  count(PULocationID) %>% 
  rename(
    "Pickup.Zone" = "PULocationID",
    "Total.Number.of.Rides" = "n"
    )

datatable(taxi_origins,
          class = 'cell-border stripe', 
          extensions = 'Buttons',
          fillContainer = FALSE, 
          options = list(pageLength = 10,
                         dom = 'Bfrtip',
                         buttons = c('csv', 'excel', 'pdf'), 
                         scrollX = TRUE, 
            selection="multiple"
          ))
taxi_destinations <- taxi %>% 
  filter(PULocationID == 186) %>% 
  count(DOLocationID) %>% 
  rename(
    "Dropoff.Zone" = "DOLocationID",
    "Total.Number.of.Rides" = "n"
    )

datatable(taxi_destinations,
          class = 'cell-border stripe', 
          extensions = 'Buttons',
          fillContainer = FALSE, 
          options = list(pageLength = 10,
                         dom = 'Bfrtip',
                         buttons = c('csv', 'excel', 'pdf'), 
                         scrollX = TRUE, 
            selection="multiple"
          ))