In Kaggle, I found an Uber pickup dataset in New York City with detailed location information specifically for 2014 from April to September. This contains over 4.5 million pickups. The location information is presented as longitude and latitude data. The motivation of this study is to see the distribution of the number of pickups in New York City based on areas or neighborhoods. I am planning on mapping longitude and latitude information to NYC zip codes. This can be done with reverse geocoding. There is am r package called revego
that would allow us to do reverse geocoding.
I would also take the zip codes and map them to certain neighborhood names. For this, I will be using a web page (https://www.health.ny.gov/statistics/cancer/registry/appendix/neighborhoods.htm) that lists zip codes for New York City.
Uber dataset: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city/version/2
Example of Reverse Geocoding: https://developers.google.com/maps/documentation/javascript/examples/geocoding-reverse