Mapping the pigeon complaints of New York City (2010 - early 2016)

I came across a dataset in Jeremy Singer-Vine’s weekly email, comprised of all reported rat sightings in New York City. Going further I found a wealth of data on complaints made to various NYC agencies, all through the 311 Service Requests system. NYC Open Data is a beautiful thing.

Filtering the data from the source using Socrata’s API - limiting the results to 311 Service Requests - contains ‘pigeon’ - we have 3893 observations of 30 variables. Since rats of New York have been well analyzed - why not see what we can learn about pigeons?

New York isn’t the only city with this problem.

sidenote: Before learning dplyr, I made more of these filtered sets and they’re on the my NYC Open Data profile page. They’re mostly filtered by animals or the prefix ‘-Un’. The API can be a little slow given the size of the dataset, so it’s going to be dplyr from here on out. If that NYC profile link doesn’t work and you’re really curious about those filtered sets, my handle is mozzarella.

Anyways! Here is just some spatial EDA with the data - mostly to learn a bit about the maptools library and RMarkdown. It is still a work in progress and am hoping to refine spatial parameters and understanding in the coming weeks.

And a quite note on Coordinate Reference Systems (CRS):

**WGS84 EPSG 4326** - used by Google Earth, Open Street Map, most GPS Systems

**WGS84 EPSG 3857** - used by Google Maps, Open Street Map. Mercator Projection.

**NADS83 EPSG 4269** - used by nyc.gov and many other state/federal agencies.

Again a work in progress here! Just a little EDA on the side my class Foundations of Data Science

load it up

pigeon <- read.csv("~/Documents/R6/data/contains_pigeon.csv")

taking a look at descriptions (of pigeon complaints)

summary(pigeon$Descriptor)
##          N/A  Pigeon Odor Pigeon Waste 
##            1          312         3580

Odor and Waste, eh? It’d be nice to make a map of these points wouldn’t it? Maybe we can find where the highest density of pigeon shit in New York is. The dataset provides latitude and longitude coordinates for the location of each complaint - service request - filed. It also provides columns for Complaint Types and descriptions of these complaints - pretty fine by my (still learning) standards.

I went on to break down the complaint description variables in order to assign lat/long coordinates to each type.

library(dplyr)
pigeonID <- select(pigeon, Descriptor, Longitude, Latitude)
odor_points <- filter(pigeonID, pigeonID$Descriptor != "Pigeon Waste")
waste_points <- filter(pigeonID, pigeonID$Descriptor != "Pigeon Odor")

To check:

head(odor_points)
##    Descriptor Longitude Latitude
## 1 Pigeon Odor -73.83707 40.70811
## 2 Pigeon Odor -73.82505 40.84501
## 3 Pigeon Odor -73.92229 40.69525
## 4 Pigeon Odor -73.94684 40.59589
## 5 Pigeon Odor -73.90589 40.70226
## 6 Pigeon Odor -74.00293 40.62462
glimpse(waste_points)
## Observations: 3,581
## Variables: 3
## $ Descriptor (fctr) Pigeon Waste, Pigeon Waste, Pigeon Waste, Pigeon W...
## $ Longitude  (dbl) -73.81459, -73.90734, -73.92513, -73.90734, -73.925...
## $ Latitude   (dbl) 40.67520, 40.77890, 40.73807, 40.77890, 40.73807, 4...

Time to convert latitude and longitude into projected coordinates using mapproject() from

library(mapproj)
pigeon_waste <- mapproject(waste_points$Longitude, waste_points$Latitude, 
                           projection = "albers", parameters = c(39, 45))

pigeon_odor <- mapproject(odor_points$Longitude, odor_points$Latitude, 
                          projection = "albers", parameters = c(39, 45))

And just wanting to see the points without any maplines yet:

# Plot data ---------------------------
par(mar=c(0, 0 , 0, 0), family="HersheySans", las = 1)
plot(pigeon_waste, asp = 1, type = "n", bty = "n", 
     xlab="", ylab="", axes=FALSE)

points(pigeon_waste, pch = 4, 
       col = "goldenrod3", cex = 0.2)
points(pigeon_odor, pch = 1, 
       col = "olivedrab4", cex = 1)
points(pigeon_odor, pch = 20, 
       col = "olivedrab", cex = 0.6)

Plotted the pigeon odor points twice for two reasons:

  1. Visually I just wanted to see it.
  2. Odor travels a more nebulous path through the atmoshere than waste does.

Those color choices make me a little queasy so maybe a black and white plot:

par(mar=c(0, 0 , 0, 0), family="HersheySans", las = 1)
plot(pigeon_waste, asp = 1, type = "n", bty = "n", 
     xlab="", ylab="", axes=FALSE)
points(pigeon_waste, pch = 20, col = "black", cex = 0.2)
points(pigeon_odor, pch = 1, col = "black", cex = 1)
points(pigeon_odor, pch = 1, col = "black", cex = 0.6)

OK! Maybe time to bring in some shapefiles. There are some great ones from nyc.gov with great parameters, but they do take quite some time to load. For this I used simpler ones - perhaps after having a more serious project in mind, will explore using the nyc.gov shapefiles.

library(maptools)
NYC_admin <- readShapeLines("data-spatial/NYS_cruzin/new_york_administrative/new_york_administrative.shp")

This is actually a shapefile of the administrative boundaries for all of New York State. For the specific plots, will set the xlim and ylim to a bounding box of latitude and longitude coordinates for New York City and borough by borough. Did one in color and black & white to test out graphics, legibility.

par(mar=c(0, 0, 0, 0), las = 1, family="HersheySans")

plot(0, 0, type="n", axes=FALSE, 
     xlim=c(-74.255735, -73.700272), 
     ylim=c(40.496044, 40.915256), 
     xlab=NA, ylab=NA)

lines(NYC_admin, lwd=1.2, col="papayawhip")

points(waste_points$Longitude, waste_points$Latitude, 
       pch = 1, col = "firebrick4", cex = 0.4)
points(odor_points$Longitude, odor_points$Latitude, 
       pch = 10, col = "cadetblue4", cex = 1.4)

Five Borough Breakdown

Going to start by layering the map a bit - adding in shapefiles for administrative, natural, and coastline boundaries.

One thing I need to do is filter out coordinates so that only those of each borough show up.

admin <- readShapeLines("data-spatial/NYS_cruzin/new_york_administrative/new_york_administrative.shp")
natural <- readShapeLines("data-spatial/NYS_cruzin/new_york_natural/new_york_natural.shp")
coastline <- readShapeLines("data-spatial/NYS_cruzin/new_york_coastline/new_york_coastline.shp")

Brooklyn

Plotted the points borough specific map.

par(family="HersheySans", bty="o", mar=c(2, 2, 2, 2))

plot(0, 0, type="n", asp=1, las = 0, axes=TRUE, 
     xlim=c(-74.05663, -73.833365), 
     ylim=c(40.551042, 40.739446), 
     xlab="longitude", ylab="latitude", 
     font.lab=2, font.axis=2,
     cex.axis=0.6, cex.lab=0.8)

lines(admin, lwd=0.2, col="grey10")
lines(natural, lwd = 0.4, col = "peachpuff3")
lines(coastline, lwd = 0.6, col = "cadetblue")

points(waste_points$Longitude, waste_points$Latitude, 
       pch = 20, col = "orange3", cex = 0.2)
points(odor_points$Longitude, odor_points$Latitude, 
       pch = 1, col = "firebrick3", cex = 1.175)
points(odor_points$Longitude, odor_points$Latitude, 
       pch = 1, col = "firebrick3", cex = 0.675)

Queens

Queens’ bounding box could use some tightening up.

Manhattan

Bronx

Staten Island

Will be tuning this up over time!