While there is a package for working with Socrata / SODA, here we’re going to show you the “manual” method… because learning it this way will help you interact with other APIs as well. But if you’re interested, check out https://cran.r-project.org/web/packages/RSocrata/index.html.
You’ll almost certainly need to install some packages, see (and
uncomment, and run) the first line of code in the chunk below (e.g.,
please run install.packages(c("rgdal", "leaflet"))).
You may also have lower level dependencies… this depends a lot on
your computer setup. I, for example, had to
brew install gdal on my homebrew-enabled Mac running Big
Sur. If you get weird error messages, don’t panic! Just try to soldier
through by trying to figure out what the main complaint of the error
message is, and using Google to help.
# install.packages(c("rgdal", "leaflet"))
library(dplyr)
library(rgdal)
library(leaflet)
GeoJSON is a file format that includes geographic features (for example, counties as lists of lat/long points that form closed polygons) and other data (like crime counts or Covid prevalence).
Below, I’m using GeoJSON endpoints from Socrata data API endpoints. You can replace one or both of the urls below with your own endpoint, as long as it ends in .geojson!
Here is an example points based data endpoint: https://data.cityofnewyork.us/resource/5uac-w243.geojson (See https://data.cityofnewyork.us/Public-Safety/NYPD-Complaint-Data-Current-Year-To-Date-/5uac-w243 for more information)
And here is an example polygon based data endpoint: https://data.cityofnewyork.us/resource/imfq-nf3j.geojson (See https://data.cityofnewyork.us/City-Government/2020-Census-Tracts-water-areas-included-Tabular/imfq-nf3j for more information)
Note that we’re using readOGR, which we will explain
more about later in the workshop. We’ll pull in the polygon based
endpoint first.
my_geo_data_1 <- readOGR("https://data.cityofnewyork.us/resource/imfq-nf3j.geojson")
## OGR data source with driver: GeoJSON
## Source: "https://data.cityofnewyork.us/resource/imfq-nf3j.geojson", layer: "imfq-nf3j"
## with 1000 features
## It has 13 fields
You’ll need to know if you have points or polygons in your data. Now,
I intentionally brought in polygon data, so this seems silly. But maybe
you’re not sure whether your data has points or polygons. Let’s find out
what we have in your case. Look at the top line. It might say something
like “SpatialPointsDataFrame” or “SpatialPolygonsDataFrame.” I’m
limiting here how deeply we look into the structure by using
max.level.
str(my_geo_data_1, max.level = 2)
## Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
## ..@ data :'data.frame': 1000 obs. of 13 variables:
## ..@ polygons :List of 1000
## ..@ plotOrder : int [1:1000] 530 531 509 308 297 126 423 542 532 318 ...
## ..@ bbox : num [1:2, 1:2] -74.1 40.6 -73.7 40.9
## .. ..- attr(*, "dimnames")=List of 2
## ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
## ..$ comment: chr "TRUE"
Great, we have a polygon data frame – this is a good place to start!
Let’s first do a basemap of your area of interest. I’m guessing at
the zoom amount, so you can try a larger or smaller number. The latitude
and longitude I’m taking from the mean of the bounding box values (the
bounding box is standard for an rgdal object). This just
shows the major geographic features, thanks to the
addTiles() layer. We can remove that line if we later
decide we don’t want the basemap.
my_basemap_1 <- leaflet(my_geo_data_1) %>%
setView(lng = mean(my_geo_data_1@bbox[1,], na.rm=TRUE),
lat = mean(my_geo_data_1@bbox[2,], na.rm=TRUE), zoom = 10) %>%
addTiles()
my_basemap_1
Once you’re happy with your zoom, you can add to your map! Spatial polygons are pretty easy to work with. I’ll just add one additional layer to our map!
my_polygon_map <- leaflet(my_geo_data_1) %>%
setView(lng = mean(my_geo_data_1@bbox[1,], na.rm=TRUE),
lat = mean(my_geo_data_1@bbox[2,], na.rm=TRUE), zoom = 10) %>%
addTiles() %>%
addPolygons()
my_polygon_map
OK, that’s not the best looking polygon map. Let’s do a bit of tweaking to our polygon layer!
my_polygon_map <- leaflet(my_geo_data_1) %>%
setView(lng = mean(my_geo_data_1@bbox[1,], na.rm=TRUE),
lat = mean(my_geo_data_1@bbox[2,], na.rm=TRUE), zoom = 10) %>%
addTiles() %>%
addPolygons(
weight = 1, # border thickness
opacity = 0.5, # border opacity
color = "grey", # border color
fillColor = "white",
fillOpacity = 1
)
my_polygon_map
What if you have point data? You can add circle markers
my_geo_data_2 <- readOGR("https://data.cityofnewyork.us/resource/5uac-w243.geojson")
## OGR data source with driver: GeoJSON
## Source: "https://data.cityofnewyork.us/resource/5uac-w243.geojson", layer: "5uac-w243"
## with 1000 features
## It has 39 fields
my_point_map <- leaflet(my_geo_data_2) %>%
setView(lng = mean(my_geo_data_2@bbox[1,], na.rm=TRUE),
lat = mean(my_geo_data_2@bbox[2,], na.rm=TRUE), zoom = 10) %>%
addTiles() %>%
addCircleMarkers()
my_point_map
Ugh once again we struggle with the appearance. Let’s revamp!
my_point_map <- leaflet(my_geo_data_2) %>%
setView(lng = mean(my_geo_data_2@bbox[1,], na.rm=TRUE),
lat = mean(my_geo_data_2@bbox[2,], na.rm=TRUE), zoom = 10) %>%
addTiles() %>%
addCircleMarkers(
radius = 3, # pixel size
weight = 1, # edge thickness
opacity = 0.5, # edge opacity
color = "grey", # edge color
fillColor = "white",
fillOpacity = 1
)
my_point_map