This document covers the following items for working with spatial data in R:
over() for polygon overlays searchingFirst, install R of course. Even better, download Rstudio http://www.rstudio.com/, an incredibly efficient and easy to use interface for working with R.
R does not support working with spatial data straight out of the box so there are a couple of packages that need to be downloaded to get R working with spatial data. The two packages required are sp and maptools, and leaflet for cool visualization. Use the following scripts to get the packages installed and loaded:
install.packages(c("sp","maptools","leaflet"))
library(sp)
library(maptools)
library(leaflet)
Now it’s time to load the data. Setting up a working directory where the raw data sits is always the first step. Here we have two spatial data files that needed to be loaded into the memory from two different locations, so function setwd() is called twice.
# load rain fall data of 20160101
RainShapeFiles <- readShapePoly("VClimateRainfall20160101_CONUS.shp")
# load US county data
CountyShapeFiles <- readShapePoly("cb_2014_us_county_500k.shp")
CountyShapeFiles$NAME <- as.character(CountyShapeFiles$NAME)
RainShapeFiles and CountyShapeFiles are SpatialPolygonsDataFrame objects. You can interpret this as the combination of polygon shape files and a data frame. Each polygon shape file contains the boundary coordinates, and the data frame contains other attributes attached to each polygon. RainShapeFiles has 926 polygons and 9 attributes attached to each polygon.
# dimension check
dim(RainShapeFiles)
## [1] 926 9
# attribute names
names(RainShapeFiles)
## [1] "startHrGMT" "Rainfall" "endHrGMT" "endDate" "startDate"
## [6] "descript" "dataRights" "units" "Poly_Id"
As a demo, here we see a region in the state of NY where it is raining.
# selecting row 736 to 742 of RainShapeFiles
NYRainShapeFiles <- RainShapeFiles[736:742,]
# selecting counties from NY state
NYCountyShapeFiles <- CountyShapeFiles[CountyShapeFiles$STATEFP==36, ]
Then we can have a quick visualization of this region in the state of NY.
V3 = leaflet() %>%
addProviderTiles("Stamen.TonerLite") %>%
addPolygons(data = NYCountyShapeFiles, popup=NYCountyShapeFiles$NAME,
fillColor = "green", fillOpacity = 0.2,
stroke = T, color="Black", weight=2, opacity=0.2) %>%
addPolygons(data = NYRainShapeFiles,
fillColor = "red", fillOpacity = 0.6,
stroke = T, color="Black", weight=1, opacity=0.1)
V3
Numerical “map overlay” combines spatial features from one map layer with the attribute (numerical) properties of another. This vignette explains the R method over(), which provides a consistent way to retrieve indices or attributes from a given spatial object (map layer) at the locations of another spatial object.
Given two geometries, A and B, command over(A,B) retrieves the geometry (location) indices of B at the locations of A. In particular, an integer vector of length ‘length(A)’ is returned, with NA values for locations in A not matching with locations in B (e.g. those points outside a set of polygons). With this function, we can have a list of counties that interact with all raining polygons.
OverlayIndex = !is.na(over(NYCountyShapeFiles, NYRainShapeFiles)[,1])
NYCountyShapeFiles[["NAME"]][OverlayIndex]
## [1] "Oneida" "Cattaraugus" "Chautauqua" "Erie" "Oswego"
## [6] "Jefferson" "Genesee" "Lewis" "Wyoming"
Finally we can have a more detailed list of which counties are interacted with each polygon. The data frame shown in between [[1]] and [[2]] contains attributes of the counties that interact with the first raining polygon. And since the 4th polygon is in Lake Erie, there is no interaction with any county.
over(NYRainShapeFiles, NYCountyShapeFiles, returnList = TRUE)
## [[1]]
## STATEFP COUNTYFP COUNTYNS AFFGEOID GEOID NAME LSAD
## 550 36 009 00974103 0500000US36009 36009 Cattaraugus 06
## 551 36 013 00974105 0500000US36013 36013 Chautauqua 06
## 553 36 029 00974113 0500000US36029 36029 Erie 06
## 1578 36 037 00974117 0500000US36037 36037 Genesee 06
## 2712 36 121 00974158 0500000US36121 36121 Wyoming 06
## ALAND AWATER
## 550 3388617683 36604676
## 551 2745973888 1139483730
## 553 2700574231 476963092
## 1578 1276697652 6142592
## 2712 1535218807 9158501
##
## [[2]]
## STATEFP COUNTYFP COUNTYNS AFFGEOID GEOID NAME LSAD
## 551 36 013 00974105 0500000US36013 36013 Chautauqua 06
## ALAND AWATER
## 551 2745973888 1139483730
##
## [[3]]
## STATEFP COUNTYFP COUNTYNS AFFGEOID GEOID NAME LSAD
## 550 36 009 00974103 0500000US36009 36009 Cattaraugus 06
## ALAND AWATER
## 550 3388617683 36604676
##
## [[4]]
## [1] STATEFP COUNTYFP COUNTYNS AFFGEOID GEOID NAME LSAD ALAND
## [9] AWATER
## <0 rows> (or 0-length row.names)
##
## [[5]]
## STATEFP COUNTYFP COUNTYNS AFFGEOID GEOID NAME LSAD
## 196 36 065 00974131 0500000US36065 36065 Oneida 06
## 988 36 075 00974136 0500000US36075 36075 Oswego 06
## 1110 36 045 00974121 0500000US36045 36045 Jefferson 06
## 1579 36 049 00974123 0500000US36049 36049 Lewis 06
## ALAND AWATER
## 196 3140111003 117137923
## 988 2464742025 933506722
## 1110 3285857552 1524359927
## 1579 3301311295 39609925
##
## [[6]]
## STATEFP COUNTYFP COUNTYNS AFFGEOID GEOID NAME LSAD ALAND
## 196 36 065 00974131 0500000US36065 36065 Oneida 06 3140111003
## AWATER
## 196 117137923
##
## [[7]]
## STATEFP COUNTYFP COUNTYNS AFFGEOID GEOID NAME LSAD ALAND
## 196 36 065 00974131 0500000US36065 36065 Oneida 06 3140111003
## AWATER
## 196 117137923