This document covers the following items for working with spatial data in R:


Get Machine and Data Ready

First, install R of course. Even better, download Rstudio http://www.rstudio.com/, an incredibly efficient and easy to use interface for working with R.

R does not support working with spatial data straight out of the box so there are a couple of packages that need to be downloaded to get R working with spatial data. The two packages required are sp and maptools, and leaflet for cool visualization. Use the following scripts to get the packages installed and loaded:

install.packages(c("sp","maptools","leaflet"))
library(sp)
library(maptools)
library(leaflet)

Now it’s time to load the data. Setting up a working directory where the raw data sits is always the first step. Here we have two spatial data files that needed to be loaded into the memory from two different locations, so function setwd() is called twice.

# load rain fall data of 20160101
RainShapeFiles <- readShapePoly("VClimateRainfall20160101_CONUS.shp")
# load US county data
CountyShapeFiles <- readShapePoly("cb_2014_us_county_500k.shp")
CountyShapeFiles$NAME <- as.character(CountyShapeFiles$NAME)

Data Structure of Spatial Polygons

RainShapeFiles and CountyShapeFiles are SpatialPolygonsDataFrame objects. You can interpret this as the combination of polygon shape files and a data frame. Each polygon shape file contains the boundary coordinates, and the data frame contains other attributes attached to each polygon. RainShapeFiles has 926 polygons and 9 attributes attached to each polygon.

# dimension check
dim(RainShapeFiles)
## [1] 926   9
# attribute names
names(RainShapeFiles)
## [1] "startHrGMT" "Rainfall"   "endHrGMT"   "endDate"    "startDate" 
## [6] "descript"   "dataRights" "units"      "Poly_Id"

As a demo, here we see a region in the state of NY where it is raining.

# selecting row 736 to 742 of RainShapeFiles
NYRainShapeFiles <- RainShapeFiles[736:742,]
# selecting counties from NY state
NYCountyShapeFiles <- CountyShapeFiles[CountyShapeFiles$STATEFP==36, ]

Then we can have a quick visualization of this region in the state of NY.

V3 = leaflet() %>% 
  addProviderTiles("Stamen.TonerLite") %>%
  addPolygons(data = NYCountyShapeFiles, popup=NYCountyShapeFiles$NAME,
              fillColor = "green", fillOpacity = 0.2,
              stroke = T, color="Black", weight=2, opacity=0.2) %>%
  addPolygons(data = NYRainShapeFiles,
              fillColor = "red", fillOpacity = 0.6,
              stroke = T, color="Black", weight=1, opacity=0.1)
V3

Overlay Searching

Numerical “map overlay” combines spatial features from one map layer with the attribute (numerical) properties of another. This vignette explains the R method over(), which provides a consistent way to retrieve indices or attributes from a given spatial object (map layer) at the locations of another spatial object.

Given two geometries, A and B, command over(A,B) retrieves the geometry (location) indices of B at the locations of A. In particular, an integer vector of length ‘length(A)’ is returned, with NA values for locations in A not matching with locations in B (e.g. those points outside a set of polygons). With this function, we can have a list of counties that interact with all raining polygons.

OverlayIndex = !is.na(over(NYCountyShapeFiles, NYRainShapeFiles)[,1])
NYCountyShapeFiles[["NAME"]][OverlayIndex]
## [1] "Oneida"      "Cattaraugus" "Chautauqua"  "Erie"        "Oswego"     
## [6] "Jefferson"   "Genesee"     "Lewis"       "Wyoming"

Finally we can have a more detailed list of which counties are interacted with each polygon. The data frame shown in between [[1]] and [[2]] contains attributes of the counties that interact with the first raining polygon. And since the 4th polygon is in Lake Erie, there is no interaction with any county.

over(NYRainShapeFiles, NYCountyShapeFiles, returnList = TRUE)
## [[1]]
##      STATEFP COUNTYFP COUNTYNS       AFFGEOID GEOID        NAME LSAD
## 550       36      009 00974103 0500000US36009 36009 Cattaraugus   06
## 551       36      013 00974105 0500000US36013 36013  Chautauqua   06
## 553       36      029 00974113 0500000US36029 36029        Erie   06
## 1578      36      037 00974117 0500000US36037 36037     Genesee   06
## 2712      36      121 00974158 0500000US36121 36121     Wyoming   06
##           ALAND     AWATER
## 550  3388617683   36604676
## 551  2745973888 1139483730
## 553  2700574231  476963092
## 1578 1276697652    6142592
## 2712 1535218807    9158501
## 
## [[2]]
##     STATEFP COUNTYFP COUNTYNS       AFFGEOID GEOID       NAME LSAD
## 551      36      013 00974105 0500000US36013 36013 Chautauqua   06
##          ALAND     AWATER
## 551 2745973888 1139483730
## 
## [[3]]
##     STATEFP COUNTYFP COUNTYNS       AFFGEOID GEOID        NAME LSAD
## 550      36      009 00974103 0500000US36009 36009 Cattaraugus   06
##          ALAND   AWATER
## 550 3388617683 36604676
## 
## [[4]]
## [1] STATEFP  COUNTYFP COUNTYNS AFFGEOID GEOID    NAME     LSAD     ALAND   
## [9] AWATER  
## <0 rows> (or 0-length row.names)
## 
## [[5]]
##      STATEFP COUNTYFP COUNTYNS       AFFGEOID GEOID      NAME LSAD
## 196       36      065 00974131 0500000US36065 36065    Oneida   06
## 988       36      075 00974136 0500000US36075 36075    Oswego   06
## 1110      36      045 00974121 0500000US36045 36045 Jefferson   06
## 1579      36      049 00974123 0500000US36049 36049     Lewis   06
##           ALAND     AWATER
## 196  3140111003  117137923
## 988  2464742025  933506722
## 1110 3285857552 1524359927
## 1579 3301311295   39609925
## 
## [[6]]
##     STATEFP COUNTYFP COUNTYNS       AFFGEOID GEOID   NAME LSAD      ALAND
## 196      36      065 00974131 0500000US36065 36065 Oneida   06 3140111003
##        AWATER
## 196 117137923
## 
## [[7]]
##     STATEFP COUNTYFP COUNTYNS       AFFGEOID GEOID   NAME LSAD      ALAND
## 196      36      065 00974131 0500000US36065 36065 Oneida   06 3140111003
##        AWATER
## 196 117137923