Open RStudio

  1. go to www.github.com/collnell/GWU-visual to access workshop materials
  2. download the GWU_maps.R script
  3. open it in RStudio
  4. make sure you have the following packages installed and loaded:

Spatial analyses

vector data

## Observations: 1
## Variables: 13
## $ statefp           <chr> "24"
## $ statens           <chr> "01714934"
## $ affgeoid          <chr> "0400000US24"
## $ geoid             <chr> "24"
## $ stusps            <chr> "MD"
## $ name              <chr> "Maryland"
## $ lsad              <chr> "00"
## $ aland             <dbl> 25147754905
## $ awater            <dbl> 6983312282
## $ state_name        <chr> "Maryland"
## $ state_abbr        <chr> "MD"
## $ jurisdiction_type <chr> "state"
## $ geometry          <MULTIPOLYGON [°]> MULTIPOLYGON (((-76.05015 3...
## Geometry set for 1 feature 
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -79.48765 ymin: 37.91172 xmax: -75.04894 ymax: 39.72304
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
##  [1] "Anne Arundel"    "Calvert"         "Caroline"       
##  [4] "Garrett"         "Harford"         "Howard"         
##  [7] "Washington"      "Baltimore"       "Cecil"          
## [10] "Frederick"       "Montgomery"      "Somerset"       
## [13] "Talbot"          "Allegany"        "Prince George's"
## [16] "Carroll"         "Charles"         "Wicomico"       
## [19] "Kent"            "Queen Anne's"    "St. Mary's"     
## [22] "Worcester"       "Dorchester"

#### CRS - coordinate reference system

(http://spatialreference.org/)
(https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/OverviewCoordinateReferenceSystems.pdf)

Wikipedia: A coordinate system is a set of mathematical rules for specifying how coordinates are to be assigned to points, such as: affine, cylindrical, Cartesian, ellipsoidal, linear, polar, spherical, vertical, etc.

Look at CRS:

## Coordinate Reference System:
##   EPSG: 4326 
##   proj4string: "+proj=longlat +datum=WGS84 +no_defs"

WGS84 ESPG:4326 - common, most shapefiles in this crs
WGS84 is a datum, not a real projection, for flat longitude and latitude
A datum is a set of parameters that define the position of the origin, the scale, and the orientation of a coordinate system.

Assign CRS to shapefiles

## Coordinate Reference System:
##   EPSG: 2248 
##   proj4string: "+proj=lcc +lat_1=39.45 +lat_2=38.3 +lat_0=37.66666666666666 +lon_0=-77 +x_0=399999.9998983998 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=us-ft +no_defs"

Convert coordinates to sf object

If we have longitude and latitude for our sites, we can project these into a spatial format to work with

## Observations: 3
## Variables: 2
## $ site     <fct> Blackwater, Elmwood, Anderson
## $ geometry <POINT [US_survey_foot]> POINT (1536969 281612.6), POINT (165…

Plot coordiantes on map

#### Find coordiantes
To add labels, need to give coordinates of where to place

##         X        Y
## 1 1536969 281612.6
## 2 1658494 182251.6
## 3 1664430 207727.2
## Observations: 3
## Variables: 4
## $ site     <fct> Blackwater, Elmwood, Anderson
## $ geometry <POINT [US_survey_foot]> POINT (1536969 281612.6), POINT (165…
## $ lon      <dbl> 1536969, 1658494, 1664430
## $ lat      <dbl> 281612.6, 182251.6, 207727.2

Add labels to map

longitude is x axis, latitude is y axis

## mapping: x = ~lon, y = ~lat, label = ~site 
## geom_label_repel: parse = FALSE, box.padding = 0.25, label.padding = 0.25, point.padding = 1e-06, label.r = 0.15, label.size = 0.25, segment.colour = NULL, segment.size = 0.5, segment.alpha = NULL, min.segment.length = 0.5, arrow = NULL, na.rm = FALSE, force = 1, max.iter = 2000, nudge_x = 0, nudge_y = 0, xlim = c(NA, NA), ylim = c(NA, NA), direction = both, seed = NA
## stat_identity: na.rm = FALSE
## position_identity

Calculating distances

st_distance() calculates the distance between points

## Units: [US_survey_foot]
##          [,1]      [,2]      [,3]
## [1,]      0.0 156974.35 147327.79
## [2,] 156974.4      0.00  26158.06
## [3,] 147327.8  26158.06      0.00

Determine nearest distance to county boundary:

## 5832.274 [US_survey_foot]

st_cast converts geometries to other types

In a similar manner we can add labels to each county polygon:

## Observations: 24
## Variables: 13
## $ statefp           <chr> "24", "24", "24", "24", "24", "24", "24", "24"…
## $ countyfp          <chr> "003", "009", "011", "023", "025", "027", "043…
## $ countyns          <chr> "01710958", "01676636", "00595737", "01711058"…
## $ affgeoid          <chr> "0500000US24003", "0500000US24009", "0500000US…
## $ geoid             <chr> "24003", "24009", "24011", "24023", "24025", "…
## $ name              <chr> "Anne Arundel", "Calvert", "Caroline", "Garret…
## $ lsad              <chr> "06", "06", "06", "06", "06", "06", "06", "06"…
## $ aland             <dbl> 1074553083, 552178304, 827350236, 1676036320, …
## $ awater            <dbl> 447833714, 341560888, 16777064, 22315761, 2319…
## $ state_name        <chr> "Maryland", "Maryland", "Maryland", "Maryland"…
## $ state_abbr        <chr> "MD", "MD", "MD", "MD", "MD", "MD", "MD", "MD"…
## $ jurisdiction_type <chr> "state", "state", "state", "state", "state", "…
## $ geometry          <MULTIPOLYGON [US_survey_foot]> MULTIPOLYGON (((1357…

Crop plot to region of interest

subsetting extent

st_crop clips vector data based on given limits (bounding box)

## Observations: 3
## Variables: 15
## $ statefp           <chr> "24", "24", "24"
## $ countyfp          <chr> "039", "045", "019"
## $ countyns          <chr> "00596907", "01668606", "00596495"
## $ affgeoid          <chr> "0500000US24039", "0500000US24045", "0500000US…
## $ geoid             <chr> "24039", "24045", "24019"
## $ name              <chr> "Somerset", "Wicomico", "Dorchester"
## $ lsad              <chr> "06", "06", "06"
## $ aland             <dbl> 828090979, 969736590, 1400542033
## $ awater            <dbl> 752714478, 66797770, 1145384766
## $ state_name        <chr> "Maryland", "Maryland", "Maryland"
## $ state_abbr        <chr> "MD", "MD", "MD"
## $ jurisdiction_type <chr> "state", "state", "state"
## $ lon               <dbl> 1671442, 1707727, 1594945
## $ lat               <dbl> 165995.8, 260349.9, 298767.1
## $ geometry          <GEOMETRY [US_survey_foot]> POLYGON ((1613938 182251…

Raster data

Raster or “gridded” data are data that are saved in pixels. In the spatial world, each pixel represents an area on the Earth’s surface

Read in DEM raster

DEM = digital elevation model
data source: (http://data.imap.maryland.gov/datasets/1c6ce663eb3b499b9495010cb8c89df6v)

## class       : RasterLayer 
## dimensions  : 400, 400, 160000  (nrow, ncol, ncell)
## resolution  : 3252.954, 3252.954  (x, y)
## extent      : 585182.1, 1886364, -226561.9, 1074620  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=lcc +lat_1=39.45 +lat_2=38.3 +lat_0=37.66666666666666 +lon_0=-77 +x_0=399999.9998983998 +y_0=0 +ellps=GRS80 +units=us-ft +no_defs 
## data source : /Users/collnell/Dropbox/rstats/data viz/MD_DEM.tiff 
## names       : MD_DEM
## An object of class ".SingleLayerData"
## Slot "values":
## logical(0)
## 
## Slot "offset":
## [1] 0
## 
## Slot "gain":
## [1] 1
## 
## Slot "inmemory":
## [1] FALSE
## 
## Slot "fromdisk":
## [1] TRUE
## 
## Slot "isfactor":
## [1] FALSE
## 
## Slot "attributes":
## list()
## 
## Slot "haveminmax":
## [1] FALSE
## 
## Slot "min":
## [1] Inf
## 
## Slot "max":
## [1] -Inf
## 
## Slot "band":
## [1] 1
## 
## Slot "unit":
## [1] ""
## 
## Slot "names":
## [1] "MD_DEM"

set crs

## [1] "+proj=lcc +lat_1=39.45 +lat_2=38.3 +lat_0=37.66666666666666 +lon_0=-77 +x_0=399999.9998983998 +y_0=0 +ellps=GRS80 +units=us-ft +no_defs"

convert to dataframe to use with ggplot2

## 'data.frame':    26858 obs. of  3 variables:
##  $ x     : num  648615 651868 655121 658374 661626 ...
##  $ y     : num  760710 760710 760710 760710 760710 ...
##  $ MD_DEM: num  580 538 621 600 695 ...

extract DEM data for sites

extract pulls data values from raster layers that correspond to vector objects

## Observations: 3
## Variables: 19
## $ site              <fct> Elmwood, Anderson, Blackwater
## $ lon               <dbl> 1658494, 1664430, 1536969
## $ lat               <dbl> 182251.6, 207727.2, 281612.6
## $ statefp           <chr> "24", "24", "24"
## $ countyfp          <chr> "039", "039", "019"
## $ countyns          <chr> "00596907", "00596907", "00596495"
## $ affgeoid          <chr> "0500000US24039", "0500000US24039", "0500000US…
## $ geoid             <chr> "24039", "24039", "24019"
## $ name              <chr> "Somerset", "Somerset", "Dorchester"
## $ lsad              <chr> "06", "06", "06"
## $ aland             <dbl> 828090979, 828090979, 1400542033
## $ awater            <dbl> 752714478, 752714478, 1145384766
## $ state_name        <chr> "Maryland", "Maryland", "Maryland"
## $ state_abbr        <chr> "MD", "MD", "MD"
## $ jurisdiction_type <chr> "state", "state", "state"
## $ lon.1             <dbl> 1671442, 1671442, 1594945
## $ lat.1             <dbl> 165995.8, 165995.8, 298767.1
## $ geometry          <POINT [US_survey_foot]> POINT (1658494 182251.6), P…
## $ DEM               <dbl> 0.7807876, 1.1911737, 0.1216580
## Reading layer `MD_endangeredspecies' from data source `/Users/collnell/Dropbox/rstats/data viz/MD_endangeredspecies' using driver `ESRI Shapefile'
## Simple feature collection with 252 features and 5 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 608948.3 ymin: 153422.9 xmax: 1867328 ymax: 754753.5
## epsg (SRID):    NA
## proj4string:    +proj=lcc +lat_1=39.45 +lat_2=38.3 +lat_0=37.66666666666666 +lon_0=-77 +x_0=399999.9998983999 +y_0=0 +ellps=GRS80 +units=us-ft +no_defs
## Observations: 252
## Variables: 6
## $ species  <fct> Alasmidonta heterodon, Betula lenta, Betula lenta, Betu…
## $ kingdom  <fct> Animalia, Plantae, Plantae, Plantae, Plantae, Plantae, …
## $ phylum   <fct> Mollusca, Tracheophyta, Tracheophyta, Tracheophyta, Tra…
## $ order    <fct> Unionida, Fagales, Fagales, Fagales, Fagales, Fagales, …
## $ family   <fct> Unionidae, Betulaceae, Betulaceae, Betulaceae, Betulace…
## $ geometry <POINT [US_survey_foot]> POINT (1597053 438583.1), POINT (137…
## Coordinate Reference System:
##   No EPSG code
##   proj4string: "+proj=lcc +lat_1=39.45 +lat_2=38.3 +lat_0=37.66666666666666 +lon_0=-77 +x_0=399999.9998983999 +y_0=0 +ellps=GRS80 +units=us-ft +no_defs"

## Observations: 5
## Variables: 6
## $ species  <fct> Pseudemys rubriventris, Pseudemys rubriventris, Pseudem…
## $ kingdom  <fct> Animalia, Animalia, Animalia, Animalia, Animalia
## $ phylum   <fct> Chordata, Chordata, Chordata, Chordata, Chordata
## $ order    <fct> Testudines, Testudines, Testudines, Testudines, Rodentia
## $ family   <fct> Emydidae, Emydidae, Emydidae, Emydidae, Sciuridae
## $ geometry <POINT [US_survey_foot]> POINT (1553412 301594.2), POINT (156…

rnaturalearth

## Observations: 241
## Variables: 64
## $ scalerank  <int> 3, 1, 1, 1, 1, 3, 3, 1, 1, 1, 3, 1, 5, 3, 1, 1, 1, 1,…
## $ featurecla <chr> "Admin-0 country", "Admin-0 country", "Admin-0 countr…
## $ labelrank  <dbl> 5, 3, 3, 6, 6, 6, 6, 4, 2, 6, 4, 4, 5, 6, 6, 2, 4, 5,…
## $ sovereignt <chr> "Netherlands", "Afghanistan", "Angola", "United Kingd…
## $ sov_a3     <chr> "NL1", "AFG", "AGO", "GB1", "ALB", "FI1", "AND", "ARE…
## $ adm0_dif   <dbl> 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0,…
## $ level      <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,…
## $ type       <chr> "Country", "Sovereign country", "Sovereign country", …
## $ admin      <chr> "Aruba", "Afghanistan", "Angola", "Anguilla", "Albani…
## $ adm0_a3    <chr> "ABW", "AFG", "AGO", "AIA", "ALB", "ALD", "AND", "ARE…
## $ geou_dif   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ geounit    <chr> "Aruba", "Afghanistan", "Angola", "Anguilla", "Albani…
## $ gu_a3      <chr> "ABW", "AFG", "AGO", "AIA", "ALB", "ALD", "AND", "ARE…
## $ su_dif     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ subunit    <chr> "Aruba", "Afghanistan", "Angola", "Anguilla", "Albani…
## $ su_a3      <chr> "ABW", "AFG", "AGO", "AIA", "ALB", "ALD", "AND", "ARE…
## $ brk_diff   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ name       <chr> "Aruba", "Afghanistan", "Angola", "Anguilla", "Albani…
## $ name_long  <chr> "Aruba", "Afghanistan", "Angola", "Anguilla", "Albani…
## $ brk_a3     <chr> "ABW", "AFG", "AGO", "AIA", "ALB", "ALD", "AND", "ARE…
## $ brk_name   <chr> "Aruba", "Afghanistan", "Angola", "Anguilla", "Albani…
## $ brk_group  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ abbrev     <chr> "Aruba", "Afg.", "Ang.", "Ang.", "Alb.", "Aland", "An…
## $ postal     <chr> "AW", "AF", "AO", "AI", "AL", "AI", "AND", "AE", "AR"…
## $ formal_en  <chr> "Aruba", "Islamic State of Afghanistan", "People's Re…
## $ formal_fr  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ note_adm0  <chr> "Neth.", NA, NA, "U.K.", NA, "Fin.", NA, NA, NA, NA, …
## $ note_brk   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "Multiple…
## $ name_sort  <chr> "Aruba", "Afghanistan", "Angola", "Anguilla", "Albani…
## $ name_alt   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ mapcolor7  <dbl> 4, 5, 3, 6, 1, 4, 1, 2, 3, 3, 4, 4, 1, 7, 2, 1, 3, 1,…
## $ mapcolor8  <dbl> 2, 6, 2, 6, 4, 1, 4, 1, 1, 1, 5, 5, 2, 5, 2, 2, 1, 6,…
## $ mapcolor9  <dbl> 2, 8, 6, 6, 1, 4, 1, 3, 3, 2, 1, 1, 2, 9, 5, 2, 3, 5,…
## $ mapcolor13 <dbl> 9, 7, 1, 3, 6, 6, 8, 3, 13, 10, 1, NA, 7, 11, 5, 7, 4…
## $ pop_est    <dbl> 103065, 28400000, 12799293, 14436, 3639453, 27153, 83…
## $ gdp_md_est <dbl> 2258.0, 22270.0, 110300.0, 108.9, 21810.0, 1563.0, 36…
## $ pop_year   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ lastcensus <dbl> 2010, 1979, 1970, NA, 2001, NA, 1989, 2010, 2010, 200…
## $ gdp_year   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ economy    <chr> "6. Developing region", "7. Least developed region", …
## $ income_grp <chr> "2. High income: nonOECD", "5. Low income", "3. Upper…
## $ wikipedia  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ fips_10    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ iso_a2     <chr> "AW", "AF", "AO", "AI", "AL", "AX", "AD", "AE", "AR",…
## $ iso_a3     <chr> "ABW", "AFG", "AGO", "AIA", "ALB", "ALA", "AND", "ARE…
## $ iso_n3     <chr> "533", "004", "024", "660", "008", "248", "020", "784…
## $ un_a3      <chr> "533", "004", "024", "660", "008", "248", "020", "784…
## $ wb_a2      <chr> "AW", "AF", "AO", NA, "AL", NA, "AD", "AE", "AR", "AM…
## $ wb_a3      <chr> "ABW", "AFG", "AGO", NA, "ALB", NA, "ADO", "ARE", "AR…
## $ woe_id     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ adm0_a3_is <chr> "ABW", "AFG", "AGO", "AIA", "ALB", "ALA", "AND", "ARE…
## $ adm0_a3_us <chr> "ABW", "AFG", "AGO", "AIA", "ALB", "ALD", "AND", "ARE…
## $ adm0_a3_un <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ adm0_a3_wb <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ continent  <chr> "North America", "Asia", "Africa", "North America", "…
## $ region_un  <chr> "Americas", "Asia", "Africa", "Americas", "Europe", "…
## $ subregion  <chr> "Caribbean", "Southern Asia", "Middle Africa", "Carib…
## $ region_wb  <chr> "Latin America & Caribbean", "South Asia", "Sub-Sahar…
## $ name_len   <dbl> 5, 11, 6, 8, 7, 5, 7, 20, 9, 7, 14, 10, 23, 22, 17, 9…
## $ long_len   <dbl> 5, 11, 6, 8, 7, 13, 7, 20, 9, 7, 14, 10, 27, 35, 19, …
## $ abbrev_len <dbl> 5, 4, 4, 4, 4, 5, 4, 6, 4, 4, 9, 4, 7, 10, 6, 4, 5, 4…
## $ tiny       <dbl> 4, NA, NA, NA, NA, 5, 5, NA, NA, NA, 3, NA, NA, 2, 4,…
## $ homepart   <dbl> NA, 1, 1, NA, 1, NA, 1, 1, 1, 1, NA, 1, NA, NA, 1, 1,…
## $ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((-69.89912 1..., MULTIPOL…