Now that we know how to pull in data, check and transform Coordinate Reference Systems (CRS), and plot sf data.frames together - let’s practice doing the same thing with other geometry types. In this notebook we’ll be bringing in bike boulevards and schools, which will get us primed to think about spatial relationship queries.
tmap
Instructor Notes
library(sf)
## Warning: package 'sf' was built under R version 4.1.2
library(here)
counties = st_read(here("notebook_data",
"california_counties",
"CaliforniaCounties.shp"))
## Reading layer `CaliforniaCounties' from data source
## `/Users/ada/Projects/Events/2023/2023-02-16_21_DLab-GIS_in_R/github/R-Geospatial-Fundamentals-master/notebook_data/california_counties/CaliforniaCounties.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 58 features and 24 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -374445.4 ymin: -604500.7 xmax: 540038.5 ymax: 450022
## Projected CRS: NAD83 / California Albers
ca_places = st_read(here("notebook_data",
"census",
"Places",
"cb_2019_06_place_500k.shp"))
## Reading layer `cb_2019_06_place_500k' from data source
## `/Users/ada/Projects/Events/2023/2023-02-16_21_DLab-GIS_in_R/github/R-Geospatial-Fundamentals-master/notebook_data/census/Places/cb_2019_06_place_500k.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 1521 features and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -124.2694 ymin: 32.53416 xmax: -114.229 ymax: 41.99314
## Geodetic CRS: NAD83
sf objectsJust like with other plots we can make in R, we can customize our maps’ colors, title, etc.
The most basic way to do this would be to use the sf::plot function.
# Make a map with plot
plot(counties$geometry, col='tan', border='darkgreen', main="CA counties")
However, we’ll get much more functionality and customizability if we use a special-purpose mapping package, rather than just relying on sf methods of base R functions.
sf objectsThere are many different packages you can use to make maps of sf data, including:
ggplot2: which is fantastic if you are already a ggplot usermapview: for a quick and easy interactive maptmap: for great custom static and interactive mapsleaflet: for highly custom interactive maps that you can output and host on a websiteshiny: for interactive R based applications that use leaflet mapsWe won’t cover leaflet or shiny, but we will present an quick look at ggplot2 and mapview and then dive into tmap.
Here is a simple example of a ggplot map syntax.
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.1.2
ggplot(counties, aes(fill = MED_AGE)) +
geom_sf() + # tells ggplot that geographic data are being plotted
scale_fill_viridis_c() +
theme_minimal() +
labs(title = "2010 Median Age by County")
mapviewThe mapview package is the easiest way to make a basic interactive map with very little code.
library(mapview)
# Interactive map of counties
mapview(counties)
# with areas colored by data values
mapview(counties, zcol = 'MED_AGE')
# and overlaying CA Places data
mapview(counties, zcol = 'MED_AGE') + mapview(ca_places)
Mapview is great for exploring your data. Less great for making pretty custom maps.
tmapOur go-to mapping package of choice is tmap. Its name stands for “thematic maps”, i.e. maps in which you can use the attribute values of the features to control the style properties of the geometry, thus creating effective data visualizations.
tmap is great for creating both static and interactive maps.
You’ll get plenty of introduction here in the workshop, but for additional support you can check out the tmap vignette or Google other tutorials and references.
The Geocomputation with R book also has a great chapter on making maps with tmap and other R libraries.
Let’s start by loading the package and creating a ‘quick tmap’.
# load tmap
library(tmap)
## Warning: package 'tmap' was built under R version 4.1.2
# make tmap fix invalid polygons
tmap_options(check.and.fix = TRUE)
# plot a 'quick tmap'
qtm(counties)
## Warning: The shape counties is invalid. See sf::st_is_valid
Nice!
That’s the quickest, simplest example of a static map that tmap can make.
tmap_options(check.and.fix = TRUE)tmap_option check.and.fix and repair invalid geometry so that it can render an interactive or static map properly. You can also use the sf function st_make_valid to repair invalid geometry. See the function documentation for more information.# save fixed geom
counties <- st_make_valid(counties)
tmap modestmap has 2 modes:
Use tmap_mode('plot') to set the mapping mode for static maps
Use tmap_mode('view') to set the mapping mode for interactive maps
use ttm() to quickly toggle between modes
tmap loads up in ‘plot’ mode. Let’s switch it to ‘view’ mode and then take a look at that same map.
# toggle the mode (or ttm!)
ttm()
## tmap mode set to interactive viewing
# then make our quick tmap again
qtm(counties)
That’s outstanding! We get a clickable, scrollable, zoomable map built with the Leaflet Javascript library… right out of the box!
Layer Control under the Zoom Controls.And to create thematic maps, we can use tmap’s more verbose mapping functions to create a new tmap object and then add geometry layers to it, setting different aesthetic aspects of those layers.
For now, let’s recreate that same map we made above with plot, but this time using tmap syntax:
tm_shape(counties) + # use the `tm_shape` function to create a tmap object
tm_polygons(col = 'tan', # add `tm_polygons` layer, coloring as before
border.col = 'darkgreen',
alpha = 0.5) # & making transparent
Above, tm_shape creates the new tmap object by identifying the data source for the first map layer (counties). Then, tm_polygons provides the aesthetics for how the data should be displayed.
tm_borders instead of tm_polygons:tm_shape(counties) + # use the `tm_shape` function to create a tmap object
tm_borders(col = 'darkgreen',
alpha = 0.85)
Now we have two primary mapping tools:
- `plot`: nice for simple, pared down plotting tasks
- `tmap`: quick maps, both static and interactive, with greater flexibility
In this section we are going to bring in some new data for mapping
First, we will read in the Berkeley bike boulevards in data. Note two things that are different from our previous data:
We’re reading in data from a GeoJSON file and not a shapefile
The data have line geometry (our county and states data had polygon geometries)
bike_blvds = st_read(here("notebook_data",
"transportation",
"BerkeleyBikeBlvds.geojson"))
## Reading layer `BerkeleyBikeBlvds' from data source
## `/Users/ada/Projects/Events/2023/2023-02-16_21_DLab-GIS_in_R/github/R-Geospatial-Fundamentals-master/notebook_data/transportation/BerkeleyBikeBlvds.geojson'
## using driver `GeoJSON'
## Simple feature collection with 211 features and 10 fields
## Geometry type: MULTILINESTRING
## Dimension: XY
## Bounding box: xmin: 561541.2 ymin: 4189007 xmax: 566451.6 ymax: 4193483
## Projected CRS: WGS 84 / UTM zone 10N
plot(bike_blvds$geometry)
Of course, we could also use tmap to plot our lines:
# set to interactive view mode
tmap_mode('view')
## tmap mode set to interactive viewing
tm_shape(bike_blvds) +
tm_lines()
As usual, we’ll want to do a bit of data exploration…
head(bike_blvds)
## Simple feature collection with 6 features and 10 fields
## Geometry type: MULTILINESTRING
## Dimension: XY
## Bounding box: xmin: 562293.8 ymin: 4189795 xmax: 562786.5 ymax: 4189899
## Projected CRS: WGS 84 / UTM zone 10N
## BB_STRNAM BB_STRID BB_FRO BB_TO BB_SECID DIR_ Status ALT_bikeCA
## 1 Heinz/Russell RUS 7th 8th RUS01 E/W Existing No
## 2 Heinz/Russell RUS 8th 9th RUS02 E/W Ezisting No
## 3 Heinz/Russell RUS 9th 10th RUS03 E/W Existing No
## 4 Heinz/Russell RUS 10th San Pablo RUS04 E/W Existing No
## 5 San Pablo RUS Heinz Russell RUS05 N/S Existing No
## 6 Russell RUS San Pablo Wallace RUS06 E/W Exisitng 3
## Shape_len len_km geometry
## 1 101.12817 0.101 MULTILINESTRING ((562293.8 ...
## 2 100.81407 0.101 MULTILINESTRING ((562391.6 ...
## 3 100.03740 0.100 MULTILINESTRING ((562489 41...
## 4 106.59288 0.107 MULTILINESTRING ((562585.7 ...
## 5 89.56348 0.090 MULTILINESTRING ((562688.9 ...
## 6 76.95699 0.077 MULTILINESTRING ((562711.4 ...
dim(bike_blvds)
## [1] 211 11
colnames(bike_blvds)
## [1] "BB_STRNAM" "BB_STRID" "BB_FRO" "BB_TO" "BB_SECID"
## [6] "DIR_" "Status" "ALT_bikeCA" "Shape_len" "len_km"
## [11] "geometry"
Our bike boulevard data includes the following information:
BB_STRNAM - bike boulevard StreetnameBB_STRID - bike boulevard Street IDBB_FRO - bike boulevard origin streetBB_TO - bike boulevard end streetBB_SECID- bike boulevard section idDIR_ - cardinal directions the bike boulevard runsStatus - status on whether the bike boulevard existsALT_bikeCA - ?Shape_len - length of the boulevard in meterslen_km - length of the boulevard in kilometersgeometry Question
Why are there 211 features when we only have 8 bike boulevards?
And now take a look at our CRS…
st_crs(bike_blvds)
## Coordinate Reference System:
## User input: WGS 84 / UTM zone 10N
## wkt:
## PROJCRS["WGS 84 / UTM zone 10N",
## BASEGEOGCRS["WGS 84",
## ENSEMBLE["World Geodetic System 1984 ensemble",
## MEMBER["World Geodetic System 1984 (Transit)"],
## MEMBER["World Geodetic System 1984 (G730)"],
## MEMBER["World Geodetic System 1984 (G873)"],
## MEMBER["World Geodetic System 1984 (G1150)"],
## MEMBER["World Geodetic System 1984 (G1674)"],
## MEMBER["World Geodetic System 1984 (G1762)"],
## MEMBER["World Geodetic System 1984 (G2139)"],
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]],
## ENSEMBLEACCURACY[2.0]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4326]],
## CONVERSION["UTM zone 10N",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",-123,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9996,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",500000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["Engineering survey, topographic mapping."],
## AREA["Between 126°W and 120°W, northern hemisphere between equator and 84°N, onshore and offshore. Canada - British Columbia (BC); Northwest Territories (NWT); Nunavut; Yukon. United States (USA) - Alaska (AK)."],
## BBOX[0,-126,84,-120]],
## ID["EPSG",32610]]
We can see that the CRS of the bike blvds dataframe is UTM Zone 10N, NAD83. This is a common CRS for locations in Northern CA.
Alright! Now that we have reviewed the bike boulevard data, we’re going to bring in Alameda County school data.
schools_df = read.csv(here("notebook_data",
"alco_schools.csv"))
head(schools_df)
## X Y Site Address City
## 1 -122.2388 37.74476 Amelia Earhart Elementary 400 Packet Landing Rd Alameda
## 2 -122.2519 37.73900 Bay Farm Elementary 200 Aughinbaugh Way Alameda
## 3 -122.2589 37.76206 Donald D. Lum Elementary 1801 Sandcreek Way Alameda
## 4 -122.2348 37.76525 Edison Elementary 2700 Buena Vista Ave Alameda
## 5 -122.2381 37.75396 Frank Otis Elementary 3010 Fillmore St Alameda
## 6 -122.2616 37.76911 Franklin Elementary 1433 San Antonio Ave Alameda
## State Type API Org
## 1 CA ES 933 Public
## 2 CA ES 932 Public
## 3 CA ES 853 Public
## 4 CA ES 927 Public
## 5 CA ES 894 Public
## 6 CA ES 893 Public
dim(schools_df)
## [1] 550 9
Questions Without looking ahead:
This is not an sf data.frame! A couple of clues to figure that out are..
Comma Separated Value (CSV) file, which is not a geospatial data file format.We can also check the object class of the dataframe.
class(schools_df)
## [1] "data.frame"
Although we are loading the school data as a regular, non-spatial data.frame, we can use the function st_as_sf to transform it into an sf spatial data.frame by:
specifying the columns that contain the point coordinates
and the identifying the CRS of the data
Let’s take another look at the dataframe.
head(schools_df, 2)
## X Y Site Address City
## 1 -122.2388 37.74476 Amelia Earhart Elementary 400 Packet Landing Rd Alameda
## 2 -122.2519 37.73900 Bay Farm Elementary 200 Aughinbaugh Way Alameda
## State Type API Org
## 1 CA ES 933 Public
## 2 CA ES 932 Public
Now lets use the st_as_sf function and what we know about the data to convert the dataframe to an sf spatial dataframe.
schools_sf = st_as_sf(schools_df, # the name of the dataframe to make spatial
coords = c('X', 'Y'), # the cols in that dataframe that contain the X & Y coordinate values
crs = 4326) # the CRS, expressed as an EPSG code
# Take a look at the output
head(schools_sf)
## Simple feature collection with 6 features and 7 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -122.2616 ymin: 37.739 xmax: -122.2348 ymax: 37.76911
## Geodetic CRS: WGS 84
## Site Address City State Type API Org
## 1 Amelia Earhart Elementary 400 Packet Landing Rd Alameda CA ES 933 Public
## 2 Bay Farm Elementary 200 Aughinbaugh Way Alameda CA ES 932 Public
## 3 Donald D. Lum Elementary 1801 Sandcreek Way Alameda CA ES 853 Public
## 4 Edison Elementary 2700 Buena Vista Ave Alameda CA ES 927 Public
## 5 Frank Otis Elementary 3010 Fillmore St Alameda CA ES 894 Public
## 6 Franklin Elementary 1433 San Antonio Ave Alameda CA ES 893 Public
## geometry
## 1 POINT (-122.2388 37.74476)
## 2 POINT (-122.2519 37.739)
## 3 POINT (-122.2589 37.76206)
## 4 POINT (-122.2348 37.76525)
## 5 POINT (-122.2381 37.75396)
## 6 POINT (-122.2616 37.76911)
You’ll notice that the spatial dataframe is almost the same as the regular data.frame, except with one less column (because the two coordinate columns, X, and Y, were consumed into a single geometry column.
We can also check the class of the output spatial dataframe…
class(schools_sf)
## [1] "sf" "data.frame"
Check the dimensions (number of rows and columns) - is it the same as for schools_df?
dim(schools_sf)
## [1] 550 8
Now that it’s an sf data.frame, we can make a map of the schools with the plot function, just as we did for our other data sets. Notice that this is our first point dataset.
plot(schools_sf)
Question
# plot the school points only
#plot(...)
plot to overlay the schools on the Berkeley Bike Boulevards?# Hint
st_crs(schools_sf) == st_crs(bike_blvds)
## [1] FALSE
So, we’ll want to transform the CRS to match that of our bike boulevard data.
# What are we setting the CRS to?
schools_utm10 = st_transform(schools_sf, st_crs(bike_blvds))
And make a map overlay of the two layers
plot(bike_blvds$geometry)
plot(schools_utm10$geometry, col= "red", add=T)
tmapWe can also use tmap to plot any of our spatial dataframes.
Here, we can use the tmap tm_dots function for point data:
tm_shape(schools_utm10) +
tm_dots(col = 'purple',
border.col = 'white',
border.lwd = 2,
size = 0.2)
In Lesson 2 we reviewed how to save sf data.frames to multiple file formats, such as the GeoJSON or shapefile formats, etc. However, point data is also often saved to a CSV file.
st_write(obj = schools_utm10,
dsn = here("outdata",
"schools_utm10.csv"),
layer_options = "geometry=AS_XY",
delete_dsn = T)
Use the RStudio File browser to take a look at the output.
tmapWe can combine multiple spatial dataframes of different geometry types to create overlay maps.
Let’s take a look at how we do this in tmap with the schools and bike boulevard dataframes.
tmap with the tm_lines function.tm_shape(schools_utm10) +
tm_dots(size = 0.1) +
tm_shape(bike_blvds) +
tm_lines(col = 'red')
Questions What
tmap functions do we use to
Now try remaking the previous map with the schools_sf dataframe instead of the schools_utm10 dataframe. What does the output tell you about tmap?
## Your code here!
Yeh,
tmapwill dynamically transform (reproject) spatial dataframes so that they can overlay in the same map if the CRS of each dataframe is defined. If it is not defined,tmapwill assume the WGS84 CRS (EPSG code 4326).
If we want to answer questions like “What schools are close to bike boulevards in Berkeley?”, the above plot isn’t super helpful, since the extent covers all of Alameda county.
Let’s practice reading in and mapping some additional spatial data.
In the code cell provided below, write code to:
tmap.# YOUR CODE HERE
In this lesson we learned several new skills:
tmap and interactive mappingsf spatial points data.frame - a very common work flow!