In today’s class we’ll download the Congressional maps that we created, and create maps and charts to display them.
sf package
Download Congressional Maps
Spatial data in R
- Spatial data in R, the
sfpackage- Process your spatial data for mapping and analysis
Creating maps in R
ggplot2 geoms
Building a map
In-class assignment
Final Projects
Assignment 8
Creating district maps that balance many criteria is hard to do! Many criteria conflict with each other and the map drawers are required to decide between them. As we have seen through our readings and examples, the way the districts are drawn can have large impacts on whose voices are heard.
Let’s explore your Congressional maps using R to view, measure, and compare. Go to your map in Dave’s Redistricting App and download the geographic file using the instructions below. If you haven’t completed your map, download another published map from DRA here.
There are special file types necessary for adding a spatial dimension to your data. The two most common are:
Both formats contain geographic information that describes the location of each of observation. For a point file, that is most commonly the latitude and longitude of the point. For example, a spatial file of all schools in New York City would include the latitude and longitude coordinates of each school. In the file below, the latitude and longitude are in the geometry column (longitude, latitude)
shapefiles are a collection of files that contain the location, data, and projection.
geojsons are one file that contain all the same information
sf packageThere are many R packages that you can use to work with spatial data, sf is the easiest and best because it treats spatial data exactly the same as a regular dataframe with the geometry as the last column.
Let’s import the official Congressional map for your state into our R environment to learn about spatial data in R. We will use the sf function st_read() to import our geojson file. To learn more about the basics of sd, take a look at the documentation and vignettes:
In your class8 project, create a new R script called “congressional_maps_exploration.R”
First let’s look at the help section about the function st_read in R by typing into the console:
# exploring the official Congressional map in Georgia, and comparing it to the map that I drew in DRA
library(tidyverse)
library(sf)
library(scales)
library(viridis)
??st_read
# You will see something like the image below in your "Help" window in R Studio:
………..
Now let’s use it
raw_congress_off <- st_read("data/raw/ga_cong_official.geojson")
## Reading layer `ga_cong_official' from data source
## `/Users/sarahodges/spatial/NewSchool/methods1-materials-fall2021/class8/data/raw/ga_cong_official.geojson'
## using driver `GeoJSON'
## Simple feature collection with 14 features and 17 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -85.60516 ymin: 30.356 xmax: -80.78296 ymax: 35.0013
## Geodetic CRS: WGS 84
You’ll see a message similar to this in your console:
This describes the data. The file
We are not going to cover map projections*. In a nutshell, they are the mathematical equation used to represent the round Earth on a flat map. The equation used in this case is based on the World Geodetic System from 1984 (WGS 84). If you decide to do more work with mapping and spatial data, you will learn a lot more about projections, but for now we’ll smooth right over it.
Let’s look at the geometry column
print(raw_congress_off$geometry)
## Geometry set for 14 features
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -85.60516 ymin: 30.356 xmax: -80.78296 ymax: 35.0013
## Geodetic CRS: WGS 84
## First 5 geometries:
## POLYGON ((-81.92944 31.8484, -81.93218 31.84604...
## POLYGON ((-84.28625 32.74763, -84.28733 32.7496...
## POLYGON ((-84.13301 33.47467, -84.13318 33.4746...
## POLYGON ((-83.91482 33.7442, -83.91562 33.7447,...
## POLYGON ((-84.44718 33.5912, -84.44623 33.59066...
You can treat a spatial data frame just like a regular data frame. Just be aware that the geometry column can be very large so it can take a long time to open. We’ll process the official congressional map so that it is ready to analyze and map:
congress_off <- raw_congress_off %>%
select(id, TotalPop, DemPct:Margin, geometry) %>%
mutate(MinorityVAP = round(MinorityPct * TotalVAP, 0),
BlackVAP = round(BlackPct * TotalVAP, 0),
DemocratVoters = round(DemPct * TotalVAP, 0),
RepublicanVoters = round(RepPct * TotalVAP, 0),)
We’ll calculate some quick statistics about the official map. The fastest way to do that is to drop the geometry column using the st_drop_geometry() function. We’ll make the table look nice and readable as we create it by using:
congress_off_stats <- st_drop_geometry(congress_off) %>%
summarise(map = "official congressional map",
`total population` = comma(sum(TotalPop)),
`total voting-age population` = comma(sum(TotalVAP)),
`percent Minority voting-age population (MVAP)` = percent(sum(MinorityVAP)/sum(TotalVAP)),
`percent Black voting-age population (BVAP)` = percent(sum(BlackVAP)/sum(TotalVAP)),
`percent Democratic voters` = percent(sum(DemocratVoters)/sum(TotalVAP)),
`percent Republican voters` = percent(sum(RepublicanVoters)/sum(TotalVAP)),
districts = n(),
`number of Minority-majority districts` = length(id[MinorityPct >= .5]), #count the number of times the conditional is true
`number of Black-majority districts` = length(id[BlackPct >= .5]),
`number of Democrat districts` = length(id[DemPct > RepPct]),
`number of Republican districts` = length(id[RepPct > DemPct]),
`number of competitive districts` = length(id[RepPct >= .45 & RepPct <= .55 & DemPct >= .45 & DemPct <= .55]))
congress_off_stats
You can use ggplot2 to create maps in R using the geom_sf
Just like other plots, you build a map by adding instructions to the ggplot. Let’s start with a simple map of the official congressional map.
ggplot() +
geom_sf(data = congress_off)
We’re actually just plotting the latitude and longitude from the geometry column. But I think it’s confusing to show the lat/long. Let’s remove the grid lines and labels, and create a choropleth map where we color each district by Percent Minority voters.
ggplot() +
geom_sf(data = congress_off, mapping = aes(fill = MinorityPct)) +
theme_void()
A choropleth map uses graduated color or patterns to show the range of a statistic.
Let’s select our color scheme, we’ll use the viridis color scheme, and format the legend title and labels.
ggplot() +
geom_sf(data = congress_off, mapping = aes(fill = MinorityPct)) +
theme_void() +
scale_fill_viridis(name="Minority Voting-Age Population (%)", labels=percent_format(accuracy = 1L))
I prefer for the the color to get darker when concentrations increase. In the next iterations I’m going to:
ggplot() +
geom_sf(data = congress_off, mapping = aes(fill = MinorityPct), color = "#ffffff") +
theme_void() +
scale_fill_viridis(name="Minority Voting-Age Population (%)", direction=-1, labels=percent_format(accuracy = 1L)) +
labs(
title = "Georgia, Official Congressional Districts",
subtitle = "Percent Minority Voting-Age Population",
caption = "Source: DRA | Voting: 2016-2020, Demographics: US Census, 2020 "
)
Now we’ll create a map of partisan lean where we define the breaks specifically to show competitiveness as well as winner:
scale_fill_stepsn to define specific breaks and colors manually# define the partisan colors
partisan_colors <- c('#bc131e', '#eb4956', '#c36e9e', '#7279db', '#3c6ebf', '#1f4bae')
ggplot() +
geom_sf(data = congress_off, mapping = aes(fill = DemPct), color = "#ffffff") +
theme_void() +
scale_fill_stepsn(breaks=c(0, .4, .45, .5, .55, .6, 1),
colors = partisan_colors,
name="Percent Democratic Votes (%)",
labels=percent_format(accuracy = 1L)) +
labs(
title = "Georgia, Official Congressional Districts",
subtitle = "Percent Democratic Voters (2016-2020)",
caption = "Source: Dave's Redistricting App"
)
Now that I have finalized two maps, let’s name them so that I can call them as objects and save them to my output folder
minority_official <- ggplot() +
geom_sf(data = congress_off, mapping = aes(fill = MinorityPct), color = "#ffffff") +
theme_void() +
scale_fill_viridis(name="Minority Voting-Age Population (%)", direction=-1, labels=percent_format(accuracy = 1L)) +
labs(
title = "Georgia, Official Congressional Districts",
subtitle = "Percent Minority Voting-Age Population",
caption = "Source: DRA | Voting: 2016-2020, Demographics: US Census, 2020 "
)
partisan_official <- ggplot() +
geom_sf(data = congress_off, mapping = aes(fill = DemPct), color = "#ffffff") +
theme_void() +
scale_fill_stepsn(breaks=c(0, .4, .45, .5, .55, .6, 1),
colors = partisan_colors,
name="Percent Democratic Votes (%)",
labels=percent_format(accuracy = 1L)) +
labs(
title = "Georgia, Official Congressional Districts",
subtitle = "Percent Democratic Voters (2016-2020)",
caption = "Source: Dave's Redistricting App"
)
# save ggplot object using ggsave
ggsave("figures/congress_official_percent_minority.png",
plot = minority_official, # specify the ggplot object you stored
units = "in",
height = 5, width = 7)
ggsave("figures/congress_official_percent_dem.png",
plot = partisan_official, # specify the ggplot object you stored
units = "in",
height = 5, width = 7)
Make maps and a summary table of your DRA maps (or pick other published map)
scale_fill_gradient() function to define a color scale by the lowest color and the highest colorscale_fill_gradient() help section in R StudioThe final project for this course is a research paper that uses R to answer a research question and visualize the results. The project can be on a topic of your choosing, and can be a small group project, or individual. The deliverables will include:
The research paper should be 3 pages without graphics and methods.
If you are working with a group your research paper should include at least one research question per person. Each person should have should have their own analysis and visualization scripts and their own methods appendix.
Read chapter 2: Collect, Analyze, Imagine, Teach. from Data Feminism by Catherine D’Ignazio and Lauren F. Klein
Did you know that you can also import census data as a shapefile with the tidycensus package?! Just add the parameter geometry = T to your get_acs() function. See example below:
library(tidyverse)
library(tidycensus)
### load all the variables for the ACS
# acs201519 <- load_variables(2019, "acs5", cache = T)
raw_income = get_acs(geography = "state",
variables = "B19013_001",
year = 2019,
geometry = T) # this parameter imports the geometry
## Getting data from the 2015-2019 5-year ACS
## Downloading feature geometry from the Census website. To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
Create 2 maps using data you download from the 2015-19 American Community Survey with the tidycensus package. You can create any maps you like. You can even use this assignment to start thinking about your final project if you are ready for that.
Here are some example of 2 maps you could make if you want some inspiration:
Idea 1. Download the median rent for every county in New York (The variable is called MEDIAN CONTRACT RENT in the ACS).
Map ideas:
Idea 2. Download the PEOPLE REPORTING ANCESTRY table for every census tract in Kings County (Brooklyn)
Map ideas:
Idea 3. Download table for LANGUAGE SPOKEN AT HOME FOR THE POPULATION 5 YEARS AND OVER for every census tract in all 5 counties in New York
Map ideas:
When you have finished your maps, save them to the figures folder in your class8 folder. Then go back through your script and clean it up: delete any lines of code that are not pertinent to your final analysis, write comments to explain all of your steps. Make this a script that you could open in a year and rerun.
At the bottom of the script write a commented out paragraph about your experience with this assignment. What was hard about this assignment, what you learned in this assignment, and how long it took you.
Upload your finalized script.