For some more practice with maps and plotting, let’s do some work with election returns from 2012 and 2016. Keep in mind that the material in this course builds on itself, so while some of these tasks may be easy for you, it’s important to keep practicing base skills using dplyr and ggplot so that we can move on to more complex manipulation in R. Keep in mind that since we now know how to make our plots with axis labels, titles, and other proper conventions, you should make all of your plots with these aesthetic features. • First, load in the county_returns.csv, which contains the 2012 and 2016 presidential election returns by county. It also includes a variable for FIPS codes, which will help us later.
county_returns <- read_csv("Data/Raw/county_returns.csv")
## Warning: Missing column names filled in: 'X1' [1]
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## X1 = col_double(),
## fips = col_double(),
## trump = col_double(),
## clinton = col_double(),
## total_votes_2016 = col_double(),
## obama = col_double(),
## romney = col_double(),
## total_votes_2012 = col_double(),
## trump_clinton_margin = col_double(),
## romney_obama_margin = col_double(),
## Geographic.Name = col_character(),
## state.name = col_character()
## )
• Merge the county-level mapping data with this electoral data. Think carefully about how we should merge it!
counties <- map_data("county") %>%
unite(., col = polyname, region, subregion, sep = ",")
counties_fips <- left_join(counties, county.fips, by = "polyname")
counties_electoral <- left_join(county_returns, counties_fips, by = "fips")
• Create new variables: (1) clinton_prop (the proportion of the vote won by Hillary Clinton in the 2016 election), (2) trump_prop (the proportion of the vote won by Donald Trump in the 2016 election), and (3) clinton_won (an indicator variable telling you whether Hillary Clinton won a majority of the votes cast in this county).
counties_electoral <- counties_electoral %>%
mutate(clinton_prop = clinton/total_votes_2016, trump_prop = trump/total_votes_2016, clinton_won = ifelse(clinton > trump, 1, 0))
• Next, make two simple maps that show (1) counties that Hillary Clinton won in blue, and counties that Donald Trump won in red, and (2) states that Clinton won in blue and Trump won in red (so a county-level and state-level electoral choropleth map). You’ll often see these maps on election night, and now you can make them! You will need to do some additional work to get this data at the state level from your county-level data.
ggplot(data = counties_electoral) +
geom_polygon(mapping = aes(x = long, y = lat, group = group, fill = clinton_won), color = "white", lwd = 0.25) +
coord_quickmap() +
scale_fill_gradient(high = "blue", low = "red") +
theme_void() +
theme(legend.position = "none") +
labs(title = "2016 Election Results by County", subtitle = "Blue represents Clinton, Red represents Trump")
states <- map_data("state")
state_returns <- county_returns %>%
group_by(state.name) %>%
summarise(clinton_prop = sum(clinton) / (sum(clinton) + sum(trump))) %>%
mutate(clinton_won = ifelse(clinton_prop > 0.5, 1, 0))
states_electoral <- left_join(state_returns, states, by = c("state.name" = "region"))
ggplot(data = states_electoral) +
geom_polygon(mapping = aes(x = long, y = lat, group = group, fill = clinton_won), color = "white", lwd = 0.25) +
coord_quickmap() +
scale_fill_gradient2(low = "red", mid = "white", high = "blue", midpoint = 0.5) +
theme_void() +
theme(legend.position = "none") +
labs(title = "2016 Election Results by State", subtitle = "Blue represents Clinton, Red represents Trump")
• Now make a county-level map that shows the proportion of votes won by Hillary Clinton in the 2016 election. Counties should be colored on a sliding scale, with counties where Clinton won 0% of the vote in the brightest red, and counties where Clinton won 100% of the vote in the brightest blue. Make these counties fade to white in the middle, so that the color fades as the margin of victory decreases. Hint: Check out the scale_fill_gradient2 option in ggplot.
ggplot(data = counties_electoral) +
geom_polygon(mapping = aes(x = long, y = lat, group = group, fill = clinton_prop), color = "white", lwd = 0.25) +
coord_quickmap() +
scale_fill_gradient2(low = "red", mid = "white", high = "blue", midpoint = 0.5) +
theme_void() +
theme(legend.position = "none") +
labs(title = "2016 Election Results by County", subtitle = "Blue represents Clinton Votes, Red represents Trump Votes")
• Now draw the same map, except us the more familiar red –> purple –> blue maps that you often see in election analyses. So heavily Trump areas are bright red, politically mixed areas are purple, and heavily Democratic areas are deep blue. Which of these maps do you prefer, and why?
I prefer the red -> white -> blue map. It is easier for me to see the distinction in the gradients of red and blue. Where as the blue in the other one is muddied by the purple and its harder for me to make distinctions, but it does show that the US as a whole is more moderate than I precieve.
ggplot(data = counties_electoral) +
geom_polygon(mapping = aes(x = long, y = lat, group = group, fill = clinton_prop), color = "white", lwd = 0.25) +
coord_quickmap() +
scale_fill_gradient2(low = "red", mid = "purple", high = "blue", midpoint = 0.5) +
theme_void() +
theme(legend.position = "none") +
labs(title = "2016 Election Results by County", subtitle = "Bluer represents more Clinton, Purple is moderate, Reder represents more Trump")
• Now use the 2012 data we included to calculate the swing from 2012 to 2016. Where did Clinton improve on Obama’s performance, and where did it fall off? Places where Clinton did better than Obama should be more blue and places where she did worse should be more red. Comment on your results and where you see trends. Note: You might see a gray county in South Dakota where the data didn’t map right. Just ignore this, there’s an issue with the underlying data that we won’t worry about here.
Clinton’s performance improved in places in the South and West. Places like Texas, Georgia, Arizona, Utah, and California. Clinton’s performance deteriorated in places in the Midwest and Northeast, such as Wisconsin, Michigan, Iowa, New York, and Maine. Although Clinton did improve democratic support in some parts of the country, she lost it in come key places, such as Iowa, Ohio, Michigan, Wisconsin and Pennsylvania, and Maine.
counties_electoral <- counties_electoral %>%
mutate(obama_prop = obama/total_votes_2012, swing = clinton_prop - obama_prop)
ggplot(data = counties_electoral) +
geom_polygon(mapping = aes(x = long, y = lat, group = group, fill = swing), color = "white", lwd = 0.25) +
coord_quickmap() +
scale_fill_gradient(high = "blue", low = "red") +
theme_void() +
theme(legend.position = "none") +
labs(title = "2012 to 2016 Election Swing", subtitle = "Blue represents Clinton, Red represents Obama")
BONUS QUESTION! I pulled data from the 2020 Virginia Democratic Presidential Primary from the Virginia Department of Elections website: https://historical.elections.virginia.gov/elections/view/139527/ I put a cleaned up version of this data on Canvas as va_2020_dem_primary.csv Use this data to make at least two different county-level maps of Virginia, showing things you find interesting about the results and commenting on what you find. These might be maps showing who won each county, or maps of how specific candidates did, or how much better one candidate did than another. In each map, use something other than the default color scheme. Notes: If you’re familiar with the geography of Virginia, you might notice that this map is missing some of the cities that are actually independent and not part of any county, so not all your data will be displayed here (Charlottesville, Harrisonburg, etc.). Real world data is messy and we can’t easily fix this, so just ignore this issue for the purposes of this assignment and work with what you have. You will also see that the FIPS code for Accomack County doesn’t join in nicely (another very real world data issue). To fix this, you can use the following code to set it manually after you’ve joined in the rest of your FIPS codes: va.fips <- va.fips %>% mutate(fips = ifelse(polyname==“virginia,accomack”,51001,fips))
counties <- map_data("county")
va_counties <- filter(counties, region =="virginia")
va_counties <- va_counties %>%
unite(., col = polyname, c(region, subregion), sep = ",")
va_fips <- left_join(va_counties, county.fips, by = "polyname")
va_fips <- va_fips %>%
mutate(fips = ifelse(polyname == "virginia,accomack", 51001, fips ))
va_dem_primary <- read_csv("Data/Raw/va_2020_dem_primary.csv")
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## fips = col_double(),
## county_name = col_character(),
## Biden = col_double(),
## Sanders = col_double(),
## Warren = col_double(),
## Bloomberg = col_double(),
## Gabbard = col_double(),
## Buttigieg = col_double(),
## Klobuchar = col_double(),
## Yang = col_double(),
## Booker = col_double(),
## Steyer = col_double(),
## Bennet = col_double(),
## Williamson = col_double(),
## Castro = col_double(),
## Patrick = col_double(),
## Others = col_double(),
## total_votes = col_double()
## )
va_mapping_data <- left_join(va_fips, va_dem_primary, by = "fips")
va_mapping_data %>%
mutate(biden_prop = Biden/total_votes) %>%
ggplot() +
geom_polygon(aes(x = long, y = lat, group = group, fill = biden_prop), color = "white", lwd = 0.25) +
coord_quickmap() +
scale_fill_gradient2(low = "red", mid = "white", high = "yellow", midpoint = 0.5) +
theme_void() +
theme(legend.position = "none") +
labs(title = "2020 Biden Virginia Primary Election Results by County", subtitle = "Yellower Represents High Percentage Biden Votes")
va_mapping_data %>%
mutate(warren_prop = Warren/total_votes, sanders_prop = Sanders/total_votes, sanders_won = ifelse(Sanders > Warren, 1, 0)) %>%
ggplot() +
geom_polygon(aes(x = long, y = lat, group = group, fill = sanders_won), color = "white", lwd = 0.25) +
coord_quickmap() +
scale_fill_gradient2(low = "orange", mid = "white", high = "yellow", midpoint = 0.5) +
theme_void() +
theme(legend.position = "none") +
labs(title = "2020 Virginia Primary Election Results by County", subtitle = "Yellow Represents Sanders beat Warren")
winners <- va_mapping_data %>%
pivot_longer(8:22, names_to = "candidate", values_to = "votes") %>%
group_by(fips) %>%
filter(votes != 0) %>%
filter(votes == max(votes)) %>%
mutate(winning_percent = round(votes/total_votes*100, 1))
winners %>%
ggplot() +
geom_polygon(aes(x = long, y = lat, group = group, fill = candidate), color = "white", lwd =0.25) +
coord_quickmap() +
scale_fill_brewer(palette = "YlGn") +
theme_void() +
labs(title = "2020 Virginia Primary Election Overall Winners by County")