knitr::opts_chunk$set(echo = F,
fig.align = "center")
## Load the libraries we will be using
pacman::p_load(gapminder, socviz, tidyverse, grid, ggthemes,
usmap, maps, statebins, viridis, leaflet)
# Creating a vector for dem/rep colors
party_colors <- c("Democratic" = "#2E74C0",
"Republican" = "#CB454A")
# Will display the election data set in the global environment
election <- election
This state level mapping data is stored in ggplot2
. We
can use the map_data()
function along with the
map = ...
argument to get the data set that has the
outlines for certain maps:
map = "state"
will be a data set for each state in the
USmap = "county"
will be a data set for each county in
the USmap = "world"
or map = "world2"
returns a
data set for each countrymap = "usa"
returns a map of just the United States
border (no state borders)For other countries, you can give map =
“italy”,
“france”, “nz”, etc…
We’ll keep it simple and just look at the US states data for now.
Let’s see what it looks like…
## long lat group order region subregion
## 1 -87.46201 30.38968 1 1 alabama <NA>
## 2 -87.48493 30.37249 1 2 alabama <NA>
## 3 -87.52503 30.37249 1 3 alabama <NA>
## 4 -87.53076 30.33239 1 4 alabama <NA>
## 5 -87.57087 30.32665 1 5 alabama <NA>
## 6 -87.58806 30.32665 1 6 alabama <NA>
## [1] 15537 6
Let’s create a map that shows which candidate each state voted for in the 2016 election.
To do that, we need to add who won each state to the us_states data set. But how do we do that?
We need a column in both data sets we can use to ID which row of the election data goes with which rows in the us_states data.
Fortunately, both data sets have a column named state!
Unfortunately, in election, the state names are capitalized
“Alabama” and in the us_states data they are all lower case “alabama”.
So we need to fix that first using the tolower()
(to lower)
function!
Since the state column in us_states is named region, let’s name the new column in election region as well!
To make joining the data set together easier, call the new data set election2 that only has the region, st, winner, party, and pct_trump columns
## # A tibble: 51 × 5
## region st winner party pct_trump
## <chr> <chr> <chr> <chr> <dbl>
## 1 alabama AL Trump Republican 62.1
## 2 alaska AK Trump Republican 51.3
## 3 arizona AZ Trump Republican 48.1
## 4 arkansas AR Trump Republican 60.6
## 5 california CA Clinton Democratic 31.5
## 6 colorado CO Clinton Democratic 43.2
## 7 connecticut CT Clinton Democratic 40.9
## 8 delaware DE Clinton Democratic 41.7
## 9 district of columbia DC Clinton Democratic 4.09
## 10 florida FL Trump Republican 48.6
## # ℹ 41 more rows
Now that we have the two data sets with the same column name and have matching cases, join the elections and us_states data sets by region and sets together and name the results us_states2!
## # A tibble: 15,537 × 10
## long lat group order region subregion st winner party pct_trump
## <dbl> <dbl> <dbl> <int> <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 -87.5 30.4 1 1 alabama <NA> AL Trump Republican 62.1
## 2 -87.5 30.4 1 2 alabama <NA> AL Trump Republican 62.1
## 3 -87.5 30.4 1 3 alabama <NA> AL Trump Republican 62.1
## 4 -87.5 30.3 1 4 alabama <NA> AL Trump Republican 62.1
## 5 -87.6 30.3 1 5 alabama <NA> AL Trump Republican 62.1
## 6 -87.6 30.3 1 6 alabama <NA> AL Trump Republican 62.1
## 7 -87.6 30.3 1 7 alabama <NA> AL Trump Republican 62.1
## 8 -87.6 30.3 1 8 alabama <NA> AL Trump Republican 62.1
## 9 -87.7 30.3 1 9 alabama <NA> AL Trump Republican 62.1
## 10 -87.8 30.3 1 10 alabama <NA> AL Trump Republican 62.1
## # ℹ 15,527 more rows
Let’s create a map of the lower 48 states using
ggplot()
.
ggplot()
, we use 4 main
aesthetics:x =
the longitude of each line (long)y =
the latitude of each line (lat)group =
the column with the states’ group numbers
(group)fill =
the column you want to shade each state
for.geom_
you want to use is
geom_polygon()
, which will draw lines between the
x, y
coordinates in order of the rows presented for each
group
. Include color = "black"
to drop a black
outline around each state.Make sure to use the party colors from the party_colors
vector with the appropriate scale
function.
To improve the look of the map, add the following options to gg_elect2016
Add theme_map()
from the ggthemes
package to use a more suitable theme for the graph
Add a projection using
coord_map(projection = "albers", lat0 = 39, lat1 = 45)
to
make the plot look like a map
Add scale_x_continuous(expand = c(0, 0,))
and
scale_y_continuous(expand = c(0, 0))
to remove the buffer
space ggplot()
typically creates
Instead of displaying the binary option of republican or democrat, change the map to display the percent that voted for Trump (pct_trump).
To have the colors appear using dem blue and rep red, use
scale_fill_gradient2()
with
low = "#2E74C0"
mid = scales::muted("purple")
high = "#CB454A"
midpoint = 50
The colors are more red and purple than we’d expect. Why?
Remove the rows corresponding to Washington DC (only 4% for Trump)
In region it is “district of colombia”, in st it is just “DC”, so let’s use st to help remove all the rows with DC
Then pipe the resulting data set into the same code as the previous
code chunk, just remove the data =
argument in
ggplot()
Now we’ll switch back to the opiates data set for our next example
## Warning in right_join(mutate(opiates, state = tolower(state)), y = us_states, : Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 1 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
## "many-to-many"` to silence this warning.