knitr::opts_chunk$set(echo = F,
fig.align = "center")
## Set the default size of figures (I only use it for knitting)
# knitr::opts_chunk$set(fig.width=8, fig.height=5)
## Load the libraries we will be using
pacman::p_load(gapminder, socviz, tidyverse,
usmap, maps, viridis, ggthemes)
When working with maps on the counties scale, we can use the
map_data()
function to get the outlines of the counties in
the continental 48 states.
The code below will create the county_lines data set and add a column called state_county that we can use to get each county’s ID number (called FIPS) in the next code chunk:
## # A tibble: 87,949 × 7
## long lat group order region subregion state_county
## <dbl> <dbl> <dbl> <int> <chr> <chr> <chr>
## 1 -86.5 32.3 1 1 alabama autauga alabama,autauga
## 2 -86.5 32.4 1 2 alabama autauga alabama,autauga
## 3 -86.5 32.4 1 3 alabama autauga alabama,autauga
## 4 -86.6 32.4 1 4 alabama autauga alabama,autauga
## 5 -86.6 32.4 1 5 alabama autauga alabama,autauga
## 6 -86.6 32.4 1 6 alabama autauga alabama,autauga
## 7 -86.6 32.4 1 7 alabama autauga alabama,autauga
## 8 -86.6 32.4 1 8 alabama autauga alabama,autauga
## 9 -86.6 32.4 1 9 alabama autauga alabama,autauga
## 10 -86.6 32.4 1 10 alabama autauga alabama,autauga
## # ℹ 87,939 more rows
Next, we need to add the FIPS column to our county_lines
data set. The county.fips data set in the maps
package has the FIPS column along with a column that has the
state,county info called polyname
## # A tibble: 3,085 × 2
## fips polyname
## <int> <chr>
## 1 1001 alabama,autauga
## 2 1003 alabama,baldwin
## 3 1005 alabama,barbour
## 4 1007 alabama,bibb
## 5 1009 alabama,blount
## 6 1011 alabama,bullock
## 7 1013 alabama,butler
## 8 1015 alabama,calhoun
## 9 1017 alabama,chambers
## 10 1019 alabama,cherokee
## # ℹ 3,075 more rows
Below uses left_join()
to add the fips column in
county.fips to the county_lines data set.
## # A tibble: 87,949 × 8
## long lat group order region subregion state_county fips
## <dbl> <dbl> <dbl> <int> <chr> <chr> <chr> <int>
## 1 -86.5 32.3 1 1 alabama autauga alabama,autauga 1001
## 2 -86.5 32.4 1 2 alabama autauga alabama,autauga 1001
## 3 -86.5 32.4 1 3 alabama autauga alabama,autauga 1001
## 4 -86.6 32.4 1 4 alabama autauga alabama,autauga 1001
## 5 -86.6 32.4 1 5 alabama autauga alabama,autauga 1001
## 6 -86.6 32.4 1 6 alabama autauga alabama,autauga 1001
## 7 -86.6 32.4 1 7 alabama autauga alabama,autauga 1001
## 8 -86.6 32.4 1 8 alabama autauga alabama,autauga 1001
## 9 -86.6 32.4 1 9 alabama autauga alabama,autauga 1001
## 10 -86.6 32.4 1 10 alabama autauga alabama,autauga 1001
## # ℹ 87,939 more rows
Why did we need to add the fips column? Because most data sets on the county level have a fips ID for each county that we can use to easily merge our county_lines data set with the info we want to display in our map!
For instance, check county_data, which holds the info we want to plot on the map
## fips name state pop_dens
## 1 0 <NA> <NA> [ 50, 100)
## 2 1000 1 AL [ 50, 100)
## 3 1001 Autauga County AL [ 50, 100)
## 4 1003 Baldwin County AL [ 100, 500)
## 5 1005 Barbour County AL [ 10, 50)
## 6 1007 Bibb County AL [ 10, 50)
In order to plot the population density (pop_dens) on a map, we need
to merge county_lines and county_data together using
the correct by
column.
Once you’ve identified the by
column, join them together
in a new data set named county_lines2
## long lat group order region subregion state_county
## 1 -97.61481 27.25560 2623 76928 texas kenedy texas,kenedy
## 2 -87.08958 32.48098 24 1155 alabama dallas alabama,dallas
## 3 -85.98951 32.28044 6 353 alabama bullock alabama,bullock
## 4 -119.28981 36.66357 166 7447 california fresno california,fresno
## 5 -124.12557 40.14715 168 7538 california humboldt california,humboldt
## 6 -97.93568 32.24606 2603 76578 texas hood texas,hood
## 7 -108.93074 47.76749 1578 46847 montana fergus montana,fergus
## 8 -83.28515 31.02566 449 17699 georgia lowndes georgia,lowndes
## 9 -83.30806 29.45576 304 12743 florida dixie florida,dixie
## 10 -119.63931 37.82094 178 8028 california mariposa california,mariposa
## fips id name state census_region pop_dens pop_dens4
## 1 48261 48261 Kenedy County TX South [ 0, 10) [ 0, 17)
## 2 1047 01047 Dallas County AL South [ 10, 50) [ 17, 45)
## 3 1011 01011 Bullock County AL South [ 10, 50) [ 17, 45)
## 4 6019 06019 Fresno County CA West [ 100, 500) [118,71672]
## 5 6023 06023 Humboldt County CA West [ 10, 50) [ 17, 45)
## 6 48221 48221 Hood County TX South [ 100, 500) [118,71672]
## 7 30027 30027 Fergus County MT West [ 0, 10) [ 0, 17)
## 8 13185 13185 Lowndes County GA South [ 100, 500) [118,71672]
## 9 12029 12029 Dixie County FL South [ 10, 50) [ 17, 45)
## 10 6043 06043 Mariposa County CA West [ 10, 50) [ 0, 17)
## pop_dens6 pct_black pop female white black travel_time land_area
## 1 [ 0, 9) [ 2.0, 5.0) 400 49.3 93.7 4.1 17.7 1458.33
## 2 [ 25, 45) [50.0,85.3] 41711 53.5 29.0 69.6 22.0 978.70
## 3 [ 9, 25) [50.0,85.3] 10764 45.2 27.2 69.9 26.9 622.81
## 4 [ 82, 215) [ 5.0,10.0) 965974 50.0 77.4 5.9 22.1 5957.99
## 5 [ 25, 45) [ 0.0, 2.0) 134809 50.0 84.4 1.3 17.5 3567.99
## 6 [ 82, 215) [ 0.0, 2.0) 53921 50.7 96.3 0.8 32.7 420.64
## 7 [ 0, 9) [ 0.0, 2.0) 11442 49.7 96.3 0.4 14.3 4339.80
## 8 [215,71672] [25.0,50.0) 113523 50.9 59.0 36.8 18.0 496.07
## 9 [ 9, 25) [ 5.0,10.0) 15907 46.4 88.8 8.9 25.7 705.05
## 10 [ 9, 25) [ 0.0, 2.0) 17682 49.2 90.2 1.1 32.0 1448.82
## hh_income su_gun4 su_gun6 votes_dem_2016 votes_gop_2016 total_votes_2016
## 1 43438 [ 0, 5) [ 0, 4) 99 84 186
## 2 26519 [ 5, 8) [ 4, 7) 12826 5784 18730
## 3 32033 [ 0, 5) [ 0, 4) 3530 1139 4701
## 4 45563 [ 0, 5) [ 4, 7) 123660 113949 250264
## 5 41426 [ 8,11) [10,12) 19596 10883 33636
## 6 55754 [ 8,11) [10,12) 4001 21367 26120
## 7 38344 [11,54] [12,54] 1196 4235 5782
## 8 37365 [ 5, 8) [ 7, 8) 14614 21308 36813
## 9 33981 [11,54] [12,54] 1270 5822 7202
## 10 49820 [11,54] [12,54] 3122 5185 8877
## per_dem_2016 per_gop_2016 diff_2016 per_dem_2012 per_gop_2012 diff_2012
## 1 0.5322581 0.4516129 15 0.4939759 0.5000000 1
## 2 0.6847838 0.3088094 7042 0.6973156 0.3001528 8315
## 3 0.7509041 0.2422889 2391 0.7630688 0.2350508 2808
## 4 0.4941182 0.4553152 9711 0.4763654 0.5056645 5281
## 5 0.5825901 0.3235521 8713 0.5999049 0.3383043 12104
## 6 0.1531776 0.8180322 17366 0.1705948 0.8171530 14512
## 7 0.2068488 0.7324455 3039 0.2693580 0.7020185 2615
## 8 0.3969793 0.5788173 6694 0.4433566 0.5484589 4058
## 9 0.1763399 0.8083866 4552 0.2590405 0.7278490 3254
## 10 0.3516954 0.5840937 2063 0.3946188 0.5680974 1354
## winner partywinner16 winner12 partywinner12 flipped
## 1 Clinton Democrat Romney Republican Yes
## 2 Clinton Democrat Obama Democrat No
## 3 Clinton Democrat Obama Democrat No
## 4 Clinton Democrat Romney Republican Yes
## 5 Clinton Democrat Obama Democrat No
## 6 Trump Republican Romney Republican No
## 7 Trump Republican Romney Republican No
## 8 Trump Republican Romney Republican No
## 9 Trump Republican Romney Republican No
## 10 Trump Republican Romney Republican No
With the data merged, make a map the population density per square mile (pop_dens).
Make sure to remove Washington DC again (state column, “DC” value)
Since most counties are small, include size = 0.05
and
color = "gray90"
inside geom_polygon()
Save the map as gg_county_density
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
Color choice isn’t appropriate since there isn’t a pattern for low pop density to high pop density.
Need to either change color scheme manually using
scale_color_manual()
or we can use a pre-designed palette
using scale_fill_brewer(palette = "palette name")
:
The code below uses "Blues"
but try using different
colors to see what you get!
It isn’t apparent where the states are in the map. Let’s add the border for the states as well:
Create the data set for the state lines, like you did in part 1
Use geom_polygon()
to add the state borders setting
the following aesthetics:
color = "black"
size = 0.1
fill = NA