After completing this worksheet, you should be able to use ggplot2 to create basic maps of points, lines, and colored polygons (called choropleth maps).
For this worksheet we will work with the Catholic dioceses data and the Paulist missions data in the historydata package, as well as U.S. county and state boundaries in the USAboundaries package. But we will also work with TODO
library(ggplot2)
library(historydata)
library(USAboundaries)
library(dplyr)
library(lubridate)
A map of points is essentially just a scatterplot where the x-axis and y-axis are longitude and latitude.
Suppose we want to make a map of Catholic dioceses in North America in 1850. We can get a data frame of them from the historydata package.
data("catholic_dioceses")
dioceses_1850 <- catholic_dioceses %>%
filter(as.Date(date) <= as.Date("1850-12-31"))
dioceses_1850
## Source: local data frame [63 x 6]
##
## diocese rite lat long event date
## (chr) (chr) (dbl) (dbl) (fctr) (time)
## 1 Baltimore, Maryland Latin 39.29038 -76.61219 erected 1789-04-06
## 2 New Orleans, Louisiana Latin 29.95107 -90.07153 erected 1793-04-25
## 3 Boston, Massachusetts Latin 42.35843 -71.05977 erected 1808-04-08
## 4 Louisville, Kentucky Latin 38.25266 -85.75846 erected 1808-04-08
## 5 New York, New York Latin 40.71435 -74.00597 erected 1808-04-08
## 6 Philadelphia, Pennsylvania Latin 39.95233 -75.16379 erected 1808-04-08
## 7 Richmond, Virginia Latin 37.54072 -77.43605 erected 1820-06-19
## 8 Charleston, South Carolina Latin 32.77657 -79.93092 erected 1820-07-11
## 9 Cincinnati, Ohio Latin 39.10312 -84.51202 erected 1821-06-19
## 10 St. Louis, Missouri Latin 38.62700 -90.19940 erected 1826-07-18
## .. ... ... ... ... ... ...
The longitude and latitude are x and y coordinates, and we can plot them in ggplot the same as any other scatterplot.
ggplot(dioceses_1850, aes(x = long, y = lat)) +
geom_point()
If you squint hard, you can make out the shape of North America in that map. But what we really want is to draw some geographic boundaries. The USAboundaries package lets us get the boundaries of U.S. states (as well as counties and other kinds of boundaries) on any arbitrary date in U.S. history. Let’s get the boundaries for the same date we used earlier.
states_1850 <- us_states("1850-12-31")
class(states_1850)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
states_1850@data
## id_num name id version start_date
## 7 7 Alabama al_state 3 1820/12/19
## 11 11 Arkansas ar_state 2 1840/05/21
## 18 18 California ca_state 1 1850/09/09
## 25 25 Connecticut ct_state 4 1804/12/31
## 27 27 District of Columbia dc 2 1846/09/07
## 28 28 Delaware de_state 1 1783/09/03
## 37 37 Deseret ds_deseret 1 1849/07/02
## 39 39 Florida fl_state 1 1845/03/03
## 44 44 Georgia ga_state 3 1802/04/24
## 49 49 Iowa ia_state 1 1846/12/28
## 55 55 Illinois il_state 1 1818/12/03
## 58 58 Indiana in_state 1 1816/12/11
## 64 64 Indian Territory it_indianterr 2 1828/05/06
## 70 70 Kentucky ky_state 1 1792/06/01
## 74 74 Louisiana la_state 1 1812/04/30
## 78 78 Massachusetts ma_state 3 1820/03/15
## 82 82 Maryland md_state 2 1791/03/30
## 83 83 Maine me_state 1 1820/03/15
## 84 84 Michigan mi_state 1 1837/01/26
## 94 94 Minnesota Territory mn_terr 1 1849/03/03
## 98 98 Missouri mo_state 4 1849/02/13
## 106 106 Mississippi ms_state 2 1820/05/29
## 117 117 North Carolina nc_state 2 1790/04/02
## 128 128 New Hampshire nh_state 1 1783/09/03
## 129 129 New Jersey nj_state 1 1783/09/03
## 131 131 New Mexico Territory nm_terr 1 1850/12/13
## 143 143 New York ny_state 1 1783/09/03
## 145 145 Ohio oh_state 1 1803/03/01
## 152 152 Oregon Territory or_terr 1 1848/08/14
## 156 156 Pennsylvania pa_state 2 1792/03/03
## 157 157 Rhode Island ri_state 1 1783/09/03
## 159 159 South Carolina sc_state 1 1783/09/03
## 163 163 Tennessee tn_state 1 1796/06/01
## 167 167 Texas tx_state 2 1850/12/13
## 182 182 Unorg. Fed. Terr. uf_terr 13 1850/12/13
## 189 189 Utah Territory ut_terr 1 1850/09/09
## 199 199 Virginia va_state 5 1846/09/07
## 204 204 Vermont vt_state 1 1791/03/04
## 210 210 Wisconsin wi_state 1 1848/05/29
## end_date
## 7 2000/12/31
## 11 2000/12/31
## 18 1959/12/30
## 25 2000/12/31
## 27 2000/12/31
## 28 2000/12/31
## 37 1851/04/04
## 39 2000/12/31
## 44 2000/12/31
## 49 2000/12/31
## 55 2000/12/31
## 58 2000/12/31
## 64 1890/05/01
## 70 1859/11/29
## 74 2000/12/31
## 78 1855/01/10
## 82 2000/12/31
## 83 2000/12/31
## 84 1908/12/31
## 94 1858/05/10
## 98 1950/08/02
## 106 2000/12/31
## 117 2000/12/31
## 128 2000/12/31
## 129 2000/12/31
## 131 1854/08/03
## 143 1855/01/10
## 145 2000/12/31
## 152 1853/03/01
## 156 2000/12/31
## 157 1862/02/28
## 159 2000/12/31
## 163 1859/11/29
## 167 1896/05/03
## 182 1854/05/29
## 189 1861/02/27
## 199 1863/06/19
## 204 2000/12/31
## 210 1926/11/21
## change
## 7 MARION's overlap of the state of Mississippi and TUSCALOOSA's overlap of the state of Mississippi ended.
## 11 Survey of boundary between the Republic of Texas and the United States began. MILLER (original) officially became extinct and LAFAYETTE was eliminated from Texas when Texas claims to the area were upheld.
## 18 The state of California was admitted to the Union.
## 25 HARTFORD lost part of the town of Southwick (the "Southwick Jog") to HAMPSHIRE (Mass.) when the state boundary was adjusted.
## 27 The federal government retroceded to Virginia all of the District of Columbia west of the Potomac River, including all of ALEXANDRIA (now ARLINGTON, Va.). ALEXANDRIA eliminated from the District of Columbia.
## 28 The three Lower Counties, of KENT, NEW CASTLE, and SUSSEX became an independent state on 4 July 1776. The name Delaware was formally adopted on 20 September 1776. The map depicts state boundaries as of 3 September 1783.
## 37 The General Assembly of the newly proposed state of Deseret met on 2 Jul 1849. The state was to include almost all of Utah and parts of present Ariz., Cal., Colo., Ida., Nev., N.Mex., Ore., and Wyo., but proposal never gained support in U.S. Congress.
## 39 The state of Florida was created from Florida Territory, with boundaries the same as those set in 1822; Florida Territory eliminated.
## 44 Georgia ceded to the United States much of present Mississippi and Alabama. This area became unorganized federal territory and Georgia Western Lands were eliminated.
## 49 The state of Iowa was created from Iowa Territory; Iowa Territory eliminated.
## 55 The state of Illinois was created from Illinois Territory; Illinois Territory eliminated.
## 58 The state of Indiana was created from Indiana Territory and small parts of Illinois and Michigan Territories.
## 64 Indian Territory gained from Arkansas Territory when the Treaty of Washington between the United States and Choctaw Indians definitively established the eastern line of the Choctaw Cession and affirmed Choctaw control of the area west of the line.
## 70 The state of Kentucky was created from the Kentucky District of Virginia.
## 74 The State of Louisiana was created from Orleans Territory; Orleans Territory eliminated.
## 78 Maine was separated from Massachusetts and admitted to the Union.
## 82 The United States created an unnamed district from land ceded by Maryland and Virginia to be the seat of national government.
## 83 Maine was separated from Massachusetts and admitted to the Union.
## 84 The state of Michigan was created from Michigan Territory; boundary dispute with Ohio over the strip running from Indiana to Lake Erie was settled in favor of Ohio; Michigan Territory eliminated.
## 94 The United States created Minnesota Territory from unorganized federal territory west of the Mississippi River (formerly part of Iowa Territory) and de facto Wisconsin Territory.
## 98 U.S. Supreme Court settled boundary dispute between Iowa and Missouri by rejecting both states' claims, choosing instead the commonly accepted boundary before the dispute arose and the present boundary between the two states.
## 106 Surveyors demarcated the southern segment of the Alabama-Mississippi line, fixing the end point ten miles east of the mouth of the Pascagoula River.
## 117 Congress accepted North Carolina's cession of its western lands (i.e. present Tennessee).
## 128 New Hampshire became an independent state on 4 July 1776. The map depicts state boundaries as of 3 September 1783.
## 129 New Jersey became an independent state on 4 July 1776. The map depicts state boundaries as of 3 September 1783.
## 131 The United States created New Mexico Territory from Unorganized Federal Territory (land ceded by Mexico to the United States in the Treaty of Guadalupe-Hidalgo) and from territory purchased from the state of Texas.
## 143 New York became an independent state on 4 July 1776, and Vermont declared its independence from New York on 15 January 1777. The map depicts state boundaries as of 3 September 1783.
## 145 The state of Ohio was created from the Northwest Territory and Indiana Territory.
## 152 The United States created the Territory of Oregon, encompassing all of present Oregon, Washington, and Idaho, and parts of present Montana and Wyoming.
## 156 Pennsylvania purchased from the federal government the Erie Triangle, a small area immediately north of Pennsylvania and west of New York, giving Pennsylvania an outlet to Lake Erie.
## 157 Rhode Island became an independent state on 4 July 1776. The map depicts state boundaries as of 3 September 1783.
## 159 South Carolina became an independent state on 4 July 1776. The map depicts state boundaries as of 3 September 1783.
## 163 The state of Tennessee was created from the Territory of the United States South of the River Ohio (Southwest Territory); Southwest Territory eliminated.
## 167 State of Texas sold land in present Colorado, Kansas, New Mexico, Oklahoma, and Wyoming to the United States.
## 182 The state of Texas sold land in present Kansas, Colorado, New Mexico, Oklahoma, and Wyoming to the United States.
## 189 The United States created Utah Territory from the territory ceded by Mexico in the Treaty of Guadalupe-Hidalgo (1848), from territory to be purchased from the state of Texas in December 1850, and from unorganized federal territory.
## 199 The District of Columbia ceded ALEXANDRIA (now ARLINGTON) to Virginia.
## 204 The state of Vermont was admitted to the Union. The boundary with New York was settled by a joint commission before statehood.
## 210 The state of Wisconsin was created from Wisconsin Territory. The area in present Minnesota between the Mississippi River and the state of Wisconsin continued as de facto Wisconsin Territory until the creation of Minnesota Territory on 3 March 1849.
## citation
## 7 (Ala. Acts 1820, 2d sess., secs. 1, 9/pp. 90, 92)
## 11 (U.S. Stat., vol. 5, ch. 75 [1844]/p. 674; Marshall, 235-236)
## 18 (U.S. Stat., vol. 9, ch. 50[1850], pp. 452-453; Van Zandt, 151)
## 25 (Hooker, 25-26; Van Zandt, 69)
## 27 (U.S. Stat., vol. 9, ch. 35 [1846]/pp. 35-37, and appendix 3/p. 1000)
## 28 (Declaration of Independence; Swindler, 2:197)
## 37 (Atlas of Utah, 160-161; Brown, Cannon, and Jackson, 90-91; Swindler, 9:375-381)
## 39 (Swindler, 2:332; U.S. Stat., vol. 5, ch. 48 [1845], secs. 1, 5/pp. 742-743)
## 44 (Paullin, 83; Van Zandt, 100; Terr. Papers U.S., 5:142-143)
## 49 (U.S. Stat., vol. 9, ch. 1[1847]/p. 117)
## 55 (Terr. Papers U.S., 10:803; U.S. Stat., vol. 3, ch. 67 [1818], secs. 1-2, 7/pp. 428-429, 431, and res. 1 [1818]/p. 536)
## 58 (U.S. Stat., vol. 3, ch. 57 [1816], secs. 1-2/p. 289 and res. 1 [1816]/p. 399; Van Zandt, 115)
## 64 (U.S. Stat., vol. 7, p. 311; Gabler, 37-39; Royce, 720-721; Van Zandt, 119)
## 70 (Hening, 11:85; U.S. Stat., 1:189)
## 74 (U.S. Stat., vol. 2, ch. 50 [1812]/pp. 701-704; Van Zandt, 107)
## 78 (U.S. Stat., vol. 3, ch. 19 [1820]/p. 544)
## 82 (Richardson, 1:102; Van Zandt, 90)
## 83 (U.S. Stat., vol. 3, ch. 19 [1820]/p. 544)
## 84 (U.S. Stat., vol. 5, ch. 99 [1836], secs. 1-2/p. 49 and ch. 6 [1837]/p. 144; Van Zandt, 127)
## 94 (U.S. Stat., vol. 9, ch. 121 [1849]/pp. 403-409)
## 98 (Landers. 647-648; Thomas 269-270)
## 106 (Van Zandt, 108-109)
## 117 (Terr. Papers U.S., 4:13-17; U.S. Stat., vol. 1, ch. 6[1790], pp. 106-109)
## 128 (Declaration of Independence)
## 129 (Declaration of Independence)
## 131 (U.S. Stat., vol. 9, ch. 49[1850]/pp. 446-452; Baldwin, 117-137; Coffey, 145-164; Van Zandt, 28-29, 162-165)
## 143 (Declaration of Independence)
## 145 (Ind. Terr., Exec. Journal, 114-115; Pence and Armstrong, 218; U.S. Stat., vol. 2, ch. 40 [1802], secs. 1-3/pp. 173-174; Van Zandt, 112)
## 152 (U.S. Stat., vol. 9, ch. 67[1848]/pp.323-331; Van Zandt, p. 153)
## 156 (Van Zandt, 83-84)
## 157 (Declaration of Independence)
## 159 (Declaration of Independence)
## 163 (U.S. Stat., vol. 1, ch. 47[1796], pp. 491-492; Folmsbee, Corlew, and Mitchell 110)
## 167 (U.S. Stat., vol. 9, ch. 49 [1850]/pp. 446-452 and appendix, sec. 10/pp. 1005-1006; Texas Laws 1850, 3d leg., 3d sess., ch. 2/p. 4; Van Zandt, 122)
## 182 (U.S. Stat., vol. 9, ch. 49 [1850]/pp. 446-452 and appendix, sec. 10/pp. 1005-1006; Texas Laws 1850, 3d leg., 3d sess., ch. 2/p. 4; Van Zandt, 122)
## 189 (U.S. Stat., vol. 9, ch. 49[1850]/pp. 446-452 and ch. 51[1850]/pp. 453-458; Van Zandt, 159)
## 199 (U.S. Stat., vol. 9, ch. 35[1846]/pp. 35-37 and appendix 3/p. 1000)
## 204 (Slade, 193; U.S. Stat., vol. 1, ch. 7 [1791]/p. 191)
## 210 (U.S. Stat., vol. 9, ch. 89 [1846]/pp. 56-58, and ch. 50 [1848]/pp. 233-235; Van Zandt, 128-130)
## start_n end_n area_sqmi terr_type
## 7 18201219 20001231 51656 State
## 11 18400521 20001231 53179 State
## 18 18500909 19591230 158097 State
## 25 18041231 20001231 4975 State
## 27 18460907 20001231 68 District of Columbia
## 28 17830903 20001231 2013 State
## 37 18490702 18510404 436387 Other
## 39 18450303 20001231 56618 State
## 44 18020424 20001231 58781 State
## 49 18461228 20001231 56271 State
## 55 18181203 20001231 56339 State
## 58 18161211 20001231 36182 State
## 64 18280506 18900501 64191 Unorganized Territory
## 70 17920601 18591129 40411 State
## 74 18120430 20001231 46740 State
## 78 18200315 18550110 8126 State
## 82 17910330 20001231 9996 State
## 83 18200315 20001231 32564 State
## 84 18370126 19081231 58121 State
## 94 18490303 18580510 167599 Territory
## 98 18490213 19500802 69708 State
## 106 18200529 20001231 47666 State
## 117 17900402 20001231 49365 State
## 128 17830903 20001231 9266 State
## 129 17830903 20001231 7545 State
## 131 18501213 18540803 232959 Territory
## 143 17830903 18550110 48621 State
## 145 18030301 20001231 41261 State
## 152 18480814 18530301 285404 Territory
## 156 17920303 20001231 45301 State
## 157 17830903 18620228 1083 State
## 159 17830903 20001231 30951 State
## 163 17960601 18591129 42142 State
## 167 18501213 18960503 267088 State
## 182 18501213 18540529 476439 Unorganized Territory
## 189 18500909 18610227 233768 Territory
## 199 18460907 18630619 64208 State
## 204 17910304 20001231 9614 State
## 210 18480529 19261121 56050 State
## full_name abbr_name
## 7 Alabama AL
## 11 Arkansas AR
## 18 California CA
## 25 Connecticut CT
## 27 District of Columbia DC
## 28 Delaware DE
## 37 Deseret Deseret
## 39 Florida FL
## 44 Georgia GA
## 49 Iowa IA
## 55 Illinois IL
## 58 Indiana IN
## 64 Indian Territory Indian Terr
## 70 Kentucky KY
## 74 Louisiana LA
## 78 Massachusetts MA
## 82 Maryland MD
## 83 Maine ME
## 84 Michigan MI
## 94 Minnesota Territory MN Terr
## 98 Missouri MO
## 106 Mississippi MS
## 117 North Carolina NC
## 128 New Hampshire NH
## 129 New Jersey NJ
## 131 New Mexico Territory NM Terr
## 143 New York NY
## 145 Ohio OH
## 152 Oregon Territory OR Terr
## 156 Pennsylvania PA
## 157 Rhode Island RI
## 159 South Carolina SC
## 163 Tennessee TN
## 167 Texas TX
## 182 Unorganized Federal Territory UFT
## 189 Utah Territory UT Terr
## 199 Virginia VA
## 204 Vermont VT
## 210 Wisconsin WI
## name_start start_posix end_posix
## 7 Alabama (1820-12-19) 1820-12-19 2000-12-31
## 11 Arkansas (1840-05-21) 1840-05-21 2000-12-31
## 18 California (1850-09-09) 1850-09-09 1959-12-30
## 25 Connecticut (1804-12-31) 1804-12-31 2000-12-31
## 27 District of Columbia (1846-09-07) 1846-09-07 2000-12-31
## 28 Delaware (1783-09-03) 1783-09-03 2000-12-31
## 37 Deseret (1849-07-02) 1849-07-02 1851-04-04
## 39 Florida (1845-03-03) 1845-03-03 2000-12-31
## 44 Georgia (1802-04-24) 1802-04-24 2000-12-31
## 49 Iowa (1846-12-28) 1846-12-28 2000-12-31
## 55 Illinois (1818-12-03) 1818-12-03 2000-12-31
## 58 Indiana (1816-12-11) 1816-12-11 2000-12-31
## 64 Indian Territory (1828-05-06) 1828-05-06 1890-05-01
## 70 Kentucky (1792-06-01) 1792-06-01 1859-11-29
## 74 Louisiana (1812-04-30) 1812-04-30 2000-12-31
## 78 Massachusetts (1820-03-15) 1820-03-15 1855-01-10
## 82 Maryland (1791-03-30) 1791-03-30 2000-12-31
## 83 Maine (1820-03-15) 1820-03-15 2000-12-31
## 84 Michigan (1837-01-26) 1837-01-26 1908-12-31
## 94 Minnesota Territory (1849-03-03) 1849-03-03 1858-05-10
## 98 Missouri (1849-02-13) 1849-02-13 1950-08-02
## 106 Mississippi (1820-05-29) 1820-05-29 2000-12-31
## 117 North Carolina (1790-04-02) 1790-04-02 2000-12-31
## 128 New Hampshire (1783-09-03) 1783-09-03 2000-12-31
## 129 New Jersey (1783-09-03) 1783-09-03 2000-12-31
## 131 New Mexico Territory (1850-12-13) 1850-12-13 1854-08-03
## 143 New York (1783-09-03) 1783-09-03 1855-01-10
## 145 Ohio (1803-03-01) 1803-03-01 2000-12-31
## 152 Oregon Territory (1848-08-14) 1848-08-14 1853-03-01
## 156 Pennsylvania (1792-03-03) 1792-03-03 2000-12-31
## 157 Rhode Island (1783-09-03) 1783-09-03 1862-02-28
## 159 South Carolina (1783-09-03) 1783-09-03 2000-12-31
## 163 Tennessee (1796-06-01) 1796-06-01 1859-11-29
## 167 Texas (1850-12-13) 1850-12-13 1896-05-03
## 182 Unorg. Fed. Terr. (1850-12-13) 1850-12-13 1854-05-29
## 189 Utah Territory (1850-09-09) 1850-09-09 1861-02-27
## 199 Virginia (1846-09-07) 1846-09-07 1863-06-19
## 204 Vermont (1791-03-04) 1791-03-04 2000-12-31
## 210 Wisconsin (1848-05-29) 1848-05-29 1926-11-21
This kind of data is a SpatialPolygonsDataFrame
. As the name implies, it contains both the polygons which define geographic boundaries and a data frame with one row for each boundary. By loading the sp package we can get a quick look at what the boundaries look like. (To see the data frame associated with it, try states_1850@data
.)
library(sp)
plot(states_1850)
That looks pretty good. (The overlapping lines come from the State of Deseret.) But in order to plot this in ggplot2, we need the spatial data in a data frame as well. The broom and maptools packages let us get the data frame we need. We figure out the region =
parameter by looking for the ID variable in the SpatialPolygonsDataFrame
.
library(broom)
library(maptools)
## Checking rgeos availability: TRUE
states_df <- tidy(states_1850, region = "id")
head(states_df)
## long lat order hole piece group id
## 1 -87.61574 35.00355 1 FALSE 1 al_state.1 al_state
## 2 -87.60613 35.00347 2 FALSE 1 al_state.1 al_state
## 3 -87.58733 35.00352 3 FALSE 1 al_state.1 al_state
## 4 -87.57727 35.00355 4 FALSE 1 al_state.1 al_state
## 5 -87.45083 35.00275 5 FALSE 1 al_state.1 al_state
## 6 -87.34387 35.00158 6 FALSE 1 al_state.1 al_state
Now we can use geom_map()
to add the polygons to our map, and coord_map()
to make it look something like a map in an Albers conical projection suitable for the United States. (The Albers conical projection preserves the areas (but not distances or angles) shown on the map. It is a generally good choice for mapping the continental United States. In a later worksheet, we will cover more advanced ways to project the map.)
ggplot() +
geom_map(data = states_df, map = states_df,
aes(x = long, y = lat, map_id = id, group = group),
fill = "white", color = "gray", size = 0.25) +
coord_map(projection = "albers", lat0 = 29.5, lat1 = 45.5)
This is not a beautiful map, at least not yet. But the beauty of ggplot is that now that we understand the rather long aesthetic mapping that goes into the map, we can re-use that same pattern to make almost any kind of map.
Let’s put all the pieces together to make a map of Catholic dioceses in 1850.
ggplot() +
geom_map(data = states_df, map = states_df,
aes(x = long, y = lat, map_id = id, group = group),
fill = "white", color = "gray", size = 0.25) +
geom_point(data = dioceses_1850, aes(x = long, y = lat),
color = "red") +
coord_map(projection = "albers", lat0 = 29.5, lat1 = 45.5)
Again this is not a beautiful map. At a very minimum, we would want to remove the axis labels and get boundary information for the rest of North America or filter out the non-US dioceses. But we have made a map in only a few lines of code, and we can re-use the basic pattern for virtually any map that involves points.
One other ggplot2 trick. It is possible to save plots or parts of plots, then add on to them. Here we save our plot with the polygons to a base_map
variable.
base_map <- ggplot() +
geom_map(data = states_df, map = states_df,
aes(x = long, y = lat, map_id = id, group = group),
fill = "white", color = "gray", size = 0.25) +
coord_map(projection = "albers", lat0 = 29.5, lat1 = 45.5)
base_map
Now instead of copying and pasting the code each time, we can use the same base_map
over and over.
base_map +
geom_point(data = dioceses_1850, aes(x = long, y = lat),
color = "red")
geom_point()
and geom_count()
to see which gives you a more useful map. You can decide what boundaries you want to show beneath the points. (You can also try loading the ggmaps package and adding + theme_nothing(legend = TRUE))
to remove the grid lines and axis labels.)data("paulist_missions")
confessions
or converts
.continental_us_1890 <- c("Maine", "New Hampshire", "Vermont", "Massachusetts", "Rhode Island", "Connecticut", "New York", "New Jersey", "Pennsylvania", "Maryland", "Delaware", "Virginia", "West Virginia", "North Carolina", "South Carolina", "Georgia", "Florida", "Ohio", "Kentucky", "Tennessee", "Indiana", "Michigan", "Mississippi", "Alabama", "Iowa", "Wisconsin", "Louisiana", "Arkansas", "Missouri", "Illinois", "Minnesota", "North Dakota", "South Dakota", "Nebraska", "Kansas", "Texas", "Montana", "Colorado", "Wyoming", "New Mexico Territory","Arizona Territory", "Utah Territory", "Idaho", "Washington", "Oregon", "California","Nevada", "Oklahoma Territory", "Indian Territory")
map_1890_df <- us_states("1890-12-31", states = continental_us_1890)
states_1890_df <- tidy(map_1890_df, region = "id")
paulist_missions <- paulist_missions %>% rename(mission_order = order)
paulist_missions_filtered <- paulist_missions %>%
arrange(desc(converts)) %>%
head(n=20L)
base_map_1890 <- ggplot(paulist_missions_filtered) +
geom_map(data = states_1890_df, map = states_1890_df,
aes(x = long, y = lat, map_id = id, group = group),
fill = "white", color = "gray", size = 0.25) +
coord_map(projection = "albers", lat0 = 29.5, lat1 = 45.5)
base_map_1890 +
geom_point(aes(x = long, y = lat, size = converts), shape = 1)+
facet_wrap(~mission_order)
paulist_missions
data frame by date. (I’ve given you a head start by creating a year
column.) From 1866 to 1870 the Paulists did not hold any missions. Can you create maps which show the difference in the Paulist missions before and after that gap? What was the difference?paulist_missions <- paulist_missions %>%
mutate(year = year(mdy(start_date)))
paulist_missions_first<- paulist_missions %>%
filter(year<=1865)
paulist_missions_second <- paulist_missions %>%
filter(year>=1870)
base_map +
geom_point(data = paulist_missions_first, aes(x = long, y = lat, color = mission_order, size=confessions))
base_map_1890 +
geom_point(data = paulist_missions_second, aes(x = long, y = lat, color = mission_order, size = confessions))
## Warning: Removed 6 rows containing missing values (geom_point).
After the gap, the missions expanded westward beyond the Mississippi river. Also Redemptionist missions only occurred/were established before the gap in the record keeping.
Another common kind of a map is a choropleth, which is a map where an area (such as a state or a county) is shaded according to some value. We are going to use religion data from the federal Census as compiled by NHGIS. This will require us to load in the spatial data directly from a shapefile instead of loading it from a package.
You can use this code to download the data.
dir.create("data/", showWarnings = FALSE)
get_nhgis_data <- function(x) {
download.file(paste0("http://lincolnmullen.com/projects/worksheets/data/", x),
paste0("data/", x))
unzip(paste0("data/", x), exdir = "data/")
}
get_nhgis_data("nhgis-religion.zip")
get_nhgis_data("county-1850.zip")
We are going to use the rgdal package to load the shapefile. Then we are also going to load in a CSV of data about religion in the 1850 Census. Both the shapefile and the census CSV have a column GISJOIN
which lets us join the two together. The Census CSV lists the data variables as codes rather than meaningful names. So we are also going to load the codebook. The codebook is a human-readable text file, but we can parse it to get a data frame.
library(rgdal)
## rgdal: version: 1.1-3, (SVN revision 594)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 1.9.2, released 2012/10/08
## Path to GDAL shared files: /usr/share/gdal
## Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012, [PJ_VERSION: 480]
## Path to PROJ.4 shared files: (autodetected)
## Linking to sp version: 1.2-1
library(readr)
library(mullenMisc)
library(ggmap)
religion_1850 <- read_csv("data/nhgis0044_ds10_1850_county.csv")
codebook_1850 <- parse_nhgis_codebook("data/nhgis0044_ds10_1850_county_codebook.txt")
counties_1850 <- readOGR("data", layer = "US_county_1850")
## OGR data source with driver: ESRI Shapefile
## Source: "data", layer: "US_county_1850"
## with 1632 features
## It has 20 fields
Now we can proceed as ususual and tidy our shapefile into a data frame. This data frame has an id
column with the GISJOIN codes. So we can do a left join to our religion data and make it a part of the shapefile.
counties_df <- tidy(counties_1850, region = "GISJOIN")
counties_df <- counties_df %>%
left_join(religion_1850, by = c("id" = "GISJOIN"))
Now we are ready to make our map. We will fill the county boundaries based on the number of churches in each county by denomination. Looking at our code book, we find that the first entry is AET001
for Baptists. We will will start with that.
ggplot() +
geom_map(data = counties_df, map = counties_df,
aes(x = long, y = lat, map_id = id, group = group,
fill = AET001),
color = "gray", size = 0.25) +
coord_map(projection = "albers", lat0 = 29.5, lat1 = 45.5) +
theme_nothing(legend = TRUE)
That map worked, but it is terrible. Colors almost never make sense as a continuous variable. It is almost always better to put them into bins. We can do that with the function cut()
that R provides. We have to decide what breaks we want. The classInt package has a number of useful ways of determining breaks from data. But for now we are going to pick breaks that make sense for our data. Notice that we start with 0
, so that counties without any Baptists are not colored. Then we add a column to our data frame, taking this opportunity to name it something sensible. And we will use the ColorBrewer scales, provided in R by the RColorBrewer package.
breaks <- c(0, 1, 5, 10, 15, 25, 100)
counties_df2 <- counties_df %>%
mutate(baptists = cut(AET001, breaks, na.rm = TRUE))
library(RColorBrewer)
ggplot() +
geom_map(data = counties_df2, map = counties_df2,
aes(x = long, y = lat, map_id = id, group = group,
fill = baptists),
color = "gray", size = 0.25) +
coord_map(projection = "albers", lat0 = 29.5, lat1 = 45.5) +
theme_nothing(legend = TRUE) +
scale_fill_brewer(palette = "OrRd", name = "Baptists in 1850")
breaks <- c(0, 1, 5, 10, 15, 25, 100)
counties_df2 <- counties_df %>%
mutate(baptists = cut(AET001, breaks, na.rm = TRUE)) %>%
mutate(rom_cath = cut(AET017, breaks, na.rm = TRUE)) %>%
mutate(jewish = cut(AET010, breaks, na.rm = TRUE))
ggplot() +
geom_map(data = counties_df2, map = counties_df2,
aes(x = long, y = lat, map_id = id, group = group,
fill = baptists),
color = "gray", size = 0.25) +
coord_map(projection = "albers", lat0 = 29.5, lat1 = 45.5) +
theme_nothing(legend = TRUE) +
scale_fill_brewer(palette = "OrRd", name = "Baptists in 1850")
ggplot() +
geom_map(data = counties_df2, map = counties_df2,
aes(x = long, y = lat, map_id = id, group = group,
fill = rom_cath),
color = "gray", size = 0.25) +
coord_map(projection = "albers", lat0 = 29.5, lat1 = 45.5) +
theme_nothing(legend = TRUE) +
scale_fill_brewer(palette = "OrRd", name = "Roman Catholics in 1850")
ggplot() +
geom_map(data = counties_df2, map = counties_df2,
aes(x = long, y = lat, map_id = id, group = group,
fill = jewish),
color = "gray", size = 0.25) +
coord_map(projection = "albers", lat0 = 29.5, lat1 = 45.5) +
theme_nothing(legend = TRUE) +
scale_fill_brewer(palette = "OrRd", name = "Jews in 1850")
breaks <- c(0, 1, 5, 10, 15, 25, 100)
counties_df2 <- counties_df %>%
mutate(unitarian = cut(AET021, breaks, na.rm = TRUE))
ggplot() +
geom_map(data = counties_df2, map = counties_df2,
aes(x = long, y = lat, map_id = id, group = group,
fill = unitarian),
color = "gray", size = 0.25) +
coord_map(projection = "albers", lat0 = 29.5, lat1 = 45.5) +
theme_nothing(legend = TRUE) +
scale_fill_brewer(palette = "OrRd", name = "Unitarian in 1850")
library(classInt)
population_1850 <- read_csv("nhgis0004_ds10_1850_county.csv")
religion_pop <- religion_1850 %>%
left_join(population_1850, by = c("GISJOIN"))
religion_pop <- religion_pop %>%
mutate(cong_per_pop = AET003/ADQ001)
breaks_jenks <- classIntervals(religion_pop$cong_per_pop, style = "jenks", n = 6)
## Warning in classIntervals(religion_pop$cong_per_pop, style = "jenks", n =
## 6): var has missing values, omitted in finding classes
religion_pop_divided <- religion_pop %>%
mutate(congregational = cut(cong_per_pop, breaks_jenks$brks, na.rm = TRUE))
county_pop_df<- counties_df %>%
left_join(religion_pop_divided, by = c("id" = "GISJOIN"))
ggplot() +
geom_map(data = county_pop_df, map = county_pop_df,
aes(x = long, y = lat, map_id = id, group = group,
fill = congregational),
color = "gray", size = 0.25) +
coord_map(projection = "albers", lat0 = 29.5, lat1 = 45.5) +
theme_nothing(legend = TRUE) +
scale_fill_brewer(palette = "OrRd", name = "Unitarian in 1850")