This is a brief introduction to mapping in R using both ggplot2 (static maps) and leaflet (interactive maps).
# install and load packages ----
#install.packages(c("rgdal","leaflet","ggplot2","acs","tigris","ggmap","stringr"))
library(rgdal)
library(leaflet)
library(ggplot2)
library(acs)
library(rgeos)
library(tigris)
library(ggmap)
library(stringr)
library(raster)
# set working directory ----
setwd('[DIRECTORY_WHERE_DATA_IS_STORED]')
# prep ----
#Sys.setenv(GOOGLE_MAP_GEOCODING_KEY="AIzaSyDobnUnB0yf5LOK8sJmIBU8-fuzKfeYmYs") # need Google geocoding key to geocode addresses
register_google(key = "[YOUR_GOOGLE_MAPS_API_KEY]")
options(tigris_use_cache=TRUE) # will temporarily save downloaded data so it won't need to re-download
# - **only use if you're downloading data through R**
state <- c(26) # also for downloading data - Michigan's FIPS code is 26
api.key.install(key="[YOUR_CENSUS_API_KEY]") # need for downloading ACS data in R
We’ve installed and loaded packages that contain functions we’ll need to map. We’ve also set the working directory, loaded keys for geocoding using Google Maps API and for downloading American Community Survey data directly in R.
Next, we’ll initialize different boundaries that we can use when creating static maps in ggplot2.
# set map boundaries ----
# State
statex <- c(-90.41, -81.5)
statey <- c(41.63,47.54)
# Detroit
detroitx <- c(-84, -81.5)
detroity <- c(42.04,43)
# Dearborn/Dearborn Heights
ddhx <- c(-83.315,-83.137)
ddhy <- c(42.267,42.353)
# Flint
flintx <- c(-83.830,-83.557)
flinty <- c(42.913,43.124)
# Tecumseh
tecumsehx <- c(-84.024,-83.862)
tecumsehy <- c(41.965,42.050)
We’ll only use the state of Michigan boundaries, but you can play around and zoom in on Detroit, Dearborn/Dearborn Heights, Flint and Tecumseh. These are all areas of interest according to Leigh and Rafael.
Next, we’ll load a basic CSV file and explore some basic functions in R for exploring the data.
# loading data ----
# CSV
ruralclinics <- read.csv("ruralclinics.csv",header=T,stringsAsFactors = F)
# some basic functions
dim(ruralclinics) # gives dimensions of dataframe [rows columns]
names(ruralclinics) # gives variable names of dataframe
head(ruralclinics,n=5) # prints first 5 observations
View(ruralclinics) # shows dataframe as a spreadsheet
class(ruralclinics) #tells you what kind of object ruralclinics is
class(ruralclinics$physician) # tells you the type of data the physician variable is
class(ruralclinics$zip) # tells you the type of data the zip variable is
class(ruralclinics$system) # character variable
ruralclinics$system # prints the values of the variable
ruralclinics$system <- as.factor(ruralclinics$system) # convert to factor
class(ruralclinics$system) # factor variable
levels(ruralclinics$system)
The class function is particularly useful. A lot of times errors will pop-up when variables are not in the form you think they are (sometimes numbers can be character variables or variables you think are factors will be character variables).
The following chunk will load a KML (can be seen as .kml.xml
).
This next chunk covers downloading American Community Survey (ACS) data directly into R.
# downloading data ----
# downloading ACS data
geo<-geo.make(state='MI', # we want to map census tracts in Michigan
county="*",
tract="*")
endyr=2015 # we want 2015 ACS data
sp=5 # we want the 5-year estimate data
tabnum = "B04006" # we want the ancestry data table
ancestry <- acs.fetch(endyear=endyr,span=sp,geography=geo, # we can now pull the 2015 ACS 5-year Ancestry table by census tract
table.number=tabnum,col.names="pretty") # "pretty" keeps the long variable names
View(ancestry)
# see quick_acs.R file for coding on manipulating and mapping ACS data
A more detailed code exploring and mapping ACS data can be found in the quick_acs.R
file in the CODE folder.
The next section shows how to download maps with different geographic subdivisions. It also demonstrates how to save these files for future use.
# downloading geodata
pumas <- pumas(state='MI',cb=TRUE) # will download Public Use Microdata Survey areas
plot(pumas) # plot these for fun
zipcodes <- read.csv("zip codes.csv",header=T)
zips <- zctas(starts_with = c('48','49'),cb=TRUE) # will download zip codes map of Michigan
counties <- counties <- counties(state='MI',cb=T) # will download counties map of Michigan
# saving geodata ----
writeOGR(obj=zips,dsn=".",layer="testzips",driver="ESRI Shapefile") # save zip code map for future use
testzips <- readOGR(dsn=".",layer="testzips") # load new shapefile
plot(testzips) # see what it looks like
rm(testzips,counties,pumas) # removes objects from environment
Similarly, if you already have shapefiles that were saved previously or were downloaded (for example, via TIGER from the US Census), you can load the files as follows.
# loading Shapefiles ----
# Shapefiles
catch.zips <- readOGR(dsn=".",layer="catch.zips") # loads outline of UMRCC catchment area
rcount <- readOGR(dsn=".",layer="rcount") # loads outline of rural counties according to Leigh
counties <- readOGR(dsn=".",layer="counties") # loads county map of Michigan
This example will demonstrate how to geocode addresses (turn them into coordinates). You will need a Google Maps API key that was run earlier in the code to access the geocoding feature.
# geocoding addresses ----
# we have the rural clinics CSV file loaded earlier
View(ruralclinics)
# it shows different variables, including street addresses, cities, state, and ZIP code
# this is all we need to geocode addresses
# Read in the CSV data and store it in a variable
ruralclinics$fulladdress <- paste0(ruralclinics$address,", ",ruralclinics$city,", ",ruralclinics$state)
# Initialize the data frame
geocoded <- data.frame(stringsAsFactors = FALSE)
# Loop through the addresses to get the latitude and longitude of each address and add it to the
# origAddress data frame in new columns lat and lon
for(i in 1:nrow(ruralclinics)){
# Print("Working...")
result <- geocode(ruralclinics$fulladdress[i], output = "latlona", source = "google")
ruralclinics$lon[i] <- as.numeric(result[1])
ruralclinics$lat[i] <- as.numeric(result[2])
ruralclinics$geoAddress[i] <- as.character(result[3])
}
View(ruralclinics)
write.csv(ruralclinics,file="test_geocoded.csv") # saves geocoded dataframe as CSV for future use
Now for the fun part. This next code will create a map of Michigan with county names, an outline of the UM Rogel Cancer Center catchment area (defined using ZIP codes), and a shaded area representing rural counties in the state (according to Leigh).
# mapping using ggplot ---- map of Michigan counties with names, UMRCC catchment area and rural counties
# Find a center point for each region
centers <- data.frame(gCentroid(counties, byid = TRUE, id=counties$NAME))
centers$NAME <- rownames(centers)
# plot
ggplot() + # starts ggplot
geom_polygon(data = counties , aes(x=long, y=lat, group = group),color="black",fill=NA) + # plot counties
geom_polygon(data = rcount , aes(x=long, y=lat, group = group),color="red",fill="red",alpha=0.3) + # plots rural county outline
geom_polygon(data = catch.zips , aes(x=long, y=lat, group = group),color="blue",fill=NA) + # plots UMRCC outline
coord_map(xlim = statex, ylim = statey) + # sets map zoom
geom_text(data = centers, aes(x=x,y=y,label = paste(as.character(NAME), sep="")), hjust = 0.5, color = "black",size=2) + # plots county names
theme(axis.line=element_blank(), # these are just visual options for the map - basically removes axes, lines, background, legend, etc
axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
legend.position="none",
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
plot.background=element_blank())
Lastly, here is an example of mapping using leaflet - a package that creates interactive maps. It will map Michigan with points for pharmacies, hospitals, and lung cancer screening centers. It will also contain the outline for the UM Rogel Cancer Center catchment center and a shaded area representing rural counties.
# mapping using leaflet ---- map of Michigan with pharmacies, hospitals, lung cancer screening centers,
# UMRCC catchment area and rural counties
pharms <- read.csv("pharmacy_geocoded.csv",header=T,stringsAsFactors = F)
lung <- read.csv("acrlcc_geocoded.csv",header=T,stringsAsFactors = F)
r <- 3 # marker radius
w <- 2 # marker width
o <- 1 # opacity
# specify pop-up info for clicking on markers
popup.pharms <- paste0("Brand: ", pharms$Brand,"<br>",
"Address: ",pharms$Address,"<br>",
"City: ",pharms$City,"<br>",
"Phone: ",pharms$Phone)
popup.rhcs <- paste0("Name: ",rhcs$name,"<br>",
ifelse(rhcs$description=="","",paste0("Phone: ",rhcs$description)))
popup.lung <- paste0("Name: ",lung$facility.name,"<br>",
"Address: ",lung$address,"<br>",
"City: ",lung$city,"<br>",
"Status: ",lung$status,"<br>",
"Expiration: ",lung$expiration)
leaflet() %>%
# Base groups
addProviderTiles("CartoDB.Positron") %>%
# Overlay outlines
addPolygons(data=rcount,
color='#FF0000',
fillOpacity=0.3,
weight=.01,
smoothFactor=0.2
#popup = paste0("County: ",counties$GEOID,"<br>","rural: ",counties$rural)
) %>%
addPolygons(data=catch.zips,
opacity=1,
fill=0,
color='#000000',
weight=1.5,
smoothFactor=0.5) %>%
addPolygons(data=counties,
opacity=.5,
fill=0,
color='#cccce5',
weight=1,
smoothFactor=0.5) %>%
# Overlay groups
addCircleMarkers(data=pharms, lat=~lat,lng=~lon,color="#f44336",radius=r,weight=w,opacity=o,group="Pharmacies",
popup=popup.pharms) %>%
addCircleMarkers(data=rhcs, lat=~lat,lng=~long,color="#4caf50",radius=r,weight=w,opacity=o,group="RHCs",
popup=popup.rhcs) %>%
addCircleMarkers(data=lung, lat=~lat,lng=~lon,color="#3f51b5",radius=r,weight=w,opacity=o,group="LCS Centers",
popup=popup.lung) %>%
addLabelOnlyMarkers(data=centers,~x, ~y, label = ~as.character(NAME),
labelOptions = labelOptions(noHide = T, direction = 'top', textOnly = T,style=list('font-weight'='normal'),padding='3px 8px',textsize='8px')) %>%
#Layers control
addLayersControl(
overlayGroups=c("Pharmacies","RHCs","LCS Centers"),
options=layersControlOptions(collapsed=F)
) %>%
addLegend(position="bottomright",
colors=c("#f44336","#4caf50","#3f51b5"),
values=c("Pharmacies","RHCs","LCS Centers"),
labels=c("Pharmacies","RHCs","LCS Centers"))
And there you have it! You’ve learned how to map static maps using ggplot2 and dynamic maps using leaflet.
sessionInfo()
## R version 3.4.0 (2017-04-21)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS 10.14
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] raster_2.6-7 ggmap_2.7.900 tigris_0.7 rgeos_0.3-26 acs_2.1.3
## [6] XML_3.98-1.11 stringr_1.3.0 ggplot2_3.0.0 leaflet_1.1.0 rgdal_1.2-18
## [11] sp_1.2-7
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.17 lattice_0.20-35 png_0.1-7
## [4] class_7.3-14 assertthat_0.2.0 rprojroot_1.3-2
## [7] digest_0.6.15 mime_0.5 R6_2.2.2
## [10] plyr_1.8.4 backports_1.1.2 evaluate_0.10.1
## [13] e1071_1.6-8 httr_1.3.1 pillar_1.2.1
## [16] RgoogleMaps_1.4.2 rlang_0.2.1 lazyeval_0.2.1
## [19] uuid_0.1-2 rmarkdown_1.9 labeling_0.3
## [22] udunits2_0.13 foreign_0.8-69 htmlwidgets_1.2
## [25] munsell_0.4.3 shiny_1.0.5 compiler_3.4.0
## [28] httpuv_1.3.6.2 pkgconfig_2.0.1 htmltools_0.3.6
## [31] tidyselect_0.2.4 tibble_1.4.2 dplyr_0.7.6
## [34] withr_2.1.2 sf_0.6-1 bitops_1.0-6
## [37] rappdirs_0.3.1 grid_3.4.0 jsonlite_1.5
## [40] xtable_1.8-2 gtable_0.2.0 DBI_0.8
## [43] magrittr_1.5 units_0.5-1 scales_0.5.0
## [46] stringi_1.2.2 mapproj_1.2.6 bindrcpp_0.2.2
## [49] rjson_0.2.20 tools_3.4.0 glue_1.3.0
## [52] markdown_0.8 purrr_0.2.5 maps_3.3.0
## [55] crosstalk_1.0.0 jpeg_0.1-8 yaml_2.1.19
## [58] colorspace_1.3-2 maptools_0.9-2 classInt_0.1-24
## [61] knitr_1.20 bindr_0.1.1