Learning Objectives

In this lesson students will learn how to

  • Create a map of the United States
  • Create a map of the World
  • Try different map projections
  • Color a map based on variables

Example 1: Dutch Bros

Dutch Bros Coffee is a drive-through coffee chain headquartered in Grants Pass, Oregon, with company-owned and franchise locations throughout the United States.

STEP 0: Library Tidyverse

library(tidyverse)

STEP 1: Dutch Bros Data

These data represent a list of store locations. There are variables for city, state, and zipcode.

dutch<- read.csv("https://raw.githubusercontent.com/kitadasmalley/DATA151/main/Data/dutch_bros_locations.csv", 
                 header=TRUE)

We will also want to wrangle our data so that we count the number of locations in a given state.

dutchSt<-dutch%>%
  group_by(state)%>%
  summarise(n=n())

head(dutchSt)
## # A tibble: 6 × 2
##   state     n
##   <fct> <int>
## 1 AZ       61
## 2 CA       96
## 3 CO       30
## 4 ID       35
## 5 NM        7
## 6 NV       25

STEP 2: usmap Package

#install.packages("usmap")
library(usmap)
## Warning: package 'usmap' was built under R version 3.6.2
states <- usmap::us_map()

## What variables in the state object?
head(states)
##         x        y order  hole piece group fips abbr    full
## 1 1091779 -1380695     1 FALSE     1  01.1   01   AL Alabama
## 2 1091268 -1376372     2 FALSE     1  01.1   01   AL Alabama
## 3 1091140 -1362998     3 FALSE     1  01.1   01   AL Alabama
## 4 1090940 -1343517     4 FALSE     1  01.1   01   AL Alabama
## 5 1090913 -1341006     5 FALSE     1  01.1   01   AL Alabama
## 6 1090796 -1334480     6 FALSE     1  01.1   01   AL Alabama

Let’s investigate the data for Oregon.

Points
oregon<-states%>%
  filter(full=="Oregon")

ggplot(oregon, aes(x, y))+
  geom_point()

These data allow us to play “connect the dots” to draw the shape of the state of Oregon.

Connect the dots

Oh no, what happened?

ggplot(oregon, aes(x, y))+
  geom_line()

We need to tell R what order to connect the dots.

  • geom_path() connects the observations in the order in which they appear in the data.

  • geom_line() connects them in order of the variable on the x axis.

ggplot(oregon, aes(x, y, group=group))+
  geom_path() 

Filling in the space

We can actually think of geographies as generalized polygons!

ggplot(oregon, aes(x, y, group=group))+
  geom_polygon(fill="forestgreen") 

STEP 3: Joining Data to the Map

When joining the data to the map we need to have the same variable name in both. Let’s create a new column named state in our states data, that contains the state abbreviations, since the dutch data set uses state abbreviations.

mapDutch<-states%>%
  mutate(state=abbr)%>%
  left_join(dutchSt)
## Joining, by = "state"

STEP 4: Choropleth

mapDutch%>%
  ggplot(aes(x, y, group = group)) +
  geom_polygon(aes(fill = n),color="black")+
  theme_bw()+
  coord_fixed()

STEP 5: Changing Color Palette

Viridis is a colorblind friendly color palette that can be used to create accessible heatmaps.

#install.packages("viridis")
library(viridis)

mapDutch%>%
  ggplot(aes(x, y, group = group)) +
  geom_polygon(aes(fill = n),color="black")+
  theme_bw()+
  coord_fixed()+
  scale_fill_viridis(option="viridis", direction = 1)+
  ggtitle("Oregon - Home of Dutch Bros")

STEP 6: Interactivity

Using the package plotly we can add some basic interactivity, like hover overs, zooms, and box selects.

#install.packages("plotly")
library(plotly)

p<-mapDutch%>%
  ggplot(aes(x, y, group = group)) +
  geom_polygon(aes(fill = n),color="black")+
  theme_bw()+
  coord_fixed()+
  scale_fill_viridis(option="viridis", direction = 1)

ggplotly(p)

Example 2: Starbucks

STEP 1: Starbucks Store Data

starbuck <- read_csv("https://raw.githubusercontent.com/kitadasmalley/DATA151/main/Data/starbucks_directory.csv")
## Rows: 25600 Columns: 13
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (11): Brand, Store Number, Store Name, Ownership Type, Street Address, C...
## dbl  (2): Longitude, Latitude
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#str(starbuck)
How many counties is Starbucks in?

Let’s look at the format of the column Country:

head(unique(starbuck$Country))
## [1] "AD" "AE" "AR" "AT" "AU" "AW"
length(unique(starbuck$Country))
## [1] 73

STEP 2: Using map_data in ggplot2

A. Load the data
# Load map data
world_map = map_data("world")
B. Basic Map
world_map%>% 
  ggplot(aes(map_id = region)) +
  geom_map(map = world_map)+
  expand_limits(x = world_map$long, y = world_map$lat)

C. Projections

Since the globe is a 3D object, we must project it onto a 2D surface to be able to visualize it on paper; however, this affects the presentation of land masses. There are five properties that can be distorted in projections: shape, area, angles, distance, and direction. Map makers are able to preserve two of these properties, but the other three are sacrificed.

The Mercator projection is one of the most common projections. It has equally spaced straight meridians, conformal, straight compass courses. However, this projection is known to distort areas. Thus, the further from the equator the larger a land mass may appear.

For instance, lets look at Greenland and Algeria:

world_map%>%
  filter(region%in%c("Greenland", "Algeria"))%>%
  ggplot(aes(map_id = region)) +
  geom_map(map = world_map)+
  expand_limits(x = c(-100, 100), y =  c(0, 100))+
  coord_map("mercator") 

TRUTH: Algeria is 2.4 million \(km^2\) and Greenland is 2.2 million \(km^2\)

Another commonly used project is the Mollweide, which is considered to be homalographic (equal-area) where the hemisphere is a circle.

world_map%>%
  filter(region%in%c("Greenland", "Algeria"))%>%
  ggplot(aes(map_id = region)) +
  geom_map(map = world_map)+
  expand_limits(x = c(-100, 100), y =  c(0, 100))+
  coord_map("mollweide") 

D. High Longitude

The longitude ranges from -180 to 180. You will notice that lines get strange when we plot the whole world. This is because there are parts of the US and Russia that extend past the 180th meridian.

world_map%>% 
  ggplot(aes(map_id = region)) +
  geom_map(map = world_map)+
  expand_limits(x = world_map$long, y = world_map$lat)+
  coord_map("mollweide") 

We can fix this by taking out longitudes greater than 180.

world_map%>%
  ggplot(aes(map_id = region)) +
  geom_map(map = world_map)+
  expand_limits(x = world_map$long, y = world_map$lat)+
  coord_map("mollweide", xlim=c(-180,180))

Try it!

Read about different types of projections:

help(mapproject)

The following are the different types of projections available in R. Some of these projections require additional parameter(s) to be specified.

## PROJECTIONS
projlist <- c("aitoff", "albers", "azequalarea", "azequidist", "bicentric",
              "bonne", "conic", "cylequalarea", "cylindrical", "eisenlohr", "elliptic",
              "fisheye", "gall", "gilbert", "guyou", "harrison", "hex", "homing",
              "lagrange", "lambert", "laue", "lune", "mercator", "mollweide", "newyorker",
              "orthographic", "perspective", "polyconic", "rectangular", "simpleconic",
              "sinusoidal", "tetra", "trapezoidal")

Task: Try three different projections and make observations for how the map of the Earth changes.

world_map%>%
  ggplot(aes(map_id = region)) +
  geom_map(map = world_map)+
  expand_limits(x = world_map$long, y = world_map$lat)+
  coord_map("mercator", xlim=c(-180,180))

STEP 3: Joining the Data to the Map

Let’s look at the map data. Note that region is used to specify country, but it has the whole name written out. This might be a problem for us because the Starbucks data uses a two-letter code of each country.

head(world_map)
##        long      lat group order region subregion
## 1 -69.89912 12.45200     1     1  Aruba      <NA>
## 2 -69.89571 12.42300     1     2  Aruba      <NA>
## 3 -69.94219 12.43853     1     3  Aruba      <NA>
## 4 -70.00415 12.50049     1     4  Aruba      <NA>
## 5 -70.06612 12.54697     1     5  Aruba      <NA>
## 6 -70.05088 12.59707     1     6  Aruba      <NA>
Use the countrycode Package
#install.packages("countrycode")
library(countrycode)

starStores<-starbuck%>%
  group_by(Country)%>%
  summarise(stores=n())%>%
  mutate(region=countrycode(Country, origin = 'iso2c', destination = 'country.name'))%>%
  arrange(desc(stores))

head(starStores)
## # A tibble: 6 × 3
##   Country stores region        
##   <chr>    <int> <chr>         
## 1 US       13608 United States 
## 2 CN        2734 China         
## 3 CA        1468 Canada        
## 4 JP        1237 Japan         
## 5 KR         993 South Korea   
## 6 GB         901 United Kingdom
Join
worldCoffee<-world_map%>%
  left_join(starStores, by="region")

STEP 4: Choropleth

#install.packages("ggthemes")
library(ggthemes)
## Warning: package 'ggthemes' was built under R version 3.6.2
worldCoffee%>% 
  ggplot(aes(map_id = region, fill=stores)) +
  geom_map(map = world_map) +
  expand_limits(x = world_map$long, y = world_map$lat) +
  coord_map("mollweide", xlim=c(-180,180))+
  theme_map()

Oh no!

Something went wrong? Can you see it?

It turns out that only country name in maps that doesn’t use the whole country name is the United States. In the map packages the name is USA. ##### Back to STEP 3

starStores<-starbuck%>%
  group_by(Country)%>%
  summarise(stores=n())%>%
  mutate(region=as.character(lapply(Country, function(x){
    if(x!="US"){
      out=countrycode(x, origin = 'iso2c', destination = 'country.name')
    }
    if(x=="US"){
      out="USA"
      }
    out})))

worldCoffee<-world_map%>%
  left_join(starStores, by="region")
Step 4…Again
worldCoffee%>% 
  ggplot(aes(map_id = region, fill=stores)) +
  geom_map(map = world_map) +
  expand_limits(x = world_map$long, y = world_map$lat) +
  coord_map("mollweide", xlim=c(-180,180))+
  theme_map()

STEP 5: Colors

Since the United States has so many Starbucks locations, its hard to see how many stores are in other countries.

Log transformation
worldCoffee%>% 
  ggplot(aes(map_id = region, fill=stores)) +
  geom_map(map = world_map) +
  expand_limits(x = world_map$long, y = world_map$lat) +
  coord_map("mollweide", xlim=c(-180,180))+
  theme_map()+
  scale_fill_gradient(trans = "log10")

Viridis
worldCoffee%>% 
  ggplot(aes(map_id = region, fill=stores)) +
  geom_map(map = world_map) +
  expand_limits(x = world_map$long, y = world_map$lat) +
  coord_map("mollweide", xlim=c(-180,180))+
  theme_map()+
  scale_fill_gradient(trans = "log10")+
  scale_fill_viridis(option="viridis", direction = 1)
## Scale for 'fill' is already present. Adding another scale for 'fill', which
## will replace the existing scale.

Binning Colors
# SPLIT THE RANGE INTO 5 EQUAL SECTIONS
split5<-quantile(starStores$stores, prob=c(0, .2, .4, .6, .8, 1))
split5
##      0%     20%     40%     60%     80%    100% 
##     1.0     7.4    18.0    54.2   139.2 13608.0
str(split5)
##  Named num [1:6] 1 7.4 18 54.2 139.2 ...
##  - attr(*, "names")= chr [1:6] "0%" "20%" "40%" "60%" ...
# CUT INTO FIVE SECTIONS
starStores<-starStores%>%
  mutate(store_cut = cut(stores, breaks = split5))

#install.packages("viridisLite")
library(viridisLite)

# CREATE A VIRIDIS COLOR PALETTE WITH 5 COLORS
viridis5<-viridis(n=5, begin = 0, end = 1, direction = 1)
viridis5
## [1] "#440154FF" "#3B528BFF" "#21908CFF" "#5DC863FF" "#FDE725FF"
# JOIN
worldCoffee<-world_map%>%
  left_join(starStores, by="region")

# MAP
worldCoffee%>% 
  ggplot(aes(map_id = region, fill=store_cut)) +
  geom_map(map = world_map) +
  expand_limits(x = world_map$long, y = world_map$lat) +
  coord_map("mollweide", xlim=c(-180,180))+
  theme_map()+
  scale_fill_manual(values=viridis5)

STEP 6: Add Points

Number of locations within the same city:

## CITIES
starCities<-starbuck%>%
  group_by(Country, City)%>%
  summarise(lat=mean(Latitude), 
            long=mean(Longitude), 
            cityStores=n())
## `summarise()` has grouped output by 'Country'. You can override using the `.groups` argument.

Add points:

worldCoffee%>% 
  ggplot() +
  geom_map(aes(map_id = region, fill=store_cut), map = world_map) +
   geom_point(data=starCities, aes(x=long, y=lat, size=cityStores), alpha=.3)+
  expand_limits(x = world_map$long, y = world_map$lat) +
  coord_map("mollweide", xlim=c(-180,180))+
  theme_map()+
  scale_fill_manual(values=viridis5)+
  guides(color = guide_legend(order=1),
          size = guide_legend(order=2))+
  theme(legend.position="bottom", legend.box = "vertical")
## Warning: Removed 1 rows containing missing values (geom_point).