First we are going to need to install the necessary packages. Since creating an interactive map is a major portion of this, we’ll be needing the necessary packages to do so.
# install.packages("devtools")
# install.packages("stringr")
# install.packages(c("maps", "mapdata"))
Then we load them.
library(ggplot2)
library(maps)
library(mapdata)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(stringr)
library(viridis)
## Loading required package: viridisLite
##
## Attaching package: 'viridis'
## The following object is masked from 'package:maps':
##
## unemp
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ tibble 3.1.2 ✓ purrr 0.3.4
## ✓ tidyr 1.1.3 ✓ forcats 0.5.1
## ✓ readr 1.4.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## x purrr::map() masks maps::map()
We load up a map of the US to start with. We get the map from the “maps” pacakge.
usa <- map_data("usa")
dim(usa)
## [1] 7243 6
# head(usa)
# tail(usa)
Then we load a different map, “world2Hires” from mapdata, which is centered around the Pacific Ocean.
w2hr <- map_data("world2Hires")
dim(w2hr)
## [1] 2274539 6
head(w2hr)
## long lat group order region subregion
## 1 226.6336 58.42416 1 1 Canada <NA>
## 2 226.6314 58.42336 1 2 Canada <NA>
## 3 226.6122 58.41196 1 3 Canada <NA>
## 4 226.5911 58.40027 1 4 Canada <NA>
## 5 226.5719 58.38864 1 5 Canada <NA>
## 6 226.5528 58.37724 1 6 Canada <NA>
tail(w2hr)
## long lat group order region subregion
## 2276817 125.0258 11.18471 2284 2276817 Philippines Leyte
## 2276818 125.0172 11.17142 2284 2276818 Philippines Leyte
## 2276819 125.0114 11.16110 2284 2276819 Philippines Leyte
## 2276820 125.0100 11.15555 2284 2276820 Philippines Leyte
## 2276821 125.0111 11.14861 2284 2276821 Philippines Leyte
## 2276822 125.0155 11.13887 2284 2276822 Philippines Leyte
From here let’s create a basic map of the USA. It’ll be filled with black and nothing else.
ggplot() +
geom_polygon(data = usa, aes(x = long, y = lat, group = group)) +
coord_quickmap()
Next, let’s make it a bit more advanced by coloring the border red.
ggplot() +
geom_polygon(data = usa, aes(x = long, y = lat, group = group), fill = NA, color = "red") +
coord_quickmap()
Then let’s fill the map with a violet color.
gg1 <- ggplot() +
geom_polygon(data = usa, aes(x = long, y = lat, group = group), fill = "violet", color = "blue") +
coord_quickmap()
gg1
Let’s add black and yellow points at the NMFS lab in Santa Cruz and at the Northwest Fisheries Science Center lab in Seattle.
labs <- tibble(
long = c(-122.064873, -122.306417),
lat = c(36.951968, 47.644855),
names = c("SWFSC-FED", "NWFSC"))
gg1 +
geom_point(data = labs, aes(x = long, y = lat), shape = 21, color = "black", fill = "yellow", size = 5) +
geom_text(data = labs, aes(x = long, y = lat, label = names), hjust = 0, nudge_x = 1)
The group aesthetics importance can be seen here as there will be lines connecting certain points.
ggplot() +
geom_polygon(data = usa, aes(x = long, y = lat), fill = "violet", color = "blue") +
geom_point(data = labs, aes(x = long, y = lat), shape = 21, color = "black", fill = "yellow", size = 5) +
geom_text(data = labs, aes(x = long, y = lat, label = names), hjust = 0, nudge_x = 1) +
coord_quickmap()
Now that we know how to make a basic graph, let’s make it a bit more advanced. This time we’ll be looking at California specifically. So first we need to load the data for states.
states <- map_data("state")
dim(states)
## [1] 15537 6
head(states)
## long lat group order region subregion
## 1 -87.46201 30.38968 1 1 alabama <NA>
## 2 -87.48493 30.37249 1 2 alabama <NA>
## 3 -87.52503 30.37249 1 3 alabama <NA>
## 4 -87.53076 30.33239 1 4 alabama <NA>
## 5 -87.57087 30.32665 1 5 alabama <NA>
## 6 -87.58806 30.32665 1 6 alabama <NA>
tail(states)
## long lat group order region subregion
## 15594 -106.3295 41.00659 63 15594 wyoming <NA>
## 15595 -106.8566 41.01232 63 15595 wyoming <NA>
## 15596 -107.3093 41.01805 63 15596 wyoming <NA>
## 15597 -107.9223 41.01805 63 15597 wyoming <NA>
## 15598 -109.0568 40.98940 63 15598 wyoming <NA>
## 15599 -109.0511 40.99513 63 15599 wyoming <NA>
Now we just fill in the USA with the different states.
ggplot(data = states) +
geom_polygon(aes(x = long, y = lat, fill = region, group = group), color = "white") +
coord_quickmap() +
guides(fill = FALSE) # do this to leave off the color legend
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.
We can even choose to include certain states. Let’s graph the 3 states California, Oregon, and Washington.
west_coast <- states %>%
filter(region %in% c("california", "oregon", "washington"))
ggplot(data = west_coast) +
geom_polygon(aes(x = long, y = lat), fill = "palegreen", color = "black")
Something seems off, let’s include “group()” and “quickmap()” to see if there’s a difference.
ggplot(data = west_coast) +
geom_polygon(aes(x = long, y = lat, group = group), fill = "palegreen", color = "black") +
coord_quickmap()
Now it’s much better to look at over the previous map which was elongated a bit too much.
Now let’s focus on a specific state - California.
ca_df <- states %>%
filter(region == "california")
head(ca_df)
## long lat group order region subregion
## 1 -120.0060 42.00927 4 667 california <NA>
## 2 -120.0060 41.20139 4 668 california <NA>
## 3 -120.0060 39.70024 4 669 california <NA>
## 4 -119.9946 39.44241 4 670 california <NA>
## 5 -120.0060 39.31636 4 671 california <NA>
## 6 -120.0060 39.16166 4 672 california <NA>
Now let’s add the country lines to the map of Califronia.
counties <- map_data("county")
ca_county <- counties %>%
filter(region == "california")
head(ca_county)
## long lat group order region subregion
## 1 -121.4785 37.48290 157 6965 california alameda
## 2 -121.5129 37.48290 157 6966 california alameda
## 3 -121.8853 37.48290 157 6967 california alameda
## 4 -121.8968 37.46571 157 6968 california alameda
## 5 -121.9254 37.45998 157 6969 california alameda
## 6 -121.9483 37.47717 157 6970 california alameda
Before we add any counties, let’s see if the basic map does indeed display Califronia.
ca_base <- ggplot(data = ca_df, mapping = aes(x = long, y = lat, group = group)) +
coord_quickmap() +
geom_polygon(color = "black", fill = "gray")
ca_base + theme_void()
Now that we’re set let’s fill in the counties.
ca_base + theme_void() +
geom_polygon(data = ca_county, fill = NA, color = "white") +
geom_polygon(color = "black", fill = NA) # get the state border back on top
Since we were able to divide the country into counties, I wonder if we could add other data to the map. Let’s try to color the map by population. First, we need to pull the population data of each country first.
# make a data frame
x <- readLines("/Users/justinpark/Desktop/DATA 110/Unit 5/HW/CA Counties Text.txt")
## Warning in readLines("/Users/justinpark/Desktop/DATA 110/Unit 5/HW/CA Counties
## Text.txt"): incomplete final line found on '/Users/justinpark/Desktop/DATA 110/
## Unit 5/HW/CA Counties Text.txt'
pop_and_area <- str_match(x, "^([a-zA-Z ]+)County\t.*\t([0-9,]{2,10})\t([0-9,]{2,10}) sq mi$")[, -1] %>%
na.omit() %>%
as.data.frame(stringsAsFactors = FALSE) %>%
mutate(subregion = str_trim(V1) %>% tolower(),
population = as.numeric(str_replace_all(V2, ",", "")),
area = as.numeric(str_replace_all(V3, ",", ""))
) %>%
dplyr::select(subregion, population, area) %>%
tbl_df()
## Warning: `tbl_df()` was deprecated in dplyr 1.0.0.
## Please use `tibble::as_tibble()` instead.
head(pop_and_area)
## # A tibble: 6 x 3
## subregion population area
## <chr> <dbl> <dbl>
## 1 alameda 1578891 738
## 2 alpine 1159 739
## 3 amador 36519 593
## 4 butte 222090 1640
## 5 calaveras 44515 1020
## 6 colusa 21358 1151
Now we need to attach the population numbers to each county area. We do this using “left_join” and we’ll also add a new variable “people_per_mile”.
cacopa <- left_join(ca_county, pop_and_area, by = "subregion") %>%
mutate(people_per_mile = population / area)
head(cacopa)
## long lat group order region subregion population area
## 1 -121.4785 37.48290 157 6965 california alameda 1578891 738
## 2 -121.5129 37.48290 157 6966 california alameda 1578891 738
## 3 -121.8853 37.48290 157 6967 california alameda 1578891 738
## 4 -121.8968 37.46571 157 6968 california alameda 1578891 738
## 5 -121.9254 37.45998 157 6969 california alameda 1578891 738
## 6 -121.9483 37.47717 157 6970 california alameda 1578891 738
## people_per_mile
## 1 2139.419
## 2 2139.419
## 3 2139.419
## 4 2139.419
## 5 2139.419
## 6 2139.419
Now let’s create the modified map.
elbow_room1 <- ca_base +
geom_polygon(data = cacopa, aes(fill = people_per_mile), color = "white") +
geom_polygon(color = "black", fill = NA) +
theme_void()
elbow_room1
The population density in Califronia is too great of a number. We need to change the scaling to get a better understanding.
elbow_room1 + scale_fill_gradient(trans = "log10")
There’s contrast among the county densities. However, the colors should still be more distinguishable.
eb2 <- elbow_room1 +
scale_fill_gradientn(colours = rev(rainbow(7)),
breaks = c(2, 4, 10, 100, 1000, 10000),
trans = "log10")
eb2
Very colorful. But perhaps a bit too colorfu. This could be unpleasant to look at for some people. Let’s use a different color package “viridis” to get better colors.
eb2 <- elbow_room1 +
scale_fill_viridis(breaks = c(2, 4, 10, 100, 1000, 10000),
trans = "log10")
eb2
So we’ve mapped the entire country, specific states, and even counties. But can we zoom in further to include only the Bay Area?
eb2 + xlim(-123, -121.0) + ylim(36, 38)
Unfortunately, this was unsuccessful. Parts of the map got cutout and even a random line appeared connecting 2 points (the same line from before). Let’s modify this to get rid of the line and include the full Bay Area.
eb2 + coord_quickmap(xlim = c(-123, -121.0), ylim = c(36, 38))
## Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Great, now the map is complete. We were able to map and observe the entirety of the USA, specific states, counties, and even a specific region of a state.
eb2 + coord_quickmap(xlim = c(-123, -121.0), ylim = c(36, 38))
## Coordinate system already present. Adding new coordinate system, which will replace the existing one.