Download the following libraries (ggplot2, sf, dplyr, tidylog, and ggiprah) and load them.
library(ggplot2)
library(sf)
library(dplyr)
library(tidylog)
library(ggiraph)
Download the Shapefile data from Minnesota
Geospatial Commons. The page was found via a Google search using the
term Minnesota Shapefile. Use the read_sf
function from the sf library to capture the data to a
table. We will call it mn.
mn <- read_sf("./mn_shapefile_2/", "mn_county_boundaries_500")
head(mn)
## Simple feature collection with 6 features and 12 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: 190012.2 ymin: 5166465 xmax: 591752.2 ymax: 5472428
## Projected CRS: NAD83 / UTM zone 15N
## # A tibble: 6 × 13
## AREA PERIMETER CTYONLY_ CTYONLY_ID COUN CTY_NAME CTY_ABBR CTY_FIPS
## <dbl> <dbl> <dbl> <dbl> <int> <chr> <chr> <int>
## 1 4608320924. 388250. 2 1 39 Lake of th… LOTW 77
## 2 2862183702. 263017. 3 2 35 Kittson KITT 69
## 3 4347098503. 302591. 4 3 68 Roseau ROSE 135
## 4 8167237871. 412897. 5 4 36 Koochiching KOOC 71
## 5 4698732288. 374208. 6 5 45 Marshall MARS 89
## 6 17451037319. 682518. 7 6 69 St. Louis STLO 137
## # ℹ 5 more variables: MaxSimpTol <dbl>, MinSimpTol <dbl>, Shape_Leng <dbl>,
## # Shape_Area <dbl>, geometry <POLYGON [m]>
mn_county_bigdata.cvs.Clean the data by just considering the county geography type and the year 2010 from the data.
mn_clean <- read.csv("mn_county_bigdata.csv")
mn_clean <- mn_clean %>%
filter(Year==2010 & Geography.Type=="County")
head(mn_clean)
## Geography.Type Geography.Name Year Population Households
## 1 County Aitkin 2010 16,202 7,299
## 2 County Anoka 2010 330,844 121,227
## 3 County Becker 2010 32,504 13,224
## 4 County Beltrami 2010 44,442 16,846
## 5 County Benton 2010 38,451 15,079
## 6 County Big Stone 2010 5,269 2,293
## Persons.Per.Household..PPH.
## 1 2.18
## 2 2.70
## 3 2.42
## 4 2.51
## 5 2.48
## 6 2.24
Perform a left join to to our main data mn so it will
have the additional features.
# Add Households and Persons Per Household variables to the main data table
mn <- mn %>% left_join(mn_clean, by = c("CTY_NAME" = "Geography.Name"))
Now, let’s create categories for the population from very small to very large.
# Define the breakpoints for the categories for population
breakpoints <- c(0, 5000, 10000, 50000, 100000, Inf)
# Convert to integer
mn$Pop_int <- gsub(",", "", mn$Population) %>%
as.integer()
# Create categories
mn$pop_category <- cut(mn$Pop_int, breaks = breakpoints, labels = c("very small (0-5,000)", "small (5,000 - 10,000)", "medium (10,000 - 50,000)", "large (50,000 - 100,000)", "very large (> 100,000)" ), include.lowest = TRUE)
This code chunk below combines the wanted features of a county into a character string. This helps as when you hover over a county on the map, it displays the correct information of name, population, households, and persons per household.
mn <- mn %>%
mutate(info = paste(
"Name: ", CTY_NAME,
"\nPopulation Category: ", pop_category,
"\nPopulation: ", Population,
"\nHouseholds: ", Households,
"\nPersons Per Household: ", Persons.Per.Household..PPH.
))
Let’s plot a static choropleth map showing Minnesota counties by population in 2010 first. We use ggplot with passed in parameters below.
plot <- ggplot(mn, aes(fill = pop_category)) +
geom_sf() + # call geom shapefile
geom_sf_interactive(aes(geometry = geometry, tooltip = info)) +
scale_fill_manual(
values = c(
"very small (0-5,000)" = "#eff3ff",
"small (5,000 - 10,000)" = "#bdd7e7",
"medium (10,000 - 50,000)" = "#6baed6",
"large (50,000 - 100,000)" = "#3182bd",
"very large (> 100,000)" = "#08519c") ) +
labs(title = "Minnesota counties by population category in 2010",
subtitle = "Most highly populated counties are on the East of MN",
fill="",
caption = "Long Truong (5/12/2023)\nSource: Minnesota Geospatial Commons and Demographic Center") +
theme(plot.caption = element_text(hjust = 0)) +
theme(legend.position = "right") +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank()) +
theme(axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank())
plot
Finally, we use this code below to produce the interactive map
final <- ggiraph(code = print(plot))
## Function `ggiraph()` is replaced by `girafe()` and will be removed soon.
final