Assignment 4

Improving an existing choropleth map

I’m interested in environmental health data in Maine, so I visited the Maine Tracking Network site. The Maine Center for Disease Control and Prevention operates the Maine Tracking Network site, and the site contains data about environmental exposures and their health effects in Maine. I notice a new category titled “Cold-Related Illnesses,” and I look for downloadable data on Emergency Department (ED) visits for cold-related illnesses (e.g., hypothermia and frostbite). I choose the surveillance data for 2018 and notice that they also have a static Maine county choropleth map in addition to a table. I want to build an interactive version of this map, so I import the data and organize it to integrate with the map functionality in leaflet.

devtools::install_github("dkahle/ggmap")
library(ggplot2)
library(maps)
library(ggmap)
library(leaflet)
library(sf)
library(tidyverse)
library(readxl)
library(htmlwidgets)
library(htmltools)
library(shiny)

state_maine <- map_data(map = "state") %>% filter(region == "maine")
str(state_maine)

## 'data.frame':    399 obs. of  6 variables:
##  $ long     : num  -70.7 -70.8 -70.8 -70.8 -70.8 ...
##  $ lat      : num  43.1 43.1 43.1 43.1 43.2 ...
##  $ group    : num  18 18 18 18 18 18 18 18 18 18 ...
##  $ order    : int  4959 4960 4961 4962 4963 4964 4965 4966 4967 4968 ...
##  $ region   : chr  "maine" "maine" "maine" "maine" ...
##  $ subregion: chr  NA NA NA NA ...

counties <- map_data("county")
maine_counties <- subset(counties, region == "maine") %>% rename(County = subregion, Long = long, Lat = lat, Group = group, Order = order, Region = region) %>% mutate(County = str_to_title(County))
maine_cold_illness <- read_excel("~/Downloads/ED_visits_cold_related_illnesses18.xlsx") %>% rename(County = Location) %>% mutate_at("Rate_per_100000", as.numeric)

maine_cold_illness %>% ggplot(aes(Rate_per_100000)) +
geom_histogram(color = "black", fill = "orange", bins = nclass.FD(maine_cold_illness$Rate_per_100000)) + theme_minimal()

Normality check

Before I map the data, I make a histogram to see how normally distributed my data is. The curve has kurtosis, and there are many observations at the mean. I know this may affect how the colors are mapped, leading the viewer to think there isn’t much difference in ED visits by county. Given the non-normal distribution of my data, I split the color gradient for my map into four parts using bins.

Choosing color and features

I decide to use the red, orange, and yellow colorbrewer palette to display the county rates because the scale is coded so that red indicates the highest rate of ED visits, and I believe people associate red with elevated rates. I add some interactivity to the map so that when the viewer mouses over a county, they will see the ED visit rate per 100,000 people and the 95% confidence interval. I included the 95% confidence interval because some intervals are quite wide (e.g., Piscataquis county), meaning the rate is less precise, and viewers should interpret with caution.

The interactive labels are a nice feature so viewers can see the exact data value without overloading the map with ink.

Maine ED visit rates for cold-related illnesses (by county), 2018:

maine_cold_illness_county <- maine_counties %>% inner_join(maine_cold_illness, by = "County")

maine_sp <- maine_counties %>%
  bind_rows(.id = "df_id") %>%
  st_as_sf(coords = c("Long", "Lat"), crs = 4326) %>%
  group_by(Group) %>%
  summarise(geometry = st_combine(geometry)) %>%
  st_cast("POLYGON") %>%
  inner_join(maine_cold_illness_county %>% distinct(Group, Rate_per_100000, County, CI), by = "Group")

tag.map.title <- tags$style(HTML("
  .leaflet-control.map-title { 
    transform: translate(-50%,20%);
    position: fixed !important;
    left: 50%;
    text-align: center;
    padding-left: 10px; 
    padding-right: 10px; 
    background: rgba(255,255,255,0.75);
    font-weight: bold;
    font-size: 28px;
  }
"))

title <- tags$div(
  tag.map.title, HTML("ED visit rate for cold-related illnesses in 2018")
)  

maine_cold_illness_2  <- leaflet(maine_sp) %>%
  setView(-69, 45.3, 6.5) %>%
  addProviderTiles("MapBox", options = providerTileOptions(
    id = "mapbox.light",
    accessToken = Sys.getenv('pk.eyJ1IjoiamZkaWJpYXNlIiwiYSI6ImNrbGdqZjd0bTExMDYyb211YWI1cmU3OHoifQ.X9V4ZPP1MIi9NcDZEqeqxQ'))) %>%
  addControl(title, position = "bottomleft", className="maptitle")

qpal <- colorBin(palette = "YlOrRd", 
                 domain = maine_sp$Rate_per_100000, 
                 bins = 4, 
                 pretty = FALSE)

labels <- sprintf(
  "<strong>%s</strong><br/><strong>%g visits</strong> per 100,000 people<br/>95 percent CI: %s", maine_sp$County, maine_sp$Rate_per_100000, maine_sp$CI) %>% lapply(htmltools::HTML)

maine_cold_illness_2 %>% addPolygons(fillColor = ~qpal(Rate_per_100000), weight = 1, opacity = 1, color = "#800000", dashArray = "", fillOpacity = 0.7,
              highlight = highlightOptions(
              weight = 3,
              color = "white",
              dashArray = "",
              fillOpacity = 1.0,
              bringToFront = TRUE),
              label = labels,
              labelOptions = labelOptions(
              style = list("font-weight" = "normal", padding = "3px 8px"),
              textsize = "15px",
              direction = "auto")) %>% addLegend(pal = qpal, values = ~Rate_per_100000, opacity = 0.7, title = "ED visit rate per 100,000", position = "bottomright")

High rates in Oxford, Somerset, and Washington counties

It appears that the counties with the highest rates of ED visits for cold-related illnesses are Oxford, Somerset, and Washington counties. I’m curious what factors may be involved in these high rates: occupation (workers exposed to cold conditions), poverty, housing, or outdoor recreation (snowmobiling, hiking).

Assignment 4

Jess DiBiase

02/24/2021

Improving an existing choropleth map

Normality check

Choosing color and features