Choropleth of HPAI in Birds in the US

Author

Dr Andrew Dalby

The first thing that you need to do is include the libraries that you are going to use for the analysis. This includes the mapping tools as well as the US census data that creates maps down to the county level. There are other available map libraries for the US but these are not available through CRAN.

library("maps")
library("ggplot2")
library("sf")
Linking to GEOS 3.13.0, GDAL 3.10.1, PROJ 9.5.1; sf_use_s2() is TRUE
library("tigris")
To enable caching of data, set `options(tigris_use_cache = TRUE)`
in your R script or .Rprofile.
library("dplyr")

Adjuntando el paquete: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library("viridis")
Cargando paquete requerido: viridisLite

Adjuntando el paquete: 'viridis'
The following object is masked from 'package:maps':

    unemp
options(tigris_use_cache = TRUE)

The data for infections of commercial and backyard flocks comes from the CDC website. There are also wild bird datasets available. This data has counties by name but some are ambiguous as there are counties with the same name in multiple states. I created a joint name with the state and county in order to assign a unique fips number to each county. The fips file had to be edited as the existing list contained ambiguous spellings and so I used my own personal version for this project. This contains some cases where a fips code is assigned to multiple names and these were removed to make them unique.

The last step is to combine all of the outbreaks from a county into a single total number of birds affected but using summarise.

#Read in the data from the CDC csv file
avian_flu_data <- read.csv("commercial-backyard-flocks.csv")

county.fips <- read.csv("county_fips.csv")
county.fips <- county.fips |> distinct(fips, .keep_all=TRUE)

#Change the case of the County and State names to match the county.fips format
avian_flu <- avian_flu_data %>% mutate(County = tolower(County),
                                       State =tolower(State))

#Join the County and State names to create a county name the same as in the county.fips data
avian_flu <- avian_flu %>% mutate(county_name = paste(State,",",County,sep=""))

#Add the fips column to the data
avian_flu <- avian_flu %>% left_join(county.fips, by = c("county_name" = "polyname"))

#Collapse the rows to give a single total for each county
avian_flu1 <- avian_flu |> summarise(.by=fips, Deaths=sum(Flock.Size))

This assigns the geometric data for the shapes of each of the counties to the outbreak data, using the common terms fips and GEOID to match the files.

counties <- counties(cb = TRUE, year = 2020)
counties <- counties %>% mutate(GEOID = as.numeric(GEOID))
merged_data <- left_join(counties,avian_flu1, by = c("GEOID"="fips"))

GGPlot2 is used to plot the simple shapefiles and to shade them by an exponential gradient of the number of bird deaths for each county.

ggplot(data = merged_data) +
  geom_sf(aes(fill = Deaths), color = "grey70", size = 0.2) +
  coord_sf(xlim = c(-125, -65), ylim = c(24, 50)) +
  scale_fill_viridis(option = "plasma", na.value = "white", name = "Bird Deaths") +
  labs(
    title = "Avian Influenza Outbreaks by U.S. County",
    subtitle = "Data source: Commerical and Backyard Flocks",
    caption = "Source: CDC"
  ) 

The map for Alaska was created separately as it flattens out the map.

ggplot(data = merged_data) +
  geom_sf(aes(fill = Deaths), color = "gray70", size = 0.2) +
  coord_sf(xlim = c(-180, -120), ylim = c(52, 72)) +
  scale_fill_viridis(option = "plasma", na.value = "white", name = "Bird Deaths") +
  labs(
    title = "Avian Influenza Outbreaks in Alaska",
    subtitle = "Data source: Commerical and Backyard Flocks",
    caption = "Source: CDC"
  )

Data was also available for Puerto Rico and an additional map was created for the territory.

ggplot(data = merged_data) +
  geom_sf(aes(fill = Deaths), color = "gray70", size = 0.2) +
  coord_sf(xlim = c(-68, -64.5), ylim = c(17.5, 18.7)) +
  scale_fill_viridis(option = "plasma", na.value = "white", name = "Bird Deaths") +
  labs(
    title = "Avian Influenza Outbreaks in Puerto Rico",
    subtitle = "Data source: Commerical and Backyard Flocks",
    caption = "Source: CDC"
  )