Ohio County Food Insecurity Map

David Selvaraj

2022-03-14

Setup

library(knitr)
library(rmdformats)
library(broom)

## Global options

opts_chunk$set(comment=NA)
opts_knit$set(width=75)

Package Loading

library(Rcpp)
library(tmap)
library(tmaptools)
library(sf)
library(biscale)
library(insurancerating)
library(stringr)
library(cowplot)
library(cartogram)
library(Hmisc)
library(magrittr)
library(janitor)
# add other packages here as desired
library(tibble)
library(tidyverse)

Import the `oh_counties_2020` Data and apply clean_names()

Here, I imported the data from the csv file, used clean_names() from the janitor package to remove “-” from the variable names , replaced them with “_” and ensured all of the names are in snake case, which is the default option of the clean_names function. I also set fips as a character data type because it an id string and not a numeric value. Finally, I selected the variables county,food_insecure, african_am, population and displayed the tibble, which has 88 observations and 4 variable.

oh20 <- read_csv("/Users/ddmsel/Desktop/PQHS432/data/oh_counties_2020.csv") 
oh20 <- oh20 %>%
    mutate(fips = as.character(fips))%>%
    clean_names()%>% 
    arrange(county)%>%
    select(county,food_insecure, african_am,population)

oh20

1 Map

1.1 Import and Filter County Map Data (.shp) from census.gov

Below, is the code used to make a county level map comparing race and income in Ohio. First, I imported the lines or county borders for the entire US in 2020 from the .shp file I downloaded from census.gov.[1] I then filtered the dataframe by the STATEFP variable == to “39” to get only the county borders for Ohio. Finally, I applied the clean names function to the dateframe to convert all of the variable names to snake_case.

oh_county_shp <- st_read(dsn ="/Users/ddmsel/Desktop/PQHS432/data/cb_2020_us_county_500k.shp")

Reading layer `cb_2020_us_county_500k' from data source 
  `/Users/ddmsel/Desktop/PQHS432/data/cb_2020_us_county_500k.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 3234 features and 12 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -179.1489 ymin: -14.5487 xmax: 179.7785 ymax: 71.36516
Geodetic CRS:  NAD83

oh_county_shp <- oh_county_shp %>%
              filter(., STATEFP == "39") %>%
              clean_names()%>% 
              arrange(name)%>%
              select(name,geometry)

Below is the dataframe with 88 rows corresponing to the 88 counties in Ohio and 18 variables including the geometry variable, which shows the boundaries for the counties.

oh_county_shp

Here, I did a sanity check to see if the counties were in the same order, had the same amount of elements and that the elements in each dataframe were identical.

str(oh_county_shp$name)

 chr [1:88] "Adams" "Allen" "Ashland" "Ashtabula" "Athens" "Auglaize" ...

str(oh20$county)

 chr [1:88] "Adams" "Allen" "Ashland" "Ashtabula" "Athens" "Auglaize" ...

identical(oh_county_shp$name,oh20$county)

[1] TRUE

1.2 Join oh_county_shp and oh20

Next, I joined the oh20, County Health Data, to the oh_county_shp using the dplyr package lefft_join. Note in this case left_join was needed so that the resulting mergeded dataframe was not a tibble becuase tibbles do not support the geometry column needed for mapping.

oh20 <- oh_county_shp %>% 
        left_join(oh20, by = c("name"="county"))
oh20

Using the describe fuction of the Hmisc package, I generated some summary statistics of the food_insecure population and the african-am varaible. Thre is no missing data for any variable and the rages of values seem reasonable for each percentage.

describe(oh20$food_insecure)

oh20$food_insecure 
       n  missing distinct     Info     Mean      Gmd      .05      .10 
      88        0       55    0.999    13.46    2.722     9.60    10.40 
     .25      .50      .75      .90      .95 
   11.80    13.50    15.03    16.43    17.20 

lowest :  8.1  8.9  9.2  9.6  9.7, highest: 17.2 17.4 17.5 18.4 19.3

describe(oh20$african_am)

oh20$african_am 
       n  missing distinct     Info     Mean      Gmd      .05      .10 
      88        0       83        1    4.254    5.078   0.4775   0.6040 
     .25      .50      .75      .90      .95 
  0.9075   2.1200   4.4150   8.9120  18.0075 

lowest :  0.34  0.37  0.38  0.44  0.46, highest: 19.60 21.09 22.90 26.21 29.61

describe(oh20$population)

oh20$population 
       n  missing distinct     Info     Mean      Gmd      .05      .10 
      88        0       88        1   132835   160359    16429    27084 
     .25      .50      .75      .90      .95 
   36760    58295   123498   255359   496480 

lowest :   13139   13790   14354   14604   15174
highest:  532331  541918  816684 1243857 1310300

Below, I use the bi_class fuction of the biscale package to convert the percentiles of food_insecure and african_am to a biscale ranging from 1-1 indicating a low percentage of African Americans (Black) and a low percentage of food insecurity and 3-3 indicating a high percentage of food insecurity and a high level of African Americans. Breaks for the classification can be specificied with the style parameter as “quintile” or “fisher”. To better approximate the natural clustering in race and food insecurity I used fisher breaks.

data <- bi_class(oh20, x = african_am, y = food_insecure, style = "fisher", dim = 3)
data <- st_transform(data, 2163)
data%>% arrange(desc(bi_class))

The frequncy of each biclass is listed below.

describe(data$bi_class)

data$bi_class 
       n  missing distinct 
      88        0        7 

lowest : 1-1 1-2 1-3 2-1 2-2, highest: 1-3 2-1 2-2 2-3 3-3
                                                    
Value        1-1   1-2   1-3   2-1   2-2   2-3   3-3
Frequency     32    23    12     2    11     3     5
Proportion 0.364 0.261 0.136 0.023 0.125 0.034 0.057

The levels of the breaks for food_insecure are [8.1,12.7] (12.7,15.3] (15.3,19.3] and are [0.34,4.91] (4.91,17.3] (17.3,29.6] for african_am.

fisher(oh20$food_insecure, n = 3, diglab = 3)[[1]]

[1] (15.3,19.3]
Levels: [8.1,12.7] (12.7,15.3] (15.3,19.3]

fisher(oh20$african_am, n = 3, diglab = 3)[[1]]

[1] [0.34,4.91]
Levels: [0.34,4.91] (4.91,17.3] (17.3,29.6]

1.3 Map of Race and Food Insecurity in 2020 by County

Below is the plot of the bivarite legend and the map using the the geom_sf package of ggplot.

legend <- bi_legend(pal = "DkBlue",
                    dim = 3,
                    xlab = "% Black",
                    ylab = "% Food Insecure",
                    size = 7)


# create map
map <- ggplot() +
  geom_sf(data = data, mapping = aes(fill = bi_class), color = "white", size = 0.1, show.legend = FALSE) +
  bi_scale_fill(pal = "DkBlue", dim = 3) +
  
  bi_theme()

To draw the “final plot” the cowplot package was used to combine the plot and the legend and the the first final plot was displayed below. While this map did show the relationship among race, food insecurity and the location of the county, population of the county also seemed to be an important variable and was not conveyed by this plot.

# combine map with legend
finalPlot1 <- ggdraw() +
  draw_plot(map,  x = 0, y = .1, width = .9, height = .9, scale = 1) +
  draw_plot(legend, 0.2, 0.1, 1, .2)



finalPlot1

1.4 Map of Race, Food Insecurity and Polulation

The plot above conveyed three variables: percentage of Blacks, percentage of food insecurity and the spatial location of the county. Population seemed to be important to the trend among these variables. To add population to the map I decided to resize the shapes of the county bondaries based on population using the cartogram pacakge. Below is the code to transform the map polygons from a geographic to a projected coordinate system using st_transform(), which is needed for the cartogram. Then I reassigned the bi_class to the new dataframe as I did above. Finally, I used the cartogram_ncont() package to resize the county polygons proportionally based on the “population” of the county.

oh_map = st_transform(oh20, 2163)
oh_map <- bi_class(oh_map, x = african_am, y = food_insecure, style = "fisher", dim = 3)
oh_map_cart = cartogram_ncont(oh_map, "population", k = 1, inplace = TRUE)

The plot of the new map is created using ggplot in a method similar to above.

map2 <- ggplot() +
  geom_sf(data = oh_map_cart, mapping = aes(fill = bi_class), color = "white", size = 0.1, show.legend = FALSE) +
  bi_scale_fill(pal = "DkBlue", dim = 3) +
  
  bi_theme() 


# combine map with legend
finalPlot2 <- ggdraw() +
  draw_plot(map2,  x = 0, y = .1, width = .9, height = .9, scale = 1) +
  draw_plot(legend, 0.2, 0.1, 1, 0.2)



finalPlot2

1.5 Figure1 Final Combined and Labeled Map

The code below usese the cowplot() package to combine the plots, legend and add a title to the plot. The final combined plot is displayed below and labled Figure1.

plot_grid<- plot_grid(map, map2, legend, ncol = 3, scale = 1,  labels = c('A', 'B', ''), rel_widths= c(1, 1, .5))

# now add the title
title <- ggdraw() + 
  draw_label(
    "Figure 1: Spatial Relationship among Race, Food Insecurity and Population Level\nfrom Ohio County Health Ranking Data 2020 ",
    fontface = 'bold',
    x = 0,vjust=.5,
    hjust = 0,
    size = 13
  ) +
  theme(
    # add margin on the left of the drawing canvas,
    # so title is aligned with left edge of first plot
    plot.margin = margin(0, 0, 0, 7)
    

  )

caption <- ggdraw() + 
  draw_label(str_wrap(
    "Percentage Black and percentage food insecure are shown as a biscale. Blue indicates high Black % and high % of food insecurity. Population of the county is shown by the relative size of the county outline in B. High percentages of food insecurity and high percentages Black residents are in populous counties. Additionally, there is a cluster of largely white counties with high food insecurity South-East Ohio along the West Virginia border. (Breaks for % Black are 4.9% and 17.3% and breaks for % food insecure are 12.7% and 15.3%)",width = 140),
    
    x = 0,
    hjust = 0,
    size = 8
  ) 
plot_grid(
  title,  plot_grid,caption, 
  ncol = 1,
  # rel_heights values control vertical title margins
  rel_heights = c(0.3, 1)
)

Figure 1: Percentage Black and percentage food insecure are shown as a biscale. Blue indicates high Black % and high % of food insecurity. Population of the county is shown by the relative size of the county outline in B. High percentages of food insecurity and high percentages Black residents are in populous counties. Additionally, there is a cluster of largely white counties with high food insecurity South-East Ohio along the West Virginia border. (Fisher Breaks for % Black are 4.9% and 17.3% and breaks for % food insecure are 12.7% and 15.3%)

#References

[1] https://www.census.gov/geographies/mapping-files/time-series/geo/cartographic-boundary.html

Session Information

xfun::session_info()

R version 4.1.3 (2022-03-10)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.2.1

Locale: en_US.UTF-8 / en_US.UTF-8 / en_US.UTF-8 / C / en_US.UTF-8 / en_US.UTF-8

Package version:
  abind_1.4-5             ape_5.6.2               arm_1.12-2             
  askpass_1.1             assertthat_0.2.1        backports_1.4.1        
  base64enc_0.1-3         biscale_0.2.0           bit_4.0.4              
  bit64_4.0.5             blob_1.2.2              bookdown_0.24          
  boot_1.3-28             brio_1.1.3              broom_0.7.12           
  bslib_0.3.1             cachem_1.0.6            callr_3.7.0            
  cartogram_0.2.2         cellranger_1.1.0        checkmate_2.0.0        
  ciTools_0.6.1           class_7.3-20            classInt_0.4-3         
  cli_3.2.0               clipr_0.8.0             cluster_2.1.2          
  coda_0.19-4             codetools_0.2-18        colorspace_2.0-3       
  commonmark_1.8.0        compiler_4.1.3          cowplot_1.1.1          
  cpp11_0.4.2             crayon_1.5.0            crosstalk_1.2.0        
  curl_4.3.2              data.table_1.14.2       DBI_1.1.2              
  dbplyr_2.1.1            desc_1.4.1              DHARMa_0.4.5           
  dichromat_2.0-0         diffobj_0.3.5           digest_0.6.29          
  doParallel_1.0.17       dplyr_1.0.8             dtplyr_1.2.1           
  e1071_1.7-9             ellipsis_0.3.2          evaluate_0.15          
  evtree_1.0-8            fansi_1.0.2             farver_2.1.0           
  fastmap_1.1.0           fitdistrplus_1.1-8      fontawesome_0.2.2      
  forcats_0.5.1           foreach_1.5.2           foreign_0.8-82         
  Formula_1.2-4           fs_1.5.2                gap_1.2.3.1            
  gargle_1.2.0            generics_0.1.2          geojsonsf_2.0.2        
  geometries_0.2.0        ggplot2_3.3.5           glue_1.6.2             
  googledrive_2.0.0       googlesheets4_1.0.0     graphics_4.1.3         
  grDevices_4.1.3         grid_4.1.3              gridExtra_2.3          
  gtable_0.3.0            haven_2.4.3             highr_0.9              
  Hmisc_4.6-0             hms_1.1.1               htmlTable_2.4.0        
  htmltools_0.5.2         htmlwidgets_1.5.4       httpuv_1.6.5           
  httr_1.4.2              ids_1.0.1               insight_0.16.0         
  insurancerating_0.6.9   inum_1.0-4              isoband_0.2.5          
  iterators_1.0.14        janitor_2.1.0           jpeg_0.1-9             
  jquerylib_0.1.4         jsonify_1.2.1           jsonlite_1.8.0         
  KernSmooth_2.23-20      knitr_1.37              labeling_0.4.2         
  later_1.3.0             lattice_0.20-45         latticeExtra_0.6-29    
  lazyeval_0.2.2          leafem_0.1.6            leaflet_2.1.0          
  leaflet.providers_1.9.0 leafsync_0.1.0          libcoin_1.0-9          
  lifecycle_1.0.1         lme4_1.1-28             lmtest_0.9.39          
  lubridate_1.8.0         lwgeom_0.2-8            magrittr_2.0.2         
  markdown_1.1            MASS_7.3-55             Matrix_1.4-0           
  methods_4.1.3           mgcv_1.8-39             mime_0.12              
  minqa_1.2.4             modelr_0.1.8            munsell_0.5.0          
  mvtnorm_1.1-3           nlme_3.1-155            nloptr_2.0.0           
  nnet_7.3-17             openssl_2.0.0           packcircles_0.3.4      
  parallel_4.1.3          partykit_1.2-15         patchwork_1.1.1        
  pillar_1.7.0            pkgconfig_2.0.3         pkgload_1.2.4          
  plyr_1.8.6              png_0.1-7               praise_1.0.0           
  prettyunits_1.1.1       processx_3.5.2          progress_1.2.2         
  promises_1.2.0.1        proxy_0.4-26            ps_1.6.0               
  purrr_0.3.4             qgam_1.3.4              R6_2.5.1               
  rapidjsonr_1.2.0        rappdirs_0.3.3          raster_3.5-15          
  RColorBrewer_1.1-2      Rcpp_1.0.8              RcppEigen_0.3.3.9.1    
  readr_2.1.2             readxl_1.3.1            rematch_1.0.1          
  rematch2_2.1.2          reprex_2.0.1            rlang_1.0.2            
  rmarkdown_2.13          rmdformats_1.0.3        rpart_4.1.16           
  rprojroot_2.0.2         rstudioapi_0.13         rvest_1.0.2            
  s2_1.0.7                sass_0.4.0              scales_1.1.1           
  selectr_0.4.2           sf_1.0-7                sfheaders_0.4.0        
  shiny_1.7.1             snakecase_0.11.0        sourcetools_0.1.7      
  sp_1.4-6                splines_4.1.3           stars_0.5-5            
  stats_4.1.3             stringi_1.7.6           stringr_1.4.0          
  survival_3.3-1          sys_3.4                 terra_1.5-21           
  testthat_3.1.2          tibble_3.1.6            tidyr_1.2.0            
  tidyselect_1.1.2        tidyverse_1.3.1         tinytex_0.37           
  tmap_3.3-3              tmaptools_3.1-1         tools_4.1.3            
  tzdb_0.2.0              units_0.8-0             utf8_1.2.2             
  utils_4.1.3             uuid_1.0.3              vctrs_0.3.8            
  viridis_0.6.2           viridisLite_0.4.0       vroom_1.5.7            
  waldo_0.3.1             widgetframe_0.3.1       withr_2.5.0            
  wk_0.6.0                xfun_0.30               XML_3.99-0.9           
  xml2_1.3.3              xtable_1.8.4            yaml_2.3.5             
  zoo_1.8.9