Introduction

In this document I display a map of the Howard County Council districts produced by running the Auto-Redistrict application to create fifteen council districts electing one member per district using ranked choice voting. I also show racial/ethnic population percentages by district and (estimated) party vote share by district.

For those readers unfamiliar with the R statistical software and the additional Tidyverse software I use to manipulate and plot data, I’ve included some additional explanation of various steps. For more information check out the various ways to learn more about the Tidyverse.

Setup and data preparation

Libraries

I use the following packages for the following purposes:

  • tidyverse: do general data manipulation.
  • sf: manipulate geospatial data.
  • tigris: get data on roads.
  • tools: compute MD5 checksums.
  • knitr: display data in tabular format.
library(tidyverse)
library(sf)
library(tigris)
library(tools)
library(knitr)

Data sources

I use data from the following sources; see the References section of part 1 for more information:

  • Boundaries for Howard County precincts are from the shapefile produced by part 1 of this analysis. These are the precincts used in the 2018 general election for county council.
  • Assignment of precincts to new districts is from the Auto-Redistrict program.

Reading in and preparing the data

I first read in the shapefile containing Howard County precinct boundaries and related data, as produced by part 1 of this analysis.

redistricting_sf <- st_read("redistricting-input.shp")
## Reading layer `redistricting-input' from data source 
##   `/home/fhecker/src/hocodata/redistricting/redistricting-input.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 118 features and 12 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 1259411 ymin: 523188.7 xmax: 1398344 ymax: 620119.3
## Projected CRS: NAD83 / Maryland (ftUS)

I then read in the data on new districts produced by the Auto-Redistrict application. I create a new factor variable District based on the original numeric variable Distrct. I leave the Precnct variable as is because it will be used as a common field with the geospatial data table. I retain only the District and Precnct fields, since all the other fields are duplicated in the geospatial data table originally used as input to the Auto-Redistrict application.

new_districts_tb <- read_csv("redistricting-output-15-districts.csv",
                             show_col_types = FALSE) %>%
  mutate(District = as.factor(Distrct)) %>%
  select(District, Precnct)

To help orient readers as to the locations of the district boundaries, I also want any maps generated to also display major roads in Howard County that correspond in whole or in part to district or precinct boundaries. I use the tigris function roads() to return geometry for all Howard County roads.

Because I don’t need or want to display each and every Howard County road, I use the RTTYP and FULLNAME variables to filter the results to retain only major roads (interstate and U.S. highways) and significant minor roads (Maryland state routes and roads with “Parkway” in their names). I store the geometry for each in separate variables so that I can plot them at different widths.

all_roads <- roads(state = "MD", county = "Howard County", class = "sf", progress_bar = FALSE)

major_roads_geo <- all_roads %>%
  filter(RTTYP %in% c("I", "U")) %>%
  st_geometry()

minor_roads_geo <- all_roads %>%
  filter(RTTYP == "S" | str_detect(FULLNAME, "Pkwy")) %>%
  st_geometry()

Analysis

I plot the new districts along with the major (and some minor) roads to show how the new districts would relate to existing communities in Howard County.

palette <- c("#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2",
             "#D55E00", "#CC79A7", "#999999", "#E69F00", "#56B4E9",
             "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")

redistricting_sf %>%
  left_join(new_districts_tb, by = "Precnct") %>%
  ggplot(aes(fill = District, geometry = geometry)) +
  geom_sf(size = 0) +
  scale_fill_manual(values = palette) +
  geom_sf(data = major_roads_geo, color = "white", size = 1.0, fill = NA) +
  geom_sf(data = minor_roads_geo, color = "white", size = 0.5, fill = NA) +
  labs(title="Example 15-District Howard County Council",
       subtitle = "Automatically Generated using 2020 Census and 2018 Election Data",
       caption = paste0(
         "Data sources:",
         "\n  Maryland Department of Planning, 2020 Redistricting Data",
         "\n  Maryland Board of Elections, 2018 General Election Results",
         "\n  Howard County GIS Division, Precinct Boundaries",
         "\nCreated using the Auto-Redistrict application and tidyverse R package"
       )
  ) +
  theme(plot.caption = element_text(margin = margin(t = 15), hjust = 0)) +
  theme(axis.ticks = element_blank(), axis.text = element_blank()) +
  theme(panel.background = element_blank()) +
  theme(legend.position = "none")

The precinct-level data in redistricting_sf contains population figures by race. I combine the precinct-level data (in redistricting_sf) with the assignment of precincts to council districts produced by the Auto-Redistrict application (in new_districts_sf), and then drop the geometry for the precinct boundaries.

district_data <- redistricting_sf %>%
  left_join(new_districts_tb, by = "Precnct") %>%
  st_drop_geometry()

I compute the racial/ethnic breakdowns for each district.

racial_breakdown <- district_data %>%
  group_by(District) %>%
  summarize(Adj_Ppl = sum(Adj_Ppl),
            Ad_NH_W = sum(Ad_NH_W),
            Ad_NH_B = sum(Ad_NH_B),
            Ad_NH_A = sum(Ad_NH_A),
            Ad_NH_R = sum(Ad_NH_R),
            Adj_H_O = sum(Adj_H_O)) %>%
  mutate(`NH White` = round(100 * Ad_NH_W / Adj_Ppl),
         `NH Black` = round(100 * Ad_NH_B / Adj_Ppl),
         `NH Asian` = round(100 * Ad_NH_A / Adj_Ppl),
         `NH Multiracial` = round(100 * Ad_NH_R / Adj_Ppl),
         Hispanic = round(100 * Adj_H_O / Adj_Ppl)) %>%
  select(District, `NH White`, `NH Black`, `NH Asian`, `NH Multiracial`, Hispanic)

I plot the racial/ethnic breakdowns by district.

palette <- c("#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2")

racial_breakdown %>%
  gather(key = "Race", value = Pop_Pct, `NH White`:Hispanic) %>%
  mutate(Race = fct_relevel(Race, c("NH White", "NH Black", "NH Asian", "NH Multiracial", "Hispanic"))) %>%
  ggplot(mapping = aes(x = District, y = Pop_Pct, fill = Race)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = palette) +
  labs(title="15-District Howard County Council Population Percentage By Race/Ethnicity",
       subtitle = "Automatically Generated using 2020 Census and 2018 Election Data",
       caption = paste0(
         "Data sources:",
         "\n  Maryland Department of Planning, 2020 Redistricting Data",
         "\n  Maryland Board of Elections, 2018 General Election Results",
         "\n  Howard County GIS Division, Precinct Boundaries",
         "\nCreated using the Auto-Redistrict application and tidyverse R package"
       ),
       fill = "Race/Ethnicity"
  ) +
  ylab("Population (%)") +
  theme(plot.caption = element_text(margin = margin(t = 15), hjust = 0)) +
  theme_bw()

I also print the same data in the form of a table:

racial_breakdown %>%
  kable(caption = "15-District Howard County Council Population Percentage By Race/Ethnicity")
15-District Howard County Council Population Percentage By Race/Ethnicity
District NH White NH Black NH Asian NH Multiracial Hispanic
1 30 28 21 6 16
2 57 7 29 4 3
3 51 16 20 6 7
4 73 5 11 6 4
5 56 11 24 5 4
6 43 28 13 6 9
7 47 20 18 6 9
8 34 15 41 4 6
9 48 20 21 5 6
10 58 6 26 6 4
11 45 25 16 7 8
12 53 11 24 6 5
13 26 38 14 6 16
14 39 33 8 7 13
15 42 25 15 7 12

I compute the party breakdown for each district, using the estimated precinct-level votes by party in the 2018 general election for Howard County Council.

party_breakdown <- district_data %>%
  mutate(Votes = CCn.DEM + CCn.REP) %>%
  group_by(District) %>%
  summarize(Votes = sum(Votes),
            Dem_Votes = sum(CCn.DEM),
            Rep_Votes = sum(CCn.REP)) %>%
  mutate(Democratic = round(100 * Dem_Votes / Votes),
         Republican = round(100 * Rep_Votes / Votes)) %>%
  select(District, Democratic, Republican)

I plot the party breakdowns by district.

palette <- c("#0072B2", "#D55E00")

party_breakdown %>%
  gather(key = "Party", value = Vote_Pct, Democratic:Republican) %>%
  mutate(Party = fct_relevel(Party, c("Democratic", "Republican"))) %>%
  ggplot(mapping = aes(x = District, y = Vote_Pct, fill = Party)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = palette) +
  ylab("Estimated Vote (%)") +
  labs(title="15-District Howard County Council Estimated Party Vote Share",
       subtitle = "Automatically Generated using 2020 Census and 2018 Election Data",
       caption = paste0(
         "Data sources:",
         "\n  Maryland Department of Planning, 2020 Redistricting Data",
         "\n  Maryland Board of Elections, 2018 General Election Results",
         "\nCreated using the Auto-Redistrict application and tidyverse R package"
       )
  ) +
  theme_bw() +
  theme(plot.caption = element_text(margin = margin(t = 15), hjust = 0))

I also print the same data in the form of a table:

party_breakdown %>%
  kable(caption = "15-District Howard County Council Estimated Party Vote Share")
15-District Howard County Council Estimated Party Vote Share
District Democratic Republican
1 74 26
2 49 51
3 58 42
4 39 61
5 60 40
6 76 24
7 61 39
8 67 33
9 71 29
10 62 38
11 74 26
12 58 42
13 80 20
14 80 20
15 72 28

Appendix

Caveats

See part 1.

References

See part 1.

Suggestions for others

See part 1.

Environment

I used the following R environment in doing the analysis above:

sessionInfo()
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] tools     stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] knitr_1.33      tigris_1.4.1    sf_1.0-2        forcats_0.5.1  
##  [5] stringr_1.4.0   dplyr_1.0.7     purrr_0.3.4     readr_2.0.1    
##  [9] tidyr_1.1.3     tibble_3.1.4    ggplot2_3.3.5   tidyverse_1.3.1
## 
## loaded via a namespace (and not attached):
##  [1] httr_1.4.2         sass_0.4.0         bit64_4.0.5        vroom_1.5.4       
##  [5] jsonlite_1.7.2     modelr_0.1.8       bslib_0.3.0        assertthat_0.2.1  
##  [9] highr_0.9          sp_1.4-5           cellranger_1.1.0   yaml_2.2.1        
## [13] pillar_1.6.2       backports_1.2.1    lattice_0.20-44    glue_1.4.2        
## [17] uuid_0.1-4         digest_0.6.27      rvest_1.0.1        colorspace_2.0-2  
## [21] htmltools_0.5.2    pkgconfig_2.0.3    broom_0.7.9        haven_2.4.3       
## [25] scales_1.1.1       tzdb_0.1.2         proxy_0.4-26       farver_2.1.0      
## [29] generics_0.1.0     ellipsis_0.3.2     withr_2.4.2        cli_3.0.1         
## [33] magrittr_2.0.1     crayon_1.4.1       readxl_1.3.1       maptools_1.1-1    
## [37] evaluate_0.14      fs_1.5.0           fansi_0.5.0        xml2_1.3.2        
## [41] foreign_0.8-81     class_7.3-19       hms_1.1.0          lifecycle_1.0.0   
## [45] munsell_0.5.0      reprex_2.0.1       compiler_4.1.1     jquerylib_0.1.4   
## [49] e1071_1.7-8        rlang_0.4.11       classInt_0.4-3     units_0.7-2       
## [53] grid_4.1.1         rstudioapi_0.13    rappdirs_0.3.3     labeling_0.4.2    
## [57] rmarkdown_2.10     gtable_0.3.0       curl_4.3.2         DBI_1.1.1         
## [61] R6_2.5.1           lubridate_1.7.10   rgdal_1.5-23       fastmap_1.1.0     
## [65] bit_4.0.4          utf8_1.2.2         KernSmooth_2.23-20 stringi_1.7.4     
## [69] parallel_4.1.1     Rcpp_1.0.7         vctrs_0.3.8        dbplyr_2.1.1      
## [73] tidyselect_1.1.1   xfun_0.25

Source code

You can find the source code for this analysis and others at my hocodata public code repository. This document and its source code are available for unrestricted use, distribution and modification under the terms of the Creative Commons CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. Stated more simply, you’re free to do whatever you’d like with it.