Introduction

In this document I display a map of the Howard County Council districts produced by running the Auto-Redistrict application to create five council districts electing three members per district using ranked choice voting. I also show racial/ethnic breakdowns for each district and (estimated) party vote share by district.

For those readers unfamiliar with the R statistical software and the additional Tidyverse software I use to manipulate and plot data, I’ve included some additional explanation of various steps. For more information check out the various ways to learn more about the Tidyverse.

Setup and data preparation

Libraries

I use the following packages for the following purposes:

  • tidyverse: do general data manipulation.
  • sf: manipulate geospatial data.
  • tigris: get data on roads.
  • tools: compute MD5 checksums.
  • knitr: display data in tabular format.
library(tidyverse)
library(sf)
library(tigris)
library(tools)
library(knitr)

Data sources

I use data from the following sources; see the References section of part 1 for more information:

  • Boundaries for Howard County precincts are from the shapefile produced by part 1 of this analysis. These are the precincts used in the 2018 general election for county council.
  • Assignment of precincts to new districts is from the Auto-Redistrict program, run for over 8,000 iterations.

Reading in and preparing the data

I first read in the shapefile containing Howard County precinct boundaries and related data, as produced by part 1 of this analysis.

redistricting_sf <- st_read("redistricting-input.shp")
## Reading layer `redistricting-input' from data source 
##   `/home/fhecker/src/hocodata/redistricting/redistricting-input.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 118 features and 12 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 1259411 ymin: 523188.7 xmax: 1398344 ymax: 620119.3
## Projected CRS: NAD83 / Maryland (ftUS)

The district numbers generated by the Auto-Redistrict application are arbitrary and bear no relationship to the current council district numbering scheme. To reduce confusion and improve clarity I create a table district_name_tb that maps the district numbers from Auto-Redistrict to district numbers matching the current Howard County Council districts.

I then read in the data on new districts produced by the Auto-Redistrict application. I use the table district_name_tb to add a new variable District containing the district names. (I make that variable into a so-called “factor” variable to simplify creating the graphs below.)

I leave the Precnct variable as is because it will be used as a common field with the geospatial data table. I retain only the District and Precnct fields, since all the other fields are duplicated in the geospatial data table originally used as input to the Auto-Redistrict application.

district_name_tb <- tribble(
  ~Distrct, ~District,
  1, "2",
  2, "1",
  3, "4",
  4, "3",
  5, "5"
)

new_districts_tb <- read_csv("redistricting-output-5-districts.csv",
                             show_col_types = FALSE) %>%
  left_join(district_name_tb, by = "Distrct") %>%
  mutate(District = as.factor(District)) %>%
  select(District, Precnct)

To help orient readers as to the locations of the district boundaries, I also want any maps generated to also display major roads in Howard County that correspond in whole or in part to district or precinct boundaries. I use the tigris function roads() to return geometry for all Howard County roads.

Because I don’t need or want to display each and every Howard County road, I use the RTTYP and FULLNAME variables to filter the results to retain only major roads (interstate and U.S. highways) and significant minor roads (Maryland state routes and roads with “Parkway” in their names). I store the geometry for each in separate variables so that I can plot them at different widths.

all_roads <- roads(state = "MD", county = "Howard County", class = "sf", progress_bar = FALSE)

major_roads_geo <- all_roads %>%
  filter(RTTYP %in% c("I", "U")) %>%
  st_geometry()

minor_roads_geo <- all_roads %>%
  filter(RTTYP == "S" | str_detect(FULLNAME, "Pkwy")) %>%
  st_geometry()

Analysis

I plot the new districts along with the major (and some minor) roads to show how the new districts would relate to existing communities in Howard County.

palette <- c("#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2")

redistricting_sf %>%
  left_join(new_districts_tb, by = "Precnct") %>%
  ggplot(aes(fill = District, geometry = geometry)) +
  geom_sf(size = 0) +
  scale_fill_manual(values = palette) +
  geom_sf(data = major_roads_geo, color = "white", size = 1.0, fill = NA) +
  geom_sf(data = minor_roads_geo, color = "white", size = 0.5, fill = NA) +
  labs(title="Example 5-District Map for 15-Member Howard County Council",
       subtitle = "Automatically Generated using 2020 Census and 2018 Election Data",
       caption = paste0(
         "Data sources:",
         "\n  Maryland Department of Planning, 2020 Redistricting Data",
         "\n  Maryland Board of Elections, 2018 General Election Results",
         "\n  Howard County GIS Division, Precinct Boundaries",
         "\nCreated using the Auto-Redistrict application and tidyverse R package"
       )
  ) +
  theme(plot.caption = element_text(margin = margin(t = 15), hjust = 0)) +
  theme(axis.ticks = element_blank(), axis.text = element_blank()) +
  theme(panel.background = element_blank())

The new districts divide the county into five “communities of interest”:

  • District 1. Ellicott City.
  • District 2. Northeastern Howard County, including Elkridge.
  • District 3. Southeastern Howard County, including North Laurel, Savage, Maple Lawn, and Fulton, as well as the Village of Kings Contrivance in Columbia.
  • District 4. Central Columbia.
  • District 5. Western Howard County, including the Villages of River Hill and Harper’s Choice in Columbia.

The precinct-level data in redistricting_sf contains population figures by race and ethnicity. I combine the precinct-level data (in redistricting_sf) with the assignment of precincts to council districts produced by the Auto-Redistrict application (in new_districts_sf), and then drop the geometry for the precinct boundaries.

district_data <- redistricting_sf %>%
  left_join(new_districts_tb, by = "Precnct") %>%
  st_drop_geometry()

I compute the racial/ethnic breakdowns for each district.

racial_breakdown <- district_data %>%
  group_by(District) %>%
  summarize(Adj_Ppl = sum(Adj_Ppl),
            Ad_NH_W = sum(Ad_NH_W),
            Ad_NH_B = sum(Ad_NH_B),
            Ad_NH_A = sum(Ad_NH_A),
            Ad_NH_R = sum(Ad_NH_R),
            Adj_H_O = sum(Adj_H_O)) %>%
  mutate(`NH White` = round(100 * Ad_NH_W / Adj_Ppl),
         `NH Black` = round(100 * Ad_NH_B / Adj_Ppl),
         `NH Asian` = round(100 * Ad_NH_A / Adj_Ppl),
         `NH Multiracial` = round(100 * Ad_NH_R / Adj_Ppl),
         Hispanic = round(100 * Adj_H_O / Adj_Ppl)) %>%
  select(District, `NH White`, `NH Black`, `NH Asian`, `NH Multiracial`, Hispanic)

I plot the racial/ethnic breakdowns by district. I also draw horizontal dashed lines representing the quota levels in a three-member district using ranked choice voting. (For example, if a candidate can get 25% plus one of the first preference votes in a three-member district then they will automatically be elected. Here 25% plus one is the quota.)

palette <- c("#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2")

quota = 100 / 4

racial_breakdown %>%
  gather(key = "Race", value = Pop_Pct, `NH White`:Hispanic) %>%
  mutate(Race = fct_relevel(Race, c("NH White", "NH Black", "NH Asian", "NH Multiracial", "Hispanic"))) %>%
  ggplot(mapping = aes(x = District, y = Pop_Pct, fill = Race)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = palette) +
  geom_hline(mapping = aes(yintercept = quota), color = "gray", linetype = "dashed") +
  geom_hline(mapping = aes(yintercept = 2 * quota), color = "gray", linetype = "dashed") +
  geom_hline(mapping = aes(yintercept = 3 * quota), color = "gray", linetype = "dashed") +
  geom_hline(mapping = aes(yintercept = 4 * quota), color = "gray", linetype = "dashed") +
  labs(title="5-District Howard County Council Population Percentage By Race/Ethnicity",
       subtitle = "Automatically Generated using 2020 Census and 2018 Election Data",
       caption = paste0(
         "Data sources:",
         "\n  Maryland Department of Planning, 2020 Redistricting Data",
         "\n  Maryland Board of Elections, 2018 General Election Results",
         "\n  Howard County GIS Division, Precinct Boundaries",
         "\nCreated using the Auto-Redistrict application and tidyverse R package"
       ),
       fill = "Race/Ethnicity"
  ) +
  ylab("Population (%)") +
  theme(plot.caption = element_text(margin = margin(t = 15), hjust = 0)) +
  theme_bw()

I also print the same data in the form of a table:

racial_breakdown %>%
  kable(caption = "5-District Howard County Council Population Percentage By Race/Ethnicity")
5-District Howard County Council Population Percentage By Race/Ethnicity
District NH White NH Black NH Asian NH Multiracial Hispanic
1 49 10 31 5 5
2 44 20 21 6 10
3 41 25 18 6 10
4 42 30 9 7 12
5 58 11 21 5 5

I compute the party breakdown for each district, using the estimated precinct-level votes by party in the 2018 general election for Howard County Council.

party_breakdown <- district_data %>%
  mutate(Votes = CCn.DEM + CCn.REP) %>%
  group_by(District) %>%
  summarize(Votes = sum(Votes),
            Dem_Votes = sum(CCn.DEM),
            Rep_Votes = sum(CCn.REP)) %>%
  mutate(Democratic = round(100 * Dem_Votes / Votes),
         Republican = round(100 * Rep_Votes / Votes)) %>%
  select(District, Democratic, Republican)

I plot the party breakdowns by district, again adding horizontal dashed lines to represent the quotas. (For example, if a party’s total vote share exceeds 50%, or two quotas, that party fields two candidates, and the two candidates have an equal share of first preference votes, then both candidates will automatically be elected.)

palette <- c("#0072B2", "#D55E00")

party_breakdown %>%
  gather(key = "Party", value = Vote_Pct, Democratic:Republican) %>%
  mutate(Party = fct_relevel(Party, c("Democratic", "Republican"))) %>%
  ggplot(mapping = aes(x = District, y = Vote_Pct, fill = Party)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = palette) +
  geom_hline(mapping = aes(yintercept = quota), color = "gray", linetype = "dashed") +
  geom_hline(mapping = aes(yintercept = 2 * quota), color = "gray", linetype = "dashed") +
  geom_hline(mapping = aes(yintercept = 3 * quota), color = "gray", linetype = "dashed") +
  geom_hline(mapping = aes(yintercept = 4 * quota), color = "gray", linetype = "dashed") +
  ylab("Estimated Vote (%)") +
  labs(title="5-District Howard County Council Estimated Party Vote Share",
       subtitle = "Automatically Generated using 2020 Census and 2018 Election Data",
       caption = paste0(
         "Data sources:",
         "\n  Maryland Department of Planning, 2020 Redistricting Data",
         "\n  Maryland Board of Elections, 2018 General Election Results",
         "\nCreated using the Auto-Redistrict application and tidyverse R package"
       )
  ) +
  theme_bw() +
  theme(plot.caption = element_text(margin = margin(t = 15), hjust = 0))

I also print the same data in the form of a table:

party_breakdown %>%
  kable(caption = "5-District Howard County Council Estimated Party Vote Share")
5-District Howard County Council Estimated Party Vote Share
District Democratic Republican
1 62 38
2 64 36
3 71 29
4 77 23
5 51 49

Appendix

Caveats

See part 1.

References

See part 1.

Suggestions for others

See part 1.

Environment

I used the following R environment in doing the analysis above:

sessionInfo()
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] tools     stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] knitr_1.33      tigris_1.4.1    sf_1.0-2        forcats_0.5.1  
##  [5] stringr_1.4.0   dplyr_1.0.7     purrr_0.3.4     readr_2.0.1    
##  [9] tidyr_1.1.3     tibble_3.1.4    ggplot2_3.3.5   tidyverse_1.3.1
## 
## loaded via a namespace (and not attached):
##  [1] httr_1.4.2         sass_0.4.0         bit64_4.0.5        vroom_1.5.4       
##  [5] jsonlite_1.7.2     modelr_0.1.8       bslib_0.3.0        assertthat_0.2.1  
##  [9] highr_0.9          sp_1.4-5           cellranger_1.1.0   yaml_2.2.1        
## [13] pillar_1.6.2       backports_1.2.1    lattice_0.20-44    glue_1.4.2        
## [17] uuid_0.1-4         digest_0.6.27      rvest_1.0.1        colorspace_2.0-2  
## [21] htmltools_0.5.2    pkgconfig_2.0.3    broom_0.7.9        haven_2.4.3       
## [25] scales_1.1.1       tzdb_0.1.2         proxy_0.4-26       farver_2.1.0      
## [29] generics_0.1.0     ellipsis_0.3.2     withr_2.4.2        cli_3.0.1         
## [33] magrittr_2.0.1     crayon_1.4.1       readxl_1.3.1       maptools_1.1-1    
## [37] evaluate_0.14      fs_1.5.0           fansi_0.5.0        xml2_1.3.2        
## [41] foreign_0.8-81     class_7.3-19       hms_1.1.0          lifecycle_1.0.0   
## [45] munsell_0.5.0      reprex_2.0.1       compiler_4.1.1     jquerylib_0.1.4   
## [49] e1071_1.7-8        rlang_0.4.11       classInt_0.4-3     units_0.7-2       
## [53] grid_4.1.1         rstudioapi_0.13    rappdirs_0.3.3     labeling_0.4.2    
## [57] rmarkdown_2.10     gtable_0.3.0       curl_4.3.2         DBI_1.1.1         
## [61] R6_2.5.1           lubridate_1.7.10   rgdal_1.5-23       fastmap_1.1.0     
## [65] bit_4.0.4          utf8_1.2.2         KernSmooth_2.23-20 stringi_1.7.4     
## [69] parallel_4.1.1     Rcpp_1.0.7         vctrs_0.3.8        dbplyr_2.1.1      
## [73] tidyselect_1.1.1   xfun_0.25

Source code

You can find the source code for this analysis and others at my hocodata public code repository. This document and its source code are available for unrestricted use, distribution and modification under the terms of the Creative Commons CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. Stated more simply, you’re free to do whatever you’d like with it.