Introduction

In this document I display a map of the Howard County Council districts produced by running the Auto-Redistrict application to create three council districts electing five members per district using ranked choice voting. I also show racial/ethnic breakdowns for each district and (estimated) party vote share by district.

For those readers unfamiliar with the R statistical software and the additional Tidyverse software I use to manipulate and plot data, I’ve included some additional explanation of various steps. For more information check out the various ways to learn more about the Tidyverse.

Setup and data preparation

Libraries

I use the following packages for the following purposes:

  • tidyverse: do general data manipulation.
  • sf: manipulate geospatial data.
  • tigris: get data on roads.
  • tools: compute MD5 checksums.
  • knitr: display data in tabular format.
library(tidyverse)
library(sf)
library(tigris)
library(tools)
library(knitr)

Data sources

I use data from the following sources; see the References section of part 1 for more information:

  • Boundaries for Howard County precincts are from the shapefile produced by part 1 of this analysis. These are the precincts used in the 2018 general election for county council.
  • Assignment of precincts to new districts is from the Auto-Redistrict program.

Reading in and preparing the data

I first read in the shapefile containing Howard County precinct boundaries and related data, as produced by part 1 of this analysis.

redistricting_sf <- st_read("redistricting-input.shp")
## Reading layer `redistricting-input' from data source 
##   `/home/fhecker/src/hocodata/redistricting/redistricting-input.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 118 features and 12 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 1259411 ymin: 523188.7 xmax: 1398344 ymax: 620119.3
## Projected CRS: NAD83 / Maryland (ftUS)

The district numbers generated by the Auto-Redistrict application are arbitrary and bear no relationship to the current council district numbering scheme. To reduce confusion and improve clarity I create a table district_name_tb that maps the district numbers to the locations of the districts within Howard County.

I then read in the data on new districts produced by the Auto-Redistrict application. I use the table district_name_tb to add a new variable District containing the district names. (I make that variable into a so-called “factor” variable to simplify creating the graphs below.)

I leave the Precnct variable as is because it will be used as a common field with the geospatial data table. I retain only the District and Precnct fields, since all the other fields are duplicated in the geospatial data table originally used as input to the Auto-Redistrict application.

district_name_tb <- tribble(
  ~Distrct, ~District,
  1, "Northeast",
  2, "Southeast",
  3, "West"
)

new_districts_tb <- read_csv("redistricting-output-3-districts.csv",
                             show_col_types = FALSE) %>%
  left_join(district_name_tb, by = "Distrct") %>%
  mutate(District = as.factor(District)) %>%
  select(District, Precnct)

To help orient readers as to the locations of the district boundaries, I also want any maps generated to also display major roads in Howard County that correspond in whole or in part to district or precinct boundaries. I use the tigris function roads() to return geometry for all Howard County roads.

Because I don’t need or want to display each and every Howard County road, I use the RTTYP and FULLNAME variables to filter the results to retain only major roads (interstate and U.S. highways) and significant minor roads (Maryland state routes and roads with “Parkway” in their names). I store the geometry for each in separate variables so that I can plot them at different widths.

all_roads <- roads(state = "MD", county = "Howard County", class = "sf", progress_bar = FALSE)

major_roads_geo <- all_roads %>%
  filter(RTTYP %in% c("I", "U")) %>%
  st_geometry()

minor_roads_geo <- all_roads %>%
  filter(RTTYP == "S" | str_detect(FULLNAME, "Pkwy")) %>%
  st_geometry()

Analysis

I plot the new districts along with the major (and some minor) roads to show how the new districts would relate to existing communities in Howard County.

palette <- c("#E69F00", "#56B4E9", "#009E73")

redistricting_sf %>%
  left_join(new_districts_tb, by = "Precnct") %>%
  ggplot(aes(fill = District, geometry = geometry)) +
  geom_sf(size = 0) +
  scale_fill_manual(values = palette) +
  geom_sf(data = major_roads_geo, color = "white", size = 1.0, fill = NA) +
  geom_sf(data = minor_roads_geo, color = "white", size = 0.5, fill = NA) +
  labs(title="Example 3-District Map for 15-Member Howard County Council",
       subtitle = "Automatically Generated using 2020 Census and 2018 Election Data",
       caption = paste0(
         "Data sources:",
         "\n  Maryland Department of Planning, 2020 Redistricting Data",
         "\n  Maryland Board of Elections, 2018 General Election Results",
         "\n  Howard County GIS Division, Precinct Boundaries",
         "\nCreated using the Auto-Redistrict application and tidyverse R package"
       )
  ) +
  theme(plot.caption = element_text(margin = margin(t = 15), hjust = 0)) +
  theme(axis.ticks = element_blank(), axis.text = element_blank()) +
  theme(panel.background = element_blank())

The new districts divide the county into three “communities of interest”:

  • Southeastern Howard County, including Savage, North Laurel, most of east Columbia south of Route 175, and most of west Columbia except the Village of River Hill.
  • Northeastern Howard County, including Elkridge, Ellicott City south of Route 40, and east Columbia north of Route 175.
  • Western Howard County, including Ellicott City north of Route 40, Maple Lawn, and the Village of River Hill in Columbia.

The precinct-level data in redistricting_sf contains population figures by race and ethnicity. I combine the precinct-level data (in redistricting_sf) with the assignment of precincts to council districts produced by the Auto-Redistrict application (in new_districts_sf), and then drop the geometry for the precinct boundaries.

district_data <- redistricting_sf %>%
  left_join(new_districts_tb, by = "Precnct") %>%
  st_drop_geometry()

I compute the racial/ethnic breakdowns for each district.

racial_breakdown <- district_data %>%
  group_by(District) %>%
  summarize(Adj_Ppl = sum(Adj_Ppl),
            Ad_NH_W = sum(Ad_NH_W),
            Ad_NH_B = sum(Ad_NH_B),
            Ad_NH_A = sum(Ad_NH_A),
            Ad_NH_R = sum(Ad_NH_R),
            Adj_H_O = sum(Adj_H_O)) %>%
  mutate(`NH White` = round(100 * Ad_NH_W / Adj_Ppl),
         `NH Black` = round(100 * Ad_NH_B / Adj_Ppl),
         `NH Asian` = round(100 * Ad_NH_A / Adj_Ppl),
         `NH Multiracial` = round(100 * Ad_NH_R / Adj_Ppl),
         Hispanic = round(100 * Adj_H_O / Adj_Ppl)) %>%
  select(District, `NH White`, `NH Black`, `NH Asian`, `NH Multiracial`, Hispanic)

I plot the racial/ethnic breakdowns by district. I also draw horizontal dashed lines representing the quota levels in a five-member district using ranked choice voting. (For example, if a candidate can get 16.7% of the first preference votes in a five-member district then they will automatically be elected. Here 16.7% is the quota.)

palette <- c("#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2")

quota = 100 / 6

racial_breakdown %>%
  gather(key = "Race", value = Pop_Pct, `NH White`:Hispanic) %>%
  mutate(Race = fct_relevel(Race, c("NH White", "NH Black", "NH Asian", "NH Multiracial", "Hispanic"))) %>%
  ggplot(mapping = aes(x = District, y = Pop_Pct, fill = Race)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = palette) +
  geom_hline(mapping = aes(yintercept = quota), color = "gray", linetype = "dashed") +
  geom_hline(mapping = aes(yintercept = 2 * quota), color = "gray", linetype = "dashed") +
  geom_hline(mapping = aes(yintercept = 3 * quota), color = "gray", linetype = "dashed") +
  geom_hline(mapping = aes(yintercept = 4 * quota), color = "gray", linetype = "dashed") +
  labs(title="3-District Howard County Council Population Percentage By Race/Ethnicity",
       subtitle = "Automatically Generated using 2020 Census and 2018 Election Data",
       caption = paste0(
         "Data sources:",
         "\n  Maryland Department of Planning, 2020 Redistricting Data",
         "\n  Maryland Board of Elections, 2018 General Election Results",
         "\n  Howard County GIS Division, Precinct Boundaries",
         "\nCreated using the Auto-Redistrict application and tidyverse R package"
       ),
       fill = "Race/Ethnicity"
  ) +
  ylab("Population (%)") +
  theme(plot.caption = element_text(margin = margin(t = 15), hjust = 0)) +
  theme_bw()

I also print the same data in the form of a table:

racial_breakdown %>%
  kable(caption = "3-District Howard County Council Population Percentage By Race/Ethnicity")
3-District Howard County Council Population Percentage By Race/Ethnicity
District NH White NH Black NH Asian NH Multiracial Hispanic
Northeast 46 18 21 6 9
Southeast 40 29 13 7 11
West 54 10 26 5 5

I compute the party breakdown for each district, using the estimated precinct-level votes by party in the 2018 general election for Howard County Council.

party_breakdown <- district_data %>%
  mutate(Votes = CCn.DEM + CCn.REP) %>%
  group_by(District) %>%
  summarize(Votes = sum(Votes),
            Dem_Votes = sum(CCn.DEM),
            Rep_Votes = sum(CCn.REP)) %>%
  mutate(Democratic = round(100 * Dem_Votes / Votes),
         Republican = round(100 * Rep_Votes / Votes)) %>%
  select(District, Democratic, Republican)

I plot the party breakdowns by district, again adding horizontal dashed lines to represent the quotas. (For example, if a party’s total vote share exceeds 33.4%, or two quotas, that party fields two candidates, and the two candidates have an equal share of first preference votes, then both candidates will automatically be elected.)

palette <- c("#0072B2", "#D55E00")

party_breakdown %>%
  gather(key = "Party", value = Vote_Pct, Democratic:Republican) %>%
  mutate(Party = fct_relevel(Party, c("Democratic", "Republican"))) %>%
  ggplot(mapping = aes(x = District, y = Vote_Pct, fill = Party)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = palette) +
  geom_hline(mapping = aes(yintercept = quota), color = "gray", linetype = "dashed") +
  geom_hline(mapping = aes(yintercept = 2 * quota), color = "gray", linetype = "dashed") +
  geom_hline(mapping = aes(yintercept = 3 * quota), color = "gray", linetype = "dashed") +
  geom_hline(mapping = aes(yintercept = 4 * quota), color = "gray", linetype = "dashed") +
  ylab("Estimated Vote (%)") +
  labs(title="3-District Howard County Council Estimated Party Vote Share",
       subtitle = "Automatically Generated using 2020 Census and 2018 Election Data",
       caption = paste0(
         "Data sources:",
         "\n  Maryland Department of Planning, 2020 Redistricting Data",
         "\n  Maryland Board of Elections, 2018 General Election Results",
         "\nCreated using the Auto-Redistrict application and tidyverse R package"
       )
  ) +
  theme_bw() +
  theme(plot.caption = element_text(margin = margin(t = 15), hjust = 0))

I also print the same data in the form of a table:

party_breakdown %>%
  kable(caption = "3-District Howard County Council Estimated Party Vote Share")
3-District Howard County Council Estimated Party Vote Share
District Democratic Republican
Northeast 66 34
Southeast 75 25
West 53 47

Appendix

Caveats

See part 1.

References

See part 1.

Suggestions for others

See part 1.

Environment

I used the following R environment in doing the analysis above:

sessionInfo()
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] tools     stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] knitr_1.33      tigris_1.4.1    sf_1.0-2        forcats_0.5.1  
##  [5] stringr_1.4.0   dplyr_1.0.7     purrr_0.3.4     readr_2.0.1    
##  [9] tidyr_1.1.3     tibble_3.1.4    ggplot2_3.3.5   tidyverse_1.3.1
## 
## loaded via a namespace (and not attached):
##  [1] httr_1.4.2         sass_0.4.0         bit64_4.0.5        vroom_1.5.4       
##  [5] jsonlite_1.7.2     modelr_0.1.8       bslib_0.3.0        assertthat_0.2.1  
##  [9] highr_0.9          sp_1.4-5           cellranger_1.1.0   yaml_2.2.1        
## [13] pillar_1.6.2       backports_1.2.1    lattice_0.20-44    glue_1.4.2        
## [17] uuid_0.1-4         digest_0.6.27      rvest_1.0.1        colorspace_2.0-2  
## [21] htmltools_0.5.2    pkgconfig_2.0.3    broom_0.7.9        haven_2.4.3       
## [25] scales_1.1.1       tzdb_0.1.2         proxy_0.4-26       farver_2.1.0      
## [29] generics_0.1.0     ellipsis_0.3.2     withr_2.4.2        cli_3.0.1         
## [33] magrittr_2.0.1     crayon_1.4.1       readxl_1.3.1       maptools_1.1-1    
## [37] evaluate_0.14      fs_1.5.0           fansi_0.5.0        xml2_1.3.2        
## [41] foreign_0.8-81     class_7.3-19       hms_1.1.0          lifecycle_1.0.0   
## [45] munsell_0.5.0      reprex_2.0.1       compiler_4.1.1     jquerylib_0.1.4   
## [49] e1071_1.7-8        rlang_0.4.11       classInt_0.4-3     units_0.7-2       
## [53] grid_4.1.1         rstudioapi_0.13    rappdirs_0.3.3     labeling_0.4.2    
## [57] rmarkdown_2.10     gtable_0.3.0       curl_4.3.2         DBI_1.1.1         
## [61] R6_2.5.1           lubridate_1.7.10   rgdal_1.5-23       fastmap_1.1.0     
## [65] bit_4.0.4          utf8_1.2.2         KernSmooth_2.23-20 stringi_1.7.4     
## [69] parallel_4.1.1     Rcpp_1.0.7         vctrs_0.3.8        dbplyr_2.1.1      
## [73] tidyselect_1.1.1   xfun_0.25

Source code

You can find the source code for this analysis and others at my hocodata public code repository. This document and its source code are available for unrestricted use, distribution and modification under the terms of the Creative Commons CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. Stated more simply, you’re free to do whatever you’d like with it.