Introduction

This is part 3 in a series. See also part 1 and part 2.

In this analysis I use precinct-level results from the Howard County 2014 general election (courtesy of the Maryland State Board of Elections) to look at Allan Kittleman’s margin of victory across the county on election day in the race for Howard County Executive. I’m interested in the general question of whether there was an “enthusiasm gap” in which Kittleman’s election-day results were particularly lopsided, e.g., due to increased turnout of Republican voters or unusually high support for Kittleman from Democrats and unaffiliated voters.

I present the data as a map of Howard County with the precincts colored according to Kittleman’s absolute and relative margins of victory, and with county council boundaries added. The map is based on precinct and council boundaries made available by the Howard County GIS division on the data.howardcountymd.gov site.

Unlike my previous analysis in part 2 of this series, here the map is deliberately distorted to have the area covered by each precinct be proportional to the number of registered voters in that precinct. For more information on the process used to create such a map (referred to as a “cartogram”) see “Creating Howard County Precinct Cartograms Based on 2014 Registered Voters”, part 1 and part 2.

Load packages

For this analysis I use the R statistical package run from the RStudio development environment, along with the dplyr and tidyr packages to do data manipulation and the ggplot2 package to draw the maps.

library("dplyr", warn.conflicts = FALSE)
library("tidyr")
library("ggplot2")

I also need to load R packages used to manipulate spatial data in R. I first load the sp package, a prerequisite for using other spatial data packages. I use the rgdal package to load spatial data for boundaries for the precincts and council districts.

library("sp")
library("rgdal")
## rgdal: version: 0.9-1, (SVN revision 518)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 1.11.1, released 2014/09/24
## Path to GDAL shared files: /Library/Frameworks/GDAL.framework/Versions/1.11/Resources/gdal
## Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012, [PJ_VERSION: 480]
## Path to PROJ.4 shared files: (autodetected)

The rgdal package also requires installing the GDAL mapping library on the underlying operating system.

General approach

How would one best measure relative voter enthusiasm for Allan Kittleman vs. Courtney Watson? One measure would be how each candidate outperformed their “expected” vote, for example, how many votes Kittleman attracted in a given precinct vs. the number of registered Republicans in that precinct, and ditto for Watson vis-a-vis the number of registered Democrats. A related measure would look at Republican turnout (i.e., as a percentage of registered Republicans) in a given precinct vs. Democratic turnout.

In this document I confine myself to looking at simple margins of victory in each precinct. The Maryland State Board of Elections has now made available turnout data (in Microsoft Excel format) giving turnout by party by precinct in the 2014 general election. I’ll take a look at that data in a later analysis.

As I mentioned above, this analysis is for election day voting only. Absentee ballots and votes cast at early voting centers are not included in the per-precinct totals as reported by the Maryland State Board of Elections. I’m not aware of any good method to assign absentee and early voting results to individual precincts.

Loading the data

First I download the CSV-format data file from the Maryland State Board of Elections containing Howard County 2014 general election results by precincts, and store a copy of the data in the local file Howard_By_Precinct_2014_General.csv.

download.file("http://elections.state.md.us/elections/2014/election_data/Howard_By_Precinct_2014_General.csv",
              "Howard_By_Precinct_2014_General.csv",
              method = "curl")

Then I download spatial data specifying the boundaries of Howard County election precincts and county council districts, using the new council boundaries in effect for the 2014 primary and general elections. The boundaries are deliberately distorted to have the size of each precinct match the number of registered voters in each precinct as of the 2014 general election. The boundaries of the council districts are similarly distorted to match the transformed precinct boundaries. (These maps are available in the datasets section of my hocodata GitHub repository.)

download.file("https://raw.githubusercontent.com/frankhecker/hocodata/master/datasets/Voting_Precincts_Cartogram.zip",
              "Voting_Precincts_Cartogram.zip",
              method = "curl")
download.file("https://raw.githubusercontent.com/frankhecker/hocodata/master/datasets/Council_Districts_Cartogram.zip",
              "Council_Districts_Cartogram.zip",
              method = "curl")

Since the boundary data is in .zip files I need to unzip the files to extract the actual shapefiles.

unzip("Voting_Precincts_Cartogram.zip", overwrite = TRUE)
unzip("Council_Districts_Cartogram.zip", overwrite = TRUE)

I then read in the CSV file for election results. I remove extraneous spaces from the names of the offices to make it easier to filter the results by office.

hoco_ge14_df <- read.csv("Howard_By_Precinct_2014_General.csv", stringsAsFactors = FALSE)
hoco_ge14_df$Office.Name <- gsub("  *$", "", hoco_ge14_df$Office.Name)

Finally I read in the precinct and council district boundary shapefile data.

precinct_map <- readOGR(dsn = ".",
                        layer = "Voting_Precincts_CartogramPolygon")
## OGR data source with driver: ESRI Shapefile 
## Source: ".", layer: "Voting_Precincts_CartogramPolygon"
## with 118 features and 12 fields
## Feature type: wkbPolygon with 2 dimensions
council_map <- readOGR(dsn = ".",
                       layer = "Council_Districts_CartogramPolygon")
## OGR data source with driver: ESRI Shapefile 
## Source: ".", layer: "Council_Districts_CartogramPolygon"
## with 5 features and 5 fields
## Feature type: wkbPolygon with 2 dimensions

I add a new Precinct field to the precinct map data to match the 000-000 format of the precinct designators I’ll be using in the data frame showing Kittleman’s victory margins.

precinct_map@data <- precinct_map@data %>%
    mutate(Precinct = gsub("^([0-9]+)-([0-9]+)$", "00\\1-0\\2", PRECINCT20))

Data processing

The processing of the data is almost identical to that done for part 2. (The only change is to the precinct designator format.) For brevity I again consolidate everything into a single data processing pipeline. The operations in the pipeline are as follows:

ak_margins_df <- hoco_ge14_df %>%
    filter(Office.Name == "County Executive") %>%
    select(Election.District, Election.Precinct, Party, Election.Night.Votes) %>%
    filter(Party != "BOT") %>%
    mutate(Precinct = paste(formatC(Election.District, width = 3, flag = 0),
                            "-",
                            formatC(Election.Precinct, width = 3, flag = 0),
                            sep = "")) %>%
    select(-Election.District, -Election.Precinct) %>%
    spread(Party, Election.Night.Votes) %>%
    mutate(REP.Margin = REP - DEM,
           Pct.REP.Margin = round(100 * (REP - DEM) / (REP + DEM), 1))

I print summary statistics for the entire data set:

summary(ak_margins_df)
##    Precinct              DEM             REP          REP.Margin     
##  Length:118         Min.   : 12.0   Min.   : 10.0   Min.   :-316.00  
##  Class :character   1st Qu.:209.2   1st Qu.:197.0   1st Qu.:-112.00  
##  Mode  :character   Median :303.0   Median :304.5   Median :  18.00  
##                     Mean   :313.8   Mean   :355.5   Mean   :  41.74  
##                     3rd Qu.:424.5   3rd Qu.:480.0   3rd Qu.: 141.00  
##                     Max.   :757.0   Max.   :936.0   Max.   : 655.00  
##  Pct.REP.Margin  
##  Min.   :-47.00  
##  1st Qu.:-19.98  
##  Median :  4.15  
##  Mean   :  3.22  
##  3rd Qu.: 21.43  
##  Max.   : 65.60

Among other things this gives ranges for Kittleman’s margins in terms of votes (-316 to 655) and percentages (-47% to 65.6%). I use these later when assigning colors to the precincts based on Kittleman’s absolute or percentage margins of victory.

Creating precinct-level maps

Now comes the fun part: actually mapping the data. First I convert the cartogram map data to normal data frames usable with the ggplot() function.

precinct_map@data$id <- rownames(precinct_map@data)
precinct_points <- fortify(precinct_map, region = "id")
precinct_df <- full_join(precinct_points, precinct_map@data, by = "id")
council_map@data$id <- rownames(council_map@data)
council_points <- fortify(council_map, region = "id")
council_df <- full_join(council_points, council_map@data, by = "id")

Then I add the margins data to the precinct map data.

precinct_df <- precinct_df %>%
    left_join(ak_margins_df, by = "Precinct")

Since I want to label the county council districts I next compute the centroids of the districts in order to position the labels on the map.

council_centers = coordinates(council_map)
council_centers_df <- as.data.frame(council_centers)
names(council_centers_df) <- c("long", "lat")
council_centers_df$District = as.character(council_map@data$DISTRICT20)

The cartogram’s shapes for districts 1 and 5 are so distorted that the centroids are almost outside the district boundaries. I therefore tweak the locations for the labels for those districts so that they’ll appear in more suitable locations, moving the label for district 1 a bit north and the label for district 5 a bit west.

council_centers_df$lat[1] <- council_centers_df$lat[1] + 9000
council_centers_df$long[5] <- council_centers_df$long[5] - 9000

Next I plot Allan Kittleman’s victory margins by precinct, starting with the absolute margins in votes. This plot contains three layers:

I also tweak the plot as follows:

g <- ggplot() +
    geom_polygon(data = precinct_df,
                 aes(x = long, y = lat, group = group, fill = REP.Margin)) +
    geom_polygon(data = council_df,
                 aes(x = long, y = lat, group = group),
                 fill = NA,
                 colour = "white") +
    geom_text(data = council_centers_df,
              aes(x = long, y = lat, label = District),
              size = 7,
              colour = "white",
              show_guide = FALSE) +
    coord_equal() +
    scale_fill_gradient("Margin (Votes)",
                        limits = c(-700, 700),
                        low = "blue",
                        high = "red",
                        space = "Lab",
                        guide = "colourbar") +
    theme(axis.title = element_blank(),
          axis.text = element_blank(),
          axis.ticks = element_blank(),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          panel.background = element_blank()) +
    ggtitle("Allan Kittleman 2014 Margins by Precinct (Votes)\nPrecinct Sizes Based on Registered Voters on Election Day")
print(g)

The second plot shows Allan Kittleman’s victory margins in terms of percentage of votes in each precinct. This graph is produced identically to the previous one, except that I use the Pct.REP.Margin variable to color the precincts (instead of REP.Margin) and I set the maximum red color to be used for a 70% winning margin for Kittleman and the maximum blue color to be used for a 70% winning margin for Courtney Watson.

g <- ggplot() +
    geom_polygon(data = precinct_df,
                 aes(x = long, y = lat, group = group, fill = Pct.REP.Margin)) +
    geom_polygon(data = council_df,
                 aes(x = long, y = lat, group = group),
                 fill = NA,
                 colour = "white") +
    geom_text(data = council_centers_df,
              aes(x = long, y = lat, label = District),
              size = 7,
              colour = "white",
              show_guide = FALSE) +
    coord_equal() +
    scale_fill_gradient("Margin (% of Vote)",
                        limits = c(-70, 70),
                        low = "blue",
                        high = "red",
                        space = "Lab",
                        guide = "colourbar") +
    theme(axis.title = element_blank(),
          axis.text = element_blank(),
          axis.ticks = element_blank(),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          panel.background = element_blank()) +
    ggtitle("Allan Kittleman 2014 Margins by Precinct (% of Votes)\nPrecinct Sizes Based on Registered Voters on Election Day")
print(g)