In doing analyses of precinct-level voting data from the Howard County 2014 general election (using data from the Maryland State Board of Elections and the Howard County Board of Elections), one problem is that maps drawn using that data are inherently misleading: Precincts in western Howard County are much larger than precincts in Columbia and eastern Howard County, so they visually dominate the maps.
One way to reduce this effect is to create cartograms, maps in which the visual sizes of the geographic subdivisions are based not on their actual geographic area but on some other variable associated with them. In particular, for political maps of Howard County I would like to display precincts sized according to the number of registered voters in each precinct.
Creating a voter-based cartogram of Howard County precincts requires distorting the precinct boundaries so as to change the area of each precinct while still preserving the overall shape of the precinct as much as possible, and also preserving its relationship to its neighboring precincts. This requires some relatively sophisticated mathematics, and unfortunately there is no existing R package that can do it well. Instead I use scapetoad, a Java-based application available for Microsoft Windows, Mac OS X, and Linux.
In a previous document (part 1) I described the overall process of preparing Howard County map data for use with the scapetoad application. In this document I describe the process of running the scapetoad application to produce the desired cartogram(s) and checking the results.
I use the R statistical package run from the RStudio development environment, along with the sp and rgdal packages to read mapping data.
library("sp")
library("rgdal")
## rgdal: version: 0.9-1, (SVN revision 518)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 1.11.1, released 2014/09/24
## Path to GDAL shared files: /Library/Frameworks/GDAL.framework/Versions/1.11/Resources/gdal
## Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012, [PJ_VERSION: 480]
## Path to PROJ.4 shared files: (autodetected)
The rgdal package also requires installing the GDAL mapping library on the underlying operating system.
The scapetoad application works only with map data in ESRI shapefile format, and requires that the variable used to determine precinct sizes be included with that data. The scapetoad application also has a handy feature whereby you can include other map layers and distort them in tandem with the original map. I use this to create maps of the Howard County council districts, Maryland state legislative districts, and US Congressional districts that match the precinct cartogram.
In the previous document (part 1) I downloaded the shapefile for the boundaries of Howard County precincts and created a new shapefile Precinct_VotersPolygon.shp
that has an added field Reg.Voters
specifying the number of registered voters in each precinct as of the 2014 general election. This is the shapefile I use to create the main cartogram.
I also downloaded shapefiles for the boundaries of Howard County council districts and for the Howard County portions of Maryland state legislative districts and US congressional districts. I use these shapefiles to create additional cartograms.
I then invoke the scapetoad application and use the “Add a Layer” function multiple times to add invidual layers for the following shapefiles:
Precinct_VotersPolygon.shp
Council_DistrictsPolygon.shp
Legislative_DistrictsPolygon.shp
Congressional_DistrictsPolygon.shp
I next use the ‘Create Cartogram’ function to create the various cartograms. I do the following steps in the resulting wizard:
Finally I use the ‘Export to shape’ function multiple times to export each of the transformed layers to a shapefile:
Each export operation produces three files, with suffixes .dbx
, .shp
, and .shx
.
At this point I am finished running the scapetoad application.
I now check the newly-created cartograms against the original maps. I read in the shapefile data for the cartograms for the precincts, county council districts, Maryland state legislative districts, and US Congressional districts, along with the shapefiles for the original maps as downloaded from data.howardcountymd.gov
.
precinct_cg <- readOGR(dsn = ".",
layer = "Voting_Precincts_CartogramPolygon")
## OGR data source with driver: ESRI Shapefile
## Source: ".", layer: "Voting_Precincts_CartogramPolygon"
## with 118 features and 12 fields
## Feature type: wkbPolygon with 2 dimensions
precinct_map <- readOGR(dsn = ".",
layer = "Voting_PrecinctsPolygon")
## OGR data source with driver: ESRI Shapefile
## Source: ".", layer: "Voting_PrecinctsPolygon"
## with 118 features and 9 fields
## Feature type: wkbPolygon with 2 dimensions
council_cg <- readOGR(dsn = ".",
layer = "Council_Districts_CartogramPolygon")
## OGR data source with driver: ESRI Shapefile
## Source: ".", layer: "Council_Districts_CartogramPolygon"
## with 5 features and 5 fields
## Feature type: wkbPolygon with 2 dimensions
council_map <- readOGR(dsn = ".",
layer = "Council_DistrictsPolygon")
## OGR data source with driver: ESRI Shapefile
## Source: ".", layer: "Council_DistrictsPolygon"
## with 5 features and 5 fields
## Feature type: wkbPolygon with 2 dimensions
leg_cg <- readOGR(dsn = ".",
layer = "Legislative_Districts_CartogramPolygon")
## OGR data source with driver: ESRI Shapefile
## Source: ".", layer: "Legislative_Districts_CartogramPolygon"
## with 4 features and 5 fields
## Feature type: wkbPolygon with 2 dimensions
leg_map <- readOGR(dsn = ".",
layer = "Legislative_DistrictsPolygon")
## OGR data source with driver: ESRI Shapefile
## Source: ".", layer: "Legislative_DistrictsPolygon"
## with 4 features and 5 fields
## Feature type: wkbPolygon with 2 dimensions
cong_cg <- readOGR(dsn = ".",
layer = "Congressional_Districts_CartogramPolygon")
## OGR data source with driver: ESRI Shapefile
## Source: ".", layer: "Congressional_Districts_CartogramPolygon"
## with 3 features and 2 fields
## Feature type: wkbPolygon with 2 dimensions
cong_map <- readOGR(dsn = ".",
layer = "Congressional_DistrictsPolygon")
## OGR data source with driver: ESRI Shapefile
## Source: ".", layer: "Congressional_DistrictsPolygon"
## with 3 features and 2 fields
## Feature type: wkbPolygon with 2 dimensions
Then I check the fields in the precinct cartogram vs. the fields in the original map:
str(precinct_map@data)
## 'data.frame': 118 obs. of 9 variables:
## $ PRECINCT20: Factor w/ 118 levels "1-01","1-02",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ CNGDISTRIC: Factor w/ 3 levels "2","3","7": 2 2 1 3 3 1 2 2 1 3 ...
## $ LEGDISTRIC: Factor w/ 4 levels "12","13","9A",..: 1 2 1 4 1 2 1 2 2 1 ...
## $ CODISTRICT: Factor w/ 5 levels "1","2","3","4",..: 1 1 2 1 2 2 2 1 2 2 ...
## $ POLLINGPLA: Factor w/ 76 levels "ALTHOLTON HIGH SCHOOL",..: 62 16 44 34 76 12 44 15 44 33 ...
## $ ADDRESS_2 : Factor w/ 77 levels "10220 WETHERBURN RD",..: 37 53 55 28 25 50 55 51 55 59 ...
## $ CITY : Factor w/ 16 levels "Clarksville",..: 4 4 4 5 5 4 4 4 4 5 ...
## $ ZIP : Factor w/ 19 levels "20723","20759",..: 13 13 13 9 9 13 13 13 13 9 ...
## $ LOCATION_2: Factor w/ 12 levels "ALL PURPOSE ROOM",..: 7 7 4 4 7 4 4 4 7 3 ...
str(precinct_cg@data)
## 'data.frame': 118 obs. of 12 variables:
## $ PRECINCT20 : Factor w/ 118 levels "1-01","1-02",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ CNGDISTRIC : Factor w/ 3 levels "2","3","7": 2 2 1 3 3 1 2 2 1 3 ...
## $ LEGDISTRIC : Factor w/ 4 levels "12","13","9A",..: 1 2 1 4 1 2 1 2 2 1 ...
## $ CODISTRICT : Factor w/ 5 levels "1","2","3","4",..: 1 1 2 1 2 2 2 1 2 2 ...
## $ POLLINGPLA : Factor w/ 76 levels "ALTHOLTON HIGH SCHOOL",..: 62 16 44 34 76 12 44 15 44 33 ...
## $ ADDRESS_2 : Factor w/ 77 levels "10220 WETHERBURN RD",..: 37 53 55 28 25 50 55 51 55 59 ...
## $ CITY : Factor w/ 16 levels "Clarksville",..: 4 4 4 5 5 4 4 4 4 5 ...
## $ ZIP : Factor w/ 19 levels "20723","20759",..: 13 13 13 9 9 13 13 13 13 9 ...
## $ LOCATION_2 : Factor w/ 12 levels "ALL PURPOSE ROOM",..: 7 7 4 4 7 4 4 4 7 3 ...
## $ Reg_Voters : num 2051 524 1803 2738 664 ...
## $ Reg_VotersD: num 4.35e-05 4.47e-05 9.33e-05 5.59e-05 9.22e-05 ...
## $ SizeError : num 101.6 98.8 100.1 103.2 104.2 ...
Note the added Reg_Voters
field from the turnout statistics. (The other two fields Reg_Voters_D
and SizeError
are created by the scapetoad application.)
Now I do quick plots of the original maps and the cartograms. I use the base plot()
function in R to avoid the need to create dataframes for the ggplot()
function.
plot(precinct_map)
plot(precinct_cg)
plot(council_map)
plot(council_cg)
plot(leg_map)
plot(leg_cg)
plot(cong_map)
plot(cong_cg)
As a final step I create new .zip
files from the shapefiles created by the scapetoad application.
files <- c("Voting_Precincts",
"Council_Districts",
"Legislative_Districts",
"Congressional_Districts")
suffixes <- c(".dbf", ".shp", ".shx")
for (file in files) {
zip_file <- paste(file, "_Cartogram.zip", sep = "")
map_files <- paste(file, "_CartogramPolygon", suffixes, sep = "")
if (file.exists(zip_file)) file.remove(zip_file)
zip(zip_file, map_files)
}
I can now use these .zip
files in place of the original shapefile .zip
files downloaded from data.howardcountymd.gov
.
I used the following R environment in doing the analysis for this example:
sessionInfo()
## R version 3.1.2 (2014-10-31)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] rgdal_0.9-1 sp_1.0-17 RCurl_1.95-4.3 bitops_1.0-6
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.4 evaluate_0.5.5 formatR_1.0 grid_3.1.2
## [5] htmltools_0.2.6 knitr_1.7 lattice_0.20-29 rmarkdown_0.5.1
## [9] stringr_0.6.2 tools_3.1.2 yaml_2.1.13
The underlying GDAL library for the rgdal packages is from the KyngChaos GDAL Complete distribution version 1.11 for Mac OS X.
You can find the source code for this analysis and others at my HoCoData repository on GitHub. This document and its source code are available for unrestricted use, distribution and modification under the terms of the Creative Commons CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. Stated more simply, you’re free to do whatever you’d like with it.