Overview

Our team of students in DPI-663, a Harvard Kennedy School class on Technology and Innovation in Government, is working with the City of Boston’s Elections Commission and the Department of Innovation and Technology to tackle low voter-turnout in Boston’s municipal elections. Our goal is to develop policy recommendations and product prototypes that Boston can implement to improve turnout in future municipal elections. In order to better understand the problem, our first step was to dig into the data!

The data

I used 4 datasets in this analysis:

  1. precincts.geojson: a geospatial dataset of the City of Boston’s voting precincts. I obtained this data from Analyze Boston.
  2. wards.geojson - a geospatial dataset of the City of Boston’s voting wards. I obtained this data from Analyze Boston.
  3. dem_data_by_precincts.xlsx: a dataset containing demographic data (e.g. population, poverty, school enrollment, geographic mobility, etc.) on Boston’s precincts. I requested this dataset from Boston’s Planning and Development Agency (BPDA), which sourced the data from U.S. Census Bureau and American Community Survey 2014-2018 5-year Estimates and BPDA Research Division analysis.
  4. elections_data.xlsx: a dataset containing the number of residents, registered voters, votes cast, and voter turnout for Boston’s precincts in municipal elections from November 2005 to November 2017 (excluding special municipal elections and preliminary municipal elections). I created this dataset using the election data available on the City of Boston website. Two important notes: (1) I calculated voter turnout for each precinct according to the City’s definition: the number of votes cast divded by the number of registered voters. (2) I did not include data from the most recent municipal election in November 2019 because I could only find unofficial election data on the City of Boston website and the City did not respond to my request for official results or confirmation of the unofficial results.

Temporal analysis

Average voter turnout in Boston’s municipal elections from 2007 to 2017, excluding special and preliminary municipal elections, is 0.2545008. Note that including all municipal elections would result in an lower average voter turnout.

Here’s a breakdown of average voter turnout (Mean) in each municipal election. I’ve also included standard deviation (Sd - a measure of how spread out the turnout among precincts is in a given election year), minimum (Min) and maximum (Max) turnout, and range (Range - the difference between the maximum and minimum turnout):

Voter Turnout in Boston's Municipal Elections (2005-2017)
Election Mean Sd Min Max Range
2005-11-08 35.58 10.03 2.01 65.66 63.65
2007-11-06 13.79 6.70 0.00 42.60 42.60
2009-11-03 31.60 10.42 0.45 63.57 63.12
2011-11-08 18.30 8.66 0.00 61.26 61.26
2013-11-05 38.46 11.98 4.27 73.81 69.54
2015-11-03 13.85 6.20 1.05 39.31 38.26
2017-11-07 28.08 8.16 0.00 58.75 58.75

There’s one caveat to the above chart: Ward 1 Precinct 15 is a mysterious outlier… Here are the rows corresponding to Ward 1 Precinct 15 from my elections dataset (where all data is directly from the City of Boston):
Municipal Election Turnout in Boston's Ward 1 Precinct 15
Election Registered Voters Votes Cast Residents Turnout
2017-11-07 18 0 2 0.00
2015-11-03 95 1 290 1.05
2013-11-05 148 10 358 6.76
2011-11-08 168 0 1225 0.00
2009-11-03 223 1 397 0.45
2007-11-06 250 0 224 0.00
2005-11-08 51 3 337 5.88

The first and penultimate rows of this chart seem to be inaccurate: there are fewer residents than registered voters. Moreover, the number of residents in the precict is abnormally small in several years. The City of Boston did not respond to my request for more information, so I can’t solve the mystery of Ward 1 Precinct 15.

I filtered out Ward 1 Precinct 15 from the data, which changed the above chart breaking down voter turnout by election year slightly: while the mean turnout essentially stayed the same, the minimum turnout for every year except 2005 and 2013 increased, causing the range and the standard deviation to decrease in those years as well.

Voter Turnout in Boston's Municipal Elections (2005-2017)*
Election Mean Sd Min Max Range
2005-11-08 35.70 9.87 2.01 65.66 63.65
2007-11-06 13.84 6.66 0.55 42.60 42.05
2009-11-03 31.72 10.26 2.15 63.57 61.42
2011-11-08 18.37 8.60 0.95 61.26 60.31
2013-11-05 38.59 11.84 4.27 73.81 69.54
2015-11-03 13.91 6.16 1.40 39.31 37.91
2017-11-07 28.19 7.98 4.31 58.75 54.44
*Calculations exclude Ward 1 Precinct 15

I illustrated this yearly turnout breakdown in two ways…

An animated map of voter turnout in Boston:

A series of static maps corresponding to each municipal election:

Key take-aways:

  1. Election turnout is much higher in years with mayoral elections (2005, 2009, 2013, 2017) than other municipal elections, like city council races. In mayoral elections, average turnout ranged from 28-38% whereas in other municipal elections, average turnout ranged from 13-18%. That’s a gap of anywhere from 10 to 25 percentage points.
  2. Turnout varies widely across Boston’s precincts. There’s anywhere from a 38 to 70 percentage point gap between the lowest turnout precinct and the highest turnout precinct in a given election year.

Geographic analysis

Since voter turnout varies significantly across Boston’s precincts, two questions follow:

  1. Are precincts consistently low or high turnout, or does their turnout vary significantly each election year?
  2. If precincts tend to have consistent turnout levels, where are the low turnout and high turnout areas?

To answer the first question, I found the 10 precincts with the lowest turnout (which I’ll refer to as the “bottom 10”) and the 10 precincts with the highest turnout (which I’ll refer to as the “top 10”) in each election. Note that I exclude Ward 1 Precinct 15 for the reasons explained earlier in this analysis; if we wanted to include this precinct, it would consistently be in the bottom 10.

There are 14 precincts that have made the bottom 10 at least once in 7 the elections from 2005 to 2017:

Precincts in the Bottom 10 Turnout (2005-2017)
Ward Precinct # of times in Bottom 10
4 10 7
21 2 7
21 3 7
21 4 7
21 5 7
21 8 7
4 9 6
21 1 5
21 15 4
5 10 3
21 7 3
21 9 3
21 14 3
5 9 1

There are only 3 wards that have ever been in the bottom 10 from 2005 to 2017:

Wards in the Bottom 10 Turnout (2005-2017)
Ward # of Times in Bottom 10 % of Precincts
21 53 75.71
4 13 18.57
5 4 5.71

6 of 14 precincts have made the bottom 10 in 7 out of 7 elections from 2005 to 2017. Only 1 made the bottom 10 once. Here’s a chart showing, on the left, the number of times in the bottom 10 (out of 7 past elections), and on the right, the number of precincts (out of 14) who have been in the bottom 10 that many times:

Frequency of Appearance in the Bottom 10 Turnout (2005-2017)
# of Times in Bottom 10 # of Precincts
1 1
3 4
4 1
5 1
6 1
7 6

Next, I turn to the top 10. There are 30 precincts that have made the top 10 at least once in the elections from 2005 to 2017:

Precincts in the Top 10 Turnout (2005-2017)
Ward Precinct # of times in Top 10
16 12 7
20 14 7
16 9 6
7 1 4
13 10 4
20 12 4
20 19 4
16 7 3
18 16 3
20 11 3
16 8 2
18 20 2
19 2 2
20 17 2
20 18 2
2 6 1
6 6 1
7 3 1
16 10 1
16 11 1
17 9 1
17 12 1
17 14 1
19 6 1
19 12 1
20 4 1
20 6 1
20 13 1
20 16 1
21 13 1

There are 10 wards that have ever been in the top 10 from 2005 to 2017:

Wards in the Top 10 Turnout (2005-2017)
Ward # of Times in Top 10 % of Precincts
20 26 37.14
16 20 28.57
7 5 7.14
18 5 7.14
13 4 5.71
19 4 5.71
17 3 4.29
2 1 1.43
6 1 1.43
21 1 1.43

2 of 30 precincts have made the top 10 in 7 out of 7 elections from 2005 to 2017. Only 15 made the top 10 once. Here’s a chart showing, on the left, the number of times in the top 10 (out of 7 past elections), and on the right, the number of precincts (out of 14) who have been in the top 10 that many times:

Frequency of Appearance in the Top 10 Turnout (2005-2017)
# of Times in Top 10 # of Precincts
1 15
2 5
3 3
4 4
6 1
7 2

So which precincts and wards are consistently low or high turnout?

These are the precincts that have been in the bottom 10 in all of the past elections:

Precincts in the Bottom 10 Turnout Every Election From 2005-2017
Ward Precinct # of times in Bottom 10
4 10 7
21 2 7
21 3 7
21 4 7
21 5 7
21 8 7

These are the precincts that have been in the top 10 in all of the past elections:

Precincts in the Top 10 Turnout Every Election From 2005-2017
Ward Precinct # of times in Bottom 10
16 12 7
20 14 7

I visualized this data by mapping average voter turnout over 7 elections from 2005-2017. The darkest blue areas correspond to the lowest turnout precincts:

In terms of wards, this chart shows the wards home to the largest share of bottom 10 precincts:

Wards with Large Number of Precincts in the Bottom 10 Turnout (2005-2017)
Ward # of Times in Bottom 10 % of Precincts
21 53 75.71
4 13 18.57

Here’s a similar chart for the top 10:
Wards with a Large Number of Precincts in the Top 10 Turnout (2005-2017)
Ward # of Times in Top 10 % of Precincts
20 26 37.14
16 20 28.57

Once again, I visualized this data by mapping average voter turnout. This time, in addition to averaging turnout across elections, I also averaged across precincts to obtain a ward-level average:

Key take-away:

Low turnout precincts in Boston are consistently low turnout, whereas high turnout precincts are, for the most part, NOT consistently high turnout. We know this for two reasons:

  1. The pool of precincts and wards that have ever been in the bottom top is much smaller than the pool of precincts and wards that have made the top 10: 14 precincts and 3 wards in the bottom 10 vs. 30 precincts and 10 wards in the top 10.
  2. 6 of 14, or 42.86% of precincts in the bottom 10 have been among the lowest turnout precincts for 7 out of 7 years. Very few (only 1) have made it only once. The statistics are flipped for the top turnout precincts: only 2 of 30, or 6.67% of precincts in the top 10 have been among the highest turnout precincts for 7 out of 7 years, whereas 15 (50%) have made it only once.

Demographic analysis

I plotted (the upper triangle of) a correlation matrix, which shows how strongly pairs of variables are related. Correlation values fall somewhere from -1 to 1. If the correlation is close to 0, there is no relationship between the variables. If the correlation is positive, it means that the variables move in the same direction (i.e. as one variable gets larger the other also gets larger). If the correlation is negative, it means that the variables move in opposite directions (i.e. as one variables gets larger, the other gets smaller).

We’re most interested in the top row of this matrix, which shows the correlation between turnout and the other variables in the demographic dataset. We could simply look at the strongest correlation values (i.e. furthest from 0, either positive or negative) but this would give us no sense of whether the correlations are statistically significant.

One solution would be to run a regression, which would help us measure the effect of each demographic variable on turnout. However, we can’t use the demographic dataset as is because of high multicollinearity, which means that many of the covariates (the demographic variables) are correlated with each other. Some of these correlations are obvious (for example, the percentage of males in a precinct is clearly going to be perfectly negatively correlated with the percentage of females), and some are more subtle (for example, the matrix shows a strong positive correlation between the percentage of the population that lived in the same house in the past year and the percentage of the population that has a high school-level education or less). The relationships among our demographic variables will skew the regression output.

In order to address the multicollinearity of the dataset, I selected only a few of the demographic variables. I selected roughly one per demographic category (i.e. geographic mobility, population, gender, school enrollment, and poverty), and I chose variables that had a strong correlation with turnout (top row of the correlation matrix) and relatively low correlation with other chosen variables, where possible.

Finally, I ran a regression using this smaller dataset, using turnout in 2017 as the dependent variable. Note that I also ran this regression multiple times, using turnout in different election years as the independent variable. I had the same results in terms of positive/negative correlation coefficients and which variables were statistically significant. As a result, I only include the 2017 results here for illustratative purposes.

## 
## Call:
## lm(formula = turnout ~ same_house + abroad + male_adult + total_adult + 
##     uni_student + below_poverty, data = condensed_dem_data_2017)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12.4891  -4.0608  -0.2411   3.7513  24.2119 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   18.4481623  8.2342406   2.240  0.02595 *  
## same_house     0.2440069  0.0616525   3.958 9.88e-05 ***
## abroad        12.2599859 30.1673783   0.406  0.68480    
## male_adult    -0.0268688  0.0862229  -0.312  0.75559    
## total_adult   -0.0010528  0.0003702  -2.844  0.00482 ** 
## uni_student   -0.0220483  0.0459435  -0.480  0.63172    
## below_poverty -0.3257261  0.0348396  -9.349  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.825 on 248 degrees of freedom
## Multiple R-squared:  0.5021, Adjusted R-squared:  0.4901 
## F-statistic: 41.69 on 6 and 248 DF,  p-value: < 2.2e-16

From this output, we can see that there are three variables which have a strong statistically significant correlation with voter turnout:

  1. same_house (which corresponds to the proportion of the population that lived in the same home in Boston over the past year) - this variable has positive correlation with turnout.
  2. total_adult (which corresponds to the size of the population) - this variable has a negative correlation with turnout.
  3. below_poverty (which corresponds to the proportion of the population below the poverty line) - this variable has a negative correlation with turnout.

Key take-away:

The demographic variables that are most strongly correlated with voter turnout and statistically significant are the proportion of the people that have lived in the same Boston neighborhood for at least a year (positive correlation), the size of the population (negative correlation), and the proportion of the population below the poverty line (negative correlation). Another demographic feature worth mentioning is the proportion of students enrolled in university, which has a strong negative correlation with turnout. This makes sense: Ward 21, which has had the largest proportion of precincts in the bottom 10 over the past 7 years, has a very large percentage of university students among its residents.

Summary

The key take-aways from this analysis are:

  1. Election turnout is much higher in years with mayoral elections (2005, 2009, 2013, 2017) than those with other types of municipal elections.
  2. Turnout varies widely across Boston’s precincts: there’s anywhere from a 38 to 70 percentage point gap between the lowest turnout precinct and the highest turnout precinct in a given election year.
  3. Low turnout precincts in Boston are consistently low turnout, whereas high turnout precincts are, for the most part, not consistently high turnout. Therefore, from a policy perspective, we should examine what low turnout areas have in common rather than focusing on features of high turnout areas.
  4. Low turnout areas tend to have any or all of the following factors: few people who have lived in the neighborhood for at least a year, a large population, a high poverty rate, and many university students. Note that all of these features are relative (to the average Boston precinct).