Research Question: Does the presence of a female candidate on the ballot affect turnout?

The goals of this research are to determine whether or not having more than one female candidate on the ticket affects turnout negatively or positively. Specifically, I will be analyzing the levels of turnout for female candidates down the ballot (Congressional, State Senate, and State House races) in Florida, 2016.

I have decided to look at the 2016 general election because 2016 was an historic election year for women and feminism. For one, Hillary Clinton is the first woman in major party history to win the nomination. Not only so, but Hillary Clinton was one of the most unlikeable, controversial candidates in history – so was her opponent.

However, while Donald Trump is not considered the face of men in politics, Hillary Clinton’s unprecedented run means her actions may be associated with female candidates for years to come. Her name at the top of the ticket msy have drove out voters, but it could have been for negative reasons. Given this political environment, combined with the fact that men have historically dominated public office, we would expect to see that female candidates performed poorly with regard to GOTV when compared to their male counterparts.

On the contrary, Donald Trump’s “grab her by the pussy” comments and multiple allegations of sexual assault likely drove out women in a symbolic act of resistance to the abuse and mistreatment of women. If this is the case, we would expect to see higher levels of turnout in districts with more women candidates on the ballot.

I am particularly interested in Florida because Florida is one of the most diverse states in the country. Florida’s diverse landscapes, cultures, demographics, and large population make for interesting analysis. More obviously, Florida is extremely purple. Understanding Florida’s 2016 election results will not only be beneficial for the 2020 campaign cycle in Florida, but the knowledge gained from this analysis could be applied to other areas across the country with similar levels of diversity.

Understanding voter behavior, particularly turnout, with regards to female candidates is important for campaigns and society as a whole. If women are to be fairly represented in office, more women need to be elected. And, as a political campaigning student, it is important to understand what mobilizes and de-mobilizes voters, as well as how to overcome these obstacles, in order to win elections.

Libraries and Packages

library(tidyverse)
## ── Attaching packages ───────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1     ✔ purrr   0.3.3
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   0.8.3     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.4.0
## ── Conflicts ──────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(sp)
library(raster)
## 
## Attaching package: 'raster'
## The following object is masked from 'package:dplyr':
## 
##     select
## The following object is masked from 'package:tidyr':
## 
##     extract
library(rgdal)
## rgdal: version: 1.4-4, (SVN revision 833)
##  Geospatial Data Abstraction Library extensions to R successfully loaded
##  Loaded GDAL runtime: GDAL 2.4.2, released 2019/06/28
##  Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/3.6/Resources/library/rgdal/gdal
##  GDAL binary built with GEOS: FALSE 
##  Loaded PROJ.4 runtime: Rel. 5.2.0, September 15th, 2018, [PJ_VERSION: 520]
##  Path to PROJ.4 shared files: /Library/Frameworks/R.framework/Versions/3.6/Resources/library/rgdal/proj
##  Linking to sp version: 1.3-1
library(hexbin)
library(readxl)
library(RColorBrewer)
library(sf)
## Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
library(tmap)
library(colorspace)
## 
## Attaching package: 'colorspace'
## The following object is masked from 'package:raster':
## 
##     RGB

Data Used

library(readxl)
overall_turnout_2016 <- read_excel("~/Documents/overall_turnout_2016.xlsx")

library(readxl)
fed_house_turnout <- read_excel("~/Documents/fed_house_turnout.xlsx")

library(readxl)
state_senate_turnout <- read_excel("~/Documents/state_senate_turnout.xlsx")

library(readxl)
state_house_turnout <- read_excel("~/Documents/state_house_turnout.xlsx")

Geo_FLHouse <- st_read("~/Downloads/tl_2019_12_sldl/tl_2019_12_sldl.shp")
## Reading layer `tl_2019_12_sldl' from data source `/Users/madison/Downloads/tl_2019_12_sldl/tl_2019_12_sldl.shp' using driver `ESRI Shapefile'
## Simple feature collection with 120 features and 12 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -87.6349 ymin: 24.39631 xmax: -79.97431 ymax: 31.00097
## epsg (SRID):    4269
## proj4string:    +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs

For the purposes of this analysis, I am using four different data sets from the 2016 Florida Turnout data: 1. Overall Turnout, which includes 127 contested races across all three levels of analysis, 2. Federal House Turnout, which includes a total of 26 contested races, 3. State Senate Turnout, which includes a total of 23 contested races, 4. State House Turnout, which includes a total of 78 contested races.

Analysis of Overall Turnout

First, we will look at overall turnout, and how turnout is affected by female candidates down the ballot.

summary(overall_turnout_2016)
##    district             votes         f.candidate       total.reg     
##  Length:127         Min.   : 41784   Min.   :0.0000   Min.   : 69776  
##  Class :character   1st Qu.: 68590   1st Qu.:0.0000   1st Qu.:105829  
##  Mode  :character   Median : 83975   Median :0.0000   Median :119753  
##                     Mean   :152490   Mean   :0.4409   Mean   :222632  
##                     3rd Qu.:235310   3rd Qu.:1.0000   3rd Qu.:344176  
##                     Max.   :409651   Max.   :1.0000   Max.   :547011  
##     turnout      
##  Min.   :0.5280  
##  1st Qu.:0.6535  
##  Median :0.6860  
##  Mean   :0.6745  
##  3rd Qu.:0.7090  
##  Max.   :0.8070

The mean of variable “f.candidate” was 0.4409 - This indicates that there were slightly more all-male races than male-female or female-female races.

The median turnout was 68.6%, and there was one State House District (5) with a turnout rate of 80.7%.

summary(lm(turnout~f.candidate, data=overall_turnout_2016))
## 
## Call:
## lm(formula = turnout ~ f.candidate, data = overall_turnout_2016)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.14445 -0.02025  0.01155  0.03425  0.13455 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.672451   0.006412 104.869   <2e-16 ***
## f.candidate 0.004603   0.009657   0.477    0.634    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05403 on 125 degrees of freedom
## Multiple R-squared:  0.001814,   Adjusted R-squared:  -0.006171 
## F-statistic: 0.2272 on 1 and 125 DF,  p-value: 0.6344
cor.test(overall_turnout_2016$turnout, overall_turnout_2016$f.candidate)
## 
##  Pearson's product-moment correlation
## 
## data:  overall_turnout_2016$turnout and overall_turnout_2016$f.candidate
## t = 0.47666, df = 125, p-value = 0.6344
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1326038  0.2152125
## sample estimates:
##        cor 
## 0.04259489

This regression analysis suggests that having a female candidate on the ticket may raise turnout by 0.5%. While not statistically significant or a steep increase, this could make a difference, especially in Florida - a state whose last election included two historic recounts.

The Pearson’s Correlation test suggests that there is a weak, positive correlation between female candidates on the ballot and turnout.

#Visualizing Overall Turnout and the Effects of Female Candidates

hist(overall_turnout_2016$turnout, main="Distribution of Overall Turnout", xlab="Turnout", ylim=c(0,55), col=brewer.pal(n = 7, name = "RdPu"))

As you can see, more than 55 races had a turnout rate between 65-70%. A few races had turnout rates below 55%.

ggplot(overall_turnout_2016, aes(f.candidate=="1", turnout)) + geom_hex(bins = 10)

Based on this hexagonal plot, it is clear that races with female candidates have more consistent turnout numbers. These races are more evenly distributed, and their turnout is steady between 65-75%.

Analysis of Federal House Turnout

Now, we will break down turnout by the level of the race and see how turnout is affected by female candidates.

summary(fed_house_turnout)
##     district         votes         f.candidate       total.reg     
##  Min.   : 1.00   Min.   :253240   Min.   :0.0000   Min.   :358768  
##  1st Qu.: 7.25   1st Qu.:317276   1st Qu.:0.0000   1st Qu.:471326  
##  Median :13.50   Median :347458   Median :0.0000   Median :480992  
##  Mean   :13.62   Mean   :341261   Mean   :0.4615   Mean   :479727  
##  3rd Qu.:19.75   3rd Qu.:368032   3rd Qu.:1.0000   3rd Qu.:512044  
##  Max.   :27.00   Max.   :409651   Max.   :1.0000   Max.   :547011  
##     turnout      
##  Min.   :0.6540  
##  1st Qu.:0.6885  
##  Median :0.7185  
##  Mean   :0.7120  
##  3rd Qu.:0.7378  
##  Max.   :0.7560

The mean of variable “f.candidate” was 0.4615 - This indicates that there were slightly more all-male races than male-female or female-female races.

The median turnout was 71.85%,which is ~3% higher than overall median turnout. This is expected, as there is down-ballot dropoff in every election. District 19 had the highest level of turnout at 75.6%.

summary(lm(turnout~f.candidate, data=fed_house_turnout))
## 
## Call:
## lm(formula = turnout ~ f.candidate, data = fed_house_turnout)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.060929 -0.020000  0.007786  0.023071  0.041071 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.714929   0.007933  90.120   <2e-16 ***
## f.candidate -0.006429   0.011677  -0.551    0.587    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02968 on 24 degrees of freedom
## Multiple R-squared:  0.01247,    Adjusted R-squared:  -0.02868 
## F-statistic: 0.3031 on 1 and 24 DF,  p-value: 0.587
cor.test(fed_house_turnout$turnout, fed_house_turnout$f.candidate)
## 
##  Pearson's product-moment correlation
## 
## data:  fed_house_turnout$turnout and fed_house_turnout$f.candidate
## t = -0.55052, df = 24, p-value = 0.587
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.4783329  0.2881441
## sample estimates:
##        cor 
## -0.1116719

While not statistically significant, this regression analysis suggests that having a female candidate on the ticket may actually decrease turnout by 0.6%. This is not a steep decrease. While not statistically significant, any factors that affect change in election turnout should be taken into consideration.

The Pearson’s Correlation test suggests that there is a weak, negative correlation between female candidates on the ballot and Federal House turnout. While this correlation is -0.11, this is stronger than the correlation between overall turnout and female candidates.

Visualizing Federal House Turnout and the Effects of Female Candidates

hist(fed_house_turnout$turnout, main="Distribution of Federal Turnout", xlab="Turnout", ylim=c(0,10), col=brewer.pal(n = 7, name = "RdPu"))

As you can see, the distribution of Federal House turnout is slightly negatively skewed, and more than 11 races had turnout above 72%.

ggplot(fed_house_turnout, aes(f.candidate=="1", turnout)) + geom_hex(bins = 10)

Based on this hexagonal plot, Federal House races with female candidates have more consistent turnout numbers. These races are more evenly distributed, and their turnout is clustered between 69-74%.

Analysis of State Senate Turnout

Now, we will look at the 2016 State Senate race.

summary(state_senate_turnout)
##    district             votes         f.candidate       total.reg     
##  Length:23          Min.   :148975   Min.   :0.0000   Min.   :231048  
##  Class :character   1st Qu.:185648   1st Qu.:0.0000   1st Qu.:306034  
##  Mode  :character   Median :211180   Median :1.0000   Median :329126  
##                     Mean   :211855   Mean   :0.5652   Mean   :322595  
##                     3rd Qu.:235310   3rd Qu.:1.0000   3rd Qu.:344176  
##                     Max.   :272698   Max.   :1.0000   Max.   :386535  
##     turnout      
##  Min.   :0.5560  
##  1st Qu.:0.6195  
##  Median :0.6630  
##  Mean   :0.6559  
##  3rd Qu.:0.7015  
##  Max.   :0.7230

The mean of variable “f.candidate” was 0.5652 - This indicates that there were slightly more races with a female candidate than all-male races.

The median turnout was 66.3%, which is a ~5% dropoff from Federal House median turnout. This is expected, as there is down-ballot dropoff in every election. District 17 had the highest level of turnout at 72.3%.

summary(lm(turnout~f.candidate, data=state_senate_turnout))
## 
## Call:
## lm(formula = turnout ~ f.candidate, data = state_senate_turnout)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.099462 -0.036431  0.007538  0.046038  0.067538 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.6564000  0.0170796  38.432   <2e-16 ***
## f.candidate -0.0009385  0.0227180  -0.041    0.967    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05401 on 21 degrees of freedom
## Multiple R-squared:  8.125e-05,  Adjusted R-squared:  -0.04753 
## F-statistic: 0.001706 on 1 and 21 DF,  p-value: 0.9674
cor.test(state_senate_turnout$turnout, state_senate_turnout$f.candidate)
## 
##  Pearson's product-moment correlation
## 
## data:  state_senate_turnout$turnout and state_senate_turnout$f.candidate
## t = -0.041309, df = 21, p-value = 0.9674
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.4196569  0.4046918
## sample estimates:
##          cor 
## -0.009014038

While not statistically significant, this regression analysis suggests that having a female candidate on the ticket may decrease turnout by 0.094%. This is a very slight, statistically insignificant decrease.

The Pearson’s Correlation test suggests that there is a very weak, negative correlation between female candidates on the ballot and State Senate turnout. This correlation is -0.009, this is weaker than the correlation between Federal House turnout and female candidates.

Visualizing State Senate Turnout and the Effects of Female Candidates

hist(state_senate_turnout$turnout, main="Distribution of FL Senate Turnout", xlab="Turnout", ylim=c(0,10), xlim=c(0.52, 0.75), col=brewer.pal(n = 7, name = "RdPu"))

The distribution of State Senate turnout is slightly negatively skewed, and more than 16 races had turnout at or above 65%.

ggplot(state_senate_turnout, aes(f.candidate=="1", turnout)) + geom_hex(bins = 10)

The State Senate race turnout distribution is different than that of the Federal or Overall data. Races with a female candidates and races with all-male candidates are scattered when it comes to turnout. Races with a female candidate has both the highest and the lowest turnout rates at the State Senate level.

Analysis of State House Turnout

Now, we will look at the 2016 State House race.

summary(state_house_turnout)
##     SLDLST              votes         f.candidate       total.reg     
##  Length:120         Min.   : 41784   Min.   :0.0000   Min.   : 69776  
##  Class :character   1st Qu.: 64117   1st Qu.:0.0000   1st Qu.: 99078  
##  Mode  :character   Median : 72498   Median :0.0000   Median :108596  
##                     Mean   : 72062   Mean   :0.3974   Mean   :107458  
##                     3rd Qu.: 81627   3rd Qu.:1.0000   3rd Qu.:117792  
##                     Max.   :105021   Max.   :1.0000   Max.   :136223  
##                     NA's   :42       NA's   :42       NA's   :42      
##     turnout      
##  Min.   :0.5280  
##  1st Qu.:0.6485  
##  Median :0.6785  
##  Mean   :0.6675  
##  3rd Qu.:0.7027  
##  Max.   :0.8070  
##  NA's   :42

The mean of variable “f.candidate” was 0.3974 - This indicates that there were many more races with all-male candidates than races with at least one female candidate.

The median turnout was 67.85%, which is slightly higher than the State Senate median turnout. Notably, the State House sample size is considerably larger than that of the State Senate or Federal House. District 5 had the highest level of turnout at 80.7%.

summary(lm(turnout~f.candidate, data=state_house_turnout))
## 
## Call:
## lm(formula = turnout ~ f.candidate, data = state_house_turnout)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.135213 -0.020255  0.009065  0.033065  0.143787 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.663213   0.008067  82.216   <2e-16 ***
## f.candidate 0.010723   0.012796   0.838    0.405    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0553 on 76 degrees of freedom
##   (42 observations deleted due to missingness)
## Multiple R-squared:  0.009155,   Adjusted R-squared:  -0.003882 
## F-statistic: 0.7022 on 1 and 76 DF,  p-value: 0.4047
cor.test(state_house_turnout$turnout, state_house_turnout$f.candidate)
## 
##  Pearson's product-moment correlation
## 
## data:  state_house_turnout$turnout and state_house_turnout$f.candidate
## t = 0.838, df = 76, p-value = 0.4047
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1296066  0.3115803
## sample estimates:
##        cor 
## 0.09568386

While not statistically significant, this regression analysis suggests that having a female candidate on the ticket may decrease turnout by 1.1%. While this is not statistically significant, as previously mentioned, a 1% decrease in turnout could cost a candidate in Florida the election.

The Pearson’s Correlation test suggests that there is a weak, negative correlation between female candidates on the ballot and State House turnout. This correlation is 0.09. Notably, this correlation is stronger than that of the State Senate race.

Visualizing State House Turnout and the Effects of Female Candidates

hist(state_house_turnout$turnout, main="Distribution of FL House Turnout", xlab="Turnout", ylim=c(0,35), xlim=c(0.45, 0.90), col=brewer.pal(n = 7, name = "RdPu"))

The distribution of State Senate turnout is slightly positively skewed. More than 55 State House races had turnout at or above 65%.

ggplot(state_house_turnout, aes(f.candidate=="1", turnout)) + geom_hex(bins = 10)
## Warning: Removed 42 rows containing non-finite values (stat_binhex).

The State House race turnout distribution is on-par with the Overall and Federal House data in that races with female candidates had more consistent turnout. With that being said, turnout still varied greatly.

Mapping out the FL House Race, 2016

FL_House_Turnout<- merge(Geo_FLHouse,state_house_turnout,by="SLDLST")

tm_shape(FL_House_Turnout) + tm_fill(col = "turnout", palette = brewer.pal(n = 7, name = "Blues"), title = "FL House Turnout 2016") + tm_style("cobalt")

This map demonstrates the varying levels of turnout in the contested Florida House races in 2016.

tm_shape(FL_House_Turnout) + tm_fill(col = "f.candidate", palette = brewer.pal(n = 3-7, name = "PuRd"), title = "Female House Cand., '16") + tm_style("cobalt")
## Warning in brewer.pal(n = 3 - 7, name = "PuRd"): minimal value for n is 3, returning requested palette with 3 different levels

Going along with the above turnout map, the FL House districts shaded pink had a female candidate on the ticket. Looking at these maps side by side, there does not seem to be a strong correlation between the presence of a female candidate and turnout levels. Notably, however, H.D. 5 in the Panhandle had the highest turnout level above 80% and there was no female House candidate on the ticket. Yet, we can attribute this high turnout rate to demographic factors in the region; therefore, the lack of a female candidate in H.D. 5 likely did not have an effect on turnout.

Concluding Thoughts

While this analysis did not produce any shocking or statistically significant results, there were some interesting findings. For one, there seems to be a relationship between female candidates and consistency in turnout statewide. This suggests that female candidates may do a better job of coalition-building or creating movements that resonate with voters across district lines. However, female candidates may slightly negatively affect turnout. This less than 1% and statistically insignificant result should not be completely ignored; in FLorida, we know a 1% difference can cost candidates the election. In particular, female candidates should consider this 1% an obstacle to factor into their GOTV efforts.

This data set is not perfect, and I encountered many challenges in my analysis. Of my own doing, my R skillset is amateur at best. Additionally, agreggate-level turnout data is limited, and my research question may have been too narrow.

At first glance, this analysis produced disappointing results, because nothing extraordinary came from it. Yet, given the subject matter at hand, this nothing is probably the best result! The lack of impact that female candidates have on turnout may suggest that the “gender gap” that is so prevalent in other aspects of society is not singificant with regard to candidates’ power to GOTV. While males still run for office more and dominate public office, female candidates can GOTV at levels similar to that of male candidates.

More research should be done on how female candidates in Florida perform against their male counterparts. As I mentioned in my introduction, female candidates’ power in turning out voters could be adverse - Female candidates may drive more men or anti-feminist women to the polls.