The goals of this research are to determine whether or not having more than one female candidate on the ticket affects turnout negatively or positively. Specifically, I will be analyzing the levels of turnout for female candidates down the ballot (Congressional, State Senate, and State House races) in Florida, 2016.
I have decided to look at the 2016 general election because 2016 was an historic election year for women and feminism. For one, Hillary Clinton is the first woman in major party history to win the nomination. Not only so, but Hillary Clinton was one of the most unlikeable, controversial candidates in history – so was her opponent.
However, while Donald Trump is not considered the face of men in politics, Hillary Clinton’s unprecedented run means her actions may be associated with female candidates for years to come. Her name at the top of the ticket msy have drove out voters, but it could have been for negative reasons. Given this political environment, combined with the fact that men have historically dominated public office, we would expect to see that female candidates performed poorly with regard to GOTV when compared to their male counterparts.
On the contrary, Donald Trump’s “grab her by the pussy” comments and multiple allegations of sexual assault likely drove out women in a symbolic act of resistance to the abuse and mistreatment of women. If this is the case, we would expect to see higher levels of turnout in districts with more women candidates on the ballot.
I am particularly interested in Florida because Florida is one of the most diverse states in the country. Florida’s diverse landscapes, cultures, demographics, and large population make for interesting analysis. More obviously, Florida is extremely purple. Understanding Florida’s 2016 election results will not only be beneficial for the 2020 campaign cycle in Florida, but the knowledge gained from this analysis could be applied to other areas across the country with similar levels of diversity.
Understanding voter behavior, particularly turnout, with regards to female candidates is important for campaigns and society as a whole. If women are to be fairly represented in office, more women need to be elected. And, as a political campaigning student, it is important to understand what mobilizes and de-mobilizes voters, as well as how to overcome these obstacles, in order to win elections.
library(tidyverse)
## ── Attaching packages ───────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1 ✔ purrr 0.3.3
## ✔ tibble 2.1.3 ✔ dplyr 0.8.3
## ✔ tidyr 0.8.3 ✔ stringr 1.4.0
## ✔ readr 1.3.1 ✔ forcats 0.4.0
## ── Conflicts ──────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(sp)
library(raster)
##
## Attaching package: 'raster'
## The following object is masked from 'package:dplyr':
##
## select
## The following object is masked from 'package:tidyr':
##
## extract
library(rgdal)
## rgdal: version: 1.4-4, (SVN revision 833)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 2.4.2, released 2019/06/28
## Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/3.6/Resources/library/rgdal/gdal
## GDAL binary built with GEOS: FALSE
## Loaded PROJ.4 runtime: Rel. 5.2.0, September 15th, 2018, [PJ_VERSION: 520]
## Path to PROJ.4 shared files: /Library/Frameworks/R.framework/Versions/3.6/Resources/library/rgdal/proj
## Linking to sp version: 1.3-1
library(hexbin)
library(readxl)
library(RColorBrewer)
library(sf)
## Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
library(tmap)
library(colorspace)
##
## Attaching package: 'colorspace'
## The following object is masked from 'package:raster':
##
## RGB
library(readxl)
overall_turnout_2016 <- read_excel("~/Documents/overall_turnout_2016.xlsx")
library(readxl)
fed_house_turnout <- read_excel("~/Documents/fed_house_turnout.xlsx")
library(readxl)
state_senate_turnout <- read_excel("~/Documents/state_senate_turnout.xlsx")
library(readxl)
state_house_turnout <- read_excel("~/Documents/state_house_turnout.xlsx")
Geo_FLHouse <- st_read("~/Downloads/tl_2019_12_sldl/tl_2019_12_sldl.shp")
## Reading layer `tl_2019_12_sldl' from data source `/Users/madison/Downloads/tl_2019_12_sldl/tl_2019_12_sldl.shp' using driver `ESRI Shapefile'
## Simple feature collection with 120 features and 12 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -87.6349 ymin: 24.39631 xmax: -79.97431 ymax: 31.00097
## epsg (SRID): 4269
## proj4string: +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
For the purposes of this analysis, I am using four different data sets from the 2016 Florida Turnout data: 1. Overall Turnout, which includes 127 contested races across all three levels of analysis, 2. Federal House Turnout, which includes a total of 26 contested races, 3. State Senate Turnout, which includes a total of 23 contested races, 4. State House Turnout, which includes a total of 78 contested races.
First, we will look at overall turnout, and how turnout is affected by female candidates down the ballot.
summary(overall_turnout_2016)
## district votes f.candidate total.reg
## Length:127 Min. : 41784 Min. :0.0000 Min. : 69776
## Class :character 1st Qu.: 68590 1st Qu.:0.0000 1st Qu.:105829
## Mode :character Median : 83975 Median :0.0000 Median :119753
## Mean :152490 Mean :0.4409 Mean :222632
## 3rd Qu.:235310 3rd Qu.:1.0000 3rd Qu.:344176
## Max. :409651 Max. :1.0000 Max. :547011
## turnout
## Min. :0.5280
## 1st Qu.:0.6535
## Median :0.6860
## Mean :0.6745
## 3rd Qu.:0.7090
## Max. :0.8070
The mean of variable “f.candidate” was 0.4409 - This indicates that there were slightly more all-male races than male-female or female-female races.
The median turnout was 68.6%, and there was one State House District (5) with a turnout rate of 80.7%.
summary(lm(turnout~f.candidate, data=overall_turnout_2016))
##
## Call:
## lm(formula = turnout ~ f.candidate, data = overall_turnout_2016)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.14445 -0.02025 0.01155 0.03425 0.13455
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.672451 0.006412 104.869 <2e-16 ***
## f.candidate 0.004603 0.009657 0.477 0.634
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05403 on 125 degrees of freedom
## Multiple R-squared: 0.001814, Adjusted R-squared: -0.006171
## F-statistic: 0.2272 on 1 and 125 DF, p-value: 0.6344
cor.test(overall_turnout_2016$turnout, overall_turnout_2016$f.candidate)
##
## Pearson's product-moment correlation
##
## data: overall_turnout_2016$turnout and overall_turnout_2016$f.candidate
## t = 0.47666, df = 125, p-value = 0.6344
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1326038 0.2152125
## sample estimates:
## cor
## 0.04259489
This regression analysis suggests that having a female candidate on the ticket may raise turnout by 0.5%. While not statistically significant or a steep increase, this could make a difference, especially in Florida - a state whose last election included two historic recounts.
The Pearson’s Correlation test suggests that there is a weak, positive correlation between female candidates on the ballot and turnout.
#Visualizing Overall Turnout and the Effects of Female Candidates
hist(overall_turnout_2016$turnout, main="Distribution of Overall Turnout", xlab="Turnout", ylim=c(0,55), col=brewer.pal(n = 7, name = "RdPu"))
As you can see, more than 55 races had a turnout rate between 65-70%. A few races had turnout rates below 55%.
ggplot(overall_turnout_2016, aes(f.candidate=="1", turnout)) + geom_hex(bins = 10)
Based on this hexagonal plot, it is clear that races with female candidates have more consistent turnout numbers. These races are more evenly distributed, and their turnout is steady between 65-75%.
Now, we will break down turnout by the level of the race and see how turnout is affected by female candidates.
summary(fed_house_turnout)
## district votes f.candidate total.reg
## Min. : 1.00 Min. :253240 Min. :0.0000 Min. :358768
## 1st Qu.: 7.25 1st Qu.:317276 1st Qu.:0.0000 1st Qu.:471326
## Median :13.50 Median :347458 Median :0.0000 Median :480992
## Mean :13.62 Mean :341261 Mean :0.4615 Mean :479727
## 3rd Qu.:19.75 3rd Qu.:368032 3rd Qu.:1.0000 3rd Qu.:512044
## Max. :27.00 Max. :409651 Max. :1.0000 Max. :547011
## turnout
## Min. :0.6540
## 1st Qu.:0.6885
## Median :0.7185
## Mean :0.7120
## 3rd Qu.:0.7378
## Max. :0.7560
The mean of variable “f.candidate” was 0.4615 - This indicates that there were slightly more all-male races than male-female or female-female races.
The median turnout was 71.85%,which is ~3% higher than overall median turnout. This is expected, as there is down-ballot dropoff in every election. District 19 had the highest level of turnout at 75.6%.
summary(lm(turnout~f.candidate, data=fed_house_turnout))
##
## Call:
## lm(formula = turnout ~ f.candidate, data = fed_house_turnout)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.060929 -0.020000 0.007786 0.023071 0.041071
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.714929 0.007933 90.120 <2e-16 ***
## f.candidate -0.006429 0.011677 -0.551 0.587
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02968 on 24 degrees of freedom
## Multiple R-squared: 0.01247, Adjusted R-squared: -0.02868
## F-statistic: 0.3031 on 1 and 24 DF, p-value: 0.587
cor.test(fed_house_turnout$turnout, fed_house_turnout$f.candidate)
##
## Pearson's product-moment correlation
##
## data: fed_house_turnout$turnout and fed_house_turnout$f.candidate
## t = -0.55052, df = 24, p-value = 0.587
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.4783329 0.2881441
## sample estimates:
## cor
## -0.1116719
While not statistically significant, this regression analysis suggests that having a female candidate on the ticket may actually decrease turnout by 0.6%. This is not a steep decrease. While not statistically significant, any factors that affect change in election turnout should be taken into consideration.
The Pearson’s Correlation test suggests that there is a weak, negative correlation between female candidates on the ballot and Federal House turnout. While this correlation is -0.11, this is stronger than the correlation between overall turnout and female candidates.
hist(fed_house_turnout$turnout, main="Distribution of Federal Turnout", xlab="Turnout", ylim=c(0,10), col=brewer.pal(n = 7, name = "RdPu"))
As you can see, the distribution of Federal House turnout is slightly negatively skewed, and more than 11 races had turnout above 72%.
ggplot(fed_house_turnout, aes(f.candidate=="1", turnout)) + geom_hex(bins = 10)
Based on this hexagonal plot, Federal House races with female candidates have more consistent turnout numbers. These races are more evenly distributed, and their turnout is clustered between 69-74%.
Now, we will look at the 2016 State Senate race.
summary(state_senate_turnout)
## district votes f.candidate total.reg
## Length:23 Min. :148975 Min. :0.0000 Min. :231048
## Class :character 1st Qu.:185648 1st Qu.:0.0000 1st Qu.:306034
## Mode :character Median :211180 Median :1.0000 Median :329126
## Mean :211855 Mean :0.5652 Mean :322595
## 3rd Qu.:235310 3rd Qu.:1.0000 3rd Qu.:344176
## Max. :272698 Max. :1.0000 Max. :386535
## turnout
## Min. :0.5560
## 1st Qu.:0.6195
## Median :0.6630
## Mean :0.6559
## 3rd Qu.:0.7015
## Max. :0.7230
The mean of variable “f.candidate” was 0.5652 - This indicates that there were slightly more races with a female candidate than all-male races.
The median turnout was 66.3%, which is a ~5% dropoff from Federal House median turnout. This is expected, as there is down-ballot dropoff in every election. District 17 had the highest level of turnout at 72.3%.
summary(lm(turnout~f.candidate, data=state_senate_turnout))
##
## Call:
## lm(formula = turnout ~ f.candidate, data = state_senate_turnout)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.099462 -0.036431 0.007538 0.046038 0.067538
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.6564000 0.0170796 38.432 <2e-16 ***
## f.candidate -0.0009385 0.0227180 -0.041 0.967
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05401 on 21 degrees of freedom
## Multiple R-squared: 8.125e-05, Adjusted R-squared: -0.04753
## F-statistic: 0.001706 on 1 and 21 DF, p-value: 0.9674
cor.test(state_senate_turnout$turnout, state_senate_turnout$f.candidate)
##
## Pearson's product-moment correlation
##
## data: state_senate_turnout$turnout and state_senate_turnout$f.candidate
## t = -0.041309, df = 21, p-value = 0.9674
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.4196569 0.4046918
## sample estimates:
## cor
## -0.009014038
While not statistically significant, this regression analysis suggests that having a female candidate on the ticket may decrease turnout by 0.094%. This is a very slight, statistically insignificant decrease.
The Pearson’s Correlation test suggests that there is a very weak, negative correlation between female candidates on the ballot and State Senate turnout. This correlation is -0.009, this is weaker than the correlation between Federal House turnout and female candidates.
hist(state_senate_turnout$turnout, main="Distribution of FL Senate Turnout", xlab="Turnout", ylim=c(0,10), xlim=c(0.52, 0.75), col=brewer.pal(n = 7, name = "RdPu"))
The distribution of State Senate turnout is slightly negatively skewed, and more than 16 races had turnout at or above 65%.
ggplot(state_senate_turnout, aes(f.candidate=="1", turnout)) + geom_hex(bins = 10)
The State Senate race turnout distribution is different than that of the Federal or Overall data. Races with a female candidates and races with all-male candidates are scattered when it comes to turnout. Races with a female candidate has both the highest and the lowest turnout rates at the State Senate level.
Now, we will look at the 2016 State House race.
summary(state_house_turnout)
## SLDLST votes f.candidate total.reg
## Length:120 Min. : 41784 Min. :0.0000 Min. : 69776
## Class :character 1st Qu.: 64117 1st Qu.:0.0000 1st Qu.: 99078
## Mode :character Median : 72498 Median :0.0000 Median :108596
## Mean : 72062 Mean :0.3974 Mean :107458
## 3rd Qu.: 81627 3rd Qu.:1.0000 3rd Qu.:117792
## Max. :105021 Max. :1.0000 Max. :136223
## NA's :42 NA's :42 NA's :42
## turnout
## Min. :0.5280
## 1st Qu.:0.6485
## Median :0.6785
## Mean :0.6675
## 3rd Qu.:0.7027
## Max. :0.8070
## NA's :42
The mean of variable “f.candidate” was 0.3974 - This indicates that there were many more races with all-male candidates than races with at least one female candidate.
The median turnout was 67.85%, which is slightly higher than the State Senate median turnout. Notably, the State House sample size is considerably larger than that of the State Senate or Federal House. District 5 had the highest level of turnout at 80.7%.
summary(lm(turnout~f.candidate, data=state_house_turnout))
##
## Call:
## lm(formula = turnout ~ f.candidate, data = state_house_turnout)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.135213 -0.020255 0.009065 0.033065 0.143787
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.663213 0.008067 82.216 <2e-16 ***
## f.candidate 0.010723 0.012796 0.838 0.405
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0553 on 76 degrees of freedom
## (42 observations deleted due to missingness)
## Multiple R-squared: 0.009155, Adjusted R-squared: -0.003882
## F-statistic: 0.7022 on 1 and 76 DF, p-value: 0.4047
cor.test(state_house_turnout$turnout, state_house_turnout$f.candidate)
##
## Pearson's product-moment correlation
##
## data: state_house_turnout$turnout and state_house_turnout$f.candidate
## t = 0.838, df = 76, p-value = 0.4047
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1296066 0.3115803
## sample estimates:
## cor
## 0.09568386
While not statistically significant, this regression analysis suggests that having a female candidate on the ticket may decrease turnout by 1.1%. While this is not statistically significant, as previously mentioned, a 1% decrease in turnout could cost a candidate in Florida the election.
The Pearson’s Correlation test suggests that there is a weak, negative correlation between female candidates on the ballot and State House turnout. This correlation is 0.09. Notably, this correlation is stronger than that of the State Senate race.
hist(state_house_turnout$turnout, main="Distribution of FL House Turnout", xlab="Turnout", ylim=c(0,35), xlim=c(0.45, 0.90), col=brewer.pal(n = 7, name = "RdPu"))
The distribution of State Senate turnout is slightly positively skewed. More than 55 State House races had turnout at or above 65%.
ggplot(state_house_turnout, aes(f.candidate=="1", turnout)) + geom_hex(bins = 10)
## Warning: Removed 42 rows containing non-finite values (stat_binhex).
The State House race turnout distribution is on-par with the Overall and Federal House data in that races with female candidates had more consistent turnout. With that being said, turnout still varied greatly.
FL_House_Turnout<- merge(Geo_FLHouse,state_house_turnout,by="SLDLST")
tm_shape(FL_House_Turnout) + tm_fill(col = "turnout", palette = brewer.pal(n = 7, name = "Blues"), title = "FL House Turnout 2016") + tm_style("cobalt")
This map demonstrates the varying levels of turnout in the contested Florida House races in 2016.
tm_shape(FL_House_Turnout) + tm_fill(col = "f.candidate", palette = brewer.pal(n = 3-7, name = "PuRd"), title = "Female House Cand., '16") + tm_style("cobalt")
## Warning in brewer.pal(n = 3 - 7, name = "PuRd"): minimal value for n is 3, returning requested palette with 3 different levels
Going along with the above turnout map, the FL House districts shaded pink had a female candidate on the ticket. Looking at these maps side by side, there does not seem to be a strong correlation between the presence of a female candidate and turnout levels. Notably, however, H.D. 5 in the Panhandle had the highest turnout level above 80% and there was no female House candidate on the ticket. Yet, we can attribute this high turnout rate to demographic factors in the region; therefore, the lack of a female candidate in H.D. 5 likely did not have an effect on turnout.
While this analysis did not produce any shocking or statistically significant results, there were some interesting findings. For one, there seems to be a relationship between female candidates and consistency in turnout statewide. This suggests that female candidates may do a better job of coalition-building or creating movements that resonate with voters across district lines. However, female candidates may slightly negatively affect turnout. This less than 1% and statistically insignificant result should not be completely ignored; in FLorida, we know a 1% difference can cost candidates the election. In particular, female candidates should consider this 1% an obstacle to factor into their GOTV efforts.
This data set is not perfect, and I encountered many challenges in my analysis. Of my own doing, my R skillset is amateur at best. Additionally, agreggate-level turnout data is limited, and my research question may have been too narrow.
At first glance, this analysis produced disappointing results, because nothing extraordinary came from it. Yet, given the subject matter at hand, this nothing is probably the best result! The lack of impact that female candidates have on turnout may suggest that the “gender gap” that is so prevalent in other aspects of society is not singificant with regard to candidates’ power to GOTV. While males still run for office more and dominate public office, female candidates can GOTV at levels similar to that of male candidates.
More research should be done on how female candidates in Florida perform against their male counterparts. As I mentioned in my introduction, female candidates’ power in turning out voters could be adverse - Female candidates may drive more men or anti-feminist women to the polls.