Final Project Proposal

On the night of August 28th, 2018, an election upset took place in Florida’s Democratic Primary for the governor’s election. Gwen Graham, the daughter of a former popular Democratic governor of Florida, had long been seen as the establishment favorite, given her name recognition and centrist political tendencies. Every single poll in the weeks leading up to election night had her in first place, and by wide margins. However, as the election results rolled in through the night, it became clear that Andrew Gillum, a dark horse candidate far outspent by his competitors and with relatively little name recognition, would be that night’s Democratic primary champion. Running as a more progressive alternative to the other Democratic candidates for governor that night, Gillum won when nobody expected it. It was a stunning event in Florida, a state known for generally electing more moderate Democrats, such as Senator Bill Nelson and Representative Debbie Wasserman-Schultz. He now faces against Ron Desantis, a staunchly-conservative Trump-backed candidate, for the Florida governor’s seat.

The aim of this project is to analyze the results of Florida’s Democratic primary for governor on August 28th in order to understand why Gillum won. Many media outlets cited that the Democratic primary in Florida this year saw the highest turnout level in years, subtly implying that this is what propelled Gillum to victory, so my first goal would be to test this hypothesis by looking at county-level data provided by the State of Florida, as well as precinct-level data provided by six or seven different counties, including the counties that contain cities like Miami, West Palm Beach, Orlando, Tampa, Jacksonville, and Tallahassee. I would test this hypothesis probably with a regression analysis, and hope to use the Leaflet package to make choropleth maps that show the change in turnout by county from years previous. Other packages and tools I would like to use might emerge as I continue with the analysis and begin to see what would be most useful for a deeper understanding.

Data

The data to conduct this analysis would have to come from multiple sources. First, I would need to find voter registration statistics over time and compare total registration numbers to the amounts voted in order to see what the turnout level was in 2018, and I would want to graph how this has compared to past gubernatorial election years in Florida. The Florida department of state keeps these “bookclosing” statistics here: https://dos.myflorida.com/elections/data-statistics/voter-registration-statistics/bookclosing/

The graph below shows 2018 voter registration statistics as of July 30, 2018, the last possible day to register to vote for the governor’s primary.

voterstats2018 <- read_excel('2018pri_party.xlsx')
glimpse(voterstats2018)
## Observations: 68
## Variables: 13
## $ `County Name`                      <chr> "Alachua", "Baker", "Bay", ...
## $ `Republican Party of Florida`      <dbl> 49066, 7885, 61943, 7622, 1...
## $ `Florida Democratic Party`         <dbl> 83302, 5301, 30745, 6034, 1...
## $ `Constitution Party`               <dbl> 26, 3, 15, 3, 116, 112, 1, ...
## $ `Ecology Party`                    <dbl> 7, 0, 3, 1, 27, 47, 0, 7, 6...
## $ `Green Party`                      <dbl> 195, 3, 98, 7, 314, 467, 3,...
## $ `Independent Party`                <dbl> 401, 13, 370, 16, 1929, 375...
## $ `Libertarian Party`                <dbl> 730, 26, 530, 48, 1349, 156...
## $ `Party for Socialism & Liberation` <dbl> 10, 1, 4, 0, 22, 39, 0, 3, ...
## $ `Reform Party of Florida`          <dbl> 23, 1, 15, 3, 49, 85, 1, 9,...
## $ `No Party Affiliation`             <dbl> 40935, 1579, 25664, 2463, 1...
## $ Total                              <dbl> 174695, 14812, 119387, 1619...
## $ Precincts                          <dbl> 63, 9, 44, 14, 163, 577, 15...

The goal is to use the website provided above to see these numbers over time. The data shows that there are 68 counties in Florida. The data that is publicly available goes back to 1994 (on the website).

Of course, I would want to compare these numbers to the results of the election itself and use ‘mutate’ to generate turnout numbers (total number of votes in the Democratic primary / registered Democratic Party voters). These datasets are also avaiable through the state of Florida’s website.

electionstats2018 <- read_csv('20180828_ElecResultsFL.csv')
## Parsed with column specification:
## cols(
##   ElectionDate = col_character(),
##   PartyCode = col_character(),
##   PartyName = col_character(),
##   RaceCode = col_character(),
##   RaceName = col_character(),
##   CountyCode = col_character(),
##   CountyName = col_character(),
##   Juris1num = col_integer(),
##   Juris2num = col_character(),
##   Precincts = col_integer(),
##   PrecinctsReporting = col_integer(),
##   CanNameLast = col_character(),
##   CanNameFirst = col_character(),
##   CanNameMiddle = col_character(),
##   CanVotes = col_integer()
## )
glimpse(electionstats2018)
## Observations: 2,598
## Variables: 15
## $ ElectionDate       <chr> "8/28/18", "8/28/18", "8/28/18", "8/28/18",...
## $ PartyCode          <chr> "REP", "REP", "REP", "REP", "REP", "REP", "...
## $ PartyName          <chr> "Republican Party", "Republican Party", "Re...
## $ RaceCode           <chr> "USS", "USS", "USS", "USS", "USS", "USS", "...
## $ RaceName           <chr> "United States Senator", "United States Sen...
## $ CountyCode         <chr> "ALA", "BAK", "BAY", "BRA", "BRE", "BRO", "...
## $ CountyName         <chr> "Alachua", "Baker", "Bay", "Bradford", "Bre...
## $ Juris1num          <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ Juris2num          <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ Precincts          <int> 63, 9, 44, 14, 174, 577, 11, 67, 31, 47, 59...
## $ PrecinctsReporting <int> 63, 9, 44, 14, 174, 577, 11, 67, 31, 47, 59...
## $ CanNameLast        <chr> "De La Fuente", "De La Fuente", "De La Fuen...
## $ CanNameFirst       <chr> "Roque", "Roque", "Roque", "Roque", "Roque"...
## $ CanNameMiddle      <chr> "Rocky", "Rocky", "Rocky", "Rocky", "Rocky"...
## $ CanVotes           <int> 2185, 244, 1771, 306, 7774, 9038, 113, 2938...

Where I’m going from here

My next few steps would be to find the datasets using precinct-level data for the specific counties I’m thinking of looking at. I’m thinking of these counties specifically to analyze precinct-level data and identify which precincts saw the highest change in turnout within these counties, and see how these specific precincts were correlated with the share of the vote that went for Andrew Gillum. I’m only thinking of sampling these counties because the number of precincts is too large within Florida as a whole. Miami-Dade County alone, for example, has 783 precincts.