The FiveThirtyEight article “Voter Registrations Are Way, Way Down During The Pandemic” https://fivethirtyeight.com/features/voter-registrations-are-way-way-down-during-the-pandemic/, published June 26, 2020, by Kaleigh Rogers and Nathaniel Rakich, compares new voter registrations in the spring of 2020 to the same period from 2016. Looking at both periods, which are typically busy during the run-up to a Presidential election, there is an apparent decline in the number of new registrations in 2020 over the comparison period.
The raw data is available on GitHub https://github.com/fivethirtyeight/data/tree/master/voter-registration and was imported to R for further analysis. The data covers January through April for new registrations in both 2016 and 2020 for 11 states and the District of Columbia, and structure of the dataset is shown below:
gitURL <- "https://raw.githubusercontent.com/fivethirtyeight/data/master/voter-registration/new-voter-registrations.csv"
voterReg <- read.csv(gitURL)
str(voterReg)
## 'data.frame': 106 obs. of 4 variables:
## $ Jurisdiction : chr "Arizona" "Arizona" "Arizona" "Arizona" ...
## $ Year : int 2016 2016 2016 2016 2020 2020 2020 2020 2016 2016 ...
## $ Month : chr "Jan" "Feb" "Mar" "Apr" ...
## $ New.registered.voters: int 25852 51155 48614 30668 33229 50853 31872 10249 87574 103377 ...
In order to sort the data by month properly, each month will need to be converted to an integer. Next, since the Year variable holds both 2016 and 2020 data, it will be helpful to separate the New registrations into two separate variables, one for 2016 new registration and one for 2020 new registrations. This will make it visually easier to see side-by-side comparisons in the data table, and will allow for a simple calculated variable showing the delta between 2016 new registrations and 2020 new registrations in each state for each month reported. The results of these data transformations are stored in a subsetted data frame.
library(dplyr)
monthName <- c(voterReg$Month)
numMonth <- sapply(monthName, switch, "Jan" = 1, "Feb" = 2, "Mar" = 3, "Apr" = 4, "May" = 5)
voterReg$numMonth <- numMonth
newReg2016 <- voterReg$New.registered.voters[which(voterReg$Year == 2016)]
newReg2016
## [1] 25852 51155 48614 30668 87574 103377 174278 185478 17024 20707
## [11] 25627 22204 3007 3629 5124 3818 2840 2954 4706 4157
## [21] 5714 50231 87351 73627 52508 34952 40976 44150 37028 44040
## [31] 99674 52782 76098 19580 29122 40497 26655 5828 35213 84357
## [41] 58272 73341 29374 132860 143795 170607 143199 91205 20032 36911
## [51] 44171 20460 26239
newReg2020 <- voterReg$New.registered.voters[which(voterReg$Year == 2020)]
newReg2020
## [1] 33229 50853 31872 10249 151595 238281 176810 38970 20260 33374
## [11] 18990 6034 3276 3353 2535 589 3334 3348 2225 1281
## [21] 1925 77466 109859 54872 21031 38573 55386 26284 15484 44443
## [31] 68455 47899 21332 21532 20708 23864 10061 23488 111990 54053
## [41] 54807 35484 23517 134559 130080 129424 34694 35678 25934 29507
## [51] 31492 5467 8239
newRegDelta <- newReg2020 - newReg2016
newRegDelta
## [1] 7377 -302 -16742 -20419 64021 134904 2532 -146508 3236
## [10] 12667 -6637 -16170 269 -276 -2589 -3229 494 394
## [19] -2481 -2876 -3789 27235 22508 -18755 -31477 3621 14410
## [28] -17866 -21544 403 -31219 -4883 -54766 1952 -8414 -16633
## [37] -16594 17660 76777 -30304 -3465 -37857 -5857 1699 -13715
## [46] -41183 -108505 -55527 5902 -7404 -12679 -14993 -18000
subsetVoterReg <- data.frame(distinct(voterReg, Jurisdiction, Month, numMonth),newReg2016,newReg2020,newRegDelta)
subsetVoterReg
## Jurisdiction Month numMonth newReg2016 newReg2020 newRegDelta
## 1 Arizona Jan 1 25852 33229 7377
## 2 Arizona Feb 2 51155 50853 -302
## 3 Arizona Mar 3 48614 31872 -16742
## 4 Arizona Apr 4 30668 10249 -20419
## 5 California Jan 1 87574 151595 64021
## 6 California Feb 2 103377 238281 134904
## 7 California Mar 3 174278 176810 2532
## 8 California Apr 4 185478 38970 -146508
## 9 Colorado Jan 1 17024 20260 3236
## 10 Colorado Feb 2 20707 33374 12667
## 11 Colorado Mar 3 25627 18990 -6637
## 12 Colorado Apr 4 22204 6034 -16170
## 13 Delaware Jan 1 3007 3276 269
## 14 Delaware Feb 2 3629 3353 -276
## 15 Delaware Mar 3 5124 2535 -2589
## 16 Delaware Apr 4 3818 589 -3229
## 17 District of Columbia Jan 1 2840 3334 494
## 18 District of Columbia Feb 2 2954 3348 394
## 19 District of Columbia Mar 3 4706 2225 -2481
## 20 District of Columbia Apr 4 4157 1281 -2876
## 21 District of Columbia May 5 5714 1925 -3789
## 22 Florida Jan 1 50231 77466 27235
## 23 Florida Feb 2 87351 109859 22508
## 24 Florida Mar 3 73627 54872 -18755
## 25 Florida Apr 4 52508 21031 -31477
## 26 Georgia Jan 1 34952 38573 3621
## 27 Georgia Feb 2 40976 55386 14410
## 28 Georgia Mar 3 44150 26284 -17866
## 29 Georgia Apr 4 37028 15484 -21544
## 30 Illinois Jan 1 44040 44443 403
## 31 Illinois Feb 2 99674 68455 -31219
## 32 Illinois Mar 3 52782 47899 -4883
## 33 Illinois Apr 4 76098 21332 -54766
## 34 Maryland Jan 1 19580 21532 1952
## 35 Maryland Feb 2 29122 20708 -8414
## 36 Maryland Mar 3 40497 23864 -16633
## 37 Maryland Apr 4 26655 10061 -16594
## 38 Maryland May 5 5828 23488 17660
## 39 North Carolina Jan 1 35213 111990 76777
## 40 North Carolina Feb 2 84357 54053 -30304
## 41 North Carolina Mar 3 58272 54807 -3465
## 42 North Carolina Apr 4 73341 35484 -37857
## 43 North Carolina May 5 29374 23517 -5857
## 44 Texas Jan 1 132860 134559 1699
## 45 Texas Feb 2 143795 130080 -13715
## 46 Texas Mar 3 170607 129424 -41183
## 47 Texas Apr 4 143199 34694 -108505
## 48 Texas May 5 91205 35678 -55527
## 49 Virginia Jan 1 20032 25934 5902
## 50 Virginia Feb 2 36911 29507 -7404
## 51 Virginia Mar 3 44171 31492 -12679
## 52 Virginia Apr 4 20460 5467 -14993
## 53 Virginia May 5 26239 8239 -18000
Visually explore the data by faceting graphs for each jurisdiction. Display 2016 data points in red and 2020 data points in blue over a numeric representation of each month, which keeps the months in proper order.
library(ggplot2)
Cfgraph <- ggplot(subsetVoterReg) + geom_line(aes(x = numMonth , y = newReg2016), color = "red") + geom_line(aes(x = numMonth , y = newReg2020), color = "blue") + xlab('Month') + ylab('New.Registrations') + facet_wrap(~Jurisdiction) + labs(title = "Comparison of 2016 to 2020 New Voter Registrations")
print(Cfgraph)
Also graph the net change in new voter registrations for each jurisdiction over each month.
Netgraph <- ggplot(subsetVoterReg, aes(x = numMonth , y = newRegDelta)) + geom_line() + xlab('Month') + ylab('Net.Change.in.New.Registrations') + facet_wrap(~Jurisdiction) + labs(title = "Net Change of 2016 to 2020 New Voter Registrations")
print(Netgraph)
Based on the comparisons of 2016 to 2020 data, it appears that California, Florida, Illinois and Texas saw the largest drops in new voter registrations during the spring registrations prior to the Presidential elections. Given that the reported data reflects discrete new voter registrations by month, we could verify which states had the largest net total Deltas by adding the net Deltas across each state for the given time period. In fact, let’s do so:
library(data.table)
DT <- data.table(subsetVoterReg)
setkey(DT, Jurisdiction)
aggregate(newRegDelta ~ Jurisdiction, subsetVoterReg, sum)
## Jurisdiction newRegDelta
## 1 Arizona -30086
## 2 California 54949
## 3 Colorado -6904
## 4 Delaware -5825
## 5 District of Columbia -8258
## 6 Florida -489
## 7 Georgia -21379
## 8 Illinois -90465
## 9 Maryland -22029
## 10 North Carolina -706
## 11 Texas -217231
## 12 Virginia -47174
Here we can see clearly that aggregated new voter registrations across the four or five months of data have most adversely impacted Texas, Illinois, Virginia, and Arizona. The large net gain in California’s reflects that more people registered even earlier in 2020 than they did in 2016, which can be seen in the first faceted visualization. Overall, it is clear that the pandemic did indeed decrease new registrations overall across most of the states studied. It would be interesting to see if there is a correlation with the larger registration declines against the attempts to properly socially distance to control the virus within each state.