Will Bruton
For this project, I wanted to explore a bit of everything: swing states, pollster data, Trump’s polling in 2020 versus 2024 as he reappears as a candidate, and how these factors will influence an election many are calling “The Most Important Election Ever.” Based on my initial research, this race is close. My prediction, however, leans toward Kamala Harris, largely due to the incumbency advantage.
Swing States: Swing states will be critical in this election; whoever secures the most is likely to win. As a Pennsylvania native, I see firsthand the constant coverage and intense focus here. While I believe incumbency gives Kamala Harris a slight edge, I also acknowledge the possibility of a late swing toward Trump.
Pollsters: I expect that pollster rankings will correlate with polling accuracy and consistency. Higher-ranked pollsters should provide the most reliable data, which will be essential in a close race like this one.
Trump in 2020 vs. 2024: I’m examining Trump’s polling across both elections. While he has maintained strong support, I’m interested to see if there are any significant shifts or if his base remains as solid.
##
## Call:
## lm(formula = pct ~ numeric_grade + rank + candidate_name, data = t_PresidentPolls2024_with_ratings)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.0383 -2.0383 -0.0225 1.9486 8.8009
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 52.62559 3.56768 14.751 < 2e-16 ***
## numeric_grade -2.13483 1.20454 -1.772 0.0767 .
## rank -0.02203 0.01160 -1.900 0.0578 .
## candidate_nameKamala Harris 0.83924 0.19085 4.397 1.23e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.803 on 864 degrees of freedom
## (103 observations deleted due to missingness)
## Multiple R-squared: 0.02686, Adjusted R-squared: 0.02348
## F-statistic: 7.948 on 3 and 864 DF, p-value: 3.124e-05
##
## Call:
## lm(formula = pct_2024 ~ pct_2020, data = t_PresidentPolls_Trump)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.6965 -1.6658 0.0765 1.5612 9.5918
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 34.84407 0.88115 39.54 <2e-16 ***
## pct_2020 0.25766 0.02031 12.69 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.075 on 2700 degrees of freedom
## Multiple R-squared: 0.05625, Adjusted R-squared: 0.0559
## F-statistic: 160.9 on 1 and 2700 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = pct ~ candidate_name + methodology, data = t_PresidentPolls2024_with_ratings)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.1065 -1.8956 0.1044 1.8935 8.9024
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 46.0976 0.1765 261.169 < 2e-16 ***
## candidate_nameKamala Harris 0.7981 0.1796 4.444 9.86e-06 ***
## methodologyOnline Panel 0.2109 0.1895 1.113 0.266
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.791 on 968 degrees of freedom
## Multiple R-squared: 0.02105, Adjusted R-squared: 0.01903
## F-statistic: 10.41 on 2 and 968 DF, p-value: 3.373e-05
## `geom_smooth()` using formula = 'y ~ x'
After looking at swing state polling I had both Kamala and Trump having a roughly even split in swing states. Trump had four while Kamala had 3 with Nevada possibly going either way. In the end, I was shocked by the results. While Nevada remained up in the air well after the election was called, Trump managed to secure more than enough swing states early on to secure more than enough votes. In the end, Trump managed to win every single swing state. This what not at all what I predicted. I assumed it would be a battle down to a last swing state to secure either candidates victory. I think this result happened for a couple different reasons. I found that Trump support had either remained the same or has grown stronger since the previous election which could help explain his jump in numbers but I did think the race was very close considering the other numbers that came back. What I do think got looked over is that Republican polling is often lower due to that people in the party being very reluctant to answer polls. I also think the silent majority had a play in this. This is another group who is reluctant to answer polls or talk their politics. This group is a mix of both parties but often it’s hard to guess where they are leaning. At the same time, my numbers showed Trumps numbers in the polls remaining the same this time around if not growing more since 2020. Who’s to say the same thing growth in support didn’t happen within the population but they didn’t answer polls. Overall, I was definitely surprised by what occurred and certainly expecting a closer race.
Sites used for Data: Pollster Data | Polling Data
Swing State Info: NPR
ChatGPT: Major help with cleaning up my coding and finding errors within it. Helped to streamline my pipes using %in% and other functions. Helpful in formatting my text in markdown and help reword bits and pieces.