*https://rpubs.com/hrh00009/1232925
##
## Call:
## lm(formula = candidatevotes ~ MEHOINUSPAA646N, data = IncomeXvotes_rep,
## na.action = na.exclude)
##
## Residuals:
## Min 1Q Median 3Q Max
## -527027 -175868 -2603 202620 593865
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.518e+06 3.402e+05 4.462 0.00211 **
## MEHOINUSPAA646N 2.321e+01 7.420e+00 3.129 0.01405 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 352800 on 8 degrees of freedom
## (4 observations deleted due to missingness)
## Multiple R-squared: 0.5503, Adjusted R-squared: 0.494
## F-statistic: 9.788 on 1 and 8 DF, p-value: 0.01405
##
## Pearson's product-moment correlation
##
## data: IncomeXvotes_rep$MEHOINUSPAA646N and IncomeXvotes_rep$candidatevotes
## t = 3.1285, df = 8, p-value = 0.01405
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2104571 0.9348118
## sample estimates:
## cor
## 0.7417894
cor = 0.741784
p-value = 0.01405
R-squared = 0.5503
The correlation test and linear model tells us that the correlation between median household income in PA and votes for a republican candidate is strong and that we feel confident that is shows a meaningful relationship instead of being random chance (p-value less than 0.05).
But the model also explains just over half of the variance, meaning other factors also play significant roles in predicting vote counts.
##
## Call:
## lm(formula = candidatevotes ~ MEHOINUSPAA646N, data = IncomeXvotes_dem,
## subset = !is.na(candidatevotes))
##
## Residuals:
## Min 1Q Median 3Q Max
## -252152 -155635 -17697 136196 359963
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.510e+06 2.090e+05 7.225 9.03e-05 ***
## MEHOINUSPAA646N 2.737e+01 4.557e+00 6.006 0.000321 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 216700 on 8 degrees of freedom
## Multiple R-squared: 0.8185, Adjusted R-squared: 0.7958
## F-statistic: 36.08 on 1 and 8 DF, p-value: 0.0003211
##
## Pearson's product-moment correlation
##
## data: IncomeXvotes_dem$MEHOINUSPAA646N and IncomeXvotes_dem$candidatevotes
## t = 6.0065, df = 8, p-value = 0.0003211
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.6391842 0.9775156
## sample estimates:
## cor
## 0.9047118
cor = 0.9047118
p-value = 0.0003211
R-squared = 0.8185
The correlation test and linear model tells us that the correlation between median household income in PA and votes for a democrat candidate is very strong, sitting significantly stronger than the correlation to republican votes (0.1629278 lower than democratic correlation). And, we are very confident about this correlation being meaningful vs random chance.
The model explains a substantial portion of the variance in the voting data, indicating it is a good fit.
Post election, we can see that Pennsylvania ultimately swung red, with Trump receiving 50.4% (3,539,563 votes) and Kamala receiving 48.6% (3,416,992 votes) and the other 1% was split between Jill Stein and Chase Oliver.
With the predictions above … Trump’s votes were under by 420,829 and Kamala’s were under by 20,078.
The largest contributing factor to Trump’s triumph in PA has to do with the last minute emerging movement of “Amish for Trump” as “unprecedented numbers” (NYP) of the amish community registered to vote and voted for Trump. This was reportedly in response to a Department of Agriculture federal raid on a farm in PA on January 4th.
Overall - The total amount of votes was underestimated by 440,907 votes. Although many people are upset by the 2024 elction outcomes and it is not ideal to have wrong predictions, this case can be seen as a win when we are looking at the importance of voting. It is the people’s civic duty to vote and PA showed the US how important every vote is this year.
However, relating to the correlation of median household income and votes, my thesis was strong. The numbers of the amish community demonstrated the strong correlation between lower income and voting red. The predictions above were based on historical income - where the amish community was not accounted for. Maybe with their numbers in historical voting counts, the predictions could have been more accurate.