Created by Riley Kearney. Updated 10/29/2024

Thesis

There are several factors that can introduce bias in pollsters, leading to inaccurate predictions.

Data

Most of the data used in this project was historical presidential election data from 2020. To simplify the analysis, I focused on surveys conducted in October and limited the scope to Trump specific data rather than including information on other past candidates.

Key variables included:

Methods

1.

The visualization above is showing that there is a negative correlation with pct error and sample size, which means when the size of the sample increases, the percentage error decreases. There is also a slight negative correlation with number of sponsors and length of the survey relating to pct_error as well.

2.

The graph above is just showing anther visualization between sample size and percent error. It shows a slight negative line of regression which indictaes that whenever sample size increases percent error decreases. Ultimately this suggests that larger sample sizes are associated with more accurate polling results.

3.

This box and whisker plot shows that pollsters that have a Republican bias had a lower percent error for Trump than the pollsters that had a Democrat or N/A bias. This suggests that Republican biased pollsters tend to be more accurate in predicting Trump’s results.

4.

This box and whisker shows that pollsters have a N/A bias has the lowest median percent error, but they aren’t as consistent. While republican bias pollsters have a slightly higher median percent error than N/A and are more consistent.

Prediction

```

I focused on polls with larger sample sizes and pthose conducted by pollsters who tend to have a Republican bias, as these factors generally correlate with lower percent error. Applying these insights to the current pollsters, I concentrated my analysis on the swing states. As shown in the graph above, all the swing states are projected to swing blue, with predicted percentages for Trump being less than 50%. Based on these findings, I predict Kamala Harris will win the 2024 presidential election.

Limitations

The findings in this project are based on the data from the 2020 presidential election, which occured under unique social, political, and public health conditions. The COVID-19 Pandemic was a major factor, leading to significant increase in mail-in voting. This shift may have influenced voter behavior and polling methodologies in ways that are unlikely to be relevant for 2024 election.

Discussion

When creating my predictions, I was just looking at the swing states. Which, resulted in me predicting that Kamala was going to win each of the swing states. For example, I predicted Trump was going to have 47.7% of Arizona and he actual had 52.2%. This is pretty much how it looked for the rest of the swing states as well, since Trump did win over all the swing states. I’m not happy with the results since I didn’t predict one of the swing states correctly. I definitely think the error comes from me getting my data from the 2020 election.

Resources

Sources Included: