Data is mostly sourced from 538 website on 10/18/2024



 

If the election happened today…

Below is the current outcome of the election, given the polls are 100% accurate, which they are not

Party Electoral Votes
Dem 276
Rep 262




Assigning the probability of winning the state

This code below is a function that creates a probablility distribution (think a bell curve) around the voting margin provided by 538 along with a margin of error that can be tweaked if needed

Essentially, if the margin is 3% in favor of Harris, this function would create a bell curve around 3% and then I check to see what percent of the bell curve is above 0%…which would indicate a Harris victory for the state. This percent of the bell curve above 0% is the probability of the state being won by Harris. Below is an example of this logic.

#simulate state win % by margin of error and SD
simulate_win_percentage <- function(win_prob) {
  # Generate random samples from a normal distribution
  samples <- rnorm(n = n, mean = win_prob, sd = sd)
  # Calculate the percentage of samples greater than 0
  percent_above_zero <- mean(samples > 0) * 100
  return(percent_above_zero)
}

x <- rnorm(500, mean = 3, sd = 3)
hist(x, breaks = 10, main = "Example distibution given margin of victory is 3% \n and margin of error is 3% ...roughly \n 84% chance of winning the state")
abline(v=0)



Once the probability for each state election is determined, I take a sample from that state to determine who the winner is. I do this by randomly taking a number from 0-100. If the randomly selected number is above the probability for Harris to win the state, Trump will win the state. For example, New Jersey has a 98% chance of being won by Harris, so the randomly selected number would have to 99 or 100 for Trump to win the state. Another example is Pennsylvania, which is a swing state. This state is essentialy 50/50 so if the randomly generated number is above 50, it will go to Trump and if below 50 it will go to Harris.


I do this sampling method 1000’s of times to get different, but mostly similar outcomes and this is how the entire presidential election is simulated. Of course the winner is the candidate with the most electoral votes!


dem_electorals <- 0
rep_electorals <- 0
dems_won_p_list <- c()
for (k in 1:15){
n <- 5000
sd <- k
# Apply the simulation to each row in the dataframe
voting$simulated_win_percentage <- apply(voting, 1, function(row) {
  simulate_win_percentage(as.numeric(row["voting$Percent2024"]))
})

voting$simulated_win_percentage <- sapply(voting$Percent2024, simulate_win_percentage)

dems_won <- 0
n_sim <- 5000
for (j in 1:n_sim){
  
  voting2 <- voting %>%
  mutate(dem_prob = ifelse(Party_2024 == "Dem",simulated_win_percentage, 100-simulated_win_percentage)) %>%
  mutate(dem_state = ifelse(runif(1,0,100) > dem_prob, 1, 0))
  
for (i in 1:nrow(voting2)){
  if (voting2$dem_state[i] == 1){
    dem_electorals <- dem_electorals + voting2$ElectoralVotes[i]
  } else {
  rep_electorals <- rep_electorals + voting2$ElectoralVotes[i]}
}

  if (dem_electorals >= 270){ 
    dems_won <- dems_won + 1
  }
dem_electorals <- 0 
rep_electorals <- 0 
}
dems_won_p_list[k] <- dems_won/n_sim*100
}

Below are figures that display the probability of either candidate winning at different margins of error

reps_won_p_list <- 100-dems_won_p_list
margin_of_error <- 1:15
dems_by_moe <- data.frame(cbind(margin_of_error,dems_won_p_list, reps_won_p_list))
dems_by_moe_3_8 <- dems_by_moe[3:8,]
p<-ggplot(dems_by_moe_3_8, aes(x=margin_of_error)) +
  geom_line(aes(y = dems_won_p_list),color = "darkblue", linewidth = 2) + 
  geom_line(aes(y = reps_won_p_list), color = "darkred", linewidth = 2) +
  ggtitle("Estimated % of Harris Victory by Margin of Error") +
  theme_bw() +
  ylim(c(0,100)) +
  xlab("Margin of Error") +
  ylab("Estimated % chance Harris wins") + 
  annotate("text", x = 7, y = 75, label = "Harris", color = "darkblue") +
  annotate("text", x = 7, y = 80, label = "Trump", color = "darkred")
p

p<-ggplot(dems_by_moe_3_8, aes(x=margin_of_error)) +
  geom_line(aes(y = dems_won_p_list),color = "darkblue", linewidth = 2) + 
  geom_line(aes(y = reps_won_p_list), color = "darkred", linewidth = 2) +
  ggtitle("Estimated % of Harris Victory by Margin of Error - Zoomed In") +
  theme_bw() +
  ylim(c(35,65)) +
  xlab("Margin of Error") +
  ylab("Estimated % chance Harris wins") + 
  annotate("text", x = 7, y = 60, label = "Harris", color = "darkblue") +
  annotate("text", x = 7, y = 62, label = "Trump", color = "darkred")
p

dems_won_p <- dems_won_p_list[3]
reps_won_p <- reps_won_p_list[3]
results <- c(dems_won_p, reps_won_p)
party <- c("Harris", "Trump")
totals <- data.frame(cbind(results,party))
colnames(totals) <- c("Election_win_percentage", "party")
totals$Election_win_percentage <- as.numeric(totals$Election_win_percentage)
p<-ggplot(totals, aes(x=party, y = Election_win_percentage)) +
  geom_bar(stat="identity") + 
  ggtitle("Rates of Election Victory Based on Simulated Results \n Using Latest Poll Data, with Margin of Error 3%") +
  theme_bw() +
  scale_y_continuous(limits = c(0, 100))
p