This is a project aiming to project the outcome of a Canadian federal election if it happened today. Much of the code was inspired by Tomorrow’s Westminster, the creator of which kindly made their code accessible on Github.
The first step of this model is estimating the likely vote shares of each of the main federal parties in Canada. First, polling data is scraped from Wikipedia. A Bayesian poll aggregation model (taken from Tomorrow’s Westminster) is used to aggregate the polls, adjusting for pollster effects and time. I added my own personal touch by evaluating the average over- or under-polling of parties in Canada in past elections. That is, if, on average, pollsters underestimate Conservative vote share by around 2% (which is roughly true), once the polls are aggregated, the model adds 2% to the Conservative total. It is a touch more complicated than that, in that the model actually randomly assigns an adjustment based on a probabilistic distribution of past polling errors.
The following table shows the most likely predicted outcome of national vote shares for each of the major federal parties.
You can also see below the smoothing and aggregation of all the polls. The lines and shading around them represent the aggregated vote share and confidence intervals, respectively. Individual dots represent individual poll results. You can see the error adjustment well in this graph. As the Conservatives often poll below their actual election results, the dots tend to be below the aggregated line. The NDP is the opposite. They tend to poll above their actual results, so the poll dots tend to be above the aggregated line.
However, as we’ve seen from recent federal elections, the vote share isn’t as important as how those votes translate into seats.
In order to determine that, I developed a modified version of the uniform swing model. Simply put, uniform swing models assume that if a party gains 5% of the popular vote at the national level compared to the last election, they will gain 5% in each riding as well. That is obviously quite simplistic. What I did was develop a more complicated linear regression model for each party for every election since 2008. The model uses “riding swing” as the outcome variable, and then the party’s national swing as a predictor variable. It also adds in three binary variables (1=Yes and 0=No) for whether or not they won that riding in each of the last 3 elections.
The table below shows the most likely predicted seat outcome if an election were held today. I did not back test the model, because all of the input was based on known data. Back testing always carries a risk of overestimating the success of the model. We shall see how it does in the next election.
Since the results were simulated 1000 times, you can see a range of the simulated outcomes below.The taller a peak is, the more likely that particular number of seats is.
The following output shows the most likely outcomes in terms of the final makeup of Parliament.
Below you can see the riding-by-riding projections in two ways. One is the table below.Below that is the second option, an interactive map, where you can pan and zoom to whatever riding you like. It will show you the projected result, past result, and a riding rating. For both of these, I used the same ratings system as Philippe Fournier at 338Canada. A “Safe” riding is one where the stated party wins in >99% of simulations. “Likely” is 90-98.9% of the time, “Leaning” is 70-89.9% of the time, and “Toss -up” is when any party wins less than 70% of the time.