9 December 2021

2012 US Presidential election: “The Triumph of the Quants”


Outline

  • poll averaging model: theory and methods
  • house effects, house-specific biases
  • extensions
  • criticism

Polls are like WW2-era radar

  • noisy sensor (sampling error)
  • likely a biased sensor (“house effects”)
  • limited resolution (coarse reporting of published polls)
  • snapshots of dynamic target (discrete field period)
  • target’s law of motion is unknown (not ballistic)
  • dependencies among multiple targets (vote shares sum to 100%)

Model for poll averaging: setup & notation, scalar target

  • Let \(t\) index campaign days.
  • Poll \(p\) fielded on day \(t\) by polling company \(j\) yields a estimated voting intention, a proportion \(\color{cyan}{y_p} \in [0,1]\), with sample size \(\color{cyan}{n_p}\).
  • Variance of \(\color{cyan}{y_p}\) is approximately \(\color{cyan}{V_p = y_p (1-y_p)/n_p}\)
  • True, latent voting intentions on day \(t\) are \(\color{orange}{\xi_t} \in [0,1]\). These are observed exactly on election days, \(\color{orange}{\xi_1}\) and \(\color{orange}{\xi_T}\), respectively.
  • Polling company \(j\) has a time-invariant “house effect” \(\color{orange}{\delta_j}\), such that \(E(\color{cyan}{y_p}) = \color{orange}{\xi_{t(p)}} + \color{orange}{\delta_{j(p)}}\)
  • \(E(\color{cyan}{y_p}) = \color{orange}{\xi_{t(p)}} + \color{orange}{\delta_{j(p)}}\) = truth + bias, both unobserved.

“House” effects \(\color{orange}{\delta}\), biases specific to a polling company:

Arise from: - sampling methodology (e.g., RDD, landline/mobile mix; quotas from web panels) - weighting procedures and selection of weighting variables (raking; propensity score matching) - survey mode (live interviewer, IVR, web self-complete) - question wording and question ordering effects - response options (are minor parties or DK offered or volunteered?; are DKs pushed?) - field operations (time of day, day of week) - reporting conventions (DKs reported or not) - compounded in low or uncertain voter turnout environments

State-space model for poll averaging

  • Measurement model: \(\color{cyan}{y_{p}} \sim N(\color{orange}{\xi_{t(p)}} + \color{orange}{\delta_{j(p)}} \, , \, \color{cyan}{V_p})\)

  • Dynamic model: \(\color{orange}{\xi_t} \sim N(\color{orange}{\xi_{t-1}}, \color{orange}{\omega^2})\), with the endpoint constraints from election results observed on \(\color{orange}{\xi_1}\) and \(\color{orange}{\xi_T}\).

  • Given published polls, \(\color{cyan}{\boldsymbol{Y}}\), sample sizes, field dates and identity of polling companies — and the model — we seek

  1. trajectory of latent voting intentions \(\color{orange}{\boldsymbol{\xi}} = (\color{orange}{\xi_1}, \ldots, \color{orange}{\xi_T})'\)
  2. house effects: \(\color{orange}{\boldsymbol{\delta}} = (\color{orange}{\delta_1}, \ldots, \color{orange}{\delta_J})'\)
  3. “pace of change” parameter (innovation variance), \(\color{orange}{\omega^2}\).
  • Augment model with unknown step/discontinuities \(\color{orange}{\gamma}\) in \(\color{orange}{\boldsymbol{\xi}}\) trajectory, no “event” days (e.g., leadership changes).

Model as a directed acyclic graph


Identification of model parameters

  • as written, the model is over-parameterised
  • \(E(\color{cyan}{y_p}) = \color{orange}{\xi_{t(p)}} + \color{orange}{\delta_{j(p)}}\).
  • Indistinguishable from \(E(\color{cyan}{y_p}) = [\color{orange}{\xi_{t(p)}} + \color{red}{c}] + [\color{orange}{\delta_{j(p)}} - \color{red}{c}], \quad \forall\ \color{red}{c} \neq 0\).
  • Post-election, end-point constraints: anchor \(\xi_T\) to known election result, and/or \(\xi_1\) to past election result as may be appropriate.
  • Sum-to-zero normalisation of house effects \(\color{orange}{\delta}\); i.e., set \(\color{red}{c} = \color{orange}{\bar{\delta}}\), such that \(\xi_t\) are identified up to a translation equal to the average bias of all pollsters.

Estimation and inference

  • Gaussian law of motion: Kalman filter.
  • in Bayesian statistics: dynamic linear model (West & Harrison).
  • house effects and partially observed polling data makes the model slightly non-standard for off-the-shelf Kalman filtering (many packages in R)
  • EM or MCMC via R and C/C++
  • jags via rjags (Plummer 2019), see Jackman (2009).
  • Stan via RStan (Stan Development Team 2020)
  • nimble (de Valpine et al. 2017)
  • pomp (King et al. 2016)

Example, Australia 2019 Federal election

  • 226 polls, fielded between 2016 and 2019 elections
  • 1,051 days, inclusive.
  • 6 distinct polling companies
  • augment model with a discontinuity when Morrison replaces Turnbull as PM.
  • estimate separately for different scalar targets.
n
Essential 108
Ipsos 18
Morgan F2F 12
Newspoll 61
ReachTEL 15
YouGov 12

Australian Labor Party, first preferences

Greens, first preferences

LNP two-party preferred

Trajectories of latent voting intentions recovered with credible intervals

Trajectories of latent voting intentions recovered with credible intervals

Rate of change parameter, \(\omega\), posterior densities

House effects, first preferences

Two-party preferred

Did the polls “herd”?

What is herding and how can we detect it

  • for election polling, the truth will out; commercial implications
  • survey houses have choices about weighting etc, by survey houses, typically after data collection, before publishing results.
  • better to be wrong with others than wrong on your own, but apparently not for Ipsos.

from Nobel laureates..

…to 538…

Theory: herding manifests as underdispersion

  • suppose true level of voting intentions is \(\color{orange}{\pi} \in (0,1)\)
  • special case of the Central Limit Theorem: under unbiased, simple random sampling (SRS), survey based estimates of \(\color{orange}{\pi}\) will be distributed normally around \(\color{orange}{\pi}\) with variance \(V(\color{orange}{\pi}) = \color{orange}{\pi}(1-\color{orange}{\pi})/\color{cyan}{n}\).
  • put the question of bias to one side (dealt with previously with house effects estimates)
  • we compare observed dispersion of the polls with theoretically expected dispersion, given (1) stated sample sizes \(\color{cyan}{n}\); (2) assumption about \(\color{orange}{\pi}\)

Herding manifests as underdispersion

A simulation-based test for herding

  • assume true voting intentions are \(\color{orange}{\pi}\) (e.g., the level observed on Election Day)
  • assume no change in voting intentions for \(d\) days prior to the election
  • \(\color{cyan}{\mathcal{D}}\) are polls fielded within \(d\) days of the election, with standard deviation \(\color{cyan}{s_\mathcal{D}}\). Repeat the following over range of values of \(d\).
  • simulate poll results for each \(\color{cyan}{p} \in \color{cyan}{\mathcal{D}}\), \(\color{red}{y_p} \sim N(\color{orange}{\pi}, \color{cyan}{V_p})\), \(\color{cyan}{V_p} = \color{orange}{\pi} (1 - \color{orange}{\pi})/\color{cyan}{n_p}\). Round \(\color{red}{y_p}\) to the same degree of precision as in reported polls. Let \(\color{red}{s^*}\) be the standard deviation of the \(\color{red}{y_p}\).
  • Over many simulations how often do we observe \(\color{red}{s^*} > \color{cyan}{s_\mathcal{D}}\)? That is, results of actual polls \(\color{cyan}{\mathcal{D}}\) are underdispersed relative to what we should see under SRS.

Strong evidence of underdispersion in Coalition & 2PP polling

Extensions

  • latent \(\xi_t\) is a \(K\) vector, subject to restriction that \(\sum_{k=1}^K \xi_{tk} = 1 \forall t\).

  • 2 approaches:

    • Dirichlet model for transitions and multinomial for polls.
    • \(K-1\) Gaussian model for log-odds of \(\xi_t\) and poll estimates (with \(K-1\) by \(K-1\) covariance matrices)
  • examples: multi-candidate elections (e.g., Iowa caucuses), with drop-out and drop-in.

  • multiple jurisdictions: e.g., each state in US presidential elections, \(M \approx 50\) filters running in parallel, with high dependencies in trajectories (voters in different states consuming same information, the “nationalisation” of politics and campaigns)

Conclusion

  • not just a “poll average”, but a forecast
  • with great public interest, comes great power, and …
  • participant, not observer
  • communication of results to lay public of vital importance