Evaluating the performance of IPF

new tests for an old technique

Robin Lovelace
RSAI-BIS August 2013, Cambridge
Wednesday 21st 11 - 13, early careers session

Introduction

  • Iterative proportional fitting is an established statistical technique (Deming, 1940)
  • It estimates the values of internal cells, based on marginal totals:
    Marginal totals table
  • Used in spatial microsimulation for allocating individuals to zones (Lovelace et al., 2013)

How IPF works 1 - visually

Visually, this can be seen as follows:

Selection of variables used to sample individuals

People most representative of the target area selected

(Optional) process of integerisation converts weights into whole individuals

How IPF works 2 - in maths

\[ w(n+1) = \frac{w(n) \times sT_{i}}{mT(n)_{i}} \]

  • where, \( w(n+1) \) is the individual's new weight,
  • \( w(n)_{ij} \) is the original weight,
  • \( sT_{i} \) is the marginal total of the small area constraint
  • \( mT(n)_{i} \) is the aggregate results of the weighted microdataset

Apply this algorithm, one constraint at a time, to every area in the case study

How IPF works 3 - in code

Main algorithm:

for (j in 1:nrow(all.msim)){
  for(i in 1:ncol(con1)){
 weights[which(ind.cat[,i] == 1),j,1] <- con1[j,i] /ind.agg[j,i,1]}}

Or in English:

  • for each zone, set the new weight of individuals in each category equal to their true number divided by their current number in the simulation
  • May need worked example to 'get it' (took me 3 months!)

The need for testing

It is well-known that IPF works:

  • converges towards a single result
  • robust and computationally efficient
  • has been used in many spatial microsimulation studies

Much less is know about the factors influencing its performance.
Are there ways IPF should or should-not be set-up?

Baseline scenarios

Three baseline scenarios were used:

  • a simplest possible case, with 5 areas, 10 individuals and 2 constraints
  • 'small area' constraints: 24 'OA' zones, ~1,000 individuals and 3 constraints
  • 'Sheffield', containing the 71 'MSOA', ~5,000 individuals and 4 constraints

Most tests were done on the 'small area' scenario

Baseline result

  • Correlation rapidly approaches 1
  • Beyond 5 iterations, result is indistinguishable from 1
  • perfect convergence (no empty cells)

  • But are we using the right metric of model fit?

Baseline result - visual

Convergence between constraints

Correlation over time - baseline

Evaluating model fit

Commonly used options include:

  • Pearson's coefficient of correlation ®
  • Total and Standardised Absolute Error (TAE and SAE)
  • Root mean squared (RMS)
  • Z-scores
  • Standard Error Around Identity (SEI)
  • other metrics do exist!

Model experiments

The impact of the following changes was tested:

  • number of iterations
  • number/order of constraints
  • initial weights
  • ratio of survey size:zone areas
  • empty cells
  • integerisation

Results - Iterations and constraints

  • After 4 iterations all models had near-perfect fit
  • The order of constraints had some impact, but not a lot
  • Fewer constraints > faster convergence (dur!)

Results - Initial weights I

Doubling initial weight has some impact after 1 iteration, tends rapidly to 0

Results - Initial weights II

Influence of weights

Effects most pronounced within each iteration

Knock-on effects on other individuals

Summary of findings

  • The number of lines of code to perform IPF has been reduced
  • IPF converges rapidly to a single result, if set-up correctly
  • Supports previous work suggesting convergence after 10 iterations (Ballas et al. 2005)
  • Five is probably sufficient for 4 or fewer constraints
  • Initial weights seem to have very little impact on the results - will have no impact on the model
  • Integerisation has a slight negative impact on fit

Conclusions and further work

  • IPF is a useful procedure for various applications, but its utility can be extended and enhanced in various ways (Pritchard and Miller, 2012)
  • Before 'trying to run', however, researchers should master walking
  • Therefore these basic tests on the performance of IPF should be useful in informing future work
  • Reproducible code and example data should be useful to others ('fork me' on Github !)
  • More model experiments: missing cells and interactions between variables
  • Is it possible for IPF to be made even faster?
  • Methods for grouping individuals (e.g. into families)

Key references (see links to others)

Deming, W. (1940). On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. The Annals of Mathematical Statistics

Lovelace, R., & Ballas, D. (2013). “Truncate, replicate, sample”: A method for creating integer weights for spatial microsimulation. CEUS, 41, 1–11

IPF-performance-testing” github repository - please 'clone' this and contribute! + this presentation at www.rpubs.com/RobinLovelace

Pritchard, D. R., & Miller, E. J. (2012). Advances in population synthesis: fitting many attributes per agent and fitting to household and person margins simultaneously. Transportation, 39(3)

Thanks for listening r.lovelace at leeds.ac.uk