Robin Lovelace
RSAI-BIS August 2013, Cambridge
Wednesday 21st 11 - 13, early careers session
Visually, this can be seen as follows:
Selection of variables used to sample individuals
People most representative of the target area selected
(Optional) process of integerisation converts weights into whole individuals
\[ w(n+1) = \frac{w(n) \times sT_{i}}{mT(n)_{i}} \]
Apply this algorithm, one constraint at a time, to every area in the case study
Main algorithm:
for (j in 1:nrow(all.msim)){
for(i in 1:ncol(con1)){
weights[which(ind.cat[,i] == 1),j,1] <- con1[j,i] /ind.agg[j,i,1]}}
Or in English:
It is well-known that IPF works:
Much less is know about the factors influencing its performance.
Are there ways IPF should or should-not be set-up?
Three baseline scenarios were used:
Most tests were done on the 'small area' scenario
perfect convergence (no empty cells)
But are we using the right metric of model fit?
Commonly used options include:
The impact of the following changes was tested:
Doubling initial weight has some impact after 1 iteration, tends rapidly to 0
Effects most pronounced within each iteration
Knock-on effects on other individuals
Deming, W. (1940). On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. The Annals of Mathematical Statistics
Lovelace, R., & Ballas, D. (2013). “Truncate, replicate, sample”: A method for creating integer weights for spatial microsimulation. CEUS, 41, 1–11
“IPF-performance-testing” github repository - please 'clone' this and contribute! + this presentation at www.rpubs.com/RobinLovelace
Pritchard, D. R., & Miller, E. J. (2012). Advances in population synthesis: fitting many attributes per agent and fitting to household and person margins simultaneously. Transportation, 39(3)
Thanks for listening r.lovelace at leeds.ac.uk