Counter-weight integerisation

Ballas et al (2005) presented the following algorithm to integerise the non-integer results of IPF in verbal form We'll implement this algorithm, which we label “count-weight integerisation” in R, based on the results of a simple worked example.

The first stage is therefore to run the simple example “microsimulation-eg-final.Rmd” (find the code here). After getting a handle of the weights, we can procede to integerise them.

weights4 <- read.table(text = "1.5162332883 0.53624505 0.6152425 0.49064783 0.6425895\n2.1842240150 1.37207617 1.0589942 1.84845317 0.5436713\n1.0000000000 3.00000000 2.0000000 1.00000000 0.0010000\n0.0005241171 0.35678987 0.3217955 1.84996569 0.8211398\n1.0000000000 1.00000000 2.0000000 1.00000000 0.0010000\n1.5162332883 0.53624505 0.6152425 0.49064783 0.6425895\n1.9832112662 2.47353799 2.2318972 0.14991164 0.1184222\n0.0010000000 0.00100000 1.0000000 2.00000000 7.0000000\n0.0004758829 0.64321013 0.6782045 0.15003431 0.1788602\n0.8000981423 0.08189573 0.4786236 0.02033953 0.0527275")
weights4
##           V1     V2     V3      V4      V5
## 1  1.5162333 0.5362 0.6152 0.49065 0.64259
## 2  2.1842240 1.3721 1.0590 1.84845 0.54367
## 3  1.0000000 3.0000 2.0000 1.00000 0.00100
## 4  0.0005241 0.3568 0.3218 1.84997 0.82114
## 5  1.0000000 1.0000 2.0000 1.00000 0.00100
## 6  1.5162333 0.5362 0.6152 0.49065 0.64259
## 7  1.9832113 2.4735 2.2319 0.14991 0.11842
## 8  0.0010000 0.0010 1.0000 2.00000 7.00000
## 9  0.0004759 0.6432 0.6782 0.15003 0.17886
## 10 0.8000981 0.0819 0.4786 0.02034 0.05273

The challenge is to convert this matrix of real numbers into integers: 0s, 1s 2s etc.

Setting-up the reweighting model

“Define two variables named counter and weight, set them to zero and then do the following.

Sort all households into ascending order of probability of living in the small area (which were calculated using the method described above) being populated.” Focussing solely on area 1:

sweights <- sort(weights4[, 1])
iweights <- floor(sweights)  # integer weights
dweights <- sweights - iweights  # decimal weights
dweights
##  [1] 0.0004759 0.0005241 0.0010000 0.8000981 0.0000000 0.0000000 0.5162333
##  [8] 0.5162333 0.9832113 0.1842240
iweights
##  [1] 0 0 0 0 1 1 1 1 1 2
sum(iweights)
## [1] 7
round(sum(sweights))  # Note difference: integer weights always lower
## [1] 10
for (i in 1:nrow(weights4)) {
    if (sum(iweights) < round(sum(sweights))) {
        iweights[i] <- iweights[i] + round(dweights[i] + dweights[i + 1])
        e <- i
    }
}
sum(iweights)
## [1] 10
e  # end counter (what if iweights never reaches sum(sweights?))
## [1] 6

We have successfully created integer weights for area 1 using a counter-weight algorithm. Now iterate over all areas:

e <- 1:ncol(weights4)
cweights <- weights4
for (j in 1:ncol(weights4)) {
    sweights <- sort(weights4[, j], index.return = T)$x
    ord <- rank(weights4[, j], ties.method = "first")  # order of index
    iweights <- floor(sweights)  # integer weights
    dweights <- sweights - iweights  # decimal weights
    dweights
    iweights
    sum(iweights)
    round(sum(sweights))  # Note difference: integer weights always lower
    for (i in 1:nrow(weights4)) {
        if (sum(iweights) < round(sum(sweights))) {
            iweights[i] <- iweights[i] + round(dweights[i] + dweights[i + 1])
            e[j] <- i
        }
    }
    cweights[, j] <- iweights  # but the order is wrong
    cweights[, j] <- cweights[ord, j]
}
cweights
##    V1 V2 V3 V4 V5
## 1   1  1  1  1  1
## 2   2  1  1  1  1
## 3   1  3  2  1  0
## 4   0  1  1  1  0
## 5   2  1  2  2  0
## 6   1  1  0  0  0
## 7   1  2  2  0  0
## 8   1  0  1  2  7
## 9   0  0  0  1  1
## 10  1  0  1  0  0
weights4
##           V1     V2     V3      V4      V5
## 1  1.5162333 0.5362 0.6152 0.49065 0.64259
## 2  2.1842240 1.3721 1.0590 1.84845 0.54367
## 3  1.0000000 3.0000 2.0000 1.00000 0.00100
## 4  0.0005241 0.3568 0.3218 1.84997 0.82114
## 5  1.0000000 1.0000 2.0000 1.00000 0.00100
## 6  1.5162333 0.5362 0.6152 0.49065 0.64259
## 7  1.9832113 2.4735 2.2319 0.14991 0.11842
## 8  0.0010000 0.0010 1.0000 2.00000 7.00000
## 9  0.0004759 0.6432 0.6782 0.15003 0.17886
## 10 0.8000981 0.0819 0.4786 0.02034 0.05273
colSums(cweights) - colSums(weights4)  # sums of counter-weight weights essentially same as IPF weights
##     V1     V2     V3     V4     V5 
## -0.002 -0.001  0.000  0.000 -0.002
plot(as.vector(as.matrix(weights4)), as.vector(as.matrix(cweights)))

plot of chunk unnamed-chunk-3

e  # the value that i reached in order to reach the desired population
## [1] 6 5 3 7 7

References

Ballas, D., Clarke, G., Dorling, D., Eyre, H., Thomas, B., & Rossiter, D. (2005). SimBritain: a spatial microsimulation approach to population dynamics. Population, Space and Place, 11(1), 13–34. doi:10.1002/psp.351