Insights from the buildings of Mutendere

The buildings of Mutendere

Mutendere is a suburb in the city of Lusaka, Zambia. The population of Zambia is growing rapidly and as such, many suburbs are increasing in population density in the capital. It is difficult to estimate the population of these smaller areas and looking at the buildings may give a sense of its population. This map visualizes the buildings in Mutendere. When official administrative boundaries are not available, one must be created to extract OSM data and produce local estimates. This can be done in a number of ways in QGIS or JOSM. The data is extracted from OpenStreetMap by using the custom boundary and then converted to a GeoJSON.

Estimating population with areal interpolation

Gaining accurate information on population density is bound to the space humans occupy and this can be challenging given the changing nature of human activities over space and time. Choropleth maps can only reveal a part of the story and it tends to apply homogeneity over an area. Dasymetric mapping can be used to gain insights from greater granularity. The general idea is to use ancillary information to define new areas more representative of changes or particular atributes of smaller spaces.

In this case, an areal interpolation can be attempted. The idea is to use information we have and disaggregate to new areas. Lusaka is a city of 1,747,152 as of Census 2010 and is 360 km square with a density of 4,853.2 people per km square. This is the starting point. The other information we have are the areas of the buildings in Mutendere.

Borrowing from Lwin and Murayama (2010), we can use the formula:

\({BP_i} = (\frac{CP}{\sum_{k=1}^{n}{BA_k}}){BA_i}\)

\({BP_i}\) = Building \(_i\) population

\({BA_i}\) = Building \(_i\) area

\({C}\) = Population of Census Tract.

We will use this to disaggregate population estimate to the buildings. We must then build the information we need from the buildings in Mutendere: their size.

Population of Lusaka

Lusaka had a population of 1,747,152 in 2010 and 1,084,703 in 2000 (Zambia Open data Africa). This represents a population growth of 38% over 10 years or 3.8% every year.

\({PG} = (\frac{P_{t2}-P_{t1}}{P_{t2}})X100\)

LuPop2010 <- 1747152
LuPop2000 <- 1084703

LuPopGrowth <- (LuPop2010 - LuPop2000) / LuPop2010 * 100
LuAnnualPopGrowth <- LuPopGrowth / 10

LuPopGrowth

## [1] 37.91593

LuAnnualPopGrowth

## [1] 3.791593

As a comparison, Zambia had a population of 13,718,722 in 2010 and a projected population of 17,885,422 for 2020. A population growth of 23% or 2.3% every year.

ZaPop2010 <- 13718722
ZaPop2000 <- 17885422

ZaPopGrowth <- (ZaPop2000 - ZaPop2010) / ZaPop2000 * 100
ZaAnnualPopGrowth <- ZaPopGrowth / 10

ZaPopGrowth

## [1] 23.29663

ZaAnnualPopGrowth

## [1] 2.329663

Using the data and projections from the Central Statistical Office of Zambia, we can estimate what the population in 2017 will be. We could use the law of growth formula which uses Euler’s constant to project the population for the current year.

\({N} = {N_0}e^{rt}\)

options(scipen=999)
Lu <- LuAnnualPopGrowth / 100
Lu

## [1] 0.03791593

LuPop2010 *2.71828^(Lu*7)

## [1] 2278229

This is what we would do if we considered exponential growth. This doesnt take into account fertility nor death rates. With a fertility rate not falling as fast as other countries (7.2 to 6.2 in the last 30 years), Zambia has a projected exceptional growth when compared to other countries in Africa. The Central Statistical Office of Zambia has done projections for 2011 to 2035. We will use this to establish the projected population of Lusaka for 2017.

luPop2017 <- 2426898
luPop2017 / 360

## [1] 6741.383

So the population would be 2,426,698 in Lusaka in 2017 at a growth rate of 3.8% annually and a population density of 6,741.4 inhabitants per km square.

Prepare the data for areametric estimates

We must first process the spatial data. The first thing to do is to verify which projection we have. Projection determines the units for the data.

# Import data
data <- readOGR("MutBuildW.geojson", 
                    "OGRGeoJSON", require_geomType="wkbPolygon")

## OGR data source with driver: GeoJSON 
## Source: "MutBuildW.geojson", layer: "OGRGeoJSON"
## with 7014 features;
## Selected wkbPolygon feature type, with 7014 rows
## It has 10 fields

# list the slots 
slotNames(data)

## [1] "data"        "polygons"    "plotOrder"   "bbox"        "proj4string"

# check the projection info
data@proj4string

## CRS arguments:
##  +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0

The data is in geodesic longlat which means the unit is going to be in degrees. We want to measure in meters so we need to change the projection. We are going to use Web Mercator (EPSG: 3857). We can then calculate the area in meter square for each buildings for use later.

# Transform to web mercator from geodesic longlat
data2 <- spTransform(data, CRS("+init=epsg:3857"))

# Calculate area for each polygon feature (buildings) and store the result in a new variable "area_m2"
data2@data$area_m2 <- gArea(data2, byid = TRUE)

data2@data$area_m2 <- round(data2@data$area_m2, digits = 2)

#Results
summary(data2@data$area_m2)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    4.98   68.91  100.80  110.20  136.70 1483.00

So, we now have the area for all buildings. In Mutendere, the average size of buildings is 110 m2, the smallest building is 5 m2 and the largest is 1483 m2.

data3 <- data2@data %>% 
    group_by(area_m2) %>% 
    mutate(group = ntile(area_m2, 5)) %>% 
    mutate(group = factor(group)) %>% 
    mutate(group = fct_recode(group, "Very small" = "1",
                              "Small" = "2",
                              "Medium" = "3",
                              "Large" = "4",
                              "Very large" = "5"))
data3 <- data3 %>% 
  group_by(group) %>% 
  tally
  
gg <- ggplot(data3, aes(x=n, y=reorder(group, n), text = n))
gg <- gg + geom_segment(aes(xend = 0, yend=group), color="#ececec")
gg <- gg + geom_point(color = "#3282bd", size = 2)
gg <- gg + scale_x_continuous(expand = c(0.1,0))
gg <- gg + labs(x = NULL, y = NULL)
gg <- gg + theme(strip.background=element_blank())
gg <- gg + theme_bw(base_family = "Helvetica")
gg <- gg + theme(panel.border = element_blank())
gg <- gg + theme(panel.grid.major = element_blank())
gg <- gg + theme(panel.grid.minor = element_blank())
gg <- gg + theme(axis.ticks.y=element_blank())
gg <- gg + theme(axis.text.x=element_text(size=9))
gg <- gg + theme(axis.text.y=element_text(size=12))
ggplotly(gg, tooltip = "text")

Estimating the population in Mutendere using it’s buildings

We can now calculate the area of the suburb of Mutendere and use the known population of the city of Lusaka and it’s area to start the disagregation. Lusaka has an area of 360 km2. With the projection Web Mercator, units are in meters so we also need to convert the area to meters square.

Then, we calculate the population by meter square for Lusaka and use that to get the population by meter square for Mutendere, at least within its boundary as a share of the total city population. This step would have much better result if there was population data at a smaller level but we don’t have it.

We calculate the population density by using the projected population of Lusake for 2017 and the area in m2 of the capital.

#Convert Lusaka area from km2 to m2
LusArea <- 360 * 1000000

#Calculate the population density per m2 for Lusaka
LusDensM2 <- luPop2017 / LusArea
LusDensM2

## [1] 0.006741383

Going back to Lwin and Murayama’s formula:

\({BP_i} = (\frac{CP}{\sum_{k=1}^{n}{BA_k}}){BA_i}\)

CP is for Census Tract and since we don’t have this level of granularity, we just use the capital’s population and density. We will disaggregate for the sum of the buildings area in Mutendere.

#Remove schools (non-residential)
data@data <- data2@data %>% 
  filter(!id %in% c("way/420281958", "way/420282448"))

# Total area of buildings footprint
BuildFootArea <- data2@data %>% 
  tally(area_m2)

# Approximate the population using the area of buildings
MutPopM2 <- LusDensM2 * BuildFootArea
MutPopM2

##          n
## 1 5210.239

#Disaggregate the population for the proxy census tract acting for Mutendere to all buildingsusing the area
data2@data <- data2@data %>% 
  mutate(PopBuild = LusDensM2 * area_m2)

data2@data$PopBuild <- round(data2@data$PopBuild, digits = 2)

summary(data2@data$PopBuild)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0300  0.4600  0.6800  0.7428  0.9200 10.0000

Results

Using an areametric approach, we estimate that the population in Mutendere is 5210.24 in 2017 based on the sum of the buildings’ area. The average number of persons per building is 0.74. This appears to be a very low estimate. As such it must be taken lightly. This is meant has a proxy in the absence of information. This approach is likely not accurate. The goal of the exercice but offers an information where there was none and is better than a less precise approximation.

As a final step, we can visualize the disaggregation to each building.

Closing remarks

Areal interpolation is based on the assumption that the population is uniformly distributed. However, populations are rarely present a homegenous distribution within a zone such as a city. Estimates are very likely going to differ from reality. The usefulness is in its use as a proxy where infromation is not avaiable.

As an exercice, this presents some interests for micro analysis. A number of things could be done to improve the estimates. First, any information that could identify the non-residential buildings would allow their removal from the calculation. Second, obtaining the number of floors or height of the buildings would allow for a volumetric approach which would potentially be more accurate. Calculating a density from the total residential buildings in the city instead of a wide area might also increase the estimates.

There are documented methods to address some of the shortcomings like target density weighting, multi-class dasymetric, and more. We intend to revisit the current test with a more refined interpolation.

A report by the Millenium Challenge Account Zambia (MCA) presented a census in Mutendere done in 2015-2016 and found 106,447 people. The area studied is larger than the one we used but there is a big difference. We might redo this exercise with an area that is closer to the one used by MCA Zambia but at this time, many of the buildings in the area from the report have not been mapped.

We wonder if the population could have grown faster than projected? Lusaka has one of the fastest growing population in Africa. Without more recent data, it is difficult to make this approach more accurate. An areametric approach is a useful method to disaggregate for micro analysis at a very small scale but it produced low estimates in our test to provide a proxy from a city wide population count.