Visualising the scales of ethnic diversity in London using a multilevel model of entropy

Introduction
Measuring ethnic diversity
A mutilevel index of ethnic diversity
Mapping the residuals
Additional Notes

Introduction

Within the segregation literature, there has been growing interest in understanding the scales of segregation not only as a simple index quantifying the amount of segregation but also as a geographical measure capturing more of the spatial pattern of segregation and of the geographical scales at which population groups are clustered together with their peers and apart from others. Such approaches feature in two special journal editions on measuring and modelling segregation - one in Environment and Planning B, and one in Tijdschrift voor economische en sociale geografie - and include the Multilevel Index of Dissimilarity, which is available as a package for R.

Given the input into these methods are geographically referenced data, the outputs should be mappable, visualizing how the overall pattern of segregation is an outcome of other patterns (and processes) at multiple geographical scales. They rarely are. A notable exception is below, reproduced from Harris and Owen (2018). Having fitted a Multilevel Index of Dissimilarity looking at the geographical scales of White British – Bangladeshi residential segregation in London, density surfaces were used to highlight the places where (a) differences at the small area, census Output Area (OA) scale, and (b) where broader scale, Local Authority (LA) differences are contributing to the patterns of segregation. The approach is, however, a crude simplification of how the strength of segregation varies at each scale across the study region, and of how those variations combine to give the overall pattern of segregation.

This question is, can we do better? The answer is yes and it is shown in Figure 2 (but with an emphasis on diversity instead of segregation). The rest of this document explains how it was created.

Figure 1: Mapping the scales of White British – Bangladeshi residential segregation in London in 2011. Darker shading indicates the places where (a) the OA, or (b) local authority effects act to increase the measured amount of segregation (Reproduced from Harris & Owen, 2018)

Figure 2: Mapping the scales of ethnic diversity in London

Measuring ethnic diversity

Although the same principles can be applied to visualizing the scales of segregation, here, instead, we will look at its counterpart: ethnic diversity. The ethnicity data used are from the National Pupil Database, which describe pupils who were in the middle years of state primary or secondary schools in either of the years 2015 or 2017 are used to calculate the ethnic mix of pupils within each of the 4835 Lower Level Super Output Areas in London. To quantify the mix, the entropy measure, \(E\), is calculated for each LSOA, defined as,

\(E = -\sum_{1}^{n_g}\text{ }{p_g\times\text{log}(p_g)}\)

where the summation is over \(n_g = 9\) ethnic categories, and where \(p_g\) is the proportion of all the pupils in the LSOA that are of ethnic group \(g\). Those groups are Asian Bangladeshi, Asian Indian, Asian Pakistani, Black African, Black Caribbean, White British, White Other, Mixed ethnicity, and Other. As well as calculating the entropy measure, each LSOA (at the ‘micro-scale’, \(i\)) is matched to an electoral ward (at the ‘meso-scale’, \(j\)) and to a local authority (at the ‘macro-scale’,\(k\)), to give a three-level, geographic hierarchy.

head(slot(map, "data"))

## # A tibble: 6 x 7
##   SP_ID LSOA11CD  WD16CD    LAD16CD   LAD16NM                  N     E
##   <fct> <fct>     <fct>     <fct>     <fct>                <dbl> <dbl>
## 1 1     E01000001 E05009288 E09000001 City of London          20  1.75
## 2 2     E01000002 E05009302 E09000001 City of London          19  1.76
## 3 3     E01000003 E05009302 E09000001 City of London          42  1.51
## 4 4     E01000005 E05009308 E09000001 City of London          66  1.78
## 5 5     E01000006 E05000026 E09000002 Barking and Dagenham   169  1.90
## 6 6     E01000007 E05000026 E09000002 Barking and Dagenham   236  1.92

A mutilevel index of ethnic diversity

In much the same way at the standard Index of Dissimilarity can be expressed as a multilevel index that measures the amount of segregation at each level, net of the others, the standard entropy model can be expressed as a multilevel model too. For example, a three-level model of LSOAs, Wards and LAs, may be defined as,

\(E_{ijk}=\beta_0 +\epsilon_{ijk}\)

where \(\epsilon_{ijk}\) is the (residual) difference between the measured value entropy for each LSOA (\(E_{ijk}\)) and the constant, \(\beta_0\) (which is the average diversity score across all LSOAs); and where

\(\epsilon_{ijk}=C_i+M_j+Y_k\)

decomposes the residuals into three levels - those at the LSOA level (\(C_i\)), those at the Ward level (\(M_j\)) and those at the LAD level (\(Y_k\)) (the reason for denoting them as C, M and Y will become evident presently).

Prior to fitting the model, the raw entropy scores are subjected to a Box-Cox transformation to deal with any problem of skew,

require(MASS)
bxcx <- boxcox(E ~ 1, data = map, lambda = seq(-5,5,0.01), plotit = FALSE)
lambda <- bxcx$x[bxcx$y == max(bxcx$y)]
print(lambda)

## [1] 2.95

map$E <- (map$E ^ lambda - 1)/lambda
head(slot(map, "data"))

## # A tibble: 6 x 7
##   SP_ID LSOA11CD  WD16CD    LAD16CD   LAD16NM                  N     E
##   <fct> <fct>     <fct>     <fct>     <fct>                <dbl> <dbl>
## 1 1     E01000001 E05009288 E09000001 City of London          20 1.43 
## 2 2     E01000002 E05009302 E09000001 City of London          19 1.46 
## 3 3     E01000003 E05009302 E09000001 City of London          42 0.812
## 4 4     E01000005 E05009308 E09000001 City of London          66 1.53 
## 5 5     E01000006 E05000026 E09000002 Barking and Dagenham   169 1.93 
## 6 6     E01000007 E05000026 E09000002 Barking and Dagenham   236 1.98

The model is then fitted with ‘prior weights’ proportional to the number of pupils in each LSOA (and therefore Ward and LA also) (see Additional Notes, below):

require(lme4)
w <- map$w <- map$N / sum(map$N) * nrow(map)
mlm <- lmer(E ~ 1 + (1 | WD16CD) + (1 | LAD16CD), weights = w, data = slot(map, "data"))

print(mlm)

## Linear mixed model fit by REML ['lmerMod']
## Formula: E ~ 1 + (1 | WD16CD) + (1 | LAD16CD)
##    Data: slot(map, "data")
## Weights: w
## REML criterion at convergence: 5661.88
## Random effects:
##  Groups   Name        Std.Dev.
##  WD16CD   (Intercept) 0.3645  
##  LAD16CD  (Intercept) 0.4958  
##  Residual             0.3487  
## Number of obs: 4835, groups:  WD16CD, 634; LAD16CD, 33
## Fixed Effects:
## (Intercept)  
##       1.224

The results from the model suggest that there is greatest variance (shown as Std. Dev. in the model output) at the local authority scale; that is, there is greatest patterning of the most and least diverse places at the LA level. Specifically, the percentage of the variance due to each level, in order of WD16CD (Wards), LAD16CD (LAs) and Residual (LSOAs) is,

vv <- data.frame(VarCorr(mlm))$vcov
names(vv) <- data.frame(VarCorr(mlm))$grp
vv <- round(vv / sum(vv) * 100, 1)
print(vv)

##   WD16CD  LAD16CD Residual 
##     26.6     49.1     24.3

… although this does rather exaggerate the difference between LAs and the rest (because the variance is in the measurement units, squared).

The next step is to visualize that patterning by mapping the residuals from the model.

Mapping the residuals

To map the scales of ethnic diversity in London, what is required are the residuals from the multilevel model. The process is one of extracting the residuals, standardizing them and assigning them a colour value for the map.

Extracting the residuals and range standardising them

Residuals at the base (LSOA) level are easily extracted and matched into the data,

map$C <- residuals(mlm)

The others take just a little more work:

rr <- ranef(mlm)
names(rr)

## [1] "WD16CD"  "LAD16CD"

mch <- match(map$WD16CD, rownames(rr[[1]]))
map$M <- rr[[1]][mch,1]
mch <- match(map$LAD16CD, rownames(rr[[2]]))
map$Y <- rr[[2]][mch,1]
slot(map, "data") %>%
  dplyr::select(LSOA11CD, WD16CD, LAD16CD, C, M, Y) %>%
  print(n=12)

## # A tibble: 4,835 x 6
##    LSOA11CD  WD16CD    LAD16CD         C       M      Y
##  * <fct>     <fct>     <fct>       <dbl>   <dbl>  <dbl>
##  1 E01000001 E05009288 E09000001  0.134   0.0204 0.0543
##  2 E01000002 E05009302 E09000001  0.263  -0.0842 0.0543
##  3 E01000003 E05009302 E09000001 -0.382  -0.0842 0.0543
##  4 E01000005 E05009308 E09000001  0.167   0.0842 0.0543
##  5 E01000006 E05000026 E09000002 -0.161   0.425  0.442 
##  6 E01000007 E05000026 E09000002 -0.107   0.425  0.442 
##  7 E01000008 E05000026 E09000002 -0.435   0.425  0.442 
##  8 E01000009 E05000026 E09000002 -0.183   0.425  0.442 
##  9 E01000010 E05000026 E09000002  0.477   0.425  0.442 
## 10 E01000011 E05000026 E09000002  0.0283  0.425  0.442 
## 11 E01000012 E05000026 E09000002  0.372   0.425  0.442 
## 12 E01000013 E05000027 E09000002  0.317  -0.445  0.442 
## # ... with 4,823 more rows

For ease of conversion into a colour value, and for comparability across the levels, the residuals are then standardized into the range \(\text[0,1]\):

mx <- max(map$C, map$M, map$Y)
mn <- min(map$C, map$M, map$Y)
rng <- mx - mn
map$C <- (map$C - mn) / rng
map$M <- (map$M - mn) / rng
map$Y <- (map$Y - mn) / rng
slot(map, "data") %>%
  summarise_at(8:10, c("min", "max"))

## # A tibble: 1 x 6
##   C_min M_min Y_min C_max M_max Y_max
##   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1     0 0.171 0.229     1 0.911 0.807

Assigning the residuals a colour value

Recall that C represents the micro-level or LSOA scale. It will be given a Cyan colour palette. M represents the meso-level or Ward scale. It will be given a Magenta colour. Finally, Y represents the macro-level of LA scale. It will be coloured Yellow.

To achieve this, first, we need a function to convert what will be a CMY(K) colour palette into a hex code:

hexcode <- function(C, M, Y, K = 0) {
  R <- 255 * (1-C) * (1-K)
  G <- 255 * (1-M) * (1-K)
  B <- 255 * (1-Y) * (1-K)
  rgb(R, G, B, maxColorValue = 255)
}

The patterns of diversity at each of the levels can now be plotted both individually (the first three panels of the graphic) and when mixed together (bottom-right plot).

par(mfrow = c(2,2))
par(mai = c(0,0,0,0))
cols <- hexcode(map$C, 0, 0)
plot(map, col=cols, border=cols)
cols <- hexcode(0, map$M, 0)
plot(map, col=cols, border=cols)
cols <- hexcode(0, 0, map$Y)
plot(map, col=cols, border=cols)
cols <- hexcode(map$C, map$M, map$Y)
plot(map, col=cols, border=cols)

Figure 3: Visualizing the scales of diversity - an initial mapping

Completing the maps

The maps above are broadly effective but there is room for improvement through some careful use of an alpha (transparency) parameter, adding an outer boundary line, and some other tidying-up and labelling.

require(rgeos)

wards <- gUnaryUnion(map, id = map$WD16CD)
wards.df <- slot(map, "data") %>%
  dplyr::select(WD16CD, M) %>%
  unique(.)
wards <- SpatialPolygonsDataFrame(wards, wards.df, match.ID = FALSE)

las <- gUnaryUnion(map, id = map$LAD16CD)
las.df <- slot(map, "data") %>%
  dplyr::select(LAD16CD, Y) %>%
  unique(.)
las <- SpatialPolygonsDataFrame(las, las.df, match.ID = FALSE)

outline <- rgeos::gBuffer(map)
outline <- as(outline, "SpatialLines")

par(mfrow = c(2,2))
par(mai = c(0,0,0,0))

cols <- alpha(hexcode(map$C, 0, 0), map$C)
plot(map, col=cols, border="transparent")
plot(outline, col="grey90", add=T)
title(main=paste0("LSOA level (",vv[3],"% of variance)"), col.main=hexcode(1,0,0,0.5), line=-2)

cols <- alpha(hexcode(0, wards$M, 0), wards$M)
plot(wards, col=cols, border="transparent")
plot(outline, col="grey70", add=T)
title(main=paste0("Ward level (",vv[1],"% of variance)"), col.main=hexcode(0,1,0,0.5), line=-2)

cols <- alpha(hexcode(0, 0, las$Y), las$Y)
plot(las, col=cols, border="transparent")
plot(outline, col="grey90", add=T)
title(main=paste0("LA level (",vv[2],"% of variance)"),col.main=hexcode(0,0,1,0.5), line=-2)

alpha <- (map$E - min(map$E)) / (max(map$E) - min(map$E))
cols <- alpha(hexcode(map$C, map$M, map$Y), alpha)
plot(map, col=cols, border="transparent")
plot(outline, col="grey80", add=T)
title(main="All levels combined", col.main=hexcode(0,0,0,1), line=-2)

Figure 4: The final maps

Additional Notes

The problem of invisibility

Looking at the final maps, there are some areas that are too small to be visible. One way to address this problem is to use the ‘visually balanced’ or minimum area cartogram method outlined by Harrist et al. (2017). Doing so, using R’s cartogram package, produces the following maps - I am not persuaded they are that much better as ‘invisibility’ in some parts of the initial map are replaced by invisibility and distortion in other parts of the revised version. A little more trial-and-error is needed, probably by changing the smallest interpretable unit in the code below (source). Unfortunately, with so many LSOAs, the code takes a very long time to run.

Figure 5: Using a visually balanced / minimum area cartogram

## Code to create a minimum area cartogram
siu <- 0.02 # the smallest interpretable unit
height <- 5
bb <- sp::bbox(map)
width <- (bb[1,2] - bb[1,1]) / (bb[2,2] - bb[2,1]) * height
bbA <- (bb[1,2] - bb[1,1]) * (bb[2,2] - bb[2, 1])
mapA <- rgeos::gArea(map)
minA <- (siu * bbA) / (height * width)
map$scaleby <- rgeos::gArea(map, byid = TRUE)
map$scaleby[map$scaleby < minA] <- minA
map <- cartogram::cartogram(map, "scaleby", maxSizeError = 1.1, prepare = "none")

The effects of the ‘prior weights’

Earlier, the multilevel model of entropy was fitted with ‘prior weights’ proportional to the number of pupils in the data per LSOA. It is informative to see what happens if those weights are not used.

The charts below consider the consequences at the LA level. A greater LA level effect (i.e. a greater measured clustering of ethnic diversity at the LA level) appears to occur when their are more pupils in the LA and/or when the entropy values within the LA exhibit less variance. Reciprocally, there is ‘shrinkage’ of the values when there are fewer pupils or more internal heterogeneity. Further analysis of these weights would be helpful and, of course, they can be dropped entirely.

Figure 6: Exploring the effects of the weights on the LA level residuals

Reproducibility

The full code needed to reproduce the final maps is as below.

install.packages("tidyverse")
ip <- installed.packages()[,1]
if(!length(grep("lme4", ip))) install.packages("lme4")
if(!length(grep("MASS", ip))) install.packages("MASS")
if(!length(grep("rgdal", ip))) install.packages("rgdal")
if(!length(grep("rgeos", ip))) install.packages("rgeos")
if(!length(grep("sp", ip))) install.packages("sp")

require(lme4)
require(MASS)
require(rgdal)
require(rgeos)
require(sp)
require(tidyverse)

download.file("https://www.dropbox.com/s/kqj93hog396ve8p/maps.zip?dl=1", "map.zip", mode="wb")
unzip("map.zip")

map <- readOGR("map.shp")
slot(map, "data") <- as_tibble(map)?download.zip

bxcx <- boxcox(E ~ 1, data = map, lambda = seq(-5,5,0.01), plotit = FALSE)
lambda <- bxcx$x[bxcx$y == max(bxcx$y)]
print(lambda)
map$E <- (map$E ^ lambda - 1)/lambda

w <- map$N / sum(map$N) * nrow(map)
w <- ifelse(w == 0, w + 0.01, w) # This is a fudge because one of the LSOAs is missing data
mlm <- lmer(E ~ 1 + (1 | WD16CD) + (1 | LAD16CD), data = data.frame(map))

vv <- data.frame(VarCorr(mlm))$vcov
names(vv) <- data.frame(VarCorr(mlm))$grp
vv <- round(vv / sum(vv) * 100, 1)

map$C <- residuals(mlm)
rr <- ranef(mlm)
names(rr)
mch <- match(map$WD16CD, rownames(rr[[1]]))
map$M <- rr[[1]][mch,1]
mch <- match(map$LAD16CD, rownames(rr[[2]]))
map$Y <- rr[[2]][mch,1]

mx <- max(map$C, map$M, map$Y)
mn <- min(map$C, map$M, map$Y)
rng <- mx - mn
map$C <- (map$C - mn) / rng
map$M <- (map$M - mn) / rng
map$Y <- (map$Y - mn) / rng

hexcode <- function(C, M, Y, K = 0) {
  R <- 255 * (1-C) * (1-K)
  G <- 255 * (1-M) * (1-K)
  B <- 255 * (1-Y) * (1-K)
  rgb(R, G, B, maxColorValue = 255)
}

wards <- gUnaryUnion(map, id = map$WD16CD)
wards.df <- slot(map, "data") %>%
  dplyr::select(WD16CD, M) %>%
  unique(.)
wards <- SpatialPolygonsDataFrame(wards, wards.df, match.ID = FALSE)

las <- gUnaryUnion(map, id = map$LAD16CD)
las.df <- slot(map, "data") %>%
  dplyr::select(LAD16CD, Y) %>%
  unique(.)
las <- SpatialPolygonsDataFrame(las, las.df, match.ID = FALSE)

outline <- rgeos::gBuffer(map)
outline <- as(outline, "SpatialLines")

if(.Platform$OS.type == "windows") windows("figure2.png", height=7, width=7)
if(.Platform$OS.type == "unix") quartz("figure2.png", height=7, width=7)

par(mfrow = c(2,2))
par(mai = c(0,0,0,0))

cols <- alpha(hexcode(map$C, 0, 0), map$C)
plot(map, col=cols, border="transparent")
plot(outline, col="grey90", add=T)
title(main=paste0("LSOA level (",vv[3],"% of variance)"), col.main=hexcode(1,0,0,0.5), line=-2)

cols <- alpha(hexcode(0, wards$M, 0), wards$M)
plot(wards, col=cols, border="transparent")
plot(outline, col="grey70", add=T)
title(main=paste0("Ward level (",vv[1],"% of variance)"), col.main=hexcode(0,1,0,0.5), line=-2)

cols <- alpha(hexcode(0, 0, las$Y), las$Y)
plot(las, col=cols, border="transparent")
plot(outline, col="grey90", add=T)
title(main=paste0("LA level (",vv[2],"% of variance)"),col.main=hexcode(0,0,1,0.5), line=-2)

alpha <- (map$E - min(map$E)) / (max(map$E) - min(map$E))
cols <- alpha(hexcode(map$C, map$M, map$Y), alpha)
plot(map, col=cols, border="transparent")
plot(outline, col="grey80", add=T)
title(main="All levels combined", col.main=hexcode(0,0,0,1), line=-2)

dev.off()

And finally

I am grateful to persons including Chris Brunsdon and David O’Sullivan who gave feedback on an earlier version of the map. As a ‘footnote’, I should probably be using more of the Simple Features for R to do more of the computational work. I will, one day. But, for now, I thoroughly recommend looking at Geocomputation with R, which is available online.