Let \(X\) and \(Y\) denote two ethnic groups (e.g. White British and Bangladeshi), and \(j\) a location (e.g. a school, neighbourhood or local authority). The number of \(X\) in \(j\) is \(n_{xj}\) and the number of \(Y\) is \(n_{yj}\). The total number of \(X\), summed for all locations in a study region or part thereof is \(n_{x+}\) and the total number of \(Y\) is \(n_{y+}\). The share of all \(X\) in location \(j\) is \(\frac{n_{xj}}{n_{x+}}\), and the share of all \(Y\) is \(\frac{n_{yj}}{n_{y+}}\). The total number of all groups in location \(j\) (e.g. the total number of pupils) is \(n_{+j}\), which means the proportion that are of group \(X\) is \(\frac{n_{xj}}{n_{+j}}\), and the proportion of \(Y\) is \(\frac{n_{yj}}{n_{+j}}\).
The Index of Dissimilarity measures how unevenly two groups are spread out geographically, relative to their overall size. It can be defined in terms of probability. Imagine that a pupil is selected at random from all those that belong to ethnic group \(X\). The probability that pupil goes to school in location, \(j\), is
\(P\big(j|X\big)=\frac{n_{xj}}{n_{x+}}\)
Similarly, the probability that a randomly selected member of ethnic group \(Y\) is in location \(j\) is,
\(P\big(j|Y\big)=\frac{n_{yj}}{n_{y+}}\)
The ID defines no segregation as when those probabilities are equal for all locations in the study region: when \(P\big(j|X\big) = P\big(j|Y\big)\) and, equivalently, \(P\big(j|X\big) - P\big(j|Y\big) = 0\) for all \(j\). Hence \(\sum_{j}\left|\frac{n_{xj}}{n_{x+}}-\frac{n_{yj}}{n_{y+}}\right|= 0\). The ID simply adds a scaling constant to the left hand side of that expression, usually 0.5 to ensure that the index ranges from 0 (no segregation, when the share of \(X\) is everywhere equal to the share of \(Y\)) to 1 (total separation, wherever \(X\) is, \(Y\) is not):
\(\text{ID}=0.5\sum_{j}\left|\frac{n_{xj}}{n_{x+}}-\frac{n_{yj}}{n_{y+}}\right|\)
The ID can be disaggregated to identify the locations contributing most to it. Any individual location’s contribution to the overall index value can be determined from
\(\text{id}_j\propto\left|\frac{n_{xj}}{n_{x+}}-\frac{n_{yj}}{n_{y+}}\right|\)
If there are more than two ethnic groups then the location’s contribution to the total segregation across all the groups can be calculated from the sum of all the pairwise calculations,
\(\text{id}_{+j}\propto\sum_x\sum_y\left|\frac{n_{xj}}{n_{x+}}-\frac{n_{yj}}{n_{y+}}\right|\)
In Chapter 3, the above is weighted, giving
\(\text{id}_{+j}\propto\frac{n_{xj} + n_{yj}}{n_{+j}}\sum_x\sum_y\left|\frac{n_{xi}}{n_{x+}}-\frac{n_{yi}}{n_{y+}}\right|\)
This places most weight on the segregation of the groups that are most prevalent in each location.
The Indices of Exposure and of Isolation are both related to how prevalent an ethnic group is in the place attended by the average member of that group - they measure how ‘exposed’ the group is to other groups or how concentrated it is with itself.
The probability that a person selected at random from group, \(X\), is in location \(j\) is
\(\text{P}\big(j|X\big)=\frac{n_{xj}}{n_{x+}}\)
Having selected that person, the probability of selecting, from the same location, another person of the same ethnicity is
\(\text{P}\big(X|j\big)=\frac{n_{xj}-1}{n_{+j}-1}\simeq\frac{n_{xj}}{n_{+j}}\)
The probability of that second person being from a different ethnic group is
\(1-\text{P}\big(X|j\big)\simeq1-\frac{n_{xj}}{n_{+j}}\)
Multiplying the first and third of these probabilities together produces an index measuring one ethnic group’s average exposure to other groups,
\(\text{IE}=\sum_{j}\left(\frac{n_{xj}}{n_{x+}}\right)\left(1-\frac{n_{xj}}{n_{+j}}\right)\)
Or, it can be modified to calculate the exposure to any other specific group,
\(\text{IE}=\sum_{j}\left(\frac{n_{xj}}{n_{x+}}\right)\left(\frac{n_{yj}}{n_{+j}}\right)\)
The Index of Isolation simply calculates the average proportion of pupils who are of group \(X\) in the locations where group \(X\) is found:
\(\text{II}=\sum_{j}\left(\frac{n_{xj}}{n_{x+}}\right)\left(\frac{n_{xj}}{n_{+j}}\right)\)
In probabilistic terms, it is \(\sum_j\text{P}\big(j|X\big).\text{P}\big(X|j\big)\), where \(\text{P}\big(j|X\big).\text{P}\big(X|j\big)\) is the joint probability of randomly selecting, from group \(X\), a pupil that lives in location \(j\), and then a second pupil, from location \(j\), who also is a member of group \(X\).
The Index of Exposure ranges from zero asymptotically to one, and the Index of Isolation asymptotically from zero to one.
Chapter 3 introduces a new index - the potential for Equal Cross-Exposure between two groups - that is at its maximum for any location when (a) \(X\) and \(Y\) are of equal number (\(n_{xj} : n_{xj} = 1\)), and (b) \(X\) and \(Y\) comprise the entire population of \(j\) (so \(n_{xj} + n_{yj} = n_{+j}\)).
The formal specification of the index is
\(\text{PECI}_j = \frac{n_{xj} + n_{yj}}{n_{+j}}\left(\frac{\text{tan}^{-1}r_j}{\text{tan}^{-1}1}\right)\)
where, if \(x_{nj} > y_{nj}\), then \(r_j = \frac{y_{nj}}{x_{nj}}\) else \(r_j = \frac{x_{nj}}{y_{nj}}\)
The index ranges from zero (there is none of one or both groups so there is no cross-exposure) to one (both groups form half the total population).
The entropy index, \(h_j\), measures the (ethnic) diversity of a location, and is calculated as
\(h_j = -\sum_kp_{kj}\text{log}\big(p_{kj}\big)\)
where \(k\) is one of \(N_k\) different ethnic groups and
\(p_{kj}=\frac{n_{kj}}{n_{+j}}\)
with \(\text{log}\big(p_{k}\big)\) set to \(0\) if \(p_k = 0\)
The index ranges from 0 (when the location is fully populated by one only group) to \(\text{log}\big(N_k\big)\) (when there is an equal proportion of each group). To standardise it into the range from \(0\) to \(1\) the following scaling is applied,
\(h_j \leftarrow h_j/\text{log}\big(N_k\big)\)
Chapter 4 looks at how concentrated pupils are in schools with their own or other ethnic groups. To achieve this a typological approach is used (rather than an index) that classifies schools according to the percentage of their pupils that are from one or more specific groups at particular thresholds of interest - for example, the percentage of White British pupils that are in a school that is majority White British. If \(n_{x(p_{xj}>0.5)}\) is the number of \(X\) in schools where that group forms a majority then, as a percentage, it is simply \(\frac{n_{x(p_x>0.5)}}{n_{x+}}\times100\). The higher that value, the more concentrated is, in the example, the White British in majority White British schools.