If we could precisely geo-locate all individuals in British Columbia, it would be relatively straightforward to classify them as either living in an urban or rural setting.
Statistics Canada defines urban areas as those that have a population of at least 1,000 and a density of 400 or more people per square kilometre, otherwise rural.
But what if we do not have precise location data for individuals? What if all we have is the first 3 digits of their postal code? (FSA).
Canada Post is the operational authority that defines and uses FSAs for mail sorting and delivery.
FSA boundaries are determined by:
Locations of mail processing plants and delivery depots.
Major transportation routes (highways, ferries, flight routes to the North).
Growth and demand: if a region’s addresses expand rapidly, Canada Post may split or reassign FSAs to keep sorting manageable.
Canada Post aims for FSAs to represent roughly comparable units of mail processing workload, not people or addresses.
Canada Post categorizes the FSAs as either rural (2nd digit 0) or urban (2nd digit not 0) but this is an operational categorization, not intended for analysis.
Note that the largest rural FSA is 278 times larger than the smallest rural FSA (plot to right), which makes it difficult to compare across FSAs that are mostly empty.
For example, consider two identical towns located in two different FSAs that are otherwise uninhabited.
If the size of the FSA “container” differs by a factor 278, then so does their population densities, even though the “contents” are identical.
Thus, we need a way to characterize how urban/rural are the areas that are actually inhabited.
We use night time light pollution aka “All Angle Composite Snow Free” sourced from NASA (2018).
Doing so allows us to calculate:
Proportion Lit: The proportion of LANDAREA
lit at night aka inhabited.Inhabited Area= Proportion Lit \(\times\) LANDAREAEffective Density= \(\frac{\mbox{Population}}{\mbox{Inhabited
Area}}\)Conditional Median = median(All Angle Composite Snow
Free, na.rm=TRUE)Going back to our previous example, our two different sized FSAs that share identical “contents” would have identical values for our measures of urban development:
\[\mbox{Class} \sim \mbox{Proportion Lit} + \mbox{Conditional Median}+ \mbox{Effective Density}+\epsilon\] where
Canada Post’s classification is based mainly on how empty the FSA’s are (as measured by proportion lit), and ignores how tightly packed people are in inhabited areas (as measured by effective density.)
This calls into question whether Canada Post’s classification is useful for analysis.
Next we create an alternative classification based on our measures of development:
Effective Density is a FSA’s population divided by its
lit area.Conditional Median is the FSA’s median light intensity,
ignoring all unlit areas.