Online Companion to Occupational Mobility Is Not One Thing: Evidence from the Canadian Labour Market

Author

Richard Martin

Overview

This companion is designed to ensure transparency and reproducibility for the paper Occupational Mobility Is Not One Thing: Evidence from the Canadian Labour Market. It provides technical detail on distance metrics, normalization, and specificity measures.

Contents

Skill distance: O*NET-based occupational skill vectors, dimensionality reduction via PCA, and the resulting continuous distance matrix.
Hierarchical distance: Taxonomy-based discrete distances derived from the NOC structure, resulting in labour market silos with internal TEER platforms.
Cost matrix normalization: Percentile-based anchoring across distance metrics.
Occupation’s education specificity: A size-adjusted measure of concentration in educational inflows into occupations.
Education’s occupation specificity: A size-adjusted measure of concentration in occupational outcomes across fields of study and attainment levels.

Skill Distance

We use 161 occupational characteristics (Skills, Abilities, Knowledge, Work Activities) from O*NET to measure the distance between occupations. Because occupational skill vectors are high-dimensional and highly correlated, Euclidean distances computed in the full feature space may suffer from distance concentration. We therefore project the data onto the 12 leading principal components, which achieve a 99% rank correlation of pairwise distances relative to the full 161-dimensional space. This reduces noise from low-variance directions while preserving the dominant geometry of the skill space.

PCA Rank Preservation

The figure below shows the Spearman rank correlation between pairwise distances in the \(k\)-dimensional PCA space and the full 161-dimensional space, as a function of \(k\). The dashed line marks the 0.99 threshold; 12 components suffice.

Code

ggplot(k_vs_spearman, aes(x = k, y = spearman)) +
  geom_hline(yintercept = .99, lty = 2) +
  geom_line() +
  geom_point() +
  scale_x_continuous(limits = c(0, 50)) +
  labs(
    title = "First 12 PCs achieve ≥0.99 rank preservation of pairwise distances.",
    y = "Spearman correlation (distance ranks)",
    x = "Number of PCs"
  ) +
  theme(text = element_text(size = 10))

Skill Distance Heatmap

Occupations are ordered by hierarchical clustering (average linkage). Diagonal entries are self-distances (zero). Dark clusters along the diagonal indicate groups of occupations with similar skill profiles. Use left mouse button drag to zoom.

Code

plotly::plot_ly(
  x = skills_xlab,
  y = skills_ylab,
  z = skills_mat_ord,
  type = "heatmap",
  hovertemplate = paste(
    "<b>Origin:</b> %{y}<br>",
    "<b>Destination:</b> %{x}<br>",
    "<b>Distance:</b> %{z:.3f}",
    "<extra></extra>"
  )
) |>
  plotly::layout(
    xaxis = list(type = "category", showticklabels = FALSE, ticks = ""),
    yaxis = list(type = "category", showticklabels = FALSE, ticks = "")
)

Hierarchical Distance

Hierarchical distances encode institutional and career barriers implied by the NOC taxonomy. Distance between any two occupations is defined as follows:

All five digits match: distance = 0
First four digits match: distance = 1
First three digits match: distance = 2
First digit matches: distance = 3 + |ΔTEER|
Otherwise: distance = 9

These distances are normalized by division by 4, as described in Cost Matrix Normalization. The Spearman correlation between skill and hierarchical distances is 0.289, indicating that institutional proximity and skill similarity are positively but only modestly related. The two metrics encode overlapping yet distinct occupational geometries.

Distance Counts

Code

hier_count_plot <- table(hier) |>
  enframe() |>
  ggplot(aes(name, value)) +
  geom_col(alpha = .5) +
  scale_y_continuous(trans = "log10", labels = scales::comma) +
  theme_minimal() +
  labs(x = "Normalized Distance", y = "Counts")

plotly::ggplotly(hier_count_plot)

Hierarchical Distance Heatmap

Occupations are ordered by hierarchical clustering (average linkage). The block-diagonal structure reflects the ten NOC broad occupational categories (silos); yellow off-diagonal entries are cross-silo pairs assigned the sentinel distance of 9. Use left mouse button drag to zoom.

Code

hier_hc      <- hclust(as.dist(hier), method = "average")
hier_ord     <- hier_hc$order
hier_mat_ord <- hier[hier_ord, hier_ord]
hier_xlab    <- as.character(colnames(hier_mat_ord))
hier_ylab    <- as.character(rownames(hier_mat_ord))

plotly::plot_ly(
  x = hier_xlab,
  y = hier_ylab,
  z = hier_mat_ord,
  type = "heatmap",
  hovertemplate = paste(
    "<b>Origin:</b> %{y}<br>",
    "<b>Destination:</b> %{x}<br>",
    "<b>Distance:</b> %{z:.3f}",
    "<extra></extra>"
  )
) |>
  plotly::layout(
    xaxis = list(type = "category", showticklabels = FALSE, ticks = ""),
    yaxis = list(type = "category", showticklabels = FALSE, ticks = "")
  )

Cost Matrix Normalization

The hierarchical distance contains a large mass of maximally distant pairs (distance = 9), representing transitions the taxonomy treats as categorically distant. Because these pairs provide little information about substitution intensity among plausible transitions, we define the informative region as the set of non-maximal, non-zero hierarchical distances.

Within this region, we compute the 25th, 50th, and 75th percentiles of hierarchical cost. These conditional quantiles represent increasingly broad but still economically meaningful transition margins. Each anchor value is then mapped to its unconditional percentile in the full hierarchical distribution. The skill distance matrix is calibrated by selecting cost values at these same unconditional percentiles. This procedure ensures that calibration aligns comparable substitution margins across metrics, rather than matching arbitrary numerical magnitudes.

Percentile-based anchoring is invariant to monotonic transformations of the cost scale and therefore preserves the rank ordering of transition costs in each metric.

Code

calibration |>
  gt() |>
  fmt_number(columns = c(h_anchor, p_uncond, s_anchor), decimals = 3) |>
  cols_label(
    q_cond   = "Conditional Quantile",
    h_anchor = "Hierarchical Anchor",
    p_uncond = "Unconditional Percentile",
    s_anchor = "Skill Anchor"
  ) |>
  tab_options(table.font.size = "90%", data_row.padding = px(4)) |>
  tab_header(title = md("**Calibration Anchors Across Distance Metrics**"))

Conditional Quantile	Hierarchical Anchor	Unconditional Percentile	Skill Anchor
Calibration Anchors Across Distance Metrics
0.25	3.000	0.040	7.145
0.50	4.000	0.081	8.590
0.75	5.000	0.105	9.253

The median non-maximal hierarchical transition lies at approximately the 8th percentile of the full hierarchical distribution; the skill anchor is defined at this same percentile to ensure comparable substitution intensity.

Occupation’s Education Specificity

Construction

Step 1 — Overall Educational Distribution. We compute the overall educational distribution across all NOCs. This represents the expected education mix for a worker drawn from the educated workforce without conditioning on occupation.

Step 2 — Distribution Within Each NOC. For each NOC we compute total workers \(T\), occupation-specific education shares \(p\), and compare to the baseline distribution \(p_0\) using KL divergence.

Step 3 — Remove Size Effect. Finite samples mechanically inflate divergence for small occupations. We remove this artifact by residualizing \(\log(\text{KL})\) on \(\log(T)\):

Code

fit_kl <- lm(log(KL) ~ log(T), data = noc_specificity)

noc_specificity <- noc_specificity |>
  mutate(
    specificity = log(KL) - predict(fit_kl),
    TEER = str_sub(noc, 2, 2)
  )

The resulting specificity index measures how distinct an occupation’s education mix is relative to what is statistically expected given its size. KL divergence captures where probability mass is concentrated relative to the aggregate baseline, rather than how much mass lies in very small cells. Residualizing removes predictable finite-sample divergence, isolating economically meaningful concentration. The index is orthogonal to occupation size by construction, ensuring that specificity reflects structural concentration in educational pathways rather than equilibrium scale effects.

Specificity vs. Occupation Size

High specificity indicates few educational pathways into the occupation. Residualized specificity is not correlated with log occupation size (Pearson = -0.03, Spearman = 0.03).

Code

p <- ggplotly(dest_plt)
p$x$data <- rev(p$x$data)
p

Occupation Specificity Rankings

Code

noc_specificity |>
  select(noc, specificity, sub_regime) |>
  arrange(desc(specificity)) |>
  my_dt()

Education’s Occupation Specificity

Construction

Education’s occupation specificity is constructed symmetrically to occupation’s education specificity, conditioning on education rather than occupation.

Step 1 — Overall Occupational Distribution. We compute the aggregate occupational distribution across all educated workers, serving as the baseline absent conditioning on field of study.

Step 2 — Distribution Within Each Education. For each (CIP × attainment) cell we compute total graduates \(T\), occupation shares \(p\), and compare to the aggregate occupational distribution \(p_0\) using KL divergence.

Step 3 — Remove Size Effect. As with occupations, small cohorts mechanically inflate divergence. We residualize \(\log(\text{KL})\) on \(\log(T)\):

Code

fit_kl <- lm(log(KL) ~ log(T), data = educ_specificity)

educ_specificity <- educ_specificity |>
  mutate(specificity = log(KL) - predict(fit_kl))

The resulting index measures how narrowly a field of study feeds into occupations, independent of cohort size. Professional and doctoral fields exhibit consistently high occupational concentration; below that threshold, specificity varies substantially within attainment levels, confirming that educational level alone does not determine pipeline concentration.

Specificity vs. Cohort Size

High specificity indicates few occupational pathways from the education. Residualized specificity is not correlated with log cohort size (Pearson = 0.07, Spearman = -0.01).

Code

spec_plotly <- plotly::ggplotly(spec_plt)

for (i in seq_along(spec_plotly$x$data)) {
  spec_plotly$x$data[[i]]$name <- gsub(",1", "", spec_plotly$x$data[[i]]$name)
  spec_plotly$x$data[[i]]$name <- gsub("^\\(|\\)$", "", spec_plotly$x$data[[i]]$name)
}

spec_plotly |>
  plotly::layout(
    legend = list(orientation = "h", x = 0.5, xanchor = "center", y = -0.2)
  )

Education Specificity Rankings

Code

educ_specificity |>
  select(highest, cip, specificity) |>
  arrange(desc(specificity)) |>
  my_dt()