Where Skills vs Institutions Shape Occupational Mobility

Methodology and Simulation Evidence



Richard Martin
Ministry of Post Secondary Education and Future Skills



Disclaimer: The views expressed are those of the author and do not necessarily reflect those of the Government of British Columbia.

Labour markets continually reallocate workers

  • Demand for occupations shifts due to technological change, policy, and shocks
  • Entry and exit absorb part of the adjustment, but some workers must move across occupations
  • Some occupational transitions are harder than others

How does occupational distance shape mobility?

\[ P_{ij} \downarrow \text{ as } C_{ij} \uparrow \]

where \(P_{ij}\) is the probability of moving from occupation \(i\) to \(j\), and \(C_{ij}\) represents the mobility cost derived from occupational distance

  • We therefore ask:

Which notion of occupational distance best rationalizes local mobility?

What Defines Occupational Distance?

  • Mobility costs are latent → we evaluate alternative specifications

Three candidates:

  1. Skill similarity
    • Distance based on O*NET skills, abilities, knowledge, work activities mapped to NOC 2021.
    • Why not OaSIS?
      • The ability to falsify the model depends on observing where measurement error enters the data.
      • OaSIS hides that structure (black box).
  2. Institutional hierarchy
    • Distance based on NOC (Canada’s occupational classification)
  3. Binary switching cost
    • (degenerate distance; stay vs move only)

Comparing distance metrics

  • Model mobility using entropy-regularized optimal transport (EROT)

  • Allows distance metrics to be compared in a clean horse race

  • We compare cost structures, holding mobility rates fixed

.

Data: Census 2016–2021 Longitudinal

  1. Core sample: Observations with reliable measurement

  2. Robustness / Falsification samples (measurement challenges)

    • 2016 Occupation mismeasurement
      • Self-employed
      • TEER–highest attainment mismatch
    • Mapping misspecification
      • Bottom 10% by (Herfindahl × similarity)
    • Joint mismeasurement
      • Both sources of noise

Mobility regimes

Local mobility (EROT)

  • No change in highest educational attainment
  • Distance-governed transitions

Non-local mobility

  • Increase in highest educational attainment
  • Education-enabled jumps

Local Mobility

  • Flows increase with size and decline with mobility cost

\[ P_{ij} = \exp(\alpha_i + \beta_j - \gamma C_{ij}) \]

  • Our objective: isolate the role of the cost structure

Gravity is the standard empirical framework

  • Anderson and Wincoop (2003) — Fixed effects (multilateral resistance)

  • Silva and Tenreyro (2006) — PPML (estimation)

  • Fit alone does not validate the cost matrix

  • Fixed effects can induce fit even with the wrong cost matrix

Entropy-Regularized Optimal Transport (EROT)

We impose observed marginals:

\[ \sum_j P_{ij} = a_i, \qquad \sum_i P_{ij} = b_j \]

→ Flows are determined by costs conditional on these constraints

We compare cost structures given fixed marginals

Mobility determined by cost–temperature ratio

\[ \frac{C_{ij}}{\varepsilon} \]

  • \(\varepsilon\): scale of idiosyncratic utility shocks (logit/RUM interpretation)
  • Low \(\varepsilon\) → cost-minimizing reallocation
  • High \(\varepsilon\) → diffuse mobility (approaches independence: \(P_{ij} = a_i b_j\))
  • Only relative cost differences (scaled by \(\varepsilon\)) shape mobility
  • Results evaluated across a range of \(\varepsilon\) (no single scale assumed)

Skill distance: Based on 161 O*NET measures

Hierarchical distance: derived from NOC taxonomy

  1. All five digits match:
    • distance = 0
  2. First four digits match:
    • distance = 1
  3. First three digits match:
    • distance = 2
  4. First digit matches:
    • distance = 3 + |ΔTEER|
  5. Otherwise:
    • distance = 9

Empirical CDFs

The Labour Market as a Whole

We simulate data from a mixture cost structure:

\[ C^{\text{true}} = (1 - w)\,C_{\text{skill}} + w\,C_{\text{hier}} \]

We then estimate using:

  • \(C_{\text{skill}}\)
  • \(C_{\text{hier}}\)
  • \(C_{\text{binary}}\)

All models are misspecified (via mixture)

Evaluating the Horse Race

  • KL divergence: how poorly the model predicts outcomes that occur.
  • We measure fit using relative KL improvement (vs. independence)

\[ \frac{KL(P \parallel P^{\text{ind}}) - KL(P \parallel \hat P)}{KL(P \parallel P^{\text{ind}})} \]

  • 0 → no improvement over independence
  • 1 → perfect fit
  • < 0 → worse than independence (typically low temp over-confidence)

  • As \(\varepsilon \rightarrow 0\) the solution is too sharp: “false negatives” are heavily penalized.
  • Best fit occurs when temperature is higher than its true value (1).

Specificity Tertiles

Some occupations draw from very specific educational paths.

  • Education concentration: \[KL(p \parallel p_0)= \sum_{k} p_k \, \log\!\left(\frac{p_k}{p_{0k}}\right)\] where \(p_0\) is the overall workforce education distribution

  • KL measures average log-surprise (relative to the workforce baseline)

  • We remove mechanical relationship with occupation size: \[\text{specificity}= \log(KL) - \widehat{E}[\log(KL)\mid \log(T)]\]

Local mobility regimes

Mobility Description Cost Matrix
Limited (Top tertile & TEERs 1,2) Binary
Vertical Middle tertile + (Top tertile & TEERs 3,4) Hierarchical
Horizontal Bottom tertile + all TEERs 0,5 Skill

Hypothesized market segments

Mobility varies by segment

Non-local model

Distance ~ specificity measures + origin fixed effects

  • Specificity measures (CIP–NOC KL):
    • Education specificity → CIP exit concentration (wormhole entrance)
    • Destination gating → NOC entry concentration (wormhole exit)
  • Hypotheses
    • New attainment: distance ↑ with specificity
    • No new attainment: no systematic relationship

Priors–Not Conclusions

  • Mobility is heterogeneous (local vs non-local)

    • Local mobility: also hetergeneous!

      • limited movement → binary
      • horizontal movement → skill
      • vertical movement → hierarchy
    • Non-local mobility

      • Distance travelled ~ education specificity and destination gating

Policy implication

  • Labour shortages in different segments likely require different adjustment mechanisms

    • Horizontal: reskilling and short-cycle training can accelerate adjustment
    • Vertical: direct workers towards “first rung” of career ladders
    • Limited: adjustment depends on expanding formal training pipelines

Thank you!

Questions?

Coop Term in Ministry of Post Secondary Education and Future Skills:

contact Nicole.Bruce@gov.bc.ca

Data: Census 2016–2021 Longitudinal File

  • Both Censuses linked via the central Derived Record Depository (DRD)
  • Probabilistic linkage (name, sex, DOB, phone, postal code)
  • High linkage quality, but not perfect
  • \(\therefore\) population estimates would be biased, but not our objective.

Which notion of occupational distance best rationalizes local mobility?

  • Robust to selection unless sample favors one distance metric

Entropy-regularized optimal transport

\[ \mathcal{L} = \underbrace{\sum_{ij} P_{ij} C_{ij}}_{\text{mass}\times\text{distance}} + \varepsilon\times \underbrace{\sum_{ij} P_{ij}(\log P_{ij}-1)}_{\text{negative entropy}} + \sum_i f_i \underbrace{\left(a_i - \sum_j P_{ij}\right)}_{\text{origin constraint}} + \sum_j g_j \underbrace{\left(b_j - \sum_i P_{ij}\right)}_{\text{destination constraint}} \]

First-order conditions:

\[ \frac{\partial \mathcal{L}}{\partial P_{ij}} = C_{ij} + \varepsilon \log P_{ij} - f_i - g_j = 0 \]

Rearranging:

\[ \log P_{ij} = \frac{f_i}{\varepsilon} + \frac{g_j}{\varepsilon} - \frac{C_{ij}}{\varepsilon} \]

Structure of the Solution

Exponentiating:

\[ P_{ij} = u_i\, e^{-C_{ij}/\varepsilon}\, v_j, \]

  • Thus the optimal solution must be a row- and column-rescaling of the kernel \(e^{-C_{ij}/\varepsilon}\)
  • Given \(v\), there is a unique \(u\) that fixes the rows
  • Given \(u\), there is a unique \(v\) that fixes the columns
  • Sinkhorn alternates these corrections until both margins are satisfied
  • Implemented using a benchmark-tested log-domain Sinkhorn solver that ensures numerical stability and dimensional alignment

.

Result

  • Because the objective is strictly convex, the optimal solution \(P^{\star}\) is unique

  • If the Sinkhorn algorithm converges, it must converge to \(P^{\star}\)

The optimal solution:

  • matches the observed origin and destination totals
  • respects the relative friction structure implied by \(C_{ij}\)

Diagnostic Plots

  • If distance rationalizes mobility, log excess mobility will decline linearly with distance.

  • If the distance metric is mis-specified, it will not.

2016 TEER mismatch

  • Apparent distance traveled might be implausible if 2016 occupation is inconsistent with highest educational attainment

  • Admisible TEERs are the Target TEER \(\pm1\) plus TEER 0 (management)

Highest attainment Target TEER Admissible TEERs
No certificate, diploma or degree 5 4, 5, 0
Secondary (high) school diploma 4 3, 4, 5, 0
Apprenticeship or trades certificate or diploma 3 2, 3, 4, 0
College / CEGEP / other non-university 2 1, 2, 3, 0
University certificate or diploma below bachelor 2 1, 2, 3, 0
Bachelor’s degree 1 1, 2, 0
Above bachelor (Master’s, PhD, MD, etc.) 1 1, 2, 0

References

Anderson, James E., and Eric van Wincoop. 2003. “Gravity with Gravitas: A Solution to the Border Puzzle.” American Economic Review 93 (1): 170–92. https://doi.org/10.1257/000282803321455214.
Gathmann, Christina, and Uta Schönberg. 2010. “How General Is Human Capital? A Task-Based Approach.” Journal of Labor Economics 28 (1): 1–49. https://doi.org/10.1086/649786.
Silva, JMC Santos, and Silvana Tenreyro. 2006. “The Log of Gravity.” The Review of Economics and Statistics, 641–58.