slides

Where Skills vs Institutions Shape Occupational Mobility

Methodology and Simulation Evidence

Richard Martin
Ministry of Post Secondary Education and Future Skills

Disclaimer: The views expressed are those of the author and do not necessarily reflect those of the Government of British Columbia.

Labour markets continually reallocate workers

Demand for occupations shifts due to technological change, policy, and shocks
Entry and exit absorb part of the adjustment, but some workers must move across occupations
Some occupational transitions are harder than others

How does occupational distance shape mobility?

Workers tend to move between similar occupations(Gathmann and Schönberg 2010)

\[ P_{ij} \downarrow \text{ as } C_{ij} \uparrow \]

where \(P_{ij}\) is the probability of moving from occupation \(i\) to \(j\), and \(C_{ij}\) represents the mobility cost derived from occupational distance

We therefore ask:

Which notion of occupational distance best rationalizes local mobility?

What Defines Occupational Distance?

Mobility costs are latent → we evaluate alternative specifications

Three candidates:

Skill similarity
- Distance based on O*NET skills, abilities, knowledge, work activities mapped to NOC 2021.
- Why not OaSIS?
  - The ability to falsify the model depends on observing where measurement error enters the data.
  - OaSIS hides that structure (black box).
Institutional hierarchy
- Distance based on NOC (Canada’s occupational classification)
Binary switching cost
- (degenerate distance; stay vs move only)

Comparing distance metrics

Model mobility using entropy-regularized optimal transport (EROT)
Allows distance metrics to be compared in a clean horse race
We compare cost structures, holding mobility rates fixed

Data: Census 2016–2021 Longitudinal

Core sample: Observations with reliable measurement
Robustness / Falsification samples (measurement challenges)
- 2016 Occupation mismeasurement
  - Self-employed
  - TEER–highest attainment mismatch
- Mapping misspecification
  - Bottom 10% by (Herfindahl × similarity)
- Joint mismeasurement
  - Both sources of noise

Mobility regimes

Local mobility (EROT)

No change in highest educational attainment
Distance-governed transitions

Non-local mobility

Increase in highest educational attainment
Education-enabled jumps

Local Mobility

Flows increase with size and decline with mobility cost

\[ P_{ij} = \exp(\alpha_i + \beta_j - \gamma C_{ij}) \]

Our objective: isolate the role of the cost structure

Gravity is the standard empirical framework

Anderson and Wincoop (2003) — Fixed effects (multilateral resistance)
Silva and Tenreyro (2006) — PPML (estimation)
Fit alone does not validate the cost matrix
Fixed effects can induce fit even with the wrong cost matrix

Entropy-Regularized Optimal Transport (EROT)

We impose observed marginals:

\[ \sum_j P_{ij} = a_i, \qquad \sum_i P_{ij} = b_j \]

→ Flows are determined by costs conditional on these constraints

We compare cost structures given fixed marginals

Gravity: estimates origin/destination attractiveness (fixed effects) jointly with costs

EROT: holds marginal structure fixed and evaluates cost models conditional on it

Mobility determined by cost–temperature ratio

\[ \frac{C_{ij}}{\varepsilon} \]

\(\varepsilon\): scale of idiosyncratic utility shocks (logit/RUM interpretation)
Low \(\varepsilon\) → cost-minimizing reallocation
High \(\varepsilon\) → diffuse mobility (approaches independence: \(P_{ij} = a_i b_j\))
Only relative cost differences (scaled by \(\varepsilon\)) shape mobility
Results evaluated across a range of \(\varepsilon\) (no single scale assumed)

Skill distance: Based on 161 O*NET measures

Hierarchical distance: derived from NOC taxonomy

All five digits match:
- distance = 0
First four digits match:
- distance = 1
First three digits match:
- distance = 2
First digit matches:
- distance = 3 + |ΔTEER|
Otherwise:
- distance = 9

Empirical CDFs

The Labour Market as a Whole

We simulate data from a mixture cost structure:

\[ C^{\text{true}} = (1 - w)\,C_{\text{skill}} + w\,C_{\text{hier}} \]

We then estimate using:

\(C_{\text{skill}}\)
\(C_{\text{hier}}\)
\(C_{\text{binary}}\)

All models are misspecified (via mixture)

Evaluating the Horse Race

KL divergence: how poorly the model predicts outcomes that occur.
We measure fit using relative KL improvement (vs. independence)

\[ \frac{KL(P \parallel P^{\text{ind}}) - KL(P \parallel \hat P)}{KL(P \parallel P^{\text{ind}})} \]

0 → no improvement over independence
1 → perfect fit
< 0 → worse than independence (typically low temp over-confidence)

As \(\varepsilon \rightarrow 0\) the solution is too sharp: “false negatives” are heavily penalized.
Best fit occurs when temperature is higher than its true value (1).

Specificity Tertiles

Some occupations draw from very specific educational paths.

Education concentration: \[KL(p \parallel p_0)= \sum_{k} p_k \, \log\!\left(\frac{p_k}{p_{0k}}\right)\] where \(p_0\) is the overall workforce education distribution
KL measures average log-surprise (relative to the workforce baseline)
We remove mechanical relationship with occupation size: \[\text{specificity}= \log(KL) - \widehat{E}[\log(KL)\mid \log(T)]\]

Local mobility regimes

Mobility	Description	Cost Matrix
Limited	(Top tertile & TEERs 1,2)	Binary
Vertical	Middle tertile + (Top tertile & TEERs 3,4)	Hierarchical
Horizontal	Bottom tertile + all TEERs 0,5	Skill

Hypothesized market segments

Mobility varies by segment

Non-local model

Distance ~ specificity measures + origin fixed effects

Specificity measures (CIP–NOC KL):
- Education specificity → CIP exit concentration (wormhole entrance)
- Destination gating → NOC entry concentration (wormhole exit)
Hypotheses
- New attainment: distance ↑ with specificity
- No new attainment: no systematic relationship

Priors–Not Conclusions

Mobility is heterogeneous (local vs non-local)
- Local mobility: also hetergeneous!
  - limited movement → binary
  - horizontal movement → skill
  - vertical movement → hierarchy
- Non-local mobility
  - Distance travelled ~ education specificity and destination gating

Policy implication

Labour shortages in different segments likely require different adjustment mechanisms
- Horizontal: reskilling and short-cycle training can accelerate adjustment
- Vertical: direct workers towards “first rung” of career ladders
- Limited: adjustment depends on expanding formal training pipelines

Thank you!

Questions?

Coop Term in Ministry of Post Secondary Education and Future Skills:

contact Nicole.Bruce@gov.bc.ca

Data: Census 2016–2021 Longitudinal File

Both Censuses linked via the central Derived Record Depository (DRD)
Probabilistic linkage (name, sex, DOB, phone, postal code)
High linkage quality, but not perfect
\(\therefore\) population estimates would be biased, but not our objective.

Which notion of occupational distance best rationalizes local mobility?

Robust to selection unless sample favors one distance metric

Entropy-regularized optimal transport

\[ \mathcal{L} = \underbrace{\sum_{ij} P_{ij} C_{ij}}_{\text{mass}\times\text{distance}} + \varepsilon\times \underbrace{\sum_{ij} P_{ij}(\log P_{ij}-1)}_{\text{negative entropy}} + \sum_i f_i \underbrace{\left(a_i - \sum_j P_{ij}\right)}_{\text{origin constraint}} + \sum_j g_j \underbrace{\left(b_j - \sum_i P_{ij}\right)}_{\text{destination constraint}} \]

First-order conditions:

\[ \frac{\partial \mathcal{L}}{\partial P_{ij}} = C_{ij} + \varepsilon \log P_{ij} - f_i - g_j = 0 \]

Rearranging:

\[ \log P_{ij} = \frac{f_i}{\varepsilon} + \frac{g_j}{\varepsilon} - \frac{C_{ij}}{\varepsilon} \]

Structure of the Solution

Exponentiating:

\[ P_{ij} = u_i\, e^{-C_{ij}/\varepsilon}\, v_j, \]

Thus the optimal solution must be a row- and column-rescaling of the kernel \(e^{-C_{ij}/\varepsilon}\)
Given \(v\), there is a unique \(u\) that fixes the rows
Given \(u\), there is a unique \(v\) that fixes the columns
Sinkhorn alternates these corrections until both margins are satisfied
Implemented using a benchmark-tested log-domain Sinkhorn solver that ensures numerical stability and dimensional alignment

Result

Because the objective is strictly convex, the optimal solution \(P^{\star}\) is unique
If the Sinkhorn algorithm converges, it must converge to \(P^{\star}\)

The optimal solution:

matches the observed origin and destination totals
respects the relative friction structure implied by \(C_{ij}\)

Diagnostic Plots

If distance rationalizes mobility, log excess mobility will decline linearly with distance.
If the distance metric is mis-specified, it will not.

2016 TEER mismatch

Apparent distance traveled might be implausible if 2016 occupation is inconsistent with highest educational attainment
Admisible TEERs are the Target TEER \(\pm1\) plus TEER 0 (management)

Highest attainment	Target TEER	Admissible TEERs
No certificate, diploma or degree	5	4, 5, 0
Secondary (high) school diploma	4	3, 4, 5, 0
Apprenticeship or trades certificate or diploma	3	2, 3, 4, 0
College / CEGEP / other non-university	2	1, 2, 3, 0
University certificate or diploma below bachelor	2	1, 2, 3, 0
Bachelor’s degree	1	1, 2, 0
Above bachelor (Master’s, PhD, MD, etc.)	1	1, 2, 0

References

Anderson, James E., and Eric van Wincoop. 2003. “Gravity with Gravitas: A Solution to the Border Puzzle.” American Economic Review 93 (1): 170–92. https://doi.org/10.1257/000282803321455214.

Gathmann, Christina, and Uta Schönberg. 2010. “How General Is Human Capital? A Task-Based Approach.” Journal of Labor Economics 28 (1): 1–49. https://doi.org/10.1086/649786.

Silva, JMC Santos, and Silvana Tenreyro. 2006. “The Log of Gravity.” The Review of Economics and Statistics, 641–58.