Supply of new workers to BC

Author

Richard Martin

Streams of people

There are three streams of people that augment British Columbia’s supply of labour: the young, working age¹ people from other provinces/territories, and working age² people from other countries. In this note we forecast each of these three streams of people, as well as their respective participation rates. When the stream and participation rate are multiplied, we get a forecast of these supply additions. Only by divine miracle would these supply additions exactly equal the sum of expansion and replacement demand: any difference is assumed to be balanced by changes in unemployment and participation in working age British Columbians.

The young

For the young, we use demographic projections from https://bcstats.shinyapps.io/popApp/. To avoid double counting working age³ immigrants and inter-provincial migrants, we use BC population counts of individuals aged 10-14 from 6 years prior. i.e. those currently ages 16-20. We apply a kernel smoother to model the transition from being young to being working age. i.e. in any given year, 10% of 16 year-olds become working age, 20% of 17 year-olds, etc.

Code

kernel_weights <- tibble(age=16:20, `kernel weights`=c(.1,.2,.4,.2,.1))

ggplot(kernel_weights, aes(age, `kernel weights`))+
  geom_col(alpha=.5)

Code

entrant_stream_data <- vroom(here("data",
                        "supply_side",
                        "put_bc_stats_demographics_here",
                        "Population_Projections.csv"),
                   skip=6)|>
  clean_names()|>
  filter(age %in% 10:14)|>
  mutate(year=year+6,
         age=as.numeric(age)+6)|>
  group_by(year)|>
  nest()|>
  mutate(smoothed=map_dbl(data, kernel_smooth))|>
  unnest(cols = c(data))

entrant_stream_data|>
  filter(year<=max(forecast_years))|>
  ggplot()+
  geom_rect(aes(xmin = min(forecast_years), xmax = max(forecast_years), ymin = -Inf, ymax = Inf),
            fill = "grey", colour="grey") +
  geom_line(mapping=aes(year, pop, colour=age, group=age))+
  geom_line(mapping=aes(year, smoothed), lwd=1.5)+
  labs(title=paste("Kernel Smoothing of 6-year-shifted demographics (black line)"))+
  scale_colour_viridis_c()+
  theme_minimal()

Code

entrant_stream <- entrant_stream_data|>
  filter(year %in% forecast_years)|>
  rename(ref_date=year)|>
  ungroup()|>
  select(ref_date, smoothed)|>
  distinct()

Net inter-provincial migrants

For net inter-provincial migrants, we use data from Statistics Canada table 17-10-0015-01. We filter geo==“British Columbia”, gender==“Total - gender”, migrants==“Net-migration”, and age==“15 to 64 years”. We create an ensemble forecast using ETS and TSLM, plotted in blue below.

Code

inter_prov_data <- vroom(here("data", "supply_side", "17100015.csv"))|>
  clean_names()|>
  filter(geo=="British Columbia",
         gender=="Total - gender",
         migrants=="Net-migration",
         age_group=="15 to 64 years")|>
  mutate(ref_date=as.numeric(str_sub(ref_date, 1, 4)))|>
  select(ref_date, value)|>
  as_tsibble(index=ref_date)

inter_prov_stream <- inter_prov_data|>
  model(ets=ETS(value),
        tslm=TSLM(value~trend())
        )|> 
  forecast(h = forecast_horizon+1)|>
  as_tibble()|>
  group_by(ref_date)|>
  summarize(interprovincial_people=mean(.mean))

ggplot()+
    geom_line(data=inter_prov_data, mapping=aes(ref_date, value))+
    geom_line(data=inter_prov_stream, mapping = aes(ref_date, interprovincial_people), colour="blue")

Immigrants

For immigrants, we use data from Statistics Canada table 17-10-0014-01 where we filter geo==“British Columbia”, gender==“Total-gender”, type of migrant \(\in\) (Immigrants, Net emigration, net non-permanent residents), and age_group==“15 to 64 years”. We create a series permanent_residents which is the difference between immigrants and net_emigration. We create an ensemble forecast using ETS and TSLM, plotted in blue below. Note that the immigrants component of permanent_residents is consistent with the levels plan for 2026-2028, but we can not use the levels plan to inform our forecast of net non-permanent residents.

Code

immigrant_data <- vroom(here("data", "supply_side", "17100014.csv"))|>
  clean_names()|>
  filter(geo=="British Columbia",
         gender=="Total - gender",
         type_of_migrant %in% c("Immigrants", "Net emigration", "Net non-permanent residents"),
         age_group=="15 to 64 years")|>
  mutate(ref_date=as.numeric(str_sub(ref_date, 1,4)))|>
  select(ref_date, type_of_migrant, value)|>
  as_tsibble(index=ref_date, key=type_of_migrant)

immigrant_stream <- immigrant_data|>
  model(ets=ETS(value),
        tslm=TSLM(value~trend())
        )|> 
  forecast(h = forecast_horizon+1)|>
  as_tibble()|>
  group_by(ref_date, type_of_migrant)|>
  summarize(value=mean(.mean))|>
  pivot_wider(id_cols = ref_date, names_from = type_of_migrant, values_from = value)|>
  clean_names()|>
  mutate(permanent_residents=immigrants-net_emigration)|>
  select(-immigrants, -net_emigration)|>
  pivot_longer(cols=-ref_date)

immigrant_data <- immigrant_data|>
  as_tibble()|>
  pivot_wider(names_from = type_of_migrant, values_from=value)|>
  clean_names()|>
  mutate(permanent_residents=immigrants-net_emigration)|>
  select(-immigrants, -net_emigration)|>
  pivot_longer(cols=-ref_date)

ggplot()+
    geom_line(data=immigrant_data, mapping=aes(ref_date, value))+
    geom_line(data=immigrant_stream, mapping = aes(ref_date, value), colour="blue")+
  facet_wrap(~name)

Participation rates:

For the participation rate for new entrants we use Statistics Canada table 14-10-0327-01, where we filter geo==“British Columbia”, gender==“Total - Gender”, and age_group==“15 to 64 years”. We create an ensemble forecast (ETS and TSLM) for the participation rate of working age British Columbians.

Code

can_bc_part_rates <- vroom(here("data", "supply_side", "14100327.csv"))

bc_part_rate_historic <- can_bc_part_rates|>
  clean_names()|>
  filter(geo=="British Columbia",
         gender=="Total - Gender",
         age_group=="15 to 64 years",
         labour_force_characteristics=="Participation rate"
         )|>
    mutate(part_rate=value/100,
           ref_date=as.numeric(ref_date))|>
    select(ref_date, part_rate)|>
    as_tsibble(index=ref_date)

  bc_part_rate <- bc_part_rate_historic|>
    model(ets=ETS(part_rate~trend()),
          tslm=TSLM(part_rate~trend()))|>
    forecast(h=forecast_horizon+1)|>
    as_tibble()|>
    group_by(ref_date)|>
    summarize(part_rate=mean(.mean))

  ggplot()+
    geom_line(data=bc_part_rate_historic, mapping=aes(ref_date, part_rate))+
    geom_line(data=bc_part_rate, mapping = aes(ref_date, part_rate), colour="blue")

For the participation rate for inter-provincial migrants, we use Statistics Canada table 14-10-0327-01 where we filter geo==“Canada”, gender==“Total-gender”, and age \(\in\) [15,64]. We create an ensemble forecast (ETS and TSLM) for the participation rate of working age Canadians.

Code

canada_part_rate_historic <- can_bc_part_rates|>
  clean_names()|>
  filter(geo=="Canada",
         gender=="Total - Gender",
         age_group=="15 to 64 years",
         labour_force_characteristics=="Participation rate"
         )|>
    mutate(part_rate=value/100,
           ref_date=as.numeric(ref_date))|>
    select(ref_date, part_rate)|>
    as_tsibble(index=ref_date)

  canada_part_rate <- canada_part_rate_historic|>
    model(ets=ETS(part_rate~trend()),
          tslm=TSLM(part_rate~trend()))|>
    forecast(h=forecast_horizon+1)|>
    as_tibble()|>
    group_by(ref_date)|>
    summarize(part_rate=mean(.mean))

  ggplot()+
    geom_line(data=canada_part_rate_historic, mapping=aes(ref_date, part_rate))+
    geom_line(data=canada_part_rate, mapping = aes(ref_date, part_rate), colour="blue")

For the participation rates for the two streams (Permanent and Non-permanent residents) of immigrant labour we use Statistics Canada table 98-10-0446-01. This is 2021 census data (not LFS time series), so the participation rates are constant over the forecast horizon. We filter geo==“British Columbia”, Immigrant status and period of immigration \(\in\) (2016-2021, Non-permanent residents), highest certificate==“Total - Highest certificate, diploma or degree”, age \(\in\) (15-24, 25-64), Gender==“Total”, Visible minority==“Total”, Statistics==“Count”. For permanent residents, we only consider those who landed between 2016-2021. Because the working age group is split into young/old age brackets, we need to aggregate before calculating working age (15-64) participation rates.

Code

immigrant_part_rate <- open_dataset(here("data", "supply_side", "98100446_parquet"))|>
  dplyr::filter(GEO == "British Columbia",
                `Immigrant status and period of immigration (11)`%in% c("2016 to 2021","Non-permanent residents"),
                `Highest certificate, diploma or degree (7)`=="Total - Highest certificate, diploma or degree",
                `Age (15A)`%in% c("15 to 24 years","25 to 64 years"),
                `Gender (3)`=="Total - Gender",
                `Visible minority (15)`=="Total - Visible minority",
                `Statistics (3)`=="Count"
                )|>
  select(contains("status"), contains("age"))|>
  collect()|>
  clean_names()|>
  pivot_longer(cols=-c("immigrant_status_and_period_of_immigration_11", "age_15a"))|>
  rename(immigrant_class=contains("immigrant_status"))|>
  mutate(value=as.numeric(value),
         name=str_remove_all(name, "labour_force_status_8_"),
         name=str_sub(name, end=-3))|>
  group_by(immigrant_class, name)|>
  summarize(value=sum(value))|>
  pivot_wider(id_cols = immigrant_class)|>
  mutate(immigrant_part_rate=(unemployed+employed)/(unemployed+employed+not_in_the_labour_force))|>
    select(immigrant_class, immigrant_part_rate)

immigrant_part_rate$immigrant_class[1] <- "Permanent residents"
pr_part_rate <- immigrant_part_rate$immigrant_part_rate[immigrant_part_rate$immigrant_class=="Permanent residents"]
npr_part_rate <- immigrant_part_rate$immigrant_part_rate[immigrant_part_rate$immigrant_class=="Non-permanent residents"]

kable(immigrant_part_rate)

immigrant_class	immigrant_part_rate
Permanent residents	0.7917766
Non-permanent residents	0.7481337

New Workers:

To get the flow of young workers entering the labour force, we multiply the forecasts of new working age British Columbians and BC’s working age (15-64) participation rate.

Code

entrant_participants <- entrant_stream|>
  full_join(bc_part_rate)|>
  mutate(new_entrants=smoothed*part_rate)|>
  select(ref_date, new_entrants)

To get the flow of inter-provincial migrants entering the BC labour force, we multiply the forecasts of net-interprovincial migrants and Canada’s working age (15-64) participation rate.

Code

inter_prov_participants <- inter_prov_stream|>
  inner_join(canada_part_rate, by = join_by(ref_date))|>
  mutate(net_new_interprovincial=interprovincial_people*part_rate)|>
  select(ref_date, net_new_interprovincial)

Regarding sources of new labour supply from outside Canada, one might worry about double counting additions when a new permanent resident was formerly a non-permanent resident. However, Statistics Canada notes that

Non-permanent residents, outflow refers to the number of work and study permit holders leaving Canada, along with their family members who are not Canadian citizens, landed immigrants (permanent residents) or non-permanent residents themselves. This also includes permit holders and asylum claimants, protected persons and related groups staying in Canada who have been granted permanent resident status. In this case, these persons will be added to the immigrant component. Asylum claimants and related groups who have been deported are also counted as outflows.

Thus, to get new permanent resident’s contribution to new supply, we subtract net emigration from new permanent residents (“immigrants”), and then multiply the difference by the permanent resident participation rate. Similarly, we multiply net non-permanent residents by the non-permanent participation rate. Finally, we add the two streams together.

Code

immigrant_participants <- immigrant_stream|>
  pivot_wider()|>
  mutate(permanent=permanent_residents*pr_part_rate,
         non_permanent=net_non_permanent_residents*npr_part_rate,
         immigrant_participants=permanent+non_permanent)|>
  select(ref_date, immigrant_participants)

Noc Proportions:

In order to “inject” the new supply into the labour market, we need to know where, historically, these sources of new supply typically land (in terms of occupations.)

For young workers, we make use of Statistics Canada Census table 98-10-0593-01, where we apply filters geo==“British Columbia”, labour force status==“Employed”, age==“15 to 19 years”, gender==“Total”, class of worker==“Employee”. We then calculate proportions for each of the NOCs, to give us an indication of first occupations for young British Columbians.

Code

entrant_noc_props <- open_dataset(here("data",
                                      "supply_side",
                              "98100593_parquet"))|>
  dplyr::filter(GEO == "British Columbia",
                `Statistics (3)`=="Count",
                `Labour force status (3)`=="Employed",
                `Age (15A)`=="15 to 19 years",
                `Gender (3)`=="Total - Gender"
                )|>
  dplyr::select(noc_desc=`Occupation - Unit group - National Occupational Classification (NOC) 2021 (821A)`,
                count=`Class of worker (7A):Total - Class of worker[1]`)|>
  collect()|>
  filter(str_detect(noc_desc, "\\b\\d{5}\\b"))|>
  mutate(count=as.numeric(count),
         noc=str_sub(noc_desc, 1, 5),
           desc=str_sub(noc_desc, 7),
           prop=count/sum(count, na.rm=TRUE)
    )|>
    select(noc, desc, prop)|>
    arrange(desc(prop))

For inter-provincial migrants, we make use of Statistics Canada Census table 98-10-0449-01, and apply filters geo==“Canada”, highest certificate==“Total”, age==“Total, gender==”Total”. We make use of the “Employed” counts to calculate proportions for each of the NOCs, to give us an indication of first occupations for inter-provincial migrants.

Code

canada_noc_props <- open_dataset(here("data",
                                      "supply_side",
                              "98100449_parquet"))|>
  filter(GEO == "Canada",
                `Highest certificate, diploma or degree (16)`=="Total - Highest certificate, diploma or degree",
                `Age (15A)`=="Total - Age",
                `Gender (3)`=="Total - Gender")|>
  select(noc_desc=`Occupation - Unit group - National Occupational Classification (NOC) 2021 (821A)`,
         count=`Labour force status (3):Employed[2]`
         )|>
  collect()|>
  filter(str_detect(noc_desc, "\\b\\d{5}\\b"))|>
  mutate(count=as.numeric(count),
         noc=str_sub(noc_desc, 1, 5),
           desc=str_sub(noc_desc, 7),
           prop=count/sum(count, na.rm=TRUE)
    )|>
    select(noc, desc, prop)|>
    arrange(desc(prop))

For immigrants, we make use of Statistics Canada Census table 98-10-0316-01, and apply filters GEO==“Canada”, admission category==“total”, highest certificate==“total”, age==“total”, gender==“total”, period of immigration==“2016-2021”. We then calculate proportions for each of the NOCs, to give us an indication of first occupations for immigrants.

Code

immigrant_noc_props <- open_dataset(here("data",
                                      "supply_side",
                              "98100316_parquet"))|>
  filter(GEO=="Canada",
         `Admission category and applicant type (35)`=="Total - Admission category and applicant type",
         `Highest certificate, diploma or degree (7)`=="Total - Highest certificate, diploma or degree",
         `Age (8C)`=="Total - Age",
         `Gender (3)`=="Total - Gender")|>
  select(noc_desc=`Occupation - Unit group - National Occupational Classification (NOC) 2021 (821A)`,
         count=`Period of immigration (8A):2016 to 2021[8]`)|>
  collect()|>
  filter(str_detect(noc_desc, "\\b\\d{5}\\b"))|>
  mutate(count=as.numeric(count),
         noc=str_sub(noc_desc, 1, 5),
         desc=str_sub(noc_desc, 7),
         prop=count/sum(count, na.rm=TRUE)
    )|>
    select(noc, desc, prop)|>
    arrange(desc(prop))

Assign new workers to NOCs:

Code

new_entrants <- crossing(entrant_participants, entrant_noc_props)|>
  mutate(`New Entrants`=new_entrants*prop)|>
  select(-prop, -desc, -new_entrants)|>
  filter(ref_date>min(ref_date))

immigrants <- crossing(immigrant_participants, immigrant_noc_props)|>
  mutate(`Net International In-Migration`=immigrant_participants*prop)|>
  select(-immigrant_participants, -prop, -desc)|>
  filter(ref_date>min(ref_date))

interprovincial <- crossing(inter_prov_participants, canada_noc_props)|>
  mutate(`Net Interregional In-Migration`=net_new_interprovincial*prop)|>
  select(ref_date, noc, desc, `Net Interregional In-Migration`)|>
  filter(ref_date>min(ref_date))

Code

by_teer <- alluvium_prep(2, teer_names)
by_broad <- alluvium_prep(1, broad_names)

Code

alluvium_plot(by_teer)

Code

alluvium_plot(by_broad)

Inter-occupational mobility.

From the above alluvium plots it should? be obvious that a disproportionate share of new supply is being “injected” into the labour market in TEER 5 Sales and services occupations, necessitating inter-occupational mobility. The first step to derive inter-occupational mobility is to calculate supply_no_mobility, which is the sum of the 3 streams of new supply.

Code

supply_no_mobility <- full_join(immigrants, new_entrants, by = join_by(ref_date, noc))|>
  full_join(interprovincial, by = join_by(ref_date, noc))|>
  mutate(across(where(is.numeric), ~ replace_na(.x, 0)))|>
  select(ref_date, noc, desc, everything())|>
  mutate(supply_no_mobility=`Net International In-Migration`+`New Entrants`+`Net Interregional In-Migration`)

Inter-occupational mobility is driven by the difference between supply_no_mobility and demand, so lets load our demand forecast.

Code

demand <- read_rds(here("out","richs_forecast.rds"))|>
  group_by(noc_5, year)|>
  summarize(expansion_demand=sum(expansion_demand, na.rm=TRUE),
            replacement_demand=sum(replacement_demand, na.rm=TRUE))|>
  filter(year>min(year))|>
  rename(ref_date=year)|>
  mutate(`Job Openings`=expansion_demand+replacement_demand)|>
  select(noc=noc_5, ref_date, `Job Openings`)

We assume that inter-occupational mobility balances supply and demand to the extent possible. Aggregate supply and aggregate demand will never be exactly equal, so we normalize demand so that it is exactly equal to supply. Inter-occupational mobility is the difference between normalized demand and supply_no_mobility. Supply is the sum of supply_no_mobility and inter-occupational mobility. Finally, the residual is the difference between demand and supply, assumed to be balanced by changes to unemployment and participation among working age British Columbians.

Code

supply_and_demand <- full_join(supply_no_mobility, demand)|>
  mutate(normalized_demand=`Job Openings`*sum(supply_no_mobility, na.rm=TRUE)/sum(`Job Openings`, na.rm=TRUE),
         interoccupation_mobility=normalized_demand-supply_no_mobility,
         `Total Supply Change`=supply_no_mobility+interoccupation_mobility,
         `Change in labour market participation`=`Job Openings`-`Total Supply Change`
         )

my_dt(supply_and_demand)

Compare with Stokes

Note there are a couple definition mis-matches in the table below.
Rich: Total Supply Change just covers the 3 streams of new workers modeled above, it does not include Change in labour market participation:

\[ \mbox{Total Supply Change} = \mbox{New Entrants} + \mbox{Net International In-Migration} + \mbox{Net Interregional In-Migration} \]

\[ \mbox{Job Openings} = \mbox{Total Supply Change} + \mbox{Change in labour market participation} \]

I.e. any imbalance between the demand and the supply model is balanced by Change in labour market participation, conflating changes in participation and unemployment.
Stokes: Total Supply Change does include Change in labour market participation, but nets out changes in Unemployment.

\[ \begin{aligned} \text{Total Supply Change} =\;& \text{New Entrants} \\ &+ \text{Net International In-Migration} \\ &+ \text{Net Interregional In-Migration} \\ &+ \text{Change in labour market participation} \\ &- \text{Change in Unemployment} \end{aligned} \]

\[ \mbox{Job Openings} = \mbox{Total Supply Change} + \mbox{Change in Unemployment} \]

Code

sd5 <- read_lengthen_clean("supply_demand.xlsx")

stokes_total <- sd5%>%
  filter(noc=="#T",
         industry=="All Industries",
         geographic_area=="British Columbia",
         variable %in% c("Job Openings",
                         "New Entrants",
                         "Net International In-Migration",
                         "Net Interregional In-Migration",
                         "Net Other Mobility",
                         "Total Supply Change"),
         year>report_year)%>%
  group_by(name=variable)%>%
  summarise(value=sum(value))|>
  pivot_wider()|>
  mutate(`Decline in unemployment`=`Job Openings`-`Total Supply Change`,
         `Change in labour market participation`=`Net Other Mobility`+`Decline in unemployment`)|>
  pivot_longer(cols=everything(), values_to = "stokes")

rich_total <- supply_and_demand|>
  summarise(across(-c(ref_date, noc, desc), ~ sum(.x, na.rm = TRUE)))|>
  pivot_longer(cols=everything(), values_to = "rich")

inner_join(rich_total, stokes_total)|>
  DT::datatable(rownames = FALSE)|>
  DT::formatRound(c("rich", "stokes"), 2)

Footnotes

15-64 years old↩︎
15-64 years old↩︎
15-64 years old↩︎