Code
kernel_weights <- tibble(age=16:20, `kernel weights`=c(.1,.2,.4,.2,.1))
ggplot(kernel_weights, aes(age, `kernel weights`))+
geom_col(alpha=.5)There are three streams of people that augment British Columbia’s supply of labour: the young, working age1 people from other provinces/territories, and working age2 people from other countries. In this note we forecast each of these three streams of people, as well as their respective participation rates. When the stream and participation rate are multiplied, we get a forecast of these supply additions. Only by divine miracle would these supply additions exactly equal the sum of expansion and replacement demand: any difference is assumed to be balanced by changes in unemployment and participation in working age British Columbians.
For the young, we use demographic projections from https://bcstats.shinyapps.io/popApp/. To avoid double counting working age3 immigrants and inter-provincial migrants, we use BC population counts of individuals aged 10-14 from 6 years prior. i.e. those currently ages 16-20. We apply a kernel smoother to model the transition from being young to being working age. i.e. in any given year, 10% of 16 year-olds become working age, 20% of 17 year-olds, etc.
kernel_weights <- tibble(age=16:20, `kernel weights`=c(.1,.2,.4,.2,.1))
ggplot(kernel_weights, aes(age, `kernel weights`))+
geom_col(alpha=.5)entrant_stream_data <- vroom(here("data",
"supply_side",
"put_bc_stats_demographics_here",
"Population_Projections.csv"),
skip=6)|>
clean_names()|>
filter(age %in% 10:14)|>
mutate(year=year+6,
age=as.numeric(age)+6)|>
group_by(year)|>
nest()|>
mutate(smoothed=map_dbl(data, kernel_smooth))|>
unnest(cols = c(data))
entrant_stream_data|>
filter(year<=max(forecast_years))|>
ggplot()+
geom_rect(aes(xmin = min(forecast_years), xmax = max(forecast_years), ymin = -Inf, ymax = Inf),
fill = "grey", colour="grey") +
geom_line(mapping=aes(year, pop, colour=age, group=age))+
geom_line(mapping=aes(year, smoothed), lwd=1.5)+
labs(title=paste("Kernel Smoothing of 6-year-shifted demographics (black line)"))+
scale_colour_viridis_c()+
theme_minimal()entrant_stream <- entrant_stream_data|>
filter(year %in% forecast_years)|>
rename(ref_date=year)|>
ungroup()|>
select(ref_date, smoothed)|>
distinct()For net inter-provincial migrants, we use data from Statistics Canada table 17-10-0015-01. We filter geo==“British Columbia”, gender==“Total - gender”, migrants==“Net-migration”, and age==“15 to 64 years”. We create an ensemble forecast using ETS and TSLM, plotted in blue below.
inter_prov_data <- vroom(here("data", "supply_side", "17100015.csv"))|>
clean_names()|>
filter(geo=="British Columbia",
gender=="Total - gender",
migrants=="Net-migration",
age_group=="15 to 64 years")|>
mutate(ref_date=as.numeric(str_sub(ref_date, 1, 4)))|>
select(ref_date, value)|>
as_tsibble(index=ref_date)
inter_prov_stream <- inter_prov_data|>
model(ets=ETS(value),
tslm=TSLM(value~trend())
)|>
forecast(h = forecast_horizon+1)|>
as_tibble()|>
group_by(ref_date)|>
summarize(interprovincial_people=mean(.mean))
ggplot()+
geom_line(data=inter_prov_data, mapping=aes(ref_date, value))+
geom_line(data=inter_prov_stream, mapping = aes(ref_date, interprovincial_people), colour="blue")For immigrants, we use data from Statistics Canada table 17-10-0014-01 where we filter geo==“British Columbia”, gender==“Total-gender”, type of migrant \(\in\) (Immigrants, Net emigration, net non-permanent residents), and age_group==“15 to 64 years”. We create a series permanent_residents which is the difference between immigrants and net_emigration. We create an ensemble forecast using ETS and TSLM, plotted in blue below. Note that the immigrants component of permanent_residents is consistent with the levels plan for 2026-2028, but we can not use the levels plan to inform our forecast of net non-permanent residents.
immigrant_data <- vroom(here("data", "supply_side", "17100014.csv"))|>
clean_names()|>
filter(geo=="British Columbia",
gender=="Total - gender",
type_of_migrant %in% c("Immigrants", "Net emigration", "Net non-permanent residents"),
age_group=="15 to 64 years")|>
mutate(ref_date=as.numeric(str_sub(ref_date, 1,4)))|>
select(ref_date, type_of_migrant, value)|>
as_tsibble(index=ref_date, key=type_of_migrant)
immigrant_stream <- immigrant_data|>
model(ets=ETS(value),
tslm=TSLM(value~trend())
)|>
forecast(h = forecast_horizon+1)|>
as_tibble()|>
group_by(ref_date, type_of_migrant)|>
summarize(value=mean(.mean))|>
pivot_wider(id_cols = ref_date, names_from = type_of_migrant, values_from = value)|>
clean_names()|>
mutate(permanent_residents=immigrants-net_emigration)|>
select(-immigrants, -net_emigration)|>
pivot_longer(cols=-ref_date)
immigrant_data <- immigrant_data|>
as_tibble()|>
pivot_wider(names_from = type_of_migrant, values_from=value)|>
clean_names()|>
mutate(permanent_residents=immigrants-net_emigration)|>
select(-immigrants, -net_emigration)|>
pivot_longer(cols=-ref_date)
ggplot()+
geom_line(data=immigrant_data, mapping=aes(ref_date, value))+
geom_line(data=immigrant_stream, mapping = aes(ref_date, value), colour="blue")+
facet_wrap(~name)For the participation rate for new entrants we use Statistics Canada table 14-10-0327-01, where we filter geo==“British Columbia”, gender==“Total - Gender”, and age_group==“15 to 64 years”. We create an ensemble forecast (ETS and TSLM) for the participation rate of working age British Columbians.
can_bc_part_rates <- vroom(here("data", "supply_side", "14100327.csv"))
bc_part_rate_historic <- can_bc_part_rates|>
clean_names()|>
filter(geo=="British Columbia",
gender=="Total - Gender",
age_group=="15 to 64 years",
labour_force_characteristics=="Participation rate"
)|>
mutate(part_rate=value/100,
ref_date=as.numeric(ref_date))|>
select(ref_date, part_rate)|>
as_tsibble(index=ref_date)
bc_part_rate <- bc_part_rate_historic|>
model(ets=ETS(part_rate~trend()),
tslm=TSLM(part_rate~trend()))|>
forecast(h=forecast_horizon+1)|>
as_tibble()|>
group_by(ref_date)|>
summarize(part_rate=mean(.mean))
ggplot()+
geom_line(data=bc_part_rate_historic, mapping=aes(ref_date, part_rate))+
geom_line(data=bc_part_rate, mapping = aes(ref_date, part_rate), colour="blue")For the participation rate for inter-provincial migrants, we use Statistics Canada table 14-10-0327-01 where we filter geo==“Canada”, gender==“Total-gender”, and age \(\in\) [15,64]. We create an ensemble forecast (ETS and TSLM) for the participation rate of working age Canadians.
canada_part_rate_historic <- can_bc_part_rates|>
clean_names()|>
filter(geo=="Canada",
gender=="Total - Gender",
age_group=="15 to 64 years",
labour_force_characteristics=="Participation rate"
)|>
mutate(part_rate=value/100,
ref_date=as.numeric(ref_date))|>
select(ref_date, part_rate)|>
as_tsibble(index=ref_date)
canada_part_rate <- canada_part_rate_historic|>
model(ets=ETS(part_rate~trend()),
tslm=TSLM(part_rate~trend()))|>
forecast(h=forecast_horizon+1)|>
as_tibble()|>
group_by(ref_date)|>
summarize(part_rate=mean(.mean))
ggplot()+
geom_line(data=canada_part_rate_historic, mapping=aes(ref_date, part_rate))+
geom_line(data=canada_part_rate, mapping = aes(ref_date, part_rate), colour="blue")For the participation rates for the two streams (Permanent and Non-permanent residents) of immigrant labour we use Statistics Canada table 98-10-0446-01. This is 2021 census data (not LFS time series), so the participation rates are constant over the forecast horizon. We filter geo==“British Columbia”, Immigrant status and period of immigration \(\in\) (2016-2021, Non-permanent residents), highest certificate==“Total - Highest certificate, diploma or degree”, age \(\in\) (15-24, 25-64), Gender==“Total”, Visible minority==“Total”, Statistics==“Count”. For permanent residents, we only consider those who landed between 2016-2021. Because the working age group is split into young/old age brackets, we need to aggregate before calculating working age (15-64) participation rates.
immigrant_part_rate <- open_dataset(here("data", "supply_side", "98100446_parquet"))|>
dplyr::filter(GEO == "British Columbia",
`Immigrant status and period of immigration (11)`%in% c("2016 to 2021","Non-permanent residents"),
`Highest certificate, diploma or degree (7)`=="Total - Highest certificate, diploma or degree",
`Age (15A)`%in% c("15 to 24 years","25 to 64 years"),
`Gender (3)`=="Total - Gender",
`Visible minority (15)`=="Total - Visible minority",
`Statistics (3)`=="Count"
)|>
select(contains("status"), contains("age"))|>
collect()|>
clean_names()|>
pivot_longer(cols=-c("immigrant_status_and_period_of_immigration_11", "age_15a"))|>
rename(immigrant_class=contains("immigrant_status"))|>
mutate(value=as.numeric(value),
name=str_remove_all(name, "labour_force_status_8_"),
name=str_sub(name, end=-3))|>
group_by(immigrant_class, name)|>
summarize(value=sum(value))|>
pivot_wider(id_cols = immigrant_class)|>
mutate(immigrant_part_rate=(unemployed+employed)/(unemployed+employed+not_in_the_labour_force))|>
select(immigrant_class, immigrant_part_rate)
immigrant_part_rate$immigrant_class[1] <- "Permanent residents"
pr_part_rate <- immigrant_part_rate$immigrant_part_rate[immigrant_part_rate$immigrant_class=="Permanent residents"]
npr_part_rate <- immigrant_part_rate$immigrant_part_rate[immigrant_part_rate$immigrant_class=="Non-permanent residents"]
kable(immigrant_part_rate)| immigrant_class | immigrant_part_rate |
|---|---|
| Permanent residents | 0.7917766 |
| Non-permanent residents | 0.7481337 |
To get the flow of young workers entering the labour force, we multiply the forecasts of new working age British Columbians and BC’s working age (15-64) participation rate.
entrant_participants <- entrant_stream|>
full_join(bc_part_rate)|>
mutate(new_entrants=smoothed*part_rate)|>
select(ref_date, new_entrants)To get the flow of inter-provincial migrants entering the BC labour force, we multiply the forecasts of net-interprovincial migrants and Canada’s working age (15-64) participation rate.
inter_prov_participants <- inter_prov_stream|>
inner_join(canada_part_rate, by = join_by(ref_date))|>
mutate(net_new_interprovincial=interprovincial_people*part_rate)|>
select(ref_date, net_new_interprovincial)Regarding sources of new labour supply from outside Canada, one might worry about double counting additions when a new permanent resident was formerly a non-permanent resident. However, Statistics Canada notes that
Non-permanent residents, outflow refers to the number of work and study permit holders leaving Canada, along with their family members who are not Canadian citizens, landed immigrants (permanent residents) or non-permanent residents themselves. This also includes permit holders and asylum claimants, protected persons and related groups staying in Canada who have been granted permanent resident status. In this case, these persons will be added to the immigrant component. Asylum claimants and related groups who have been deported are also counted as outflows.
Thus, to get new permanent resident’s contribution to new supply, we subtract net emigration from new permanent residents (“immigrants”), and then multiply the difference by the permanent resident participation rate. Similarly, we multiply net non-permanent residents by the non-permanent participation rate. Finally, we add the two streams together.
immigrant_participants <- immigrant_stream|>
pivot_wider()|>
mutate(permanent=permanent_residents*pr_part_rate,
non_permanent=net_non_permanent_residents*npr_part_rate,
immigrant_participants=permanent+non_permanent)|>
select(ref_date, immigrant_participants)In order to “inject” the new supply into the labour market, we need to know where, historically, these sources of new supply typically land (in terms of occupations.)
For young workers, we make use of Statistics Canada Census table 98-10-0593-01, where we apply filters geo==“British Columbia”, labour force status==“Employed”, age==“15 to 19 years”, gender==“Total”, class of worker==“Employee”. We then calculate proportions for each of the NOCs, to give us an indication of first occupations for young British Columbians.
entrant_noc_props <- open_dataset(here("data",
"supply_side",
"98100593_parquet"))|>
dplyr::filter(GEO == "British Columbia",
`Statistics (3)`=="Count",
`Labour force status (3)`=="Employed",
`Age (15A)`=="15 to 19 years",
`Gender (3)`=="Total - Gender"
)|>
dplyr::select(noc_desc=`Occupation - Unit group - National Occupational Classification (NOC) 2021 (821A)`,
count=`Class of worker (7A):Total - Class of worker[1]`)|>
collect()|>
filter(str_detect(noc_desc, "\\b\\d{5}\\b"))|>
mutate(count=as.numeric(count),
noc=str_sub(noc_desc, 1, 5),
desc=str_sub(noc_desc, 7),
prop=count/sum(count, na.rm=TRUE)
)|>
select(noc, desc, prop)|>
arrange(desc(prop))For inter-provincial migrants, we make use of Statistics Canada Census table 98-10-0449-01, and apply filters geo==“Canada”, highest certificate==“Total”, age==“Total, gender==”Total”. We make use of the “Employed” counts to calculate proportions for each of the NOCs, to give us an indication of first occupations for inter-provincial migrants.
canada_noc_props <- open_dataset(here("data",
"supply_side",
"98100449_parquet"))|>
filter(GEO == "Canada",
`Highest certificate, diploma or degree (16)`=="Total - Highest certificate, diploma or degree",
`Age (15A)`=="Total - Age",
`Gender (3)`=="Total - Gender")|>
select(noc_desc=`Occupation - Unit group - National Occupational Classification (NOC) 2021 (821A)`,
count=`Labour force status (3):Employed[2]`
)|>
collect()|>
filter(str_detect(noc_desc, "\\b\\d{5}\\b"))|>
mutate(count=as.numeric(count),
noc=str_sub(noc_desc, 1, 5),
desc=str_sub(noc_desc, 7),
prop=count/sum(count, na.rm=TRUE)
)|>
select(noc, desc, prop)|>
arrange(desc(prop))For immigrants, we make use of Statistics Canada Census table 98-10-0316-01, and apply filters GEO==“Canada”, admission category==“total”, highest certificate==“total”, age==“total”, gender==“total”, period of immigration==“2016-2021”. We then calculate proportions for each of the NOCs, to give us an indication of first occupations for immigrants.
immigrant_noc_props <- open_dataset(here("data",
"supply_side",
"98100316_parquet"))|>
filter(GEO=="Canada",
`Admission category and applicant type (35)`=="Total - Admission category and applicant type",
`Highest certificate, diploma or degree (7)`=="Total - Highest certificate, diploma or degree",
`Age (8C)`=="Total - Age",
`Gender (3)`=="Total - Gender")|>
select(noc_desc=`Occupation - Unit group - National Occupational Classification (NOC) 2021 (821A)`,
count=`Period of immigration (8A):2016 to 2021[8]`)|>
collect()|>
filter(str_detect(noc_desc, "\\b\\d{5}\\b"))|>
mutate(count=as.numeric(count),
noc=str_sub(noc_desc, 1, 5),
desc=str_sub(noc_desc, 7),
prop=count/sum(count, na.rm=TRUE)
)|>
select(noc, desc, prop)|>
arrange(desc(prop))new_entrants <- crossing(entrant_participants, entrant_noc_props)|>
mutate(`New Entrants`=new_entrants*prop)|>
select(-prop, -desc, -new_entrants)|>
filter(ref_date>min(ref_date))
immigrants <- crossing(immigrant_participants, immigrant_noc_props)|>
mutate(`Net International In-Migration`=immigrant_participants*prop)|>
select(-immigrant_participants, -prop, -desc)|>
filter(ref_date>min(ref_date))
interprovincial <- crossing(inter_prov_participants, canada_noc_props)|>
mutate(`Net Interregional In-Migration`=net_new_interprovincial*prop)|>
select(ref_date, noc, desc, `Net Interregional In-Migration`)|>
filter(ref_date>min(ref_date))by_teer <- alluvium_prep(2, teer_names)
by_broad <- alluvium_prep(1, broad_names)alluvium_plot(by_teer)alluvium_plot(by_broad)From the above alluvium plots it should? be obvious that a disproportionate share of new supply is being “injected” into the labour market in TEER 5 Sales and services occupations, necessitating inter-occupational mobility. The first step to derive inter-occupational mobility is to calculate supply_no_mobility, which is the sum of the 3 streams of new supply.
supply_no_mobility <- full_join(immigrants, new_entrants, by = join_by(ref_date, noc))|>
full_join(interprovincial, by = join_by(ref_date, noc))|>
mutate(across(where(is.numeric), ~ replace_na(.x, 0)))|>
select(ref_date, noc, desc, everything())|>
mutate(supply_no_mobility=`Net International In-Migration`+`New Entrants`+`Net Interregional In-Migration`)Inter-occupational mobility is driven by the difference between supply_no_mobility and demand, so lets load our demand forecast.
demand <- read_rds(here("out","richs_forecast.rds"))|>
group_by(noc_5, year)|>
summarize(expansion_demand=sum(expansion_demand, na.rm=TRUE),
replacement_demand=sum(replacement_demand, na.rm=TRUE))|>
filter(year>min(year))|>
rename(ref_date=year)|>
mutate(`Job Openings`=expansion_demand+replacement_demand)|>
select(noc=noc_5, ref_date, `Job Openings`)We assume that inter-occupational mobility balances supply and demand to the extent possible. Aggregate supply and aggregate demand will never be exactly equal, so we normalize demand so that it is exactly equal to supply. Inter-occupational mobility is the difference between normalized demand and supply_no_mobility. Supply is the sum of supply_no_mobility and inter-occupational mobility. Finally, the residual is the difference between demand and supply, assumed to be balanced by changes to unemployment and participation among working age British Columbians.
supply_and_demand <- full_join(supply_no_mobility, demand)|>
mutate(normalized_demand=`Job Openings`*sum(supply_no_mobility, na.rm=TRUE)/sum(`Job Openings`, na.rm=TRUE),
interoccupation_mobility=normalized_demand-supply_no_mobility,
`Total Supply Change`=supply_no_mobility+interoccupation_mobility,
`Change in labour market participation`=`Job Openings`-`Total Supply Change`
)
my_dt(supply_and_demand) Note there are a couple definition mis-matches in the table below.
Rich: Total Supply Change just covers the 3 streams of new workers modeled above, it does not include Change in labour market participation:
\[ \mbox{Total Supply Change} = \mbox{New Entrants} + \mbox{Net International In-Migration} + \mbox{Net Interregional In-Migration} \]
\[ \mbox{Job Openings} = \mbox{Total Supply Change} + \mbox{Change in labour market participation} \]
I.e. any imbalance between the demand and the supply model is balanced by Change in labour market participation, conflating changes in participation and unemployment.
Stokes: Total Supply Change does include Change in labour market participation, but nets out changes in Unemployment.
\[ \begin{aligned} \text{Total Supply Change} =\;& \text{New Entrants} \\ &+ \text{Net International In-Migration} \\ &+ \text{Net Interregional In-Migration} \\ &+ \text{Change in labour market participation} \\ &- \text{Change in Unemployment} \end{aligned} \]
\[ \mbox{Job Openings} = \mbox{Total Supply Change} + \mbox{Change in Unemployment} \]
sd5 <- read_lengthen_clean("supply_demand.xlsx")
stokes_total <- sd5%>%
filter(noc=="#T",
industry=="All Industries",
geographic_area=="British Columbia",
variable %in% c("Job Openings",
"New Entrants",
"Net International In-Migration",
"Net Interregional In-Migration",
"Net Other Mobility",
"Total Supply Change"),
year>report_year)%>%
group_by(name=variable)%>%
summarise(value=sum(value))|>
pivot_wider()|>
mutate(`Decline in unemployment`=`Job Openings`-`Total Supply Change`,
`Change in labour market participation`=`Net Other Mobility`+`Decline in unemployment`)|>
pivot_longer(cols=everything(), values_to = "stokes")
rich_total <- supply_and_demand|>
summarise(across(-c(ref_date, noc, desc), ~ sum(.x, na.rm = TRUE)))|>
pivot_longer(cols=everything(), values_to = "rich")
inner_join(rich_total, stokes_total)|>
DT::datatable(rownames = FALSE)|>
DT::formatRound(c("rich", "stokes"), 2)