Post Secondary Graduates vs. Job openings that require post secondary education.

Author

Richard Martin

Intro

The goal of this exercise is to compare LMO job openings (by NOC) to a forecast of the supply of new post secondary graduates (by CIP). A couple subtleties:

  1. Occupations differ in terms of what proportion require post secondary education. We deflate the LMO job openings by these proportions (which roughly align with the TEER, see Figure 1)

  2. Job openings are by occupation, whereas post secondary graduates are by field of study. We make use of historic proportions based on the census CIP-NOC table to predict in which occupations the graduates will end up.

Show the code
plt <- ggplot(counts_by_noc_post_sec, aes(total,
                                          post_secondary,
                                          colour = teer,
                                          text=paste0(
                                            "NOC: ",
                                            NOC,
                                            "\n Total Employed: ",
                                            scales::comma(total),
                                            "\n With Post Secondary Education: ",
                                            scales::comma(post_secondary),
                                            "\n % with Post Secondary Education: ",
                                            scales::percent(prop_post_sec, accuracy = .1))))+
  geom_abline(slope = 1, intercept = 0, colour="white", lwd=2)+
  scale_x_continuous(trans="log10", labels=scales::comma)+
  scale_y_continuous(trans="log10", labels=scales::comma)+
  geom_point(alpha=.75)+
  labs(title="Canadian Employment counts by NOC and Post secondary Education",
    x="Total",
       y="Some post secondary education")

ggplotly(plt, tooltip="text")
Figure 1: White diagonal indicates 100% of workers have some post secondary education, whereas the further below the white diagonal the lower the proportion requiring post secondary education.

Our forecast for graduates in year \(t\) is

\[graduates_t=base_{\bar{t}} \times (1+CAGR)^{(t-\bar{t})}\]

In words we take a base level of graduates and then have it either grow (or shrink) at a constant rate \(CAGR\). Note that the post secondary completion data is quite noisy at the field of study level, especially for more niche fields. In order to mitigate the effect of this noise we

  1. average across the most recent 5 years of data (2017:2021) to derive the base level at time \(\bar{t}=2019\), and
  2. we use a weighted average for the growth rate over the most recent 5 years of data.

Specifically, the weighted average puts 90% of the weight on the (relatively stable) aggregate growth rate, and the remaining 10% on the noisy field of study growth rate: i.e. We shrink the noisy growth rates towards the overall growth rate.

Example:

Suppose that we had the following data for two hypothetical CIPS, that we wanted to forecast out to the mid point of the LMO forecast, the year 2029.

Show the code
kableExtra::kable(tibble(year=2017:2021, grads=seq(100, 500, 100)), caption = "CIP1")
kableExtra::kable(tibble(year=2017:2020, grads=seq(400,100, -100)), caption = "CIP2")
CIP1
year grads
2017 100
2018 200
2019 300
2020 400
2021 500
CIP2
year grads
2017 400
2018 300
2019 200
2020 100

CIP1

\[\begin{align} & mean~grads=300\\ & mean~year=2019\\ & raw~cagr=\left(\frac{500}{100}\right)^{\frac{1}{4}}-1=50\%\\ \end{align}\]

CIP2

\[\begin{align} & mean~grads=250\\ &mean~year=2018.5\\ &raw~cagr=\left(\frac{100}{400}\right)^{\frac{1}{3}}-1=-37\%\\ \end{align}\]

If these were the only two CIPS, \(mean~cagr=0\%\) (the total number of graduates is constant)

Shrunken cagrs:

CIP1: \(cagr=.1\times 50\%=5\%\)

CIP2: \(cagr=.1\times -37\%=-3.7\%\)

Forecasts:

CIP1: \(grads_{2029}=300\times 1.05^{10}=490\)

CIP2: \(grads_{2029}=250\times .963^{10.5}=170\)

Aggregate imbalances for some broad occupational groups (over LMO 10 year horizon)

Note that in the following plots, we only include:

  1. occupations where the 10 year excess demand or supply exceeds 200 and
  2. occupations with TEERs 1,2 or 3
Show the code
plt <- col_plot(2, "Natural and applied sciences and related occupations", imbalance_greater)
ggplotly(plt, tooltip = "text")
Show the code
plt <- col_plot(3, "Health Occupations", imbalance_greater)+
  labs(x="Excess Supply                     |                                                           Excess Demand")
ggplotly(plt, tooltip = "text")
Show the code
plt <- col_plot(4, "Education, law and social, community and government services", imbalance_greater)+
  labs(x="Excess Supply  |                              Excess Demand")
ggplotly(plt, tooltip = "text")
Show the code
plt <- col_plot(7, "Trades, transport and equipment operators and related", imbalance_greater)+
  labs(x="Excess Supply        |                                    Excess Demand")
ggplotly(plt, tooltip = "text")

The data:

Show the code
my_dt(tbbl)