Post Secondary Graduates vs. Job openings that require post secondary education.

Author

Richard Martin

Intro

The goal of this exercise is to compare LMO job openings (by NOC) to a forecast of the supply of new post secondary graduates (by CIP). A couple subtleties:

Occupations differ in terms of what proportion require post secondary education. We deflate the LMO job openings by these proportions (which roughly align with the TEER, see Figure 1)
Job openings are by occupation, whereas post secondary graduates are by field of study. We make use of historic proportions based on the census CIP-NOC table to predict in which occupations the graduates will end up.

Show the code

plt <- ggplot(counts_by_noc_post_sec, aes(total,
                                          post_secondary,
                                          colour = teer,
                                          text=paste0(
                                            "NOC: ",
                                            NOC,
                                            "\n Total Employed: ",
                                            scales::comma(total),
                                            "\n With Post Secondary Education: ",
                                            scales::comma(post_secondary),
                                            "\n % with Post Secondary Education: ",
                                            scales::percent(prop_post_sec, accuracy = .1))))+
  geom_abline(slope = 1, intercept = 0, colour="white", lwd=2)+
  scale_x_continuous(trans="log10", labels=scales::comma)+
  scale_y_continuous(trans="log10", labels=scales::comma)+
  geom_point(alpha=.75)+
  labs(title="Canadian Employment counts by NOC and Post secondary Education",
    x="Total",
       y="Some post secondary education")

ggplotly(plt, tooltip="text")

Figure 1: White diagonal indicates 100% of workers have some post secondary education, whereas the further below the white diagonal the lower the proportion requiring post secondary education.

Our forecast for graduates in year \(t\) is

\[graduates_t=base_{\bar{t}} \times (1+CAGR)^{(t-\bar{t})}\]

In words we take a base level of graduates and then have it either grow (or shrink) at a constant rate \(CAGR\). Note that the post secondary completion data is quite noisy at the field of study level, especially for more niche fields. In order to mitigate the effect of this noise we

average across the most recent 5 years of data (2017:2021) to derive the base level at time \(\bar{t}=2019\), and
we use a weighted average for the growth rate over the most recent 5 years of data.

Specifically, the weighted average puts 90% of the weight on the (relatively stable) aggregate growth rate, and the remaining 10% on the noisy field of study growth rate: i.e. We shrink the noisy growth rates towards the overall growth rate.

Example:

Suppose that we had the following data for two hypothetical CIPS, that we wanted to forecast out to the mid point of the LMO forecast, the year 2029.

Show the code

kableExtra::kable(tibble(year=2017:2021, grads=seq(100, 500, 100)), caption = "CIP1")
kableExtra::kable(tibble(year=2017:2020, grads=seq(400,100, -100)), caption = "CIP2")

CIP1
year	grads
2017	100
2018	200
2019	300
2020	400
2021	500

CIP2
year	grads
2017	400
2018	300
2019	200
2020	100

CIP1

\[\begin{align} & mean~grads=300\\ & mean~year=2019\\ & raw~cagr=\left(\frac{500}{100}\right)^{\frac{1}{4}}-1=50\%\\ \end{align}\]

CIP2

\[\begin{align} & mean~grads=250\\ &mean~year=2018.5\\ &raw~cagr=\left(\frac{100}{400}\right)^{\frac{1}{3}}-1=-37\%\\ \end{align}\]

If these were the only two CIPS, \(mean~cagr=0\%\) (the total number of graduates is constant)

Shrunken cagrs:

CIP1: \(cagr=.1\times 50\%=5\%\)

CIP2: \(cagr=.1\times -37\%=-3.7\%\)

Forecasts:

CIP1: \(grads_{2029}=300\times 1.05^{10}=490\)

CIP2: \(grads_{2029}=250\times .963^{10.5}=170\)

Aggregate imbalances for some broad occupational groups (over LMO 10 year horizon)

Note that in the following plots, we only include:

occupations where the 10 year excess demand or supply exceeds 200 and
occupations with TEERs 1,2 or 3

Show the code

plt <- col_plot(2, "Natural and applied sciences and related occupations", imbalance_greater)
ggplotly(plt, tooltip = "text")

Show the code

plt <- col_plot(3, "Health Occupations", imbalance_greater)+
  labs(x="Excess Supply                     |                                                           Excess Demand")
ggplotly(plt, tooltip = "text")

Show the code

plt <- col_plot(4, "Education, law and social, community and government services", imbalance_greater)+
  labs(x="Excess Supply  |                              Excess Demand")
ggplotly(plt, tooltip = "text")

Show the code

plt <- col_plot(7, "Trades, transport and equipment operators and related", imbalance_greater)+
  labs(x="Excess Supply        |                                    Excess Demand")
ggplotly(plt, tooltip = "text")

The data:

Show the code

my_dt(tbbl)