53120: Dancers

Why is skill data important?

  1. Career Planning: Job seekers can use this information to understand the skills needed for different careers and assess whether their current skill set aligns with their desired occupation.

  2. Workforce Development: Educators and training providers can use skill data to design relevant curricula and programs that address the specific skill needs of industries and occupations.

  3. Employee Training: Employers can identify the skills required for specific job roles and develop targeted training programs to enhance the skills of their workforce.

  4. Labor Market Analysis: Policymakers and researchers can use skill data to analyze the labor market and identify trends in demand and supply.

In a nutshell we should care about skills because labour is not fungible: we all differ in terms of the skills, knowledge and experience we possess. Likewise, employers are not simply looking for someone with a pulse: for every position there is a unique bundle of skills, knowledge and experience which would maximize the value of the match. The better the match, the greater the surplus that can be shared between the employer and the employee.

Outline:

We attempt to answer two questions:

  1. How do BC’s future skill needs differ from the present?
  2. What is the relationship between skills and wages?

ONET Skills

The Occupational Information Network (ONET) is a comprehensive database developed by the U.S. Department of Labor that provides detailed information about various occupations. One of the key components of ONET is the Skills section, which categorizes the skills required for different job roles. ONET’s Skills data includes a wide range of skills that are relevant to specific occupations, helping individuals, employers, educators, and policymakers understand the skill requirements of different jobs and industries.

ONET Skills data is organized into several categories:

  1. Basic Skills: These are fundamental skills that are applicable to many different jobs. Examples include reading comprehension, active listening, writing, mathematics, and critical thinking.

  2. Cross-Functional Skills: These skills are more specific than basic skills and may be applicable across various occupations. They include skills like communication, teamwork, problem-solving, adaptability, and creativity.

  3. Technical Skills: These skills are job-specific and relate directly to the tasks and functions of a particular occupation. Technical skills can encompass a wide range of abilities, such as programming, operating machinery, using specialized software, and conducting scientific research.

  4. Job-Related Skills: These skills are directly related to the tasks and responsibilities of a specific job. For instance, a surgeon would require skills like surgical procedures, medical diagnosis, and patient care.

  5. Tools and Technology Skills: This category lists the tools, equipment, and technology that individuals in a particular occupation are expected to use. It includes skills related to operating machinery, software programs, and specialized instruments.

  6. Knowledge Areas: While not skills per se, knowledge areas are also covered in ONET data. These include subjects like mathematics and science that are essential for performing various job functions.

ONET skill data is collected through a combination of methods, including expert analysis, job analysis surveys, occupational research, and data aggregation. The process involves input from various stakeholders, including industry experts, workers, employers, and training providers. Here’s an overview of how ONET skill data is collected:

  1. Job Analysis Surveys: Job analysis surveys are a primary method for collecting data on the skills required for different occupations. ONET relies on surveys administered to workers, supervisors, and subject matter experts who are familiar with the tasks, responsibilities, and skill requirements of specific jobs. These surveys gather information about the skills that are critical to performing job tasks effectively.

  2. Subject Matter Experts (SMEs): ONET collaborates with subject matter experts who have in-depth knowledge and experience in various industries and occupations. These experts provide valuable insights into the skills that are most relevant and important for specific job roles.

  3. Occupational Research: ONET researchers study various industries and occupations to identify emerging trends, technological advancements, and changes in skill requirements. This research helps ensure that ONET’s skill data remains current and aligned with the evolving demands of the labor market.

  4. Existing Data Sources: ONET also aggregates information from existing sources, such as industry reports, job postings, and training programs. This data helps supplement the skill information gathered through surveys and expert input.

Job Analysis Survey

Workers, supervisors and subject matter experts are asked about the importance and level of the 35 different skills: e.g.

Future skill needs

The future demand for skills may differ from the current demand for skills in 3 ways:

  1. The skill profile of an occupation may change over time.
  2. The set of occupations may change over time (e.g. a new occupation is introduced).
  3. The occupation mix (i.e. proportion of people in each occupation) may change over time.

There is not much that we can do to predict skill changes due to 1) or 2), but we can investigate 3) by using data from the BC Labour Market Outlook (LMO). The LMO forecasts job openings over the next 10 years. In what follows, we take the skill data for these 500ish occupations and create a weighted average for each skill level. The weights are employment for the current year, and job openings for each of the following 10 years.

In the plot above it appears that the average skill profile is changing rapidly over time for many skills, with most skills showing an increase in their weighted average level. However note the scale of the y-axis… Here is another view where all the plots share the same y-axis, which includes the origin.

The stability in the skill profile over time is due to the fact that current employment and future job openings are highly correlated, implying that the weights are not changing much over time.

Relationship between skill level and wage rate (income for now, wages come November)

One way to think about workers is that we are just a bundle of attributes (skills, knowledge, experience) that employers want to rent. If the goal of a job seeker is to maximize their wages, then it makes sense to invest in the skills that are most highly valued by employers. First thing we do is plot the distribution of skill levels:

From this we can see skill levels differ substantially both in terms of their mean and dispersion. Next, lets look at how these skill levels are related to income.

Take aways:

  1. There are two main clusters of skill levels, which can be broadly characterized by whether their mean value is below or above 2.
    • High skill levels have a fairly consistent colour of green: these skill levels are high across most occupations (generic skills)
    • Low skill levels have a more variable blueish colour: these skill levels are lower in general, but are high for certain occupations. (specific skills)
  2. There appears to be a relationship between skill levels and income:
    • The occupations are organized by income, with high income to the left, where skill levels are higher as well.

The above suggests that we might want to look more closely at the pairwise correlations between the skill levels and income. In the plot below the skill levels are organized in terms of their correlation with income.

Take aways:

  1. Consistent with the previous heatmap of the raw data, we see two distinct groups of skill levels, where scores are highly correlated within a group (yellow), but negatively correlated between groups (blue).
  2. The generic skills are more highly correlated with income than the specific skills.
  3. Certain skill measures might be redundant: e.g. correlation between repairing and equipement maintenance is .99

Hedonic estimation:

If we are willing to assume that skill interactions1 are not an important determinant of wages, we can use hedonic estimation to estimate the wage semi-elasticities with the model:

\[log(wage)=\hat{\beta_0}+\sum^{35}_{i=1}\hat{\beta_i} \times level_i + \epsilon\] i.e. holding all other skills constant, a 1 unit increase in skill level \(i\) is associated with a \(100\times (e^{\hat{\beta_i}}-1)\)% change in wage rate. In order to interpret these as causal effects, one must be willing to assume that for a given occupation the associated skill profile is “as good as” randomly assigned, which precludes

  1. Confounding variables that influence both skills and wages.
  2. Reverse causality, where the wage rate of an occupation influences its skill profile.

Regression results

We run two models: first an un-weighted plain vanilla OLS regression (every occupation is equally important), and second a weighted least squares regression, where the weights are the current employment levels: This puts a large weight on occupations where there are many people with that occupation, and a small weight on occupations where there are few.

Regression Results
  OLS WLS
(Intercept) 9.20*** 8.85***
  (0.18) (0.16)
Active Learning -0.02 -0.07
  (0.05) (0.04)
Active Listening 0.09 0.09
  (0.05) (0.05)
Complex Problem Solving 0.12* 0.12*
  (0.05) (0.05)
Coordination 0.07 0.10**
  (0.04) (0.04)
Critical Thinking 0.01 0.01
  (0.05) (0.05)
Equipment Maintenance -0.05 -0.02
  (0.05) (0.05)
Equipment Selection -0.11*** -0.10***
  (0.03) (0.02)
Installation 0.03 0.00
  (0.02) (0.02)
Instructing 0.10* 0.10**
  (0.04) (0.04)
Judgment and Decision Making 0.03 0.11*
  (0.05) (0.04)
Learning Strategies -0.07 0.01
  (0.04) (0.04)
Management of Financial Resources -0.03 0.01
  (0.03) (0.03)
Management of Material Resources 0.00 -0.04
  (0.03) (0.03)
Management of Personnel Resources 0.01 -0.05
  (0.04) (0.03)
Mathematics 0.01 0.01
  (0.02) (0.02)
Monitoring 0.04 0.03
  (0.05) (0.04)
Negotiation 0.09* 0.05
  (0.04) (0.04)
Operation and Control 0.04 0.05*
  (0.02) (0.02)
Operations Analysis -0.01 -0.02
  (0.02) (0.02)
Operations Monitoring 0.06 0.08**
  (0.03) (0.03)
Persuasion -0.08 -0.10**
  (0.05) (0.04)
Programming 0.05* 0.06***
  (0.02) (0.02)
Quality Control Analysis -0.10*** -0.09***
  (0.02) (0.02)
Reading Comprehension 0.08 0.03
  (0.05) (0.05)
Repairing 0.10 0.11*
  (0.05) (0.05)
Science 0.02 0.02
  (0.01) (0.01)
Service Orientation -0.09** -0.09**
  (0.03) (0.03)
Social Perceptiveness -0.08* -0.00
  (0.04) (0.04)
Speaking -0.05 -0.06
  (0.05) (0.05)
Systems Analysis 0.07 -0.01
  (0.04) (0.04)
Systems Evaluation -0.08 -0.09
  (0.05) (0.05)
Technology Design -0.02 0.03
  (0.03) (0.03)
Time Management 0.13** 0.13**
  (0.05) (0.04)
Troubleshooting 0.04 0.02
  (0.03) (0.03)
Writing -0.05 0.03
  (0.04) (0.04)
R2 0.57 0.72
Adj. R2 0.54 0.70
Num. obs. 497 497
***p < 0.001; **p < 0.01; *p < 0.05

Diagnostic plots of weighted regression

data: weighted BP = 0.11733, df = 35, p-value = 1

Effect size

Below are the effect sizes, with the regression p.values determining shade (dark ~ low p.value)

Residuals analysis:

Skill is not the only factor that determines wages, and any omitted factors gets captured in the residual. Note that the adjusted \(R^2=\) 0.7 for the weighted regression, implying 30% of the variation in wage is not explained by skill.

One interpretation of the residuals is that they capture what is know as compensating wage differentials (Smith 1776). Unpleasant or dangerous jobs require a greater wage than skill alone would predict, the opposite occurring for fun jobs. The next plot shows the occupations with the largest absolute value residuals from the weighted regression.

Another possibility is that educational attainment influences wages even once we control for skill: i.e. the piece of paper matters. To investigate this possibility we plot the residuals from the skill regression vs TEER.

There is no apparent relationship between TEER and the residuals for TEERs 1-5, but TEER 0 is an outlier, in the sense that their wages are systematically higher than their skill profile would suggest. Some possible explanations:

Skill interactions:

As alluded to above, the hedonic analysis above assumes that skill interactions are not important: it only captures the direct effect of skill changes. One could easily imagine complementarities between skills: e.g. increasing two skills in tandem yields a greater increase in wage than if the skills increased individually. To allow for these types of interactions we move to a new modeling technique, decision trees.

Conditional inference trees

The algorithm utilized below is a statistical and machine learning method used for building decision trees. It is designed to create unbiased decision trees by using statistical significance as a decision criteria. Here’s how the algorithm works:

  1. Select a splitting variable: The algorithm chooses the skill that is most closely related to income.

  2. Split the data: The algorithm splits the data (by the variable above) in such a way that maximizes the differences in wages between the two resulting groups.

  3. Test the Split: The algorithm then tests if the split is statistically significant. In other words, it checks if the division of the data into these groups provides meaningful information for making predictions.

  4. Repeat the Process: If the split is significant, the algorithm continues to divide the data further, selecting the “best” variable to split on at each step. This process is repeated recursively until it decides that further splits are not significantly different.

Suppose that we had only two explanatory variables, \(X1\) and \(X2\), and the response is \(R\). A decision tree might look something like this:

The fact that decision trees allow for interactions between the explanatory variables can be seen most easily in the 3-D representation of the model:

The key idea behind the Conditional Inference Tree algorithm is that it uses statistical tests at each step to ensure that the splits are based on meaningful patterns in the data, which leads to more accurate decision trees: less bias, less variance (over-fitting). Here is a conditional inference tree based on our data:

Take aways:

  1. Complex problem solving is the skill that is the “best” predictor of wage.
  2. Operations monitoring, systems analysis, and operations and control also are significant predictors of wage.

A problem with decision trees is that they are not very robust: a small change in the data can lead to a very different tree. To overcome this issue, we move on to our next modelling technique:

Random forests

Rather than utilizing a single decision tree, we can create a forest of trees. For each tree in the forest, we introduce some randomness:

  1. Each tree is based on a different random subset of occupations.
  2. For each decision node, only a random subset of skills are considered.

Once we have created a forest of trees we can use this “ensemble” to predict wages. The act of averaging across many trees reduces the variance when compared to a single decision tree. One of the outputs of a random forest is a measure of variable importance: How important was each skill (on average) in predicting wage.

References

Schmidt, Amand F, and Chris Finan. 2018. “Linear Regression and the Normality Assumption.” Journal of Clinical Epidemiology 98: 146–51.
Smith, Adam. 1776. “An Inquiry into the Nature and Causes of the Wealth of Nations: Volume One.” In. London: printed for W. Strahan;; T. Cadell, 1776.

  1. e.g. An interaction occurs if increasing two skills in tandem yields a greater increase in wage than if the skills increased individually.↩︎