2024-11-13

Introduction

The labor market in the US is one of the most complex and dynamic markets in the world. As we enter the “real world” it is important four our careers and to be informed votors.

Table of Contents

  • Data Overview
  • Population Gender
  • Gender and Income
  • Education by Gender
  • Earnings by Education
  • Employment Earnings
  • Employment by Sex and Age

Data Overview

Using Data from IPUMS (Integrated Public Use Microdata Series) we have representative data of the US population.

Data Overview: Key Variables and Descriptions
Variable Description Data Type
empstat Employment status Categorical
labforce Labor force status Categorical
occ2010 Occupation (2010 basis) Categorical
wkswork1 Weeks worked last year Numeric
hrswork1 Hours worked last week Numeric
uhrswork Usual hours worked per week Numeric
wksunemp Weeks unemployed last year Numeric
looking Looking for work Binary
availble Available for work Binary
ftotinc Total family income Numeric
incwage Wage and salary income Numeric
poverty Poverty status Binary
educ Educational attainment Categorical
degfield Field of degree Categorical
age Age Numeric
statefip State FIPS code Categorical
sex Sex Categorical
race Race Categorical
speakeng Speaks English Binary
citizen Citizenship status Categorical
yrsusa1 Years in the United States Numeric
marst Marital status Categorical
pwmetro Place of work: metropolitan area Categorical
trantime Travel time to work Numeric
year Year Numeric

Gender on a Population Level

Gender and Income

Education by Gender

Exmployment By Sex And Age

Earnings By Education

Employment Earnings

What gives?

Are there underlying factors that are driving the differences in income? We already saw that education does not explain the differences in income. Leaving the workforce also impacts earnings.

Compensation By Hours

Compensation By Hours Pt. 2

Top Ten Occupations with Highest Average Hours Worked
occ2010 avg_hours_worked total_workers
general and operations managers 45.16918 1989
chief executives and legislators/public administration 44.66231 2758
driver/sales workers and truck drivers 43.46639 5730
managers in marketing, advertising, and public relations 42.90429 2103
lawyers, and judges, magistrates, and other judicial workers 42.71511 2247
managers, nec (including postmasters) 42.45275 8902
financial managers 42.40746 2257
first-line supervisors of sales workers 42.15471 6000
sales representatives, wholesale and manufacturing 41.36543 1922
software developers, applications and systems software 41.04188 3249

Pay Per Hour

Labor Market In States (Fixed Effects)

Fixed Effects Model: Predicting Wages
Dependent variable:
Wage
Education Level 10,609.170***
(68.676)
Gender 15,759.470***
(270.836)
Usual Hours Worked 1,623.007***
(10.492)
Observations 256,540
R2 0.191
Adjusted R2 0.191
F Statistic 20,214.690*** (df = 3; 256486)
Note: p<0.1; p<0.05; p<0.01

Oneway (individual) effect Within Model

Call: plm(formula = incwage ~ educ_num + sex + uhrswork, data = SmallDF_FE, model = “within”, index = c(“statefip”))

Unbalanced Panel: n = 51, T = 482-30087, N = 256540

Residuals: Min. 1st Qu. Median 3rd Qu. Max. -174284.2 -29999.8 -9631.7 13407.7 748384.0

Coefficients: Estimate Std. Error t-value Pr(>|t|)
educ_num 10609.169 68.676 154.481 < 2.2e-16 sexmale 15759.469 270.836 58.188 < 2.2e-16 uhrswork 1623.007 10.492 154.683 < 2.2e-16 *** — Signif. codes: 0 ‘’ 0.001 ’’ 0.01 ’’ 0.05 ‘.’ 0.1 ’ ’ 1

Total Sum of Squares: 1.4287e+15 Residual Sum of Squares: 1.1555e+15 R-Squared: 0.19123 Adj. R-Squared: 0.19106 F-statistic: 20214.7 on 3 and 256486 DF, p-value: < 2.22e-16

New Immigrants

Occupation Analysis

Top Ten Highest Paid Occupations
occ2010 mean_income total_count male_percentage
chief executives and legislators/public administration 165729.33 3098 69.59329
lawyers, and judges, magistrates, and other judicial workers 145167.04 2436 60.83744
software developers, applications and systems software 125936.92 3449 78.89243
financial managers 106309.78 2466 42.98459
managers in marketing, advertising, and public relations 104637.25 2334 49.95716
managers, nec (including postmasters) 90850.31 9932 58.38703
general and operations managers 88261.43 2175 63.63218
sales representatives, wholesale and manufacturing 78582.52 2166 71.46814
computer scientists and systems analysts/network systems analysts/web developers 78068.61 3336 70.08393
accountants and auditors 77346.86 3108 39.92921

Occupation Analysis

Top Ten Highest Paid Occupations by Education Level
educ occ2010 mean_income total_count male_percentage
College chief executives and legislators/public administration $176,143 2706 69.88174
College lawyers, and judges, magistrates, and other judicial workers $146,195 2402 61.28226
College software developers, applications and systems software $128,235 3269 79.07617
College financial managers $114,727 2051 46.51390
College managers in marketing, advertising, and public relations $109,762 2051 49.14676
High School chief executives and legislators/public administration $93,460.84 369 67.47967
High School general and operations managers $68,535.38 585 66.15385
High School managers in marketing, advertising, and public relations $66,786.69 275 56.00000
High School financial managers $65,319.11 404 25.24752
High School managers, nec (including postmasters) $55,596.80 2062 64.16101
Middle School driver/sales workers and truck drivers $34,273.75 328 93.29268
Middle School construction laborers $29,663.22 422 94.54976
Middle School carpenters $28,151.67 222 98.64865
Middle School laborers and freight, stock, and material movers, hand $27,602.50 240 71.25000
Middle School grounds maintenance workers $20,208.07 301 94.35216

Top Field Analysis (Interaction Effects)

\[ \begin{align*} \text{Income}_i = & \ \beta_0 + \sum_{j=1}^{k-1} \beta_{j} \cdot \text{Field}_j \\ & + \beta_{\text{sex}} \cdot \text{Male}_i \\ & + \sum_{j=1}^{k-1} \beta_{j, \text{sex}} \cdot (\text{Field}_j \cdot \text{Male}_i) \\ & + \epsilon_i \end{align*} \]

Top Field Analysis (Interaction Effects)

\[ \begin{align*} \text{Income}_i = & \ \beta_0 + \beta_1 \cdot \text{Business}_i + \beta_2 \cdot \text{Education}_i \\ & + \beta_3 \cdot \text{Engineering}_i + \beta_4 \cdot \text{Health}_i \\ & + \beta_5 \cdot \text{Social Sciences}_i + \beta_{\text{sex}} \cdot \text{Male}_i \\ & + \beta_{1, \text{sex}} \cdot (\text{Business}_i \cdot \text{Male}_i) \\ & + \beta_{2, \text{sex}} \cdot (\text{Education}_i \cdot \text{Male}_i) \\ & + \beta_{3, \text{sex}} \cdot (\text{Engineering}_i \cdot \text{Male}_i) \\ & + \beta_{4, \text{sex}} \cdot (\text{Health}_i \cdot \text{Male}_i) \\ & + \beta_{5, \text{sex}} \cdot (\text{Social Sciences}_i \cdot \text{Male}_i) \\ & + \epsilon_i \end{align*} \]

Top Field Analysis (Interaction Effects)

Interaction Effect of Field and Gender on Income
Dependent variable:
Income
Field: Business -30,572.460***
(1,169.165)
teaching 21,873.110***
(2,260.687)
Field: Engineering -2,959.658**
(1,286.971)
and services -940.951
(1,588.095)
Field: Social Sciences 28,132.050***
(1,141.476)
Sex (Male) -15,108.360***
(2,032.589)
Interaction: Field * Sex -11,961.240***
(2,576.279)
and services:sexmale 2,712.847
(2,594.801)
degfieldSocial Sciences:sexmale 4,481.716**
(2,176.646)
Constant 58,182.220***
(844.883)
Observations 74,974
R2 0.062
Adjusted R2 0.062
Residual Std. Error 92,119.360 (df = 74964)
F Statistic 550.546*** (df = 9; 74964)
Note: p<0.1; p<0.05; p<0.01
Model explores the interaction between gender and field of degree.

Gender and Education Interaction Effects

\[ \scriptsize \ln(\text{Income}) = \beta_0 + \beta_1 \text{Gender} + \beta_2 \text{Education} + \beta_3 (\text{Gender} \times \text{Education}) + \beta_4 \text{Age} + \epsilon \]

\[ \text{Interaction Term: } \beta_3 (\text{Gender} \times \text{Education}) + \beta_4 \text{Age} + \epsilon_2 \]

Gender and Education Interaction Effects

Regression Results: Gender × Education Interaction
Dependent variable:
Log Total Personal Income
Female 1.283***
(0.073)
Education: High School 0.779***
(0.055)
Education: Any College 3.051***
(0.054)
Female × Education: High School -0.098***
(0.0004)
Female × Education: Any College -0.449***
(0.077)
Age -0.402***
(0.076)
Constant 8.553***
(0.056)
Observations 416,189
R2 0.196
Adjusted R2 0.196
Residual Std. Error 4.690 (df = 416182)
F Statistic 16,894.480*** (df = 6; 416182)
Note: p<0.1; p<0.05; p<0.01
Regression Results: Gender × Education Interaction
TRUE

PCA Analysis

Principal Component Analysis (PCA) to investigate the relationships among key variables influencing income. The selected variables include:

  • Income Variables: incwage_log (log-transformed wage income).
  • Demographic and Work Variables: age, educ_num (numerical education level), wkswork1 (weeks worked in the previous year), hrswork1 (hours worked last week), and uhrswork (usual hours worked per week).
PCA Summary: Variance Explained by Components
Component Standard Deviation Proportion of Variance Cumulative Proportion
PC1 PC1 1.3108035 0.34364 0.34364
PC2 PC2 1.0114532 0.20461 0.54825
PC3 PC3 0.9824254 0.19303 0.74128
PC4 PC4 0.8253283 0.13623 0.87751
PC5 PC5 0.7825791 0.12249 1.00000

PCA Pt2

PCA Pt3

PCA Loadings for Key Variables
Variable PC1 PC2
wkswork1 wkswork1 -0.5852694 -0.1228604
uhrswork uhrswork -0.5630263 -0.0343595
total_income_log total_income_log -0.5333753 0.2356820
age age -0.1911235 -0.6789585
educ_num educ_num 0.1394403 -0.6835160

Regression of Education, Gender, and Family Income to predict hours worked

## 
## \begin{table}[!htbp] \centering 
##   \caption{Regression Results: Predicting Hours Worked} 
##   \label{tab:hours_work_regression} 
## \begin{tabular}{@{\extracolsep{5pt}}lc} 
## \\[-1.8ex]\hline 
## \hline \\[-1.8ex] 
##  & \multicolumn{1}{c}{\textit{Dependent variable:}} \\ 
## \cline{2-2} 
## \\[-1.8ex] & Hours Worked per Week \\ 
## \hline \\[-1.8ex] 
##  Education Level & 0.000 \\ 
##   & (0.000) \\ 
##   Female & 0.000 \\ 
##   & (0.000) \\ 
##   Family Income & 0.000 \\ 
##   & (0.000) \\ 
##   Age & 0.000 \\ 
##   & (0.000) \\ 
##   Employed & 0.000 \\ 
##   & (0.000) \\ 
##   Constant & 0.000 \\ 
##   & (0.000) \\ 
##  \hline \\[-1.8ex] 
## Observations & 416,189 \\ 
## Residual Std. Error & 0.000 (df = 416183) \\ 
## \hline 
## \hline \\[-1.8ex] 
## \textit{Note:}  & \multicolumn{1}{r}{$^{*}$p$<$0.1; $^{**}$p$<$0.05; $^{***}$p$<$0.01} \\ 
## \end{tabular} 
## \end{table}

Further Research:

  • Transit Times
  • Poverty
  • Degree Field
  • English & Citizenship
  • Race
  • Marriage
  • State Variations
  • Within occupation trends