Gender Wage Inequality in STEM

Lydia Gibson, Sara Hatter & Ken Vu

April 28, 2022

Introduction

Do we choose our career path based on gender-based social roles or based on top salary? Although many countries, such as China, have incorporated women into their labor power to become a powerful economy\(^1\), women still choose careers that are more in sync to gender stereotype.

Undoubtedly, personality characteristics associated with women, are sympathy, kindness, warmth, and reflect a concern about other people. However, the traits associated to men are achievement orientation and ambitiousness, and concern about accomplishing tasks. These characteristics are very noticeable in the stereotypical association of men in the worker role and women in the family role\(^2\).

More schools are encouraging girls to enter STEM programs and provided them with many resources to succeed in these types of careers. Despite these efforts, women tend to choose career where the median pay is lower.

Data Description

The data was obtained from the American Community Survey 2010-2012 Public Use Microdata Series and has been already subsetted to only concern STEM majors (particularly with an interest in women majoring in STEM). For each row in the data set (which represents one major), there’s a collection of details and statistics about the major, such as the type of major (i.e. Engineering, Health Science, etc), the proportion of women in the sample of individuals working in that particular field, and other relevant pieces of information.

Data set

The dimensions of the data set are 76 rows (Major) by 9 columns.

Variables

Research Question and Goals

Our research question tries to find associations within STEM college majors that influence median wages. Our goals are to explore the data for STEM college majors and to create a predictive model for median wages.

Research Question:

What associations exist within STEM college majors that have an effect on median wages?

Goals:

Stacked Bar chart: Gender Proportions per Major Category

Exploratory Data Analysis

Median wage of the individual majors ranged from \(\$26,000\) for Zoology to \(\$110,000\) for Petroleum Engineering (\(Mdn = \$44350, M = \$46118\)) .

We have set Major_category as a factor with the following levels:

so that we can further distinguish the variation of share of women within major categories and the median wages each major category earns.

Box Plot: Median Wage by Major Category

Test differences between major categories

Based on our boxplot, we noticed there may be a significant difference between median wage by major category so we ran an ANOVA to test our hypothesis:

\(H_0:\alpha_1=\alpha_2=\alpha_3=\alpha_4=\alpha_5= 0\)

\(H_A:\alpha_i\ne 0, i=1,2...,5\)

Based on our one-way ANOVA, we reject the null hypothesis and concluded that there are statistically significant differences in Median Wages between Major Categories \((F(4, 71) = [16.71], p = [0.00000001013])\).

Jitter plot: Median Wage by Major Category

Further Cleaning

##   Major_category Total   Men Women ShareWomen Median
## 1    Engineering  2339  2057   282  0.1205643 110000
## 2    Engineering   756   679    77  0.1018519  75000
## 3    Engineering   856   725   131  0.1530374  73000
## 4    Engineering  1258  1123   135  0.1073132  70000
## 5    Engineering  2573  2200   373  0.1449670  65000
## 6    Engineering 32260 21239 11021  0.3416305  65000

Scaterplot Matrix

Scatterplot Matrix Insights

Methods and Results: Checking Assumptions

Before beginning our analysis, we began by exploring the normality within our response variable, Median.

Box Cox

We notices that there was some skewing, so we decided to do a Box-Cox test to see if a transformation is necessary.

Box-Cox Summary output

## bcPower Transformation to Normality 
##    Est Power Rounded Pwr Wald Lwr Bnd Wald Upr Bnd
## Y1   -0.8569          -1      -1.4598       -0.254
## 
## Likelihood ratio test that transformation parameter is equal to 0
##  (log transformation)
##                            LRT df      pval
## LR test, lambda = (0) 8.338064  1 0.0038823
## 
## Likelihood ratio test that no transformation is needed
##                            LRT df             pval
## LR test, lambda = (1) 41.68169  1 0.00000000010741

Our rounded power is -1 so we will do an inverse transformation of the response Median. However, model interpretability may be difficult.

Building Predicitive Model

We started with the full additive model but it removed to many variables so we decided switched to a model with interactions.

Building Predicitive Model w/ Interaction

Since the additive model removed all but one predictor, we reran the model with interactions

Running step-wise to reduce the model’s AIC

Test significance of predictor Women

Given \(p=0.7394>\alpha=0.05\), we fail to reject \(H_0\) (Women is not a significant predictor). Thus, we can remove the predictor Women.

Getting the reduced final model

\(Y^{-1} = 2.71 \cdot 10^{-5} -3.441 \cdot 10^{-6}x_1 - 8.87 \cdot 10^{-6} x_2 -3.991 \cdot 10^{-7} x_3 -3.09\cdot 10^{-6}x_4 -4.14 \cdot 10^{-11}x_5 +1.08 \cdot 10^{-6}x_6 +8.97\cdot 10^{-11} x_5 \cdot x_6\)

Predictive power

Here we do a prediction interval for Median\(^{-1}\) for Statistics & Decision Sciences then take the inverse so that our response is in our original units.

##        fit      lwr      upr
## 1 41240.31 61595.08 30997.01

Looking at the actual Median for Statistics & Decision Sciences, we see that the actual response is within our prediction interval of (30997,61595).

Major Major Category Men Share Women Median
STATISTICS AND DECISION SCIENCE Computers & Mathematics 2960 0.5265 45000

Model Diagnostics

Model Diagnostics (Numeric Tests)

## 
##  studentized Breusch-Pagan
##  test
## 
## data:  lm_reduced
## BP = 3.2776, df = 7,
## p-value = 0.8582
## 
##  Shapiro-Wilk normality
##  test
## 
## data:  rstandard(lm_reduced)
## W = 0.98673, p-value =
## 0.6165

Multicollinearity (VIF)

## Major_categoryComputers & Mathematics 
##                                  2.41 
##             Major_categoryEngineering 
##                                  4.58 
##                  Major_categoryHealth 
##                                  2.14 
##       Major_categoryPhysical Sciences 
##                                  1.57 
##                                   Men 
##                                  4.20 
##                            ShareWomen 
##                                  5.21 
##                        Men:ShareWomen 
##                                  4.19

Conclusion

In conclusion

Further Research

Bibliography

Etaugh, Claire A., and Judith S. Bridges. Women’s Lives: A Psychological Exploration. 3rd ed., Pearson, 2013.

Kristof, Nicholas D. Half the Sky: Turning Oppression into Opportunity for Women Worldwide. Three Rivers Press, 2010.

Code Appendix

For supplementary R script, visit