Lydia Gibson, Sara Hatter & Ken Vu
April 28, 2022
Do we choose our career path based on gender-based social roles or based on top salary? Although many countries, such as China, have incorporated women into their labor power to become a powerful economy\(^1\), women still choose careers that are more in sync to gender stereotype.
Undoubtedly, personality characteristics associated with women, are sympathy, kindness, warmth, and reflect a concern about other people. However, the traits associated to men are achievement orientation and ambitiousness, and concern about accomplishing tasks. These characteristics are very noticeable in the stereotypical association of men in the worker role and women in the family role\(^2\).
More schools are encouraging girls to enter STEM programs and provided them with many resources to succeed in these types of careers. Despite these efforts, women tend to choose career where the median pay is lower.
The data was obtained from the American Community Survey 2010-2012 Public Use Microdata Series and has been already subsetted to only concern STEM majors (particularly with an interest in women majoring in STEM). For each row in the data set (which represents one major), there’s a collection of details and statistics about the major, such as the type of major (i.e. Engineering, Health Science, etc), the proportion of women in the sample of individuals working in that particular field, and other relevant pieces of information.
The dimensions of the data set are 76 rows (Major) by 9
columns.
Median: Median earnings of full-time, year-round
workers
Rank: Rank by median earnings
Major_code: Major code, FO1DP in ACS PUMS
Major: Major description
Major_category: Category of major from Carnevale et
al
Total: Total number of people with major
Men: Male graduates
Women:Female graduates
ShareWomen: Women as share of total
Our research question tries to find associations within STEM college majors that influence median wages. Our goals are to explore the data for STEM college majors and to create a predictive model for median wages.
What associations exist within STEM college majors that have an effect on median wages?
Median wage of the individual majors ranged from \(\$26,000\) for Zoology to \(\$110,000\) for Petroleum Engineering
(\(Mdn = \$44350, M = \$46118\)) .
We have set Major_category as a factor with the
following levels:
so that we can further distinguish the variation of share of women within major categories and the median wages each major category earns.
Based on our boxplot, we noticed there may be a significant difference between median wage by major category so we ran an ANOVA to test our hypothesis:
\(H_0:\alpha_1=\alpha_2=\alpha_3=\alpha_4=\alpha_5= 0\)
\(H_A:\alpha_i\ne 0, i=1,2...,5\)
Based on our one-way ANOVA, we reject the null hypothesis and concluded that there are statistically significant differences in Median Wages between Major Categories \((F(4, 71) = [16.71], p = [0.00000001013])\).
Major_code and Rank as they aren’t relevant
predictors for our purposes.## Major_category Total Men Women ShareWomen Median
## 1 Engineering 2339 2057 282 0.1205643 110000
## 2 Engineering 756 679 77 0.1018519 75000
## 3 Engineering 856 725 131 0.1530374 73000
## 4 Engineering 1258 1123 135 0.1073132 70000
## 5 Engineering 2573 2200 373 0.1449670 65000
## 6 Engineering 32260 21239 11021 0.3416305 65000
As expected, there seems to be a negative association between
ShareWomen and Median. This is one of the main
motivators for our research.
There may be an issues of multicollinearity between
Total, Men, Women and
ShareWomen, so we will run some analyses to assess which of
these predictors could be removed from our model. To address this, we
will run a correlation matrix.
Before beginning our analysis, we began by exploring the normality
within our response variable, Median.
We notices that there was some skewing, so we decided to do a Box-Cox test to see if a transformation is necessary.
## bcPower Transformation to Normality
## Est Power Rounded Pwr Wald Lwr Bnd Wald Upr Bnd
## Y1 -0.8569 -1 -1.4598 -0.254
##
## Likelihood ratio test that transformation parameter is equal to 0
## (log transformation)
## LRT df pval
## LR test, lambda = (0) 8.338064 1 0.0038823
##
## Likelihood ratio test that no transformation is needed
## LRT df pval
## LR test, lambda = (1) 41.68169 1 0.00000000010741
Our rounded power is -1 so we will do an inverse transformation of
the response Median. However, model interpretability may be
difficult.
We started with the full additive model but it removed to many variables so we decided switched to a model with interactions.
Since the additive model removed all but one predictor, we reran the model with interactions
WomenGiven \(p=0.7394>\alpha=0.05\),
we fail to reject \(H_0\)
(Women is not a significant predictor). Thus, we can remove
the predictor Women.
\(Y^{-1} = 2.71 \cdot 10^{-5} -3.441 \cdot 10^{-6}x_1 - 8.87 \cdot 10^{-6} x_2 -3.991 \cdot 10^{-7} x_3 -3.09\cdot 10^{-6}x_4 -4.14 \cdot 10^{-11}x_5 +1.08 \cdot 10^{-6}x_6 +8.97\cdot 10^{-11} x_5 \cdot x_6\)
Here we do a prediction interval for Median\(^{-1}\) for Statistics & Decision
Sciences then take the inverse so that our response is in our original
units.
## fit lwr upr
## 1 41240.31 61595.08 30997.01
Looking at the actual Median for Statistics &
Decision Sciences, we see that the actual response is within our
prediction interval of (30997,61595).
| Major | Major Category | Men | Share Women | Median |
|---|---|---|---|---|
| STATISTICS AND DECISION SCIENCE | Computers & Mathematics | 2960 | 0.5265 | 45000 |
##
## studentized Breusch-Pagan
## test
##
## data: lm_reduced
## BP = 3.2776, df = 7,
## p-value = 0.8582
##
## Shapiro-Wilk normality
## test
##
## data: rstandard(lm_reduced)
## W = 0.98673, p-value =
## 0.6165
## Major_categoryComputers & Mathematics
## 2.41
## Major_categoryEngineering
## 4.58
## Major_categoryHealth
## 2.14
## Major_categoryPhysical Sciences
## 1.57
## Men
## 4.20
## ShareWomen
## 5.21
## Men:ShareWomen
## 4.19
In conclusion
There is an association with gender and median wage of STEM majors.
We can predict the median wage of STEM majors based on the major category, total number of men in the major and total proportion of women in the major.
We all should have majored in Petroleum Engineering!
If we had sex disaggregated data for median wage, we could see the difference in median wage by gender for each major.
If we had time series data, we could then see how median wage changes with an influx of women and/or exodus of men from a given major.
Since we only looked at STEM majors, it would be interesting to
see if these same variables (Major_category,
Men, ShareWomen) are associated with median
wage for all majors.
Etaugh, Claire A., and Judith S. Bridges. Women’s Lives: A Psychological Exploration. 3rd ed., Pearson, 2013.
Kristof, Nicholas D. Half the Sky: Turning Oppression into Opportunity for Women Worldwide. Three Rivers Press, 2010.
For supplementary R script, visit