Here are the key claims in the two papers:

STOET AND GEARY (2018) CLAIM: At the individual level, boys tend to be relatively better STEM than girls. This generally leads boys to be more likely to pursue STEM jobs. But, by expectancy value theory, when there’s less economic opportunity, girls are more likely to go against their relative strengths and pursue STEM degrees (because STEM jobs are high paying). As a result, in countries with less economic opporunity there are more girls pursuing STEM jobs, and thus more STEM equality.

FALK AND HERMLE (2018) CLAIM: Boys and girls have different preferences. When there’s more economic opportunity (GDP), people are more free to express their preferences, leading to greater divergence in gender preferences in countries lots of economic opportunity.

Correlations between all measures at the country level

# get GDP 2017 data from World Bank API
gdp_data <- wbstats::wb(indicator = "NY.GDP.PCAP.CD", 
                        startdate = 2017, 
                        enddate = 2017) %>%
            select(iso2c, value) %>%
            rename(gdp_2017 = value)

pref_data <- readstata13::read.dta13("genderdifferences.dta") %>%
  mutate(country_code = countrycode::countrycode(ison, "iso3n", "iso2c")) %>%
  select(country_code, genderdif) 

STOET_PATH <- "/Users/mollylewis/Documents/research/Projects/1_in_progress/IATLANG/exploratory_studies/7_age_controls/stoet_data.csv"
stoet_data <- read_csv(STOET_PATH) %>%
  mutate(country_code = countrycode::countrycode(country_name, "country.name", "iso2c")) %>%
  select(country_code, everything()) 

# data from Bill von Hippel
INPATH <- "data/Molly data2.csv" 
country_raw <- read_csv(INPATH) %>%
  janitor::clean_names()  %>%
  left_join(gdp_data, by = c("country_code" = "iso2c"))   %>%
  left_join(stoet_data) %>%
  left_join(pref_data)

# save country data with GDP 2017 data merged in (unscaled)
# OUTPATH <- "country_level_data_with_GDP.csv"
# write_csv(country_raw, OUTPATH)

# scale variables
country_level <- country_raw %>%
  mutate_if(is.numeric, base::scale) 
plot_data <- country_level %>%
  select_if(is.numeric) %>%
  select(-n_participants, -gdp_2013)

long_corr <- cor(plot_data, 
                use = "pairwise.complete.obs") %>%
  as.data.frame() %>%
  rownames_to_column("v2") %>%
  gather("v1", "estimate", -v2)

long_p <- corrplot::cor.mtest(plot_data, 
                             use = "pairwise.complete.obs")$p %>%
  as.data.frame(row.names = names(plot_data)) %>%
  do(setNames(.,names(plot_data))) %>%
      rownames_to_column("v2") %>%
  gather("v1", "p", -v2)

corr_df <- full_join(long_corr, long_p) %>%
  mutate(estimate_char = case_when(v1 == v2 ~ "", 
                              TRUE ~ as.character(round(estimate,2))),
         estimate = case_when(v1 == v2 ~ as.numeric(NA), 
                              TRUE ~ estimate),
         estimate_color = case_when(p < .05 ~ estimate, TRUE ~ 0),
         v1 = fct_relevel(v1, "lang_es_sub", "lang_es_wiki", "subt_occu_semantics_fm",
                          "wiki_occu_semantics_fm", "mean_prop_distinct_occs", "implicit_resid", "explicit_resid", "median_country_age", "gdp_2017", "per_women_stem", "gender_inequality_index_value", "science_literacy_diff", "intra_indv_diff", "self_efficacy_diff", "intrest_diff", "enjoy_diff", "satisfaction"),
           v2 = fct_relevel(v2, "lang_es_sub", "lang_es_wiki", "subt_occu_semantics_fm",
                          "wiki_occu_semantics_fm", "mean_prop_distinct_occs", "implicit_resid", "explicit_resid", "median_country_age", "gdp_2017", "per_women_stem", "gender_inequality_index_value", "science_literacy_diff", "intra_indv_diff", "self_efficacy_diff", "intrest_diff", "enjoy_diff","satisfaction"))

ggplot(corr_df, aes(v1, fct_rev(v2), fill = estimate_color)) + 
  geom_tile() + #rectangles for each correlation
  #add actual correlation value in the rectangle
  geom_text(aes(label = estimate_char), size = 3) + 
  scale_fill_gradient2(low ="blue", mid = "white", high = "red", 
                       midpoint = 0, space = "Lab", guide = "colourbar",
                       name = "Pearson's r") +
  ggtitle("Pairwise Correlation Coefficients") +
  theme_classic(base_size = 12) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1), #, hjust = .95, vjust = .2), 
        axis.title.x = element_blank(), 
        axis.title.y = element_blank(),
        axis.ticks = element_blank(),
        legend.position = "none")
Pairwise correlation between all country-level measures. Red and blue correspond to positive and negative correlations, respectively. Non-significant correlations (\textit{p} >= .05) are indicated with white squares.

Pairwise correlation between all country-level measures. Red and blue correspond to positive and negative correlations, respectively. Non-significant correlations ( >= .05) are indicated with white squares.

STEM Self Efficacy Gender Difference (Stoet & Geary, 2018)

Note that this is with their measure (per_women_stem); the mediation isn’t siginficant with our women stem measure…not clear why, other than that the data is newer.

Target correlations

country_level %>%
  ggplot(aes(x = per_women_stem, y = self_efficacy_diff, label = country))+
  geom_point() +
  geom_text_repel(size = 3) +
  ylab("Gender difference in STEM Self Efficacy (Stoet & Geary, 2018)") +
  xlab("Per. Women in STEM (SG measure)") +
  ggtitle("STEM measure vs. Gender Dif. in STEM Self Efficacy  ") +
  geom_smooth(method = "lm", alpha = .2) +
  theme_classic()

country_level %>%
  ggplot(aes(x = lang_es_sub, y = self_efficacy_diff, label = country))+
  geom_point() +
  geom_text_repel(size = 3) +
  ylab("Gender difference in STEM Self Efficacy (Stoet & Geary, 2018)") +
  xlab("Linguistic Gender Bias\n(effect size)") +
  ggtitle("Language Bias vs. Gender Dif. in STEM Self Efficacy  ") +
  geom_smooth(method = "lm", alpha = .2) +
  theme_classic()

Regressions

lm(per_women_stem ~ self_efficacy_diff + lang_es_sub, data = country_level) %>%
  summary()
## 
## Call:
## lm(formula = per_women_stem ~ self_efficacy_diff + lang_es_sub, 
##     data = country_level)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.2426 -0.6839 -0.1395  0.5759  1.5049 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)  
## (Intercept)         0.008111   0.161698   0.050   0.9604  
## self_efficacy_diff -0.514323   0.191907  -2.680   0.0134 *
## lang_es_sub        -0.130940   0.212744  -0.615   0.5443  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8215 on 23 degrees of freedom
##   (13 observations deleted due to missingness)
## Multiple R-squared:  0.4027, Adjusted R-squared:  0.3508 
## F-statistic: 7.755 on 2 and 23 DF,  p-value: 0.002666

Mediation Models

psych::mediate(x = "lang_es_sub", y = "per_women_stem", m = "self_efficacy_diff",
               data = country_level, plot = T) %>%
  summary()

## Call: psych::mediate(y = "per_women_stem", x = "lang_es_sub", m = "self_efficacy_diff", 
##     data = country_level, plot = T)
## 
##  Total effect estimates (c) 
##             per_women_stem   se    t df    Prob
## lang_es_sub          -0.47 0.15 -3.2 36 0.00287
## 
## Direct effect estimates     (c') 
##                    per_women_stem   se     t df     Prob
## lang_es_sub                 -0.09 0.16 -0.57 36 0.572000
## self_efficacy_diff          -0.61 0.16 -3.93 36 0.000367
## 
## R = 0.67 R2 = 0.45   F = 14.85 on 2 and 36 DF   p-value:  1.98e-05 
## 
##  'a'  effect estimates 
##             self_efficacy_diff   se    t df     Prob
## lang_es_sub               0.61 0.13 4.72 37 3.34e-05
## 
##  'b'  effect estimates 
##                    per_women_stem   se     t df     Prob
## self_efficacy_diff          -0.61 0.16 -3.93 36 0.000367
## 
##  'ab'  effect estimates 
##             per_women_stem  boot   sd lower upper
## lang_es_sub          -0.38 -0.38 0.22 -0.81 -0.07
country_level %>%
  fit_mediation(
    x = "lang_es_sub",
    y = "per_women_stem",
    m = "self_efficacy_diff") %>%
   test_mediation() %>%
   p_value()
## [1] 0.0074

This also holds for the wikipedia model. Also holds for three other meaasures of gender inequality (hdi, gini, ggi)

Composite Preference Differences (Falk & Hermle, 2018)

Target correlations

country_level %>%
  ggplot(aes(x = gdp_2017, y = genderdif, label = country))+
  ylab("Gender Differences in Preferences (Falk & Hermle, 2018)") +
  xlab("GDP") +
  ggtitle("GDP vs. Gender Differences in Preferences ") +
  geom_smooth(method = "lm", alpha = .2) +
  geom_point() +
  geom_text_repel(size = 3) +
  theme_classic(base_size = 12)

country_level %>%
  ggplot(aes(x = lang_es_sub, y = genderdif, label = country))+
  ylab("Gender Differences in Preferences (Falk & Hermle, 2018)") +
  xlab("Linguistic Gender Bias\n(effect size)") +
  ggtitle("Language Bias vs. Dif. in Gender Preferences ") +
  geom_smooth(method = "lm", alpha = .2) +
  geom_point() +
  geom_text_repel(size = 3) +
  theme_classic(base_size = 12)

Regressions

lm(genderdif ~ gdp_2017 + lang_es_sub, data = country_level) %>%
  summary()
## 
## Call:
## lm(formula = genderdif ~ gdp_2017 + lang_es_sub, data = country_level)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.18022 -0.59530 -0.03532  0.47677  1.06880 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.1418     0.1414   1.003 0.326756    
## gdp_2017      0.7827     0.1772   4.417 0.000218 ***
## lang_es_sub   0.1859     0.1814   1.025 0.316548    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7036 on 22 degrees of freedom
##   (14 observations deleted due to missingness)
## Multiple R-squared:  0.5903, Adjusted R-squared:  0.553 
## F-statistic: 15.85 on 2 and 22 DF,  p-value: 5.462e-05
lm(gdp_2017 ~ genderdif + lang_es_sub, data = country_level) %>%
  summary()
## 
## Call:
## lm(formula = gdp_2017 ~ genderdif + lang_es_sub, data = country_level)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.84907 -0.44668 -0.03616  0.25844  1.39812 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.06978    0.12576  -0.555 0.584576    
## genderdif    0.60052    0.13595   4.417 0.000218 ***
## lang_es_sub  0.14158    0.15979   0.886 0.385167    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6163 on 22 degrees of freedom
##   (14 observations deleted due to missingness)
## Multiple R-squared:  0.5855, Adjusted R-squared:  0.5478 
## F-statistic: 15.54 on 2 and 22 DF,  p-value: 6.204e-05

Mediation Models

Their causal model is GDP -> preferences. A mediation model suggests that GDP does mediate between language and preferences:

psych::mediate(x = "lang_es_sub", y = "genderdif", m = "gdp_2017",
               data = country_level, plot = T) %>%
  summary()

## Call: psych::mediate(y = "genderdif", x = "lang_es_sub", m = "gdp_2017", 
##     data = country_level, plot = T)
## 
##  Total effect estimates (c) 
##             genderdif   se    t df    Prob
## lang_es_sub      0.45 0.15 3.06 36 0.00422
## 
## Direct effect estimates     (c') 
##             genderdif   se    t df     Prob
## lang_es_sub      0.07 0.14 0.53 36 5.99e-01
## gdp_2017         0.68 0.14 4.85 36 2.35e-05
## 
## R = 0.72 R2 = 0.52   F = 19.29 on 2 and 36 DF   p-value:  2.03e-06 
## 
##  'a'  effect estimates 
##             gdp_2017   se    t df     Prob
## lang_es_sub     0.56 0.14 4.06 37 0.000245
## 
##  'b'  effect estimates 
##          genderdif   se    t df     Prob
## gdp_2017      0.68 0.14 4.85 36 2.35e-05
## 
##  'ab'  effect estimates 
##             genderdif boot   sd lower upper
## lang_es_sub      0.37  0.4 0.16  0.13  0.76

But a more plausible explanation, which there is also evidence is: language -> preferences -> GDP

psych::mediate(x = "lang_es_sub", y = "gdp_2017", m = "genderdif",
               data = country_level, plot = T) %>%
  summary()

## Call: psych::mediate(y = "gdp_2017", x = "lang_es_sub", m = "genderdif", 
##     data = country_level, plot = T)
## 
##  Total effect estimates (c) 
##             gdp_2017   se    t df     Prob
## lang_es_sub     0.56 0.14 4.06 36 0.000254
## 
## Direct effect estimates     (c') 
##             gdp_2017   se    t df     Prob
## lang_es_sub     0.29 0.12 2.42 36 2.05e-02
## genderdif       0.59 0.12 4.85 36 2.35e-05
## 
## R = 0.76 R2 = 0.58   F = 25.03 on 2 and 36 DF   p-value:  1.54e-07 
## 
##  'a'  effect estimates 
##             genderdif   se    t df    Prob
## lang_es_sub      0.45 0.15 3.06 37 0.00416
## 
##  'b'  effect estimates 
##           gdp_2017   se    t df     Prob
## genderdif     0.59 0.12 4.85 36 2.35e-05
## 
##  'ab'  effect estimates 
##             gdp_2017 boot   sd lower upper
## lang_es_sub     0.26 0.27 0.11  0.11  0.51
country_level %>%
  fit_mediation(
    x = "lang_es_sub",
    y = "per_women_stem",
    m = "genderdif") %>%
   test_mediation() %>%
   p_value()
## [1] 0.0177

This also holds for the wikipedia model. Also holds for ggi, hdi_value and per_women_stem.