Here we test two hypotheses:

  • As predicted by Caliskan, et al., cross-cultural variability in implicit bias will be correlated with bias from word embedding models
  • Bias (for both language and behavioral IAT) will be larger for those languages that encode gender grammatically, compared to those who do not.

Both of these predictions pan out. The relationships are particularly strong for male participants from the IAT - this could be because there’s a male bias in wikipedia editors.

1 Implicit measures of bias (IAT score)

Here we look at whether IAT scores behavioral scores (from project implicit) correlate with language bias for a range of difference languages. Note this is at the language level (averaging across countries that speak the same language for the IAT behavioral data).

Though I explored a range of different measures, the pre-calculated D-score appears to be the best measure of bias. High D-scores indicates more male-career bias. I look at D-score by participant gender, and with all participants.

IAT scores calculated from the embeddings model are based on Wikipedia fasttext models (https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md). I use the same measure of effect size as Caliskan et al.

# Read in behavior and language biases
countries_langs <- read_csv("../../data/other/countries_lang.csv") %>%
  mutate(language_name = ifelse(language_name == "Spanish; Castilian", "Spanish", language_name),
         language_name = ifelse(language_name == "Dutch; Flemish", "Dutch", language_name))

IAT_behavior_measures_imp <- read_csv("../behavior_IAT/IAT_behavior_measures.csv") %>%
  left_join(countries_langs %>% select(country_name, language_name, language_code)) %>%
  filter(type %in% c("country_means_D_score", 
             "country_gender_D_score")) %>%
  group_by(type, language_code, sex) %>%
  summarize(mean = mean(mean)) 

lang_bias <- read_csv("career_effect_sizes.csv") 

d <- left_join(IAT_behavior_measures_imp, lang_bias) %>%
  mutate_if(is.character, as.factor) %>%
  filter(!is.na(lang_bias))

There are 22 languages represented here.

1.1 Measures

1.1.1 Behavior

ggplot(d, aes(x = mean)) +
  geom_histogram() +
  facet_wrap(~type + sex, scales = "free_x")+
  theme_bw()

1.1.2 Language

ggplot(lang_bias, aes(x = lang_bias)) +
  geom_histogram() +
  theme_bw()

1.2 Correlation between measures

Each point below is a language. We find that male IAT bias is reliably correlated with bias from the embedding models.

d %>%
ggplot( aes(x = mean, y = lang_bias, group = sex, color = sex)) +
  geom_point() +
  geom_smooth(method = "lm") +
  facet_grid(.~ type, scales = "free_x", drop = T) +
  theme_bw()

d %>%
  group_by(type, sex) %>%
  do(tidy(cor.test(.$mean, .$lang_bias))) %>%
  arrange(p.value) %>%
  select(-parameter, -method, -alternative) %>%
  kable()
type sex estimate statistic p.value conf.low conf.high
country_gender_D_score m 0.4477692 2.2395419 0.0366428 0.0322487 0.7313172
country_means_D_score NA 0.3787269 1.8300407 0.0821900 -0.0510297 0.6901379
country_gender_D_score f 0.1534496 0.6944725 0.4953750 -0.2867076 0.5401150

2 Explicit measures of bias

Next, we ask whether the explicit questions asked in the IAT correlate with the language IAT measure.

We look at two measures (explained here: https://mfr.osf.io/render?url=https://osf.io/thvzf/?action=download%26mode=render):

assocareer: ’How strongly do you associate the following with males and females? Career; 1=female 7= male

assofamily: ’How strongly do you associate the following with males and females? Family; 1=female 7= male

# Read in behavior and language biases
countries_langs <- read_csv("../../data/other/countries_lang.csv") %>%
  mutate(language_name = ifelse(language_name == "Spanish; Castilian", "Spanish", language_name),
         language_name = ifelse(language_name == "Dutch; Flemish", "Dutch", language_name))


IAT_behavior_measures_exp <- read_csv("IAT_behavior_measures.csv") %>%
  left_join(countries_langs %>% select(country_name, language_name, language_code)) %>%
  filter(type %in% c("country_fam_male", 
             "country_career_male")) %>%
  group_by(type, language_code, sex) %>%
  summarize(mean = mean(mean)) 

d <- left_join(IAT_behavior_measures_exp, lang_bias) %>%
  mutate_if(is.character, as.factor) %>%
  filter(!is.na(lang_bias))

2.1 Measure

ggplot(d, aes(x = mean)) +
  geom_histogram() +
  facet_wrap(sex ~type, scales = "free_x",)+
  theme_bw()

2.2 Correlation between measures

I find that the degree to which males associate family with males is negatively correlated with bias. The career question is not predictive.

d %>%
ggplot( aes(x = mean, y = lang_bias, group = sex, color = sex)) +
  geom_point() +
  geom_smooth(method = "lm") +
  facet_grid(.~ type, scales = "free_x", drop = T) +
  theme_bw()

d %>%
  group_by(type, sex) %>%
  do(tidy(cor.test(.$mean, .$lang_bias))) %>%
  arrange(p.value) %>%
  select(-parameter, -method, -alternative) %>%
  kable()
type sex estimate statistic p.value conf.low conf.high
country_fam_male m -0.4425174 -2.2068315 0.0391791 -0.7282558 -0.0257047
country_fam_male all -0.3651060 -1.7538813 0.0947633 -0.6817686 0.0667824
country_career_male m -0.2716986 -1.2625678 0.2212680 -0.6220509 0.1693037
country_fam_male f -0.2011662 -0.9184176 0.3693442 -0.5740846 0.2408714
country_career_male f -0.0035605 -0.0159230 0.9874536 -0.4245316 0.4186765

3 Relationship between grammar type and bias measures

Finally, we ask whether a language encodes gender predicts its bias. WALS strangely only has this coded for about half of the langauges in our sample. I coded the rest from wikipedia (https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders). For the ones that were coded by both, they always agreed. Here are the distributions of languages across grammar types (CN = common/neuter, MF = masculine/feminine, MFN = masculine/feminine/neuter).

gender_data <- read_csv("../language_bias/gender_grammar.csv") %>%
  rename(wikipedia_grammar_type = Wikipedia) %>%
  select(language_code, language_name, wikipedia_grammar_type) %>%
  filter(!is.na(wikipedia_grammar_type)) %>%
  mutate(wikipedia_grammar_type2 = ifelse(wikipedia_grammar_type == "none", "none", "MF"))
kable(arrange(gender_data, wikipedia_grammar_type2))
language_code language_name wikipedia_grammar_type wikipedia_grammar_type2
ar arabic MF MF
nl dutch; flemish CN MF
pt portuguese MF MF
da danish CN MF
it italian MF MF
no norwegian CN MF
pl polish MFN MF
sv swedish MFN MF
es spanish; castilian MF MF
de german MFN MF
ru russian MFN MF
fr french MF MF
el greek MFN MF
he hebrew MF MF
hr croatian MFN MF
ro romanian MFN MF
ja japanese none none
ko korean none none
ms malay none none
fa persian none none
zh chinese none none
fi finnish none none
id indonesian none none
tr turkish none none
en english none none
tl tagalog none none
th thai none none
count(gender_data, wikipedia_grammar_type2)  %>%
  kable()
wikipedia_grammar_type2 n
MF 16
none 11
count(gender_data, wikipedia_grammar_type2)  %>%
  kable()
wikipedia_grammar_type2 n
MF 16
none 11

3.1 Implicit measures

For both IAT measures, languages with bigger grammatical bias have large implicit bias than languages that don’t, and this effect is larger for male participants.

3.1.1 Behavioral IAT measure

full_d <- IAT_behavior_measures_imp %>%
  full_join(lang_bias, by = "language_code") %>%
  full_join(gender_data, by = "language_code") %>%
  select(-contains("test"))

iat_means <-  full_d %>%
  filter(!is.na(wikipedia_grammar_type2)) %>%
  mutate(sex = ifelse(is.na(sex), "all", sex)) %>%
  group_by(type, sex, wikipedia_grammar_type2) %>%
  multi_boot_standard(col = "mean", na.rm = T)

  ggplot(iat_means, aes(x = sex, 
                                 y = mean, fill = wikipedia_grammar_type2)) +
  geom_bar(stat = "identity", position = "dodge") +
  facet_wrap(~type, scales = "free") +
  geom_linerange(aes(ymin = ci_lower, ymax = ci_upper),  
                 position = position_dodge(width = .9)) +
  theme_bw() 

3.1.2 Language IAT measures

lang_means <-  full_d %>%
  filter(!is.na(wikipedia_grammar_type2)) %>%
  mutate(sex = ifelse(is.na(sex), "all", sex)) %>%
  filter(sex == "all") %>%
  group_by(wikipedia_grammar_type2) %>%
  multi_boot_standard(col = "lang_bias", na.rm = T)

  ggplot(lang_means, aes(x = wikipedia_grammar_type2, 
                                 y = mean, fill = wikipedia_grammar_type2)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_linerange(aes(ymin = ci_lower, ymax = ci_upper),  
                 position = position_dodge(width = .9)) +
  theme_bw() 

3.2 Explicit measure

This effect is also in the predicted direction: Languages that don’t have grammatical gender are more likely to associate males with family (but the reverse is true for career?).

full_d <- IAT_behavior_measures_exp %>%
  full_join(lang_bias, by = "language_code") %>%
  full_join(gender_data, by = "language_code") %>%
  select(-contains("test"))

iat_means <-  full_d %>%
  filter(!is.na(wikipedia_grammar_type2)) %>%
  mutate(sex = ifelse(is.na(sex), "all", sex)) %>%
  group_by(type, sex, wikipedia_grammar_type2) %>%
  multi_boot_standard(col = "mean", na.rm = T)

  ggplot(iat_means, aes(x = sex, 
                                 y = mean, fill = wikipedia_grammar_type2)) +
  geom_bar(stat = "identity", position = "dodge") +
  facet_wrap(~type, scales = "free") +
  geom_linerange(aes(ymin = ci_lower, ymax = ci_upper),  
                 position = position_dodge(width = .9)) +
  theme_bw()