Here we test two hypotheses:

As predicted by Caliskan, et al., cross-cultural variability in implicit bias will be correlated with bias from word embedding models
Bias (for both language and behavioral IAT) will be larger for those languages that encode gender grammatically, compared to those who do not.

Both of these predictions pan out. The relationships are particularly strong for male participants from the IAT - this could be because there’s a male bias in wikipedia editors.

1 Implicit measures of bias (IAT score)

Here we look at whether IAT scores behavioral scores (from project implicit) correlate with language bias for a range of difference languages. Note this is at the language level (averaging across countries that speak the same language for the IAT behavioral data).

Though I explored a range of different measures, the pre-calculated D-score appears to be the best measure of bias. High D-scores indicates more male-career bias. I look at D-score by participant gender, and with all participants.

IAT scores calculated from the embeddings model are based on Wikipedia fasttext models (https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md). I use the same measure of effect size as Caliskan et al.

# Read in behavior and language biases
countries_langs <- read_csv("../../data/other/countries_lang.csv") %>%
  mutate(language_name = ifelse(language_name == "Spanish; Castilian", "Spanish", language_name),
         language_name = ifelse(language_name == "Dutch; Flemish", "Dutch", language_name))

IAT_behavior_measures_imp <- read_csv("../behavior_IAT/IAT_behavior_measures.csv") %>%
  left_join(countries_langs %>% select(country_name, language_name, language_code)) %>%
  filter(type %in% c("country_means_D_score", 
             "country_gender_D_score")) %>%
  group_by(type, language_code, sex) %>%
  summarize(mean = mean(mean)) 

lang_bias <- read_csv("career_effect_sizes.csv") 

d <- left_join(IAT_behavior_measures_imp, lang_bias) %>%
  mutate_if(is.character, as.factor) %>%
  filter(!is.na(lang_bias))

There are 22 languages represented here.

1.1 Measures

1.1.1 Behavior

ggplot(d, aes(x = mean)) +
  geom_histogram() +
  facet_wrap(~type + sex, scales = "free_x")+
  theme_bw()

1.1.2 Language

ggplot(lang_bias, aes(x = lang_bias)) +
  geom_histogram() +
  theme_bw()

1.2 Correlation between measures

Each point below is a language. We find that male IAT bias is reliably correlated with bias from the embedding models.

d %>%
ggplot( aes(x = mean, y = lang_bias, group = sex, color = sex)) +
  geom_point() +
  geom_smooth(method = "lm") +
  facet_grid(.~ type, scales = "free_x", drop = T) +
  theme_bw()

d %>%
  group_by(type, sex) %>%
  do(tidy(cor.test(.$mean, .$lang_bias))) %>%
  arrange(p.value) %>%
  select(-parameter, -method, -alternative) %>%
  kable()

type	sex	estimate	statistic	p.value	conf.low	conf.high
country_gender_D_score	m	0.4477692	2.2395419	0.0366428	0.0322487	0.7313172
country_means_D_score	NA	0.3787269	1.8300407	0.0821900	-0.0510297	0.6901379
country_gender_D_score	f	0.1534496	0.6944725	0.4953750	-0.2867076	0.5401150

2 Explicit measures of bias

Next, we ask whether the explicit questions asked in the IAT correlate with the language IAT measure.

We look at two measures (explained here: https://mfr.osf.io/render?url=https://osf.io/thvzf/?action=download%26mode=render):

assocareer: ’How strongly do you associate the following with males and females? Career; 1=female 7= male

assofamily: ’How strongly do you associate the following with males and females? Family; 1=female 7= male

# Read in behavior and language biases
countries_langs <- read_csv("../../data/other/countries_lang.csv") %>%
  mutate(language_name = ifelse(language_name == "Spanish; Castilian", "Spanish", language_name),
         language_name = ifelse(language_name == "Dutch; Flemish", "Dutch", language_name))


IAT_behavior_measures_exp <- read_csv("IAT_behavior_measures.csv") %>%
  left_join(countries_langs %>% select(country_name, language_name, language_code)) %>%
  filter(type %in% c("country_fam_male", 
             "country_career_male")) %>%
  group_by(type, language_code, sex) %>%
  summarize(mean = mean(mean)) 

d <- left_join(IAT_behavior_measures_exp, lang_bias) %>%
  mutate_if(is.character, as.factor) %>%
  filter(!is.na(lang_bias))

2.1 Measure

ggplot(d, aes(x = mean)) +
  geom_histogram() +
  facet_wrap(sex ~type, scales = "free_x",)+
  theme_bw()

2.2 Correlation between measures

I find that the degree to which males associate family with males is negatively correlated with bias. The career question is not predictive.

d %>%
ggplot( aes(x = mean, y = lang_bias, group = sex, color = sex)) +
  geom_point() +
  geom_smooth(method = "lm") +
  facet_grid(.~ type, scales = "free_x", drop = T) +
  theme_bw()

d %>%
  group_by(type, sex) %>%
  do(tidy(cor.test(.$mean, .$lang_bias))) %>%
  arrange(p.value) %>%
  select(-parameter, -method, -alternative) %>%
  kable()

type	sex	estimate	statistic	p.value	conf.low	conf.high
country_fam_male	m	-0.4425174	-2.2068315	0.0391791	-0.7282558	-0.0257047
country_fam_male	all	-0.3651060	-1.7538813	0.0947633	-0.6817686	0.0667824
country_career_male	m	-0.2716986	-1.2625678	0.2212680	-0.6220509	0.1693037
country_fam_male	f	-0.2011662	-0.9184176	0.3693442	-0.5740846	0.2408714
country_career_male	f	-0.0035605	-0.0159230	0.9874536	-0.4245316	0.4186765

3 Relationship between grammar type and bias measures

Finally, we ask whether a language encodes gender predicts its bias. WALS strangely only has this coded for about half of the langauges in our sample. I coded the rest from wikipedia (https://en.wikipedia.org/wiki/List_of_languages_by_type_of_grammatical_genders). For the ones that were coded by both, they always agreed. Here are the distributions of languages across grammar types (CN = common/neuter, MF = masculine/feminine, MFN = masculine/feminine/neuter).

gender_data <- read_csv("../language_bias/gender_grammar.csv") %>%
  rename(wikipedia_grammar_type = Wikipedia) %>%
  select(language_code, language_name, wikipedia_grammar_type) %>%
  filter(!is.na(wikipedia_grammar_type)) %>%
  mutate(wikipedia_grammar_type2 = ifelse(wikipedia_grammar_type == "none", "none", "MF"))

kable(arrange(gender_data, wikipedia_grammar_type2))

language_code	language_name	wikipedia_grammar_type	wikipedia_grammar_type2
ar	arabic	MF	MF
nl	dutch; flemish	CN	MF
pt	portuguese	MF	MF
da	danish	CN	MF
it	italian	MF	MF
no	norwegian	CN	MF
pl	polish	MFN	MF
sv	swedish	MFN	MF
es	spanish; castilian	MF	MF
de	german	MFN	MF
ru	russian	MFN	MF
fr	french	MF	MF
el	greek	MFN	MF
he	hebrew	MF	MF
hr	croatian	MFN	MF
ro	romanian	MFN	MF
ja	japanese	none	none
ko	korean	none	none
ms	malay	none	none
fa	persian	none	none
zh	chinese	none	none
fi	finnish	none	none
id	indonesian	none	none
tr	turkish	none	none
en	english	none	none
tl	tagalog	none	none
th	thai	none	none

count(gender_data, wikipedia_grammar_type2)  %>%
  kable()

wikipedia_grammar_type2	n
MF	16
none	11

count(gender_data, wikipedia_grammar_type2)  %>%
  kable()

wikipedia_grammar_type2	n
MF	16
none	11

3.1 Implicit measures

For both IAT measures, languages with bigger grammatical bias have large implicit bias than languages that don’t, and this effect is larger for male participants.

3.1.1 Behavioral IAT measure

full_d <- IAT_behavior_measures_imp %>%
  full_join(lang_bias, by = "language_code") %>%
  full_join(gender_data, by = "language_code") %>%
  select(-contains("test"))

iat_means <-  full_d %>%
  filter(!is.na(wikipedia_grammar_type2)) %>%
  mutate(sex = ifelse(is.na(sex), "all", sex)) %>%
  group_by(type, sex, wikipedia_grammar_type2) %>%
  multi_boot_standard(col = "mean", na.rm = T)

  ggplot(iat_means, aes(x = sex, 
                                 y = mean, fill = wikipedia_grammar_type2)) +
  geom_bar(stat = "identity", position = "dodge") +
  facet_wrap(~type, scales = "free") +
  geom_linerange(aes(ymin = ci_lower, ymax = ci_upper),  
                 position = position_dodge(width = .9)) +
  theme_bw()

3.1.2 Language IAT measures

lang_means <-  full_d %>%
  filter(!is.na(wikipedia_grammar_type2)) %>%
  mutate(sex = ifelse(is.na(sex), "all", sex)) %>%
  filter(sex == "all") %>%
  group_by(wikipedia_grammar_type2) %>%
  multi_boot_standard(col = "lang_bias", na.rm = T)

  ggplot(lang_means, aes(x = wikipedia_grammar_type2, 
                                 y = mean, fill = wikipedia_grammar_type2)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_linerange(aes(ymin = ci_lower, ymax = ci_upper),  
                 position = position_dodge(width = .9)) +
  theme_bw()

3.2 Explicit measure

This effect is also in the predicted direction: Languages that don’t have grammatical gender are more likely to associate males with family (but the reverse is true for career?).

full_d <- IAT_behavior_measures_exp %>%
  full_join(lang_bias, by = "language_code") %>%
  full_join(gender_data, by = "language_code") %>%
  select(-contains("test"))

iat_means <-  full_d %>%
  filter(!is.na(wikipedia_grammar_type2)) %>%
  mutate(sex = ifelse(is.na(sex), "all", sex)) %>%
  group_by(type, sex, wikipedia_grammar_type2) %>%
  multi_boot_standard(col = "mean", na.rm = T)

  ggplot(iat_means, aes(x = sex, 
                                 y = mean, fill = wikipedia_grammar_type2)) +
  geom_bar(stat = "identity", position = "dodge") +
  facet_wrap(~type, scales = "free") +
  geom_linerange(aes(ymin = ci_lower, ymax = ci_upper),  
                 position = position_dodge(width = .9)) +
  theme_bw()

Gender bias and grammatical gender

Molly Lewis

2018-01-12