All words

Here are all the words we normed projected onto 2-dimensional semantic space. The semantic space comes from a word2vec model trained on English Wikipedia. The word color indicates the quartile of girl-relatedness (larger quartiles = more associated with girls).

coordinates <- read_csv("data/tsne_book_words.csv") %>%
  mutate(gender_tile = as.factor(gender_tile))

coordinates %>%
ggplot(aes(x = tsne_X, y = tsne_Y, color = gender_tile)) +
  geom_text(aes(label = word), size = 2) +
  theme_void() +
  guides(color=guide_legend(title = "Quartile of \ngirl relatedness"))