Frequency of (non)-colexification against WordNet relationship types
In general, this coarse pattern is also as expected. That two pairs are no expressed by the same form in a language is much more frequent than them colexifying. I’m not sure if there’s much more to be said. The colexifying pairs are all very close to 0.
Proportions against relationship type. This looks pretty much like what you hypothesized, right? Proportion of colexified antonyms < hypernyms ~ meronyms. And these proportions are all higher than those of pairs that stand in no relationship.
Proportions against cosine similarity, per relationship type. This one is, again, less informative to me:
df_freq_pos <- df_freq %>% filter(type=='colex_pos')
tapply(df_freq_pos$freq, df_freq_pos$wn.relation, summary)
## $antonymy
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 0.00 0.00 8.35 4.00 102.00
##
## $hypernymy
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 0.000 1.000 9.438 6.000 235.000
##
## $meronymy
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 0.00 2.00 23.45 19.50 300.00
##
## $no.rel
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.0000 0.0000 0.1412 0.0000 263.0000
df_freq_neg <- df_freq %>% filter(type=='colex_neg')
tapply(df_freq_neg$freq, df_freq_neg$wn.relation, summary)
## $antonymy
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 36.0 284.8 452.0 580.8 700.2 1830.0
##
## $hypernymy
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 10.0 158.2 245.0 294.8 332.5 1924.0
##
## $meronymy
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 30.0 288.5 584.0 618.5 774.5 1971.0
##
## $no.rel
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.0 129.0 223.0 245.2 303.0 2076.0
tapply(df_prop$colex.prop, df_prop$wn.relation, summary)
## $antonymy
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000000 0.000000 0.000000 0.017144 0.004884 0.227692
##
## $hypernymy
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000000 0.000000 0.003098 0.040312 0.036972 0.545454
##
## $meronymy
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000000 0.000000 0.003802 0.037331 0.032796 0.372294
##
## $no.rel
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000000 0.0000000 0.0000000 0.0004481 0.0000000 0.8694030
This pretty much mirrors the plot above
Proportions against relationship type. This looks pretty much like what you hypothesized, right? Proportion of colexified antonyms < hypernyms ~ meronyms. And these proportions are all higher than those of pairs that stand in no relationship.
Proportions against cosine similarity, per relationship type. This one is, again, less informative to me:
tapply(df_prop$colex.prop, df_prop$wn.relation, summary)
## $antonymy
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000000 0.000000 0.000000 0.017144 0.004884 0.227692
##
## $hypernymy
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000000 0.000000 0.002918 0.035569 0.027221 0.545454
##
## $meronymy
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000000 0.000000 0.003802 0.037331 0.032796 0.372294
##
## $no.rel
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000000 0.0000000 0.0000000 0.0004708 0.0000000 0.8694030