## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0     ✔ purrr   0.3.4
## ✔ tibble  3.1.7     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.5.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## here() starts at /home/vboyce/Research/mia23

These visualizations are based on 139 submissions.

dat <- semantic |> select(-Timestamp, -`Email Address`, -Name, -`SUNet ID`) |> 
  pivot_longer(everything()) |> separate(name, c("type", "instance")) 



ggplot(dat, aes(x=reorder(instance, value), y=value, color=type))+
  geom_jitter(, height=.1, width=.2, alpha=.1)+
  stat_summary(fun.data = "mean_cl_boot", color="black")+
  facet_wrap(~type, scales="free_x")+
  theme(legend.position = "none")+
  labs(y="Goodness as example of class", x="")

ggsave(here(images, "dist.png"), dev="png")

There looks to be a correlation between the average ratings and the spread of the ratings. The prototypical examples like apple get almost uniformly rated as a 1 (“good example of the class”) whereas less prototypical examples have less agreement – some people think wresting is a good example of a sport, but others don’t.

Might be something to do with how people are using the scales? Like, maybe some people are thinking “well, an ostrich isn’t that great of a bird, but at least it’s still a bird” and would only use 6 or 7 on the scale for say lizard (it’s such a bad example of a bird, it’s not even a bird). I’m surprised by the ostrich / wren ordering.