Load libraries
library(ggplot2)
library(dplyr)
library(tidyr)
library(purrr)
theme_set(theme_bw(base_size = 16))Generate two hypothetical parents that differ in their distributions of naming event qualities.
easy <- rbeta(50, 4, 3)
hard <- rbeta(50, 2, 3)
parents <- data.frame(event = 1:50, easy = easy, hard = hard) %>%
gather(type, size, -event)
ggplot(parents, aes(x = size)) +
facet_wrap(~ type) +
geom_histogram(bins = 10, color = "black", fill = "white") +
theme(panel.grid = element_blank())parents %>%
group_by(type) %>%
summarise(mean = mean(size))## # A tibble: 2 × 2
## type mean
## <chr> <dbl>
## 1 easy 0.576
## 2 hard 0.428
Suppose that learning works like this: Children observe a bunch of naming events. For each one, they have a probability of learning that is equal to the the 1/100 * size of the object. If they learn from at least one event, we say they learned it–there’s no aggregation.
test <- function(vals) {
samples <- runif(length(vals), 0, 100) < vals
sum(samples) >= 1
}Now let’s see what this looks like for the easy and hard events. Simulate each one 1000 times and see what proportion of kids learn from each parent.
test_kids <- parents %>%
split(.$type) %>%
map(function(df) mean(replicate(1000, test(df$size)))) %>%
bind_rows(.$type)
test_kids## # A tibble: 1 × 2
## easy hard
## <dbl> <dbl>
## 1 0.271 0.19
Ok, so the number of kids is higher for the easier parent, as expected
Now let’s try to sub-sample the naming events using the moving threshold technique.
# Subsample to the data to be less than or greater than a threshold, or alternatively just subset to a random proportion of the trails equal to threshold
test_prop <- function(vals, thresh = 1, type = "greater") {
if(type == "less") {
filtered_vals <- vals[vals <= thresh]
}
else if(type == "greater") {
filtered_vals <- vals[vals >= thresh]
}
else {
filtered_vals <- sample(vals, thresh * length(vals))
}
mean(replicate(10000,
test(filtered_vals)))
}threshes <- seq(0, 1, .1)
simulations <- expand.grid(thresh_type = c("greater", "less", "prop"),
parent_type = c("easy", "hard"),
thresh = seq(0, 1, .1)) %>%
rowwise() %>%
mutate(prop_learned = test_prop(filter(parents, type == parent_type)$size,
thresh = thresh,
type = thresh_type))ggplot(simulations, aes(x = thresh, y = prop_learned, fill = parent_type)) +
facet_grid(. ~ thresh_type) +
geom_bar(stat = "identity", position = "dodge")diffs <- simulations %>%
spread(parent_type, prop_learned) %>%
mutate(diff = easy - hard)
ggplot(diffs, aes(x = thresh, y = diff)) +
facet_grid(. ~ thresh_type) +
geom_point() +
scale_x_continuous(breaks = threshes) +
geom_hline(aes(yintercept = 0), linetype = "dashed") +
theme(panel.grid = element_blank())```