Participants will be 120 children (after exclusions) with 60 infants from Stanford, CA, United States and 60 infants from the Chinese University of Hong Kong in Hong Kong, China. We will recruit infants between 12-24 months of age, stratified such that 30 children will be between 12;0 - 17;31 months of age and 30 children will be between 18;0 - 23;31 at each site. For our Bayesian power analysis, we will generate datasets corresponding to this design. We will estimate power to detect a moderate effect size of culture (Cohen’s d = 0.5).
n = 60 # participants per group (culture)
effect_size = .5 # assume a medium effect size
sim_d <- function(seed, n) {
# define the means for standardized DVs
mu_t <- effect_size
mu_c <- 0
set.seed(seed)
young = runif(n, -1, 0)
older = runif(n, 0, 1)
d <- tibble(id = seq(1:(2*n)),
age = c(young[1:(n/2)], older[1:(n/2)], young[((n/2)+1):n], older[((n/2)+1):n]),
group = rep(c("culture1", "culture2"), each = n)) %>%
mutate(culture = ifelse(group == "culture1", 0, 1),
y = ifelse(group == "culture1",
rnorm(n, mean = mu_c, sd = 1),
rnorm(n, mean = mu_t, sd = 1)))
return(d)
}
d <- sim_d(123, 60)
# get default brms prior
get_prior(data = d, family = gaussian,
y ~ 0 + intercept + age*culture)
## prior class coef group resp dpar nlpar bound
## 1 b
## 2 b age
## 3 b age:culture
## 4 b culture
## 5 b intercept
## 6 student_t(3, 0, 10) sigma
# random intercepts per subject or not?
#get_prior(data = d, family = gaussian,
# y ~ 0 + intercept + age*culture + (1|id))
The main analysis will entail six separate regressions, analyzing the proportion of relational play, frequency of joint attention, and level of parental support separately for the individual vs. guided play phases. The independent variables variables are age and culture. Let’s first run one Bayesian regression on our first generated dataset and see how it looks.
fit <- brm(data = d,
family = gaussian,
y ~ 0 + intercept + age * culture,
prior = c(prior(normal(0, 10), class = b),
prior(student_t(3, 0, 10), class = sigma)),
seed = 1, silent=T)
## Compiling the C++ model
## Start sampling
tidy(fit, prob = .95)
## term estimate std.error lower upper
## 1 b_intercept -0.03889714 0.12604184 -0.2830970 0.2074505
## 2 b_age -0.01879263 0.23470800 -0.4747172 0.4424618
## 3 b_culture 0.53419878 0.17957605 0.1920023 0.8949431
## 4 b_age:culture 0.37511153 0.31737778 -0.2469116 0.9906630
## 5 sigma 0.97806839 0.06439057 0.8652806 1.1164673
## 6 lp__ -182.48986615 1.62878138 -186.5515612 -180.3693312
plot(fit)
Now simulate more experiments.
#s %>%
# mutate(treatment = map(fit, tidy, prob = .95)) %>%
# unnest(treatment) %>%
# filter(term == "b_culture") %>%
# ggplot(aes(x = seed, y = estimate, ymin = lower, ymax = upper)) +
# geom_hline(yintercept = c(0, .5), color = "white") +
# geom_pointrange(fatten = 1/2) +
# labs(x = "seed (i.e., simulation index)",
# y = expression(beta[culture]))
s %>%
unnest(tidy) %>%
ggplot(aes(x = seed, y = estimate, ymin = lower, ymax = upper)) +
geom_pointrange(fatten = 1/2, alpha=.7) +
geom_hline(yintercept = c(0, .5), color = "white") +
labs(x = "seed (i.e., simulation index)",
y = expression(beta[culture]))
power = s %>%
#mutate(group = map(fit, tidy, prob = .95)) %>%
unnest(tidy) %>%
#filter(term == "b_culture") %>%
mutate(check = ifelse(lower > 0, 1, 0)) %>%
summarise(power = mean(check))
powerpct = unlist(power*100)
In 77.9% of our 1000 simulations, a group size of 60 per culture was sufficient to produce a 95% Bayesian credible interval that did not straddle 0.