Primary results
Induction task
Analyses of the induction task were logistic regressions unless otherwise specified, predicting prevalence (.01-.99) with participant and test feature as random intercepts. Test feature (“can snap with their toes”, etc.) is technically nested within test feature type (physical, diet, personality), but since each test feature is unique to each test feature type, a model with the nesting term is analytically equivalent to the previous model, so the nesting term was omitted for simplicity of specification.
Overall
We can look at prevalence estimates overall.
# condition
glmm_condition <-
glmmTMB(prevalence ~ condition + (1|participant) + (1|test_feature_type),
data = data_tidy,
family = beta_family(link = "logit"))
glmm_condition %>%
Anova()
## Analysis of Deviance Table (Type II Wald chisquare tests)
##
## Response: prevalence
## Chisq Df Pr(>Chisq)
## condition 40.046 2 0.000000002014 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
glmm_condition %>%
emmeans("condition") %>%
contrast(method = "pairwise") %>%
summary(adjust = "FDR")
## contrast estimate SE df z.ratio
## heterogeneous generic - baseline 0.468 0.102 Inf 4.607
## heterogeneous generic - heterogeneous specific 0.626 0.103 Inf 6.085
## baseline - heterogeneous specific 0.158 0.101 Inf 1.562
## p.value
## <.0001
## <.0001
## 0.1183
##
## Results are given on the log odds ratio (not the response) scale.
## P value adjustment: fdr method for 3 tests
Indeed, there is a significant effect of condition on prevalence (\(\chi^2\)(2) = 40.05, p < .001), based on an ANOVA conducted on a logistic regression with condition, test feature type, and their interaction as fixed effects, with random intercepts for participant and test feature and random slopes of condition within test feature type.
The heterogeneous generic condition led to greater generalization compared to baseline, and compared to the heterogeneous specific condition.
However, the heterogeneous specific condition only marginally led to lower generalization compared to baseline.
# # make contrast matrix for condition
# C <- matrix(
# c(
# # physical diet pers hetero
# 1, -1/3, -1/3, -1/3, # Contrast 1: physical vs others
# 0, 1, -1, 0, # Contrast 2: diet vs personality
# 0, 1, 0, -1, # Contrast 3: diet vs heterogeneous
# 1, 1, 1, 1 # Overall mean (intercept)
# ),
# nrow = 4,
# byrow = TRUE
# )
#
# # assign row names
# rownames(C) <- levels(data_tidy$condition)
#
# # apply and center columns
# contrasts(data_tidy$condition) <- C[,1:3] # first 3 rows are true contrasts
#
#
# # condition * test feature type
# glmm_condition_testfeaturetype_phys <-
# glmmTMB(prevalence ~ condition * test_feature_type + (1|participant) + (1|test_feature),
# data = data_tidy,
# family = beta_family(link = "logit"))
#
# glmm_condition_testfeaturetype_phys %>%
# summary()
By test feature type
We can look at how prevalence judgments vary by condition and test feature type (i.e., physical, diet, or personality).
If the chosen clusters capture some systematicity in how people generalize, the physical condition should make the highest prevalence estimates for physical test features, the diet condition for the diet test features, and the personality condition for personality test features. This appears to be true for the physical and personality conditions, but not for the diet condition.
By test feature
We can look at how prevalence judgments vary by condition and individual test feature.
Group characterization
Participants were asked to describe what characterizes Zarpies as a group. TBD