Primary results
Induction task
Analyses of the induction task were logistic regressions unless otherwise specified, predicting prevalence (.01-.99) with participant and test feature as random intercepts. Test feature (“can snap with their toes”, etc.) is technically nested within test feature type (physical, diet, personality), but since each test feature is unique to each test feature type, a model with the nesting term is analytically equivalent to the previous model, so the nesting term was omitted for simplicity of specification.
By test feature
We can look at how prevalence judgments vary by condition and individual test feature.
By test feature type
We can look at how prevalence judgments vary by condition and test feature type (i.e., physical, diet, or personality).
If the chosen clusters capture some systematicity in how people generalize, the physical condition should make the highest prevalence estimates for physical test features, the diet condition for the diet test features, and the personality condition for personality test features. This appears to be true for the physical and personality conditions, but not for the diet condition.
# condition * test feature type
glmm_condition_testfeaturetype <-
glmmTMB(prevalence ~ condition * test_feature_type + (1|participant) + (1|test_feature),
data = data_tidy,
family = beta_family(link = "logit"))
glmm_condition_testfeaturetype %>%
Anova()
## Analysis of Deviance Table (Type II Wald chisquare tests)
##
## Response: prevalence
## Chisq Df Pr(>Chisq)
## condition 6.4863 3 0.09020 .
## test_feature_type 6.6197 2 0.03652 *
## condition:test_feature_type 86.1700 6 < 0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
glmm_condition_testfeaturetype %>%
emmeans(~ condition * test_feature_type) %>%
contrast(method = "pairwise") %>%
summary(adjust = "FDR")
## contrast estimate SE df z.ratio
## physical physical - diet physical 0.57177 0.134 Inf 4.258
## physical physical - personality physical 0.44917 0.135 Inf 3.334
## physical physical - heterogeneous physical 0.30804 0.135 Inf 2.284
## physical physical - physical diet 0.06678 0.203 Inf 0.329
## physical physical - diet diet 0.18582 0.236 Inf 0.786
## physical physical - personality diet 0.35124 0.236 Inf 1.485
## physical physical - heterogeneous diet 0.32310 0.237 Inf 1.364
## physical physical - physical personality -0.20105 0.203 Inf -0.989
## physical physical - diet personality -0.00323 0.237 Inf -0.014
## physical physical - personality personality -0.34012 0.237 Inf -1.437
## physical physical - heterogeneous personality -0.04994 0.237 Inf -0.211
## diet physical - personality physical -0.12261 0.135 Inf -0.910
## diet physical - heterogeneous physical -0.26374 0.135 Inf -1.956
## diet physical - physical diet -0.50499 0.236 Inf -2.137
## diet physical - diet diet -0.38595 0.203 Inf -1.900
## diet physical - personality diet -0.22054 0.236 Inf -0.933
## diet physical - heterogeneous diet -0.24868 0.237 Inf -1.050
## diet physical - physical personality -0.77282 0.237 Inf -3.267
## diet physical - diet personality -0.57501 0.203 Inf -2.828
## diet physical - personality personality -0.91189 0.237 Inf -3.853
## diet physical - heterogeneous personality -0.62171 0.237 Inf -2.625
## personality physical - heterogeneous physical -0.14113 0.135 Inf -1.043
## personality physical - physical diet -0.38239 0.237 Inf -1.616
## personality physical - diet diet -0.26334 0.237 Inf -1.113
## personality physical - personality diet -0.09793 0.203 Inf -0.482
## personality physical - heterogeneous diet -0.12607 0.237 Inf -0.532
## personality physical - physical personality -0.65021 0.237 Inf -2.746
## personality physical - diet personality -0.45240 0.237 Inf -1.911
## personality physical - personality personality -0.78929 0.203 Inf -3.880
## personality physical - heterogeneous personality -0.49910 0.237 Inf -2.105
## heterogeneous physical - physical diet -0.24126 0.237 Inf -1.019
## heterogeneous physical - diet diet -0.12221 0.237 Inf -0.516
## heterogeneous physical - personality diet 0.04320 0.237 Inf 0.182
## heterogeneous physical - heterogeneous diet 0.01506 0.203 Inf 0.074
## heterogeneous physical - physical personality -0.50908 0.237 Inf -2.149
## heterogeneous physical - diet personality -0.31127 0.237 Inf -1.314
## heterogeneous physical - personality personality -0.64816 0.237 Inf -2.735
## heterogeneous physical - heterogeneous personality -0.35798 0.203 Inf -1.760
## physical diet - diet diet 0.11904 0.135 Inf 0.883
## physical diet - personality diet 0.28445 0.135 Inf 2.106
## physical diet - heterogeneous diet 0.25632 0.136 Inf 1.889
## physical diet - physical personality -0.26783 0.204 Inf -1.316
## physical diet - diet personality -0.07001 0.237 Inf -0.296
## physical diet - personality personality -0.40690 0.237 Inf -1.718
## physical diet - heterogeneous personality -0.11672 0.237 Inf -0.493
## diet diet - personality diet 0.16541 0.135 Inf 1.225
## diet diet - heterogeneous diet 0.13727 0.136 Inf 1.012
## diet diet - physical personality -0.38687 0.237 Inf -1.635
## diet diet - diet personality -0.18906 0.203 Inf -0.929
## diet diet - personality personality -0.52594 0.237 Inf -2.221
## diet diet - heterogeneous personality -0.23576 0.237 Inf -0.995
## personality diet - heterogeneous diet -0.02814 0.136 Inf -0.207
## personality diet - physical personality -0.55228 0.237 Inf -2.332
## personality diet - diet personality -0.35447 0.237 Inf -1.497
## personality diet - personality personality -0.69136 0.204 Inf -3.397
## personality diet - heterogeneous personality -0.40117 0.237 Inf -1.692
## heterogeneous diet - physical personality -0.52414 0.237 Inf -2.210
## heterogeneous diet - diet personality -0.32633 0.237 Inf -1.376
## heterogeneous diet - personality personality -0.66322 0.237 Inf -2.795
## heterogeneous diet - heterogeneous personality -0.37304 0.204 Inf -1.831
## physical personality - diet personality 0.19781 0.136 Inf 1.460
## physical personality - personality personality -0.13907 0.136 Inf -1.025
## physical personality - heterogeneous personality 0.15111 0.136 Inf 1.111
## diet personality - personality personality -0.33689 0.136 Inf -2.482
## diet personality - heterogeneous personality -0.04670 0.136 Inf -0.343
## personality personality - heterogeneous personality 0.29018 0.136 Inf 2.130
## p.value
## 0.0014
## 0.0113
## 0.1055
## 0.8165
## 0.5275
## 0.2749
## 0.3076
## 0.4349
## 0.9891
## 0.2841
## 0.8758
## 0.4602
## 0.1514
## 0.1108
## 0.1554
## 0.4566
## 0.4349
## 0.0120
## 0.0412
## 0.0026
## 0.0520
## 0.4349
## 0.2257
## 0.4191
## 0.7167
## 0.7136
## 0.0412
## 0.1554
## 0.0026
## 0.1108
## 0.4349
## 0.7136
## 0.8820
## 0.9554
## 0.1108
## 0.3195
## 0.0412
## 0.1915
## 0.4697
## 0.1108
## 0.1554
## 0.3195
## 0.8303
## 0.2021
## 0.7167
## 0.3642
## 0.4349
## 0.2247
## 0.4566
## 0.1108
## 0.4349
## 0.8758
## 0.1001
## 0.2749
## 0.0112
## 0.2064
## 0.1108
## 0.3076
## 0.0412
## 0.1702
## 0.2802
## 0.4349
## 0.4191
## 0.0718
## 0.8165
## 0.1108
##
## Results are given on the log odds ratio (not the response) scale.
## P value adjustment: fdr method for 66 tests
Indeed, there is a significant interaction between condition and test feature type in an ANOVA conducted on a logistic regression with condition, test feature type, and their interaction as fixed effects, and with participant and test feature as random intercepts (\(\chi\)(6) = 86.17, p < .001). There is also a main effect of test feature type (\(\chi\)(2) = 6.62, p = .037) and a marginal effect of condition (\(\chi\)(3) = 6.49, p = .090).
When rating the prevalence of physical features, the physical condition produced significantly higher prevalence estimates than the diet condition (FDR-corrected z = 4.26, p = .0014) or personality condition (z = 3.33, p = .011), but no different from the heterogeneous condition (z = 2.28, p = .11).
When rating the prevalence of diet features, the diet condition did not produce different prevalence estimates than the physical condition (z = 0.88, p = .47), personality condition (z = 1.23, p = 0.36), or heterogeneous condition (z = 1.01, p = .43).
When rating the prevalence of personality features, the personality condition produced only marginally higher prevalence estimates than the diet condition (z = 2.48, p = .072), and heterogeneous condition (z = 2.13, p = .11), and no different from the physical condition (z = 1.03, p = .43).
# make contrast matrix for condition
C <- matrix(
c(
# physical diet pers hetero
1, -1/3, -1/3, -1/3, # Contrast 1: physical vs others
0, 1, -1, 0, # Contrast 2: diet vs personality
0, 1, 0, -1, # Contrast 3: diet vs heterogeneous
1, 1, 1, 1 # Overall mean (intercept)
),
nrow = 4,
byrow = TRUE
)
# assign row names
rownames(C) <- levels(data_tidy$condition)
# apply and center columns
contrasts(data_tidy$condition) <- C[,1:3] # first 3 rows are true contrasts
# condition * test feature type
glmm_condition_testfeaturetype_phys <-
glmmTMB(prevalence ~ condition * test_feature_type + (1|participant) + (1|test_feature),
data = data_tidy,
family = beta_family(link = "logit"))
glmm_condition_testfeaturetype_phys %>%
summary()
## Family: beta ( logit )
## Formula:
## prevalence ~ condition * test_feature_type + (1 | participant) +
## (1 | test_feature)
## Data: data_tidy
##
## AIC BIC logLik -2*log(L) df.resid
## -3517.8 -3417.4 1773.9 -3547.8 5940
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## participant (Intercept) 0.72984 0.8543
## test_feature (Intercept) 0.09435 0.3072
## Number of obs: 5955, groups: participant, 397; test_feature, 15
##
## Dispersion parameter for beta family (): 3.6
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.31645 0.27317 1.158 0.246683
## condition1 0.01851 0.23386 0.079 0.936905
## condition2 -0.35364 0.16843 -2.100 0.035759
## condition3 0.12262 0.13467 0.911 0.362553
## test_feature_typediet -0.22892 0.24380 -0.939 0.347752
## test_feature_typepersonality 0.88587 0.24477 3.619 0.000295
## condition1:test_feature_typediet 0.17504 0.14596 1.199 0.230423
## condition2:test_feature_typediet 0.32682 0.10515 3.108 0.001882
## condition3:test_feature_typediet -0.28803 0.08395 -3.431 0.000602
## condition1:test_feature_typepersonality -0.64558 0.14731 -4.382 0.0000117
## condition2:test_feature_typepersonality -0.09657 0.10618 -0.910 0.363073
## condition3:test_feature_typepersonality 0.21427 0.08495 2.522 0.011658
##
## (Intercept)
## condition1
## condition2 *
## condition3
## test_feature_typediet
## test_feature_typepersonality ***
## condition1:test_feature_typediet
## condition2:test_feature_typediet **
## condition3:test_feature_typediet ***
## condition1:test_feature_typepersonality ***
## condition2:test_feature_typepersonality
## condition3:test_feature_typepersonality *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
By test feature type match
Another way to look at the data is to code responses by whether the test feature type matched the training condition. If they match (e.g., diet condition responding to a diet test question), we can code that as a match, or if they mismatch (e.g., diet condition responding to a personality test question), we can code that as a mismatch. We can leave the heterogeneous condition as its own category, since it’s a semi-match to everything.
If the chosen clusters capture some systematicity in how people generalize, matches should result in higher prevalence estimates than mismatches. Indeed, that’s what we find.
# condition
glmm_condition_test_match <-
glmmTMB(prevalence ~ condition_test_match + (1|participant) + (1|test_feature),
data = data_tidy,
family = beta_family(link = "logit"))
glmm_condition_test_match %>%
Anova()
## Analysis of Deviance Table (Type II Wald chisquare tests)
##
## Response: prevalence
## Chisq Df Pr(>Chisq)
## condition_test_match 74.026 2 < 0.00000000000000022 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
glmm_condition_test_match %>%
emmeans(~ condition_test_match) %>%
contrast(method = "pairwise") %>%
summary(adjust = "FDR")
## contrast estimate SE df z.ratio p.value
## match - heterogeneous 0.2422 0.106 Inf 2.285 0.0335
## match - mismatch 0.2570 0.030 Inf 8.577 <0.0001
## heterogeneous - mismatch 0.0148 0.105 Inf 0.142 0.8874
##
## Results are given on the log odds ratio (not the response) scale.
## P value adjustment: fdr method for 3 tests
Indeed, there is a main effect of whether condition and test variables match (match, hetereogenous, or mismatch) on prevalence, in an ANOVA conducted on a logistic regression with match as a main effect, and with participant and test feature as random intercepts (\(\chi\)(2) = 74.03, p < .001). Post-hoc FDR-corrected pairwise comparisons reveal that the matching condition results in higher prevalence estimates of test features than the heterogeneous condition (z = 8.58, p < .001) or the mismatching conditions (z = 2.29, p = .034).
By cosine similarity
Instead of grouping features into discrete types, we can also look at the distance (cosine similarity) between each individual test feature to the training features presented in each condition, in the multidimensional embedding space.
For each test feature, we can calculate the average distance to training features in each condition, and see if that metric of cosine similarity predicts measures of prevalence.
# average cosine similarity of the test feature, to the training features in that condition
glmm_cosine_similarity_avg <-
glmmTMB(prevalence ~ cosine_similarity_avg + (1|participant) + (1|test_feature),
data = data_tidy,
family = beta_family(link = "logit"))
glmm_cosine_similarity_avg %>%
summary()
Indeed, there is a significant effect of average cosine similarity of the test feature to the various training features in the condition (\(z\) = 7.38, p < .001), such that higher average cosine similarity predicts higher prevalence estimates, in a logistic model with random intercepts per participant and test feature.
We can also focus on maximum cosine similarity, i.e., the distance from the test feature to the closest training feature in a given condition, and see if that metric predicts prevalence judgments.
# max cosine similarity of the test feature, to the closest training features in that condition
glmm_cosine_similarity_max <-
glmmTMB(prevalence ~ cosine_similarity_max + (1|participant) + (1|test_feature),
data = data_tidy,
family = beta_family(link = "logit"))
glmm_cosine_similarity_max %>%
summary()
Indeed, maximum cosine similarity is also a significant predictor of prevalence estimates (\(z\) = 7.24, p < .001), such that higher maximum cosine similarity predicts higher prevalence estimates, in a logistic model with random intercepts per participant and test feature.
glmmTMB(prevalence ~ cosine_similarity_avg + cosine_similarity_max + (1|participant) + (1|test_feature),
data = data_tidy,
family = beta_family(link = "logit")) %>%
summary()
However, when including both average and max cosine similarity as predictors, only average cosine similarity remains a significant predictor of prevalence estimates, suggesting that people are integrating over all training features in the condition, rather than just attending to the most similar training feature.
Overall
We can look at prevalence estimates overall. If the heterogeneous condition leads to the highest overall coherence, we should see the highest prevalence estimates in that condition overall. However, that’s not what we find.
# condition * test feature type
glmm_condition <-
glmmTMB(prevalence ~ condition + (1|participant) + (1|test_feature_type),
data = data_tidy,
family = beta_family(link = "logit"))
glmm_condition %>%
Anova()
## Analysis of Deviance Table (Type II Wald chisquare tests)
##
## Response: prevalence
## Chisq Df Pr(>Chisq)
## condition 6.475 3 0.09065 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
glmm_condition %>%
emmeans("condition") %>%
contrast(method = "pairwise") %>%
summary(adjust = "FDR")
## contrast estimate SE df z.ratio p.value
## physical - diet 0.2907 0.121 Inf 2.404 0.0973
## physical - personality 0.1925 0.121 Inf 1.588 0.2244
## physical - heterogeneous 0.2310 0.122 Inf 1.901 0.1719
## diet - personality -0.0982 0.121 Inf -0.810 0.6267
## diet - heterogeneous -0.0596 0.122 Inf -0.491 0.7483
## personality - heterogeneous 0.0385 0.122 Inf 0.316 0.7516
##
## Results are given on the log odds ratio (not the response) scale.
## P value adjustment: fdr method for 6 tests
There is only a marginal effect of condition on prevalence (\(\chi\)(3) = 6.48, p = .091). Post-hoc FDR-corrected pairwise comparisons reveal no significant differences between any conditions (ps > .10).
By test feature, vs model
We can get the model’s predictions and compare those to people’s ratings of prevalence. For now, we get the model’s “kind score” for each test feature, which is a measure of the expected value of the Gaussian function at that location in feature space.
Group characterization
Participants were asked to describe what characterizes Zarpies as a group, with responses coded by Marianna blind to condition.
Eyeballing the plot below, participants in the diet and personality conditions often characterized Zarpies in terms of their diet or personality, seemingly moreso than in the other conditions.
In the physical condition, participants appeared more likely to describe Zarpies in terms of physical characteristics than the other conditions, but this effect seems less pronounced than in the diet and personality conditions, with physical descriptions remaining a minority of descriptions in the physical condition. (maybe a bit more when merged with appearance, but still remaining below a majority)
TBD: analyses of the frequency of these codes