Summary

This exploratory version of study 8 (n=35/condition * 15 conditions = 450 participants) was an exploratory study to assess how count versus proportionality of generics vs specifics affect inductive potential.

Unlike the previous pilot, we find an effect of proportion of generics to specifics on adults’ judgments of the prevalence of novel features of a social group. This effect is stronger than the effect of the raw number of generics heard, which is a significant predictor on its own, but not after being entered into the same model as proportion (see Primary results).

Notably, proportion continued to be a stronger predictor than raw number of generics, even after subsetting the study to extreme conditions, where proportion (0% or 100%) was a coarser predictor than number of generics (0, or 4/8/16) (see Extreme conditions only).

Nonetheless, the proportionality effect became marginal after controlling for which particular training features were seen (see Training features), and was no longer significant after subsetting the study to the mixed conditions, which are more naturalistic and where participants might be particularly sensitive to alternative utterances the speaker could make (see Mixed conditions only).

Methods

Participants

Data was collected from 524 adults (n=8-11/condition) via Prolific on Monday 3/10/2025. Participants required to be in the United States, fluent in English, and having not participated in prior studies under this protocol. Participants were paid $1.63 for an estimated 5-8 minute task. Participants were requested to particpate via desktop, after reported iOS issues on the last task.

num_generics	total_utt	n
0	0	35
0	4	32
0	8	32
0	12	33
0	16	35
4	4	33
4	8	35
4	12	34
4	16	34
8	8	35
8	12	34
8	16	33
12	12	35
12	16	32
16	16	35

Exclusion criteria

We recruited 525 participants, of whom 17 participants (3.2% of all participants) were excluded for meeting at least 1 of the following exclusion criteria:

failing the attention check (i.e., did not select 100% on slider when asked to during induction task) (n = 5 participants)
admitting to use of AI after being explicitly informed use was prohibited (n = 3 participants)
failing the task check (n = 9 participants)

Participants who failed the sound check were included, since a few participants mentioned technical difficulties with the Qualtrics automatically progressing past that video.

Demographics

mean	sd	n
age
42.45	13.69	507

The sample skewed young in age.

gender	n	prop
Female	266	52.5%
Male	232	45.8%
Non-binary	6	1.2%
Prefer not to specify	3	0.6%

The sample reflected the diversity of the gender identities in the US.

race	n	prop
White, Caucasian, or European American	359	70.8%
Black or African American	62	12.2%
East Asian	17	3.4%
Hispanic or Latino/a	17	3.4%
South or Southeast Asian	11	2.2%
White, Caucasian, or European American,Hispanic or Latino/a	11	2.2%
Prefer not to specify	6	1.2%
White, Caucasian, or European American,Native American, American Indian, or Alaska Native	4	0.8%
Middle Eastern or North African	3	0.6%
White, Caucasian, or European American,Middle Eastern or North African	3	0.6%
White, Caucasian, or European American,Black or African American	2	0.4%
Black or African American,Prefer not to specify	1	0.2%
Black or African American,South or Southeast Asian	1	0.2%
Hispanic or Latino/a,Black or African American	1	0.2%
Indigenous Black American	1	0.2%
Native American, American Indian, or Alaska Native	1	0.2%
White, Caucasian, or European American,Hispanic or Latino/a,Black or African American,Native American, American Indian, or Alaska Native	1	0.2%
White, Caucasian, or European American,Hispanic or Latino/a,East Asian	1	0.2%
White, Caucasian, or European American,Hispanic or Latino/a,Native American, American Indian, or Alaska Native	1	0.2%
White, Caucasian, or European American,South or Southeast Asian	1	0.2%
jewish	1	0.2%
mixed	1	0.2%
mixed race.	1	0.2%

The sample was also racially diverse, with White Americans slightly overrepresented and Hispanic Americans undererepresented.

education	n	prop
Less than high school	3	0.6%
High school/GED	74	14.6%
Some college	131	25.8%
Bachelor's (B.A., B.S.)	199	39.3%
Master's (M.A., M.S.)	79	15.6%
Doctoral (Ph.D., J.D., M.D.)	16	3.2%
Prefer not to specify	5	1.0%

The sample was about evenly split on college completion.

Procedure

This study was administered as a Qualtrics survey, and approved by the NYU IRB (IRB-FY2023-6812).

After providing their consent, participants completed a captcha, pledge not to use AI, and sound check. Participants then completed:

Training phase: participants heard some number of generic statements and specific statements, based on condition. Which features were mentioned was randomized, as was statement order.
Test phase (induction task): participants completed an induction task where they imagined seeing a Zarpie with a novel feature, and estimated the prevalence of that feature among Zarpies using a slider from 0 to 100 (initialized at 0). All participants completed the same 16 trials, with order of trials randomized.

Participants then completed a few task completion questions, demographics, and were debriefed.

Data processing

Prevalence judgments were converted to a scale from 0 to 1, with 0 and 1 values trimmed to 0.01 and 0.99 to support a beta regression, since a uniform beta distribution does not include its endpoints of 0 and 1.

Participant feedback

The most frequent participant issue was audio issues (n = 4). One participant reported “The audio was horrible. It scared my cat.”, another reported “you might want to equalize the volume between the videos and the questions” (I had noticed equalization issues but was unable to fix in HTML, will fix later in the video itself).

One person had trouble with the attention check (“I couldn’t see where to move the slider for the attention check, so I just chose 50%”).

One person complained that they were unable to copy the consent form (this issue was a result of anti-AI study-wide CSS and will be changed to question-specific CSS in the future).

When asked to guess what the study was about, many participants reported that it was about judging other people by their characteristics.

Primary results

Induction task

Plots

Summary

Density

Histogram

Cumulative

Analyses

The following beta regressions predict prevalence with random intercepts per participant and per test feature.

There is a main effect of condition ($\chi$(14)=43.69, p<.001) on the inferred prevalence of novel features.
On their own as lone predictors, raw number of generics (z = 5.34, p < .001), the raw number of specifics (z = -4.25, p < .001), and the proportion of generics to specifics (z = 5.68, p < .001) each predicted the inferred prevalence of novel features.
When entered into the same model, proportion of generics to specifics predicted prevalence judgments (z = 2.36, p = .018), while raw number of generics did not (z = 0.99, p = .32). (This is the reverse of the pattern from the pilot.)
When entered into the same model, raw number of generics predicted prevalence judgments (z = 3.69, p < .001), while raw number of specifics was marginal (z = -1.86, p = .063).

# condition
glmmTMB(prevalence ~ condition + (1|participant) + (1|test_feature), 
        data = data_tidy, 
        family = beta_family(link = "logit")) %>% 
  Anova()

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: prevalence
##            Chisq Df Pr(>Chisq)    
## condition 43.691 14 0.00006638 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# raw number of generics
glmmTMB(prevalence ~ num_generics + (1|participant) + (1|test_feature),
        data = data_tidy, 
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_generics + (1 | participant) + (1 | test_feature)
## Data: data_tidy
## 
##      AIC      BIC   logLik deviance df.resid 
##  -5584.9  -5549.9   2797.5  -5594.9     8107 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.6759   0.8221  
##  test_feature (Intercept) 0.1845   0.4295  
## Number of obs: 8112, groups:  participant, 507; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.79 
## 
## Conditional model:
##               Estimate Std. Error z value     Pr(>|z|)    
## (Intercept)  -0.655056   0.121219  -5.404 0.0000000652 ***
## num_generics  0.040818   0.007648   5.337 0.0000000945 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# raw number of specifics
glmmTMB(prevalence ~ num_specifics + (1|participant) + (1|test_feature),
        data = data_tidy, 
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_specifics + (1 | participant) + (1 | test_feature)
## Data: data_tidy
## 
##      AIC      BIC   logLik deviance df.resid 
##  -5575.0  -5540.0   2792.5  -5585.0     8107 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.6905   0.8310  
##  test_feature (Intercept) 0.1846   0.4296  
## Number of obs: 8112, groups:  participant, 507; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.79 
## 
## Conditional model:
##                Estimate Std. Error z value  Pr(>|z|)    
## (Intercept)   -0.262500   0.121168  -2.166    0.0303 *  
## num_specifics -0.032664   0.007682  -4.252 0.0000212 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# proportion of generics
glmmTMB(prevalence ~ prop_generics + (1|participant) + (1|test_feature),
        data = data_tidy, 
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ prop_generics + (1 | participant) + (1 | test_feature)
## Data: data_tidy
## 
##      AIC      BIC   logLik deviance df.resid 
##  -5190.5  -5155.9   2600.3  -5200.5     7547 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.6665   0.8164  
##  test_feature (Intercept) 0.1948   0.4414  
## Number of obs: 7552, groups:  participant, 472; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.78 
## 
## Conditional model:
##               Estimate Std. Error z value     Pr(>|z|)    
## (Intercept)   -0.70981    0.12763  -5.561 0.0000000268 ***
## prop_generics  0.56747    0.09983   5.684 0.0000000131 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# raw number vs proportion of generics
glmmTMB(prevalence ~ num_generics + prop_generics + (1|participant) + (1|test_feature),
        data = data_tidy, 
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_generics + prop_generics + (1 | participant) +  
##     (1 | test_feature)
## Data: data_tidy
## 
##      AIC      BIC   logLik deviance df.resid 
##  -5189.5  -5147.9   2600.7  -5201.5     7546 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.6650   0.8155  
##  test_feature (Intercept) 0.1948   0.4414  
## Number of obs: 7552, groups:  participant, 472; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.78 
## 
## Conditional model:
##               Estimate Std. Error z value     Pr(>|z|)    
## (Intercept)   -0.71688    0.12780  -5.610 0.0000000203 ***
## num_generics   0.01403    0.01423   0.986       0.3240    
## prop_generics  0.42121    0.17870   2.357       0.0184 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# raw number of generics vs raw number of specifics
glmmTMB(prevalence ~ num_generics + num_specifics + (1|participant) + (1|test_feature),
        data = data_tidy, 
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_generics + num_specifics + (1 | participant) +  
##     (1 | test_feature)
## Data: data_tidy
## 
##      AIC      BIC   logLik deviance df.resid 
##  -5586.4  -5544.4   2799.2  -5598.4     8106 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.6709   0.8191  
##  test_feature (Intercept) 0.1845   0.4295  
## Number of obs: 8112, groups:  participant, 507; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.79 
## 
## Conditional model:
##                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   -0.523927   0.140100  -3.740 0.000184 ***
## num_generics   0.032533   0.008825   3.687 0.000227 ***
## num_specifics -0.016341   0.008778  -1.862 0.062664 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Secondary results

Straight-lining

Induction task plots suggest a lot of anchoring around the 50% marker, and debriefing suggests many participants thought it was a strange task since the test features were odd/bizarre. Are these just from participants who were straightlining through all the test items?

3 out of 507 included participants (0.59%) answered 50% to all test questions.
5 out of 507 included participants (0.99%) answered 48-52% to all test questions, a looser criterion.

Since there were only a few participants who consistently straightlined, these participants were not excluded from analyses.

Pairwise condition comparisons

We can make pairwise comparisons between conditions using two-sample Kolmogorov–Smirnov tests, with Bonferroni correction for number of tests run (105). The Kolmogorov–Smirnov test compares the cumulative distributions of two samples and returns a statistic D that reflects the maximum difference between the two distributions, as well as a p-value for the test.

One quick and dirty way to think about these results is to look at how often pairs of conditions with the same (boxed) versus different (not in box) number of generics are significantly different from each other, and to do the same for pairs of conditions with same (boxed) versus different (not in box) proportions of generics.

Note, there are way way more pairwise comparisons comparing different numbers or proportions of generics than same numbers or proportion of generics, so this is a bit of a lopsided comparison.

If say number of generics matters, we would expect to see that the distribution of prevalence ratings rarely differ when comparing pairs that are the same number of generics, and differ much more when comparing pairs that are different number of generics.

If say proportion of generics matters, we would expect to see that the distribution of prevalence ratings rarely differ when comparing pairs that are the same proportion of generics, and differ much more when comparing pairs that are different proportion of generics.

same_num_generics	sig_corr_tests	total_tests	prop_sig_corr
FALSE	46	85	54.1%
TRUE	4	20	20.0%

same_prop_generics	sig_corr_tests	total_tests	prop_sig_corr
FALSE	44	78	56.4%
TRUE	0	13	0.0%
NA	6	14	42.9%

Study 6 conditions (replication)

Three of the conditions (baseline, 0/16, 16/16) in this study (n = 32-35/condition) are replications of the conditions (baseline, specific, generic conditions) in Study 6 (n = 90-99/condition). In Study 6, prevalence was rated higher in the generic condition than the baseline condition than the specific condition, respectively.

## # A tibble: 3 × 2
##   condition count
##   <fct>     <int>
## 1 0/16         35
## 2 baseline     35
## 3 16/16        35

After subsetting to the Study 6 conditions, we do replicate the main effect of condition ($\chi$(2) = 12.61, p = .0018). The 16/16 condition reported marginally higher prevalence judgments compared to baseline (z = 2.19, p = .085), and significantly higher compared to 0/16 condition (z = 3.51, p = .0013).

Unlike in Study 6, the baseline and 0/16 conditions were not statistically different from each other (z = 1.33, p = .56).

# same analysis as study 6
model <- 
  glmmTMB(prevalence ~ condition + (1|participant) + (1|test_feature),
          data = data_study_6, 
          family = beta_family(link = "logit")) 

model %>% 
  Anova()

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: prevalence
##            Chisq Df Pr(>Chisq)   
## condition 12.612  2   0.001825 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

model %>% 
  emmeans("condition") %>% 
  pairs(adjust = "bonferroni") %>% 
  summary()

##  contrast           estimate    SE  df z.ratio p.value
##  (16/16) - baseline    0.435 0.199 Inf   2.191  0.0853
##  (16/16) - (0/16)      0.698 0.199 Inf   3.516  0.0013
##  baseline - (0/16)     0.263 0.199 Inf   1.326  0.5550
## 
## Results are given on the log odds ratio (not the response) scale. 
## P value adjustment: bonferroni method for 3 tests

Extreme conditions only

Subsetting to the extreme conditions where the proportion of generics was 0% or 100% (0/4, 0/8, 0/12, 0/16, 4/4, 8/8, 12/12, 16/16).

## # A tibble: 8 × 2
##   condition count
##   <chr>     <int>
## 1 8/8          35
## 2 4/4          33
## 3 16/16        35
## 4 12/12        35
## 5 0/8          32
## 6 0/4          32
## 7 0/16         35
## 8 0/12         33

When each entered on their own as lone predictors, the raw number of generics (z = 5.33, p < .001), raw number of specifics (z = -5.13, p < .001), and the proportion of generics to specifics (z = 5.71, p < .001) were each significant in predicting prevalence judgments.

When pitted against each other in the same model, the proportion of generics to specifics remains a significant predictor of prevalence judgments (z = 2.21, p = .027), while the raw number of generics does not (z = 1.00, p = .32).

# raw number of generics
glmmTMB(prevalence ~ num_generics + (1|participant) + (1|test_feature),
        data = data_extreme_only, 
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_generics + (1 | participant) + (1 | test_feature)
## Data: data_extreme_only
## 
##      AIC      BIC   logLik deviance df.resid 
##  -2966.6  -2934.8   1488.3  -2976.6     4315 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.6360   0.7975  
##  test_feature (Intercept) 0.1861   0.4314  
## Number of obs: 4320, groups:  participant, 270; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.68 
## 
## Conditional model:
##               Estimate Std. Error z value    Pr(>|z|)    
## (Intercept)  -0.665090   0.127314  -5.224 0.000000175 ***
## num_generics  0.045591   0.008563   5.325 0.000000101 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# raw number of specifics
glmmTMB(prevalence ~ num_specifics + (1|participant) + (1|test_feature),
        data = data_extreme_only, 
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_specifics + (1 | participant) + (1 | test_feature)
## Data: data_extreme_only
## 
##      AIC      BIC   logLik deviance df.resid 
##  -2964.8  -2932.9   1487.4  -2974.8     4315 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.6407   0.8005  
##  test_feature (Intercept) 0.1862   0.4315  
## Number of obs: 4320, groups:  participant, 270; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.68 
## 
## Conditional model:
##                Estimate Std. Error z value    Pr(>|z|)    
## (Intercept)   -0.211682   0.126739  -1.670      0.0949 .  
## num_specifics -0.044006   0.008584  -5.127 0.000000295 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# proportion of generics
glmmTMB(prevalence ~ prop_generics + (1|participant) + (1|test_feature),
        data = data_extreme_only, 
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ prop_generics + (1 | participant) + (1 | test_feature)
## Data: data_extreme_only
## 
##      AIC      BIC   logLik deviance df.resid 
##  -2970.5  -2938.6   1490.2  -2980.5     4315 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.6260   0.7912  
##  test_feature (Intercept) 0.1861   0.4314  
## Number of obs: 4320, groups:  participant, 270; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.68 
## 
## Conditional model:
##               Estimate Std. Error z value     Pr(>|z|)    
## (Intercept)    -0.7261     0.1301  -5.583 0.0000000237 ***
## prop_generics   0.5792     0.1014   5.712 0.0000000112 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# raw number vs proportion of generics
glmmTMB(prevalence ~ num_generics + prop_generics + (1|participant) + (1|test_feature),
        data = data_extreme_only, 
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_generics + prop_generics + (1 | participant) +  
##     (1 | test_feature)
## Data: data_extreme_only
## 
##      AIC      BIC   logLik deviance df.resid 
##  -2969.5  -2931.3   1490.7  -2981.5     4314 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.6235   0.7896  
##  test_feature (Intercept) 0.1861   0.4314  
## Number of obs: 4320, groups:  participant, 270; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.68 
## 
## Conditional model:
##               Estimate Std. Error z value     Pr(>|z|)    
## (Intercept)   -0.72614    0.12999  -5.586 0.0000000232 ***
## num_generics   0.01585    0.01589   0.997        0.319    
## prop_generics  0.41931    0.18959   2.212        0.027 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Mixed conditions only

Mixed conditions (4/8, 4/12, 4/16, 8/12, 8/16, 12/16) have heightened contrast between generic and specific statements, since participants see both types of statements.

## # A tibble: 6 × 2
##   condition count
##   <chr>     <int>
## 1 8/16         33
## 2 8/12         34
## 3 4/8          35
## 4 4/16         34
## 5 4/12         34
## 6 12/16        32

After subsetting the sample to just mixed conditions (n = 202 total), neither raw numbers of generics, raw numbers of specifics, nor the proportion of generics to specifics were statistically significant in predicting prevalence judgments on their own (ps > .31).

# raw number of generics
glmmTMB(prevalence ~ num_generics + (1|participant) + (1|test_feature),
        data = data_mixed_only, 
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_generics + (1 | participant) + (1 | test_feature)
## Data: data_mixed_only
## 
##      AIC      BIC   logLik deviance df.resid 
##  -2163.1  -2132.7   1086.5  -2173.1     3227 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.7237   0.8507  
##  test_feature (Intercept) 0.2077   0.4557  
## Number of obs: 3232, groups:  participant, 202; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.92 
## 
## Conditional model:
##              Estimate Std. Error z value Pr(>|z|)   
## (Intercept)  -0.55505    0.19065  -2.911   0.0036 **
## num_generics  0.02149    0.02115   1.016   0.3096   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# raw number of specifics
glmmTMB(prevalence ~ num_specifics + (1|participant) + (1|test_feature),
        data = data_mixed_only,
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_specifics + (1 | participant) + (1 | test_feature)
## Data: data_mixed_only
## 
##      AIC      BIC   logLik deviance df.resid 
##  -2162.6  -2132.2   1086.3  -2172.6     3227 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.7257   0.8519  
##  test_feature (Intercept) 0.2077   0.4557  
## Number of obs: 3232, groups:  participant, 202; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.92 
## 
## Conditional model:
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept)   -0.31360    0.19074  -1.644    0.100
## num_specifics -0.01495    0.02093  -0.714    0.475

# proportion of generics
glmmTMB(prevalence ~ prop_generics + (1|participant) + (1|test_feature),
        data = data_mixed_only,
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ prop_generics + (1 | participant) + (1 | test_feature)
## Data: data_mixed_only
## 
##      AIC      BIC   logLik deviance df.resid 
##  -2163.0  -2132.6   1086.5  -2173.0     3227 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.7238   0.8508  
##  test_feature (Intercept) 0.2077   0.4557  
## Number of obs: 3232, groups:  participant, 202; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.92 
## 
## Conditional model:
##               Estimate Std. Error z value Pr(>|z|)   
## (Intercept)    -0.5938     0.2221  -2.673  0.00752 **
## prop_generics   0.3626     0.3620   1.002  0.31651   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# raw number vs proportion of generics
glmmTMB(prevalence ~ num_generics * prop_generics + (1|participant) + (1|test_feature),
        data = data_mixed_only,
        family = beta_family(link = "logit")) %>% 
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_generics * prop_generics + (1 | participant) +  
##     (1 | test_feature)
## Data: data_mixed_only
## 
##      AIC      BIC   logLik deviance df.resid 
##  -2159.1  -2116.6   1086.6  -2173.1     3225 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.7234   0.8505  
##  test_feature (Intercept) 0.2077   0.4557  
## Number of obs: 3232, groups:  participant, 202; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.92 
## 
## Conditional model:
##                             Estimate Std. Error z value Pr(>|z|)
## (Intercept)                -0.549962   0.603784  -0.911    0.362
## num_generics                0.004829   0.129493   0.037    0.970
## prop_generics               0.125824   1.097975   0.115    0.909
## num_generics:prop_generics  0.011343   0.180743   0.063    0.950

# raw number of generics vs raw number of specifics
glmmTMB(prevalence ~ num_generics + num_specifics + (1|participant) + (1|test_feature),
        data = data_mixed_only,
        family = beta_family(link = "logit")) %>%  
  summary()

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_generics + num_specifics + (1 | participant) +  
##     (1 | test_feature)
## Data: data_mixed_only
## 
##      AIC      BIC   logLik deviance df.resid 
##  -2161.1  -2124.7   1086.6  -2173.1     3226 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.7234   0.8506  
##  test_feature (Intercept) 0.2077   0.4557  
## Number of obs: 3232, groups:  participant, 202; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.92 
## 
## Conditional model:
##                Estimate Std. Error z value Pr(>|z|)
## (Intercept)   -0.495465   0.304990  -1.624    0.104
## num_generics   0.018517   0.024244   0.764    0.445
## num_specifics -0.005996   0.023961  -0.250    0.802

Training features

Participants saw some subset of 16 training features - anywhere from 0 to 16 of the full list of 16 - with subsets randomly selected.

Theoretically, the particular training features received might have an effect on prevalence judgments. Of note, the conditions with fewer total training utterances contain significant variability in which training features were seen, while the conditions with more total training utterances have less variability (with all participants in any of the 16 utterance conditions all seeing the same features).

Random effects to account for: * different test features may have different prevalences –> random intercepts per test feature * different participants may rate prevalence differently –> random intercepts per participant * total number of utterances heard may correlate with variability between participants in prevalence ratings, since conditions with fewer total utterances provided different training features more variability between-participants in the training –> random slopes of __ per total number of utterances * different training features may increase or decrease the overall prevalence

Are the effects above robust after accounting for additional variance in the training structure?

The proportion of generics to specifics was marginal in predicting prevalence judgments (z = 1.88, p = .061), while the number of generics was not significant (z = 0.94, p = .35) in a beta regression with random slopes for the total utterances heard, in addition to random intercepts per participant and test feature.

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_generics + prop_generics + (num_generics | total_utt) +  
##     (prop_generics | total_utt) + (1 | participant) + (1 | test_feature)
## Data: data_tidy_dummy
## 
##      AIC      BIC   logLik deviance df.resid 
##       NA       NA       NA       NA     7588 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name          Variance                                       
##  total_utt    (Intercept)   0.005708217899479856721756121373800851870328188
##               num_generics  0.000003051523744080734600953508026122129592750
##  total_utt.1  (Intercept)   0.000000000000000000000000000000000000000007931
##               prop_generics 0.000000000000000000000000000000000006806329817
##  participant  (Intercept)   0.654452908043517522784782158851157873868942261
##  test_feature (Intercept)   0.198675480804334836371083383710356429219245911
##  Std.Dev.                   Corr 
##  0.075552749119273329703184      
##  0.001746861111846255074087 1.00 
##  0.000000000000000000002816      
##  0.000000000000000002608894 1.00 
##  0.808982637665059844955806      
##  0.445730278087920372964703      
## Number of obs: 7600, groups:  total_utt, 4; participant, 472; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.76 
## 
## Conditional model:
##               Estimate Std. Error z value    Pr(>|z|)    
## (Intercept)   -0.71438    0.13415  -5.325 0.000000101 ***
## num_generics   0.01678    0.01786   0.939      0.3476    
## prop_generics  0.39146    0.20872   1.876      0.0607 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

With 16 random intercepts for whether each of the training features was seen or not, the proportion of generics to specifics is marginal in predicting prevalence judgments (z = 1.81, p = .0070) and the raw number of generics is not significant (z = 1.19, p = .23).

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_generics + prop_generics + (1 + num_generics +  
##     prop_generics || participant) + (1 + num_generics + prop_generics ||  
##     test_feature) + (1 | trained_babies_blankets) + (1 | trained_bounce_ball_head) +  
##     (1 | trained_can_flip_air) + (1 | trained_chase_shadows) +  
##     (1 | trained_climb_fences) + (1 | trained_dont_like_icecream) +  
##     (1 | trained_dont_like_mud) + (1 | trained_draw_stars_knees) +  
##     (1 | trained_eat_flowers) + (1 | trained_flap_arms_happy) +  
##     (1 | trained_freckles_feet) + (1 | trained_hop_puddles) +  
##     (1 | trained_like_sing) + (1 | trained_scared_ladybugs) +  
##     (1 | trained_sleep_trees) + (1 | trained_stripes_hair)
## Data: data_tidy_dummy
## 
##      AIC      BIC   logLik deviance df.resid 
##       NA       NA       NA       NA     7574 
## 
## Random effects:
## 
## Conditional model:
##  Groups                     Name          Variance      Std.Dev.  Corr      
##  participant                (Intercept)   0.53384904752 0.7306497           
##                             num_generics  0.00182334595 0.0427007 0.00      
##                             prop_generics 0.06582384614 0.2565616 0.00 0.00 
##  test_feature               (Intercept)   0.17239813713 0.4152085           
##                             num_generics  0.00020475765 0.0143094 0.00      
##                             prop_generics 0.04975536048 0.2230591 0.00 0.00 
##  trained_babies_blankets    (Intercept)   0.00000001462 0.0001209           
##  trained_bounce_ball_head   (Intercept)   0.00000012714 0.0003566           
##  trained_can_flip_air       (Intercept)   0.00000001751 0.0001323           
##  trained_chase_shadows      (Intercept)   0.00000040417 0.0006357           
##  trained_climb_fences       (Intercept)   0.00000001393 0.0001180           
##  trained_dont_like_icecream (Intercept)   0.00000004709 0.0002170           
##  trained_dont_like_mud      (Intercept)   0.00000005112 0.0002261           
##  trained_draw_stars_knees   (Intercept)   0.00000003362 0.0001834           
##  trained_eat_flowers        (Intercept)   0.00000034332 0.0005859           
##  trained_flap_arms_happy    (Intercept)   0.00000001857 0.0001363           
##  trained_freckles_feet      (Intercept)   0.00000001538 0.0001240           
##  trained_hop_puddles        (Intercept)   0.00000001990 0.0001411           
##  trained_like_sing          (Intercept)   0.00353389795 0.0594466           
##  trained_scared_ladybugs    (Intercept)   0.00000001510 0.0001229           
##  trained_sleep_trees        (Intercept)   0.00000001317 0.0001148           
##  trained_stripes_hair       (Intercept)   0.00000457870 0.0021398           
## Number of obs: 7600, groups:  
## participant, 472; test_feature, 16; trained_babies_blankets, 2; trained_bounce_ball_head, 2; trained_can_flip_air, 2; trained_chase_shadows, 2; trained_climb_fences, 2; trained_dont_like_icecream, 2; trained_dont_like_mud, 2; trained_draw_stars_knees, 2; trained_eat_flowers, 2; trained_flap_arms_happy, 2; trained_freckles_feet, 2; trained_hop_puddles, 2; trained_like_sing, 2; trained_scared_ladybugs, 2; trained_sleep_trees, 2; trained_stripes_hair, 2
## 
## Dispersion parameter for beta family (): 2.78 
## 
## Conditional model:
##               Estimate Std. Error z value    Pr(>|z|)    
## (Intercept)   -0.70525    0.12833  -5.495 0.000000039 ***
## num_generics   0.02100    0.01775   1.183      0.2368    
## prop_generics  0.35023    0.20626   1.698      0.0895 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Did any particular training features predict prevalence judgments? No (ps > .19).

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ trained_babies_blankets + trained_bounce_ball_head +  
##     trained_can_flip_air + trained_chase_shadows + trained_climb_fences +  
##     trained_dont_like_icecream + trained_dont_like_mud + trained_draw_stars_knees +  
##     trained_eat_flowers + trained_flap_arms_happy + trained_freckles_feet +  
##     trained_hop_puddles + trained_like_sing + trained_scared_ladybugs +  
##     trained_sleep_trees + trained_stripes_hair + (1 | participant) +  
##     (1 | test_feature)
## Data: data_tidy_dummy
## 
##      AIC      BIC   logLik deviance df.resid 
##  -5546.6  -5406.5   2793.3  -5586.6     8140 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.6958   0.8341  
##  test_feature (Intercept) 0.1870   0.4325  
## Number of obs: 8160, groups:  participant, 507; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.77 
## 
## Conditional model:
##                                 Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                    -0.550573   0.142345  -3.868  0.00011 ***
## trained_babies_blanketsTRUE     0.034927   0.108064   0.323  0.74654    
## trained_bounce_ball_headTRUE   -0.047476   0.104438  -0.455  0.64941    
## trained_can_flip_airTRUE       -0.004133   0.105859  -0.039  0.96885    
## trained_chase_shadowsTRUE      -0.104644   0.107005  -0.978  0.32810    
## trained_climb_fencesTRUE        0.067247   0.103950   0.647  0.51769    
## trained_dont_like_icecreamTRUE  0.111003   0.106230   1.045  0.29605    
## trained_dont_like_mudTRUE      -0.092568   0.109743  -0.844  0.39895    
## trained_draw_stars_kneesTRUE    0.101930   0.103258   0.987  0.32357    
## trained_eat_flowersTRUE        -0.085195   0.103698  -0.822  0.41132    
## trained_flap_arms_happyTRUE     0.102891   0.103109   0.998  0.31833    
## trained_freckles_feetTRUE      -0.028907   0.102005  -0.283  0.77688    
## trained_hop_puddlesTRUE         0.062985   0.107716   0.585  0.55873    
## trained_like_singTRUE          -0.140180   0.107792  -1.300  0.19344    
## trained_scared_ladybugsTRUE     0.017752   0.106578   0.167  0.86772    
## trained_sleep_treesTRUE         0.003615   0.106654   0.034  0.97296    
## trained_stripes_hairTRUE        0.157599   0.105273   1.497  0.13438    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## # A tibble: 16 × 1
##    test_feature
##    <chr>       
##  1 cave        
##  2 lion        
##  3 potatoes    
##  4 banjo       
##  5 look_left   
##  6 clap        
##  7 sad         
##  8 maple_syrup 
##  9 cats        
## 10 opera       
## 11 dance       
## 12 song        
## 13 window      
## 14 garbage     
## 15 pond        
## 16 yellow

Additional possible analyses to do:

Within a condition, find mean prevalence difference for people who saw feature X vs people who didn’t see the feature –> then average that, across conditions
feature effects are probably bigger for conditions with fewer counts
could try to put more limitations on which features are seen in conditions w fewer counts -

Test features order effects

All participants saw and rated the prevalence of the same set of 16 test features, in random order. Did the order of test feature/prevalence judgment questions matter for prevalence judgments? No.

##  Family: beta  ( logit )
## Formula:          
## prevalence ~ num_generics * prop_generics * test_feature_order +  
##     (1 | participant) + (1 | test_feature)
## Data: data_tidy
## 
##      AIC      BIC   logLik deviance df.resid 
##  -5191.6  -5115.3   2606.8  -5213.6     7541 
## 
## Random effects:
## 
## Conditional model:
##  Groups       Name        Variance Std.Dev.
##  participant  (Intercept) 0.6656   0.8159  
##  test_feature (Intercept) 0.1945   0.4410  
## Number of obs: 7552, groups:  participant, 472; test_feature, 16
## 
## Dispersion parameter for beta family (): 2.79 
## 
## Conditional model:
##                                                Estimate Std. Error z value
## (Intercept)                                   -0.763909   0.138140  -5.530
## num_generics                                   0.030589   0.031044   0.985
## prop_generics                                  0.573595   0.214778   2.671
## test_feature_order                             0.004835   0.004762   1.015
## num_generics:prop_generics                    -0.017406   0.032531  -0.535
## num_generics:test_feature_order               -0.001499   0.001772  -0.846
## prop_generics:test_feature_order              -0.016804   0.012242  -1.373
## num_generics:prop_generics:test_feature_order  0.001496   0.001855   0.807
##                                                  Pr(>|z|)    
## (Intercept)                                   0.000000032 ***
## num_generics                                      0.32446    
## prop_generics                                     0.00757 ** 
## test_feature_order                                0.31003    
## num_generics:prop_generics                        0.59260    
## num_generics:test_feature_order                   0.39770    
## prop_generics:test_feature_order                  0.16986    
## num_generics:prop_generics:test_feature_order     0.41989    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Power analysis

What is the smallest effect size of interest?

Bootstrapped power analysis to detect the following effects:

main effect of condition
main effect of proportionality, in same model as count
main effect of proportionality, in same model as count - in mixed conditions only

Footnotes

## R version 4.4.2 (2024-10-31)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.3.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] emmeans_1.10.4  car_3.1-3       carData_3.0-5   glmmTMB_1.1.10 
##  [5] lubridate_1.9.3 forcats_1.0.0   stringr_1.5.1   dplyr_1.1.4    
##  [9] purrr_1.0.2     readr_2.1.5     tidyr_1.3.1     tibble_3.2.1   
## [13] ggplot2_3.5.1   tidyverse_2.0.0 gt_0.11.1       scales_1.3.0   
## [17] janitor_2.2.0   here_1.0.1     
## 
## loaded via a namespace (and not attached):
##  [1] Rdpack_2.6.2        gridExtra_2.3       sandwich_3.1-1     
##  [4] rlang_1.1.4         magrittr_2.0.3      multcomp_1.4-26    
##  [7] snakecase_0.11.1    compiler_4.4.2      mgcv_1.9-1         
## [10] systemfonts_1.1.0   vctrs_0.6.5         pkgconfig_2.0.3    
## [13] crayon_1.5.3        fastmap_1.2.0       backports_1.5.0    
## [16] labeling_0.4.3      utf8_1.2.4          rmarkdown_2.29     
## [19] tzdb_0.4.0          nloptr_2.1.1        ragg_1.3.2         
## [22] bit_4.5.0.1         xfun_0.49           cachem_1.1.0       
## [25] jsonlite_1.8.9      parallel_4.4.2      cluster_2.1.6      
## [28] R6_2.5.1            bslib_0.8.0         stringi_1.8.4      
## [31] boot_1.3-31         rpart_4.1.23        jquerylib_0.1.4    
## [34] numDeriv_2016.8-1.1 estimability_1.5.1  Rcpp_1.0.13        
## [37] knitr_1.49          zoo_1.8-12          base64enc_0.1-3    
## [40] Matrix_1.7-1        splines_4.4.2       nnet_7.3-19        
## [43] timechange_0.3.0    tidyselect_1.2.1    rstudioapi_0.17.1  
## [46] abind_1.4-8         yaml_2.3.10         TMB_1.9.17         
## [49] codetools_0.2-20    lattice_0.22-6      withr_3.0.2        
## [52] coda_0.19-4.1       evaluate_1.0.1      foreign_0.8-87     
## [55] survival_3.7-0      xml2_1.3.6          pillar_1.10.0      
## [58] checkmate_2.3.2     reformulas_0.4.0    generics_0.1.3     
## [61] vroom_1.6.5         rprojroot_2.0.4     hms_1.1.3          
## [64] munsell_0.5.1       minqa_1.2.8         glue_1.8.0         
## [67] Hmisc_5.1-3         tools_4.4.2         data.table_1.15.4  
## [70] lme4_1.1-35.5       mvtnorm_1.3-1       rbibutils_2.3      
## [73] colorspace_2.1-1    nlme_3.1-166        htmlTable_2.4.3    
## [76] Formula_1.2-5       cli_3.6.3           textshaping_0.4.0  
## [79] ggthemes_5.1.0      viridisLite_0.4.2   gtable_0.3.5       
## [82] sass_0.4.9          digest_0.6.37       TH.data_1.1-2      
## [85] htmlwidgets_1.6.4   farver_2.1.2        htmltools_0.5.8.1  
## [88] lifecycle_1.0.4     bit64_4.5.2         MASS_7.3-61

Compgenerics study 8 (proportionality) exploratory study

Marianna Zhang

2025-03-10

Summary

Methods

Participants

Exclusion criteria

Demographics

Procedure

Data processing

Participant feedback

Primary results

Induction task

Plots

Summary

Density

Histogram

Cumulative

Analyses

Secondary results

Straight-lining

Pairwise condition comparisons

Study 6 conditions (replication)

Extreme conditions only

Mixed conditions only

Training features

Test features order effects

Power analysis

Footnotes

num_generics	total_utt	n
0	0	35
0	4	32
0	8	32
0	12	33
0	16	35
4	4	33
4	8	35
4	12	34
4	16	34
8	8	35
8	12	34
8	16	33
12	12	35
12	16	32
16	16	35

num_generics	total_utt	n
0	0	35
0	4	32
0	8	32
0	12	33
0	16	35
4	4	33
4	8	35
4	12	34
4	16	34
8	8	35
8	12	34
8	16	33
12	12	35
12	16	32
16	16	35

num_generics	total_utt	n
0	0	35
0	4	32
0	8	32
0	12	33
0	16	35
4	4	33
4	8	35
4	12	34
4	16	34
8	8	35
8	12	34
8	16	33
12	12	35
12	16	32
16	16	35