Overview

Project goals

The goals of this project are to establish:

  1. if children and adults generalize from a sample to a social group from a sample that is, unbeknownst to them, structurally skewed, resulting in inaccurate beliefs about the group

  2. if children and adults can adjust their generalization from a sample to a social group to account for the fact that the sample was skewed by a structural process

This study focuses on question (2) in adults.

Previously on..

In Study 1a and 1b, we found that adults estimated the height of a novel social group to be taller after seeing the exact same sample of group members be generated from a skewed sampling process versus a not skewed sampling process.

However, this result could be explained by the boarding scene indirectly revealing information about the population to participants, who can then generalize directly off of that information, ignoring the sample and the process by which the sample was generated.

Study goals

The primary goal of this study was to rule out an alternative perceptual explanation for adults’ success in Studies 1a and 1b, where adults could have used population information revealed in the boarding scene.

We eliminated the boarding scene entirely, and bolstered the boat training, such that the only information participants had to work with was that the boat makes it difficult for anyone taller to board (as illustrated by Quaffas), the sample of Zarpies who visited, and the height of the boat the Zarpies came on.

Results

Indeed, adults remained more likely to say that Zarpies on Zarpie island are “taller” in the skewed condition than in the not skewed condition, suggesting adults are capable of adjusting for the sampling process. However, frequency of adults saying “taller” was very low overall, with the vast majority of participants still just reported that Zarpies on Zarpie island are “the same” in height, suggesting that adults are capable, but most adults did not engage this process.

Methods

The study was preregistered on OSF.

Participants

Data was collected from 199 adults recruited via Prolific on Weds 4/15/2026 as a standard sample. Participants were required to be in the United States, fluent in English, and have not participated in any previous studies in this project.

Participants were paid $2.00 for an estimated 7-8 minute task. In fact, the study generally took about 8-9 minutes for participants.

The final sample included 199 adults (n = 99-100 in each of the 2 conditions).

boarding participants
not skewed 100
skewed 99

Exclusion criteria

Unlike previous studies, no participants were excluded for failing questions.

Sound check

Participants overwhelmingly passed the sound check, i.e., selected the “bird” as making the chirping noise.

Participants who made incorrect responses were included.

Boat comprehension check

Participants overwhelmingly passed the comprehension check for how the boat works. That is, after watching the Quaffa boat boarding sequence, participants selected the “short Quaffa” (shorter than the boat presented) as the Quaffa that can make it onto the boat for sure.

Participants who made incorrect responses were included.

Demographics

age
mean sd n
39.71 12.67 199
  • The sample was largely young and middle-aged.
gender n prop
Female 103 51.8%
Male 92 46.2%
Non-binary 3 1.5%
my sex is female. I do not have a gender identity 1 0.5%
  • The sample was diverse in terms of gender identities in the US.
race n prop
White, Caucasian, or European American 122 61.3%
Black or African American 26 13.1%
East Asian 15 7.5%
Hispanic or Latino/a 10 5.0%
South or Southeast Asian 9 4.5%
White, Caucasian, or European American,Black or African American 4 2.0%
White, Caucasian, or European American,Hispanic or Latino/a 4 2.0%
White, Caucasian, or European American,Native American, American Indian, or Alaska Native 3 1.5%
White, Caucasian, or European American,South or Southeast Asian 2 1.0%
Hispanic or Latino/a,Native American, American Indian, or Alaska Native 1 0.5%
Middle Eastern or North African 1 0.5%
Native Hawaiian or other Pacific Islander 1 0.5%
White, Caucasian, or European American,Black or African American,Native American, American Indian, or Alaska Native 1 0.5%
  • The sample was also racially diverse.
education n prop
Less than high school 1 0.5%
High school/GED 21 10.6%
Some college 57 28.6%
Bachelor's (B.A., B.S.) 80 40.2%
Master's (M.A., M.S.) 35 17.6%
Doctoral (Ph.D., J.D., M.D.) 5 2.5%
  • The slight majority of the sample had attained a college education.

Procedure

This study was administered as a Qualtrics survey, and approved by the NYU IRB (IRB-FY2024-9169).

After providing their consent, participants completed a captcha and sound check, and were asked to watch videos sound on. Participants then watched the following videos in order:

  1. In the prior setting and familiarization phase, participants saw a photorealistic picture of 5 human adults and then another picture of a different 5 adults appear on screen against a grid. These adults were all 10 gridline units tall.
Prior setting and familiarization.
Prior setting and familiarization.
  1. In the boat training phase, participants were shown a parade of fictional animals attempting to board the boat, to illustrate how the boat works. In the skewed condition and not skewed control condition, the boat was 6 units tall. In the not skewed condiiton, the boat was 14 units tall.

    • The boat height was specified to be accidental (“When the boat builders were building the boat, they started building the boat from the bottom, but ran out of the special wood they needed for the boat! So the boat ended up being this tall. It might be hard for anyone who is taller than the boat to get on the boat.”), to avoid any justificatory reasoning about the height of the boat being informative about the height of Zarpies or vice versa.

    • To communicate how the boat functions to exclude those shorter than the boat, participants then watched a parade of 20 fictional animals (Quaffas, taken from Foster-Hanson et al., 2019) attempt to board the boat, one at a time, from shortest to tallest.

    • The height of animals were scaled to the height of the boat, such that 10 animals were always shorter than the boat (these animals boarded successfully) and 10 animals were always taller than the boat (all but one were unable to board; the third quaffa successfully boards by bending its head).

    Quaffas in the skewed condition. Note the Quaffas are short, since the skewed condition involves a short boat.
    Quaffas in the skewed condition. Note the Quaffas are short, since the skewed condition involves a short boat.
    • Participants were asked a comprehension check: “Which Quaffa can make it onto the boat for sure?”, and had to choose between a Quaffa shorter than the boat, and a Quaffa taller than the boat. After responding, they received information that the short Quaffa was the correct answer because it can definitely fit onto the boat.
  2. In the boat boarding scene, participants learned that Zarpies live on Zarpie island, and saw an island with many Zarpies overhead. Unlike previous studies, nothing else was shown in this scene, and participants were simply told that the boat works the same way for Zarpies as it does for Quaffas: if there are any Zarpies taller than the boat, they might have a hard time getting on.

  3. In the sample observation phase, all participants saw the Zarpies who successfully boarded the boat get off the boat to visit us. The Zarpies got off one at a time, and each waved/descrunched if relevant. The height of this observed sample (4, 5, 6, 6, 7, 8) was held constant across conditions.

    • To emphasize the height of the Zarpies relative to the boat, participants watched Zarpies deboard the boat, wave, reboard the boat (with any Zarpies taller than the boat stooping down again to board again), and deboard again (with any Zarpies taller than the boat straightening up again).
Observed sample in skewed condition. Note the observed sample is the same, but the height of the boat is short in the skewed condition, vs tall in the not skewed condition.
Observed sample in skewed condition. Note the observed sample is the same, but the height of the boat is short in the skewed condition, vs tall in the not skewed condition.
  1. Participants were asked an explicit comparison question asking them to compare the heights of Zarpies on Zarpie island to that of Zarpies who visited: shorter, about the same, or taller (see explicit comparison).

Finally, participants were asked for any problems or confusion they had, what they thought the task was about (see [Participant feedback]), and demographic information.

Primary results

Explicit comparison

Participants were explicitly asked to compare the population to the sample: “Do you think the Zarpies on Zarpie island are shorter, the same, or taller in height than the Zarpies who visited?”

We pre-registered that if adults do adjust, they should be more likely to say “taller” in the skewed condition than the not skewed condition for both the visible and occluded sets of conditions.

Notably, “the same” is an extremely common response across all conditions.

boarding shorter the same taller
not skewed 8% 80% 12%
skewed 5% 70% 25%
## # weights:  9 (4 variable)
## initial  value 218.623845 
## iter  10 value 137.744908
## iter  10 value 137.744907
## final  value 137.744907 
## converged
## # weights:  6 (2 variable)
## initial  value 218.623845 
## iter  10 value 140.831356
## iter  10 value 140.831356
## iter  10 value 140.831356
## final  value 140.831356 
## converged
## Analysis of Deviance Table (Type II tests)
## 
## Response: dv_comp
##          LR Chisq Df Pr(>Chisq)  
## boarding   6.1729  2    0.04566 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

As predicted, participants’ explicit comparisons were significantly different between conditions (LR Chisq(2) = 6.17, = 0.046).

## 
## Call:
## glm(formula = dv_comp_taller ~ boarding, family = binomial, data = .)
## 
## Coefficients:
##                Estimate Std. Error z value       Pr(>|z|)    
## (Intercept)     -1.9924     0.3077  -6.475 0.000000000095 ***
## boardingskewed   0.9072     0.3850   2.357         0.0184 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 191.15  on 198  degrees of freedom
## Residual deviance: 185.27  on 197  degrees of freedom
## AIC: 189.27
## 
## Number of Fisher Scoring iterations: 4
## 
## Call:
## glm(formula = dv_comp_same ~ boarding, family = binomial, data = .)
## 
## Coefficients:
##                Estimate Std. Error z value     Pr(>|z|)    
## (Intercept)      1.3863     0.2500   5.545 0.0000000294 ***
## boardingskewed  -0.5534     0.3322  -1.666       0.0957 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 224.36  on 198  degrees of freedom
## Residual deviance: 221.54  on 197  degrees of freedom
## AIC: 225.54
## 
## Number of Fisher Scoring iterations: 4
## 
## Call:
## glm(formula = dv_comp_shorter ~ boarding, family = binomial, 
##     data = .)
## 
## Coefficients:
##                Estimate Std. Error z value        Pr(>|z|)    
## (Intercept)     -2.4423     0.3686  -6.626 0.0000000000345 ***
## boardingskewed  -0.4915     0.5886  -0.835           0.404    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 96.069  on 198  degrees of freedom
## Residual deviance: 95.354  on 197  degrees of freedom
## AIC: 99.354
## 
## Number of Fisher Scoring iterations: 5

Specifically, participants were more likely to say that Zarpies on Zarpie island are “taller”, in the skewed condition compared to not skewed condition (z = 2.36, = 0.018).

Participants did not show any significant differences across conditions in their likelihood of saying that Zarpies on Zarpie island are “the same” (z = -1.67, = 0.096) or that they are “shorter” (z = -0.84, = 0.404).

Session info

## R version 4.5.2 (2025-10-31)
## Platform: aarch64-apple-darwin20
## Running under: macOS Tahoe 26.3
## 
## Matrix products: default
## BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] effectsize_1.0.1  emmeans_2.0.1     nnet_7.3-20       lmerTest_3.2-0   
##  [5] lme4_1.1-38       Matrix_1.7-4      car_3.1-3         carData_3.0-5    
##  [9] viridis_0.6.5     viridisLite_0.4.2 ggtext_0.1.2      lubridate_1.9.4  
## [13] forcats_1.0.1     stringr_1.6.0     dplyr_1.1.4       purrr_1.2.1      
## [17] readr_2.1.6       tidyr_1.3.2       tibble_3.3.1      ggplot2_4.0.1    
## [21] tidyverse_2.0.0   gt_1.3.0          scales_1.4.0      janitor_2.2.1    
## [25] here_1.0.2       
## 
## loaded via a namespace (and not attached):
##  [1] Rdpack_2.6.5        gridExtra_2.3       sandwich_3.1-1     
##  [4] rlang_1.1.7         magrittr_2.0.4      multcomp_1.4-29    
##  [7] snakecase_0.11.1    otel_0.2.0          compiler_4.5.2     
## [10] systemfonts_1.3.1   vctrs_0.7.1         pkgconfig_2.0.3    
## [13] crayon_1.5.3        fastmap_1.2.0       labeling_0.4.3     
## [16] rmarkdown_2.30      tzdb_0.5.0          nloptr_2.2.1       
## [19] ragg_1.5.0          bit_4.6.0           xfun_0.56          
## [22] cachem_1.1.0        jsonlite_2.0.0      parallel_4.5.2     
## [25] R6_2.6.1            bslib_0.10.0        stringi_1.8.7      
## [28] RColorBrewer_1.1-3  boot_1.3-32         jquerylib_0.1.4    
## [31] numDeriv_2016.8-1.1 estimability_1.5.1  Rcpp_1.1.1         
## [34] knitr_1.51          zoo_1.8-15          parameters_0.28.3  
## [37] splines_4.5.2       timechange_0.3.0    tidyselect_1.2.1   
## [40] rstudioapi_0.18.0   abind_1.4-8         yaml_2.3.12        
## [43] codetools_0.2-20    lattice_0.22-7      withr_3.0.2        
## [46] bayestestR_0.17.0   S7_0.2.1            coda_0.19-4.1      
## [49] evaluate_1.0.5      survival_3.8-6      xml2_1.5.2         
## [52] pillar_1.11.1       reformulas_0.4.3.1  insight_1.4.5      
## [55] generics_0.1.4      vroom_1.6.7         rprojroot_2.1.1    
## [58] hms_1.1.4           minqa_1.2.8         xtable_1.8-4       
## [61] glue_1.8.0          tools_4.5.2         fs_1.6.6           
## [64] mvtnorm_1.3-3       grid_4.5.2          rbibutils_2.4.1    
## [67] datawizard_1.3.0    nlme_3.1-168        Formula_1.2-5      
## [70] cli_3.6.5           textshaping_1.0.4   ggthemes_5.2.0     
## [73] gtable_0.3.6        sass_0.4.10         digest_0.6.39      
## [76] TH.data_1.1-5       farver_2.1.2        htmltools_0.5.9    
## [79] lifecycle_1.0.5     gridtext_0.1.5      bit64_4.6.0-1      
## [82] MASS_7.3-65