Overview

This study investigated whether 4- to 8-year-olds are sensitive to sampling information in making inferences about a social group, i.e., whether they can adjust their inferences after seeing a skewed sample of group members.

We preregistered that children would be sensitive, i.e., be more likely to report that a novel group population is taller than the sample observed, after seeing a sample selected for being short.

In a deviation from our preregistration, we included children who failed one of the warmup questions (“the same” warmup), since it proved more difficult than expected. The results are qualitatively the same if we exclude these participants.

Pending video validation, we confirmed our prediction: participants were more likely to choose the population as taller in the skewed condition than the not skewed condition.

Methods

The study was preregistered on OSF.

Power analysis

See 1a_children_power_analysis.html.

Participants

Data was collected from 156 children recruited via PANDA in November 6-7, 2025. Participants were required to be in the United States.

Participants were paid $10 for an estimated 10-15 minute task.

The final sample included 134 children (n = 63-71 in each of the 2 conditions).

boarding participants
not skewed 71
skewed 63

Exclusion criteria

A total of 22 participants (14.1% of all participants) were excluded for meeting at least 1 of the following exclusion criteria:

  • failing the sound check (n = 1 participants)

  • failing the “taller” of the two warmup questions (n = 2 participants)

  • In a deviation from our preregistration, where we intended to exclude either failure, participants were not excluded for failing “the same” warmup question, as this question was unexpectedly challenging (n = 56 participants failed).

  • failing the memory check (n = 18 participants)

  • failing the comprehension check (n = 4 participants)

  • not passing video validation, e.g., no parent or child in frame for entire duration of video, parental interference (n = TBD participants)

Warmup questions

We asked participants questions designed to elicit a “taller” and a “the same” response to get participants comfortable with answering either option.

The order of these two questions was counterbalanced. The order of options within each question was fixed, and not counterbalanced.

Taller warmup
Taller warmup
Taller warmup

Participants mostly answered the taller warmup correctly.

Participants who made incorrect responses were excluded.

The same warmup
The same warmup
The same warmup

Unexpectedly, performance overall was not very good on “the same” control question - a small minority of children thought the duck was taller, perhaps due to the chicken’s comb - so participants were included if they failed this question, in a deviation from the preregistration.

Memory check

Participants mostly passed the memory check for the Quaffa boarding sequence, i.e., “no”, not all the Quaffas made it onto the boat.

Participants who made incorrect responses were excluded.

Comprehension check

Participants overwhelmingly passed the comprehension check for the Zarpie boarding sequence. Note the correct answer to this question depends on condition:

  • In the skewed condition, the correct answer is “no”, not all of the Zarpies made it onto the boat.

  • In the not skewed condition, the correct answer is “yes”, all of the Zarpies made it onto the boat.

Note that there are some non-responses (NAs), because I forgot to require a response on this question in Qualtrics. These participants were included below, but excluding them does not change anything.

Demographics

age
mean sd n
6.56 1.38 134
gender n prop
female 69 51.5%
male 65 48.5%
race n prop
Caucasian 76 56.7%
Asian, Caucasian 18 13.4%
Asian 17 12.7%
Caucasian, Hispanic 7 5.2%
African American 5 3.7%
Hispanic 5 3.7%
African American, Caucasian 3 2.2%
Asian, Hispanic 2 1.5%
South American 1 0.7%
gender race n prop
female African American 4 3.0%
female African American, Caucasian 2 1.5%
female Asian 5 3.7%
female Asian, Caucasian 7 5.2%
female Asian, Hispanic 2 1.5%
female Caucasian 42 31.3%
female Caucasian, Hispanic 6 4.5%
female Hispanic 1 0.7%
male African American 1 0.7%
male African American, Caucasian 1 0.7%
male Asian 12 9.0%
male Asian, Caucasian 11 8.2%
male Caucasian 34 25.4%
male Caucasian, Hispanic 1 0.7%
male Hispanic 4 3.0%
male South American 1 0.7%
education n prop
High school/GED 4 3.0%
Some college 16 11.9%
Bachelor's (B.A., B.S.) 44 32.8%
Master's (M.A., M.S.) 49 36.6%
Doctoral (Ph.D., J.D., M.D.) 20 14.9%
NA 1 0.7%
  • The majority of supervising parents or guardians had attained a college degree.

Procedure

This study was administered as a Qualtrics survey, and approved by the NYU IRB (IRB-FY2024-9169).

After providing their consent, participants completed a captcha and sound check, and were asked to watch videos sound on. Participants then watched the following videos in order:

  1. In the warmup phase, participants were familiarized with answering questions about height in terms of who is taller or whether they are the same height. Participants saw a duck and a chicken appear on screen against a grid, who were the same in height, and were asked who is taller: the duck, the chicken, or are they the same. A same question was asked about a giraffe and a bunny, where the giraffe is in fact taller. The order of these two questions was counterbalanced.

  2. In the prior setting and familiarization phase, participants saw a photorealistic picture of 5 human adults and then another picture of a different 5 adults appear on screen against a grid. These adults were all 10 gridline units tall.

Prior setting and familiarization.
Prior setting and familiarization.
  1. In the boat training phase, participants were shown a parade of fictional animals attempting to board the boat, to illustrate how the boat works. In the skewed condition, the boat was 6 units tall. In the not skewed condiiton, the boat was 10 units tall.

    • The boat height was specified to be accidental (“When the boat builders were building the boat, they started building the boat from the bottom, but ran out of the special wood they needed for the boat! So the boat ended up being this tall. It might be hard for anyone who is taller than the boat to get on the boat.”), to avoid any justificatory reasoning about the height of the boat being informative about the height of Zarpies or vice versa.

    • To communicate how the boat functions to exclude those shorter than the boat, participants then watched a parade of 20 fictional animals (Quaffas, taken from Foster-Hanson et al., 2019) attempt to board the boat, one at a time, from shortest to tallest.

    • The height of animals were scaled to the height of the boat, such that 10 animals were always shorter than the boat (these animals boarded successfully) and 10 animals were always taller than the boat (all but one were unable to board; the third quaffa successfully boards by bending its head).

    Quaffas in the skewed condition. Note the Quaffas are short, since the skewed condition involves a short boat.
    Quaffas in the skewed condition. Note the Quaffas are short, since the skewed condition involves a short boat.
    • Participants were asked a memory check: “Did all of the animals board the boat?” (yes/no), and received an affirmation (if they said “no”) or correction (if they said “yes”).
  2. In the boat boarding phase, participants learned that Zarpies live on Zarpie island, and saw an island with many Zarpies overhead. Participants learned that all the grownup Zarpies’ names were put into a hat, and some of their names “were drawn out of a hat to try and visit us”. Participants saw then saw a parade of Zarpies attempt to board the boat to visit us, one at a time. Participants were told that they were all grown-up Zarpies. The boarding phase was occluded: i.e., the heights of Zarpies were hidden behind a curtain that showed only their feet.

    • In the skewed condition, the boat is 6 units tall. 20 Zarpies attempt to board, 6 of whom successfully make it on (6 out of 16 successful = 30% successful). Of the 6 who make it on, 2 had to stoop to board.

    • In the not skewed condition, the boat is 10 units tall. 6 Zarpies attempt to board, all of whom successfully make it on (6 out of 6 successful = 100% successful). Of the 6 who board, none had to stoop to board.

Boarding in not skewed condition.
Boarding in not skewed condition.
  1. After the boat boarding phase, participants were asked a comprehension check: “Did all of the Zarpies board the boat?” (yes/no), and received either an affirmation (if they said “no” in the skewed condition, or “yes” in the not skewed condition) or correction (if they said “yes” in the skewed condition, or “no” in the not skewed condition).

  2. In the sample observation phase, all participants saw the Zarpies who successfully boarded the boat get off the boat to visit us. The Zarpies got off one at a time, and each waved/descrunched if relevant. The height of this observed sample (4, 5, 6, 6, 7, 8) was held constant across conditions.

    • To emphasize the height of the Zarpies relative to the boat, participants watched Zarpies deboard the boat, wave, reboard the boat (with any Zarpies taller than the boat stooping down again to board again), and deboard again (with any Zarpies taller than the boat straightening up again).
Observed sample in skewed condition. Note the observed sample is the same, but the height of the boat is short in the skewed condition, vs tall in the not skewed condition.
Observed sample in skewed condition. Note the observed sample is the same, but the height of the boat is short in the skewed condition, vs tall in the not skewed condition.

Participants were asked a single DV:

  1. Participants were asked an explicit comparison question asking them to who is taller: Zarpies who visited, Zarpies on Zarpie island, or are they the same (see [explicit comparison]). The order of the first two options were counterbalanced (“the same” always came last).

Finally, participants’ parent or guardian were asked for any problems or confusion they had and demographic information.

Planned analyses

Explicit comparison question in skewed condition.
Explicit comparison question in skewed condition.

Participants were explicitly asked to compare the population and the sample: “Who is taller? The Zarpies on Zarpie island, the Zarpies who visited, or are they the same?” The order of the first two options was counterbalanced across participants.

Explicit comparison by condition

We pre-registered that if children do adjust, they should be more likely to say “Zarpies on Zarpie island” in the skewed condition than the not skewed condition.

Including those who failed “the same” warmup

## 
##  Fisher's Exact Test for Count Data
## 
## data:  dv_comp_table
## p-value = 0.004556
## alternative hypothesis: two.sided

As predicted, overall, participants provided different responses to the question “Who’s taller?” in the skewed condition versus not skewed condition (Fisher’s test, p = 0.005).

## 
## Call:
## lm(formula = dv_comp_pop ~ boarding, data = data %>% mutate(dv_comp_pop = case_when(dv_comp == 
##     "Zarpies on Zarpie island" ~ 1, TRUE ~ 0)))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.4921 -0.2253 -0.2253  0.5079  0.7746 
## 
## Coefficients:
##                Estimate Std. Error t value  Pr(>|t|)    
## (Intercept)     0.22535    0.05480   4.113 0.0000684 ***
## boardingskewed  0.26671    0.07992   3.337    0.0011 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4617 on 132 degrees of freedom
## Multiple R-squared:  0.07782,    Adjusted R-squared:  0.07083 
## F-statistic: 11.14 on 1 and 132 DF,  p-value: 0.001099

As predicted, specifically, participants in the skewed condition were more likely to say that Zarpies on Zarpie island (the population) is taller in the skewed condition, compared to the not skewed condition (t(132) = 3.3374168, p = 0.001).

Excluding those who failed “the same” warmup

## 
##  Fisher's Exact Test for Count Data
## 
## data:  dv_comp_table
## p-value = 0.007207
## alternative hypothesis: two.sided

As predicted, overall, participants provided different responses to the question “Who’s taller?” in the skewed condition versus not skewed condition (Fisher’s test, p = 0.007).

## 
## Call:
## lm(formula = dv_comp_pop ~ boarding, data = data_exclude_on_both_warmups %>% 
##     mutate(dv_comp_pop = case_when(dv_comp == "Zarpies on Zarpie island" ~ 
##         1, TRUE ~ 0)))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.5476 -0.2273 -0.2273  0.4524  0.7727 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)   
## (Intercept)     0.22727    0.07004   3.245  0.00169 **
## boardingskewed  0.32035    0.10023   3.196  0.00196 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4646 on 84 degrees of freedom
## Multiple R-squared:  0.1084, Adjusted R-squared:  0.09782 
## F-statistic: 10.22 on 1 and 84 DF,  p-value: 0.001963

As predicted, specifically, participants in the skewed condition were more likely to say that Zarpies on Zarpie island (the population) is taller in the skewed condition, compared to the not skewed condition (t(84) = 3.1962289, p = 0.002).

Exploratory analyses

Explicit comparison by age and condition

We suspected but did not preregister an interaction between age and condition (see 1a power analysis).

## # weights:  15 (8 variable)
## initial  value 147.214047 
## iter  10 value 131.258930
## final  value 131.151555 
## converged
## Analysis of Deviance Table (Type II tests)
## 
## Response: dv_comp
##                    LR Chisq Df Pr(>Chisq)   
## boarding             9.5389  2   0.008485 **
## age_exact            2.6965  2   0.259697   
## boarding:age_exact   2.1024  2   0.349512   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There was not a statistically significant interaction between condition and age (exact) in a multinomial model of responses with age, condition, and their interaction as predictors.

Power analysis on age and condition interaction

Power analysis bootstrapped off the data from this study.

Power on target effects by total sample size
multinomial logistic regression predicting response from age, condition, and their interaction
total sample size power: age x condition interaction
200 0.453
300 0.590
400 0.784
500 0.866
600 0.933

Session info

## R version 4.4.2 (2024-10-31)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.7.2
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] effectsize_1.0.0 emmeans_1.10.4   nnet_7.3-19      lmerTest_3.1-3  
##  [5] lme4_1.1-35.5    Matrix_1.7-1     car_3.1-3        carData_3.0-5   
##  [9] ggtext_0.1.2     lubridate_1.9.3  forcats_1.0.0    stringr_1.5.1   
## [13] dplyr_1.1.4      purrr_1.0.2      readr_2.1.5      tidyr_1.3.1     
## [17] tibble_3.2.1     ggplot2_3.5.1    tidyverse_2.0.0  gt_0.11.1       
## [21] scales_1.3.0     janitor_2.2.0    here_1.0.1      
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.2.1    farver_2.1.2        fastmap_1.2.0      
##  [4] TH.data_1.1-2       bayestestR_0.17.0   digest_0.6.37      
##  [7] timechange_0.3.0    estimability_1.5.1  lifecycle_1.0.4    
## [10] survival_3.7-0      magrittr_2.0.3      compiler_4.4.2     
## [13] rlang_1.1.4         sass_0.4.9          tools_4.4.2        
## [16] yaml_2.3.10         knitr_1.49          labeling_0.4.3     
## [19] bit_4.5.0.1         xml2_1.3.6          abind_1.4-8        
## [22] multcomp_1.4-26     withr_3.0.2         numDeriv_2016.8-1.1
## [25] grid_4.4.2          datawizard_1.3.0    colorspace_2.1-1   
## [28] MASS_7.3-61         insight_1.4.2       cli_3.6.3          
## [31] mvtnorm_1.3-1       crayon_1.5.3        rmarkdown_2.29     
## [34] ragg_1.3.2          generics_0.1.3      rstudioapi_0.17.1  
## [37] tzdb_0.4.0          parameters_0.28.2   minqa_1.2.8        
## [40] cachem_1.1.0        splines_4.4.2       ggthemes_5.1.0     
## [43] parallel_4.4.2      vctrs_0.6.5         boot_1.3-31        
## [46] sandwich_3.1-1      jsonlite_1.8.9      hms_1.1.3          
## [49] bit64_4.5.2         Formula_1.2-5       systemfonts_1.1.0  
## [52] jquerylib_0.1.4     glue_1.8.0          nloptr_2.1.1       
## [55] codetools_0.2-20    stringi_1.8.4       gtable_0.3.5       
## [58] munsell_0.5.1       pillar_1.10.0       htmltools_0.5.8.1  
## [61] R6_2.5.1            textshaping_0.4.0   rprojroot_2.0.4    
## [64] vroom_1.6.5         evaluate_1.0.1      lattice_0.22-6     
## [67] gridtext_0.1.5      snakecase_0.11.1    bslib_0.8.0        
## [70] Rcpp_1.0.13         coda_0.19-4.1       nlme_3.1-166       
## [73] xfun_0.49           zoo_1.8-12          pkgconfig_2.0.3