1 2
70.09 29.91
Group Coordination Goals
Hivemind
A casual Reddit poll was posted with the goal of coordinating an 80/20 split amongst anonymous internet strangers. The poll asked people to choose two options (A or B) and said that the goal was for 80% of people to choose option A and 20% to choose option B. The poll was pretty successful: the group achieved a 79%/21% split.
Someone else then ran another set of polls with varying splits: 70/30, 50/50, 80/20, etc. The most successful was the 80/20 though it was closer to 75/25.
In this project, we are investigating when and how people are able to coordinate group level shared goals with very little (or no) information about others’ actions.
How are people deciding which option to choose?
Do people’s choices vary across situations, or does someone who chooses option A always choose the majority option, and vice versa?
If people’s choices vary across situations, what cues are they using about themselves and their perceived group to calibrate their choices?
How are people conceptualizing the group they are participating with, given that they have little to no information about the other participants?
Revelant References
To open these, open in new tab/window
Woolley et al. (2010) Collective intelligence
As an initial proof of concept, we ran a replication of the Reddit Hivemind poll with students around Grounds. We tabled in front of Garrett Hall from 09/14/23 to 10/09/23. Participation was voluntary. Goal was to collect at least 200 responses.
Participants responded to a simple two question survey:
The goal of this survey is for 80% of respondents to select option A and 20% to select option B. Please make your selection below.
How did you decide to choose [chosen option]? Write down whatever came to mind as you were choosing.
Qualtrics title: Group Coordination Pilot
Results
We collected a total of 224 responses.
The group did not achieve an 80/20 split, it was much closer to 70/30.
Some notable themes that I observed in the free response question:
Evaluate self as part of majority
I felt most people might pick option A so I picked option B
thats what most people should do
i consider myself part of the 80%
“I’m not like everyone else” a la TikTok cultural theme
I’m different.
I’m the main character
No introspective process. Reads instructions as telling them to choose A
You told me to
that is the purpose of your survey
I thought I was supposed to select it
Underestimating how many others chose B
Because I felt like everyone was going to choose A so I choose B so that my choice would help account for the 20%.
I figured everyone would choose option A already so I picked B because there was probably less than 20 percent that selected it so far.
“Reverse psychology” or strategic variance
Infelt like many people would try to be “different” and pick B which would sway the percentages out of the 80/20 range so i picked A to counteract one of those votes
I felt like people were going to be inclined to pick option B as it is stated that the goal is for 20% of people to select B so I feel like more people will end up picking B, but option A is supposed to be the majority which is why I chose A
I figured a lot of people would try to make b have the appropriate amount thinking it would naturally be overlooked and smaller, so I did the opposite to overcorrect
Pilot 1.2
In this second pilot, participants were recruited to respond to the hivemind task in exchange for a free bagel. Participants completed this study in combination with other studies [Natalie ran this, check with her for more details].
Responses were collected on 4/21 and 4/22. Qualtrics title: Group Coordination (Bagel Copy)
Results
We collected a total of 83 total responses.
Final Split:
A B
64.63 35.37
estimate | statistic | p | parameter | Method | Alternative | 95% CI |
|---|---|---|---|---|---|---|
0.64 | 53.00 | .001*** | 83.00 | Exact binomial test | two.sided | [0.53, 0.74] |
Do people expect success?
Warning: Using `bins = 30` by default. Pick better value with the argument
`bins`.
Warning: `geom_vline()`: Ignoring `mapping` because `xintercept` was provided.
Warning: `geom_vline()`: Ignoring `data` because `xintercept` was provided.
Warning: Removed 1 row containing non-finite outside the scale range
(`stat_bin()`).
[1] 66.28049
[1] 57.49123
[1] 87.9
Most people do not expect the group to achieve coordination. Most people expect about a 60/40 split. The mean expectation is 66% will choose option A.
Gender
Term | estimate | std.error | statistic | p |
|---|---|---|---|---|
(Intercept) | 0.00 | 1.41 | 0.00 | 1.00 |
gender_codedM | -0.00 | 2.00 | -0.00 | 1.00 |
Interestingly, this is the one case where men outperformed women. The men were much closer to 80/20 than the women were. Maybe because there were only 21 men v 59 women.
Same question as pilot 1 was run on Participant Pool. Launched on 10/31. Target N = 300. Qualtrics title: Group Coordination 2 - PPool
Participants responded to following questions (new wording in bold):
The goal of this survey is for 80% of respondents to select option A and 20% to select option B. As you know, we are collecting about 300 responses from students on the participant pool. Please make your selection below.
How did you decide to choose [chosen option]? Write down whatever came to mind as you were choosing.
Do you think that this task will succeed at achieving an 80/20 split? i.e. do you think that roughly 80% of the 300 respondents will choose A and 20% will choose B? [yes/no/not sure]
[Participants who said no] You said “no”, you do not expect the task to succeed. What split do you think this task will yield?
Results
We collected a total of 293 responses. 11 were duplicate responses. Of the duplicates, I removed either the unfinished submission(s) or, if there were multiple finished submissions, I removed the submission with the later timestamp.
Final N = 286
Final split:
A B
73.78 26.22
estimate | statistic | p | parameter | Method | Alternative | 95% CI |
|---|---|---|---|---|---|---|
0.26 | 75.00 | < .001*** | 286.00 | Exact binomial test | two.sided | [0.21, 0.32] |
This study’s split was a little closer to 80/20 than the first pilot.
And a binomial test determines that the split is significantly different than 80/20 (p = .01)
Exploratory analyses:
Do people expect that the group task can be successful? We asked participants whether they think that the 300 respondents will be able to achieve the 80/20 split.
yes no not sure
11.89 65.73 22.38
Most people (65%) did not think that they would be able to achieve the 80/20 split. They underestimated their group’s ability to coordinate!
We asked people who said “no” to estimate what split would be achieved instead of the 80/20 goal. Participants reported what percentage of people will choose option A (meant to be 80%).
[1] 56.82639
[1] 90.97727
It seems that the doubters fell into two categories. Some people expected more than 80% of respondents to pick A. But more people expected less than 80% of respondents to pick A. Most commonly, people expected between 50 and 65% of respondents to pick A.
Exploring the doubters
Next we tested whether those who did not expect that the group could achieve an 80/20 split were more likely to have selected A or B.
For this test, I removed the ~40 something people that overestimated the percentage of people that would choose A.
Term | estimate | std.error | statistic | p |
|---|---|---|---|---|
(Intercept) | -1.76 | 0.48 | -3.63 | < .001*** |
estimateSuccessno | 0.97 | 0.52 | 1.88 | .061 |
estimateSuccessnot sure | 0.48 | 0.57 | 0.85 | .396 |
Interestingly, those who didn’t think the group would succeed were slightly more likely to choose B - the minority option.
Ingroup identification as a predictor
We collected participants’ pretest responses, on which they responded to an ingroup identification scale. The scale consisted of three items:
How important is UVa to your own personal identity?
How similar do you feel in attitudes and opinions to other UVa students?
How strongly do you identify as a UVa student?
Items were collapsed into one variable: ingroup identification with UVa students. We tested whether ingroup identification predicted participants’ likelihood to choose A (the majority choice) or B (the minority choice).
Term | estimate | std.error | statistic | p |
|---|---|---|---|---|
(Intercept) | -0.70 | 0.55 | -1.27 | .205 |
ingroup | -0.07 | 0.12 | -0.58 | .563 |
Ingroup identification did not have a significant relationship with participants’ likelihood to choose A or B.
Gender
Term | estimate | std.error | statistic | p |
|---|---|---|---|---|
(Intercept) | -1.23 | 0.17 | -7.26 | < .001*** |
gender_codedM | 0.54 | 0.29 | 1.87 | .061 |
Males are slightly more likely to choose B than females.
We invited participants from Pilot 2 to take the same study again. Launched on 11/13 on the PPool for N = 286. Qualtrics title: Group Coordination 2.2 - PPool Part 2
Participants were invited with the following message:
Hi [First Name],
I am reaching out to let you know that you are qualified to participate in a new study - D4.1 (0.5 credits) - based on your participation in study D4.
Only those who participated in study D4 are qualified to participate in this second iteration of our study, so we encourage you to participate! You will receive another 0.5 credits upon completion of the study. You can sign up for study D4.1 on the Psychology Department’s Sona site.
Thank you!
In this iteration, participants estimated the results from pilot 2, then were told the results of pilot 2, then answered the same question again, and reported their estimate of this iteration’s split and their confidence in their estimate.
We collected a total of 286 responses in the previous study that you participated in a couple of weeks ago. Before we show you the results of that study, we want to know what you estimate the results were. What A/B split do you think was achieved in the study you participated in a couple of weeks ago?
74% of respondents chose option A and 26% chose option B. Now, we will ask you to participate in the same task again with the same group of 286 respondents.
The goal of this survey - as with the previous survey - is for 80% of respondents to select option A and 20% to select option B.
Once again, we are collecting 286 responses from the same students on the participant pool who participated in the previous iteration of this task.
Please make your selection below.How did you choose? [free response]
What split will this study achieve?
How confident are you in that estimate?
Why do you think we invited you to a second iteration of this study? [free response]
Results
Final n = 127
A B
62.99 37.01
estimate | statistic | p | parameter | Method | Alternative | 95% CI |
|---|---|---|---|---|---|---|
0.28 | 80.00 | < .001*** | 286.00 | Exact binomial test | two.sided | [0.23, 0.34] |
Gender
Term | estimate | std.error | statistic | p |
|---|---|---|---|---|
(Intercept) | -0.64 | 0.23 | -2.79 | .005** |
gender_codedM | 0.30 | 0.39 | 0.75 | .451 |
Participant pool students completed reading the mind in the eyes measure of social sensitivity on the psych department pretest. Then, they were invited to participate in our in-lab study as part of a 3-study session. [Ask Natalie what the other two studies were, but sometimes it may have been only one other study]. Participants were collected from 2/26/2025 - 04/25/2025. Participants received class credit in exchange for completing all of the studies in their 30 minute session.
Qualtrics title: Group Coordination 3 - PPool with Reading Mind Eye
Reading the Mind in the Eyes Test
About 490 people responded to the RMET on the pretest. In general, people perform pretty well on the test.
Mean score = 25.6
SD = 5.0149427
Pilot 3
We collected a total of 129 participants to take the study in lab. Of those, less than 60 participants had completed the RMET in full on the psychology dept pretest.
There were a number of duplicate attempts on the pretest. In most cases, one of the attempts is incomplete. In those cases, I deleted the incomplete responses. In some cases, all attempts that the participant made are complete and yield different total REMT scores. This results in 2 sets of duplicates.
- PID 57 took the pretest 3 times with total scores 30, 34, 33.
- PID 107 took the pretest 2 times with total scores 22, 23
Overall study results
A B
76.74 23.26
binomial test of achieved ratio and its distance from 80/20 | ||||||
|---|---|---|---|---|---|---|
estimate | statistic | p | parameter | Method | Alternative | 95% CI |
0.77 | 99.00 | .378 | 129.00 | Exact binomial test | two.sided | [0.68, 0.84] |
Overall, participants achieved closer to a 75/25 split than an 80/20 split. But the binomial tests determined that it is not significantly different from 80/20
Gender
Term | estimate | std.error | statistic | p |
|---|---|---|---|---|
(Intercept) | -1.34 | 0.29 | -4.60 | < .001*** |
gender_codedM | 0.33 | 0.42 | 0.79 | .431 |
There is no significant effect of gender on what choice people made. But women - again - were closer to an 80/20 split than men were. They achieved a 79%/21% split
In study 4, we tested whether being in a position of power would shift the way people approach the hivemind task. It is possible that a sense of power is one variable that pushes people to be more likely to choose the minority option.
To test this, we recruited passerby on campus in front of Garrett to complete a study for a free bagel. At the beginning of the study, they were ostensibly assigned to be a bagel chooser or a bagel receiver. That is, we told a random half of participants that they had been selected to choose the bagel flavor for the next 4 participants (power condition) and told the other half that they had been randomly selected to receive the bagel that another participant had chosen for them (control/no power condition). They then responded to the same hivemind task that we employed in previous studies.
Participants were collected from 10/09/2024 - 10/18/2024.
Qualtrics title: Group coordination 4 - Power manip
We collected a total of 215 responses.
condition | X80.20split | n |
|---|---|---|
bagelChooser | 1 | 80 |
bagelChooser | 2 | 30 |
bagelReceiver | 1 | 76 |
bagelReceiver | 2 | 29 |
The splits achieved were closer to 70/30 than 80/20 for both bagel choosers and bagel receivers. There didn’t seem to be a difference between the bagel chooser (power condition) group and the bagel receiver group. Accordingly, we didn’t find evidence that a power manipulation shifts the way people approach the hivemind task.
Prolific study with varying group sizes (ostensibly), RMET and addition of a question to measure strategizing on the centipede game as a potential measure of ToM + backward induction. Launched on 06/20/25 for a target total N of 400. Compensation: 1.73 british pounds.
Qualtrics name: Group Coordination 5 - Prolific with varying group sizes
Method
Participants were randomly assigned to one of four conditions: group sizes of 10, 50, 100 or 500. (Note that participants were told they were in a group of 10 people, for example, but we aimed to collect an n of 100 per condition.)
Participants were told that they were being randomly assigned to a group and given the instructions that their group is tasked to achieve an 80/20 split. They were also told the number of people that would have to choose each option to achieve the split, i.e. people in the 10 group condition were told that 8 people should choose A and 2 people should choose B.
Participants then wrote their reasoning for how they chose their choice, and reported their estimate of what the group would achieve.
Next, participants completed the 36-item RMET.
Finally, participants were given instructions to imagine that they were playing in a centipede game.

They were offered a chance to take or pass the pot for each of their turns, with the game ending whenever they chose to take the pot or at the 9th turn (end of game).
Participants were thanked for their time and paid on the Prolific platform.
Useful references for the centipede game
To open these links, open in new tab/window
Results
We collected a total of 395 complete responses.
condition | n |
|---|---|
10 | 101 |
50 | 101 |
100 | 100 |
500 | 93 |
gender | n |
|---|---|
man | 162 |
man,transgender | 1 |
non-binary | 1 |
not listed | 1 |
transgender | 5 |
woman | 225 |
Overall split results
A B
82.03 17.97
binomial test of achieved ratio and its distance from 80/20 | ||||||
|---|---|---|---|---|---|---|
estimate | statistic | p | parameter | Method | Alternative | 95% CI |
0.82 | 324.00 | .345 | 395.00 | Exact binomial test | two.sided | [0.78, 0.86] |
Split results by condition
Gender
Gender did not seem to play a role in this study. Note that this is only analyzing those that self-reported “man” or “woman”, and does not include those that self-reported as transgender or non-binary.
Reading the Mind in the Eyes
How did the participants perform on the RMET?
Overall, a normal distribution around a mean score of 21.4962025 and SD of 5.8058137. To analyze RMET score as a predictor of group level performance on the hivemind task, I subset the sample into +1 and -1 SD. Low performers scored <16 and High Performers scored >27.
The mean RMET score in the high performers groups is 30.0952381 and in the low performers group is 12.4761905
Does RMET score predict group level performance on the hivemind task?
As a group, it doesn’t look like low RMET scorers performed any differently from high RMET scorers. What if we break them into conditions?
RMET_performance | condition | n |
|---|---|---|
low performers | 10 | 14 |
low performers | 50 | 17 |
low performers | 100 | 14 |
low performers | 500 | 18 |
middle performers | 10 | 76 |
middle performers | 50 | 68 |
middle performers | 100 | 66 |
middle performers | 500 | 59 |
high performers | 10 | 11 |
high performers | 50 | 16 |
high performers | 100 | 20 |
high performers | 500 | 16 |
Term | estimate | std.error | statistic | p |
|---|---|---|---|---|
(Intercept) | 0.16 | 0.16 | 1.05 | .293 |
RMETscore | 0.00 | 0.01 | 0.23 | .817 |
condition50 | -0.03 | 0.21 | -0.13 | .895 |
condition100 | -0.15 | 0.22 | -0.71 | .479 |
condition500 | -0.03 | 0.21 | -0.14 | .892 |
RMETscore × condition50 | 0.00 | 0.01 | 0.33 | .742 |
RMETscore × condition100 | 0.00 | 0.01 | 0.44 | .657 |
RMETscore × condition500 | -0.00 | 0.01 | -0.15 | .884 |
Visually, it looks like high and low RMET scorers may be performing differently at different group sizes. However, neither RMET score, nor condition, nor their interaction significantly predict an individual’s choice for A or B.
Predictions of success
Did participants think they would succeed at achieving an 80/20 split?
On average, people thought their group would undershoot the split (i.e. that slightly too many people would choose B). Did not vary by condition.
Centipede game
Participants also imagined playing a centipede game, described above. Did their choices on the centipede game correspond to their choice on the hivemind task?
Term | estimate | std.error | statistic | p |
|---|---|---|---|---|
(Intercept) | 0.15 | 0.03 | 5.18 | < .001*** |
centipede_game_takePoint | 0.01 | 0.01 | 1.24 | .214 |
Most people (n = 200) took the pot at the first turn. However, the point at which players chose to take the pot did not significantly predict their hivemind choice.
Prolific study with varying goal ratios, RMET and modified centipede game.
Qualtrics name: Group Coordination 6 - Prolific with varying goal ratios
Method
Participants were randomly assigned to one of four conditions. Their goal was to achieve a 60/40, 70/30, 80/20, or 90/10 split. Participants were recruited on Prolific and compensated 2.21 british pounds. Collection was launched on 06/23/25 with a target of N = 400; n = 100 per condition.
Procedure was identical to Pilot 5, except for two features: randomly assigned conditions determined their goal split, and all participants were told they were in a group of 100 participants.
Results
We collected a total of 400 complete responses.
condition | n |
|---|---|
60/40 | 96 |
70/30 | 101 |
80/20 | 103 |
90/10 | 100 |
gender | n |
|---|---|
man | 180 |
man,not listed | 1 |
man,transgender | 1 |
non-binary | 3 |
transgender | 2 |
transgender,non-binary | 1 |
transgender,non-binary,gender queer | 1 |
woman | 207 |
woman,gender queer | 1 |
woman,man | 2 |
woman,not listed | 1 |
Split results by condition
Gender
Term | estimate | std.error | statistic | p |
|---|---|---|---|---|
(Intercept) | -0.86 | 0.36 | -2.39 | .017* |
genderwoman | 0.25 | 0.46 | 0.54 | .587 |
condition70/30 | -0.36 | 0.51 | -0.71 | .475 |
condition80/20 | -0.55 | 0.50 | -1.09 | .274 |
condition90/10 | -0.47 | 0.51 | -0.94 | .348 |
genderwoman × condition70/30 | -0.53 | 0.68 | -0.78 | .436 |
genderwoman × condition80/20 | 0.22 | 0.66 | 0.33 | .742 |
genderwoman × condition90/10 | -0.86 | 0.73 | -1.18 | .236 |
Gender did not significantly predict a participants’ likelihood of choosing A or B in any of the four conditions. Note that this is only analyzing those that self-reported “man” or “woman”, and does not include those that self-reported as transgender or non-binary.
Reading the Mind in the Eyes
How did the participants perform on the RMET?
Overall, a little bit of a left skewed distribution around a mean score of 24.2325 and SD of 5.746826. To analyze RMET score as a predictor of group level performance on the hivemind task, I subset the sample into +1 and -1 SD. Low performers scored <19 and High Performers scored >29.
The mean RMET score in the high performers groups is 31.5733333 and in the low performers group is 14.8857143
Does RMET score predict group level performance on the hivemind task?
RMET_performance | condition | n |
|---|---|---|
low performers | 60/40 | 9 |
low performers | 70/30 | 23 |
low performers | 80/20 | 24 |
low performers | 90/10 | 14 |
middle performers | 60/40 | 75 |
middle performers | 70/30 | 65 |
middle performers | 80/20 | 56 |
middle performers | 90/10 | 59 |
high performers | 60/40 | 12 |
high performers | 70/30 | 13 |
high performers | 80/20 | 23 |
high performers | 90/10 | 27 |
Predictions of success
Did participants think they would succeed at achieving an 80/20 split?
Super interesting that people in this study on average generally expected their group to be able to achieve their respective splits - EXCEPT for the 90/10 group. By eyeballing it, I would estimate that groups ‘80/20’ and ‘90/10’ had fairly accurate estimates of their groups’ performance.
Centipede game
Participants also imagined playing a centipede game, described above. Did their choices on the centipede game correspond to their choice on the hivemind task?
Term | estimate | std.error | statistic | p |
|---|---|---|---|---|
(Intercept) | 0.35 | 0.07 | 5.09 | < .001*** |
centipede_game_takePoint | -0.01 | 0.01 | -0.54 | .591 |
condition70/30 | -0.09 | 0.10 | -0.92 | .357 |
condition80/20 | -0.21 | 0.10 | -2.16 | .032* |
condition90/10 | -0.26 | 0.10 | -2.64 | .009** |
centipede_game_takePoint × condition70/30 | -0.00 | 0.02 | -0.26 | .796 |
centipede_game_takePoint × condition80/20 | 0.03 | 0.02 | 1.65 | .099 |
centipede_game_takePoint × condition90/10 | 0.02 | 0.02 | 1.36 | .174 |
Again, most people take on the first turn. And the point at which players choose to take the pot does not predict their behavior on the hivemind task
pilot 6: bring teams into lab, half of groups take hivemind before getting to know each other, half take after
pilot 7: in class, people are assigned to 60/40, 80/20, 70/30 or 90/10.
break people into small groups of 10, 15, etc
have people do task multiple times: “for this trial you have been randomly assigned to X group with X number of people…” etc
have people take timed test
- it seems that when people think too much they ‘choke’ and it reduces the success of the group
examine group level mind in the eyes scores predicting group level success
extract themes of free response in groups who succeed vs groups who dont
theory wise: interested in how individuals are thinking about the group that they are a part of and how they become in sync with each other. what is the big model? if ________ then _____ when we manipulate _______
are people picking up on the most salient dimension?
next studies would be: finding analogous situations, test those as well - is it sensitivity to norms?
sorting
people are converging on what the relevant dimensions are
do i need to step up in this situation?
do i deserve the valued resource?
do i fit in?
tell people that they are in a group of women/men or not
make choices (ostensibly) meaningful options, but with same goal. i.e. 20% of you have to choose chocolate, 80% have to choose lice
real world analogies:
self-sorting
- choosing residency specialties
risk taking?
giving ground in crowds
Note that this doesn’t include pilot 1.2, which was collected recently on a whim