Group Coordination Goals

Hivemind

A casual Reddit poll was posted with the goal of coordinating an 80/20 split amongst anonymous internet strangers. The poll asked people to choose two options (A or B) and said that the goal was for 80% of people to choose option A and 20% to choose option B. The poll was pretty successful: the group achieved a 79%/21% split.

Someone else then ran another set of polls with varying splits: 70/30, 50/50, 80/20, etc. The most successful was the 80/20 though it was closer to 75/25.

In this project, we are investigating when and how people are able to coordinate group level shared goals with very little (or no) information about others’ actions.

  • How are people deciding which option to choose?

  • Do people’s choices vary across situations, or does someone who chooses option A always choose the majority option, and vice versa?

  • If people’s choices vary across situations, what cues are they using about themselves and their perceived group to calibrate their choices?

  • How are people conceptualizing the group they are participating with, given that they have little to no information about the other participants?

Revelant References

To open these, open in new tab/window

Woolley et al. (2010) Collective intelligence

Centola (2022) Network science of collective intelligence

Freeman et al (2020) Social and general intelligence improves collective action in a common pool resource system

As an initial proof of concept, we ran a replication of the Reddit Hivemind poll with students around Grounds. We tabled in front of Garrett Hall from 09/14/23 to 10/09/23. Participation was voluntary. Goal was to collect at least 200 responses.

Participants responded to a simple two question survey:

  1. The goal of this survey is for 80% of respondents to select option A and 20% to select option B. Please make your selection below. 

  2. How did you decide to choose [chosen option]? Write down whatever came to mind as you were choosing.

Qualtrics title: Group Coordination Pilot

Results

We collected a total of 224 responses.


    1     2 
70.09 29.91 

The group did not achieve an 80/20 split, it was much closer to 70/30.

Some notable themes that I observed in the free response question:

  • Evaluate self as part of majority

    • I felt most people might pick option A so I picked option B

    • thats what most people should do

    • i consider myself part of the 80%

  • “I’m not like everyone else” a la TikTok cultural theme

    • I’m different.

    • I’m the main character

  • No introspective process. Reads instructions as telling them to choose A

    • You told me to

    • that is the purpose of your survey

    • I thought I was supposed to select it

  • Underestimating how many others chose B

    • Because I felt like everyone was going to choose A so I choose B so that my choice would help account for the 20%.

    • I figured everyone would choose option A already so I picked B because there was probably less than 20 percent that selected it so far.

  • “Reverse psychology” or strategic variance

    • Infelt like many people would try to be “different” and pick B which would sway the percentages out of the 80/20 range so i picked A to counteract one of those votes

    • I felt like people were going to be inclined to pick option B as it is stated that the goal is for 20% of people to select B so I feel like more people will end up picking B, but option A is supposed to be the majority which is why I chose A

    • I figured a lot of people would try to make b have the appropriate amount thinking it would naturally be overlooked and smaller, so I did the opposite to overcorrect

Pilot 1.2

In this second pilot, participants were recruited to respond to the hivemind task in exchange for a free bagel. Participants completed this study in combination with other studies [Natalie ran this, check with her for more details].

Responses were collected on 4/21 and 4/22. Qualtrics title: Group Coordination (Bagel Copy)

Results

We collected a total of 83 total responses.

Final Split:


    A     B 
64.63 35.37 

estimate

statistic

p

parameter

Method

Alternative

95% CI

0.64

53.00

.001***

83.00

Exact binomial test

two.sided

[0.53, 0.74]

Do people expect success?

Warning: Using `bins = 30` by default. Pick better value with the argument
`bins`.
Warning: `geom_vline()`: Ignoring `mapping` because `xintercept` was provided.
Warning: `geom_vline()`: Ignoring `data` because `xintercept` was provided.
Warning: Removed 1 row containing non-finite outside the scale range
(`stat_bin()`).

[1] 66.28049
[1] 57.49123
[1] 87.9

Most people do not expect the group to achieve coordination. Most people expect about a 60/40 split. The mean expectation is 66% will choose option A.

Gender

Term

estimate

std.error

statistic

p

(Intercept)

0.00

1.41

0.00

1.00

gender_codedM

-0.00

2.00

-0.00

1.00

Interestingly, this is the one case where men outperformed women. The men were much closer to 80/20 than the women were. Maybe because there were only 21 men v 59 women.

Same question as pilot 1 was run on Participant Pool. Launched on 10/31. Target N = 300. Qualtrics title: Group Coordination 2 - PPool

Participants responded to following questions (new wording in bold):

  1. The goal of this survey is for 80% of respondents to select option A and 20% to select option B. As you know, we are collecting about 300 responses from students on the participant pool. Please make your selection below. 

  2. How did you decide to choose [chosen option]? Write down whatever came to mind as you were choosing.

  3. Do you think that this task will succeed at achieving an 80/20 split? i.e. do you think that roughly 80% of the 300 respondents will choose A and 20% will choose B? [yes/no/not sure]

  4. [Participants who said no] You said “no”, you do not expect the task to succeed. What split do you think this task will yield? 

Results

We collected a total of 293 responses. 11 were duplicate responses. Of the duplicates, I removed either the unfinished submission(s) or, if there were multiple finished submissions, I removed the submission with the later timestamp.

Final N = 286

Final split:


    A     B 
73.78 26.22 

estimate

statistic

p

parameter

Method

Alternative

95% CI

0.26

75.00

< .001***

286.00

Exact binomial test

two.sided

[0.21, 0.32]

This study’s split was a little closer to 80/20 than the first pilot.

And a binomial test determines that the split is significantly different than 80/20 (p = .01)

Exploratory analyses:

Do people expect that the group task can be successful? We asked participants whether they think that the 300 respondents will be able to achieve the 80/20 split.


     yes       no not sure 
   11.89    65.73    22.38 

Most people (65%) did not think that they would be able to achieve the 80/20 split. They underestimated their group’s ability to coordinate!

We asked people who said “no” to estimate what split would be achieved instead of the 80/20 goal. Participants reported what percentage of people will choose option A (meant to be 80%).

[1] 56.82639
[1] 90.97727

It seems that the doubters fell into two categories. Some people expected more than 80% of respondents to pick A. But more people expected less than 80% of respondents to pick A. Most commonly, people expected between 50 and 65% of respondents to pick A.

Exploring the doubters

Next we tested whether those who did not expect that the group could achieve an 80/20 split were more likely to have selected A or B.

For this test, I removed the ~40 something people that overestimated the percentage of people that would choose A.

Term

estimate

std.error

statistic

p

(Intercept)

-1.76

0.48

-3.63

< .001***

estimateSuccessno

0.97

0.52

1.88

.061

estimateSuccessnot sure

0.48

0.57

0.85

.396

Interestingly, those who didn’t think the group would succeed were slightly more likely to choose B - the minority option.

Ingroup identification as a predictor

We collected participants’ pretest responses, on which they responded to an ingroup identification scale. The scale consisted of three items:

  1. How important is UVa to your own personal identity?

  2. How similar do you feel in attitudes and opinions to other UVa students?

  3. How strongly do you identify as a UVa student?

Items were collapsed into one variable: ingroup identification with UVa students. We tested whether ingroup identification predicted participants’ likelihood to choose A (the majority choice) or B (the minority choice).

Term

estimate

std.error

statistic

p

(Intercept)

-0.70

0.55

-1.27

.205

ingroup

-0.07

0.12

-0.58

.563

Ingroup identification did not have a significant relationship with participants’ likelihood to choose A or B.

Gender

Term

estimate

std.error

statistic

p

(Intercept)

-1.23

0.17

-7.26

< .001***

gender_codedM

0.54

0.29

1.87

.061

Males are slightly more likely to choose B than females.

We invited participants from Pilot 2 to take the same study again. Launched on 11/13 on the PPool for N = 286. Qualtrics title: Group Coordination 2.2 - PPool Part 2

Participants were invited with the following message:

Hi [First Name],

I am reaching out to let you know that you are qualified to participate in a new study - D4.1 (0.5 credits) - based on your participation in study D4.

Only those who participated in study D4 are qualified to participate in this second iteration of our study, so we encourage you to participate! You will receive another 0.5 credits upon completion of the study. You can sign up for study D4.1 on the Psychology Department’s Sona site.

Thank you!

In this iteration, participants estimated the results from pilot 2, then were told the results of pilot 2, then answered the same question again, and reported their estimate of this iteration’s split and their confidence in their estimate.

  1. We collected a total of 286 responses in the previous study that you participated in a couple of weeks ago. Before we show you the results of that study, we want to know what you estimate the results were. What A/B split do you think was achieved in the study you participated in a couple of weeks ago?

  2. 74% of respondents chose option A and 26% chose option B. Now, we will ask you to participate in the same task again with the same group of 286 respondents.

  3. The goal of this survey - as with the previous survey - is for 80% of respondents to select option A and 20% to select option B. 
    Once again, we are collecting 286 responses from the same students on the participant pool who participated in the previous iteration of this task.
    Please make your selection below. 

  4. How did you choose? [free response]

  5. What split will this study achieve?

  6. How confident are you in that estimate?

  7. Why do you think we invited you to a second iteration of this study? [free response]

Results

Final n = 127


    A     B 
62.99 37.01 

estimate

statistic

p

parameter

Method

Alternative

95% CI

0.28

80.00

< .001***

286.00

Exact binomial test

two.sided

[0.23, 0.34]

Gender

Term

estimate

std.error

statistic

p

(Intercept)

-0.64

0.23

-2.79

.005**

gender_codedM

0.30

0.39

0.75

.451

Participant pool students completed reading the mind in the eyes measure of social sensitivity on the psych department pretest. Then, they were invited to participate in our in-lab study as part of a 3-study session. [Ask Natalie what the other two studies were, but sometimes it may have been only one other study]. Participants were collected from 2/26/2025 - 04/25/2025. Participants received class credit in exchange for completing all of the studies in their 30 minute session.

Qualtrics title: Group Coordination 3 - PPool with Reading Mind Eye

Reading the Mind in the Eyes Test

About 490 people responded to the RMET on the pretest. In general, people perform pretty well on the test.

Mean score = 25.6

SD = 5.0149427

Pilot 3

We collected a total of 129 participants to take the study in lab. Of those, less than 60 participants had completed the RMET in full on the psychology dept pretest.

There were a number of duplicate attempts on the pretest. In most cases, one of the attempts is incomplete. In those cases, I deleted the incomplete responses. In some cases, all attempts that the participant made are complete and yield different total REMT scores. This results in 2 sets of duplicates.

  • PID 57 took the pretest 3 times with total scores 30, 34, 33.
  • PID 107 took the pretest 2 times with total scores 22, 23

Overall study results


    A     B 
76.74 23.26 

binomial test of achieved ratio and its distance from 80/20

estimate

statistic

p

parameter

Method

Alternative

95% CI

0.77

99.00

.378

129.00

Exact binomial test

two.sided

[0.68, 0.84]

Overall, participants achieved closer to a 75/25 split than an 80/20 split. But the binomial tests determined that it is not significantly different from 80/20

Gender

Term

estimate

std.error

statistic

p

(Intercept)

-1.34

0.29

-4.60

< .001***

gender_codedM

0.33

0.42

0.79

.431

There is no significant effect of gender on what choice people made. But women - again - were closer to an 80/20 split than men were. They achieved a 79%/21% split

Effects of social sensitivity (Reading the Mind in the Eyes Test)

How did our participants perform on the RMET?

Does gender predict performance on the RMET?

RMET score predicted by gender

Term

estimate

std.error

statistic

p

(Intercept)

27.19

0.87

31.22

< .001***

gender_codedM

-1.84

1.24

-1.48

.145

No, gender does not predict RMET performance.

The sample isn’t big enough to be able to chunk them into +1/-1SD around the mean on RMET performance. The best I can do is split them down the median and analyze the top half to the bottom half of performers. Median score is 27

With a median split, and those with the median score binned into the lower performers, the “lower performers group” is n = 30 , and the “higher performers” group is n = 24

Does performance on the RMET (i.e. social sensitivity) predict group performance on the hivemind task?

median_split

choice

n

Higher Performers

A

20

Higher Performers

B

4

Lower Performers

A

22

Lower Performers

B

8

There appears to be a difference between the group comprised of lower RMET performers and higher RMET performers. Lower RMET performers (i.e. lower social sensitivity) overshot the ratio, yielding closer to a 75/25 ratio.

In study 4, we tested whether being in a position of power would shift the way people approach the hivemind task. It is possible that a sense of power is one variable that pushes people to be more likely to choose the minority option.

To test this, we recruited passerby on campus in front of Garrett to complete a study for a free bagel. At the beginning of the study, they were ostensibly assigned to be a bagel chooser or a bagel receiver. That is, we told a random half of participants that they had been selected to choose the bagel flavor for the next 4 participants (power condition) and told the other half that they had been randomly selected to receive the bagel that another participant had chosen for them (control/no power condition). They then responded to the same hivemind task that we employed in previous studies.

Participants were collected from 10/09/2024 - 10/18/2024.

Qualtrics title: Group coordination 4 - Power manip

We collected a total of 215 responses.

condition

X80.20split

n

bagelChooser

1

80

bagelChooser

2

30

bagelReceiver

1

76

bagelReceiver

2

29

The splits achieved were closer to 70/30 than 80/20 for both bagel choosers and bagel receivers. There didn’t seem to be a difference between the bagel chooser (power condition) group and the bagel receiver group. Accordingly, we didn’t find evidence that a power manipulation shifts the way people approach the hivemind task.

Prolific study with varying group sizes (ostensibly), RMET and addition of a question to measure strategizing on the centipede game as a potential measure of ToM + backward induction. Launched on 06/20/25 for a target total N of 400. Compensation: 1.73 british pounds.

Qualtrics name: Group Coordination 5 - Prolific with varying group sizes

Method

Participants were randomly assigned to one of four conditions: group sizes of 10, 50, 100 or 500. (Note that participants were told they were in a group of 10 people, for example, but we aimed to collect an n of 100 per condition.)

Participants were told that they were being randomly assigned to a group and given the instructions that their group is tasked to achieve an 80/20 split. They were also told the number of people that would have to choose each option to achieve the split, i.e. people in the 10 group condition were told that 8 people should choose A and 2 people should choose B.

Participants then wrote their reasoning for how they chose their choice, and reported their estimate of what the group would achieve.

Next, participants completed the 36-item RMET.

Finally, participants were given instructions to imagine that they were playing in a centipede game.

They were offered a chance to take or pass the pot for each of their turns, with the game ending whenever they chose to take the pot or at the 9th turn (end of game).

Participants were thanked for their time and paid on the Prolific platform.

Useful references for the centipede game

To open these links, open in new tab/window

Brocas, I., & Carrillo, J. D. (2020). Iterative dominance in young children: Experimental evidence in simple two-person games. Journal of Economic Behavior & Organization, 179, 623-637

Gerber, A., & Wichardt, P. C. (2010). Iterated reasoning and welfare-enhancing instruments in the Centipede game. Journal of Economic Behavior & Organization, 74(1-2), 123-136

Izquierdo, S. S., & Izquierdo, L. R. centipede-test-two

Results

We collected a total of 395 complete responses.

condition

n

10

101

50

101

100

100

500

93

gender

n

man

162

man,transgender

1

non-binary

1

not listed

1

transgender

5

woman

225

Overall split results


    A     B 
82.03 17.97 

binomial test of achieved ratio and its distance from 80/20

estimate

statistic

p

parameter

Method

Alternative

95% CI

0.82

324.00

.345

395.00

Exact binomial test

two.sided

[0.78, 0.86]

Split results by condition

Gender

Gender did not seem to play a role in this study. Note that this is only analyzing those that self-reported “man” or “woman”, and does not include those that self-reported as transgender or non-binary.

Reading the Mind in the Eyes

How did the participants perform on the RMET?

Overall, a normal distribution around a mean score of 21.4962025 and SD of 5.8058137. To analyze RMET score as a predictor of group level performance on the hivemind task, I subset the sample into +1 and -1 SD. Low performers scored <16 and High Performers scored >27.

The mean RMET score in the high performers groups is 30.0952381 and in the low performers group is 12.4761905

Does RMET score predict group level performance on the hivemind task?

As a group, it doesn’t look like low RMET scorers performed any differently from high RMET scorers. What if we break them into conditions?

RMET_performance

condition

n

low performers

10

14

low performers

50

17

low performers

100

14

low performers

500

18

middle performers

10

76

middle performers

50

68

middle performers

100

66

middle performers

500

59

high performers

10

11

high performers

50

16

high performers

100

20

high performers

500

16

Term

estimate

std.error

statistic

p

(Intercept)

0.16

0.16

1.05

.293

RMETscore

0.00

0.01

0.23

.817

condition50

-0.03

0.21

-0.13

.895

condition100

-0.15

0.22

-0.71

.479

condition500

-0.03

0.21

-0.14

.892

RMETscore × condition50

0.00

0.01

0.33

.742

RMETscore × condition100

0.00

0.01

0.44

.657

RMETscore × condition500

-0.00

0.01

-0.15

.884

Visually, it looks like high and low RMET scorers may be performing differently at different group sizes. However, neither RMET score, nor condition, nor their interaction significantly predict an individual’s choice for A or B.

Predictions of success

Did participants think they would succeed at achieving an 80/20 split?

On average, people thought their group would undershoot the split (i.e. that slightly too many people would choose B). Did not vary by condition.

Centipede game

Participants also imagined playing a centipede game, described above. Did their choices on the centipede game correspond to their choice on the hivemind task?

Term

estimate

std.error

statistic

p

(Intercept)

0.15

0.03

5.18

< .001***

centipede_game_takePoint

0.01

0.01

1.24

.214

Most people (n = 200) took the pot at the first turn. However, the point at which players chose to take the pot did not significantly predict their hivemind choice.

Prolific study with varying goal ratios, RMET and modified centipede game.

Qualtrics name: Group Coordination 6 - Prolific with varying goal ratios

Method

Participants were randomly assigned to one of four conditions. Their goal was to achieve a 60/40, 70/30, 80/20, or 90/10 split. Participants were recruited on Prolific and compensated 2.21 british pounds. Collection was launched on 06/23/25 with a target of N = 400; n = 100 per condition.

Procedure was identical to Pilot 5, except for two features: randomly assigned conditions determined their goal split, and all participants were told they were in a group of 100 participants.

Results

We collected a total of 400 complete responses.

condition

n

60/40

96

70/30

101

80/20

103

90/10

100

gender

n

man

180

man,not listed

1

man,transgender

1

non-binary

3

transgender

2

transgender,non-binary

1

transgender,non-binary,gender queer

1

woman

207

woman,gender queer

1

woman,man

2

woman,not listed

1

Split results by condition

Gender

Term

estimate

std.error

statistic

p

(Intercept)

-0.86

0.36

-2.39

.017*

genderwoman

0.25

0.46

0.54

.587

condition70/30

-0.36

0.51

-0.71

.475

condition80/20

-0.55

0.50

-1.09

.274

condition90/10

-0.47

0.51

-0.94

.348

genderwoman × condition70/30

-0.53

0.68

-0.78

.436

genderwoman × condition80/20

0.22

0.66

0.33

.742

genderwoman × condition90/10

-0.86

0.73

-1.18

.236

Gender did not significantly predict a participants’ likelihood of choosing A or B in any of the four conditions. Note that this is only analyzing those that self-reported “man” or “woman”, and does not include those that self-reported as transgender or non-binary.

Reading the Mind in the Eyes

How did the participants perform on the RMET?

Overall, a little bit of a left skewed distribution around a mean score of 24.2325 and SD of 5.746826. To analyze RMET score as a predictor of group level performance on the hivemind task, I subset the sample into +1 and -1 SD. Low performers scored <19 and High Performers scored >29.

The mean RMET score in the high performers groups is 31.5733333 and in the low performers group is 14.8857143

Does RMET score predict group level performance on the hivemind task?

RMET_performance

condition

n

low performers

60/40

9

low performers

70/30

23

low performers

80/20

24

low performers

90/10

14

middle performers

60/40

75

middle performers

70/30

65

middle performers

80/20

56

middle performers

90/10

59

high performers

60/40

12

high performers

70/30

13

high performers

80/20

23

high performers

90/10

27

Predictions of success

Did participants think they would succeed at achieving an 80/20 split?

Super interesting that people in this study on average generally expected their group to be able to achieve their respective splits - EXCEPT for the 90/10 group. By eyeballing it, I would estimate that groups ‘80/20’ and ‘90/10’ had fairly accurate estimates of their groups’ performance.

Centipede game

Participants also imagined playing a centipede game, described above. Did their choices on the centipede game correspond to their choice on the hivemind task?

Term

estimate

std.error

statistic

p

(Intercept)

0.35

0.07

5.09

< .001***

centipede_game_takePoint

-0.01

0.01

-0.54

.591

condition70/30

-0.09

0.10

-0.92

.357

condition80/20

-0.21

0.10

-2.16

.032*

condition90/10

-0.26

0.10

-2.64

.009**

centipede_game_takePoint × condition70/30

-0.00

0.02

-0.26

.796

centipede_game_takePoint × condition80/20

0.03

0.02

1.65

.099

centipede_game_takePoint × condition90/10

0.02

0.02

1.36

.174

Again, most people take on the first turn. And the point at which players choose to take the pot does not predict their behavior on the hivemind task

pilot 6: bring teams into lab, half of groups take hivemind before getting to know each other, half take after

pilot 7: in class, people are assigned to 60/40, 80/20, 70/30 or 90/10.

break people into small groups of 10, 15, etc

  • have people do task multiple times: “for this trial you have been randomly assigned to X group with X number of people…” etc

  • have people take timed test

    • it seems that when people think too much they ‘choke’ and it reduces the success of the group
  • examine group level mind in the eyes scores predicting group level success

  • extract themes of free response in groups who succeed vs groups who dont

theory wise: interested in how individuals are thinking about the group that they are a part of and how they become in sync with each other. what is the big model? if ________ then _____ when we manipulate _______

are people picking up on the most salient dimension?

next studies would be: finding analogous situations, test those as well - is it sensitivity to norms?

  • sorting

    • people are converging on what the relevant dimensions are

    • do i need to step up in this situation?

    • do i deserve the valued resource?

    • do i fit in?

  • tell people that they are in a group of women/men or not

  • make choices (ostensibly) meaningful options, but with same goal. i.e. 20% of you have to choose chocolate, 80% have to choose lice

real world analogies:

  • self-sorting

    • choosing residency specialties
  • risk taking?

  • giving ground in crowds

Note that this doesn’t include pilot 1.2, which was collected recently on a whim