I have annotated the transcripts of the full games marking what parts are referential (descriptive of a flower) and which flower (as best I can tell) it refers to.
See https://docs.google.com/presentation/d/1qcDRZzHbLhE-fp6W9nKcbGouj4w01qCLCy-yt072iiY/edit?usp=sharing for the flower number - image correspondences.
Key points:
Many of the key reduction findings from tangrams generalize to this situation. Specifically, we see utterance reduction over time and w/i group convergence for each image and divergence between images. This situation is different in that we have different stimuli (more natural) and the set up is collaborative and more free-form in what is talked about. These patterns hold for both the individual and group payoff structures.
One difference we see is that groups don’t diverge (from each other). This may be dependent on stimulus properities (are there universal features of some of the images?) and group dynamics
Conclusion: The key reference game findings have some generalizability. Settings like this one may be useful for encouraging discussion of a set of images and setting up partial knowledge situations.
## total utterances
## # A tibble: 1 × 1
## n
## <int>
## 1 3404
## Utterances / game
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.00 63.00 77.00 97.26 114.50 264.00
## Utterances / flower
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 32.00 52.75 67.50 70.92 84.50 134.00
Take away – across the condition manipulation, the number of utterances doesn’t decrease, but the length of each one does.
Graph could use finessing if going in paper.
We can probably give this as summary stats and not as graph
Per game
Per person
## stan_lmer
## family: gaussian [identity]
## formula: numword ~ trialNum + (1 | gameId) + (1 | player) + (1 | flower)
## observations: 3404
## ------
## Median MAD_SD
## (Intercept) 5.7 0.2
## trialNum -0.2 0.0
##
## Auxiliary parameter(s):
## Median MAD_SD
## sigma 2.1 0.0
##
## Error terms:
## Groups Name Std.Dev.
## player (Intercept) 0.65
## flower (Intercept) 0.63
## gameId (Intercept) 0.96
## Residual 2.09
## Num. levels: player 104, flower 48, gameId 35
##
## ------
## * For help interpreting the printed output see ?print.stanreg
## * For info on the priors used see ?prior_summary.stanreg
## # A tibble: 2 × 3
## Term Estimate `Credible Interval`
## <chr> <dbl> <chr>
## 1 (Intercept) 5.67 [5.26, 6.09]
## 2 trialNum -0.15 [-0.16, -0.14]
Some pre-commentary on analytic choices: - it’s unclear whether we should be looking at referential statements individually (i.e. treating anything that occured as a separate utterance, after a line break, as separate) or concatenating everything Laju said about flower yellow3 into one statement. I think the first is generally better since sometimes a flower is described twice (perhaps in the same way), as in “the best values are 3 in a row and big fluffy” … “okay, I’ll take big fluffy”. But there are also times when someone has a line break before continuing/clarifying a description. For now we go with single utterances.
These have quadratic smooths fit onto them, to allow for curvature, but no reason to choose this curve and not another.
Comparing utterances within rounds that were up to 2 apart, round coded as the later (so 10-12,11-12 and 12-12 will all contribute to “12”). Up to 2 apart is somewhat arbitrary, not sure what the right rolling window to do is.
Take aways: - descriptions to the same flower within game (either same person or not) become more similar - descriptions to the same flower across games become (slightly) less similar - descriptions to different flowers become less similar (regardless of between/within game)
We can treat what someone wrote at the end as the convention and then compare how similar this is to earlier utterances by them and their group mates.
Big question is whether the end utts need cleaning (which probably yes?)
We see that for the same flower, similarity to the end utt increases over blocks both within individual and indiv - other group mate (this is all within a game)
Comparing utterances earlier to post-game descriptions.
General question of how to display these things – what should be color v faceting and what to do for the dots.
We can look at how converged people are (or are not) by comparing what they say at the end.
Each person was asked for names ( ~ “how would you label this to your group”) for the 12 images they were seeing and then for 4 from a different color palette.
Own = color you saw all game
Other = the color you didn’t see
Mixed = own for one person, other for other person
Not sure how to show spread here. Or how to do a viz that allows for comparisons.
Take aways: