TODO Figure out what to do with the 3.8 year old, oops

TODO figure out what’s up with game 80 where video cuts out?

TODO may want to verify transcripts !! (found possible errors when fixing off-by-one labelling of description <-> row)

Exclusions

Accuracy

Description length

Speed

Similarities

Run Analyses

Analysis results

[x] Logistic model of accuracy: correct ~ trialNum + (trialNum|game) + (1|target) (prior normal(0,1) for beta and sd, lkj(1) for correlation)

## Warning: There were 2 divergent transitions after warmup. Increasing
## adapt_delta above 0.95 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
##  Family: bernoulli 
##   Links: mu = logit 
## Formula: correct ~ trial + (trial | gameId) + (1 | target) 
##    Data: filter(all_data, !is.na(repNum)) (Number of observations: 465) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Multilevel Hyperparameters:
## ~gameId (Number of levels: 31) 
##                      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept)            0.55      0.37     0.03     1.42 1.00     1263
## sd(trial)                0.05      0.03     0.00     0.13 1.01      607
## cor(Intercept,trial)    -0.32      0.55    -0.98     0.90 1.00      995
##                      Tail_ESS
## sd(Intercept)            1962
## sd(trial)                1205
## cor(Intercept,trial)     1759
## 
## ~target (Number of levels: 4) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.26      0.26     0.01     0.99 1.00     1586     1895
## 
## Regression Coefficients:
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept     1.75      0.44     0.91     2.66 1.00     3067     2388
## trial         0.01      0.03    -0.06     0.07 1.00     3180     2371
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

no effect of trial on accuracy; accuracy generally high

[x] Linear model of description length: words ~ trialNum + (trialNum|game) + (1|target) (prior intercept of normal(5,10), beta and sd priors of normal(0,5), and correlation of lkj(1))

## Warning: There were 3 divergent transitions after warmup. Increasing
## adapt_delta above 0.95 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: words ~ trial + (trial | gameId) + (1 | target) 
##    Data: filter(all_data, !is.na(repNum)) (Number of observations: 465) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Multilevel Hyperparameters:
## ~gameId (Number of levels: 31) 
##                      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept)            1.75      0.41     0.95     2.58 1.00     1479
## sd(trial)                0.04      0.03     0.00     0.11 1.01      602
## cor(Intercept,trial)     0.19      0.52    -0.87     0.97 1.00     3059
##                      Tail_ESS
## sd(Intercept)            2190
## sd(trial)                1196
## cor(Intercept,trial)     2188
## 
## ~target (Number of levels: 4) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.83      0.72     0.13     2.83 1.00     1152     1759
## 
## Regression Coefficients:
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept     3.36      0.72     1.97     4.80 1.00     1891     1901
## trial         0.03      0.03    -0.03     0.09 1.00     6333     3258
## 
## Further Distributional Parameters:
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     2.93      0.10     2.74     3.14 1.00     6112     2877
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

no effect of trial on length; generally pretty short throughout

[x] Speed: time_seconds ~ trialNum + (trialNum|game) + (1|target) (prior intercept of normal(60,100), beta and sd priors of (0,20) and correlation of lkj(1))

## Warning: There were 3 divergent transitions after warmup. Increasing
## adapt_delta above 0.95 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: time_seconds ~ trial + (trial | gameId) + (1 | target) 
##    Data: mutate(filter(all_data, !is.na(repNum)), time_seco (Number of observations: 465) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Multilevel Hyperparameters:
## ~gameId (Number of levels: 31) 
##                      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept)           11.21      2.24     7.33    16.13 1.00     1477
## sd(trial)                0.54      0.16     0.25     0.88 1.00     1600
## cor(Intercept,trial)    -0.92      0.08    -1.00    -0.72 1.00     1424
##                      Tail_ESS
## sd(Intercept)            1661
## sd(trial)                1928
## cor(Intercept,trial)     2033
## 
## ~target (Number of levels: 4) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     4.16      3.26     1.06    13.20 1.00     1103     1325
## 
## Regression Coefficients:
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept    23.77      3.55    16.92    31.27 1.00     1473     1139
## trial        -0.71      0.15    -1.01    -0.41 1.01     2664     2550
## 
## Further Distributional Parameters:
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma    10.60      0.36     9.92    11.32 1.00     4367     2987
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

kids get faster over time (by about .7 seconds a pop)

We will interpret the fixed effect of trial number as being about changes in referring over time.

Sbert models

For utterance similarities, we will embed the describer’s descriptions using S-BERT and use cosine similarity as a proxy for utterance similarity.

We are unsure whether 4 blocks will be sufficient to see potential change over time, so we will

[x]* compare similarities of utterances between different games v within one game v within one speaker: sim ~ same_game + same_speaker + (1|target)

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: sim ~ same_game + same_speaker + (1 | target) 
##    Data: mutate(filter(sims, target1 == target2), target =  (Number of observations: 26001) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Multilevel Hyperparameters:
## ~target (Number of levels: 4) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.08      0.02     0.05     0.13 1.00     1313     1753
## 
## Regression Coefficients:
##              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept        0.34      0.04     0.26     0.42 1.01      990     1513
## same_game        0.25      0.01     0.22     0.27 1.00     2074     2356
## same_speaker     0.08      0.02     0.04     0.12 1.00     2036     2048
## 
## Further Distributional Parameters:
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.24      0.00     0.24     0.24 1.00     3793     2797
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

utterances from within the same game are more similar (.34 across games v .59 within), smaller effect of same speaker(+.08, so .67)

[x]* compare similarities of utterances to the last utterance within a game: sim_to_last ~ earlier_block + same_speaker + (1|game) + (1|target)

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: sim ~ earlier + same_speaker + (1 | game) + (1 | target) 
##    Data: mutate(filter(filter(sims, later == 4), game1 == g (Number of observations: 328) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Multilevel Hyperparameters:
## ~game (Number of levels: 29) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.14      0.02     0.10     0.18 1.00     2187     2153
## 
## ~target (Number of levels: 4) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.05      0.02     0.01     0.10 1.00     1203      690
## 
## Regression Coefficients:
##              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept        0.48      0.06     0.37     0.60 1.00     2154     2796
## earlier          0.06      0.02     0.02     0.09 1.00     5062     2820
## same_speaker     0.06      0.03    -0.00     0.12 1.00     4948     2976
## 
## Further Distributional Parameters:
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.29      0.01     0.27     0.31 1.00     4216     3394
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

utterances do get (slightly) more similar to last utterance over time, with a probable bump if same speaker

[x]* compare similarities of utterances to the next utterance: sim_to_next ~ earlier_block + same_speaker + (1|game) + (1|target)

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: sim ~ earlier + same_speaker + (1 | game) + (1 | target) 
##    Data: mutate(filter(filter(sims, later == earlier + 1),  (Number of observations: 334) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Multilevel Hyperparameters:
## ~game (Number of levels: 30) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.16      0.02     0.12     0.20 1.00     1841     2405
## 
## ~target (Number of levels: 4) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.05      0.02     0.01     0.10 1.00     1729     1373
## 
## Regression Coefficients:
##              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept        0.55      0.06     0.43     0.67 1.00     2026     2689
## earlier          0.03      0.02    -0.01     0.07 1.00     4815     2831
## same_speaker     0.07      0.03     0.01     0.13 1.00     4344     2813
## 
## Further Distributional Parameters:
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.29      0.01     0.27     0.31 1.00     3622     2929
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

utterances are again more similar from same speaker, but similarity to next is not clearly increasing over time

[x]* compare the similarity level of descriptions from the same block in different games: sim ~ block + (1|target)

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: sim ~ block + (1 | target) 
##    Data: mutate(filter(filter(sims, later == earlier), game (Number of observations: 6335) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Multilevel Hyperparameters:
## ~target (Number of levels: 4) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.08      0.02     0.05     0.13 1.00     1224     1776
## 
## Regression Coefficients:
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept     0.36      0.04     0.28     0.45 1.01      871     1083
## block        -0.01      0.00    -0.01    -0.00 1.00     3735     2286
## 
## Further Distributional Parameters:
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.24      0.00     0.24     0.25 1.00     2339     2122
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

distance of descriptions across games is stable (not seeing divergence)

Priors for the above similarity models: normal(.5,.2) for intercept, normal(0,.1) for beta, and normal(0,.05) for sd.

We will also qualitatively explore children’s turn-taking structure and how they understand each other.

TODO: the qualitative thing