Model Accuracy

By round & condition

It increases over repNum, which is … interesting?!

By tangram

Above chance at everything! No longer hates the ice skater.

Accuracy as funct of number of words

Confusion Matrix

Comparison with all tg-matcher

Compare the random forest with human results

## [[1]]

## 
## [[2]]
## [1] "Correlation between random forest model and human 0.525"

Individual correlation

Item level correlation between correct and wrong responses (model & per-human response)

Not sure how to look for correlated error patterns…

## 
##  Pearson's product-moment correlation
## 
## data:  for_corr$human_correct and for_corr$model_correct
## t = 14.998, df = 7288, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1506713 0.1952096
## sample estimates:
##       cor 
## 0.1730289

Mid level comparisons

Want to know when model v tg-human is more accurate

So humans are better at first round, and models are sometimes better at last round – which is interesting

Compare with MPT

Lots of ways to carve this up since we have tangram, round, and conditions…

Model is consistently worse than in-game human, but pretty correlated?

## [[1]]

## 
## [[2]]

Comparison with kilogram naming divergence

Not sure this is at all meaningful, but tg-matcher accuracies and model accuracies show about the same correlation with this codeability measure ?

## [[1]]

## 
## [[2]]