Introduction

  1. A key debate in developmental psych concerns the role of labels in how children learn and use categories.

  2. Give v brief rundown of broader evidence eg waxman

  3. But Perhaps the most well articulated debate is gelman sloutsky. Two line summary

  4. The assumption behind these debates is that children use labels as these are robust category markers, eg for statistical or conceptual reasons.

  5. But that’s not quite right: Labels are not robust category markers, in that almost all labels are used to refer to multiple different categories. We call this phenomenon lexical flexibility

  6. Here we investigate what the presence of lex flex means for the role of labels in how children reason about concepts and categories.

Gelman Sloutsky debate

  1. Broader overview and history of gelman sloutsky debate. Perhaps end with final experiment in Sloutsky Fisher JECP, where they show that even phonological similarity between words can affect children’s induction, and that the size of this effect is not mediated by similarity of the two concepts under comparison.

  2. What do these theories say about lex flex? Gelman - labels are category markers. To the degree that a labels meaning can be discerned, there shouldn’t be a problem, as reasoning is done based on kinds, not labels. Sloutsky - labels contribute to similarity. Overlapping labels should contribute to similarity judgments and thereby cause children to “hallucinate” that distinct kinds are the same.

Experiment 1

This vs chicken experiment. We compare label (this chicken, this chicken) with no label (this one, this one).

Adults
## [1] "Unambig 12"
## [1] "Ambig 12"

Label * Meaning mixed effects model, followed by t-tests against chance for each condition.

## [1] "adult female =  14"
##   age.FUN1 age.FUN2 age.FUN3
## 1    22.25       18       31
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.21 0.29 -4.22 0.00
Label.L -1.52 0.40 -3.83 0.00
Meaning.L -1.31 0.33 -4.02 0.00
Label.L:Meaning.L 0.39 0.52 0.75 0.45
## 
##  One Sample t-test
## 
## data:  subset(Adult.Sum, Meaning == "Unambiguous (Same Kind)" & Label ==     "Shared Label")$Choice
## t = 2.0838, df = 11, p-value = 0.06129
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.4900047 0.8655508
## sample estimates:
## mean of x 
## 0.6777778
## 
##  One Sample t-test
## 
## data:  subset(Adult.Sum, Meaning == "Unambiguous (Same Kind)" & Label ==     "No Label")$Choice
## t = -5.6977, df = 11, p-value = 0.0001387
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.07640961 0.31247928
## sample estimates:
## mean of x 
## 0.1944444
## 
##  One Sample t-test
## 
## data:  subset(Adult.Sum, Meaning == "Flexible (Thematic Relation)" &     Label == "Shared Label")$Choice
## t = -2.6802, df = 11, p-value = 0.0214
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.09529104 0.46026452
## sample estimates:
## mean of x 
## 0.2777778
## 
##  One Sample t-test
## 
## data:  subset(Adult.Sum, Meaning == "Flexible (Thematic Relation)" &     Label == "No Label")$Choice
## t = -14.182, df = 11, p-value = 2.053e-08
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  -0.01341795  0.12452906
## sample estimates:
##  mean of x 
## 0.05555556
Children
## [1] "Unambig 3 yrs 24"
## [1] "Unambig 4 yrs 25"
## [1] "Ambig 3 yrs 24"
## [1] "Ambig 4 yrs 24"
## [1] "3 yrs girls =  22"
## [1] "4 yrs girls =  23"
##   age age.months.FUN1 age.months.FUN2 age.months.FUN3
## 1   3        43.76729        36.32877        47.90137
## 2   4        54.47649        48.26301        59.83562

Label * Meaning mixed effects model, followed by t-tests against chance for each condition.

Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.43 0.22 -1.97 0.05
Label.L -0.69 0.14 -4.94 0.00
Meaning.L -0.41 0.12 -3.51 0.00
Label.L:Meaning.L 0.06 0.19 0.29 0.78
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.14 0.17 -0.79 0.43
Label.L -0.73 0.21 -3.43 0.00
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.72 0.29 -2.47 0.01
Label.L -0.66 0.18 -3.64 0.00
## 
##  One Sample t-test
## 
## data:  subset(Child.Sum, Meaning == "Unambiguous (Same Kind)" & Label ==     "Shared Label")$Choice
## t = 1.995, df = 48, p-value = 0.05174
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.4993860 0.6570766
## sample estimates:
## mean of x 
## 0.5782313
## 
##  One Sample t-test
## 
## data:  subset(Child.Sum, Meaning == "Unambiguous (Same Kind)" & Label ==     "No Label")$Choice
## t = -3.6101, df = 48, p-value = 0.0007292
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.2913484 0.4406244
## sample estimates:
## mean of x 
## 0.3659864
## 
##  One Sample t-test
## 
## data:  subset(Child.Sum, Meaning == "Flexible (Thematic Relation)" &     Label == "Shared Label")$Choice
## t = -1.5064, df = 47, p-value = 0.1386
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.3726864 0.5182858
## sample estimates:
## mean of x 
## 0.4454861
## 
##  One Sample t-test
## 
## data:  subset(Child.Sum, Meaning == "Flexible (Thematic Relation)" &     Label == "No Label")$Choice
## t = -7.2563, df = 47, p-value = 3.344e-09
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.2028640 0.3318582
## sample estimates:
## mean of x 
## 0.2673611

Discussion

Potentially evidence for Sloutsky account

But one surprising result: Adults go with label on lex flex condition. Why?

Pragmatic account? Why would they use distinct labels if they did not want us to use them?

Experiment 2:

Always use labels, but vary if they are shared or synonyms. By sloutsky fisher JECP, should still get effect.

Here, we compare inferences from e.g., Chicken Animal to Duck, and either another Chicken Animal or Chicken Meat. We also vary whether the same label is used twice (both pictures called chicken) or whether a synonym is used (one called chicken, one called drumsticks).

Adults

Label * Meaning mixed effects model, followed by t-tests against chance for each condition.

## [1] "Unambig 11"
## [1] "Ambig 13"
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.60 0.60 2.68 0.01
WordTypePolysemous -3.57 0.79 -4.49 0.00
LabelTypeDifferent -0.80 0.53 -1.52 0.13
WordTypePolysemous:LabelTypeDifferent 0.93 0.79 1.17 0.24
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.84 0.77 2.38 0.02
LabelTypeDifferent -1.01 0.66 -1.53 0.13
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.19 0.87 -2.53 0.01
LabelTypeDifferent 0.01 0.58 0.01 0.99
## 
##  One Sample t-test
## 
## data:  subset(Adult.Sum, WordType == "Non-Polysemous" & LabelType ==     "Same")$Choice
## t = 2.8333, df = 10, p-value = 0.01775
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.5550177 0.9601338
## sample estimates:
## mean of x 
## 0.7575758
## 
##  One Sample t-test
## 
## data:  subset(Adult.Sum, WordType == "Non-Polysemous" & LabelType ==     "Different")$Choice
## t = 2.2361, df = 10, p-value = 0.04933
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.5005910 0.8327423
## sample estimates:
## mean of x 
## 0.6666667
## 
##  One Sample t-test
## 
## data:  subset(Adult.Sum, WordType == "Polysemous" & LabelType == "Different")$Choice
## t = -8.6841, df = 12, p-value = 1.608e-06
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.06699738 0.24069493
## sample estimates:
## mean of x 
## 0.1538462
## 
##  One Sample t-test
## 
## data:  subset(Adult.Sum, WordType == "Polysemous" & LabelType == "Same")$Choice
## t = -7.2111, df = 12, p-value = 1.07e-05
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.06595101 0.26738233
## sample estimates:
## mean of x 
## 0.1666667

Children

## [1] "Unambig 3 yrs 25"
## [1] "Unambig 4 yrs 24"
## [1] "Ambig 3 yrs 24"
## [1] "Ambig 4 yrs 24"
## [1] "3 yrs girls =  27"
## [1] "4 yrs girls =  22"
##   Age Age.Months.FUN1 Age.Months.FUN2 Age.Months.FUN3
## 1   3        42.98541              36        47.86849
## 2   4        51.61667              48        59.55890

Label * Meaning mixed effects model, followed by t-tests against chance for each condition.

Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.29 0.19 -1.51 0.13
Meaning.L -0.51 0.16 -3.23 0.00
LabelTypeDifferent -0.13 0.09 -1.43 0.15
Meaning.L:LabelTypeDifferent 0.20 0.13 1.59 0.11
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.07 0.14 0.53 0.60
LabelTypeDifferent -0.27 0.13 -2.01 0.04
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.66 0.28 -2.38 0.02
LabelTypeDifferent 0.00 0.12 -0.02 0.98
## 
##  One Sample t-test
## 
## data:  subset(Child.Sum, WordType == "Non-Polysemous" & LabelType ==     "Same")$Choice
## t = 1.7633, df = 48, p-value = 0.08422
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.4901714 0.6499646
## sample estimates:
## mean of x 
##  0.570068
## 
##  One Sample t-test
## 
## data:  subset(Child.Sum, WordType == "Non-Polysemous" & LabelType ==     "Different")$Choice
## t = -1.1923, df = 48, p-value = 0.239
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.3830445 0.5298807
## sample estimates:
## mean of x 
## 0.4564626
## 
##  One Sample t-test
## 
## data:  subset(Child.Sum, WordType == "Polysemous" & LabelType == "Different")$Choice
## t = -3.422, df = 47, p-value = 0.001297
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.2915913 0.4459087
## sample estimates:
## mean of x 
##   0.36875
## 
##  One Sample t-test
## 
## data:  subset(Child.Sum, WordType == "Polysemous" & LabelType == "Same")$Choice
## t = -3.7254, df = 47, p-value = 0.0005223
## alternative hypothesis: true mean is not equal to 0.5
## 95 percent confidence interval:
##  0.2914566 0.4377101
## sample estimates:
## mean of x 
## 0.3645833

Comparison of Experiments 1 and 2

Adults

And data analysis

## [1] 48
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.80 0.16 -4.89 0.00
MeaningFlexible (Thematic Relation) -1.15 0.17 -6.88 0.00
Label_cOther -0.63 0.12 -5.27 0.00
ExptExpt2 -0.44 0.16 -2.65 0.01
MeaningFlexible (Thematic Relation):Label_cOther 0.09 0.12 0.80 0.43
MeaningFlexible (Thematic Relation):ExptExpt2 0.29 0.16 1.76 0.08
Label_cOther:ExptExpt2 -0.47 0.12 -3.99 0.00
MeaningFlexible (Thematic Relation):Label_cOther:ExptExpt2 -0.01 0.12 -0.10 0.92

Children

Graph the two experiments side-by-side. Other is Synonym in Expt 2, No Label in Expt 1.

And data analysis

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge with max|grad| = 0.087261 (tol =
## 0.001, component 1)
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.37 0.20 -1.82 0.07
MeaningFlexible (Thematic Relation) -0.33 0.09 -3.74 0.00
Label_cOther -0.30 0.07 -4.61 0.00
ExptExpt2 -0.07 0.06 -1.13 0.26
MeaningFlexible (Thematic Relation):Label_cOther 0.08 0.07 1.20 0.23
MeaningFlexible (Thematic Relation):ExptExpt2 0.03 0.06 0.55 0.58
Label_cOther:ExptExpt2 -0.18 0.07 -2.77 0.01
MeaningFlexible (Thematic Relation):Label_cOther:ExptExpt2 -0.06 0.07 -0.90 0.37

We can also look at how this varies by age

And analysis (note that model doesn’t converge with random intercepts for subjects, just items).

Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.33 0.18 -1.81 0.07
MeaningFlexible (Thematic Relation) -0.30 0.06 -5.21 0.00
Label_cOther -0.28 0.04 -6.25 0.00
Age4 0.01 0.06 0.13 0.89
ExptExpt2 -0.07 0.06 -1.17 0.24
MeaningFlexible (Thematic Relation):Label_cOther 0.07 0.04 1.64 0.10
MeaningFlexible (Thematic Relation):Age4 -0.09 0.06 -1.65 0.10
Label_cOther:Age4 0.00 0.04 0.07 0.94
MeaningFlexible (Thematic Relation):ExptExpt2 0.03 0.06 0.55 0.58
Label_cOther:ExptExpt2 -0.16 0.04 -3.66 0.00
Age4:ExptExpt2 -0.08 0.06 -1.48 0.14
MeaningFlexible (Thematic Relation):Label_cOther:Age4 0.00 0.04 -0.06 0.95
MeaningFlexible (Thematic Relation):Label_cOther:ExptExpt2 -0.05 0.04 -1.22 0.22
MeaningFlexible (Thematic Relation):Age4:ExptExpt2 0.01 0.06 0.16 0.88
Label_cOther:Age4:ExptExpt2 -0.02 0.04 -0.48 0.63
MeaningFlexible (Thematic Relation):Label_cOther:Age4:ExptExpt2 0.05 0.04 1.10 0.27

Analysis of vocab survey

Proportion of labels and synonyms that parents rate that their child knows, or rate that they are unsures if the child knows.

library(tidyr)
## 
## Attaching package: 'tidyr'
## The following object is masked from 'package:Matrix':
## 
##     expand
library(dplyr)
vocab.w <-read.csv("./Data/VocabSurvey/Vocab_Full.csv", header = T)
vocab <- gather(vocab.w, word, response,rocker:glass.windowpane,factor_key = TRUE)
## Warning: attributes are not identical across measure variables; they will
## be dropped
vocab <- subset(vocab, word %in% c("rocker","chicken.drumstick","chicken.hen","zebra","horse.foal","plastic","glass.cup","windowpane","cup","duck","horse.rocker","foal","drumsticks","hen","glass.windowpane"))
vocab$word <- ordered(vocab$word, levels = c("chicken.hen","hen","chicken.drumstick","drumsticks","duck", 
                                             "glass.windowpane","windowpane","glass.cup","cup","plastic",
                                             "horse.foal","foal","horse.rocker","rocker","zebra"))
vocab$known <- ifelse(vocab$response %in% c("yes","notsure"),1,0)
summaryBy(known ~ word, data = vocab)
##                 word known.mean
## 1        chicken.hen  1.0000000
## 2                hen  0.8493151
## 3  chicken.drumstick  0.8630137
## 4         drumsticks  0.4109589
## 5               duck  0.9315068
## 6   glass.windowpane  0.8630137
## 7         windowpane  0.5890411
## 8          glass.cup  0.9863014
## 9                cup  0.8356164
## 10           plastic  0.8082192
## 11        horse.foal  0.9863014
## 12              foal  0.5068493
## 13      horse.rocker  0.9315068
## 14            rocker  0.8356164
## 15             zebra  0.9863014

Discussion

No robust effect of synonyms, and effect was signif different between two experiments.

Gen discussion

What can lex flex tell us about role of labels in induction?

What can induction tell us about children’s early word meanings?