Load packages and helper functions
- Packages
- Helper functions
Load datasets
Experiment 1 (consistent condition)
Experiment 2 (partial condition)
Experiment 3 (inconsistent condition)
Additional analyses: footnote #9

Load packages and helper functions

Packages

library(akima)
library(compute.es)
library(cowplot)
library(doBy)
library(ez)
library(ggplot2)
library(Hmisc)
library(knitr)
library(languageR)
library(lattice)
library(lme4)
library(multcomp)
library(nlme)
library(pastecs)
library(plotrix)
library(plyr)
library(psych)
library(Rcpp)
library(reshape)
library(reshape2)
library(stringdist)
library(WRS)

theme_set(theme_bw())

Helper functions

SummarySE

This function can be found on the website “Cookbook for R”.

http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/#Helper functions

It summarizes data, giving count, mean, standard deviation, standard error of the mean, and confidence intervals (default 95%).

data: a data frame.

measurevar: the name of a column that contains the variable to be summariezed

groupvars: a vector containing names of columns that contain grouping variables

na.rm: a boolean that indicates whether to ignore NA’s

conf.interval: the percent range of the confidence interval (default is 95%)

summarySE <- function(data=NULL, measurevar, groupvars=NULL, na.rm=FALSE,
                      conf.interval=.95, .drop=TRUE) {
    require(plyr)

    # New version of length which can handle NA's: if na.rm==T, don't count them
    length2 <- function (x, na.rm=FALSE) {
        if (na.rm) sum(!is.na(x))
        else       length(x)
    }

    # This does the summary. For each group's data frame, return a vector with
    # N, mean, and sd
    datac <- ddply(data, groupvars, .drop=.drop,
      .fun = function(xx, col) {
        c(N    = length2(xx[[col]], na.rm=na.rm),
          mean = mean   (xx[[col]], na.rm=na.rm),
          sd   = sd     (xx[[col]], na.rm=na.rm)
        )
      },
      measurevar
    )

    # Rename the "mean" column    
    datac <- rename(datac, c("mean" = measurevar))

    datac$se <- datac$sd / sqrt(datac$N)  # Calculate standard error of the mean

    # Confidence interval multiplier for standard error
    # Calculate t-statistic for confidence interval: 
    # e.g., if conf.interval is .95, use .975 (above/below), and use df=N-1
    ciMult <- qt(conf.interval/2 + .5, datac$N-1)
    datac$ci <- datac$se * ciMult

    return(datac)
}

SummarySEwithin

This function can be found on the website “Cookbook for R”.

http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/#Helper functions

It summarizes data, handling within-subjects variables by removing inter-subject variability. It will still work if there are no within-S variables. It gives count, un-normed mean, normed mean (with same between-group mean), standard deviation, standard error of the mean, and confidence intervals. If there are within-subject variables, calculate adjusted values using method from Morey (2008).

data: a data frame.

measurevar: the name of a column that contains the variable to be summariezed

betweenvars: a vector containing names of columns that are between-subjects variables

withinvars: a vector containing names of columns that are within-subjects variables

idvar: the name of a column that identifies each subject (or matched subjects)

na.rm: a boolean that indicates whether to ignore NA’s

conf.interval: the percent range of the confidence interval (default is 95%)

summarySEwithin <- function(data=NULL, measurevar, betweenvars=NULL, withinvars=NULL,
                            idvar=NULL, na.rm=FALSE, conf.interval=.95, .drop=TRUE) {

  # Ensure that the betweenvars and withinvars are factors
  factorvars <- vapply(data[, c(betweenvars, withinvars), drop=FALSE],
    FUN=is.factor, FUN.VALUE=logical(1))

  if (!all(factorvars)) {
    nonfactorvars <- names(factorvars)[!factorvars]
    message("Automatically converting the following non-factors to factors: ",
            paste(nonfactorvars, collapse = ", "))
    data[nonfactorvars] <- lapply(data[nonfactorvars], factor)
  }

  # Get the means from the un-normed data
  datac <- summarySE(data, measurevar, groupvars=c(betweenvars, withinvars),
                     na.rm=na.rm, conf.interval=conf.interval, .drop=.drop)

  # Drop all the unused columns (these will be calculated with normed data)
  datac$sd <- NULL
  datac$se <- NULL
  datac$ci <- NULL

  # Norm each subject's data
  ndata <- normDataWithin(data, idvar, measurevar, betweenvars, na.rm, .drop=.drop)

  # This is the name of the new column
  measurevar_n <- paste(measurevar, "_norm", sep="")

  # Collapse the normed data - now we can treat between and within vars the same
  ndatac <- summarySE(ndata, measurevar_n, groupvars=c(betweenvars, withinvars),
                      na.rm=na.rm, conf.interval=conf.interval, .drop=.drop)

  # Apply correction from Morey (2008) to the standard error and confidence interval
  #  Get the product of the number of conditions of within-S variables
  nWithinGroups    <- prod(vapply(ndatac[,withinvars, drop=FALSE], FUN=nlevels,
                           FUN.VALUE=numeric(1)))
  correctionFactor <- sqrt( nWithinGroups / (nWithinGroups-1) )

  # Apply the correction factor
  ndatac$sd <- ndatac$sd * correctionFactor
  ndatac$se <- ndatac$se * correctionFactor
  ndatac$ci <- ndatac$ci * correctionFactor

  # Combine the un-normed means with the normed results
  merge(datac, ndatac)
}

myCenter

This function outputs the centered values of an variable, which can be a numeric variable, a factor, or a data frame. It was taken from Florian Jaegers blog https://hlplab.wordpress.com/2009/04/27/centering-several-variables/.

From his blog:

-If the input is a numeric variable, the output is the centered variable.

-If the input is a factor, the output is a numeric variable with centered factor level values. That is, the factor’s levels are converted into numerical values in their inherent order (if not specified otherwise, R defaults to alphanumerical order). More specifically, this centers any binary factor so that the value below 0 will be the 1st level of the original factor, and the value above 0 will be the 2nd level.

-If the input is a data frame or matrix, the output is a new matrix of the same dimension and with the centered values and column names that correspond to the colnames() of the input preceded by “c” (e.g. “Variable1” will be “cVariable1”).

myCenter= function(x) {
  if (is.numeric(x)) { return(x - mean(x, na.rm=T)) }
    if (is.factor(x)) {
        x= as.numeric(x)
        return(x - mean(x, na.rm=T))
    }
    if (is.data.frame(x) || is.matrix(x)) {
        m= matrix(nrow=nrow(x), ncol=ncol(x))
        colnames(m)= paste("c", colnames(x), sep="")
    
        for (i in 1:ncol(x)) {
        
            m[,i]= myCenter(x[,i])
        }
        return(as.data.frame(m))
    }
}

lizCenter

This function provides a wrapper around myCenter allowing you to center a specific list of variables from a dataframe. The input is a dataframe (x) and a list of the names of the variables which you wish to center (listfname). The output is a copy of the dataframe with a column (numeric) added for each of the centered variables with each one labelled with it’s previous name with “.ct” appended. For example, if x is a dataframe with columns “a” and “b” lizCenter(x, list(“a”, “b”)) will return a dataframe with two additional columns, a.ct and b.ct, which are numeric, centered codings of the corresponding variables.

lizCenter= function(x, listfname) 
{
    for (i in 1:length(listfname)) 
    {
        fname = as.character(listfname[i])
        x[paste(fname,".ct", sep="")] = myCenter(x[fname])
    }
        
    return(x)
}

Load datasets

The dataframe all.production.data contains the data from children and adults’ production performance in experiments 1, 2, & 3.

all.production.data <- read.csv("production_data.csv", header=TRUE)

The dataframe all.afc.data contains the data from children and adults’ 2afc performance in experiments 1, 2, & 3.

all.afc.data <- read.csv("2afc_data.csv", header=TRUE)

Entropy data: The dataframe all.entropy.data contains entropy scores (total entropy, lexical mutual information, & speaker-identity mutual information) for children’s and adults’ performance in experiments 1, 2, & 3.

all.entropy.data = read.csv("talker_entropy_data_both_final.csv")

Experiment 1 (consistent condition)

Production data: baseline

select appropriate dataset

Select the production data from the consistent condition:

exp1.data = subset(all.production.data, consistency == "consistent")

Filter out responses where participants produced an incorrect noun:

exp1.data = subset(exp1.data, noun_correct ==1)

Create a column that codes whether an ‘other’ response was produced:

exp1.data$det_other <- 0
exp1.data$det_other[exp1.data$pt_logical_det_intended == "other"] <- 1
exp1.data$det_other[exp1.data$pt_logical_det_intended == "none"] <- 1

calculate proportion of excluded trials

round(with(exp1.data, tapply(det_other, list(adult_child), mean, na.rm=T)),4)

##  adult  child 
## 0.0000 0.0977

statistical analyses

lme analyses: children

Select appropriate dataset:

exp1.child.data = subset(exp1.data, adult_child == "child")

Ensure day is coded as a factor and center the variables of interest using the lizCenter function:

exp1.child.data$day = factor(exp1.child.data$day)
exp1.child.data.ct = lizCenter(exp1.child.data, list("day","oldnew"))

Run the lme model:

exp1.child.data.excluded.lmer = glmer(det_other~ day.ct * oldnew.ct + (day.ct*oldnew.ct|pt_code), family = binomial, data = exp1.child.data.ct)

kable(summary(exp1.child.data.excluded.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	-10.74	4.27	-2.51	0.01
day.ct	-5.69	6.73	-0.85	0.40
oldnew.ct	-0.26	1.77	-0.15	0.88
day.ct:oldnew.ct	-2.23	3.83	-0.58	0.56

Production data: choice of particle

select appropriate dataset

Select both children’s and adults’ production data from the consistent condition:

exp1.both.production = subset(all.production.data, consistency== "consistent")

Filter out ‘other responses’:

exp1.both.production = subset(exp1.both.production, pt_logical_det_intended =="det1" | pt_logical_det_intended =="det2")

Filter out responses where participants produced an incorrect noun:

exp1.both.production = subset(exp1.both.production, noun_correct ==1)

calculate means for Figure 3

Aggregate data:

aggregated.production_consistent = aggregate(correct_intended ~ pt_code + adult_child + day + oldnew, exp1.both.production, FUN=mean)

Calculate means: Proportion of correct particle usage by age group, noun type, and day in experiment 1

round(with(aggregated.production_consistent, tapply(correct_intended, list(day, oldnew, adult_child), mean, na.rm=T)),2)

## , , adult
## 
##    new  old
## 1 0.96 0.96
## 4 0.98 0.99
## 
## , , child
## 
##    new  old
## 1 0.84 0.83
## 4 0.85 0.85

statistical analyses

lme analyses: children & adults

Ensure day is coded as a factor and center the variables of interest using the lizCenter function:

exp1.both.production$day = factor(exp1.both.production$day)
exp1.both.production.ct = lizCenter(exp1.both.production, list("day","oldnew", "adult_child"))

Run the lme model:

exp1.both.production.lmer = glmer(correct_intended ~ day.ct * oldnew.ct * adult_child.ct + (day.ct*oldnew.ct|pt_code), family = binomial, data = exp1.both.production.ct, control=glmerControl(optimizer = "bobyqa"))

kable(summary(exp1.both.production.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	4.83	0.39	12.32	0.00
day.ct	0.20	0.49	0.40	0.69
oldnew.ct	-0.34	0.33	-1.01	0.31
adult_child.ct	-3.63	0.71	-5.13	0.00
day.ct:oldnew.ct	0.24	0.66	0.36	0.72
day.ct:adult_child.ct	-0.49	0.76	-0.64	0.52
oldnew.ct:adult_child.ct	-0.03	0.38	-0.08	0.93
day.ct:oldnew.ct:adult_child.ct	-0.64	0.76	-0.85	0.40

2afc data

select appropriate dataset

Select both children’s and adults’ 2afc data from the consistent condition:

exp1.both.2afc = subset(all.afc.data, consistency=="consistent")

calculate means for Figure 4

Aggregate data:

aggregated.2afc_consistent = aggregate(correct ~ pt_code + adult_child + old_new, exp1.both.2afc, FUN=mean)

Calculate means: proportion of correct particle choice by age group and noun type in experiment 1

round(with(aggregated.2afc_consistent, tapply(correct, list(adult_child, old_new), mean, na.rm=T)),2)

##        new  old
## adult 0.98 1.00
## child 0.81 0.81

statistical analyses

lme analyses: children & adults

Center variables of interest using the lizCenter function:

exp1.both.2afc.ct = lizCenter(exp1.both.2afc, list("old_new", "adult_child"))

Run the lme model:

exp1.both.2afc.lmer = glmer(correct~ old_new.ct * adult_child.ct + (old_new.ct |pt_code), family = binomial, data = exp1.both.2afc.ct)

kable(summary(exp1.both.2afc.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	6.89	1.40	4.91	0.00
old_new.ct	-0.63	1.52	-0.42	0.68
adult_child.ct	-4.55	1.72	-2.65	0.01
old_new.ct:adult_child.ct	-1.08	1.54	-0.70	0.48

Production data: unaware participants

select appropriate dataset

Select children’s production data:

exp1.child.production = subset(exp1.both.production, adult_child == "child")

Filter out ‘aware’ children:

exp1.child.production.unaware = subset(exp1.child.production, aware=="no")

calculate means for Figure B9

Aggregate data:

aggregated.production_consistent_unaware = aggregate(correct_intended ~ pt_code + day + oldnew, exp1.child.production.unaware , FUN=mean)

Calculate means: Proportion of correct particle usage by noun type and day in experiment 1, unaware participants

round(with(aggregated.production_consistent_unaware, tapply(correct_intended, list(oldnew, day), mean, na.rm=T)),2)

##        1    4
## new 0.87 0.82
## old 0.86 0.82

statistical analyses

lme analyses: children

Center variables of interest using the lizCenter function:

exp1.child.production.unaware.ct = lizCenter(exp1.child.production.unaware, list("day","oldnew"))

Run the lme model:

exp1.child.production.unaware.lmer = glmer(correct_intended ~ day.ct * oldnew.ct + (day.ct*oldnew.ct|pt_code), family = binomial, data = exp1.child.production.unaware.ct)

kable(summary(exp1.child.production.unaware.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	2.64	0.51	5.13	0.00
day.ct	-0.35	0.26	-1.32	0.19
oldnew.ct	-0.23	0.25	-0.94	0.35
day.ct:oldnew.ct	0.28	0.50	0.57	0.57

2afc data: unaware participants

select appropriate dataset

Select children’s 2afc data:

exp1.child.2afc = subset(exp1.both.2afc, adult_child == "child")

Filter out ‘aware’ children:

exp1.child.2afc.unaware = subset(exp1.child.2afc, aware=="no")

calculate means for Figure B10

Aggregate data:

aggregated.2afc_consistent_unaware = aggregate(correct ~ pt_code + adult_child + old_new, exp1.child.2afc.unaware, FUN=mean)

Calculate means: proportion of correct particle choice by noun type in experiment 1, unaware participants

round(with(aggregated.2afc_consistent_unaware, tapply(correct, list(old_new), mean, na.rm=T)),2)

##  new  old 
## 0.78 0.81

statistical analyses

lme analyses: children

Center variables of interest using the lizCenter function:

exp1.child.2afc.unaware.ct = lizCenter(exp1.child.2afc.unaware, list("old_new"))

Run the lme:

exp1.child.2afc.unaware.lmer = glmer(correct~ old_new.ct + (old_new.ct |pt_code), family = binomial, data = exp1.child.2afc.unaware.ct)

kable(summary(exp1.child.2afc.unaware.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	3.47	1.21	2.88	0.00
old_new.ct	0.10	1.13	0.09	0.93

Overall regularization

calculate means for Figure 5

entropy particle usage for day 1 and 4, for productions involving old nouns, novel nouns, or taken across all nouns in the language, in experiments 1, 2, & 3

round(with(all.entropy.data, tapply(TotalEntropy, list(Consistency, AgeGroup, Day, LexicalItems), mean, na.rm=T)),2)

## , , 1, all
## 
##              adult child
## consistent    0.94  0.65
## inconsistent  0.88  0.35
## partial       0.98  0.33
## 
## , , 4, all
## 
##              adult child
## consistent    1.00  0.76
## inconsistent  0.96  0.68
## partial       0.99  0.67
## 
## , , 1, new
## 
##              adult child
## consistent    0.93  0.60
## inconsistent  0.85  0.28
## partial       0.94  0.26
## 
## , , 4, new
## 
##              adult child
## consistent    1.00  0.72
## inconsistent  0.89  0.62
## partial       0.96  0.64
## 
## , , 1, old
## 
##              adult child
## consistent    0.95  0.68
## inconsistent  0.84  0.38
## partial       0.96  0.37
## 
## , , 4, old
## 
##              adult child
## consistent    1.00  0.78
## inconsistent  0.89  0.69
## partial       0.97  0.68

lexical mutual information of particle usage for day 1 and 4, for productions involving old nouns, novel nouns, or taken across all nouns in the language, in experiments 1, 2, & 3

round(with(all.entropy.data, tapply(MutualInformationLexical, list(Consistency, AgeGroup, Day, LexicalItems), mean, na.rm=T)),2)

## , , 1, all
## 
##              adult child
## consistent    0.01  0.03
## inconsistent  0.34  0.09
## partial       0.29  0.05
## 
## , , 4, all
## 
##              adult child
## consistent    0.00  0.02
## inconsistent  0.27  0.11
## partial       0.13  0.06
## 
## , , 1, new
## 
##              adult child
## consistent    0.00  0.01
## inconsistent  0.31  0.04
## partial       0.23  0.03
## 
## , , 4, new
## 
##              adult child
## consistent    0.00  0.02
## inconsistent  0.18  0.08
## partial       0.12  0.05
## 
## , , 1, old
## 
##              adult child
## consistent    0.01  0.02
## inconsistent  0.31  0.09
## partial       0.28  0.05
## 
## , , 4, old
## 
##              adult child
## consistent    0.01  0.01
## inconsistent  0.23  0.10
## partial       0.10  0.06

select appropriate dataset

Select children’s and adults’ entropy data, across nouns, from the consistent condition:

exp1.entropy.allnouns = subset(all.entropy.data, Consistency == "consistent" & LexicalItems =="all")

statistical analyses

whole-language entropy analyses

We need to turn the data to wide-format. We do that using the dcast function.

exp1.entropy.total.allnouns <- dcast(exp1.entropy.allnouns, Participant + AgeGroup ~ Day, value.var="TotalEntropy")

Run bootstrapped ANOVA:

A = Age Group

B = Day

AB = Age Group by Day

z=bw2list(exp1.entropy.total.allnouns,2,c(3:4))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwtrimbt(2,2,z,nboot=1000,tr=.05)

## [1] "Taking bootstrap samples. Please wait."

RobustAnova

## $p.value.A
## [1] 0.014
## 
## $p.value.B
## [1] 0.041
## 
## $p.value.AB
## [1] 0.101

Establish whether the intercept is different from chance (0.987858121):

Aggregate data:

exp1.entropy.total.overall = aggregate(TotalEntropy ~ Participant, exp1.entropy.allnouns, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp1.entropy.total.overall$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.8315874 0.7501969 0.9076309

Establish whether children showed above chance regularization on day 1

select appropriate dataset:

exp1.entropy.total.overall.kids1=subset(exp1.entropy.allnouns, AgeGroup =="child" & LexicalItems =="all" & Day == 1)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp1.entropy.total.overall.kids1$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.6547058 0.4912498 0.7998439

Establish whether children showed above chance regularization on day 4

select appropriate dataset:

exp1.entropy.total.overall.kids4=subset(exp1.entropy.allnouns, AgeGroup =="child" & LexicalItems =="all" & Day == 4)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp1.entropy.total.overall.kids4$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.7593695 0.6011025 0.8877136

Establish whether adults showed above chance regularization on day 1

select appropriate dataset:

exp1.entropy.total.overall.adults1=subset(exp1.entropy.allnouns, AgeGroup =="adult" & LexicalItems =="all" & Day == 1)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp1.entropy.total.overall.adults1$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.9424327 0.8518657 0.9998591

Establish whether adults showed above chance regularization on day 4

select appropriate dataset:

exp1.entropy.total.overall.adults4=subset(exp1.entropy.allnouns, AgeGroup =="adult" & LexicalItems =="all" & Day == 4)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp1.entropy.total.overall.adults4$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.9996946 0.9993656 0.9999530

whole-language entropy analyses with the additional within-subject factor noun type

Select appropriate dataset:

exp1.entropy.old.new.nouns = subset(all.entropy.data, Consistency == "consistent" & LexicalItems !="all")

We need to turn the data to wide-format. We do that using the dcast function.

exp1.entropy.total.old.new.nouns <- dcast(exp1.entropy.old.new.nouns, Participant + AgeGroup ~ Day * LexicalItems, value.var="TotalEntropy")

Run bootstrapped ANOVA:

A = Age Group

B = Day

C = Noun Type

AB = Age Group by Day

AC = Age Group by Noun Type

BC = Day by Noun Type

ABC = Age Group by Day by Noun Type

z=bw2list(exp1.entropy.total.old.new.nouns,2,c(3:6))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwwtrimbt(2,2,2,z,nboot=1000,est=mom,tr=.05)

RobustAnova

## $p.value.A
## [1] 0.009
## 
## $p.value.B
## [1] 0.031
## 
## $p.value.C
## [1] 0.038
## 
## $p.value.AB
## [1] 0.085
## 
## $p.value.AC
## [1] 0.065
## 
## $p.value.BC
## [1] 0.205
## 
## $p.value.ABC
## [1] 0.544

Analyses over number of participants showing evidence of significant regularization:

Comparison of adult vs. child regularizers on day 1:

select appropriate dataset:

regularizers.day1 = subset(exp1.entropy.allnouns, Day  == 1)

calculate number of regularizers and run chi-square test:

x =table(regularizers.day1$PLowerTotalEntropy, regularizers.day1$AgeGroup)
x

##    
##     adult child
##   0    28    15
##   1     2    13

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 11.942, df = 1, p-value = 0.0005488

Comparison of adult vs. child regularizers on day 4:

select appropriate dataset:

regularizers.day4 = subset(exp1.entropy.allnouns, Day  == 4)

calculate number of regularizers and run chi-square test:

regularizers.day4$Consistency = factor(regularizers.day4$Consistency)
x =table(regularizers.day4$PLowerTotalEntropy, regularizers.day4$AgeGroup)
x

##    
##     adult child
##   0    30    19
##   1     0    11

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 13.469, df = 1, p-value = 0.0002425

Lexical conditioning

select appropriate dataset

We need to turn the exp1.entropy.allnouns dataset to wide-format. We do that using the dcast function.

exp1.lexical.MI.allnouns <- dcast(exp1.entropy.allnouns, Participant + AgeGroup ~ Day, value.var="MutualInformationLexical")

statistical analyses

lexical MI analyses across nouns

Run bootstrapped ANOVA:

A = Age Group

B = Day

AB = Age Group by Day

z=bw2list(exp1.lexical.MI.allnouns,2,c(3:4))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwtrimbt(2,2,z,nboot=1000,tr=.05)

## [1] "Taking bootstrap samples. Please wait."

RobustAnova

## $p.value.A
## [1] 0.006
## 
## $p.value.B
## [1] 0.24
## 
## $p.value.AB
## [1] 0.362

Establish whether the intercept is different from chance (0.079641531):

Aggregate data:

exp1.lexical.MI.overall = aggregate(MutualInformationLexical ~ Participant, exp1.entropy.allnouns, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp1.lexical.MI.overall$MutualInformationLexical, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##        Mean       Lower       Upper 
## 0.015502573 0.008847536 0.022749106

Establish whether children showed below chance lexical MI

Select appropriate dataset:

exp1.lexical.MI.overall.kids = subset(exp1.entropy.allnouns, AgeGroup =="child")

Aggregate data:

exp1.lexical.MI.overall.kids = aggregate(MutualInformationLexical ~ Participant, exp1.lexical.MI.overall.kids, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp1.lexical.MI.overall.kids$MutualInformationLexical, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##       Mean      Lower      Upper 
## 0.02606044 0.01637575 0.03882741

Establish whether adults showed below chance lexical MI

Select appropriate dataset:

exp1.lexical.MI.overall.adults = subset(exp1.entropy.allnouns, AgeGroup =="adult")

Aggregate data:

exp1.lexical.MI.overall.adults = aggregate(MutualInformationLexical ~ Participant, exp1.lexical.MI.overall.adults, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp1.lexical.MI.overall.adults$MutualInformationLexical, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##        Mean       Lower       Upper 
## 0.004944709 0.000798565 0.012334186

lexical MI analyses with the additional within-subject factor noun type

Select appropriate dataset:

exp1.entropy.old.new.nouns = subset(all.entropy.data, Consistency == "consistent" & LexicalItems !="all")

We need to turn the data to wide-format. We do that using the dcast function.

exp1.lexical.MI.old.new.nouns <- dcast(exp1.entropy.old.new.nouns, Participant + AgeGroup ~ Day * LexicalItems, value.var="MutualInformationLexical")

Run bootstrapped ANOVA:

A = Age Group

B = Day

C = Noun Type

AB = Age Group by Day

AC = Age Group by Noun Type

BC = Day by Noun Type

ABC = Age Group by Day by Noun Type

z=bw2list(exp1.lexical.MI.old.new.nouns,2,c(3:6))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwwtrimbt(2,2,2,z,nboot=1000,est=mom,tr=.05)

RobustAnova

## $p.value.A
## [1] 0.001
## 
## $p.value.B
## [1] 0.682
## 
## $p.value.C
## [1] 0.218
## 
## $p.value.AB
## [1] 0.859
## 
## $p.value.AC
## [1] 0.371
## 
## $p.value.BC
## [1] 0.145
## 
## $p.value.ABC
## [1] 0.215

Analyses over number of participants showing evidence of significant regularization:

There are no participants whose MI differed from that predicted by chance, thus, no analyses were carried out on lexical MI.

Experiment 2 (partial condition)

Production data: baseline

select appropriate dataset

Select production data from the partial condition:

exp2.data = subset(all.production.data, consistency == "partial")

Filter out responses where participants produced an incorrect noun:

exp2.data = subset(exp2.data, noun_correct ==1)

Create a column that codes whether an ‘other’ response was produced:

exp2.data$det_other <- 0
exp2.data$det_other[exp2.data$pt_logical_det_intended == "other"] <- 1
exp2.data$det_other[exp2.data$pt_logical_det_intended == "none"] <- 1

calculate proportion of excluded trials

round(with(exp2.data, tapply(det_other, list(adult_child), mean, na.rm=T)),4)

##  adult  child 
## 0.0000 0.0835

statistical analyses

lme analyses: children

Select appropriate dataset:

exp2.child.data = subset(exp2.data, adult_child == "child")

Ensure day is coded as a factor and center the variables of interest using the lizCenter function:

exp2.child.data$day = factor(exp2.child.data$day)
exp2.child.data.ct = lizCenter(exp2.child.data, list("day","oldnew"))

Run the lme model:

exp2.child.data.excluded.lmer = glmer(det_other~ day.ct * oldnew.ct + (day.ct*oldnew.ct|pt_code), family = binomial, data = exp2.child.data.ct, control=glmerControl(optimizer = "bobyqa"))

kable(summary(exp2.child.data.excluded.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	-9.18	1.90	-4.83	0.00
day.ct	-2.43	2.78	-0.87	0.38
oldnew.ct	-0.94	1.95	-0.48	0.63
day.ct:oldnew.ct	-5.55	3.89	-1.43	0.15

Production data: choice of particle

select appropriate datasets

Children & adults:

Filter out ‘other responses’:

exp2.both.production = subset(all.production.data, pt_logical_det_intended =="det1" | pt_logical_det_intended =="det2")

Filter out incorrect noun responses:

exp2.both.production = subset(exp2.both.production, noun_correct ==1)

Filter out data from experiments 1 and 3:

exp2.both.production = subset(exp2.both.production, consistency== "partial")

children

exp2.child.production = subset(exp2.both.production, adult_child == "child")

adults

exp2.adult.production <-subset(exp2.both.production, adult_child == "adult")

calculate means for Figure 6

Aggregate data:

aggregated.production_partial = aggregate(correct_intended ~ pt_code + adult_child + day + oldnew, exp2.both.production, FUN=mean)

Calculate means: Proportion of majority particle usage by noun type and day in experiment 2

round(with(aggregated.production_partial, tapply(correct_intended, list(day, oldnew, adult_child), mean, na.rm=T)),2)

## , , adult
## 
##    new  old
## 1 0.53 0.55
## 4 0.64 0.60
## 
## , , child
## 
##    new  old
## 1 0.55 0.53
## 4 0.53 0.54

statistical analyses

lme analyses: children & adults

Center variables of interest using the lizCenter function:

exp2.both.production.ct = lizCenter(exp2.both.production, list("day","oldnew", "adult_child"))

Run the lme:

exp2.both.production.lmer = glmer(correct_intended ~ day.ct * oldnew.ct * adult_child.ct + (day.ct*oldnew.ct |pt_code), family = binomial, control=glmerControl(optimizer = "bobyqa"), data = exp2.both.production.ct)

kable(summary(exp2.both.production.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.28	0.08	3.55	0.00
day.ct	0.08	0.04	1.99	0.05
oldnew.ct	-0.04	0.05	-0.73	0.46
adult_child.ct	-0.22	0.16	-1.38	0.17
day.ct:oldnew.ct	-0.02	0.03	-0.62	0.53
day.ct:adult_child.ct	-0.14	0.08	-1.78	0.08
oldnew.ct:adult_child.ct	0.02	0.10	0.20	0.84
day.ct:oldnew.ct:adult_child.ct	0.15	0.07	2.20	0.03

lme analyses: adults

Center the variables of interest using the lizCenter function:

exp2.adult.production.ct = lizCenter(exp2.adult.production, list("day","oldnew"))

Run the model:

exp2.adult.production.lmer = glmer(correct_intended ~ day.ct * oldnew.ct + (day.ct*oldnew.ct|pt_code), family = binomial, data = exp2.adult.production.ct)

kable(summary(exp2.adult.production.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.39	0.12	3.26	0.00
day.ct	0.15	0.06	2.34	0.02
oldnew.ct	-0.04	0.08	-0.50	0.62
day.ct:oldnew.ct	-0.09	0.05	-1.66	0.10

Compare mean performance on day 1 against chance (50%):

exp2.adult.production.Day1 <-subset(exp2.adult.production, day =="1")

Center the variables of interest using the lizCenter function:

exp2.adult.production.Day1.ct = lizCenter(exp2.adult.production.Day1, list("oldnew"))

Run the model:

exp2.adult.production.Day1.lmer = glmer(correct_intended ~ oldnew.ct + (oldnew.ct|pt_code), family = binomial, data = exp2.adult.production.Day1.ct)

kable(summary(exp2.adult.production.Day1.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.17	0.14	1.22	0.22
oldnew.ct	0.09	0.10	0.97	0.33

Probability matching analyses:

Calculate the log of the input in the partial experiment (75%):

log(75/25)

## [1] 1.098612

Calculate the intercept’s z-value for a model comparing adults’ performance on day 1 (54%) against 75%:

(0.17033  - 1.098612)/0.13930

## [1] -6.663905

Compare mean performance on day 4 against chance (50%):

exp2.adult.production.Day4 <-subset(exp2.adult.production, day =="4")

Center the variables of interest using the lizCenter function:

exp2.adult.production.Day4.ct = lizCenter(exp2.adult.production.Day4, list("oldnew"))

Run the model:

exp2.adult.production.Day4.lmer = glmer(correct_intended ~ oldnew.ct + (oldnew.ct|pt_code), family = binomial, data = exp2.adult.production.Day4.ct)

kable(summary(exp2.adult.production.Day4.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.62	0.17	3.64	0.00
oldnew.ct	-0.18	0.13	-1.37	0.17

Calculate the intercept’s z-value for a model comparing adults’ performance on day 4 (62%) against 75%:

(0.6243   - 1.098612)/0.1716

## [1] -2.764056

lme analyses: children

Center the variables of interest using the lizCenter function:

exp2.child.production.ct = lizCenter(exp2.child.production, list("day","oldnew"))

Run the lme:

exp2.child.production.lmer = glmer(correct_intended ~ day.ct * oldnew.ct + (day.ct*oldnew.ct|pt_code), family = binomial, control=glmerControl(optimizer = "bobyqa"), data = exp2.child.production.ct)

kable(summary(exp2.child.production.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.16	0.10	1.52	0.13
day.ct	0.00	0.05	0.09	0.93
oldnew.ct	-0.03	0.07	-0.43	0.67
day.ct:oldnew.ct	0.06	0.05	1.16	0.25

Calculate the intercept’s z-value for a model comparing children’s performance (53%) performance against 75%:

(0.158621 - 1.098612)/0.104222

## [1] -9.019123

2afc data

select appropriate dataset

children & adults

exp2.both.2afc = subset(all.afc.data, consistency=="partial")

children

exp2.child.2afc = subset(exp2.both.2afc, adult_child == "child")

adults

exp2.adult.2afc = subset(exp2.both.2afc, adult_child == "adult")

calculate means for Figure 7

Aggregate data:

aggregated.2afc_partial = aggregate(correct ~ pt_code + adult_child + old_new, exp2.both.2afc, FUN=mean)

Calculate means: proportion of majority particle choice by noun type in experiment 2

round(with(aggregated.2afc_partial, tapply(correct, list(adult_child, old_new), mean, na.rm=T)),2)

##        new  old
## adult 0.60 0.65
## child 0.57 0.55

statistical analyses

lme analyses: children & adults

Center the variables of interest using the lizCenter function:

exp2.both.2afc.ct = lizCenter(exp2.both.2afc, list("old_new", "adult_child"))

Run the lme:

exp2.both.2afc.lmer = glmer(correct~ old_new.ct * adult_child.ct + (old_new.ct |pt_code), family = binomial, data = exp2.both.2afc.ct)

kable(summary(exp2.both.2afc.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.40	0.09	4.53	0.00
old_new.ct	0.07	0.14	0.54	0.59
adult_child.ct	-0.31	0.18	-1.76	0.08
old_new.ct:adult_child.ct	-0.29	0.27	-1.08	0.28

lme analyses: children

Center the variables of interest using the lizCenter function:

exp2.child.2afc.ct = lizCenter(exp2.child.2afc, list("old_new"))

Run the lme:

exp2.child.2afc.lmer = glmer(correct~ old_new.ct + (old_new.ct |pt_code), family = binomial, data = exp2.child.2afc.ct)

kable(summary(exp2.child.2afc.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.24	0.11	2.19	0.03
old_new.ct	-0.07	0.19	-0.38	0.70

lme analyses: adults

Center the variables of interest using the lizCenter function:

exp2.adult.2afc.ct = lizCenter(exp2.adult.2afc, list("old_new"))

Run the lme:

exp2.adult.2afc.lmer = glmer(correct~ old_new.ct + (old_new.ct |pt_code), family = binomial, data = exp2.adult.2afc.ct)

kable(summary(exp2.adult.2afc.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.58	0.14	4.03	0.00
old_new.ct	0.23	0.22	1.08	0.28

Production data: unaware participants

select appropriate datasets

children & adults

exp2.both.production.unaware = subset(exp2.both.production, aware=="no")

children

exp2.child.production.unaware = subset(exp2.child.production, aware=="no")

adults

exp2.adult.production.unaware = subset(exp2.adult.production, aware=="no")

calculate means for Figure C12

Aggregate data:

aggregated.production_partial_unaware = aggregate(correct_intended ~ pt_code + adult_child + day + oldnew, exp2.both.production.unaware, FUN=mean)

Calculate means: proportion of majority particle usage by noun type and day in experiment 2, unaware children and adults

round(with(aggregated.production_partial_unaware, tapply(correct_intended, list(day, oldnew, adult_child), mean, na.rm=T)),2)

## , , adult
## 
##    new  old
## 1 0.48 0.49
## 4 0.54 0.47
## 
## , , child
## 
##    new  old
## 1 0.54 0.51
## 4 0.53 0.53

statistical analyses

lme analyses: adults

Center the variables of interest using the lizCenter function:

exp2.adult.production.unaware.ct = lizCenter(exp2.adult.production.unaware , list("day","oldnew"))

Run the lme model:

exp2.adult.production.unaware.lmer = glmer(correct_intended ~ day.ct * oldnew.ct + (day.ct*oldnew.ct|pt_code), family = binomial, data = exp2.adult.production.unaware.ct)

kable(summary(exp2.adult.production.unaware.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	-0.03	0.09	-0.30	0.76
day.ct	0.04	0.04	0.81	0.42
oldnew.ct	-0.12	0.10	-1.25	0.21
day.ct:oldnew.ct	-0.11	0.07	-1.62	0.11

lme analyses: children

Center the variables of interest using the lizCenter function:

exp2.child.production.unaware.ct = lizCenter(exp2.child.production.unaware, list("day","oldnew"))

Run the lme model:

exp2.child.production.unaware.lmer = glmer(correct_intended ~ day.ct * oldnew.ct + (day.ct*oldnew.ct|pt_code), family = binomial, data = exp2.child.production.unaware.ct)

kable(summary(exp2.child.production.unaware.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.10	0.09	1.10	0.27
day.ct	0.05	0.04	1.16	0.24
oldnew.ct	-0.04	0.08	-0.57	0.57
day.ct:oldnew.ct	0.05	0.05	0.92	0.36

2afc data: unaware participants

select appropriate dataset

children & adults

exp2.both.2afc.unaware = subset(exp2.both.2afc, aware=="no")

children

exp2.child.2afc.unaware = subset(exp2.child.2afc, aware=="no")

adults

exp2.adult.2afc.unaware = subset(exp2.adult.2afc, aware=="no")

calculate means for Figure C13

Aggregate data:

aggregated.2afc_partial_unaware = aggregate(correct ~ pt_code + adult_child + old_new, exp2.both.2afc.unaware, FUN=mean)

Calculate means: proportion of majority particle choice by noun type in experiment 2, unaware children and adults

round(with(aggregated.2afc_partial_unaware , tapply(correct, list(adult_child, old_new), mean, na.rm=T)),2)

##        new  old
## adult 0.52 0.56
## child 0.56 0.55

statistical analyses

lme analyses: adults

Center variables of interest using the lizCenter function:

exp2.adult.2afc.unaware.ct = lizCenter(exp2.adult.2afc.unaware, list("old_new"))

Run the lme model:

exp2.adult.2afc.unaware.lmer = glmer(correct~ old_new.ct + (old_new.ct |pt_code), family = binomial, data = exp2.adult.2afc.unaware.ct)

kable(summary(exp2.adult.2afc.unaware.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.15	0.12	1.27	0.20
old_new.ct	0.14	0.26	0.53	0.59

lme analyses: children

Center the variables of interest using the lizCenter function:

exp2.child.2afc.unaware.ct = lizCenter(exp2.child.2afc.unaware , list("old_new"))

Run the lme model:

exp2.child.2afc.unaware.lmer = glmer(correct~ old_new.ct + (old_new.ct |pt_code), family = binomial, data = exp2.child.2afc.unaware.ct)

kable(summary(exp2.child.2afc.unaware.lmer)$coefficients, 
      digits = 2)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.21	0.10	2.21	0.03
old_new.ct	-0.02	0.19	-0.10	0.92

Overall regularization

select appropriate dataset

children & adults:

exp2.entropy.allnouns = subset(all.entropy.data, Consistency == "partial" & LexicalItems =="all")

statistical analyses

whole-language entropy analyses

We need to turn the data to wide-format. We do that using the dcast function.

exp2.entropy.total.allnouns <- dcast(exp2.entropy.allnouns, Participant + AgeGroup ~ Day, value.var="TotalEntropy")

Run bootstrapped ANOVA:

A = Age Group

B = Day

AB = Age Group by Day

z=bw2list(exp2.entropy.total.allnouns,2,c(3:4))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwtrimbt(2,2,z,nboot=1000,tr=.05)

## [1] "Taking bootstrap samples. Please wait."

RobustAnova

## $p.value.A
## [1] 0
## 
## $p.value.B
## [1] 0.001
## 
## $p.value.AB
## [1] 0.001

Establish whether the intercept is different from chance (0.983212597):

Aggregate data:

exp2.entropy.total.overall = aggregate(TotalEntropy ~ Participant, exp2.entropy.allnouns, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.entropy.total.overall$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.7428151 0.6522299 0.8303446

Children day 1 vs. day 4

select appropriate dataset:

exp2.entropy.allnouns.kids = subset(exp2.entropy.allnouns, AgeGroup =="child")
exp2.entropy.allnouns.kids <- dcast(exp2.entropy.allnouns.kids, Participant + AgeGroup ~ Day, value.var="TotalEntropy")
exp2.entropy.allnouns.kids$dif <- (exp2.entropy.allnouns.kids$"4" - exp2.entropy.allnouns.kids$"1")

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.entropy.allnouns.kids$dif, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.3378333 0.1804425 0.5062946

Adults day 1 vs. day 4

exp2.entropy.allnouns.adults = subset(exp2.entropy.allnouns, AgeGroup =="adult")
exp2.entropy.allnouns.adults <- dcast(exp2.entropy.allnouns.adults, Participant + AgeGroup ~ Day, value.var="TotalEntropy")
exp2.entropy.allnouns.adults$dif <- (exp2.entropy.allnouns.adults$"4" - exp2.entropy.allnouns.adults$"1")

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.entropy.allnouns.adults$dif, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##         Mean        Lower        Upper 
##  0.005427691 -0.003800743  0.014542019

Establish whether children showed above chance regularization on day 1

select appropriate dataset:

exp2.entropy.total.overall.kids1=subset(exp2.entropy.allnouns, AgeGroup =="child" & LexicalItems =="all" & Day == 1)

Calculate bootstrappped confidence intervals:

select appropriate dataset:

smean.cl.boot(exp2.entropy.total.overall.kids1$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.3308676 0.1893508 0.4783758

Establish whether children showed above chance regularization on day 4

select appropriate dataset:

exp2.entropy.total.overall.kids4=subset(exp2.entropy.allnouns, AgeGroup =="child" & LexicalItems =="all" & Day == 4)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.entropy.total.overall.kids4$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.6687009 0.5340549 0.7992812

Establish whether adults showed above chance regularization (across days)

select appropriate dataset:

exp2.entropy.total.overall.adults_across = subset(exp2.entropy.allnouns, AgeGroup =="adult" & LexicalItems =="all")

Aggregate data:

exp2.entropy.total.overall.adults_across = aggregate(TotalEntropy ~ Participant, exp2.entropy.total.overall.adults_across, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.entropy.total.overall.adults_across$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.9858460 0.9795069 0.9914940

whole-language entropy analyses with the additional within-subject factor noun type

select appropriate dataset:

exp2.entropy.old.new.nouns = subset(all.entropy.data, Consistency == "partial" & LexicalItems !="all")

We need to turn the data to wide-format. We do that using the dcast function.

exp2.entropy.total.old.new.nouns <- dcast(exp2.entropy.old.new.nouns, Participant + AgeGroup ~ Day * LexicalItems, value.var="TotalEntropy")

Run bootstrapped ANOVA: A = Age Group

B = Day

C = Noun Type

AB = Age Group by Day

AC = Age Group by Noun Type

BC = Day by Noun Type

ABC = Age Group by Day by Noun Type

z=bw2list(exp2.entropy.total.old.new.nouns,2,c(3:6))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwwtrimbt(2,2,2,z,nboot=1000,est=mom,tr=.05)

RobustAnova

## $p.value.A
## [1] 0
## 
## $p.value.B
## [1] 0
## 
## $p.value.C
## [1] 0.002
## 
## $p.value.AB
## [1] 0.001
## 
## $p.value.AC
## [1] 0
## 
## $p.value.BC
## [1] 0.188
## 
## $p.value.ABC
## [1] 0.127

children: old vs. new nouns

exp2.total.entropy.kids.nouns =subset(exp2.entropy.old.new.nouns, AgeGroup =="child")

Aggregate data:

exp2.total.entropy.kids.nouns <- dcast(exp2.total.entropy.kids.nouns, Participant ~ LexicalItems, value.var="TotalEntropy", fun.aggregate=mean)

exp2.total.entropy.kids.nouns$dif <- (exp2.total.entropy.kids.nouns$new - exp2.total.entropy.kids.nouns$old)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.total.entropy.kids.nouns$dif, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##        Mean       Lower       Upper 
## -0.06316156 -0.11027687 -0.01030189

adults: old vs. new nouns

exp2.total.entropy.adults.nouns =subset(exp2.entropy.old.new.nouns, AgeGroup =="adult")

Aggregate data:

exp2.total.entropy.adults.nouns <- dcast(exp2.total.entropy.adults.nouns, Participant ~ LexicalItems, value.var="TotalEntropy", fun.aggregate=mean)

exp2.total.entropy.adults.nouns$dif <- (exp2.total.entropy.adults.nouns$new - exp2.total.entropy.adults.nouns$old)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.total.entropy.adults.nouns$dif, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##         Mean        Lower        Upper 
## -0.008777248 -0.035124267  0.015417382

Analyses over number of participants showing evidence of significant regularization:

Comparison of adult vs. child regularizers on day 1:

select appropriate dataset:

regularizers.day1 = subset(exp2.entropy.allnouns, Day  == 1)

calculate number of regularizers and run chi-square test:

x =table(regularizers.day1$PLowerTotalEntropy, regularizers.day1$AgeGroup)
x

##    
##     adult child
##   0    27     7
##   1     3    23

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 27.149, df = 1, p-value = 1.883e-07

Comparison of adult vs. child regularizers on day 4:

select appropriate dataset:

regularizers.day4 = subset(exp2.entropy.allnouns, Day  == 4)

calculate number of regularizers and run chi-square test:

x =table(regularizers.day4$PLowerTotalEntropy, regularizers.day4$AgeGroup)
x

##    
##     adult child
##   0    30    12
##   1     0    18

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 25.714, df = 1, p-value = 3.959e-07

Comparison of numbers of child regularizers in experiment 1 vs. experiment 2 on day 1:

select appropriate dataset:

exp1_vs.exp2_child_day1 = subset(all.entropy.data, Consistency != "inconsistent" &AgeGroup == "child" & LexicalItems == "all" & Day == 1)
exp1_vs.exp2_child_day1$Consistency = factor(exp1_vs.exp2_child_day1$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp1_vs.exp2_child_day1$PLowerTotalEntropy, exp1_vs.exp2_child_day1$Consistency)
x

##    
##     consistent partial
##   0         15       7
##   1         13      23

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 5.6246, df = 1, p-value = 0.01771

Comparison of numbers of child regularizers in experiment 1 vs. experiment 2 on day 4:

select appropriate dataset:

exp1_vs.exp2_child_day4 = subset(all.entropy.data, Consistency != "inconsistent" &AgeGroup == "child" & LexicalItems == "all" & Day == 4)
exp1_vs.exp2_child_day4$Consistency = factor(exp1_vs.exp2_child_day4$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp1_vs.exp2_child_day4$PLowerTotalEntropy, exp1_vs.exp2_child_day4$Consistency)
x

##    
##     consistent partial
##   0         19      12
##   1         11      18

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 3.2703, df = 1, p-value = 0.07054

Comparison of numbers of adult regularizers in experiment 1 vs. experiment 2 on day 1:

select appropriate dataset:

exp1_vs.exp2_adult_day1 = subset(all.entropy.data, Consistency != "inconsistent" &AgeGroup == "adult" & LexicalItems == "all" & Day == 1)
exp1_vs.exp2_adult_day1$Consistency = factor(exp1_vs.exp2_adult_day1$Consistency)

calculate number of regularizers and run fisher test:

x =table(exp1_vs.exp2_adult_day1$PLowerTotalEntropy, exp1_vs.exp2_adult_day1$Consistency)
fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 1
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##   0.1633985 19.8400365
## sample estimates:
## odds ratio 
##   1.544236

Comparison of numbers of adult regularizers in experiment 1 vs. experiment 2 on day 4:

select appropriate dataset:

exp1_vs.exp2_adult_day4 = subset(all.entropy.data, Consistency != "inconsistent" &AgeGroup == "adult" & LexicalItems == "all" & Day == 4)

calculate number of regularizers

exp1_vs.exp2_adult_day4$Consistency = factor(exp1_vs.exp2_adult_day4$Consistency)
x =table(exp1_vs.exp2_adult_day4$PLowerTotalEntropy, exp1_vs.exp2_adult_day4$Consistency)
x

##    
##     consistent partial
##   0         30      30

Lexical conditioning

We need to turn the exp2.entropy.allnouns dataset to wide-format. We do that using the dcast function.

exp2.lexical.MI.allnouns <- dcast(exp2.entropy.allnouns, Participant + AgeGroup ~ Day, value.var="MutualInformationLexical")

statistical analyses

lexical MI analyses across nouns

Run bootstrapped ANOVA:

A = Age Group

B = Day

AB = Age Group by Day

z=bw2list(exp2.lexical.MI.allnouns,2,c(3:4))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwtrimbt(2,2,z,nboot=1000,tr=.05)

## [1] "Taking bootstrap samples. Please wait."

RobustAnova

## $p.value.A
## [1] 0
## 
## $p.value.B
## [1] 0.017
## 
## $p.value.AB
## [1] 0.003

Establish whether the intercept is different from chance (0.077145621):

Aggregate data:

exp2.lexical.MI.overall = aggregate(MutualInformationLexical ~ Participant, exp2.entropy.allnouns, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.lexical.MI.overall$MutualInformationLexical, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.1345434 0.1028523 0.1691494

Children day 1 vs. day 4

exp2.lexical.MI.overall.kids = subset(exp2.entropy.allnouns, AgeGroup =="child")
exp2.lexical.MI.overall.kids <- dcast(exp2.lexical.MI.overall.kids, Participant + AgeGroup ~ Day, value.var="MutualInformationLexical")
exp2.lexical.MI.overall.kids$dif <- (exp2.lexical.MI.overall.kids$"4" - exp2.lexical.MI.overall.kids$"1")

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.lexical.MI.overall.kids$dif, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##        Mean       Lower       Upper 
##  0.01101793 -0.01384301  0.03484774

Adults day 1 vs. day 4

exp2.lexical.MI.overall.adults = subset(exp2.entropy.allnouns, AgeGroup =="adult")
exp2.lexical.MI.overall.adults <- dcast(exp2.lexical.MI.overall.adults, Participant + AgeGroup ~ Day, value.var="MutualInformationLexical")
exp2.lexical.MI.overall.adults$dif <- (exp2.lexical.MI.overall.adults$"4" - exp2.lexical.MI.overall.adults$"1")

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.lexical.MI.overall.adults$dif, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##        Mean       Lower       Upper 
## -0.15357050 -0.24044187 -0.07228839

adults day 1:

exp2.lexical.MI.overall.adults1=subset(exp2.entropy.allnouns, AgeGroup =="adult" & LexicalItems =="all" & Day == 1)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.lexical.MI.overall.adults1$MutualInformationLexical, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.2868074 0.2077213 0.3708436

adults day 4:

exp2.lexical.MI.overall.adults4=subset(exp2.entropy.allnouns, AgeGroup =="adult" & LexicalItems =="all" & Day == 4)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.lexical.MI.overall.adults4$MutualInformationLexical, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##       Mean      Lower      Upper 
## 0.13323690 0.09347458 0.17492717

Establish whether children showed below chance lexical MI

Select appropriate dataset:

exp2.lexical.MI.overall.kids = subset(exp2.entropy.allnouns, AgeGroup =="child")

Aggregate data:

exp2.lexical.MI.overall.kids = aggregate(MutualInformationLexical ~ Participant, exp2.lexical.MI.overall.kids, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp2.lexical.MI.overall.kids$MutualInformationLexical, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##       Mean      Lower      Upper 
## 0.05906463 0.04366173 0.07621856

lexical MI analyses with the additional within-subject factor noun type

Select appropriate dataset:

exp2.entropy.old.new.nouns = subset(all.entropy.data, Consistency == "partial" & LexicalItems !="all")

We need to turn the data to wide-format. We do that using the dcast function.

exp2.lexical.MI.old.new.nouns <- dcast(exp2.entropy.old.new.nouns, Participant + AgeGroup ~ Day * LexicalItems, value.var="MutualInformationLexical")

Run bootstrapped ANOVA:

A = Age Group

B = Day

C = Noun Type

AB = Age Group by Day

AC = Age Group by Noun Type

BC = Day by Noun Type

ABC = Age Group by Day by Noun Type

z=bw2list(exp2.lexical.MI.old.new.nouns,2,c(3:6))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwwtrimbt(2,2,2,z,nboot=1000,est=mom,tr=.05)

RobustAnova

## $p.value.A
## [1] 0
## 
## $p.value.B
## [1] 0.011
## 
## $p.value.C
## [1] 0.16
## 
## $p.value.AB
## [1] 0.003
## 
## $p.value.AC
## [1] 0.756
## 
## $p.value.BC
## [1] 0.115
## 
## $p.value.ABC
## [1] 0.267

Analyses over number of participants showing evidence of significant regularization:

Comparison of adult vs. child regularizers on day 1:

select appropriate dataset:

regularizers.day1 = subset(exp2.entropy.allnouns, Day  == 1)

calculate number of regularizers and run chi-square test:

x =table(regularizers.day1$PHigherMutualInformationLexical, regularizers.day1$AgeGroup)
x

##    
##     adult child
##   0    15    30
##   1    15     0

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 20, df = 1, p-value = 7.744e-06

Comparison of adult vs. child regularizers on day 4:

select appropriate dataset:

regularizers.day4 = subset(exp2.entropy.allnouns, Day  == 4)

calculate number of regularizers and run chi-square test:

x =table(regularizers.day4$PHigherMutualInformationLexical, regularizers.day4$AgeGroup)
x

##    
##     adult child
##   0    21    29
##   1     9     1

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 7.68, df = 1, p-value = 0.005584

Comparison of numbers of adult regularizers in experiment 1 vs. experiment 2 on day 1:

select appropriate dataset:

exp1_vs.exp2_adult_day1 = subset(all.entropy.data, Consistency != "inconsistent" &AgeGroup == "adult" & LexicalItems == "all" & Day == 1)
exp1_vs.exp2_adult_day1$Consistency = factor(exp1_vs.exp2_adult_day1$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp1_vs.exp2_adult_day1$PHigherMutualInformationLexical, exp1_vs.exp2_adult_day1$Consistency)
x

##    
##     consistent partial
##   0         30      15
##   1          0      15

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 20, df = 1, p-value = 7.744e-06

Comparison of numbers of adult regularizers in experiment 1 vs. experiment 2 on day 4:

select appropriate dataset:

exp1_vs.exp2_adult_day4 = subset(all.entropy.data, Consistency != "inconsistent" &AgeGroup == "adult" & LexicalItems == "all" & Day == 4)
exp1_vs.exp2_adult_day4$Consistency = factor(exp1_vs.exp2_adult_day4$Consistency)

calculate number of regularizers and run fisher test:

x =table(exp1_vs.exp2_adult_day4$PHigherMutualInformationLexical, exp1_vs.exp2_adult_day4$Consistency)
x

##    
##     consistent partial
##   0         30      21
##   1          0       9

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 0.001936
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  2.435816      Inf
## sample estimates:
## odds ratio 
##        Inf

Comparison of numbers of child regularizers in experiment 1 vs. experiment 2 on day 1:

select appropriate dataset:

exp1_vs.exp2_child_day1 = subset(all.entropy.data, Consistency != "inconsistent" &AgeGroup == "child" & LexicalItems == "all" & Day == 1)
exp1_vs.exp2_child_day1$Consistency = factor(exp1_vs.exp2_child_day1$Consistency)

calculate number of regularizers

x =table(exp1_vs.exp2_child_day1$PHigherMutualInformationLexical, exp1_vs.exp2_child_day1$Consistency)
x

##    
##     consistent partial
##   0         28      30

Comparison of numbers of child regularizers in experiment 1 vs. experiment 2 on day 4:

select appropriate dataset:

exp1_vs.exp2_child_day4 = subset(all.entropy.data, Consistency != "inconsistent" &AgeGroup == "child" & LexicalItems == "all" & Day == 4)
exp1_vs.exp2_child_day4$Consistency = factor(exp1_vs.exp2_child_day4$Consistency)

calculate number of regularizers and run fisher test:

x =table(exp1_vs.exp2_child_day4$PHigherMutualInformationLexical, exp1_vs.exp2_child_day4$Consistency)
x

##    
##     consistent partial
##   0         30      29
##   1          0       1

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 1
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.02564066        Inf
## sample estimates:
## odds ratio 
##        Inf

Experiment 3 (inconsistent condition)

Production data: baseline

In Experiment 3, there was no correct or speaker-appropriate form for production performance, thus, these analyses were not carried out.

Production data: choice of particle

In Experiment 3, there was no correct or speaker-appropriate form for production performance, thus, these analyses were not carried out.

2afc data

In Experiment 3, there was no correct or speaker-appropriate form for 2AFC performance, thus, these analyses were not carried out.

Overall regularization

select appropriate dataset

children & adults

exp3.entropy.allnouns = subset(all.entropy.data, Consistency == "inconsistent" & LexicalItems =="all")

statistical analyses

whole-language entropy analyses

We need to turn the data to wide-format. We do that using the dcast function.

exp3.entropy.total.allnouns <- dcast(exp3.entropy.allnouns, Participant + AgeGroup ~ Day, value.var="TotalEntropy")

Run bootstrapped ANOVA:

A = Age Group

B = Day

AB = Age Group by Day

z=bw2list(exp3.entropy.total.allnouns,2,c(3:4))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwtrimbt(2,2,z,nboot=1000,tr=.05)

## [1] "Taking bootstrap samples. Please wait."

RobustAnova

## $p.value.A
## [1] 0
## 
## $p.value.B
## [1] 0
## 
## $p.value.AB
## [1] 0.018

Establish whether the intercept is different from chance (0.988032873):

Aggregate data:

exp3.entropy.total.overall = aggregate(TotalEntropy ~ Participant, exp3.entropy.allnouns, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.entropy.total.overall$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.7149996 0.6278927 0.7967219

Children day 1 vs. day 4

exp3.entropy.allnouns.kids = subset(exp3.entropy.allnouns, AgeGroup =="child")
exp3.entropy.allnouns.kids <- dcast(exp3.entropy.allnouns.kids, Participant + AgeGroup ~ Day, value.var="TotalEntropy")
exp3.entropy.allnouns.kids$dif <- (exp3.entropy.allnouns.kids$"4" - exp3.entropy.allnouns.kids$"1")

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.entropy.allnouns.kids$dif, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.3272345 0.1585868 0.4912957

Adults day 1 vs. day 4

exp3.entropy.allnouns.adults = subset(exp3.entropy.allnouns, AgeGroup =="adult")
exp3.entropy.allnouns.adults <- dcast(exp3.entropy.allnouns.adults, Participant + AgeGroup ~ Day, value.var="TotalEntropy")
exp3.entropy.allnouns.adults$dif <- (exp3.entropy.allnouns.adults$"4" - exp3.entropy.allnouns.adults$"1")

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.entropy.allnouns.adults$dif, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##        Mean       Lower       Upper 
## 0.079548262 0.005755025 0.178649313

Establish whether children showed above chance regularization on day 1

select appropriate dataset:

exp3.entropy.total.overall.kids1=subset(exp3.entropy.allnouns, AgeGroup =="child" & LexicalItems =="all" & Day == 1)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.entropy.total.overall.kids1$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.3492711 0.2138436 0.4854579

Establish whether children showed above chance regularization on day 4

select appropriate dataset:

exp3.entropy.total.overall.kids4=subset(exp3.entropy.allnouns, AgeGroup =="child" & LexicalItems =="all" & Day == 4)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.entropy.total.overall.kids4$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.6765057 0.5434385 0.7963216

Establish whether adults showed above chance regularization on day 1

select appropriate dataset:

exp3.entropy.total.overall.adults1=subset(exp3.entropy.allnouns, AgeGroup =="adult" & LexicalItems =="all" & Day == 1)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.entropy.total.overall.adults1$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.8773366 0.7490172 0.9739649

Establish whether adults showed above chance regularization on day 4

select appropriate dataset:

exp3.entropy.total.overall.adults4=subset(exp3.entropy.allnouns, AgeGroup =="adult" & LexicalItems =="all" & Day == 4)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.entropy.total.overall.adults4$TotalEntropy, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.9568848 0.8967360 0.9892241

whole-language entropy analyses with the additional within-subject factor noun type

exp3.entropy.old.new.nouns = subset(all.entropy.data, Consistency == "inconsistent" & LexicalItems !="all")

We need to turn the data to wide-format. We do that using the dcast function.

exp3.entropy.total.old.new.nouns <- dcast(exp3.entropy.old.new.nouns, Participant + AgeGroup ~ Day * LexicalItems, value.var="TotalEntropy")

Run bootstrapped ANOVA:

A = Age Group

B = Day

C = Noun Type

AB = Age Group by Day

AC = Age Group by Noun Type

BC = Day by Noun Type

ABC = Age Group by Day by Noun Type

z=bw2list(exp3.entropy.total.old.new.nouns,2,c(3:6))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwwtrimbt(2,2,2,z,nboot=1000,est=mom,tr=.05)

RobustAnova

## $p.value.A
## [1] 0
## 
## $p.value.B
## [1] 0
## 
## $p.value.C
## [1] 0.019
## 
## $p.value.AB
## [1] 0.007
## 
## $p.value.AC
## [1] 0.02
## 
## $p.value.BC
## [1] 0.755
## 
## $p.value.ABC
## [1] 0.372

children: old vs. new nouns

exp3.total.entropy.kids.nouns =subset(exp3.entropy.old.new.nouns, AgeGroup =="child")

Aggregate data:

exp3.total.entropy.kids.nouns <- dcast(exp3.total.entropy.kids.nouns, Participant ~ LexicalItems, value.var="TotalEntropy", fun.aggregate=mean)

exp3.total.entropy.kids.nouns$dif <- (exp3.total.entropy.kids.nouns$new - exp3.total.entropy.kids.nouns$old)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.total.entropy.kids.nouns$dif, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##        Mean       Lower       Upper 
## -0.08696759 -0.13883859 -0.03965704

adults: old vs. new nouns

exp3.total.entropy.adults.nouns =subset(exp3.entropy.old.new.nouns, AgeGroup =="adult")

Aggregate data:

exp3.total.entropy.adults.nouns <- dcast(exp3.total.entropy.adults.nouns, Participant ~ LexicalItems, value.var="TotalEntropy", fun.aggregate=mean)

exp3.total.entropy.adults.nouns$dif <- (exp3.total.entropy.adults.nouns$new - exp3.total.entropy.adults.nouns$old)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.total.entropy.adults.nouns$dif, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##        Mean       Lower       Upper 
##  0.00117915 -0.04742150  0.04787941

Analyses over number of participants showing evidence of significant regularization:

Comparison of adult vs. child regularizers on day 1:

select appropriate dataset:

regularizers.day1 = subset(exp3.entropy.allnouns, Day  == 1)

calculate number of regularizers and run chi-square test:

x =table(regularizers.day1$PLowerTotalEntropy, regularizers.day1$AgeGroup)
x

##    
##     adult child
##   0    22     4
##   1     8    26

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 21.991, df = 1, p-value = 2.739e-06

Comparison of adult vs. child regularizers on day 4:

select appropriate dataset:

regularizers.day4 = subset(exp3.entropy.allnouns, Day  == 4)

calculate number of regularizers and run chi-square test:

x =table(regularizers.day4$PLowerTotalEntropy, regularizers.day4$AgeGroup)
x

##    
##     adult child
##   0    26    10
##   1     4    20

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 17.778, df = 1, p-value = 2.483e-05

Comparison of numbers of child regularizers in experiment 3 vs. experiment 1 on day 1:

select appropriate dataset:

exp3_vs.exp1_child_day1 = subset(all.entropy.data, Consistency =="consistent" | Consistency =="inconsistent")
exp3_vs.exp1_child_day1 = subset(exp3_vs.exp1_child_day1, AgeGroup == "child" & LexicalItems == "all" & Day == 1)
exp3_vs.exp1_child_day1$Consistency = factor(exp3_vs.exp1_child_day1$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp1_child_day1$PLowerTotalEntropy, exp3_vs.exp1_child_day1$Consistency)
x

##    
##     consistent inconsistent
##   0         15            4
##   1         13           26

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 10.645, df = 1, p-value = 0.001103

Comparison of numbers of child regularizers in experiment 3 vs. experiment 1 on day 4:

select appropriate dataset:

exp3_vs.exp1_child_day4 = subset(all.entropy.data, Consistency =="consistent" | Consistency =="inconsistent")
exp3_vs.exp1_child_day4 = subset(exp3_vs.exp1_child_day4, AgeGroup == "child" & LexicalItems == "all" & Day == 4)
exp3_vs.exp1_child_day4$Consistency = factor(exp3_vs.exp1_child_day4$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp1_child_day4$PLowerTotalEntropy, exp3_vs.exp1_child_day4$Consistency)
x

##    
##     consistent inconsistent
##   0         19           10
##   1         11           20

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 5.406, df = 1, p-value = 0.02007

Comparison of numbers of adult regularizers in experiment 3 vs. experiment 1 on day 1:

select appropriate dataset:

exp3_vs.exp1_adult_day1 = subset(all.entropy.data, Consistency != "partial" &AgeGroup == "adult" & LexicalItems == "all" & Day == 1)
exp3_vs.exp1_adult_day1$Consistency = factor(exp3_vs.exp1_adult_day1$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp1_adult_day1$PLowerTotalEntropy, exp3_vs.exp1_adult_day1$Consistency)
x

##    
##     consistent inconsistent
##   0         28           22
##   1          2            8

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 4.32, df = 1, p-value = 0.03767

Comparison of numbers of adult regularizers in experiment 3 vs. experiment 1 on day 4:

select appropriate dataset:

exp3_vs.exp1_adult_day4 = subset(all.entropy.data, Consistency != "partial" &AgeGroup == "adult" & LexicalItems == "all" & Day == 4)
exp3_vs.exp1_adult_day4$Consistency = factor(exp3_vs.exp1_adult_day4$Consistency)

calculate number of regularizers and run fisher test:

x =table(exp3_vs.exp1_adult_day4$PLowerTotalEntropy, exp3_vs.exp1_adult_day4$Consistency)
x

##    
##     consistent inconsistent
##   0         30           26
##   1          0            4

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 0.1124
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.6891535       Inf
## sample estimates:
## odds ratio 
##        Inf

Comparison of numbers of child regularizers in experiment 3 vs. experiment 2 on day 1:

select appropriate dataset:

exp3_vs.exp2_child_day1 = subset(all.entropy.data, Consistency =="inconsistent" | Consistency =="partial")
exp3_vs.exp2_child_day1 = subset(exp3_vs.exp2_child_day1, AgeGroup == "child" & LexicalItems == "all" & Day == 1)
exp3_vs.exp2_child_day1$Consistency = factor(exp3_vs.exp2_child_day1$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp2_child_day1$PLowerTotalEntropy, exp3_vs.exp2_child_day1$Consistency)
x

##    
##     inconsistent partial
##   0            4       7
##   1           26      23

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 1.0019, df = 1, p-value = 0.3169

Comparison of numbers of child regularizers in experiment 3 vs. experiment 2 on day 4:

select appropriate dataset:

exp3_vs.exp2_child_day4 = subset(all.entropy.data, Consistency =="inconsistent" | Consistency =="partial")
exp3_vs.exp2_child_day4 = subset(exp3_vs.exp2_child_day4, AgeGroup == "child" & LexicalItems == "all" & Day == 4)
exp3_vs.exp2_child_day4$Consistency = factor(exp3_vs.exp2_child_day4$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp2_child_day4$PLowerTotalEntropy, exp3_vs.exp2_child_day4$Consistency)
x

##    
##     inconsistent partial
##   0           10      12
##   1           20      18

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 0.28708, df = 1, p-value = 0.5921

Comparison of numbers of adult regularizers in experiment 3 vs. experiment 2 on day 1:

select appropriate dataset:

exp3_vs.exp2_adult_day1 = subset(all.entropy.data, Consistency != "consistent" &AgeGroup == "adult" & LexicalItems == "all" & Day == 1)
exp3_vs.exp2_adult_day1$Consistency = factor(exp3_vs.exp2_adult_day1$Consistency)
x =table(exp3_vs.exp2_adult_day1$PLowerTotalEntropy, exp3_vs.exp2_adult_day1$Consistency)
x

##    
##     inconsistent partial
##   0           22      27
##   1            8       3

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 2.7829, df = 1, p-value = 0.09527

Comparison of numbers of adult regularizers in experiment 3 vs. experiment 2 on day 4:

select appropriate dataset:

exp3_vs.exp2_adult_day4 = subset(all.entropy.data, Consistency != "consistent" &AgeGroup == "adult" & LexicalItems == "all" & Day == 4)
exp3_vs.exp2_adult_day4$Consistency = factor(exp3_vs.exp2_adult_day4$Consistency)

calculate number of regularizers and run fisher test:

x =table(exp3_vs.exp2_adult_day4$PLowerTotalEntropy, exp3_vs.exp2_adult_day4$Consistency)
x

##    
##     inconsistent partial
##   0           26      30
##   1            4       0

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 0.1124
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.000000 1.451056
## sample estimates:
## odds ratio 
##          0

Lexical conditioning

We need to turn the exp3.entropy.allnouns dataset to wide-format. We do that using the dcast function.

exp3.lexical.MI.allnouns <- dcast(exp3.entropy.allnouns, Participant + AgeGroup ~ Day, value.var="MutualInformationLexical")

statistical analyses

lexical MI analyses across nouns

Run bootstrapped ANOVA:

A = Age Group

B = Day

AB = Age Group by Day

z=bw2list(exp3.lexical.MI.allnouns,2,c(3:4))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwtrimbt(2,2,z,nboot=1000,tr=.05)

## [1] "Taking bootstrap samples. Please wait."

RobustAnova

## $p.value.A
## [1] 0
## 
## $p.value.B
## [1] 0.646
## 
## $p.value.AB
## [1] 0.24

Establish whether the intercept is different from chance (0.074168799):

Aggregate dataset:

exp3.lexical.MI.overall = aggregate(MutualInformationLexical ~ Participant, exp3.entropy.allnouns, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.lexical.MI.overall$MutualInformationLexical, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.2010298 0.1589386 0.2465909

Establish whether children showed above chance lexical MI

select appropriate dataset:

exp3.lexical.MI.overall.kids = subset(exp3.entropy.allnouns, AgeGroup =="child")

Aggregate data:

exp3.lexical.MI.overall.kids = aggregate(MutualInformationLexical ~ Participant, exp3.lexical.MI.overall.kids, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.lexical.MI.overall.kids$MutualInformationLexical, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##       Mean      Lower      Upper 
## 0.09718205 0.06463419 0.13520564

Establish whether adults showed above chance lexical MI

Select appropriate dataset:

exp3.lexical.MI.overall.adults = subset(exp3.entropy.allnouns, AgeGroup =="adult")

Aggregate data:

exp3.lexical.MI.overall.adults = aggregate(MutualInformationLexical ~ Participant, exp3.lexical.MI.overall.adults, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.lexical.MI.overall.adults$MutualInformationLexical, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##      Mean     Lower     Upper 
## 0.3048776 0.2486271 0.3781721

lexical MI analyses with the additional within-subject factor noun type

select appropriate dataset:

exp3.entropy.old.new.nouns = subset(all.entropy.data, Consistency == "inconsistent" & LexicalItems !="all")

We need to turn the data to wide-format. We do that using the dcast function.

exp3.lexical.MI.old.new.nouns <- dcast(exp3.entropy.old.new.nouns, Participant + AgeGroup ~ Day * LexicalItems, value.var="MutualInformationLexical")

Run bootstrapped ANOVA:

A = Age Group

B = Day

C = Noun Type

AB = Age Group by Day

AC = Age Group by Noun Type

BC = Day by Noun Type

ABC = Age Group by Day by Noun Type

z=bw2list(exp3.lexical.MI.old.new.nouns,2,c(3:6))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwwtrimbt(2,2,2,z,nboot=1000,est=mom,tr=.05)

RobustAnova

## $p.value.A
## [1] 0
## 
## $p.value.B
## [1] 0.236
## 
## $p.value.C
## [1] 0.057
## 
## $p.value.AB
## [1] 0.037
## 
## $p.value.AC
## [1] 0.864
## 
## $p.value.BC
## [1] 0.549
## 
## $p.value.ABC
## [1] 0.325

Analyses over number of participants showing evidence of significant regularization:

Comparison of adult vs. child regularizers on day 1:

select appropriate dataset:

regularizers.day1 = subset(exp3.entropy.allnouns, Day  == 1)

calculate number of regularizers and run chi-square test:

x =table(regularizers.day1$PHigherMutualInformationLexical, regularizers.day1$AgeGroup)
x

##    
##     adult child
##   0    10    28
##   1    20     2

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 23.254, df = 1, p-value = 1.42e-06

Comparison of adult vs. child regularizers on day 4:

select appropriate dataset:

regularizers.day4 = subset(exp3.entropy.allnouns, Day  == 4)

calculate number of regularizers and run chi-square test:

x =table(regularizers.day4$PHigherMutualInformationLexical, regularizers.day4$AgeGroup)
x

##    
##     adult child
##   0    17    27
##   1    13     3

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 8.5227, df = 1, p-value = 0.003507

Comparison of numbers of adult regularizers in experiment 3 vs. experiment 1 on day 1:

select appropriate dataset:

exp3_vs.exp1_adult_day1 = subset(all.entropy.data, Consistency != "partial" &AgeGroup == "adult" & LexicalItems == "all" & Day == 1)
exp3_vs.exp1_adult_day1$Consistency = factor(exp3_vs.exp1_adult_day1$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp1_adult_day1$PHigherMutualInformationLexical, exp3_vs.exp1_adult_day1$Consistency)
x

##    
##     consistent inconsistent
##   0         30           10
##   1          0           20

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 30, df = 1, p-value = 4.32e-08

Comparison of numbers of adult regularizers in experiment 3 vs. experiment 1 on day 4:

select appropriate dataset:

exp3_vs.exp1_adult_day4 = subset(all.entropy.data, Consistency != "partial" &AgeGroup == "adult" & LexicalItems == "all" & Day == 4)
exp3_vs.exp1_adult_day4$Consistency = factor(exp3_vs.exp1_adult_day4$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp1_adult_day4$PHigherMutualInformationLexical, exp3_vs.exp1_adult_day4$Consistency)
x

##    
##     consistent inconsistent
##   0         30           17
##   1          0           13

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 16.596, df = 1, p-value = 4.625e-05

Comparison of numbers of child regularizers in experiment 3 vs. experiment 1 on day 1:

select appropriate dataset:

exp3_vs.exp1_child_day1 = subset(all.entropy.data, Consistency != "partial" &AgeGroup == "child" & LexicalItems == "all" & Day == 1)
exp3_vs.exp1_child_day1$Consistency = factor(exp3_vs.exp1_child_day1$Consistency)

calculate number of regularizers and run fisher test:

x =table(exp3_vs.exp1_child_day1$PHigherMutualInformationLexical, exp3_vs.exp1_child_day1$Consistency)
x

##    
##     consistent inconsistent
##   0         28           28
##   1          0            2

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 0.4918
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.1760643       Inf
## sample estimates:
## odds ratio 
##        Inf

Comparison of numbers of child regularizers in experiment 3 vs. experiment 1 on day 4:

select appropriate dataset:

exp3_vs.exp1_child_day4 = subset(all.entropy.data, Consistency != "partial" &AgeGroup == "child" & LexicalItems == "all" & Day == 4)
exp3_vs.exp1_child_day4$Consistency = factor(exp3_vs.exp1_child_day4$Consistency)

calculate number of regularizers and run fisher test:

x =table(exp3_vs.exp1_child_day4$PHigherMutualInformationLexical, exp3_vs.exp1_child_day4$Consistency)
x

##    
##     consistent inconsistent
##   0         30           27
##   1          0            3

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 0.2373
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.4218602       Inf
## sample estimates:
## odds ratio 
##        Inf

Comparison of numbers of adult regularizers in experiment 3 vs. experiment 2 on day 1:

select appropriate dataset:

exp3_vs.exp2_adult_day1 = subset(all.entropy.data, Consistency != "consistent" &AgeGroup == "adult" & LexicalItems == "all" & Day == 1)
exp3_vs.exp2_adult_day1$Consistency = factor(exp3_vs.exp2_adult_day1$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp2_adult_day1$PHigherMutualInformationLexical, exp3_vs.exp2_adult_day1$Consistency)
x

##    
##     inconsistent partial
##   0           10      15
##   1           20      15

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 1.7143, df = 1, p-value = 0.1904

Comparison of numbers of adult regularizers in experiment 3 vs. experiment 2 on day 4:

select appropriate dataset:

exp3_vs.exp2_adult_day4 = subset(all.entropy.data, Consistency != "consistent" &AgeGroup == "adult" & LexicalItems == "all" & Day == 4)
exp3_vs.exp2_adult_day4$Consistency = factor(exp3_vs.exp2_adult_day4$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp2_adult_day4$PHigherMutualInformationLexical, exp3_vs.exp2_adult_day4$Consistency)
x

##    
##     inconsistent partial
##   0           17      21
##   1           13       9

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 1.1483, df = 1, p-value = 0.2839

Comparison of numbers of child regularizers in experiment 3 vs. experiment 2 on day 1:

select appropriate dataset:

exp3_vs.exp2_child_day1 = subset(all.entropy.data, Consistency != "consistent" &AgeGroup == "child" & LexicalItems == "all" & Day == 1)
exp3_vs.exp2_child_day1$Consistency = factor(exp3_vs.exp2_child_day1$Consistency)

calculate number of regularizers and run fisher test:

x =table(exp3_vs.exp2_child_day1$PHigherMutualInformationLexical, exp3_vs.exp2_child_day1$Consistency)
x

##    
##     inconsistent partial
##   0           28      30
##   1            2       0

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 0.4915
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.000000 5.295628
## sample estimates:
## odds ratio 
##          0

Comparison of numbers of child regularizers in experiment 3 vs. experiment 2 on day 4:

select appropriate dataset:

exp3_vs.exp2_child_day4 = subset(all.entropy.data, Consistency != "consistent" &AgeGroup == "child" & LexicalItems == "all" & Day == 4)
exp3_vs.exp2_child_day4$Consistency = factor(exp3_vs.exp2_child_day4$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp2_child_day4$PHigherMutualInformationLexical, exp3_vs.exp2_child_day4$Consistency)
x

##    
##     inconsistent partial
##   0           27      29
##   1            3       1

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 0.612
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.005723854 4.211021402
## sample estimates:
## odds ratio 
##  0.3160163

Speaker-based conditioning

calculate means for Figure 8

Speaker identity mutual information of particle usage in experiments 1, 2, and 3, indicating the extent to which particle choice is conditioned on the speaker:

round(with(all.entropy.data, tapply(MutualInformationClass, list(Consistency, AgeGroup, Day, LexicalItems), mean, na.rm=T)),2)

## , , 1, all
## 
##              adult child
## consistent    0.90  0.50
## inconsistent  0.05  0.00
## partial       0.09  0.04
## 
## , , 4, all
## 
##              adult child
## consistent    0.95  0.54
## inconsistent  0.03  0.03
## partial       0.15  0.06
## 
## , , 1, new
## 
##              adult child
## consistent    0.90  0.50
## inconsistent  0.07  0.02
## partial       0.09  0.04
## 
## , , 4, new
## 
##              adult child
## consistent    0.95  0.56
## inconsistent  0.06  0.04
## partial       0.18  0.07
## 
## , , 1, old
## 
##              adult child
## consistent    0.90  0.51
## inconsistent  0.05  0.01
## partial       0.10  0.05
## 
## , , 4, old
## 
##              adult child
## consistent    0.96  0.53
## inconsistent  0.04  0.04
## partial       0.16  0.08

statistical analyses

whole-language MI class analyses

We need to turn the data to wide-format. We do that using the dcast function.

exp3.entropy.MI.class.allnouns <- dcast(exp3.entropy.allnouns, Participant + AgeGroup ~ Day, value.var="MutualInformationClass")

Run bootstrapped ANOVA:

A = Age Group

B = Day

AB = Age Group by Day

z=bw2list(exp3.entropy.MI.class.allnouns,2,c(3:4))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwtrimbt(2,2,z,nboot=1000,tr=.05)

## [1] "Taking bootstrap samples. Please wait."

RobustAnova

## $p.value.A
## [1] 0.215
## 
## $p.value.B
## [1] 0.863
## 
## $p.value.AB
## [1] 0.184

Establish whether the intercept is different from chance (0.010195483):

Aggregate data:

exp3.entropy.MI.class.overall = aggregate(MutualInformationClass ~ Participant, exp3.entropy.allnouns, FUN=mean)

Calculate bootstrappped confidence intervals:

smean.cl.boot(exp3.entropy.MI.class.overall$MutualInformationClass, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)

##       Mean      Lower      Upper 
## 0.02749728 0.01014300 0.04823230

whole-language MI class analyses with the additional within-subject factor noun type

We need to turn the data to wide-format. We do that using the dcast function.

exp3.entropy.MI.class.old.new.nouns <- dcast(exp3.entropy.old.new.nouns, Participant + AgeGroup ~ Day * LexicalItems, value.var="MutualInformationClass")

Run bootstrapped ANOVA:

A = Age Group

B = Day

C = Noun Type

AB = Age Group by Day

AC = Age Group by Noun Type

BC = Day by Noun Type

ABC = Age Group by Day by Noun Type

z=bw2list(exp3.entropy.MI.class.old.new.nouns,2,c(3:6))

## [1] "Levels for between factor:"
## [1] adult child
## Levels: adult child

RobustAnova=bwwtrimbt(2,2,2,z,nboot=1000,est=mom,tr=.05)

RobustAnova

## $p.value.A
## [1] 0.108
## 
## $p.value.B
## [1] 0.413
## 
## $p.value.C
## [1] 0.846
## 
## $p.value.AB
## [1] 0.392
## 
## $p.value.AC
## [1] 0.391
## 
## $p.value.BC
## [1] 0.19
## 
## $p.value.ABC
## [1] 0.521

Analyses over number of participants showing evidence of significant regularization:

Comparison of adult vs. child regularizers on day 1:

select appropriate dataset:

regularizers.day1 = subset(exp3.entropy.allnouns, Day  == 1)

calculate number of regularizers and run fisher test:

x =table(regularizers.day1$PHigherMutualInformationClass, regularizers.day1$AgeGroup)
x

##    
##     adult child
##   0    25    30
##   1     5     0

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 0.05219
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.000000 1.017194
## sample estimates:
## odds ratio 
##          0

Comparison of adult vs. child regularizers on day 4:

select appropriate dataset:

regularizers.day4 = subset(exp3.entropy.allnouns, Day  == 4)

calculate number of regularizers and run fisher test:

x =table(regularizers.day4$PHigherMutualInformationClass, regularizers.day4$AgeGroup)
x

##    
##     adult child
##   0    28    29
##   1     2     1

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 1
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.00792058 9.87756616
## sample estimates:
## odds ratio 
##  0.4884649

Comparison of numbers of child regularizers in experiment 3 vs. experiment 1 on day 1:

select appropriate dataset:

exp3_vs.exp1_child_day1 = subset(all.entropy.data, Consistency =="consistent" | Consistency =="inconsistent")
exp3_vs.exp1_child_day1 = subset(exp3_vs.exp1_child_day1, AgeGroup == "child" & LexicalItems == "all" & Day == 1)
exp3_vs.exp1_child_day1$Consistency = factor(exp3_vs.exp1_child_day1$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp1_child_day1$PHigherMutualInformationClass, exp3_vs.exp1_child_day1$Consistency)
x

##    
##     consistent inconsistent
##   0          9           30
##   1         19            0

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 30.275, df = 1, p-value = 3.75e-08

Comparison of numbers of child regularizers in experiment 3 vs. experiment 1 on day 4:

select appropriate dataset:

exp3_vs.exp1_child_day4 = subset(all.entropy.data, Consistency =="consistent" | Consistency =="inconsistent")
exp3_vs.exp1_child_day4 = subset(exp3_vs.exp1_child_day4, AgeGroup == "child" & LexicalItems == "all" & Day == 4)
exp3_vs.exp1_child_day4$Consistency = factor(exp3_vs.exp1_child_day4$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp1_child_day4$PHigherMutualInformationClass, exp3_vs.exp1_child_day4$Consistency)
x

##    
##     consistent inconsistent
##   0          9           29
##   1         21            1

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 28.708, df = 1, p-value = 8.415e-08

Comparison of numbers of adult regularizers in experiment 3 vs. experiment 1 on day 1:

select appropriate dataset:

exp3_vs.exp1_adult_day1 = subset(all.entropy.data, Consistency != "partial" &AgeGroup == "adult" & LexicalItems == "all" & Day == 1)
exp3_vs.exp1_adult_day1$Consistency = factor(exp3_vs.exp1_adult_day1$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp1_adult_day1$PHigherMutualInformationClass, exp3_vs.exp1_adult_day1$Consistency)
x

##    
##     consistent inconsistent
##   0          2           25
##   1         28            5

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 35.623, df = 1, p-value = 2.395e-09

Comparison of numbers of adult regularizers in experiment 3 vs. experiment 1 on day 4:

select appropriate dataset:

exp3_vs.exp1_adult_day4 = subset(all.entropy.data, Consistency != "partial" &AgeGroup == "adult" & LexicalItems == "all" & Day == 4)
exp3_vs.exp1_adult_day4$Consistency = factor(exp3_vs.exp1_adult_day4$Consistency)

calculate number of regularizers and run fisher test:

x =table(exp3_vs.exp1_adult_day4$PHigherMutualInformationClass, exp3_vs.exp1_adult_day4$Consistency)
x

##    
##     consistent inconsistent
##   0          0           28
##   1         30            2

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 8.388e-15
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.00000000 0.02061791
## sample estimates:
## odds ratio 
##          0

Comparison of numbers of child regularizers in experiment 3 vs. experiment 2 on day 1:

select appropriate dataset:

exp3_vs.exp2_child_day1 = subset(all.entropy.data, Consistency =="inconsistent" | Consistency =="partial")
exp3_vs.exp2_child_day1 = subset(exp3_vs.exp2_child_day1, AgeGroup == "child" & LexicalItems == "all" & Day == 1)
exp3_vs.exp2_child_day1$Consistency = factor(exp3_vs.exp2_child_day1$Consistency)

calculate number of regularizers and run fisher test:

x =table(exp3_vs.exp2_child_day1$PHigherMutualInformationClass, exp3_vs.exp2_child_day1$Consistency)
x

##    
##     inconsistent partial
##   0           30      28
##   1            0       2

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 0.4915
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.188835      Inf
## sample estimates:
## odds ratio 
##        Inf

Comparison of numbers of child regularizers in experiment 3 vs. experiment 2 on day 4:

select appropriate dataset:

exp3_vs.exp2_child_day4 = subset(all.entropy.data, Consistency =="inconsistent" | Consistency =="partial")
exp3_vs.exp2_child_day4 = subset(exp3_vs.exp2_child_day4, AgeGroup == "child" & LexicalItems == "all" & Day == 4)
exp3_vs.exp2_child_day4$Consistency = factor(exp3_vs.exp2_child_day4$Consistency)

calculate number of regularizers and run fisher test:

x =table(exp3_vs.exp2_child_day4$PHigherMutualInformationClass, exp3_vs.exp2_child_day4$Consistency)
x

##    
##     inconsistent partial
##   0           29      25
##   1            1       5

fisher.test(x)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 0.1945
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##    0.5769977 283.1514053
## sample estimates:
## odds ratio 
##   5.653276

Comparison of numbers of adult regularizers in experiment 3 vs. experiment 2 on day 1:

select appropriate dataset:

exp3_vs.exp2_adult_day1 = subset(all.entropy.data, Consistency != "consistent" &AgeGroup == "adult" & LexicalItems == "all" & Day == 1)
exp3_vs.exp2_adult_day1$Consistency = factor(exp3_vs.exp2_adult_day1$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp2_adult_day1$PHigherMutualInformationClass, exp3_vs.exp2_adult_day1$Consistency)
x

##    
##     inconsistent partial
##   0           25      22
##   1            5       8

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 0.8838, df = 1, p-value = 0.3472

Comparison of numbers of adult regularizers in experiment 3 vs. experiment 2 on day 4:

select appropriate dataset:

exp3_vs.exp2_adult_day4 = subset(all.entropy.data, Consistency != "consistent" &AgeGroup == "adult" & LexicalItems == "all" & Day == 4)
exp3_vs.exp2_adult_day4$Consistency = factor(exp3_vs.exp2_adult_day4$Consistency)

calculate number of regularizers and run chi-square test:

x =table(exp3_vs.exp2_adult_day4$PHigherMutualInformationClass, exp3_vs.exp2_adult_day4$Consistency)
x

##    
##     inconsistent partial
##   0           28      17
##   1            2      13

chisq.test(x, correct = FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  x
## X-squared = 10.756, df = 1, p-value = 0.00104

Additional analyses: footnote #9

The analyses reported on footnote #9 address another question of interest, namely, whether participants produced the same pattern of particle production at (approximately) the same level on days 1 and 4. In other words, for a given participant, how self-consistent was his/her particle production across days?

The general approach we took to address this question was to compute a measure of participants’ self- consistency across days and compare it against the level of self-consistency expected under the null hypothesis that participants are not self-consistent across days (i.e. participants’ particle usage on day 4 no more resemble their day 1 than expected by chance).

To this aim, we:

calculated, for each participant, a difference score based on their particle usage on day 1 and day 4 (note that this difference score can simply be calculated on the overall usage of particle 1 on both days [type=“overall”], or calculated on noun-specific proportions [type=“by.noun”]).
We then compared the mean of the veridical difference scores to a distribution of mean difference scores generated by shuffling day 4 data across participants (e.g. re-attributing the data produced by participant 1 on day 4 to participant 2 and so on; we generated 1000 such random re-assignments to obtain a distribution of mean difference scores), which captures the null hypothesis that participants are no more self-consistent across days than we would expect by chance, given the observed distribution of responses on day 1 and day 4.
We then obtained a p-value for the observed level of self-consistency by calculating the proportion of samples from this null distribution that exhibit mean difference scores which are equal to or lower than the veridical mean difference score; if less than 5% of the random samples have equal or lower mean difference, we can reject the null hypothesis at p < .05 and conclude that participants are more self-consistent than we expect by chance.

Load dataframe (all.production.data contains the data from children and adults’ production performance in experiments 1, 2, & 3) and filter out trials where participants produced something other than particle 1 or particle 2 or an incorrect noun.

talkerid.data <- subset(all.production.data,pt_logical_det_intended=="det1" | pt_logical_det_intended=="det2")
talkerid.data <- subset(talkerid.data,noun_correct == 1)

Score every trial as using particle1 or not, this will be used for all subsequent calculations of particle 1 rates.

talkerid.data$det1used <- ifelse(talkerid.data$pt_logical_det_intended=="det1",1,0)

daytoday.consistency.score takes two sets of production data (representing productions on day 1 and day 4) and calculates mean difference in particle usage - if random=FALSE this is for the data as provided, if random=TRUE then the day 4 data is shuffled by re-allocating participant IDs. Note that this could be done more simply for the overall measure by simply using sample, but it is necessary to preserve by-participant structure in the randomised data for the by-noun measure.

daytoday.consistency.score <- function(day1.data,day4.data,type,random) {
  if (type=="overall") {
    day1.det1.freq <- aggregate(det1used ~ pt_code,FUN=mean,data=day1.data)
    day4.det1.freq <- aggregate(det1used ~ pt_code,FUN=mean,data=day4.data)}
  else if (type=="by.noun") {
      day1.det1.freq <- aggregate(det1used ~ logical_noun + pt_code,FUN=mean,data=day1.data)
      day4.det1.freq <- aggregate(det1used ~ logical_noun + pt_code,FUN=mean,data=day4.data)
  }
  if (random) { #here we shuffle by re-allocating participant IDs on day 4 
    day4.data.random <- day4.data
    day4.det1.freq$pt_code <- mapvalues(day4.det1.freq$pt_code, 
                                          from=levels(droplevels(day4.det1.freq$pt_code)), 
                                          to=sample(levels(droplevels(day4.det1.freq$pt_code))))
  }
  
  if (type=="overall") {
    bothdays.data <- merge(day1.det1.freq,day4.det1.freq,by="pt_code",suffixes=c(".day1",".day4"))
  }
  
  else if (type=="by.noun") {
    bothdays.data <- merge(day1.det1.freq,day4.det1.freq,c("pt_code","logical_noun"),suffixes=c(".day1",".day4"))
  }
  
  mean.diff <- mean(abs(bothdays.data$det1used.day4-bothdays.data$det1used.day1))
  mean.diff
}

evaluate.daytoday.consistency calculates a distribution of mean differences using replicate, and obtains a p value for the veridical mean difference by comparing to this distribution. If type=“overall” the day-to-day difference is calculated on overall use of det1; if type=“by.noun” then the difference is calculated on a noun-by-noun basis.

evaluate.daytoday.consistency <- function(data,type,nreps) {
  day1.data <- subset(data,day==1)
  day4.data <- subset(data,day==4)
  veridical.score <- daytoday.consistency.score(day1.data,day4.data,type,FALSE)
  random.sample <- replicate(nreps,daytoday.consistency.score(day1.data,day4.data,type,TRUE))
  p.lower.or.equal <- sum(random.sample<=veridical.score)/nreps
  list(veridical=veridical.score,sample.mean=mean(random.sample),p=p.lower.or.equal)               
}

Compare the distribution of random difference scores against the veridical mean difference: children, experiment 1, across nouns

evaluate.daytoday.consistency(subset(talkerid.data,oldnew=="old" & adult_child=="child" & consistency=="consistent"),"overall",1000)

## $veridical
## [1] 0.1736366
## 
## $sample.mean
## [1] 0.283253
## 
## $p
## [1] 0

Compare the distribution of random difference scores against the veridical mean difference: children, experiment 2, across nouns

evaluate.daytoday.consistency(subset(talkerid.data,oldnew=="old" & adult_child=="child" & consistency=="partial"),"overall",1000)

## $veridical
## [1] 0.2615628
## 
## $sample.mean
## [1] 0.4114665
## 
## $p
## [1] 0.002

Compare the distribution of random difference scores against the veridical mean difference: children, experiment 3, across nouns

evaluate.daytoday.consistency(subset(talkerid.data,oldnew=="old" & adult_child=="child" & consistency=="inconsistent"),"overall",1000)

## $veridical
## [1] 0.2384741
## 
## $sample.mean
## [1] 0.4107228
## 
## $p
## [1] 0

Compare the distribution of random difference scores against the veridical mean difference: adults, experiment 1, across nouns

evaluate.daytoday.consistency(subset(talkerid.data,oldnew=="old" & adult_child=="adult" & consistency=="consistent"),"overall",1000)

## $veridical
## [1] 0.0378293
## 
## $sample.mean
## [1] 0.03734073
## 
## $p
## [1] 1

Compare the distribution of random difference scores against the veridical mean difference: adults, experiment 2, across nouns

evaluate.daytoday.consistency(subset(talkerid.data,oldnew=="old" & adult_child=="adult" & consistency=="partial"),"overall",1000)

## $veridical
## [1] 0.1101142
## 
## $sample.mean
## [1] 0.1311415
## 
## $p
## [1] 0.06

Compare the distribution of random difference scores against the veridical mean difference: adults, experiment 3, across nouns

evaluate.daytoday.consistency(subset(talkerid.data,oldnew=="old" & adult_child=="adult" & consistency=="inconsistent"),"overall",1000)

## $veridical
## [1] 0.1531048
## 
## $sample.mean
## [1] 0.204304
## 
## $p
## [1] 0.011

Compare the distribution of random difference scores against the veridical mean difference: children, experiment 1, noun-specific proportions

evaluate.daytoday.consistency(subset(talkerid.data,oldnew=="old" & adult_child=="child" & consistency=="consistent"),"by.noun",1000)

## $veridical
## [1] 0.1869154
## 
## $sample.mean
## [1] 0.3026041
## 
## $p
## [1] 0

Compare the distribution of random difference scores against the veridical mean difference: children, experiment 2, noun-specific proportions

evaluate.daytoday.consistency(subset(talkerid.data,oldnew=="old" & adult_child=="child" & consistency=="partial"),"by.noun",1000)

## $veridical
## [1] 0.2757869
## 
## $sample.mean
## [1] 0.4209588
## 
## $p
## [1] 0

Compare the distribution of random difference scores against the veridical mean difference: children, experiment 3, noun-specific proportions

evaluate.daytoday.consistency(subset(talkerid.data,oldnew=="old" & adult_child=="child" & consistency=="inconsistent"),"by.noun",1000)

## $veridical
## [1] 0.2717758
## 
## $sample.mean
## [1] 0.4418939
## 
## $p
## [1] 0

Compare the distribution of random difference scores against the veridical mean difference: adults, experiment 1, noun-specific proportions

evaluate.daytoday.consistency(subset(talkerid.data,oldnew=="old" & adult_child=="adult" & consistency=="consistent"),"by.noun",1000)

## $veridical
## [1] 0.04122024
## 
## $sample.mean
## [1] 0.0463494
## 
## $p
## [1] 0.071

Compare the distribution of random difference scores against the veridical mean difference: adults, experiment 2, noun-specific proportions

evaluate.daytoday.consistency(subset(talkerid.data,oldnew=="old" & adult_child=="adult" & consistency=="partial"),"by.noun",1000)

## $veridical
## [1] 0.2698413
## 
## $sample.mean
## [1] 0.2898599
## 
## $p
## [1] 0.094

Compare the distribution of random difference scores against the veridical mean difference: adults, experiment 3, noun-specific proportions

evaluate.daytoday.consistency(subset(talkerid.data,oldnew=="old" & adult_child=="adult" & consistency=="inconsistent"),"by.noun",1000)

## $veridical
## [1] 0.2973214
## 
## $sample.mean
## [1] 0.3752804
## 
## $p
## [1] 0.001

Samara, Smith, Brown, & Wonnacottt

Anna Samara

February, 2017

Load packages and helper functions

Packages

Helper functions

SummarySE

SummarySEwithin

myCenter

lizCenter

Load datasets

Experiment 1 (consistent condition)

Production data: baseline

select appropriate dataset

calculate proportion of excluded trials

statistical analyses

lme analyses: children

Production data: choice of particle

select appropriate dataset

calculate means for Figure 3

statistical analyses

lme analyses: children & adults

2afc data

select appropriate dataset

calculate means for Figure 4

statistical analyses

lme analyses: children & adults

Production data: unaware participants

select appropriate dataset

calculate means for Figure B9

statistical analyses

lme analyses: children

2afc data: unaware participants

select appropriate dataset

calculate means for Figure B10

statistical analyses

lme analyses: children

Overall regularization

calculate means for Figure 5

select appropriate dataset

statistical analyses

whole-language entropy analyses

Run bootstrapped ANOVA:

Establish whether the intercept is different from chance (0.987858121):

Establish whether children showed above chance regularization on day 1

Establish whether children showed above chance regularization on day 4

Establish whether adults showed above chance regularization on day 1

Establish whether adults showed above chance regularization on day 4

whole-language entropy analyses with the additional within-subject factor noun type

Run bootstrapped ANOVA:

Analyses over number of participants showing evidence of significant regularization:

Comparison of adult vs. child regularizers on day 1:

Comparison of adult vs. child regularizers on day 4:

Lexical conditioning

select appropriate dataset

statistical analyses

lexical MI analyses across nouns

Run bootstrapped ANOVA:

Establish whether the intercept is different from chance (0.079641531):

Establish whether children showed below chance lexical MI

Establish whether adults showed below chance lexical MI

lexical MI analyses with the additional within-subject factor noun type

Run bootstrapped ANOVA:

Analyses over number of participants showing evidence of significant regularization:

Experiment 2 (partial condition)

Production data: baseline

select appropriate dataset

calculate proportion of excluded trials

statistical analyses

lme analyses: children

Production data: choice of particle

select appropriate datasets

Children & adults:

children

adults

calculate means for Figure 6

statistical analyses

lme analyses: children & adults

lme analyses: adults

lme analyses: children