Load Packages and Helper Functions
- Packages.
- Helper Functions.
Load Datasets
Training
English Introduction
Lexical Decision
Discrimination

Load Packages and Helper Functions

Packages.

rm(list=ls())
library(ggplot2)

## Warning: package 'ggplot2' was built under R version 3.3.2

library(plotrix)

## Warning: package 'plotrix' was built under R version 3.3.2

#suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(plyr))

## Warning: package 'plyr' was built under R version 3.3.2

library(bootstrap)

## Warning: package 'bootstrap' was built under R version 3.3.2

suppressPackageStartupMessages(library(lme4))

## Warning: package 'lme4' was built under R version 3.3.2

suppressPackageStartupMessages(library(lmerTest))

## Warning: package 'lmerTest' was built under R version 3.3.2

library(pbkrtest)

## Warning: package 'pbkrtest' was built under R version 3.3.2

#detach("package:lmerTest", unload=TRUE)

library(knitr)

## Warning: package 'knitr' was built under R version 3.3.2

theme_set(theme_bw())
opts_chunk$set(fig.width=8, fig.height=5, 
                      echo=TRUE, warning=FALSE, message=FALSE, cache=TRUE)

Helper Functions.

SummarySE

This function can be found on the website “Cookbook for R”.

http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/#Helper

It summarizes data, giving count, mean, standard deviation, standard error of the mean, and confidence interval (default 95%).

data: a data frame
measurevar: the name of a column that contains the variable to be summarized
groupvars: a vector containing names of columns that contain grouping variables
na.rm: a boolean that indicates whether to ignore NAs
conf.interval: the percent range of the confidence interval (default is 95%)

summarySE <- function(data=NULL, measurevar, groupvars=NULL, na.rm=FALSE,
                      conf.interval=.95, .drop=TRUE) {
    require(plyr)

    # New version of length which can handle NA's: if na.rm==T, don't count them
    length2 <- function (x, na.rm=FALSE) {
        if (na.rm) sum(!is.na(x))
        else       length(x)
    }

    # This does the summary. For each group's data frame, return a vector with
    # N, mean, and sd
    datac <- ddply(data, groupvars, .drop=.drop,
      .fun = function(xx, col) {
        c(N    = length2(xx[[col]], na.rm=na.rm),
          mean = mean   (xx[[col]], na.rm=na.rm),
          sd   = sd     (xx[[col]], na.rm=na.rm)
        )
      },
      measurevar
    )

    # Rename the "mean" column    
    datac <- rename(datac, c("mean" = measurevar))

    datac$se <- datac$sd / sqrt(datac$N)  # Calculate standard error of the mean

    # Confidence interval multiplier for standard error
    # Calculate t-statistic for confidence interval: 
    # e.g., if conf.interval is .95, use .975 (above/below), and use df=N-1
    ciMult <- qt(conf.interval/2 + .5, datac$N-1)
    datac$ci <- datac$se * ciMult

    return(datac)
}

SummarySEwithin

This function can be found on the website “Cookbook for R”.

http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/#Helper

From that website:

Summarizes data, handling within-subject variables by removing inter-subject variability. It will still work if there are no within-subject variables. Gives count, un-normed mean, normed mean (with same between-group mean), standard deviation, standard error of the mean, and confidence interval. If there are within-subject variables, it calculates adjusted values using the method from Morey (2008).

data: a data frame
measurevar: the name of a column that contains the variable to be summarized
betweenvars: a vector containing names of columns that are between-subjects variables
withinvars: a vector containing names of columns that are within-subjects variables
idvar: the name of a column that identifies each subject (or matched subjects)
na.rm: a boolean that indicates whether to ignore NA’s
conf.interval: the percent range of the confidence interval (default is 95%)

summarySEwithin <- function(data=NULL, measurevar, betweenvars=NULL, withinvars=NULL,
                            idvar=NULL, na.rm=FALSE, conf.interval=.95, .drop=TRUE) {

  # Ensure that the betweenvars and withinvars are factors
  factorvars <- vapply(data[, c(betweenvars, withinvars), drop=FALSE],
    FUN=is.factor, FUN.VALUE=logical(1))

  if (!all(factorvars)) {
    nonfactorvars <- names(factorvars)[!factorvars]
    message("Automatically converting the following non-factors to factors: ",
            paste(nonfactorvars, collapse = ", "))
    data[nonfactorvars] <- lapply(data[nonfactorvars], factor)
  }

  # Get the means from the un-normed data
  datac <- summarySE(data, measurevar, groupvars=c(betweenvars, withinvars),
                     na.rm=na.rm, conf.interval=conf.interval, .drop=.drop)

  # Drop all the unused columns (these will be calculated with normed data)
  datac$sd <- NULL
  datac$se <- NULL
  datac$ci <- NULL

  # Norm each subject's data
  ndata <- normDataWithin(data, idvar, measurevar, betweenvars, na.rm, .drop=.drop)

  # This is the name of the new column
  measurevar_n <- paste(measurevar, "_norm", sep="")

  # Collapse the normed data - now we can treat between and within vars the same
  ndatac <- summarySE(ndata, measurevar_n, groupvars=c(betweenvars, withinvars),
                      na.rm=na.rm, conf.interval=conf.interval, .drop=.drop)

  # Apply correction from Morey (2008) to the standard error and confidence interval
  #  Get the product of the number of conditions of within-S variables
  nWithinGroups    <- prod(vapply(ndatac[,withinvars, drop=FALSE], FUN=nlevels,
                           FUN.VALUE=numeric(1)))
  correctionFactor <- sqrt( nWithinGroups / (nWithinGroups-1) )

  # Apply the correction factor
  ndatac$sd <- ndatac$sd * correctionFactor
  ndatac$se <- ndatac$se * correctionFactor
  ndatac$ci <- ndatac$ci * correctionFactor

  # Combine the un-normed means with the normed results
  merge(datac, ndatac)
}

normDataWithin

This function is used by the SummarySEWithin function above. It can be found on the website “Cookbook for R”.

http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/#Helper

From that website:

Norms the data within specified groups in a data frame; it normalizes each subject (identified by idvar) so that they have the same mean, within each group specified by betweenvars.

data: a data frame
idvar: the name of a column that identifies each subject (or matched subjects)
measurevar: the name of a column that contains the variable to be summarized
betweenvars: a vector containing names of columns that are between-subjects variables
na.rm: a boolean that indicates whether to ignore NAs

normDataWithin <- function(data=NULL, idvar, measurevar, betweenvars=NULL,
                           na.rm=FALSE, .drop=TRUE) {
    #library(plyr)

    # Measure var on left, idvar + between vars on right of formula.
    data.subjMean <- ddply(data, c(idvar, betweenvars), .drop=.drop,
     .fun = function(xx, col, na.rm) {
        c(subjMean = mean(xx[,col], na.rm=na.rm))
      },
      measurevar,
      na.rm
    )

    # Put the subject means with original data
    data <- merge(data, data.subjMean)

    # Get the normalized data in a new column
    measureNormedVar <- paste(measurevar, "_norm", sep="")
    data[,measureNormedVar] <- data[,measurevar] - data[,"subjMean"] +
                               mean(data[,measurevar], na.rm=na.rm)

    # Remove this subject mean column
    data$subjMean <- NULL

    return(data)
}

myCenter

This function outputs the centered values of an variable, which can be a numeric variable, a factor, or a data frame. It was taken from Florian Jaegers blog: https://hlplab.wordpress.com/2009/04/27/centering-several-variables/

From his blog:

If the input is a numeric variable, the output is the centered variable
If the input is a factor, the output is a numeric variable with centered factor level values. That is, the factor’s levels are converted into numerical values in their inherent order (if not specified otherwise, R defaults to alphanumerical order). More specifically, this centers any binary factor so that the value below 0 will be the first level of the original factor, and the value above 0 will be the second level.
If the input is a data frame or matrix, the output is a new matrix of the same dimension and with the centered values and column names that correspond to the colnames() of the input preceded by “c” (e.g. “Variable1” will be “cVariable1”).

myCenter= function(x) {
  if (is.numeric(x)) { return(x - mean(x, na.rm=T)) }
    if (is.factor(x)) {
        x= as.numeric(x)
        return(x - mean(x, na.rm=T))
    }
    if (is.data.frame(x) || is.matrix(x)) {
        m= matrix(nrow=nrow(x), ncol=ncol(x))
        colnames(m)= paste("c", colnames(x), sep="")
    
        for (i in 1:ncol(x)) {
        
            m[,i]= myCenter(x[,i])
        }
        return(as.data.frame(m))
    }
}

selectCenter

This function provides a wrapper around myCenter allowing you to center a specific list of variables from a dataframe.

x: data frame
listfname: a list of the variables to be centered (e.g. list(variable1,variable2))

The output is a copy of the data frame with a column (always a numeric variable) added for each of the centered variables. These columns are labelled with each column’s previous name, but with “.ct” appended (e.g., “variable1” will become “variable1.ct”).

selectCenter= function(x, listfname) 
{
    for (i in 1:length(listfname)) 
    {
        fname = as.character(listfname[i])
        x[paste(fname,".ct", sep="")] = myCenter(x[fname])
    }
        
    return(x)
}

get_coeffs and get_coeffs_lmerTest

The get_coeffs function allows us to inspect just particular coefficients from the output of an lme model (as produced by the summary() function in the lme4 package) by putting them in table.

x: the output returned when running an lmer or glmer (i.e. an object of type lmerMod or glmerMod)
list: a list of names of the coefficients to be extracted (e.g. c(“variable1”,“variable1:variable2”))

The get_coeffs_lmerTest function is identical except that instead of using the summary package from lme4 it uses the summary package from lmerTest using Kenward-Rodger approximation for denominator degrees of freedom and thus also including p values)

get_coeffs <- function(x,list){(kable(as.data.frame(summary(x)$coefficients)[list,],digits=3))}

get_coeffs_lmerTest <- function(x,list){(kable(as.data.frame(summary(x,ddf="kenw")$coefficients)[list,],digits=3))}

filter2

This is a function which filters a column of data removing values which are some number of standard deviations above/below the mean for that participant, possibly in some condition/subcondition.

im: the input matrix (a data frame)
svn: a list of the names of factors to be group by (subject name + one or more conditions)
fn: the name of the column containing the data to be filtered
lim: how many standard deviations above/below the mean to filter

The function returns an input matrix identical to the input matrix but with an additional columns giving the group means and the “filtered” data

filter2 = function(im, svn, fn, lim)
{
  ## work out means lisfor each subject for each word

x = list()
y = ""

for(n in svn) x=append(im[n],x)
for(n in svn) y=paste(y,n,sep="_")

means = aggregate(im[fn], by = x, mean, na.rm=T)
head(means)
nocols = dim(means)[2]
colnames(means)[nocols] = "means"

sds = aggregate(im[fn], by = x, sd, na.rm=T)
head(sds)
nocols = dim(sds)[2]
colnames(sds)[nocols] = "sds"

gs = merge(means,sds)

## because if there is just one value it doesn't have a stand deviation and don't want to just disregard all of these
gs$sds[is.na(gs$sds)] = 0 

gs$max = gs$means + lim*gs$sds
gs$min = gs$means- lim*gs$sds

im2 = merge(im,gs, sort=F)


im2[paste(fn,"filt",sep="")] = im2[fn]
cn= dim(im2)[2] ## get colnumber (last one added)

im2[,cn][im2[,fn]> im2$max] = ""

im2[,cn][im2[,fn]< im2$min] = ""

im2[,cn]= as.numeric(im2[,cn])

 
names(im2)[names(im2)=="means"] = paste("mean", y, sep="_") 
names(im2)[names(im2)=="sds"] = paste("sd", y, sep="_") 
names(im2)[names(im2)=="max"] = paste("max", y, sep="_") 
names(im2)[names(im2)=="min"] = paste("min", y, sep="_") 

return(im2)
}

Load Datasets

train = read.csv( "training.csv")
discrim = read.csv( "discrim.csv")
lexd = read.csv( "lexd.csv")
engintro = read.csv( "englishintro.csv")

Training

Data Preparation

Check amount of missing data

There should be 320 trials * 10 sessions per participant. Thus for adults there should be 41 adults * 3200 making 1.31210^{5} trials and for children there should 52 children * 3200 trials making 1.66410^{5} trials.

round(1- table(train$agegroup)/c(3200*41, 3200*52),2)

## 
## adult child 
##  0.02  0.06

Select data for analyses

In order to make a fair comparison across conditions, we only look at trials using the two talkers who were also used in the low variability (one talker) condition.

train2 = subset(train, talker == "female1" | talker == "female2")
train2$talker = factor(train2$talker)
train2.child = subset(train2, agegroup =="child")
train2.adult = subset(train2, agegroup =="adult")

Plot means (for Figure 3)

Adults

train2.adult.participantmeans = aggregate(accuracy ~ participant + session + condition, na.rm = T, data = train2.adult, FUN = mean)
train2.adult.groupmeans <- summarySEwithin(train2.adult.participantmeans, measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for imbalance in talkers, get the mean values for participant for each talker (average mean values for each talker); error calculations as done over the whole participant set

train2.adult.participantmeans.t12 = aggregate(accuracy ~ participant + session + condition + talker, na.rm = T, data = train2.adult, FUN = mean)

train2.adult.groupmeans.talker1 <- summarySEwithin(subset(train2.adult.participantmeans.t12, talker == "female1"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

train2.adult.groupmeans.talker2 <- summarySEwithin(subset(train2.adult.participantmeans.t12, talker == "female2"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

train2.adult.groupmeans$accuracy = (train2.adult.groupmeans.talker1$accuracy + train2.adult.groupmeans.talker2$accuracy)/2

# Now plot using the standard error for the confidence intervals

epsilon = 0.02 #how big the hat (bar at the top/bottom) is
with(subset(train2.adult.groupmeans, condition == "highvar"),
plot(as.numeric(session), accuracy, type = "p", col = "black",pch = 15, xlab = "Session", ylab = "Proportion correct", xlim = c(1,10), ylim = c(0.5, 1)) +
  segments(as.numeric(session), (accuracy - ci) ,as.numeric(session), (accuracy + ci), col="black") +
  segments(as.numeric(session)-epsilon, (accuracy - ci) ,as.numeric(session)+epsilon, (accuracy - ci), col="black") +
  segments(as.numeric(session)-epsilon, (accuracy + ci) ,as.numeric(session)+epsilon, (accuracy + ci), col="black"),
     )

## numeric(0)

# Add low variability to the same plot (using "lines" rather than plot)

with(subset(train2.adult.groupmeans, condition == "lowvar"),
lines(as.numeric(session), accuracy, type = "p", col = "grey45", pch = 15, xlab = "Session", ylab = "Proportion correct", xlim = c(1,10), ylim = c(0.5, 1)) +
  segments(as.numeric(session), (accuracy - ci) ,as.numeric(session), (accuracy + ci), col="grey45") +
  segments(as.numeric(session)-epsilon, (accuracy - ci) ,as.numeric(session)+epsilon, (accuracy - ci), col="grey45") +
  segments(as.numeric(session)-epsilon, (accuracy + ci) ,as.numeric(session)+epsilon, (accuracy + ci), col="grey45"),
     )

## numeric(0)

legend('topleft','groups', c("Low Variability","High Variabiltiy"), col=c("grey45","black"), pch = c(15,17))

Children

train2.child.participantmeans = aggregate(accuracy ~ participant + session + condition, na.rm = T, data = train2.child, FUN = mean)
train2.child.groupmeans <- summarySEwithin(train2.child.participantmeans, measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for imbalance in talkers, get the mean values for participant for each talker (average mean values for each talker); error calculations as done over the whole participant set.

train2.child.participantmeans.t12 = aggregate(accuracy ~ participant + session + condition + talker, na.rm = T, data = train2.child, FUN = mean)

train2.child.groupmeans.talker1 <- summarySEwithin(subset(train2.child.participantmeans.t12, talker == "female1"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

train2.child.groupmeans.talker2 <- summarySEwithin(subset(train2.child.participantmeans.t12, talker == "female2"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

train2.child.groupmeans$accuracy = (train2.child.groupmeans.talker1$accuracy + train2.child.groupmeans.talker2$accuracy)/2

# Now plot using the standard error for the error bars

epsilon = 0.02 #how big the hat (bar at the top/bottom) is
with(subset(train2.child.groupmeans, condition == "highvar"),
plot(as.numeric(session), accuracy, type = "p", col="black",pch = 15, xlab = "Session", ylab = "Proportion correct", xlim = c(1,10), ylim = c(0.5, 1)) +
  segments(as.numeric(session), (accuracy - ci) ,as.numeric(session), (accuracy + ci), col="black") +
  segments(as.numeric(session)-epsilon, (accuracy - ci) ,as.numeric(session)+epsilon, (accuracy - ci), col="black") +
  segments(as.numeric(session)-epsilon, (accuracy + ci) ,as.numeric(session)+epsilon, (accuracy + ci), col="black"),
     )

## numeric(0)

# Add low variability to the same plot (using "lines" rather than plot)

with(subset(train2.child.groupmeans, condition == "lowvar"),
lines(as.numeric(session), accuracy, type = "p", col="grey45",pch = 15, xlab = "Session", ylab = "Proportion correct", xlim = c(1,10), ylim = c(0.5, 1)) +
  segments(as.numeric(session), (accuracy - ci) ,as.numeric(session), (accuracy + ci), col="grey45") +
  segments(as.numeric(session)-epsilon, (accuracy - ci) ,as.numeric(session)+epsilon, (accuracy - ci), col="grey45") +
  segments(as.numeric(session)-epsilon, (accuracy + ci) ,as.numeric(session)+epsilon, (accuracy + ci), col="grey45"),
     )

## numeric(0)

legend('topleft','groups', c("Low Variability","High Variabiltiy"), col=c("grey45","black"), pch = c(15,17))

Statistial Analyses

Our approach was to only inspect models for effects and interactions between the experimental variables where there are clear predictions. For the training data, this is the case for both main effects of condition (high/low variability) and session and the interaction between them. We therefore begin by inspecting these effects in the model. Wherever we do find reliable effects, we then check to see whether they are qualified by an interaction with talker. Where this is the case, we check to see if the effect holds for each of the talkers separately.

Adults

train2.adult = selectCenter(train2.adult, list("condition", "session", "talker"))
lmer.adult <- glmer(accuracy ~ 1 
                    + (condition.ct * session.ct * talker.ct)
                    + (session.ct||participant),
                    data = train2.adult, control = glmerControl(optimizer = "bobyqa"), family = binomial)
lmer.adult.coeff = kable(summary(lmer.adult)$coefficients, digits = 3)
get_coeffs(lmer.adult, c("(Intercept)", "condition.ct", "session.ct", "condition.ct:session.ct"))

	Estimate	Std. Error	z value
(Intercept)	3.062	0.152	20.197
condition.ct	1.626	0.254	6.406
session.ct	0.349	0.031	11.208
condition.ct:session.ct	0.202	0.052	3.857

There are main effects of training session and condition and a reliable interaction between condition and session. This reflects improvement across sessions and overall better performance in the low variability condition which increases across sessions.

Check interactions with talker

Check interactions with talker for each of these effects

get_coeffs(lmer.adult, c("talker.ct", "condition.ct:talker.ct", "session.ct:talker.ct", "condition.ct:session.ct:talker.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
talker.ct	2.107	0.284	7.411	0.000
condition.ct:talker.ct	1.727	0.452	3.817	0.000
session.ct:talker.ct	0.260	0.062	4.182	0.000
condition.ct:session.ct:talker.ct	0.255	0.105	2.442	0.015

There is a main effect of the control variable talker (reflecting greater accuracy with one talker: female1 0.84 female2 0.94 )

There are also interactions with talker for all of the reliable experimental effects. To break down the interactions, we look at each talker separately.

train2.adult.t1 = selectCenter(subset(train2.adult, talker == "female1"), list("condition", "session", "talker"))
lmer.adult.t1 <- glmer(accuracy ~ 1
                       + (condition.ct * session.ct)
                       + (session.ct|participant),
                       data = train2.adult.t1, control = glmerControl(optimizer = "bobyqa"), family = binomial)
kable(summary(lmer.adult.t1)$coefficients, digits = 3)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	1.940	0.170	11.403	0.000
condition.ct	0.720	0.285	2.526	0.012
session.ct	0.212	0.020	10.412	0.000
condition.ct:session.ct	0.069	0.035	1.978	0.048

train2.adult.t2 = selectCenter(subset(train2.adult, talker == "female2"), list("condition", "session", "talker"))
lmer.adult.t2 <- glmer(accuracy ~ 1 
                       + (condition.ct * session.ct)
                       + (session.ct|participant),
                       data = train2.adult.t2, control = glmerControl(optimizer = "bobyqa"), family = binomial)
kable(summary(lmer.adult.t2)$coefficients, digits = 3)

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	4.139	0.265	15.642	0.000
condition.ct	2.473	0.442	5.600	0.000
session.ct	0.485	0.064	7.526	0.000
condition.ct:session.ct	0.334	0.108	3.096	0.002

These analyses show that the reliable effects of condition and session hold for both talkers. The interaction holds for talker 2 (couldn’t look at this for talker 1).

Children

Note that full model did not converge, even with interactions between slopes removed. We therefore removed correlations between slopes.

train2.child = selectCenter(train2.child, list("condition", "session", "talker"))
lmer.child <- glmer(accuracy ~ 1 
                    + (condition.ct * session.ct + talker.ct)
                    + (session.ct||participant),
                    data = train2.child, control = glmerControl(optimizer = "bobyqa"), family = binomial)
get_coeffs(lmer.child, c("(Intercept)", "condition.ct", "session.ct", "condition.ct:session.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	1.064	0.142	7.469	0.000
condition.ct	0.374	0.239	1.564	0.118
session.ct	0.223	0.032	7.055	0.000
condition.ct:session.ct	0.103	0.054	1.924	0.054

There was a reliable main effect of session, reflecting improved performance across sessions, but no reliable main effect of condition. There was also a near reliable interaction between session and condition. Inspecting the figure above, this seems to reflect the fact that the difference between conditions emerges only in the second half of training.

Follow up in a model where session is replaced by a binary division into “first half” “second half”

train2.child$testhalf =  2
train2.child$testhalf[train2.child$session < 6] = 1

train2.child = selectCenter(train2.child, list("condition", "testhalf", "talker"))
lmer.child.2 <- glmer(accuracy ~ 1 
                      + (condition.ct * testhalf.ct * talker.ct)
                      + (testhalf.ct||participant),
                      data = train2.child, control = glmerControl(optimizer = "bobyqa"), family = binomial)
get_coeffs(lmer.child.2, c("(Intercept)", "condition.ct", "testhalf.ct", "condition.ct:testhalf.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.994	0.125	7.920	0.000
condition.ct	0.317	0.224	1.412	0.158
testhalf.ct	1.045	0.139	7.514	0.000
condition.ct:testhalf.ct	0.483	0.215	2.248	0.025

There is a reliable interaction between condition and testhalf. We explore this by running the model again but with a separate slope for condition for each half of testing.

train2.child$testhalf = factor(train2.child$testhalf)

## Run the model again but removing the main effect of condition and the condition by testhalf interaction and replacing with separate effects of condition for each test half.  

lmer.child.2.v2 <- glmer(accuracy ~ 1 
                + condition.ct : testhalf
                + (condition.ct * testhalf.ct * talker.ct)
                - condition.ct
                - condition.ct : testhalf.ct
                + (testhalf.ct||participant),
                data = train2.child, control = glmerControl(optimizer = "bobyqa"), family = binomial)

## Check  that this is the same model as above with the same number of dfs  
anova(lmer.child.2, lmer.child.2.v2)

## Data: train2.child
## Models:
## lmer.child.2: accuracy ~ 1 + (condition.ct * testhalf.ct * talker.ct) + (testhalf.ct || 
## lmer.child.2:     participant)
## lmer.child.2.v2: accuracy ~ 1 + condition.ct:testhalf + (condition.ct * testhalf.ct * 
## lmer.child.2.v2:     talker.ct) - condition.ct - condition.ct:testhalf.ct + (testhalf.ct || 
## lmer.child.2.v2:     participant)
##                 Df    AIC    BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## lmer.child.2    10 106402 106496 -53191   106382                        
## lmer.child.2.v2 10 106402 106496 -53191   106382     0      0          1

get_coeffs(lmer.child.2.v2, c("condition.ct:testhalf1", "condition.ct:testhalf2"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
condition.ct:testhalf1	0.077	0.241	0.320	0.749
condition.ct:testhalf2	0.560	0.229	2.442	0.015

This shows that there is a reliable effect of condition only in the second half of training.

Check interactions with talker

Note that we were not able to include interactions with talker in the original model (with session as a continuous variable) so cannot look to see if the reliable effect of session and the (near) realiable session by condition interaction are qualified by an effect of talker

However we can look at the model with session replaced by test-half.

get_coeffs(lmer.child.2, c("talker.ct", "testhalf.ct:talker.ct", "condition.ct:testhalf.ct:talker.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
talker.ct	0.185	0.218	0.851	0.395
testhalf.ct:talker.ct	0.189	0.284	0.666	0.506
condition.ct:testhalf.ct:talker.ct	0.732	0.339	2.162	0.031

There was no reliable effect of talker and the effect of test half isn’t qualified by an interaction with talker. However the interaction between condition and testhalf is qualified by an interaction with talker. We broke this down by looking to see whether there was a reliable effect of condition in each test half for each talker.

lmer.child.2.v3 <- glmer(accuracy ~ 1 
                  + condition.ct : testhalf : talker
                  + (condition.ct * testhalf.ct * talker.ct)
                  - condition.ct 
                  - condition.ct : testhalf.ct
                  - condition.ct: talker.ct
                  - condition.ct : testhalf.ct : talker.ct
                  + (testhalf.ct||participant),
                  data = train2.child, control = glmerControl(optimizer = "bobyqa"), family = binomial)

anova(lmer.child.2, lmer.child.2.v3)

## Data: train2.child
## Models:
## lmer.child.2: accuracy ~ 1 + (condition.ct * testhalf.ct * talker.ct) + (testhalf.ct || 
## lmer.child.2:     participant)
## lmer.child.2.v3: accuracy ~ 1 + condition.ct:testhalf:talker + (condition.ct * 
## lmer.child.2.v3:     testhalf.ct * talker.ct) - condition.ct - condition.ct:testhalf.ct - 
## lmer.child.2.v3:     condition.ct:talker.ct - condition.ct:testhalf.ct:talker.ct + 
## lmer.child.2.v3:     (testhalf.ct || participant)
##                 Df    AIC    BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## lmer.child.2    10 106402 106496 -53191   106382                        
## lmer.child.2.v3 10 106402 106496 -53191   106382     0      0          1

get_coeffs(lmer.child.2.v3, c("condition.ct:testhalf1:talkerfemale1", "condition.ct:testhalf2:talkerfemale1", "condition.ct:testhalf1:talkerfemale2", "condition.ct:testhalf2:talkerfemale2"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
condition.ct:testhalf1:talkerfemale1	-0.048	0.307	-0.157	0.875
condition.ct:testhalf2:talkerfemale1	0.090	0.333	0.272	0.786
condition.ct:testhalf1:talkerfemale2	0.219	0.290	0.756	0.450
condition.ct:testhalf2:talkerfemale2	1.090	0.285	3.828	0.000

We only have an effect of condition for the more intelligible talker (female 2) in the second half of training.

Adult/Child Comparison

Since adults are at ceiling in the low-variablity condition, we look at the high variability condition only.

train2.highvar = selectCenter(subset(train2, condition == "highvar"), list("session", "talker", "agegroup"))

lmer.highvar.age <- glmer(accuracy ~ 1 
                    + session.ct * talker.ct * agegroup.ct                 
                    + (session.ct||participant),
                    data = train2.highvar, control = glmerControl(optimizer = "bobyqa"), family = binomial)

get_coeffs(lmer.highvar.age, c("agegroup.ct", "session.ct:agegroup.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
agegroup.ct	-1.008	0.217	-4.644	0.000
session.ct:agegroup.ct	-0.050	0.049	-1.020	0.308

There is a main effect of age group but no interaction between age group and session.

Check interactions with talker

get_coeffs(lmer.highvar.age, c("talker.ct", "talker.ct:agegroup.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
talker.ct	0.173	0.216	0.802	0.423
talker.ct:agegroup.ct	-1.071	0.435	-2.462	0.014

The main effect of age group is qualified by an interaction with talker, so we look to see if the benefit for adults over children holds for both talkers:

lmer.highvar.age.v2 <- glmer(accuracy ~ 1 
                      + agegroup.ct : talker
                      + session.ct * talker.ct * agegroup.ct                 
                      - agegroup.ct
                      - talker.ct : agegroup.ct
                      + (session.ct||participant),
                      data = train2.highvar, control = glmerControl(optimizer = "bobyqa"), family = binomial)

anova(lmer.highvar.age.v2,lmer.highvar.age)

## Data: train2.highvar
## Models:
## lmer.highvar.age.v2: accuracy ~ 1 + agegroup.ct:talker + session.ct * talker.ct * 
## lmer.highvar.age.v2:     agegroup.ct - agegroup.ct - talker.ct:agegroup.ct + (session.ct || 
## lmer.highvar.age.v2:     participant)
## lmer.highvar.age: accuracy ~ 1 + session.ct * talker.ct * agegroup.ct + (session.ct || 
## lmer.highvar.age:     participant)
##                     Df   AIC   BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## lmer.highvar.age.v2 10 39940 40026 -19960    39920                        
## lmer.highvar.age    10 39940 40026 -19960    39920     0      0          1

get_coeffs(lmer.highvar.age.v2, c("agegroup.ct:talkerfemale1", "agegroup.ct:talkerfemale2"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
agegroup.ct:talkerfemale1	-0.465	0.304	-1.532	0.126
agegroup.ct:talkerfemale2	-1.537	0.311	-4.941	0.000

This suggests that the benefit for adults over children is only reliable for the more intelligible talker.

English Introduction

Data Preparation

Select data for analyses

engintro.child = subset(engintro, agegroup == "child")
engintro.adult = subset(engintro, agegroup == "adult")

Get means for Table 4

Adults

engintro.adult.participantmeans = aggregate(accuracy ~ participant + session + condition, na.rm = T, data = engintro.adult, FUN = mean)

engintro.adult.groupmeans <- summarySEwithin(engintro.adult.participantmeans, measurevar="accuracy", betweenvars = c("condition"), withinvars = c("session"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for the imbalance in talkers (rather than the overall mean per participant, get the mean values for each participant for each talker and average those; error calculations remain as done over the whole participant set

engintro.adult.participantmeans.t12 = aggregate(accuracy ~ participant + session + condition + talker, na.rm = T, data = engintro.adult, FUN = mean)

engintro.adult.groupmeans.talker1 <- summarySEwithin(subset(engintro.adult.participantmeans.t12, talker == "female1"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

engintro.adult.groupmeans.talker2 <- summarySEwithin(subset(engintro.adult.participantmeans.t12, talker == "female2"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

engintro.adult.groupmeans$accuracy = (engintro.adult.groupmeans.talker1$accuracy + engintro.adult.groupmeans.talker2$accuracy)/2

kable(engintro.adult.groupmeans, digits = 2)

condition	session	N	accuracy	accuracy_norm	sd	se	ci
highvar	post	22	0.98	0.99	0.10	0.02	0.04
highvar	pre	22	0.81	0.82	0.10	0.02	0.04
lowvar	post	18	1.00	0.99	0.08	0.02	0.04
lowvar	pre	18	0.82	0.81	0.08	0.02	0.04

Children

engintro.child.participantmeans = aggregate(accuracy ~ participant + session + condition, na.rm = T, data = engintro.child, FUN = mean)
engintro.child.groupmeans <- summarySEwithin(engintro.child.participantmeans, measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for the imbalance in talkers (rather than the overall mean per participant, get the mean values for each participant for each talker and average those; error calculations remain as done over the whole participant set

engintro.child.participantmeans.t12 = aggregate(accuracy ~ participant + session + condition + talker, na.rm = T, data = engintro.child, FUN = mean)
 
engintro.child.groupmeans.talker1 <- summarySEwithin(subset(engintro.child.participantmeans.t12, talker == "female1"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

engintro.child.groupmeans.talker2 <- summarySEwithin(subset(engintro.child.participantmeans.t12, talker == "female2"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

engintro.child.groupmeans$accuracy = (engintro.child.groupmeans.talker1$accuracy + engintro.child.groupmeans.talker2$accuracy)/2

kable(engintro.child.groupmeans, digits = 2)

condition	session	N	accuracy	accuracy_norm	sd	se	ci
highvar	post	28	0.88	0.88	0.15	0.03	0.06
highvar	pre	28	0.50	0.50	0.15	0.03	0.06
lowvar	post	24	0.92	0.92	0.09	0.02	0.04
lowvar	pre	24	0.47	0.47	0.09	0.02	0.04

Statistical Analyses

Note we do not do statistical analyses on adult data as they are close to ceiling, even at pre-test.

Children

engintro.child = selectCenter(engintro.child, list("session", "condition", "talker"))
lme.engintrochild <- glmer(accuracy ~ 1 
                           + session.ct * condition.ct *  talker.ct
                           + (session.ct|participant), 
                           control=glmerControl(optimizer = "bobyqa"), data = engintro.child, family = binomial)
get_coeffs(lme.engintrochild, c("session.ct", "condition.ct", "session.ct:condition.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
session.ct	-3.191	0.261	-12.232	0.000
condition.ct	0.331	0.285	1.161	0.246
session.ct:condition.ct	-0.946	0.505	-1.874	0.061

There was a reliable main effect of session, but not of condition. There was a marginal interaction between session and condition.

Check interactions with talker

get_coeffs(lme.engintrochild , c("talker.ct", "session.ct:talker.ct", "session.ct:condition.ct:talker.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
talker.ct	-0.098	0.281	-0.348	0.728
session.ct:talker.ct	0.138	0.496	0.279	0.780
session.ct:condition.ct:talker.ct	-0.555	1.001	-0.555	0.579

There was no main effect of talker and neither the main effect of condition nor the interaction between condition and session was qualified by an interaction with talker.

Lexical Decision

Data Preparation

Select data for analyses

Remove practice and nonword trials (which aren’t used in any analyses) and then subset into the adult and child data sets for greek and english primes

lexd = subset(lexd, trialnumber != "practice" & trialtype != "NONWORD")
lexd.adult.english = subset(lexd, agegroup == "adult" & primelang == "english")
lexd.adult.greek = subset(lexd, agegroup =="adult" & primelang == "greek")
lexd.child.english = subset(lexd, agegroup == "child" & primelang == "english")
lexd.child.greek = subset(lexd, agegroup == "child" & primelang == "greek")

Filtering for RT analyses

For each of these data sets, remove the trials that were not answered correctly, and filter the RT data using the filter2 function. This function removes trials where the RT is > or < 2.5 SD from the mean value for that participant in that session. Finally, remove any remaining RTs that are < 200 ms.

flist = c("participant", "session") # list of parameters for calculating means
lim = 2.5 # number of standard deviations 

lexd.adult.greek.RT = subset(lexd.adult.greek, accuracy == 1)
lexd.adult.greek.RT = filter2(lexd.adult.greek.RT, flist, "RT", lim)
lexd.adult.greek.RT$RTfilt[lexd.adult.greek$RTfilt < 200] <- NA

lexd.child.greek.RT = subset(lexd.child.greek, accuracy == 1)
lexd.child.greek.RT = filter2(lexd.child.greek.RT, flist, "RT", lim)
lexd.child.greek.RT$RTfilt[lexd.child.greek$RTfilt < 200] <- NA

lexd.adult.english.RT = subset(lexd.adult.english, accuracy == 1)
lexd.adult.english.RT = filter2(lexd.adult.english.RT, flist, "RT", lim)
lexd.adult.english.RT$RTfilt[lexd.adult.english$RTfilt < 200] <- NA

lexd.child.english.RT = subset(lexd.child.english, accuracy == 1)
lexd.child.english.RT = filter2(lexd.child.english.RT, flist, "RT", lim)
lexd.child.english.RT$RTfilt[lexd.child.english$RTfilt < 200] <- NA

Calculate % trials removed because targets were incorrectly identified as nonwords:

#adult, greek-primes percent incorrect: 
round(100 * (1 - dim(lexd.adult.greek.RT)[1]/dim(lexd.adult.greek)[1]))

## [1] 5

#child, greek-primes percent incorrect: 
round(100 * (1 - dim(lexd.child.greek.RT)[1]/dim(lexd.child.greek)[1]))

## [1] 10

#adult, english-primes percent incorrect: 
round(100 * (1 - dim(lexd.adult.english.RT)[1]/dim(lexd.adult.english)[1]))

## [1] 6

#child, english-primes percent incorrect: 
round(100 * (1 - dim(lexd.child.english.RT)[1]/dim(lexd.child.english)[1]))

## [1] 16

Calculate % of further trials removed due to filtering:

#adult, greek-primes  
round((sum(is.na(lexd.adult.greek.RT$RTfilt))/sum(!is.na(lexd.adult.greek.RT$RT)))*100)

## [1] 3

#child, greek-primes 
round((sum(is.na(lexd.child.greek.RT$RTfilt))/sum(!is.na(lexd.child.greek.RT$RT)))*100)

## [1] 3

#adult, english-primes 
round((sum(is.na(lexd.adult.english.RT$RTfilt))/sum(!is.na(lexd.adult.english.RT$RT)))*100)

## [1] 4

#child, english-primes 
round((sum(is.na(lexd.child.english.RT$RTfilt))/sum(!is.na(lexd.child.english.RT$RT)))*100)

## [1] 4

Get Means and Plots

Greek Primes

Adult Accuracy

lexd.adult.greek.participantmeans = aggregate(accuracy ~ participant + session + primetarget_relationship, na.rm = T, data = lexd.adult.greek, FUN = mean)

lexd.adult.greek.groupmeans <- summarySEwithin(lexd.adult.greek.participantmeans, measurevar = "accuracy", betweenvars = NULL, withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

kable(lexd.adult.greek.groupmeans, digits = 3)

session	primetarget_relationship	N	accuracy	accuracy_norm	sd	se	ci
post	related	41	0.973	0.973	0.032	0.005	0.010
post	unrelated	41	0.923	0.923	0.048	0.007	0.015
pre	related	41	0.974	0.974	0.046	0.007	0.014
pre	unrelated	41	0.928	0.928	0.047	0.007	0.015

Adults RT

lexd.adult.greek.RT.participantmeans = aggregate(RTfilt ~ participant + session + primetarget_relationship, na.rm = T, data = lexd.adult.greek.RT, FUN = mean)

lexd.adult.greek.RT.groupmeans <- summarySEwithin(lexd.adult.greek.RT.participantmeans, measurevar = "RTfilt", betweenvars = NULL, withinvars = c("session","primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

kable(lexd.adult.greek.RT.groupmeans, digits = 3)

session	primetarget_relationship	N	RTfilt	RTfilt_norm	sd	se	ci
post	related	41	1007.995	1007.995	184.290	28.781	58.169
post	unrelated	41	1087.945	1087.945	144.334	22.541	45.557
pre	related	41	1031.923	1031.923	143.959	22.483	45.439
pre	unrelated	41	1142.395	1142.395	157.665	24.623	49.765

Children Accuracy

lexd.child.greek.participantmeans = aggregate(accuracy ~ participant + session + primetarget_relationship, na.rm = T, data = lexd.child.greek, FUN = mean)

lexd.child.greek.groupmeans <- summarySEwithin(lexd.child.greek.participantmeans, measurevar = "accuracy", betweenvars = NULL, withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

kable(lexd.child.greek.groupmeans, digits = 3)

session	primetarget_relationship	N	accuracy	accuracy_norm	sd	se	ci
post	related	52	0.907	0.907	0.095	0.013	0.026
post	unrelated	52	0.851	0.851	0.099	0.014	0.028
pre	related	52	0.946	0.946	0.094	0.013	0.026
pre	unrelated	52	0.881	0.881	0.118	0.016	0.033

Children RT

lexd.child.greek.RT.participantmeans = aggregate(RTfilt ~ participant + session + primetarget_relationship, na.rm = T, data = lexd.child.greek.RT, FUN = mean)

lexd.child.greek.RT.groupmeans <- summarySEwithin(lexd.child.greek.RT.participantmeans, measurevar = "RTfilt", betweenvars = NULL, withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

kable(lexd.child.greek.RT.groupmeans, digits = 3)

session	primetarget_relationship	N	RTfilt	RTfilt_norm	sd	se	ci
post	related	52	1389.938	1389.938	356.514	49.440	99.254
post	unrelated	52	1461.169	1461.169	330.132	45.781	91.909
pre	related	52	1552.385	1552.385	393.381	54.552	109.518
pre	unrelated	52	1612.984	1612.984	269.153	37.325	74.933

English Primes

Adult Accuracy

lexd.adult.english.participantmeans = aggregate(accuracy ~ participant + session + primetarget_relationship + condition, na.rm = T, data = lexd.adult.english, FUN = mean)

lexd.adult.english.groupmeans <- summarySEwithin(lexd.adult.english.participantmeans, measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for imbalance in talkers, get the mean values for participant for each talker (average mean values for each talker); error calculations as done over the whole participant set

lexd.adult.english.participantmeans.t12 = aggregate(accuracy ~ participant + session + condition + talker + primetarget_relationship, na.rm = T, data = lexd.adult.english, FUN = mean)

lexd.adult.english.groupmeans.talker1 <- summarySEwithin(subset(lexd.adult.english.participantmeans.t12, talker == "female1"), measurevar ="accuracy", betweenvars = c("condition"), withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

lexd.adult.english.groupmeans.talker2 <- summarySEwithin(subset(lexd.adult.english.participantmeans.t12, talker == "female2"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

lexd.adult.english.groupmeans$accuracy = (lexd.adult.english.groupmeans.talker1$accuracy + lexd.adult.english.groupmeans.talker2$accuracy)/2

kable(lexd.adult.english.groupmeans, digits = 3)

condition	session	primetarget_relationship	N	accuracy	accuracy_norm	sd	se	ci
highvar	post	related	22	0.982	0.976	0.036	0.008	0.016
highvar	post	unrelated	22	0.923	0.917	0.051	0.011	0.023
highvar	pre	related	22	0.980	0.974	0.033	0.007	0.015
highvar	pre	unrelated	22	0.902	0.897	0.058	0.012	0.026
lowvar	post	related	19	0.972	0.977	0.036	0.008	0.017
lowvar	post	unrelated	19	0.894	0.898	0.067	0.015	0.032
lowvar	pre	related	19	0.958	0.967	0.037	0.009	0.018
lowvar	pre	unrelated	19	0.914	0.922	0.045	0.010	0.021

Adult RT

lexd.adult.english.RT.participantmeans = aggregate(RTfilt ~ participant + session + primetarget_relationship + condition, na.rm = T, data = lexd.adult.english.RT, FUN = mean)

lexd.adult.english.RT.groupmeans <- summarySEwithin(lexd.adult.english.RT.participantmeans, measurevar = "RTfilt", betweenvars = c("condition"), withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for imbalance in talkers, get the mean values for each participant for each talker (average mean values for each talker); error calculations as done over the whole participant set

lexd.adult.english.RT.participantmeans.t12 = aggregate(RTfilt ~ participant + session + condition + talker + primetarget_relationship, na.rm = T, data = lexd.adult.english.RT, FUN = mean)

lexd.adult.english.RT.groupmeans.talker1 <- summarySEwithin(subset(lexd.adult.english.RT.participantmeans.t12, talker == "female1"), measurevar = "RTfilt", betweenvars = c("condition"), withinvars = c("session","primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

lexd.adult.english.RT.groupmeans.talker2 <- summarySEwithin(subset(lexd.adult.english.RT.participantmeans.t12, talker == "female2"), measurevar = "RTfilt", betweenvars = c("condition"), withinvars = c("session","primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

lexd.adult.english.RT.groupmeans$RTfilt = (lexd.adult.english.RT.groupmeans.talker1$RTfilt + lexd.adult.english.RT.groupmeans.talker2$RTfilt)/2

kable(lexd.adult.english.RT.groupmeans, digits = 3)

condition	session	primetarget_relationship	N	RTfilt	RTfilt_norm	sd	se	ci
highvar	post	related	22	1151.098	1138.351	199.289	42.489	88.360
highvar	post	unrelated	22	1234.067	1221.320	220.874	47.091	97.930
highvar	pre	related	22	1225.341	1212.594	289.115	61.640	128.186
highvar	pre	unrelated	22	1226.076	1213.328	199.420	42.516	88.418
lowvar	post	related	19	1097.814	1116.801	93.480	21.446	45.056
lowvar	post	unrelated	19	1208.202	1231.300	239.688	54.988	115.526
lowvar	pre	related	19	1138.798	1151.622	129.654	29.745	62.491
lowvar	pre	unrelated	19	1273.559	1285.869	166.363	38.166	80.184

Plot for Figure 4

Note that we have collapsed across condition here.

# First get the means (collapsed across conditions) to plot

lexd.adult.english.RT.participantmeans = aggregate(RTfilt ~ participant + session + primetarget_relationship, na.rm = T, data = lexd.adult.english.RT, FUN = mean)

lexd.adult.english.RT.groupmeans <- summarySEwithin(lexd.adult.english.RT.participantmeans, measurevar = "RTfilt", betweenvars = NULL, withinvars = c("session","primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for imbalance in talkers, get the mean values for each participant for each talker (average mean values for each talker); error calculations as done over the whole participant set

lexd.adult.english.RT.participantmeans.t12 = aggregate(RTfilt ~ participant + session + talker + primetarget_relationship, na.rm = T, data = lexd.adult.english.RT, FUN = mean)

lexd.adult.english.RT.groupmeans.talker1 <- summarySEwithin(subset(lexd.adult.english.RT.participantmeans.t12, talker == "female1"), measurevar = "RTfilt", betweenvars = NULL, withinvars = c("session","primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

lexd.adult.english.RT.groupmeans.talker2 <- summarySEwithin(subset(lexd.adult.english.RT.participantmeans.t12, talker == "female2"), measurevar = "RTfilt", betweenvars = NULL, withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

lexd.adult.english.RT.groupmeans$RTfilt = (lexd.adult.english.RT.groupmeans.talker1$RTfilt + lexd.adult.english.RT.groupmeans.talker2$RTfilt)/2

# Now plot

lexd.adult.english.RT.groupmeans$session = relevel(lexd.adult.english.RT.groupmeans$session, "pre")

p = ggplot(lexd.adult.english.RT.groupmeans, aes(x = session, y = RTfilt, fill = primetarget_relationship))
p = p + geom_bar(stat = "identity", position = "dodge", colour = "black", size = .3)
p = p + geom_errorbar(aes(ymin = RTfilt - ci, ymax = RTfilt + ci), width = .4, position = position_dodge(.9))
p = p + xlab("") + ylab("RT(ms)")
p = p + scale_x_discrete(labels = c("pre" = "Pre test", "post" = "Post test"), expand = c(0, 0.6))
p = p + scale_fill_manual(values = c("grey", "white"), name = "Prime target relationship", labels = c("related", "unrelated"))
#p = p + scale_y_continuous(limits = c(0,1.1), expand = c(0, 0))
p = p + coord_cartesian(ylim = c(1000, 1900))

p

Children Accuracy

lexd.child.english.participantmeans = aggregate(accuracy ~ participant + session + primetarget_relationship + condition, na.rm = T, data = lexd.child.english, FUN = mean)

lexd.child.english.groupmeans <- summarySEwithin(lexd.child.english.participantmeans, measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for imbalance in talkers, get the mean values for each participant for each talker (average mean values for each talker); error calculations as done over the whole participant set

lexd.child.english.participantmeans.t12 = aggregate(accuracy ~ participant + session + condition + talker + primetarget_relationship, na.rm = T, data = lexd.child.english, FUN = mean)

lexd.child.english.groupmeans.talker1 <- summarySEwithin(subset(lexd.child.english.participantmeans.t12, talker == "female1"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session","primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

lexd.child.english.groupmeans.talker2 <- summarySEwithin(subset(lexd.child.english.participantmeans.t12, talker == "female2"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

lexd.child.english.groupmeans$accuracy = (lexd.child.english.groupmeans.talker1$accuracy + lexd.child.english.groupmeans.talker2$accuracy)/2

kable(lexd.child.english.groupmeans, digits = 3)

condition	session	primetarget_relationship	N	accuracy	accuracy_norm	sd	se	ci
highvar	post	related	28	0.891	0.876	0.103	0.019	0.040
highvar	post	unrelated	28	0.834	0.819	0.114	0.021	0.044
highvar	pre	related	28	0.884	0.869	0.086	0.016	0.033
highvar	pre	unrelated	28	0.820	0.805	0.129	0.024	0.050
lowvar	post	related	24	0.867	0.889	0.113	0.023	0.048
lowvar	post	unrelated	24	0.815	0.836	0.094	0.019	0.040
lowvar	pre	related	24	0.838	0.858	0.084	0.017	0.035
lowvar	pre	unrelated	24	0.770	0.787	0.132	0.027	0.056

Children RT

lexd.child.english.RT.participantmeans = aggregate(RTfilt ~ participant + session + primetarget_relationship + condition, na.rm = T, data = lexd.child.english.RT, FUN = mean)

lexd.child.english.RT.groupmeans <- summarySEwithin(lexd.child.english.RT.participantmeans, measurevar = "RTfilt", betweenvars = c("condition"), withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for imbalance in talkers, get the mean values for each participant for each talker (average mean values for each talker); error calculations as done over the whole participant set

lexd.child.english.RT.participantmeans.t12 = aggregate(RTfilt~ participant + session + condition + talker + primetarget_relationship, na.rm = T, data = lexd.child.english.RT, FUN = mean)

lexd.child.english.RT.groupmeans.talker1 <- summarySEwithin(subset(lexd.child.english.RT.participantmeans.t12, talker == "female1"), measurevar = "RTfilt", betweenvars = c("condition"), withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

lexd.child.english.RT.groupmeans.talker2 <- summarySEwithin(subset(lexd.child.english.RT.participantmeans.t12, talker == "female2"), measurevar = "RTfilt", betweenvars = c("condition"), withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

lexd.child.english.RT.groupmeans$RTfilt = (lexd.child.english.RT.groupmeans.talker1$RTfilt + lexd.child.english.RT.groupmeans.talker2$RTfilt)/2

kable(lexd.child.english.RT.groupmeans, digits = 3)

condition	session	primetarget_relationship	N	RTfilt	RTfilt_norm	sd	se	ci
highvar	post	related	28	1486.777	1536.915	378.880	71.602	146.914
highvar	post	unrelated	28	1574.379	1624.517	274.708	51.915	106.521
highvar	pre	related	28	1669.582	1719.720	351.204	66.371	136.183
highvar	pre	unrelated	28	1665.049	1715.187	290.669	54.931	112.710
lowvar	post	related	24	1494.584	1439.786	331.837	67.736	140.122
lowvar	post	unrelated	24	1622.993	1562.857	468.758	95.685	197.939
lowvar	pre	related	24	1811.015	1765.427	444.960	90.827	187.890
lowvar	pre	unrelated	24	1869.315	1828.268	409.474	83.584	172.906

Plot for Figure 4

Note that we have collapsed across condition here.

# First get the means (collapsed across conditions) to plot

lexd.child.english.RT.participantmeans = aggregate(RTfilt ~ participant + session + primetarget_relationship, na.rm = T, data = lexd.child.english.RT, FUN = mean)

lexd.child.english.RT.groupmeans <- summarySEwithin(lexd.child.english.RT.participantmeans, measurevar = "RTfilt", betweenvars = NULL, withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for imbalance in talkers, get the mean values for each participant for each talker (average mean values for each talker); error calculations as done over the whole participant set

lexd.child.english.RT.participantmeans.t12 = aggregate(RTfilt ~ participant + session + talker + primetarget_relationship, na.rm = T, data = lexd.child.english.RT, FUN = mean)

lexd.child.english.RT.groupmeans.talker1 <- summarySEwithin(subset(lexd.child.english.RT.participantmeans.t12, talker == "female1"), measurevar = "RTfilt", betweenvars = NULL, withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

lexd.child.english.RT.groupmeans.talker2 <- summarySEwithin(subset(lexd.child.english.RT.participantmeans.t12, talker == "female2"), measurevar = "RTfilt", betweenvars = NULL, withinvars = c("session", "primetarget_relationship"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

lexd.child.english.RT.groupmeans$RTfilt = (lexd.child.english.RT.groupmeans.talker1$RTfilt + lexd.child.english.RT.groupmeans.talker2$RTfilt)/2

# Now plot

lexd.child.english.RT.groupmeans$session = relevel(lexd.child.english.RT.groupmeans$session, "pre")

p = ggplot(lexd.child.english.RT.groupmeans, aes(x = session, y = RTfilt, fill = primetarget_relationship))
p = p + geom_bar(stat="identity", position="dodge", colour="black", size=.3)
p = p + geom_errorbar(aes(ymin = RTfilt-ci, ymax=RTfilt+ci), width = .4, position = position_dodge(.9))
p = p + xlab("") + ylab("RT(ms)")
p = p + scale_x_discrete(labels = c("pre" = "Pre test", "post" = "Post test"), expand = c(0, 0.6))
p = p + scale_fill_manual(values = c("grey", "white"), name = "Prime target relationship", labels = c("related", "unrelated"))
#p = p + scale_y_continuous(limits=c(0,1.1), expand = c(0, 0))
p = p + coord_cartesian(ylim = c(1000, 1900))

p

Statistical Analyses

Adults, Greek, RT

Note that for these analyses - which have a continuous predicted variable and use lmer rather than glmer - p values are obtained using the package lmerTest. A Kenward-Roger approximation is used (computational error occurs when using the default Satthwaite approximation). (Note that “talker”" is not included as variable since there was only one greek talker used for these stimuli)

lexd.adult.greek.RT$primetarget_relationship = factor(lexd.adult.greek.RT$primetarget_relationship) 

lexd.adult.greek.RT = selectCenter(lexd.adult.greek.RT, list("session", "primetarget_relationship"))

lme.greek.adult<- lmer(RTfilt ~ 1
                      + session.ct * primetarget_relationship.ct
                      + (session.ct * primetarget_relationship.ct|participant), 
                      data = lexd.adult.greek.RT, control = lmerControl(optimizer = "bobyqa"), REML = F )
kable(summary(lme.greek.adult, ddf = "kenw")$coefficients, digits = 3)

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	1066.968	23.936	42.066	44.028	0.000
session.ct	37.904	39.445	42.220	0.949	0.348
primetarget_relationship.ct	101.326	13.516	49.702	7.403	0.000
session.ct:primetarget_relationship.ct	17.707	26.409	45.666	0.668	0.507

An effect of prime-target relationship (related = 1014, unrelated = 1114 but no effect of test session and no interaction.

Adult, English, RT

Begin by looking at experimental effects with predictions:

lexd.adult.english.RT$primetarget_relationship = factor(lexd.adult.english.RT$primetarget_relationship)
lexd.adult.english.RT$talker = factor(lexd.adult.english.RT$talker)

lexd.adult.english.RT = selectCenter(lexd.adult.english.RT, list("session", "primetarget_relationship", "talker", "condition"))

lme.english.adult<- lmer(RTfilt ~ 1
                        + session.ct * primetarget_relationship.ct * condition.ct * talker.ct
                        + (session.ct* primetarget_relationship.ct|participant), 
                        data = lexd.adult.english.RT, control = lmerControl(optimizer = "bobyqa"), REML = F )
get_coeffs_lmerTest(lme.english.adult, c("(Intercept)", "session.ct", "primetarget_relationship.ct", "session.ct:primetarget_relationship.ct", "session.ct:primetarget_relationship.ct:condition.ct"))

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	1195.162	29.115	46.830	39.026	0.000
session.ct	42.175	48.643	48.747	0.824	0.414
primetarget_relationship.ct	83.619	20.127	57.046	4.036	0.000
session.ct:primetarget_relationship.ct	-37.261	41.760	55.469	-0.854	0.397
session.ct:primetarget_relationship.ct:condition.ct	95.898	83.810	55.473	1.096	0.278

An effect of prime-target relationship (related = 1153 unrelated = 1238) but no effect of test session, no interaction between prime-target relationship and test session, and no three-way interaction between condition, prime-target relationship and test sesssion.

This indicates semantic priming across the langauges, but no indication that it is influenced by training.

Check interactions with talker

Check whether effect of prime-target relationship is qualified by an interaction with talker

get_coeffs_lmerTest(lme.english.adult, c("talker", "primetarget_relationship.ct:talker"))

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
talker.ct	-47.841	57.930	49.687	-0.782	0.438
primetarget_relationship.ct:talker.ct	-71.576	40.291	59.469	-1.721	0.090

No overall effect of talker, but talker did interact marginally with prime-target relationship. Looking at the means, this reflects more priming with the less intelligble talker (female 1)

round(with(lexd.adult.english.RT, tapply(RTfilt, list(primetarget_relationship, talker), mean, na.rm = T)))

##           female1 female2
## related      1163    1143
## unrelated    1284    1193

Child, Greek, RT

lexd.child.greek.RT$primetarget_relationship = factor(lexd.child.greek.RT$primetarget_relationship) 

lexd.child.greek.RT = selectCenter(lexd.child.greek.RT, list("session", "primetarget_relationship"))
lme.greek.child<- lmer(RTfilt ~ 1
                      + session.ct * primetarget_relationship.ct
                      + (primetarget_relationship.ct * session.ct|participant), 
                      data = lexd.child.greek.RT, control = lmerControl(optimizer = "bobyqa") , REML = F)
kable(summary(lme.greek.child, ddf = "kenw")$coefficients, digits = 3)

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	1510.026	43.241	53.549	34.568	0.000
session.ct	136.083	69.209	54.156	1.946	0.057
primetarget_relationship.ct	72.343	29.209	52.715	2.467	0.017
session.ct:primetarget_relationship.ct	-25.249	67.967	71.995	-0.368	0.714

There was a significant effect of prime-target relationship (related = 1487 unrelated = 1549), and a marginal effect of test session (pre = 1563 post = 1469). The interaction between prime-target relationship and session was non-significant.

Child, English, RT

lexd.child.english.RT$primetarget_relationship = factor(lexd.child.english.RT$primetarget_relationship)
lexd.child.english.RT$talker = factor(lexd.child.english.RT$talker)

lexd.child.english.RT = selectCenter(lexd.child.english.RT, list("session", "primetarget_relationship", "talker", "condition"))

lme.english.child<- lmer(RTfilt ~ 1
                      + session.ct * primetarget_relationship.ct * condition.ct * talker.ct
                      + (session.ct * primetarget_relationship.ct|participant), 
                      data = lexd.child.english.RT, control = lmerControl(optimizer = "bobyqa"), REML = F)
get_coeffs_lmerTest(lme.english.child, c("(Intercept)", "session.ct", "primetarget_relationship.ct", "session.ct:primetarget_relationship.ct", "session.ct:primetarget_relationship.ct:condition.ct"))

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	1641.142	48.517	57.344	32.416	0.000
session.ct	181.883	74.463	57.593	2.339	0.023
primetarget_relationship.ct	65.165	33.389	47.873	1.945	0.058
session.ct:primetarget_relationship.ct	-52.665	66.597	47.363	-0.788	0.434
session.ct:primetarget_relationship.ct:condition.ct	-7.548	133.805	47.616	-0.056	0.955

There is a marginal effect of prime-target relationship (related = 1620 unrelated = 1664 and a main effect of test session (pre = 1704 post = 1580

Check interactions with talker

get_coeffs_lmerTest(lme.english.child, c("(Intercept)", "talker.ct", "session.ct:talker.ct", "primetarget_relationship.ct:talker.ct"))

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	1641.142	48.517	57.344	32.416	0.000
talker.ct	-70.538	97.110	57.398	-0.696	0.489
session.ct:talker.ct	-245.946	149.068	57.665	-1.580	0.120
primetarget_relationship.ct:talker.ct	-8.199	66.880	47.941	-0.122	0.903

No effect of talker or interactions involving this variable.

Child, English, Accuracy

lexd.child.english$primetarget_relationship = factor(lexd.child.english$primetarget_relationship)
lexd.child.english$talker = factor(lexd.child.english$talker)

lexd.child.english = selectCenter(lexd.child.english, list("session", "primetarget_relationship", "talker", "condition"))

lme.english.child.acc<- glmer(accuracy ~ 1
                              + session.ct * primetarget_relationship.ct * condition.ct * talker.ct
                              + (session.ct * primetarget_relationship.ct|participant), 
                              data = lexd.child.english, control = glmerControl(optimizer = "bobyqa" ), family = binomial)
get_coeffs(lme.english.child.acc, c("(Intercept)", "session.ct", "primetarget_relationship.ct", "session.ct:primetarget_relationship.ct", "session.ct:primetarget_relationship.ct:condition.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	2.029	0.132	15.355	0.000
session.ct	-0.320	0.168	-1.906	0.057
primetarget_relationship.ct	-0.708	0.126	-5.596	0.000
session.ct:primetarget_relationship.ct	0.261	0.232	1.123	0.261
session.ct:primetarget_relationship.ct:condition.ct	0.012	0.399	0.030	0.976

There is a reliable main effect of prime-target relationship (related = 0.87 unrelated = 0.81 and a marginal main effect of test session (pre = 0.83 post = 0.85). No other reliable effects.

Check interactions with talker

get_coeffs(lme.english.child.acc, c("(Intercept)", "talker.ct", "primetarget_relationship.ct:talker.ct", "session.ct:talker.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	2.029	0.132	15.355	0.000
talker.ct	-0.195	0.260	-0.750	0.453
primetarget_relationship.ct:talker.ct	0.130	0.227	0.575	0.566
session.ct:talker.ct	0.625	0.322	1.945	0.052

There is no overall effect talker and the effect of prime-target relationship isn’t qualified by an interaction with talker. However, the marginal effect of session is qualified by an interaction with talker. This reflects the fact that children only showed improvements from pre to post test with one of the two talkers (female 1 - the less intelligible talker)

female 1 : pre = 0.82 post = 0.89 female 2 :pre = 0.84 post = 0.81

Discrimination

Data Preparation

Select data for analyses

discrim.child = subset(discrim, agegroup == "child")
discrim.adult = subset(discrim, agegroup == "adult")

Get Means and Plots

Adults

Table of Means

discrim.adult.participantmeans = aggregate(accuracy ~ participant + session + condition + word_oldnew + voice_oldnew, na.rm = T, data = discrim.adult, FUN = mean)

discrim.adult.groupmeans <- summarySEwithin(discrim.adult.participantmeans, measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session", "voice_oldnew", "word_oldnew"), idvar = "participant", na.rm = FALSE, conf.interval  = .95)

# To correct means for imbalance in talkers, get the mean values for each participant for each talker (average mean values for each talker); error calculations as done over the whole participant set

discrim.adult.participantmeans.t12 = aggregate(accuracy ~ participant + session + condition + talker + word_oldnew + voice_oldnew, na.rm = T, data = discrim.adult, FUN = mean)

discrim.adult.groupmeans.talker1 <- summarySEwithin(subset(discrim.adult.participantmeans.t12, talker == "female1"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session", "voice_oldnew", "word_oldnew"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

discrim.adult.groupmeans.talker2 <- summarySEwithin(subset(discrim.adult.participantmeans.t12, talker == "female2"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session", "voice_oldnew", "word_oldnew"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

discrim.adult.groupmeans$accuracy = (discrim.adult.groupmeans.talker1$accuracy + discrim.adult.groupmeans.talker2$accuracy)/2

kable(discrim.adult.groupmeans, digits = 3)

condition	session	voice_oldnew	word_oldnew	N	accuracy	accuracy_norm	sd	se	ci
highvar	post	newvoice	newword	22	0.836	0.835	0.193	0.041	0.086
highvar	post	newvoice	oldword	22	0.823	0.821	0.134	0.029	0.059
highvar	post	oldvoice	newword	22	0.800	0.798	0.184	0.039	0.082
highvar	post	oldvoice	oldword	22	0.841	0.839	0.175	0.037	0.078
highvar	pre	newvoice	newword	22	0.709	0.708	0.191	0.041	0.085
highvar	pre	newvoice	oldword	22	0.755	0.753	0.185	0.039	0.082
highvar	pre	oldvoice	newword	22	0.800	0.798	0.186	0.040	0.082
highvar	pre	oldvoice	oldword	22	0.764	0.762	0.186	0.040	0.083
lowvar	post	newvoice	newword	19	0.836	0.833	0.205	0.047	0.099
lowvar	post	newvoice	oldword	19	0.810	0.807	0.186	0.043	0.090
lowvar	post	oldvoice	newword	19	0.791	0.802	0.186	0.043	0.089
lowvar	post	oldvoice	oldword	19	0.776	0.786	0.161	0.037	0.078
lowvar	pre	newvoice	newword	19	0.812	0.812	0.126	0.029	0.061
lowvar	pre	newvoice	oldword	19	0.767	0.765	0.179	0.041	0.086
lowvar	pre	oldvoice	newword	19	0.741	0.749	0.186	0.043	0.090
lowvar	pre	oldvoice	oldword	19	0.752	0.760	0.150	0.034	0.072

Plot for Figure 5

Plot of the difference scores (showing improvement from pre- to post- test)

discrim.adult.participantmeans.2 = reshape(discrim.adult.participantmeans, timevar = "session", idvar = c("participant", "condition", "word_oldnew", "voice_oldnew"), direction = "wide")

discrim.adult.participantmeans.2$accuracy.diff = discrim.adult.participantmeans.2$accuracy.post - discrim.adult.participantmeans.2$accuracy.pre

discrim.adult.groupmeans.diff <- summarySEwithin(discrim.adult.participantmeans.2, measurevar = "accuracy.diff", betweenvars = c("condition"), withinvars = c("voice_oldnew", "word_oldnew"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for imbalance in talkers, get the mean values for each participant for each talker (average mean values for each talker); error calculations as done over the whole participant set

discrim.adult.participantmeans.t12.2 = reshape(discrim.adult.participantmeans.t12, timevar = "session", idvar = c("participant", "condition", "word_oldnew", "voice_oldnew", "talker"), direction = "wide")

discrim.adult.participantmeans.t12.2$accuracy.diff = discrim.adult.participantmeans.t12.2$accuracy.post - discrim.adult.participantmeans.t12.2$accuracy.pre

discrim.adult.groupmeans.diff.t1 <- summarySEwithin(subset(discrim.adult.participantmeans.t12.2, talker == "female1"), measurevar = "accuracy.diff", betweenvars = c("condition"), withinvars = c("voice_oldnew", "word_oldnew"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

discrim.adult.groupmeans.diff.t2 <- summarySEwithin(subset(discrim.adult.participantmeans.t12.2, talker == "female2"), measurevar = "accuracy.diff", betweenvars = c("condition"), withinvars = c("voice_oldnew","word_oldnew"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

discrim.adult.groupmeans.diff$accuracy.diff =  (discrim.adult.groupmeans.diff.t1$accuracy.diff + discrim.adult.groupmeans.diff.t2$accuracy.diff)/2 

# Now plot

discrim.adult.groupmeans.diff$word_oldnew = relevel(discrim.adult.groupmeans.diff$word_oldnew, "oldword")
discrim.adult.groupmeans.diff$novelty = paste(as.character(discrim.adult.groupmeans.diff$word_oldnew), as.character(discrim.adult.groupmeans.diff$voice_oldnew))

discrim.adult.groupmeans.diff$novelty = factor(discrim.adult.groupmeans.diff$novelty)
discrim.adult.groupmeans.diff$novelty = factor(discrim.adult.groupmeans.diff$novelty, 
levels = c("oldword oldvoice", "oldword newvoice", "newword oldvoice", "newword newvoice"))

p = ggplot(discrim.adult.groupmeans.diff, aes(x = novelty, y = accuracy.diff, fill = condition))
p = p + geom_bar(stat="identity", position = "dodge", colour = "black", size = .3)
p = p + geom_errorbar(aes(ymin = accuracy.diff-ci, ymax=accuracy.diff+ci), width = .4, position = position_dodge(.9))
p = p + xlab("") + ylab("% increase in correct responses")
p = p + scale_x_discrete(labels=c("oldword oldvoice" = " trained words \n trained voice", 
                                  "oldword newvoice" = " trained words \n untrained voice",
                                  "newword oldvoice" = " untrained words \n trained voice",
                                  "newword newvoice" = " untrained words \n untrained voice"
                                  ), expand=c(0, 0.6))
p = p + scale_fill_manual(values = c("grey", "white"), name = "Condition", labels = c("high-variablity", "low-variablity"))
p = p + coord_cartesian(ylim = c(-.1, .4))

p

Children

Table of Means

discrim.child.participantmeans = aggregate(accuracy ~ participant + session + condition + word_oldnew + voice_oldnew, na.rm = T, data = discrim.child, FUN = mean)
discrim.child.groupmeans <- summarySEwithin(discrim.child.participantmeans, measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session", "voice_oldnew", "word_oldnew"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for imbalance in talkers, get the mean values for each participant for each talker (average mean values for each talker); error calculations as done over the whole participant set

discrim.child.participantmeans.t12 = aggregate(accuracy ~ participant + session + condition + talker + word_oldnew + voice_oldnew, na.rm = T, data = discrim.child, FUN = mean)

discrim.child.groupmeans.talker1 <- summarySEwithin(subset(discrim.child.participantmeans.t12, talker == "female1"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session", "voice_oldnew", "word_oldnew"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

discrim.child.groupmeans.talker2 <- summarySEwithin(subset(discrim.child.participantmeans.t12, talker == "female2"), measurevar = "accuracy", betweenvars = c("condition"), withinvars = c("session", "voice_oldnew", "word_oldnew"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

discrim.child.groupmeans$accuracy = (discrim.child.groupmeans.talker1$accuracy + discrim.child.groupmeans.talker2$accuracy)/2

kable(discrim.child.groupmeans, digits = 3)

condition	session	voice_oldnew	word_oldnew	N	accuracy	accuracy_norm	sd	se	ci
highvar	post	newvoice	newword	28	0.746	0.724	0.163	0.031	0.063
highvar	post	newvoice	oldword	28	0.754	0.731	0.148	0.028	0.057
highvar	post	oldvoice	newword	28	0.714	0.692	0.148	0.028	0.057
highvar	post	oldvoice	oldword	28	0.754	0.731	0.148	0.028	0.057
highvar	pre	newvoice	newword	28	0.664	0.642	0.192	0.036	0.075
highvar	pre	newvoice	oldword	28	0.629	0.606	0.196	0.037	0.076
highvar	pre	oldvoice	newword	28	0.686	0.663	0.154	0.029	0.060
highvar	pre	oldvoice	oldword	28	0.668	0.646	0.131	0.025	0.051
lowvar	post	newvoice	newword	24	0.770	0.801	0.167	0.034	0.070
lowvar	post	newvoice	oldword	24	0.748	0.780	0.218	0.045	0.092
lowvar	post	oldvoice	newword	24	0.720	0.738	0.166	0.034	0.070
lowvar	post	oldvoice	oldword	24	0.744	0.759	0.180	0.037	0.076
lowvar	pre	newvoice	newword	24	0.584	0.613	0.206	0.042	0.087
lowvar	pre	newvoice	oldword	24	0.526	0.555	0.153	0.031	0.064
lowvar	pre	oldvoice	newword	24	0.588	0.609	0.176	0.036	0.074
lowvar	pre	oldvoice	oldword	24	0.567	0.580	0.178	0.036	0.075

Plot for Figure 5

Plot of the difference scores (showing improvement from pre- to post- test)

discrim.child.participantmeans.2 = reshape(discrim.child.participantmeans, timevar = "session", idvar = c("participant", "condition", "word_oldnew", "voice_oldnew"), direction = "wide")

discrim.child.participantmeans.2$accuracy.diff = discrim.child.participantmeans.2$accuracy.post - discrim.child.participantmeans.2$accuracy.pre

discrim.child.groupmeans.diff <- summarySEwithin(discrim.child.participantmeans.2, measurevar = "accuracy.diff", betweenvars = c("condition"), withinvars = c("voice_oldnew", "word_oldnew"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

# To correct means for imbalance in talkers, get the mean values for each participant for each talker (average mean values for each talker); error calculations as done over the whole participant set

discrim.child.participantmeans.t12.2 = reshape(discrim.child.participantmeans.t12, timevar = "session", idvar = c("participant", "condition", "word_oldnew", "voice_oldnew","talker"), direction = "wide")

discrim.child.participantmeans.t12.2$accuracy.diff = discrim.child.participantmeans.t12.2$accuracy.post - discrim.child.participantmeans.t12.2$accuracy.pre

discrim.child.groupmeans.diff.t1 <- summarySEwithin(subset(discrim.child.participantmeans.t12.2, talker =="female1"), measurevar = "accuracy.diff", betweenvars = c("condition"), withinvars = c("voice_oldnew", "word_oldnew"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

discrim.child.groupmeans.diff.t2 <- summarySEwithin(subset(discrim.child.participantmeans.t12.2, talker =="female2"), measurevar = "accuracy.diff", betweenvars = c("condition"), withinvars = c("voice_oldnew", "word_oldnew"), idvar = "participant", na.rm = FALSE, conf.interval = .95)

discrim.child.groupmeans.diff$accuracy.diff =  (discrim.child.groupmeans.diff.t1$accuracy.diff + discrim.child.groupmeans.diff.t2$accuracy.diff)/2 

# Now plot

discrim.child.groupmeans.diff$word_oldnew = relevel(discrim.child.groupmeans.diff$word_oldnew, "oldword")
discrim.child.groupmeans.diff$novelty = paste(as.character(discrim.child.groupmeans.diff$word_oldnew), as.character(discrim.child.groupmeans.diff$voice_oldnew))

discrim.child.groupmeans.diff$novelty = factor(discrim.child.groupmeans.diff$novelty)
discrim.child.groupmeans.diff$novelty = factor(discrim.child.groupmeans.diff$novelty, levels = c("oldword oldvoice", "oldword newvoice", "newword oldvoice", "newword newvoice"))

p = ggplot(discrim.child.groupmeans.diff, aes(x = novelty, y = accuracy.diff, fill = condition))
p = p + geom_bar(stat="identity", position = "dodge", colour = "black", size = .3)
p = p + geom_errorbar(aes(ymin = accuracy.diff-ci, ymax = accuracy.diff+ci), width = .4, position = position_dodge(.9))
p = p + xlab("") + ylab("% increase in correct responses")
p = p + scale_x_discrete(labels=c("oldword oldvoice" = " trained words \n trained voice", 
                                  "oldword newvoice" = " trained words \n untrained voice",
                                  "newword oldvoice" = " untrained words \n trained voice",
                                  "newword newvoice" = " untrained words \n untrained voice"
                                  ), expand = c(0, 0.6))
p = p + scale_fill_manual(values = c("grey", "white"), name = "Condition", labels = c("high-variablity", "low-variablity"))
p = p + coord_cartesian(ylim = c(-.1, .4))

p

Statistical Analyses

Adults

discrim.adult = selectCenter(discrim.adult, list("session", "word_oldnew", "voice_oldnew", "talker", "condition", "agegroup", "mean_replay_in_training"))

adult.lmer <- glmer(accuracy ~ 1 
                    + (condition.ct * session.ct * word_oldnew.ct * voice_oldnew.ct * talker.ct)
                    + mean_replay_in_training.ct 
                    + (word_oldnew.ct * voice_oldnew.ct * session.ct|participant), 
                    data = discrim.adult, 
                    control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)), family = binomial)

get_coeffs(adult.lmer, c("(Intercept", "condition.ct", "session.ct", "word_oldnew.ct", "condition.ct:session.ct", "session.ct:word_oldnew.ct", "session.ct:voice_oldnew.ct", "condition.ct:session.ct:word_oldnew.ct", "condition.ct:session.ct:voice_oldnew.ct", "condition.ct:word_oldnew.ct:voice_oldnew.ct", "session.ct:word_oldnew.ct:voice_oldnew.ct", "condition.ct:session.ct:word_oldnew.ct:voice_oldnew.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	1.661	0.099	16.854	0.000
condition.ct	-0.050	0.193	-0.260	0.795
session.ct	-0.520	0.151	-3.440	0.001
word_oldnew.ct	-0.148	0.120	-1.234	0.217
condition.ct:session.ct	0.065	0.291	0.223	0.824
session.ct:word_oldnew.ct	0.071	0.248	0.287	0.774
session.ct:voice_oldnew.ct	0.189	0.249	0.761	0.446
condition.ct:session.ct:word_oldnew.ct	0.229	0.462	0.496	0.620
condition.ct:session.ct:voice_oldnew.ct	-0.910	0.457	-1.993	0.046
condition.ct:word_oldnew.ct:voice_oldnew.ct	-0.005	0.452	-0.012	0.991
session.ct:word_oldnew.ct:voice_oldnew.ct	-0.824	0.495	-1.666	0.096
condition.ct:session.ct:word_oldnew.ct:voice_oldnew.ct	1.938	0.918	2.111	0.035

There is a reliable effect of test-session but no predicted test-session by condition interaction. There is a test-session by condition by voice-novelty interaction, but this is qualified by a four-way test-session by condition by voice-novelty by word-novelty interaction.

We break down the four way interaction by re-running the model so that we have a slope for the three way test-session by condition by voice-novelty interaction for each of trained and untrained words.

adult.lmer.v2 <- glmer(accuracy ~ 1 
                       + (condition.ct: session.ct:voice_oldnew.ct:word_oldnew)
                       + (condition.ct  * session.ct * word_oldnew.ct * voice_oldnew.ct * talker.ct)
                       - (condition.ct: session.ct: word_oldnew.ct: voice_oldnew.ct)
                       - (condition.ct: session.ct: voice_oldnew.ct)
                       + mean_replay_in_training.ct 
                       + (word_oldnew.ct * voice_oldnew.ct * session.ct|participant),
                       data = discrim.adult, 
                       control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=1e5)), family = binomial)
anova(adult.lmer.v2, adult.lmer)

## Data: discrim.adult
## Models:
## adult.lmer.v2: accuracy ~ 1 + (condition.ct:session.ct:voice_oldnew.ct:word_oldnew) + 
## adult.lmer.v2:     (condition.ct * session.ct * word_oldnew.ct * voice_oldnew.ct * 
## adult.lmer.v2:         talker.ct) - (condition.ct:session.ct:word_oldnew.ct:voice_oldnew.ct) - 
## adult.lmer.v2:     (condition.ct:session.ct:voice_oldnew.ct) + mean_replay_in_training.ct + 
## adult.lmer.v2:     (word_oldnew.ct * voice_oldnew.ct * session.ct | participant)
## adult.lmer: accuracy ~ 1 + (condition.ct * session.ct * word_oldnew.ct * 
## adult.lmer:     voice_oldnew.ct * talker.ct) + mean_replay_in_training.ct + 
## adult.lmer:     (word_oldnew.ct * voice_oldnew.ct * session.ct | participant)
##               Df    AIC    BIC  logLik deviance Chisq Chi Df Pr(>Chisq)
## adult.lmer.v2 69 3127.5 3548.1 -1494.7   2989.5                        
## adult.lmer    69 3127.5 3548.1 -1494.7   2989.5     0      0          1

get_coeffs(adult.lmer.v2, c("condition.ct:session.ct:voice_oldnew.ct:word_oldnewnewword", "condition.ct:session.ct:voice_oldnew.ct:word_oldnewoldword"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
condition.ct:session.ct:voice_oldnew.ct:word_oldnewnewword	-1.879	0.674	-2.789	0.005
condition.ct:session.ct:voice_oldnew.ct:word_oldnewoldword	0.059	0.620	0.095	0.925

The condition by session by voice-novelty interaction is reliable only for untrained words. We further break down the condition by session by voice-novelty interaction for untrained words.

adult.lmer.v3 <- glmer(accuracy ~ 1 + 
                         + (condition.ct: session.ct: word_oldnew: voice_oldnew)
                       + (condition.ct  * session.ct * word_oldnew.ct * voice_oldnew.ct * talker.ct)
                       - (condition.ct: session.ct: word_oldnew.ct: voice_oldnew.ct)
                       - (condition.ct: session.ct: voice_oldnew.ct)
                       - (condition.ct: session.ct: word_oldnew.ct)
                       - (condition.ct: session.ct)
                       + mean_replay_in_training.ct 
                       + (word_oldnew.ct * voice_oldnew.ct * session.ct|participant),
                       data = discrim.adult, 
                       control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=1e5)), family = binomial)
anova(adult.lmer.v3, adult.lmer)

## Data: discrim.adult
## Models:
## adult.lmer.v3: accuracy ~ 1 + +(condition.ct:session.ct:word_oldnew:voice_oldnew) + 
## adult.lmer.v3:     (condition.ct * session.ct * word_oldnew.ct * voice_oldnew.ct * 
## adult.lmer.v3:         talker.ct) - (condition.ct:session.ct:word_oldnew.ct:voice_oldnew.ct) - 
## adult.lmer.v3:     (condition.ct:session.ct:voice_oldnew.ct) - (condition.ct:session.ct:word_oldnew.ct) - 
## adult.lmer.v3:     (condition.ct:session.ct) + mean_replay_in_training.ct + 
## adult.lmer.v3:     (word_oldnew.ct * voice_oldnew.ct * session.ct | participant)
## adult.lmer: accuracy ~ 1 + (condition.ct * session.ct * word_oldnew.ct * 
## adult.lmer:     voice_oldnew.ct * talker.ct) + mean_replay_in_training.ct + 
## adult.lmer:     (word_oldnew.ct * voice_oldnew.ct * session.ct | participant)
##               Df    AIC    BIC  logLik deviance Chisq Chi Df Pr(>Chisq)
## adult.lmer.v3 69 3127.5 3548.1 -1494.7   2989.5                        
## adult.lmer    69 3127.5 3548.1 -1494.7   2989.5     0      0          1

get_coeffs(adult.lmer.v3, c("condition.ct:session.ct:word_oldnewnewword:voice_oldnewnewvoice", "condition.ct:session.ct:word_oldnewnewword:voice_oldnewoldvoice"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
condition.ct:session.ct:word_oldnewnewword:voice_oldnewnewvoice	0.890	0.504	1.764	0.078
condition.ct:session.ct:word_oldnewnewword:voice_oldnewoldvoice	-0.989	0.517	-1.913	0.056

There is a marginal condition by session interaction for the untrained voice which went in the predicted direction, but also a marginal interaction in the opposite direction for the trained voice. In other words, the interaction rests both on a trend towards a greater benefit of high-variability input compared with low-variability input for untrained words-untrained voice items (which is predicted since novelty should aid generalization) and a trend towards a greater benefit of low-variability for untrained words-trained voice items (which is not predicted).

Check interactions with talker

get_coeffs(adult.lmer, c("talker.ct", "session.ct:talker.ct", "condition.ct:session.ct:word_oldnew.ct:talker.ct", "condition.ct:session.ct:word_oldnew.ct:voice_oldnew.ct:talker.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
talker.ct	1.678	0.133	12.661	0.000
session.ct:talker.ct	-0.590	0.234	-2.528	0.011
condition.ct:session.ct:word_oldnew.ct:talker.ct	0.367	0.919	0.399	0.690
condition.ct:session.ct:word_oldnew.ct:voice_oldnew.ct:talker.ct	1.886	1.838	1.026	0.305

There is a reliable main effect of talker, reflecting female being overall more intelligible (female1 0.68, female2 0.9).

The main effect of test-session is qualified by an interaction with talker. We look to see if the effect of test-session holds for each talker;

adult.lmer.v4 <- glmer(accuracy ~ 1
                      + session.ct : talker
                      + (condition.ct  * session.ct * word_oldnew.ct * voice_oldnew.ct * talker.ct)
                      - session.ct : talker.ct
                      - session.ct
                      + mean_replay_in_training.ct 
                      + (word_oldnew.ct * voice_oldnew.ct * session.ct|participant),
                      data = discrim.adult, 
                      control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=1e5)), family=binomial)
anova(adult.lmer.v4, adult.lmer)

## Data: discrim.adult
## Models:
## adult.lmer.v4: accuracy ~ 1 + session.ct:talker + (condition.ct * session.ct * 
## adult.lmer.v4:     word_oldnew.ct * voice_oldnew.ct * talker.ct) - session.ct:talker.ct - 
## adult.lmer.v4:     session.ct + mean_replay_in_training.ct + (word_oldnew.ct * 
## adult.lmer.v4:     voice_oldnew.ct * session.ct | participant)
## adult.lmer: accuracy ~ 1 + (condition.ct * session.ct * word_oldnew.ct * 
## adult.lmer:     voice_oldnew.ct * talker.ct) + mean_replay_in_training.ct + 
## adult.lmer:     (word_oldnew.ct * voice_oldnew.ct * session.ct | participant)
##               Df    AIC    BIC  logLik deviance Chisq Chi Df Pr(>Chisq)
## adult.lmer.v4 69 3127.5 3548.1 -1494.7   2989.5                        
## adult.lmer    69 3127.5 3548.1 -1494.7   2989.5     0      0          1

get_coeffs(adult.lmer.v4, c("session.ct:talkerfemale1", "session.ct:talkerfemale2"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
session.ct:talkerfemale1	-0.225	0.149	-1.514	0.13
session.ct:talkerfemale2	-0.815	0.226	-3.613	0.00

The effect of session is only reliable for the more intelligible talker (female 2), although means are in the same direction in each case.

female 1 : pre = 0.66 post = 0.7 female 2 :pre = 0.87 post = 0.93

Children

Note that a model containing correlations between slopes didn’t converge, these are removed in the model below (ie using the “(x||participant)” syntax)

discrim.child = selectCenter(discrim.child, list("session", "word_oldnew", "voice_oldnew", "talker", "condition", "agegroup", "mean_replay_in_training"))

child.lmer <- glmer(accuracy ~ 1 
                    + (condition.ct  * session.ct * word_oldnew.ct * voice_oldnew.ct * talker.ct)
                    + mean_replay_in_training.ct 
                    + (word_oldnew.ct * voice_oldnew.ct * session.ct||participant),
                    data = discrim.child, 
                    control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=1e5)), family = binomial)

get_coeffs(child.lmer, c("(Intercept", "condition.ct", "session.ct", "word_oldnew.ct", "condition.ct:session.ct", "session.ct:word_oldnew.ct",
                         "session.ct:voice_oldnew.ct", "condition.ct:session.ct:word_oldnew.ct", "condition.ct:session.ct:voice_oldnew.ct", "condition.ct:word_oldnew.ct:voice_oldnew.ct", "session.ct:word_oldnew.ct:voice_oldnew.ct", "condition.ct:session.ct:word_oldnew.ct:voice_oldnew.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	0.870	0.080	10.941	0.000
condition.ct	-0.249	0.160	-1.552	0.121
session.ct	-0.671	0.099	-6.774	0.000
word_oldnew.ct	-0.027	0.072	-0.378	0.705
condition.ct:session.ct	-0.490	0.198	-2.478	0.013
session.ct:word_oldnew.ct	-0.250	0.145	-1.730	0.084
session.ct:voice_oldnew.ct	0.216	0.154	1.405	0.160
condition.ct:session.ct:word_oldnew.ct	0.073	0.290	0.251	0.802
condition.ct:session.ct:voice_oldnew.ct	-0.020	0.308	-0.064	0.949
condition.ct:word_oldnew.ct:voice_oldnew.ct	0.085	0.290	0.294	0.769
session.ct:word_oldnew.ct:voice_oldnew.ct	-0.134	0.300	-0.445	0.656
condition.ct:session.ct:word_oldnew.ct:voice_oldnew.ct	-0.042	0.602	-0.070	0.944

There was no reliable main effect of word-novelty. There was a reliable effect of test-session, indicating an effect of training (pre-test 0.62, post test 0.74). There was a marginal interaction between word-novelty and test session (untrained words 0.63 –> 0.74, untrained words 0.6 –> 0.75 However there is reliable interaction between test-session and condition. This is the reverse to the predicted direction - greater improvement in the low variability condition (0.18) than in the highvariablity condition (0.08)

Breaking this down:

discrim.child = selectCenter(discrim.child, list("session", "word_oldnew", "voice_oldnew", "talker", "condition", "agegroup", "mean_replay_in_training"))

child.lmer.v2 <- glmer(accuracy ~ 1
                       + condition.ct : session
                       + (condition.ct * session.ct * word_oldnew.ct * voice_oldnew.ct * talker.ct)
                       - condition.ct : session.ct
                       - condition.ct
                       + mean_replay_in_training.ct 
                       + (word_oldnew.ct * voice_oldnew.ct * session.ct||participant),
                       data = discrim.child, 
                       control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)), family=binomial)
anova(child.lmer, child.lmer.v2 )

## Data: discrim.child
## Models:
## child.lmer: accuracy ~ 1 + (condition.ct * session.ct * word_oldnew.ct * 
## child.lmer:     voice_oldnew.ct * talker.ct) + mean_replay_in_training.ct + 
## child.lmer:     (word_oldnew.ct * voice_oldnew.ct * session.ct || participant)
## child.lmer.v2: accuracy ~ 1 + condition.ct:session + (condition.ct * session.ct * 
## child.lmer.v2:     word_oldnew.ct * voice_oldnew.ct * talker.ct) - condition.ct:session.ct - 
## child.lmer.v2:     condition.ct + mean_replay_in_training.ct + (word_oldnew.ct * 
## child.lmer.v2:     voice_oldnew.ct * session.ct || participant)
##               Df    AIC    BIC  logLik deviance Chisq Chi Df Pr(>Chisq)
## child.lmer    41 4920.5 5180.1 -2419.2   4838.5                        
## child.lmer.v2 41 4920.5 5180.1 -2419.2   4838.5     0      0  < 2.2e-16
##                  
## child.lmer       
## child.lmer.v2 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

get_coeffs(child.lmer.v2, c("condition.ct:sessionpost", "condition.ct:sessionpre"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
condition.ct:sessionpost	-0.003	0.192	-0.018	0.986
condition.ct:sessionpre	-0.494	0.184	-2.683	0.007

There is a reliable difference between the conditions at pre-test (low variability: 0.56, high-variability: 0.66) but although children in the low-variability have caught up by post-test, they have not overtaken the high-variability condition (low variability: 0.74, high-variability: 0.74).

Check interactions with talker

get_coeffs(child.lmer.v2, c("talker.ct", "condition.ct:session.ct:talker.ct", "session.ct:word_oldnew.ct:talker.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
talker.ct	0.801	0.074	10.820	0.000
condition.ct:session.ct:talker.ct	-0.358	0.308	-1.160	0.246
session.ct:word_oldnew.ct:talker.ct	-0.167	0.300	-0.556	0.578

There is a reliable main effect of talker, reflecting female 2 being overall more intelligible (female 1 0.6, female 2 0.76. )

The interactions between word-type and session, and condition and session do not interact with talker. However, since the benefit of low-variability is unexpected, particularly for untrained voices, we consider whether the pattern is ordinal across counterbalances. Looking at the difference scores by talker:

discrim.child.talker = aggregate(accuracy ~  session + condition + talker + voice_oldnew, na.rm = T, data = discrim.child, FUN=mean)

discrim.child.talker.2 = reshape(discrim.child.talker, timevar = "session", idvar = c("condition", "talker","voice_oldnew"), direction = "wide")

discrim.child.talker.2$accuracy.diff = discrim.child.talker.2$accuracy.post - discrim.child.talker.2$accuracy.pre
kable(discrim.child.talker.2, digits = 3)

	condition	talker	voice_oldnew	accuracy.post	accuracy.pre	accuracy.diff
1	highvar	female1	newvoice	0.686	0.568	0.118
3	lowvar	female1	newvoice	0.691	0.509	0.182
5	highvar	female2	newvoice	0.814	0.725	0.089
7	lowvar	female2	newvoice	0.827	0.600	0.227
9	highvar	female1	oldvoice	0.664	0.604	0.061
11	lowvar	female1	oldvoice	0.623	0.473	0.150
13	highvar	female2	oldvoice	0.804	0.750	0.054
15	lowvar	female2	oldvoice	0.841	0.682	0.159

This suggests that for both counterbalances, for both trained and untrained voices, there is more improvement in the low variability condition.

Age Group Comparison

discrim = selectCenter(discrim, list("session", "word_oldnew", "voice_oldnew", "talker", "condition", "agegroup", "mean_replay_in_training"))

age.lmer <- glmer(accuracy ~ 1 
            + (condition.ct * session.ct * word_oldnew.ct * voice_oldnew.ct * agegroup.ct * talker.ct)
            + mean_replay_in_training.ct 
            + (word_oldnew.ct * voice_oldnew.ct * session.ct||participant),
            data = discrim, 
            control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e5)), family=binomial)

get_coeffs(age.lmer, c("(Intercept)","agegroup.ct", "session.ct:agegroup.ct", "condition.ct:session.ct:agegroup.ct",  "session.ct:word_oldnew.ct:agegroup.ct", "session.ct:voice_oldnew.ct:agegroup.ct", "condition.ct:session.ct:word_oldnew.ct:agegroup.ct", "condition.ct:session.ct:voice_oldnew.ct:agegroup.ct", "session.ct:word_oldnew.ct:voice_oldnew.ct:agegroup.ct", "condition.ct:session.ct:word_oldnew.ct:voice_oldnew.ct:agegroup.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	1.203	0.060	19.890	0.000
agegroup.ct	-0.807	0.126	-6.427	0.000
session.ct:agegroup.ct	-0.172	0.170	-1.015	0.310
condition.ct:session.ct:agegroup.ct	-0.537	0.340	-1.580	0.114
session.ct:word_oldnew.ct:agegroup.ct	-0.316	0.261	-1.211	0.226
session.ct:voice_oldnew.ct:agegroup.ct	0.034	0.267	0.128	0.898
condition.ct:session.ct:word_oldnew.ct:agegroup.ct	-0.174	0.522	-0.333	0.739
condition.ct:session.ct:voice_oldnew.ct:agegroup.ct	0.876	0.534	1.642	0.101
session.ct:word_oldnew.ct:voice_oldnew.ct:agegroup.ct	0.506	0.521	0.969	0.332
condition.ct:session.ct:word_oldnew.ct:voice_oldnew.ct:agegroup.ct	-1.995	1.038	-1.922	0.055

Main effect of age-group (children 0.68, adults 0.79)

No reliable interaction between age-group and test-session and no higher interaction with any combination of condition, word-novetly or voice-novelty. There was however, a near reliable five-way interaction (reflecting different factors in the models above)

Check interactions with talker

get_coeffs(age.lmer, c("talker.ct", "agegroup.ct:talker.ct", "condition.ct:session.ct:word_oldnew.ct:voice_oldnew.ct:talker.ct"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
talker.ct	1.171	0.068	17.283	0.000
agegroup.ct:talker.ct	-0.825	0.138	-5.974	0.000
condition.ct:session.ct:word_oldnew.ct:voice_oldnew.ct:talker.ct	1.103	0.999	1.104	0.269

There was a effect of talker. Moreover the main effect of age group was qualified by an interaction with talker (the five way interaction was not). Breaking down talker by age group interaction:

age.lmer.v2 <- glmer(accuracy ~ 1 
                     + agegroup.ct:talker
                     + (condition.ct * session.ct * word_oldnew.ct * voice_oldnew.ct * agegroup.ct * talker.ct)
                     - agegroup.ct: talker.ct
                     - agegroup.ct
                     + mean_replay_in_training.ct 
                     + (word_oldnew.ct * voice_oldnew.ct * session.ct||participant),
                     data = discrim, 
                     control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun=1e5)), family = binomial)
anova(age.lmer, age.lmer.v2)

## Data: discrim
## Models:
## age.lmer: accuracy ~ 1 + (condition.ct * session.ct * word_oldnew.ct * 
## age.lmer:     voice_oldnew.ct * agegroup.ct * talker.ct) + mean_replay_in_training.ct + 
## age.lmer:     (word_oldnew.ct * voice_oldnew.ct * session.ct || participant)
## age.lmer.v2: accuracy ~ 1 + agegroup.ct:talker + (condition.ct * session.ct * 
## age.lmer.v2:     word_oldnew.ct * voice_oldnew.ct * agegroup.ct * talker.ct) - 
## age.lmer.v2:     agegroup.ct:talker.ct - agegroup.ct + mean_replay_in_training.ct + 
## age.lmer.v2:     (word_oldnew.ct * voice_oldnew.ct * session.ct || participant)
##             Df    AIC  BIC  logLik deviance Chisq Chi Df Pr(>Chisq)
## age.lmer    73 7987.2 8492 -3920.6   7841.2                        
## age.lmer.v2 73 7987.2 8492 -3920.6   7841.2     0      0          1

get_coeffs(age.lmer.v2,c("(Intercept)","agegroup.ct:talkerfemale1", "agegroup.ct:talkerfemale2"))

	Estimate	Std. Error	z value	Pr(>\|z\|)
(Intercept)	1.203	0.060	19.890	0.000
agegroup.ct:talkerfemale1	-0.395	0.131	-3.017	0.003
agegroup.ct:talkerfemale2	-1.219	0.155	-7.877	0.000

The difference between age groups holds for each talker.

Giannakopoulou, Brown, Clayards & Wonnacott

Elizabeth Wonnacott

January, 2017

Load Packages and Helper Functions

Packages.

Helper Functions.

SummarySE

SummarySEwithin

normDataWithin

myCenter

selectCenter

get_coeffs and get_coeffs_lmerTest

filter2

Load Datasets

Training

Data Preparation

Check amount of missing data

Select data for analyses

Plot means (for Figure 3)

Adults

Children

Statistial Analyses

Adults

Check interactions with talker

Children

Check interactions with talker

Adult/Child Comparison

Check interactions with talker

English Introduction

Data Preparation

Select data for analyses

Get means for Table 4

Adults

Children

Statistical Analyses

Children

Check interactions with talker

Lexical Decision

Data Preparation

Select data for analyses

Filtering for RT analyses

Get Means and Plots

Greek Primes

Adult Accuracy

Adults RT

Children Accuracy

Children RT

English Primes

Adult Accuracy

Adult RT

Plot for Figure 4

Children Accuracy

Children RT

Plot for Figure 4

Statistical Analyses

Adults, Greek, RT

Adult, English, RT

Check interactions with talker

Child, Greek, RT

Child, English, RT

Check interactions with talker

Child, English, Accuracy

Check interactions with talker

Discrimination

Data Preparation

Get Means and Plots

Adults

Table of Means

Plot for Figure 5

Children

Table of Means

Plot for Figure 5

Statistical Analyses

Adults

Check interactions with talker

Children

Check interactions with talker

Age Group Comparison

Check interactions with talker