We conducted three experiments examining children’s ability to understand sine wave speech. Experiment 1 also assessed children’s ability to understand Mooney pictures. The different experiments examined baseline interpretation of sine wave speech, pop out effects, and perceptual learning.
Children in this study had to match a human speech sentence to one of two robot (i.e., sine wave speech) sentences, or they had to match a clear picture to one of two Mooney pictures. The aim was to get a first assessment of children’s ability to process distorted stimuli using a top-down cue.
Participants were tested at the Edinburgh Zoo, the Edinburgh DevLab, and in the local community. This was a low n pilot study; the table below shows participants per age group.
| age | Mooney Pictures | Sine-Wave Speech |
|---|---|---|
| 2 | 5 | 5 |
| 3 | 5 | 5 |
| 4 | 9 | 9 |
| 5 | 3 | 4 |
| 6 | 3 | 3 |
| 9 | 1 | 1 |
Participants completed 3 practice trials with blurred images or noise distorted speech, followed by 24 test trials with mooney images or sine wave speech. On each trial, the child had to select either which of the robot’s pictures matched the teacher’s standard, or which of the robot’s sentences matched the teacher’s standard. To play the sentences, the experimenter tapped the Teacher. Triplets of sentences could be repeated, but individual sentences could not be repeaated.
Left: 2AFC Mooney picture choice. Right: 2AFC sine wave speech choice
Comparing age groups across the two studies. Most participants (but not all) did both experiments. Errors bars are 95% CIs
All age groups were above chance when interpreting Mooney pictures, altho 2-year-olds had relatively low performance. 2- and 3-year-olds were not credibly different from chance when interpreting Sine Wave speech. Performance on the sine wave speech task had a more gradual improvement with age.
Experiment 1 suggested that sine wave speech was harder to interpret than Mooney pictures, although the youngest children showed limited skill on both tasks. We had two concerns about this manipulation, however. First, the sine wave speech task places more of a demand on working memory than the picture task (as you are matching a heard sentence to a previously heard sample, rather than matching a visible picture to a visible sample). Second, 2AFC may be a conceptually difficult task for young children. Anecdotally, my RAs reported that this task was difficult for the younger children, especially the sine wave speech task.
Our subsequent two experiments only assessed sine wave speech. In Experiment 2, we moved to a method that we thought might be more naturalistic – picture selection.
Participants in Experiment 2 did a word recognition task. We thought that this method might better allow younger children to demonstrate their ability to process sine wave speech. Moreover, it also allowed us to assess children’s processing of sine wave sentences when heard naively, versus when heard after a clear speech prime. As a little extra bonus, we also assessed adults, and a second group of adults who were beginner learners of Italian (with stimuli translated into Italian). If second-language learners also show difficulty with sine wave speech, this could indicate that the problem for children lies in their degree of linguistic knowledge, rather than in connections between levels of representation.
Participants in this study were 3- and 4-year-olds learning English, adult English speakers learning Italian (with the stimuli translated into Italian), and adult English speakers. They saw three pictures on each trial and heard three sentences (e.g., point to the dog). The third sentence was always sine wave speech. On half of the trials the sine wave speech sentence mentioned the so-far untouched picture (i.e., it was a novel sentence, the different condition), and on half of the trials it mentioned one of the previously mentioned pictures (either repeating the first heard-sentence [far condition] or the second-heard sentence [near condition]). For half of the participants the sine wave speech sentences used the same voice as the clear sentences [match condition], and for half of the trials it used a different gender voice [mismatch condition].
Unfortunately this design led to a large response bias in children, to touch the unmentioned picture on the third instruction.
Participants were tested in preschools around Perthshire, in the Edinburgh DevLab, and in the Edinburgh University community. 2nd language speakers were taken from an Introductory Italian class at Edinburgh University. The table below shows participants per age group.
expt2_summary <- expt2 %>%
dplyr::select(resp,subject_id,age_years,Group) %>%
dplyr::group_by(subject_id,age_years,Group) %>%
dplyr::summarise(responded = mean(resp,na.rm = T)) %>%
dplyr::group_by(age_years,Group) %>%
dplyr::select(responded,subject_id,age_years,Group) %>%
dplyr::summarise(n = length(responded))| age_years | Adults: 1st Lang | Adults: 2nd Lang | Children |
|---|---|---|---|
| 2 | NA | NA | 4 |
| 3 | NA | NA | 37 |
| 4 | NA | NA | 37 |
| 5 | NA | NA | 2 |
| adult | 20 | 15 | NA |
Participants completed 1 practice trial with sine wave speech, followed by 24 test trials. On each trial, participants would see three pictures and hear three instructions; they had to tap the mentioned picture after each instruction. The Experimenter tapped the green button to produce each instruction.
Words/pictures were chosen such that CDI norms show that 75% of 2-year-olds know that word.
3 choice word recognition task
First we graph overall performance, then exclude trials where participants did not recognise one of the clear speech items. We collapse children into a single group, because there do not appear to be any interesting developmental differences.
You can see that accuracy is generally quite high for clear speech items, even in young children (suggesting the task is simple). But accuracy declines for distorted speech across all groups, especially for children and 2nd language speakers.
You can easily see the response bias – children are considerably more accuracte on “different” trials, i.e., the unheard word, than on Near/Far trials. But they are above chance overall, assuming a big criterion effect, and they clearly interpret the repeated distorted speech to be distinct from the unrepeated distorted speech (i.e., the ‘different’ condition).
ggplot(expt2, aes(x = distance, y = resp, color = clarity))+
facet_wrap(~speaker_match+Group,nrow=2) +
ylim(c(0,1))+
stat_summary_bin(fun.data = mean_cl_boot)+
theme_cowplot()+
ggtitle("Accuracy across groups and conditions")+
xlab("Condition (Unheard word, far prior word, near prior word)")+
ylab("Accuracy")+
labs(caption = "")#######
# Exclude trials where they got a clear word wrong
wrong_clear <- summaryBy(resp ~ subject_id + trial_id, data = subset(expt2, clarity != "distorted"))
wrong_clear <- subset(wrong_clear, resp.mean <1)
wrong_clear$subj_trial <- paste(wrong_clear$subject_id, wrong_clear$trial_id)
expt2$subj_trial <- paste(expt2$subject_id,expt2$trial_id)
expt2 <- expt2[!expt2$subj_trial %in% wrong_clear$subj_trial,]To minimize the possibility that our analyses are contaminated by effects of lexical knowledge, we remove trials on which participants made a mistake on clear items (i.e., we only assess children’s behavior on words that we have good reason to believe they know). All analyses are now done on this restricted dataset.
ggplot(expt2, aes(x = distance, y = resp, color = clarity))+
facet_wrap(~speaker_match+Group,nrow=2) +
ylim(c(0,1))+
stat_summary_bin(fun.data = mean_cl_boot)+
theme_cowplot()+
ggtitle("Accuracy across groups and conditions")+
xlab("Condition (Unheard word, far prior word, near prior word)")+
ylab("Accuracy")+
labs(caption = "")The graph below illustrates the response bias in children; a tendency to choose the picture that was unmentioned on clear speech trials.
ggplot(expt2, aes(x = distance, y = choose_unmentioned, color = clarity))+
facet_wrap(~speaker_match+Group,nrow=2) + ylim(c(0,1))+
stat_summary_bin(fun.data = mean_cl_boot)+theme_cowplot()+ggtitle("Choice of unmentioned picture across groups and conditions")+xlab("Condition (Unheard word, far prior word, near prior word)")+ylab("Choose unmentioned picture")Because 1st language adults were close to ceiling in both conditions, and children were close to ceiling on clear speech, models that compare clear and distorted conditions don’t converge, or produce implausible parameter estimates. So we instead simple analyze accuracy at interpreting distorted speech across groups.
Analyses use Bayesian logistic regression models; in the final column of each table I’ve put *s next to parameters whose credible intervals do not include 0.
z<-1
kable(convert_stan_to_dataframe(compare_children_1stLang),digits=2, caption = "Accuracy understanding distorted speech for children vs. Adults (1st lang)")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 3.05 | 0.31 | 2.46 | 3.67 | 1837.28 | 1 | * |
| GroupChildren | -2.35 | 0.33 | -3.01 | -1.73 | 1828.11 | 1 | * |
kable(convert_stan_to_dataframe(compare_children_2ndLang),digits=2, caption = "Accuracy understanding distorted speech for children vs. 2nd language learners")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.81 | 0.24 | 0.35 | 1.28 | 1814.45 | 1 | * |
| GroupChildren | -0.13 | 0.26 | -0.64 | 0.37 | 1726.51 | 1 | - |
kable(convert_stan_to_dataframe(compare_1st_2ndLang),digits=2, caption = "Accuracy understanding distorted speech for Adult 1st language vs. 2nd language learners")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 3.60 | 0.54 | 2.6 | 4.72 | 559.45 | 1 | * |
| GroupAdults:2ndLang | -2.69 | 0.73 | -4.2 | -1.36 | 477.27 | 1 | * |
#kable(convert_stan_to_dataframe(compare_children_1stLang_distorted),digits=2, caption = "Accuracy on distorted speech for Children vs Adults 1st language")We also compared overall accuracy on distorted speech across groups depending on whether the distorted and undistorted voices matched. We see interactions between children and adults (both groups), such that children are actually less affected by speaker match – this might hint that children are really failing to use the top-down cue here. But when analyzed separately, 1st and 2nd language adults don’t show the speaker_match effect. That is to say, there is a difference between groups, but neither group is credibly different from zero. Harrumph…
z<-1
kable(convert_stan_to_dataframe(compare_children_1st_speaker_match),digits=2, caption = "Matching vs mismatching distorted speech for children vs. 1st language adults")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 4.09 | 0.49 | 3.19 | 5.13 | 1144.63 | 1 | * |
| speaker_matchSpeakerMismatch | -1.89 | 0.61 | -3.10 | -0.68 | 962.87 | 1 | * |
| GroupChildren | -3.56 | 0.51 | -4.62 | -2.65 | 1097.27 | 1 | * |
| speaker_matchSpeakerMismatch:GroupChildren | 2.17 | 0.65 | 0.89 | 3.52 | 863.60 | 1 | * |
kable(convert_stan_to_dataframe(compare_children_2nd_speaker_match),digits=2, caption = "Matching vs mismatching distorted speech for children vs. 2nd language adults")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.74 | 0.26 | 0.23 | 1.26 | 853.10 | 1.00 | * |
| speaker_matchSpeakerMismatch | -0.76 | 0.43 | -1.62 | 0.12 | 696.34 | 1.01 | - |
| GroupChildren | -0.22 | 0.30 | -0.80 | 0.36 | 811.49 | 1.00 | - |
| speaker_matchSpeakerMismatch:GroupChildren | 1.03 | 0.47 | 0.09 | 1.96 | 659.93 | 1.01 | * |
kable(convert_stan_to_dataframe(compare_1st_2nd_speaker_match),digits=2, caption = "Matching vs mismatching distorted speech for 1st language adults vs 2nd language")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 4.47 | 0.69 | 3.28 | 5.96 | 1000.97 | 1 | * |
| speaker_matchSpeakerMismatch | -1.90 | 0.86 | -3.62 | -0.23 | 808.60 | 1 | * |
| GroupAdults:2ndLang | -3.65 | 0.83 | -5.37 | -2.19 | 727.82 | 1 | * |
| speaker_matchSpeakerMismatch:GroupAdults:2ndLang | 1.04 | 1.19 | -1.35 | 3.44 | 696.99 | 1 | - |
z<-1
kable(convert_stan_to_dataframe(children_match),digits=2, caption = "Matching vs mismatching distorted speech for children")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.52 | 0.12 | 0.28 | 0.77 | 834.13 | 1 | * |
| speaker_matchSpeakerMismatch | 0.27 | 0.18 | -0.08 | 0.64 | 1338.77 | 1 | - |
kable(convert_stan_to_dataframe(adult_1st_match),digits=2, caption = "Matching vs mismatching distorted speech for 1st language adults")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 5.30 | 1.27 | 3.38 | 8.22 | 542.42 | 1 | * |
| speaker_matchSpeakerMismatch | -2.33 | 1.54 | -5.63 | 0.38 | 528.04 | 1 | - |
kable(convert_stan_to_dataframe(adult_2nd_match),digits=2, caption = "Matching vs mismatching distorted speech for 2nd language adults")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.78 | 0.38 | 0.05 | 1.57 | 552.12 | 1 | * |
| speaker_matchSpeakerMismatch | -0.80 | 0.65 | -2.10 | 0.41 | 492.97 | 1 | - |
Do different participant groups perform better on close primed stimuli or far primed stimuli?
z<- 2
kable(convert_stan_to_dataframe(child_distance),digits=2, caption = "Effect of prime distance for children")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | -1.12 | 0.31 | -1.76 | -0.51 | 379.43 | 1 | * |
| distancenear | 0.47 | 0.21 | 0.06 | 0.91 | 2000.00 | 1 | * |
kable(convert_stan_to_dataframe(first_lang_distance),digits=2, caption = "Effect of prime distance for 1st language adults")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 4.31 | 1.13 | 2.61 | 6.92 | 972.56 | 1 | * |
| distancenear | -0.08 | 0.56 | -1.15 | 1.01 | 4000.00 | 1 | - |
kable(convert_stan_to_dataframe(second_lang_distance),digits=2, caption = "Effect of prime distance for 2nd language adults")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.22 | 1.08 | -1.94 | 2.33 | 443.79 | 1.01 | - |
| distancenear | 1.90 | 0.96 | 0.24 | 4.12 | 768.30 | 1.00 | * |
Did participants’ accuracy increase over the study? Yes for 1st language adults, no for children. But it is unclear for 2nd language adults – probably we need a larger sample.
This graph illustrates learning over time. You can see that 1st language adults appear to improve a little, and perhaps 2nd language adults too. But not Children.
ggplot(subset(expt2, clarity == "distorted"), aes(x = trial_number, y = resp, color = Group))+
geom_smooth(method = "glm", method.args = list(family = "binomial")) +
ylim(c(0,1.05))+
ggtitle("Change in accuracy over the study")+
stat_summary_bin(fun.data = mean_cl_boot)+
ylab("Accuracy")+
xlab("Trial number")kable(convert_stan_to_dataframe(learning_overall),digits=2, caption = "Effect of learning over trials across groups")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 3.22 | 0.33 | 2.60 | 3.89 | 1452.73 | 1 | * |
| scaletrial_number | 0.52 | 0.21 | 0.11 | 0.94 | 1553.70 | 1 | * |
| GroupAdults:2ndLang | -2.40 | 0.44 | -3.32 | -1.58 | 1154.03 | 1 | * |
| GroupChildren | -2.52 | 0.35 | -3.23 | -1.87 | 1372.48 | 1 | * |
| scaletrial_number:GroupAdults:2ndLang | -0.39 | 0.26 | -0.94 | 0.10 | 1681.42 | 1 | - |
| scaletrial_number:GroupChildren | -0.56 | 0.22 | -0.99 | -0.15 | 1710.10 | 1 | * |
kable(convert_stan_to_dataframe(learning_children),digits=2, caption = "Effect of learning over trials for children")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.68 | 0.09 | 0.51 | 0.86 | 676.07 | 1 | * |
| scaletrial_number | -0.04 | 0.06 | -0.15 | 0.08 | 2000.00 | 1 | - |
kable(convert_stan_to_dataframe(learning_first_lang),digits=2, caption = "Effect of learning over trials for first lang adults")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 4.38 | 0.89 | 2.91 | 6.41 | 693.77 | 1 | * |
| scaletrial_number | 0.63 | 0.25 | 0.16 | 1.12 | 2000.00 | 1 | * |
kable(convert_stan_to_dataframe(learning_second_lang),digits=2, caption = "Effect of learning over trials for second lang adults")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.89 | 0.41 | 0.09 | 1.68 | 653.44 | 1.01 | * |
| scaletrial_number | 0.14 | 0.24 | -0.35 | 0.63 | 1503.60 | 1.00 | - |
z<-1This graph illustrates the (log) number of times that participants were given a repetition of the clear/distorted speech. Different colored lines indicate if the stimuli were on familiar trials (distorted speech as a repetition) or not. Children were given more repetitions overall.
repetition <- ggplot(expt2, aes(x = trial_number, y = log(play_counter), color= novelty))+
geom_smooth(method = "loess")+
facet_wrap(~clarity+Group,ncol=3, nrow = 2) +
ylab("log Number of repetitions per trial")+
ggtitle("Repetitions per trial over the study")
repetitionWhile Experiment 2 provided evidence that younger children can indeed interpret sine wave speech, they clearly were error prone in doing so. Unfortunately this experimental design was greatly contaminated by response bias, which makes it hard to draw strong conclusions, particularly about whether children show top-down pop out effects.
We replicated a finding by Nittrouer and Lowenstein (2013) that second-language learners have difficulty interpreting sine wave speech. This is particularly interesting, because it raises the possibility that children’s difficulties are not driven by immature top-down connections, but by incomplete linguistic knowledge.
Our next Experiment had three goals:
Participants in Experiment 3 also did a word recognition task, but here we a) removed the response bias, and b) added a perceptual learning component (following Davis et al. 2005). Participants saw two pictures on each trial and heard three sentences (e.g., point to the dog). They heard 24 trials altogether (plus one practice at the start). The first 16 trials were training trials, and the final 8 trials assessed learning and generalization. The order in which participants heard the first 16 sentences determined whether their learning was top-down guided, or not. For the top-down guidance participants, the third sentence was always sine wave speech, and it was a distorted repetition of one of the two previously heard sentences. For the unguided participants, the first sentence was always sine wave speech, and it was a distorted version of a sentence that they would eventually hear. This design thus allowed us to manipulate learning style (top-down or not) and also to test whether children show pop-out effects (if so, they should be more accurate during top-down training trials).
The final 8 trials were always presented in the order Distorted-Clear-Clear, in order to test learning and generalization.
Participants in this study were 2- 3- and 4-year-olds learning English.
Participants were tested in preschools around Edinburgh, in the Edinburgh DevLab, and in the Edinburgh University community. The table below shows participants per age group. We aimed to test 8 participants per cell, but kept testing available participants until all cells were filled. This lead to the surprlus of 4-year-olds, and to our testing some 5-year-olds.
expt3_summary <- children %>%
dplyr::select(resp,subject_id,age,test_lang) %>%
dplyr::group_by(subject_id,age,test_lang) %>%
dplyr::summarise(responded = mean(resp,na.rm = T)) %>%
dplyr::group_by(age,test_lang) %>%
dplyr::select(responded,subject_id,age,test_lang) %>%
dplyr::summarise(n = length(responded))| age | Clear_After | Clear_Before |
|---|---|---|
| 2 | 9 | 8 |
| 3 | 8 | 11 |
| 4 | 14 | 14 |
| 5 | 2 | 1 |
Participants completed 1 practice trial with sine wave speech, followed by the 16 training trials and 8 test trials. On each trial, participants would see two pictures and hear three instructions; they had to tap the mentioned picture after each instruction. The Experimenter tapped the green button to produce each instruction.
2 choice word recognition task
Half the participants were assigned to a top-down condition: During the 16 training trials they heard two clear speech instructions followed by a distorted speech instruction (Clear-Clear-Distorted order). The other participants were assigned to a bottom-up condition: during their 16 training trials they heard a distorted speech instruction followed by two clear speech instructions (Distorted-Clear-Clear order).
After the 16 training trials, all participants took part in 8 test trials where distorted speech was presented before, and clear speech after (i.e., Distorted-Clear-Clear order).
Structure of Experiment 3
We do our analyses only on trials in which participants got the “clear” presentation of the words correct (because we don’t want to look at vocab knowledge, at least not for now). But I’ll first show the proportion of trials on which children were correct for the clear presentation. As you’ll see accuracy is high. Error bars are 95% CIs. We don’t graph 5-year-olds as a separate group, as there are only 3 of them, but they are included in analyses.
ggplot(subset(children, clarity !="distorted" & age != "5"), aes(x = block, y = acc, color = test_lang))+facet_wrap(~age,nrow=1) + ylim(c(0,1))+stat_summary_bin(fun.data = mean_cl_boot)+theme_cowplot()+ggtitle("Accuracy on clear speech")From now on, we remove all trials on which participants got the clear word incorrect. The first graph shows averaged data by block, and the second shows trial-by-trial data over the course of the study (test trials start on block 16).
# Exclude trials where they got a clear word wrong
wrong_clear <- summaryBy(acc ~ subject_id + trial_id, data = subset(children, clarity != "distorted"))
wrong_clear <- subset(wrong_clear, acc.mean <1)
wrong_clear$subj_trial <- paste(wrong_clear$subject_id, wrong_clear$trial_id)
children$subj_trial <- paste(children$subject_id,children$trial_id)
children <- children[!children$subj_trial %in% wrong_clear$subj_trial,]
age <- ggplot(subset(children, clarity=="distorted" & age != "5"), aes(x = block, y = acc, color = test_lang))+
facet_wrap(~age,nrow=1) + ylim(c(0,1))+
stat_summary_bin(fun.data = mean_cl_boot)+theme_cowplot()+ggtitle("Accuracy across ages, blocks and training types")
speaker_match <- ggplot(subset(children, clarity=="distorted" & age != "5"), aes(x = block, y = acc, color = test_lang))+
facet_wrap(~speaker_match+age,nrow=2) + ylim(c(0,1))+
stat_summary_bin(fun.data = mean_cl_boot)+theme_cowplot()+ggtitle("Effect of speaker match")
trial <- ggplot(subset(children, clarity == "distorted" & age != "5"), aes(x = trial_number, y = acc, color = test_lang))+
geom_smooth(method = "glm", method.args = list(family = "binomial"))+
facet_wrap(~clarity+age,nrow=1) +
ylim(c(0,1.05))+
ggtitle("Effect of learning")+
ylab("Accuracy")
distance <- ggplot(subset(children, clarity=="distorted" & age != "5" & test_lang == "Clear_Before" & block == "Training"), aes(x = block, y = acc, color = distance))+
facet_wrap(~age,nrow=1) + ylim(c(0,1))+
stat_summary_bin(fun.data = mean_cl_boot)+theme_cowplot()+ggtitle("Effect of distance on Clear Before participants")
repetition <- ggplot(subset(children, age != "5"), aes(x = trial_number, y = log(play_counter), color = test_lang))+
geom_smooth(method = "loess")+
facet_wrap(~clarity+age,nrow=2) +
ylab("log Number of repetitions per trial")+
ggtitle("Repetitions per trial over the study")
#plot_grid(age, speaker_match,trial,distance)We’ll start with a full analysis of distorted trials, comparing Block (training/test), Training regime (clear before/after), age (scaled in weeks), and speaker match (march/mismatch). Unsurprisingly, nothing is significant, except that kids get better with age.
z <- 1
kable(convert_stan_to_dataframe(major_analysis.expt3),digits=2, caption = "Full Analysis")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.79 | 0.15 | 0.52 | 1.09 | 756.65 | 1.00 | * |
| speaker_matchSpeakerMismatch | 0.07 | 0.13 | -0.20 | 0.32 | 989.62 | 1.00 | - |
| test_langClear_Before | 0.23 | 0.13 | -0.04 | 0.48 | 1178.35 | 1.00 | - |
| scaleage_weeks | 0.29 | 0.14 | 0.01 | 0.56 | 1251.62 | 1.00 | * |
| block.L | 0.12 | 0.10 | -0.07 | 0.33 | 2000.00 | 1.00 | - |
| speaker_matchSpeakerMismatch:test_langClear_Before | -0.07 | 0.13 | -0.32 | 0.18 | 985.51 | 1.00 | - |
| speaker_matchSpeakerMismatch:scaleage_weeks | 0.10 | 0.14 | -0.18 | 0.37 | 1150.47 | 1.00 | - |
| test_langClear_Before:scaleage_weeks | 0.08 | 0.14 | -0.20 | 0.34 | 933.29 | 1.00 | - |
| speaker_matchSpeakerMismatch:block.L | 0.06 | 0.09 | -0.12 | 0.24 | 2000.00 | 1.00 | - |
| test_langClear_Before:block.L | 0.06 | 0.10 | -0.14 | 0.25 | 2000.00 | 1.00 | - |
| scaleage_weeks:block.L | -0.02 | 0.10 | -0.22 | 0.18 | 2000.00 | 1.00 | - |
| speaker_matchSpeakerMismatch:test_langClear_Before:scaleage_weeks | 0.14 | 0.14 | -0.12 | 0.42 | 1029.46 | 1.01 | - |
| speaker_matchSpeakerMismatch:test_langClear_Before:block.L | 0.08 | 0.10 | -0.11 | 0.28 | 2000.00 | 1.00 | - |
| speaker_matchSpeakerMismatch:scaleage_weeks:block.L | -0.13 | 0.11 | -0.34 | 0.08 | 2000.00 | 1.00 | - |
| test_langClear_Before:scaleage_weeks:block.L | 0.09 | 0.10 | -0.11 | 0.29 | 2000.00 | 1.00 | - |
| speaker_matchSpeakerMismatch:test_langClear_Before:scaleage_weeks:block.L | -0.17 | 0.11 | -0.38 | 0.04 | 2000.00 | 1.00 | - |
ageThere was no large pop-out effect. Children improve with age, but not by much.
z <- 1
kable(convert_stan_to_dataframe(major_analysis.expt3.train), digits=3, caption = "Are children more accurate when the clear sentence comes before? Not a lot more accurate (credible interval includes 0)")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.592 | 0.134 | 0.347 | 0.865 | 785.697 | 1.005 | * |
| test_langClear_Before | 0.154 | 0.129 | -0.092 | 0.412 | 670.967 | 0.999 | - |
| scaleage_weeks | 0.292 | 0.127 | 0.040 | 0.542 | 725.247 | 1.002 | * |
| test_langClear_Before:scaleage_weeks | 0.027 | 0.131 | -0.224 | 0.288 | 683.396 | 1.002 | - |
Did training that allowed pop-out subsequently allow children to do better at test? The effect is in the right direction, but is not significant.
z <- 1
kable(convert_stan_to_dataframe(major_analysis.expt3.test), digits=3, caption = "Are children more accurate at Test when they have been trained on the clear sentence coming before? Maybe a little...")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.878 | 0.194 | 0.505 | 1.281 | 1805.327 | 1.001 | * |
| test_langClear_Before | 0.256 | 0.178 | -0.083 | 0.618 | 2000.000 | 1.001 | - |
| scaleage_weeks | 0.278 | 0.215 | -0.150 | 0.714 | 2000.000 | 1.002 | - |
| test_langClear_Before:scaleage_weeks | 0.127 | 0.199 | -0.263 | 0.530 | 2000.000 | 1.001 | - |
We look at accuracy on clear speech (we return to the dataset on which trials are not excluded based on whether participants made a mistake indentifying clear speech items).
kable(convert_stan_to_dataframe(expt3.clear), digits=3, caption = "Are children more accurate on clear speech as they age?")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 4.119 | 0.257 | 3.660 | 4.646 | 685.164 | 1.003 | * |
| scaleage_weeks | 0.505 | 0.210 | 0.095 | 0.916 | 941.491 | 1.006 | * |
kable(convert_stan_to_dataframe(expt3.distorted), digits=3, caption = "Are children more accurate on distorted speech as they age?")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.620 | 0.117 | 0.396 | 0.855 | 1092.121 | 1.002 | * |
| scaleage_weeks | 0.279 | 0.110 | 0.056 | 0.503 | 1109.589 | 1.000 | * |
kable(convert_stan_to_dataframe(expt3.distorted.excluded), digits=3, caption = "Are children more accurate on distorted speech as they age? (excluding incorrect questions)")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.625 | 0.114 | 0.398 | 0.858 | 1337.713 | 1.002 | * |
| scaleage_weeks | 0.256 | 0.109 | 0.046 | 0.478 | 1141.699 | 1.004 | * |
We look at children’s accuracy on distorted speech in each age group. 3s and 4s are above chance, but the CI for 2s includes 0, even in the topdown clear-speech-first condition.
kable(convert_stan_to_dataframe(age.2), digits=3, caption = "Overall accuracy at Age 2")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.246 | 0.211 | -0.171 | 0.672 | 949.343 | 1.002 | - |
kable(convert_stan_to_dataframe(age.3), digits=3, caption = "Overall accuracy at Age 3")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.667 | 0.238 | 0.198 | 1.148 | 673.22 | 1.004 | * |
kable(convert_stan_to_dataframe(age.4), digits=3, caption = "Overall accuracy at Age 4")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.78 | 0.198 | 0.393 | 1.168 | 1162.333 | 1.002 | * |
kable(convert_stan_to_dataframe(age.2.TD), digits=3, caption = "Overall accuracy at Age 2 in TopDown condition")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.379 | 0.489 | -0.54 | 1.408 | 690.26 | 1.002 | - |
We look at how children’s accuracy improves over the experiment. There is not a significant improvement in accuracy, but gazing at the graphs, I would suggest that there is in fact a small learning effect which aren’t able to measure with this design.
trialkable(convert_stan_to_dataframe(expt3.learning ), digits=3, caption = "Learning over trials by age")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.638 | 0.124 | 0.396 | 0.887 | 1174.984 | 1.000 | * |
| scaletrial_number | 0.104 | 0.068 | -0.033 | 0.237 | 2000.000 | 1.000 | - |
| scaleage_weeks | 0.264 | 0.118 | 0.049 | 0.497 | 1288.243 | 1.000 | * |
| scaletrial_number:scaleage_weeks | -0.068 | 0.069 | -0.199 | 0.066 | 2000.000 | 0.999 | - |
z<-0How often do children request a repetition during training trials (trials 2-17), across the training conditions?
Children in the Clear Before condition are less likely to request a repetition overall, and there is a marginal effect such that the rate at which they request declines faster over the training phase.
repetitionz<-0
kable(convert_stan_to_dataframe(expt3.rep_train ), digits=3, caption = "Repetitions during training phase by trial number and condition")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.753 | 0.029 | 0.697 | 0.809 | 982.474 | 1.001 | * |
| scaletrial_number | -0.101 | 0.013 | -0.126 | -0.074 | 2000.000 | 1.001 | * |
| test_langClear_Before | -0.101 | 0.026 | -0.156 | -0.051 | 1148.129 | 1.002 | * |
| scaletrial_number:test_langClear_Before | -0.025 | 0.014 | -0.053 | 0.002 | 2000.000 | 0.998 | - |
kable(convert_stan_to_dataframe(expt3.rep_train.before ), digits=3, caption = "Repetitions during training phase by trial number -- Clear Before")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.647 | 0.045 | 0.554 | 0.736 | 469.895 | 1.005 | * |
| scaletrial_number | -0.127 | 0.022 | -0.169 | -0.085 | 2000.000 | 0.999 | * |
kable(convert_stan_to_dataframe(expt3.rep_train.after ), digits=3, caption = "Repetitions during training phase by trial number -- Clear After")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.849 | 0.037 | 0.776 | 0.921 | 640.406 | 1.005 | * |
| scaletrial_number | -0.072 | 0.019 | -0.110 | -0.035 | 2000.000 | 0.999 | * |
Does training regime affect how often children request repetitions during the test phase (trials 18-25)? Recall that test phase is matched across participants.
Children who got Clear-Before training no longer request fewer repetitions when we move to the Test phase.
kable(convert_stan_to_dataframe(expt3.rep_test ), digits=3, caption = "Repetitions during test phase by trial number and condition")| Estimate | Est.Error | l.95..CI | u.95..CI | Eff.Sample | Rhat | Diff_from_zero | |
|---|---|---|---|---|---|---|---|
| Intercept | 0.764 | 0.025 | 0.716 | 0.816 | 1593.011 | 1.000 | * |
| scaletrial_number | -0.004 | 0.018 | -0.039 | 0.033 | 2000.000 | 1.000 | - |
| test_langClear_Before | 0.008 | 0.024 | -0.039 | 0.056 | 1489.855 | 1.000 | - |
| scaletrial_number:test_langClear_Before | -0.003 | 0.019 | -0.041 | 0.034 | 2000.000 | 0.999 | - |
We don’t see much in the way of an effect of speaker match (same voice for clear/distorted stimuli).
speaker_matchWe don’t see much in the way of an effect of prime distance (whether the to-be-repeated clear sentence in the Clear_Before condition comes immediately prior to the distorted sentence, or one sentence prior).
distanceWhile 2-year-olds had difficulty interpreting sine wave speech, three- and four-year-olds showed above chance performance on our task. Nevertheless, their accuracy was still low. Looking at the graphs, children’s accuracy improved a little with practice, but there was no step-change: Even 4-year-olds had difficulty with this task. In fact, their performance was numerically lower than in Experiment 1’s 2AFC task. Perhaps this was due to the semantically bleached stimuli we used in Expts 2 & 3 (where the carrier phrase Can you touch… did not provide information about the continuation).
Interestingly, we found little evidence that top-down information plays an important role in children’s accuracy interpretating the instructions. First, we found little evidence for pop-out effects: During training, accuracy was not much higher in the Clear Before condition than Clear After. In addition, children who were given a top-down learning protocol were not better at test. Note, though, that both results were marginal: It seems more likely that there was a small effect we couldn’t measure, than no effect at all.
However, top-down information did appear to affect the number of repetitions that participants asked for (at least for ages 3 and 4). Under “pop out” conditions, children were less likely to request repetitions. Unfortunately, we can’t tell if this reflects improved understanding in the Clear_Before group, or improved confidence in the Clear_Before group. Because children in the Clear Before group show no effect of accuracy overall, and because increased repetitions is actually associated with worse performance (nb. that analysis is not shown here), we suspect that top down information is actually improving confidence rather than accuracy.