What do the overall adult word count (AWC) per minute look like for different children? Is there greater consistency within- than between-children? d’Apice, Latham, & von Stumm (2019) found that daily home language input varied as much within- as between families (intraclass correlation = .47), but this was computed on the daily totals per child.
We look at 32985 5-minute segments from 261 children’s day-long recordings, although for most analyses exclude the 1880 nap segments (78.1% of which have 0 AWC). Of the non-nap segments, only 6.9% of the non-nap segments have 0 AWC.
## `summarise()` has grouped output by 'Dataset'. You can override using the `.groups` argument.
Dataset | Nchildren | totalSegments | mean_segments_per_child | meanAWC_per_min | meanCVC_per_min | meanCTC_per_min | meanPropNonZeroAWC |
---|---|---|---|---|---|---|---|
CONTX | 90 | 12728 | 141.42 | 22.19 | 4.38 | 1.10 | 0.92 |
Joel EN | 56 | 6365 | 113.66 | 28.27 | 3.48 | 1.06 | 0.96 |
Joel SP | 52 | 5371 | 103.29 | 22.45 | 3.07 | 0.84 | 0.92 |
SOT Outreach | 29 | 2889 | 99.62 | 24.09 | 3.28 | 0.83 | 0.93 |
SOT Stanford | 27 | 3064 | 113.48 | 25.45 | 3.66 | 1.11 | 0.93 |
WF | 29 | 2568 | 88.55 | 18.14 | 2.57 | 0.62 | 0.91 |
Scatterplot matrix of children’s mean values of hourly AWC, CVC, CTC, and consistency (Lucy King’s definition: proportion of segments with AWC > 0).
## `summarise()` has grouped output by 'id'. You can override using the `.groups` argument.
## AWC Distribution
What does the distribution of AWC per minute (per segment, colored per child) look like in the non-nap segments? This Poisson distribution is shown below, with dotted, colored lines showing the mean per child, and the dashed black line showing the overall mean. Overall, there is a mean of 23.84 words/minute (1430 words/hour)
d_ch <- d %>%
mutate(id = as.factor(id)) %>%
group_by(id) %>% summarise(AWC = sum(AWC),
dur = sum(dur_min),
AWC_per_min = AWC / dur,
AWC_per_hr = AWC / (dur / 60))
# Two children with >50 words per minute (>3000/hr)
# subset(d_ch, AWC_per_min > 50)
d %>%
mutate(id = as.factor(id)) %>%
ggplot( aes(x=AWC_per_min, fill=id)) +
geom_histogram(alpha=0.6, position = 'stack') +
theme_bw() + theme(legend.position="none") +
geom_vline(data=d_ch, aes(xintercept=AWC_per_min, color=id), alpha=0.8, linetype='dotted') +
geom_vline(aes(xintercept=mean(d_ch$AWC_per_min)), linetype="dashed") # 34.4*60*12 = 24.8k
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
How reliable is the per-child estimate of AWC? First we simply look at children’s mean AWC per minute (with bootstrapped 95% CIs) as a function of the number of 5-minute LENA segments. No significant correlation between a child’s hourly AWC and the total duration of that child’s recordings (r(259) = .10, p = .091).
The higher a child’s mean AWC, the larger the confidence interval (r(259) = .66, p < .001), which makes sense given that AWC is Poisson-distributed (mean = variance). The more segments you collect per child, the smaller the confidence interval (r(259) = -.38, p < .001)
## # A tibble: 1 x 5
## n empirical_stat ci_lower mean ci_upper
## <int> <dbl> <dbl> <dbl> <dbl>
## 1 258 1416. 1342. 1415. 1491.
Repeatedly resample 50% of the segments per child and see what the reliability (ICC2) of hourly AWC, CVC, and CTC for each split half looks like.
split_half <- function(dat, samples=100, plot=F) {
dat$row = 1:nrow(dat)
#rs = rep(NA, samples)
rdf <- tibble()
for(samp in 1:samples) {
s1 <- dat %>% group_by(id) %>% slice_sample(prop=.5) %>% ungroup() # or n=6 (half an hour..)
s2 <- subset(dat, !is.element(row, s1$row)) # select all segments not in s1
s1m <- s1 %>% group_by(id) %>%
summarise(totAWC=sum(AWC),
totCVC=sum(CVC),
totCTC=sum(CTC),
dur_hr=sum(dur_min)/60, n=n(), .groups='keep') %>%
mutate(hourlyAWC = totAWC / dur_hr,
hourlyCVC = totCVC / dur_hr,
hourlyCTC = totCTC / dur_hr) %>% arrange(id)
s2m <- s2 %>% group_by(id) %>%
summarise(totAWC=sum(AWC),
totCVC=sum(CVC),
totCTC=sum(CTC),
dur_hr=sum(dur_min)/60, n=n(), .groups='keep') %>%
mutate(hourlyAWC = totAWC / dur_hr,
hourlyCVC = totCVC / dur_hr,
hourlyCTC = totCTC / dur_hr) %>% arrange(id)
#rs[samp] = cor(s1m$hourlyAWC, s2m$hourlyAWC)
#rs[samp] = ICC(cbind(s1m$hourlyAWC, s2m$hourlyAWC), lmer=F)$results$ICC[2]
rdf <- rdf %>%
bind_rows(
tibble(sample=rep(samp, 3),
var = c("AWC","CVC","CTC"),
ICC = c(ICC(cbind(s1m$hourlyAWC, s2m$hourlyAWC), lmer=F)$results$ICC[2],
ICC(cbind(s1m$hourlyCVC, s2m$hourlyCVC), lmer=F)$results$ICC[2],
ICC(cbind(s1m$hourlyCTC, s2m$hourlyCTC), lmer=F)$results$ICC[2])))
if(plot) plot(s1m$hourlyAWC, s2m$hourlyAWC,
xlab="Hourly AWC from Half 1",
ylab="Hourly AWC from Half 2")
}
return(rdf)
}
nonap_rel = split_half(d)
# split-half correlation results:
#mean(nonap_rel) # .92
#wnaps_rel = split_half(d)
#mean(wnaps_rel) # .916
split_half(d, samples=1, plot=T)
## # A tibble: 3 x 3
## sample var ICC
## <int> <chr> <dbl>
## 1 1 AWC 0.907
## 2 1 CVC 0.949
## 3 1 CTC 0.932
We show an example split-half plot above for hourly AWC. The split-half reliability results of 100 resamples for hourly AWC, CTC, and CVC are shown below, and are generally quite high.
nonap_rel %>% group_by(var) %>%
summarise(`Mean ICC` = mean(ICC), sd = sd(ICC)) %>%
kable(digits=2)
var | Mean ICC | sd |
---|---|---|
AWC | 0.92 | 0.01 |
CTC | 0.93 | 0.01 |
CVC | 0.94 | 0.01 |
Next we examine the reliability of using just a subset of each child’s segments (N=1..30) to estimate hourly AWC compared to using all of each child’s segments to estimate hourly AWC. Note that the minimum number of segments from a child is 37, but the median number of segments per child is 120 (sd = 50.8976895).
rel_n_samples <- function(dat, Ns=seq(1,30), method="", samples=100) {
dat$row = 1:nrow(dat)
s2m <- dat %>% group_by(Dataset,id) %>%
summarise(totAWC=sum(AWC),
totCVC=sum(CVC),
totCTC=sum(CTC),
dur_hr=sum(dur_min)/60, n=n(), .groups='keep') %>%
mutate(hourlyAWC = totAWC / (dur_hr+.00001),
hourlyCVC = totCVC / (dur_hr+.00001),
hourlyCTC = totCTC / (dur_hr++.00001)) %>%
arrange(Dataset,id) #%>%
#filter(n >= 100) # 181 kids with at least 100 segments
#dat <- subset(dat, is.element(id, s2m$id))
# can't just use ID because some SOT Outreach IDs are same as CONTX ideas -- should uniqueify them!
rdf = tibble()
for(i in Ns) {
rs <- tibble()
for(samp in 1:samples) {
if(method=="firstN") {
s1 <- dat %>% group_by(Dataset,id) %>%
slice_head(n=i) %>% ungroup() %>% arrange(Dataset,id)
} else if(method=="lastN") {
s1 <- dat %>% group_by(Dataset,id) %>%
slice_tail(n=i) %>% ungroup() %>% arrange(Dataset,id)
} else if(method=="randomN") { # random segments
s1 <- dat %>% group_by(Dataset,id) %>%
slice_sample(n=i) %>% ungroup() %>% arrange(Dataset,id)
} else if(method=="highestN") { # highest AWC segments
s1 <- dat %>% group_by(Dataset,id) %>%
slice_max(n=i, order_by=AWC) %>% ungroup() %>% arrange(Dataset,id)
}
# contiguousN - select startindex less than final index - N;
s1m <- s1 %>% group_by(Dataset,id) %>%
summarise(totAWC=sum(AWC),
totCVC=sum(CVC),
totCTC=sum(CTC),
dur_hr=sum(dur_min)/60, n=n(), .groups='keep') %>%
mutate(hourlyAWC = totAWC / (dur_hr+.00001),
hourlyCVC = totCVC / (dur_hr+.00001),
hourlyCTC = totCTC / (dur_hr+.00001)) %>%
arrange(Dataset,id)
rs <- rs %>%
bind_rows(
tibble(sample=rep(samp, 3),
var = c("AWC","CVC","CTC"),
ICC = c(ICC(cbind(s1m$hourlyAWC, s2m$hourlyAWC), lmer=F)$results$ICC[2],
ICC(cbind(s1m$hourlyCVC, s2m$hourlyCVC), lmer=F)$results$ICC[2],
ICC(cbind(s1m$hourlyCTC, s2m$hourlyCTC), lmer=F)$results$ICC[2])))
}
rdf <- rdf %>% bind_rows(rs %>% group_by(var) %>%
summarise(sd = sd(ICC), ICC = mean(ICC)) %>%
mutate(N = i))
}
return(rdf)
}
randomN = rel_n_samples(d, method="randomN")
firstN = rel_n_samples(d, method="firstN")
lastN = rel_n_samples(d, method="lastN")
highestN = rel_n_samples(d, method="highestN")
save(randomN, firstN, lastN, highestN, file="data/reliability_Nsegments.Rdata")
Despite immense variability per 5-minute segment, we see surprisingly high reliability (over 0.75 with 8 segments–40 minutes). CVC is more reliable than AWC or CTC.
Here we’re taking 1 day of recording and randomly sampling 50% of the 5-min segments from that day, which might get a fairly uniform sample of the day’s activities, compared to von Stumm’s use of 3 (perhaps very different?) days of recording (5.81-18.08 hr, M=15.06 hr), from which they looked at the day-to-day ICC: “Across families’ three days of recordings, adult word count estimates correlated .42, .46 and .56 within families. The ICC for adult word counts within families across days was .47”. Maybe we would see much lower correlations looking at first half vs. second half of day? I do suspect that it is analytically true for Poisson distributed variables that the larger the sampling period (e.g., 1 hour instead of 5 minutes), the greater the variability (and thus the lower the reliability). So von Stumm’s use of 1 day would be expected to have lower reliability.
There might have also been some difference in the type of days parents recorded on between our dataset and theirs, based on the instructions parents were given: d’Apice simply asked parents to choose days when their child was not in daycare, whereas Virginia and Janet instructed parents to choose “typical days” (days without special events like birthday parties).
Using the highest-AWC segments from each child will result in overestimates of their mean AWC, so instead of ICC we should perhaps look at correlations between the subsets and the full sample.
Overall, children’s hourly AWC (within 1 day) is quite reliable. Finally, we will test reliability using only the first N segments per child (at least the CONTX segments–50% of the dataset–are sorted by recording time).
Below we examine how reliably a child’s hourly AWC, CVC, and CTC can be estimated using the child’s first or last contiguous N segments (for N=1..30). This scheme – simply using the first N segments of a child’s recording does not result in reliable estimates, even for N=30 segments (2.5 hrs).
We would like to test the reliability of Lucy King’s definition of consistency: proportion of segments with AWC > 0.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Based on 100 resamples, split-half reliability of consistency (proportion of segments with 0 AWC) is reasonably high (\(r=0.814\), sd=0.019).
Is it the case that using a longer time period (e.g., an 8-hour day) as opposed to an equivalent number of shorter intervals (e.g., 8 1-hr periods) will result in less reliable estimates of a Poisson-distributed variable’s mean?
# von Stumm used day-long recordings from 107 children
set.seed(123)
Nsubj = 107
true_rate_hr = sort(rnorm(Nsubj, mean=1200, sd=500))
#lambda_hr = rpois(107, mean=1200, sd=400) # hourly input rate
day1 <- rpois(Nsubj, 8*true_rate_hr)
day2 <- rpois(Nsubj, 8*true_rate_hr)
cor(day1, day2) # .999
## [1] 0.999366
hours <- matrix(rpois(8 * Nsubj, true_rate_hr), byrow = F, nrow=Nsubj)
hourly_mean = rowSums(hours) / ncol(hours)
cor(true_rate_hr, hourly_mean)
## [1] 0.9996545
cor(true_rate_hr, day1)
## [1] 0.9996461
# error:
day_error = sum((day1/8 - true_rate_hr)^2)
day2_error = sum((day2/8 - true_rate_hr)^2)
hourly_error = sum((hourly_mean - true_rate_hr)^2)