Effects of word exposure on sentence reading time

This is an exploration of the Dutch data from Year 1 (2018) focusing on the effects of item repetition on sentence reading time. Feedback will be greatly appreciated. Unless some gross stupidity is identified, I will submit an abstract based on Part B below for presentation at Psychonomics (fall 2019).

Just as a general reminder, this is an “orthographic learning” study, in which children in Grades 2 and 5 (\(N=81\) and 59, respectively) read aloud sentences containing target items. Target items can be words or pseudowords, short or long (i.e., 1 vs. 2 syllables long), and they are repeated twice or six times through the experiment. There are a total of 83 trials, including 3 practice, 16 fillers (with comprehension/attention questions), and 64 experimental. The 64 trials amount to 2 lengths \(\times\) 2 lexicality \(\times\) 2 items, each repeated either 2 or 6 times. Assignment of items to repetition condition (2 vs. 6) and to sentence is randomized individually per participant, resulting in random crossing combined with counterbalancing of conditions, modeled with fixed effects plus crossed random effects of participants, target items, and sentences.

Importantly, each participant only reads each sentence once, so there are no sentence repetition effects. Sentences all share a similar structure in that the target item is preceded by an adjective and followed by a verb, so some form of structural priming might be possible during the experiment. In combination with other long-term effects, such as task learning and fatigue, this is accounted for by modeling trial number as a continuous predictor.

In the following we focus on the items that are repeated six times (i.e., twice-appearing items are not included in the analyses). We employ generalized additive mixed-effects models (GAMMs), which are a kind of nonlinear regression models allowing curvilinear relationships among variables. We use R library mgcv, which can also model random effects (including crossed random effects and random slopes). In all models nonlinear relationships are modeled with penalized cubic splines with shrinkage, so that coefficients will shrink to zero, effectively being smoothed out of the model with strong enough penalization. In other words, we are trying to force only substantial effects to remain in the model.

The basic premise is as follows: Reading time is the main behavioral dependent variable in sentence reading (outside of eye tracking experiments, that is), as an index of oral reading fluency (assuming no reading errors). Since our items are randomly crossed with sentences, we expect that reading times will depend on the sentence and on the item. However, items are repeated, and we see (as expected) that item repetition leads to reduced item viewing time. It is also conceivable that item repetition might affect the viewing time of the preceding (\(N-1\)) and following (\(N+1\)) item, via preview or spillover, respectively. Fixed effects such as lexicality and length can take up some item variance. However, there is no other obvious source of systematic effects on sentence reading time. In other words, once we account for the effect of repetition on the target item, there should be no further effect of repetition on the sentence. Spoiler: This turns out not to be the case.

library(mgcv)
library(parallel)
library(itsadug)
nc <- 6
cl <- makeCluster(nc)
par(mfrow=c(2,2))

Part A: Model target viewing times

It turns out that the effect of repetition on target viewing time is complex, depending on interacting properties of participant (grade and reading skill) and item (lexicality and length). We will not disentangle these here, but just display the model. There are random intercepts of participant, sentence, and target item, as well as random slopes of repetition per participant, conditioned on lexicality (i.e., two participant-specific effects of repetition, one for words and one for pseudowords).

As a general approach, we only use trials with target viewing times between 80 ms and 10 s and no reading errors on the target item. Independent variables have been centered (indicated by a final “C”) to permit interpretation of interactions. Time measures (such as target viewing times) are logarithmically transformed (indicated by a final “l”) to approach the normal distribution (and lead to normally distributed residuals as well). This creates a difficulty in interpreting certain parameters later, but we will deal with that using simplified models (with untransformed variables) for illustration purposes later.

bm1 <- bam( DwelTSl ~ 1 + gradeC + lexC + NsylC + gradeC:NsylC +
               te(WR_std,repN,by=gradelexsyl,bs=c("cs","cs"),k=c(16,6)) + 
               s(sID,bs="re") + s(sentID,bs="re") + s(targID,bs="re") + 
               s(repNC,lex,sID,bs="re",m=1),
             cluster=cl,
             data=mergedat,
             subset= IA_DWELL_TIME>=80 & IA_DWELL_TIME<=10000 & Err.Targ==0 & targreps == 6 
)

This model is not bad at all; all effects are significant, half of the variance is accounted for, and the residual distribution seems rather tame:

summary(bm1)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## DwelTSl ~ 1 + gradeC + lexC + NsylC + gradeC:NsylC + te(WR_std, 
##     repN, by = gradelexsyl, bs = c("cs", "cs"), k = c(16, 6)) + 
##     s(sID, bs = "re") + s(sentID, bs = "re") + s(targID, bs = "re") + 
##     s(repNC, lex, sID, bs = "re", m = 1)
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -0.35653    0.03669  -9.717  < 2e-16 ***
## gradeC       -0.73290    0.04500 -16.286  < 2e-16 ***
## lexC          0.26893    0.05855   4.593 4.48e-06 ***
## NsylC         0.26189    0.05857   4.472 7.94e-06 ***
## gradeC:NsylC -0.20884    0.03377  -6.184 6.78e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                                     edf Ref.df      F  p-value    
## te(WR_std,repN):gradelexsylG2p1  17.199     75  4.275  < 2e-16 ***
## te(WR_std,repN):gradelexsylG2p2   9.269     69  4.128  < 2e-16 ***
## te(WR_std,repN):gradelexsylG2w1  11.639     74  4.045  < 2e-16 ***
## te(WR_std,repN):gradelexsylG2w2   9.614     74  4.240  < 2e-16 ***
## te(WR_std,repN):gradelexsylG5p1   7.624     91  1.101 2.21e-12 ***
## te(WR_std,repN):gradelexsylG5p2  11.490     86  2.726  < 2e-16 ***
## te(WR_std,repN):gradelexsylG5w1   4.452     91  0.509 1.10e-06 ***
## te(WR_std,repN):gradelexsylG5w2   7.556     92  1.068 3.19e-11 ***
## s(sID)                          112.213    134  6.503 8.21e-11 ***
## s(sentID)                        28.323     63  0.905 0.000109 ***
## s(targID)                        25.842     29 29.140 5.70e-08 ***
## s(repNC,lex,sID)                 73.957    272  0.389 2.21e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.493   Deviance explained = 52.5%
## fREML = 4613.7  Scale est. = 0.31131   n = 5123

gam.check(bm1)

## 
## Method: fREML   Optimizer: perf newton
## full convergence after 11 iterations.
## Gradient range [-9.616087e-09,8.505867e-09]
## (score 4613.651 & scale 0.311308).
## Hessian positive definite, eigenvalue range [0.372778,2560.972].
## Model rank =  1269 / 1269 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                                     k'    edf k-index p-value
## te(WR_std,repN):gradelexsylG2p1  95.00  17.20    1.01    0.69
## te(WR_std,repN):gradelexsylG2p2  95.00   9.27    1.01    0.68
## te(WR_std,repN):gradelexsylG2w1  95.00  11.64    1.01    0.69
## te(WR_std,repN):gradelexsylG2w2  95.00   9.61    1.01    0.69
## te(WR_std,repN):gradelexsylG5p1  95.00   7.62    1.01    0.64
## te(WR_std,repN):gradelexsylG5p2  95.00  11.49    1.01    0.64
## te(WR_std,repN):gradelexsylG5w1  95.00   4.45    1.01    0.68
## te(WR_std,repN):gradelexsylG5w2  95.00   7.56    1.01    0.71
## s(sID)                          136.00 112.21      NA      NA
## s(sentID)                        64.00  28.32      NA      NA
## s(targID)                        32.00  25.84      NA      NA
## s(repNC,lex,sID)                272.00  73.96      NA      NA

Here’s what the effects look like – as noted above, they are complex, and I am not going to try and interpret them here, because that’s not the point. Rather, the point is that a lot is already gone into the target viewing time. So, how does that account sentence reading time?

ylm <- c(-1.2,+1.2)
par(mfcol=c(4,2))
par(oma=c(1,1,0,0))
par(mar=c(1.75,1.75,1.25,0.01))
par(mgp=c(1.5,0.5,0))
par(tcl=-0.25)
wrng <- c(7,10,13,16)
for (lx in c("w","p")) {
  for (gd in c("G2","G5")) {
    for (ns in c(1,2)) {
      for (wr in wrng) {
        plot_smooth(bm1,view="repN",
                    cond=list(WR_std=wr,gradeC=0.5-(gd=="G2"),lexC=0.5-(lx=="w"),NsylC=0.5-(ns==1),
                              gradelexsyl=paste(gd,lx,ns,sep="")),
                    rm.ranef=T,se=0,print.summary=F,ylim=ylm,#transform=exp, #need to adjust ylim for transformation
                    xlab=ifelse(gd=="G5"&ns==2,"repN",""),
                    ylab=ifelse(lx=="w","Effect on log dwell time on target",""),
                    lwd=2,col=ifelse(wr<=10,"blue","darkgreen"),lty=1+(wr==min(wrng))+(wr==max(wrng)),
                    xpd=NA,hide.label=T,
                    add= wr!=min(wrng))
      }
      title(paste("Grade ",gd,", ",ns,"-syllable ",ifelse(lx=="w","words","pseudowords"),sep=""),line=-1)
      grid(nx=NA,ny=8)
    }
  }
}
legend(-0.8,4.05,paste("Word reading standard score =",wrng),col=rep(c("blue","darkgreen"),each=2),lty=c(2,1,1,2),lwd=2,xpd=NA,cex=1.1,bg="white")

par(mfrow=c(2,2))

Part B: Model sentence reading times

This is the part that I consider presenting at Psychonomics.

Jumping straight to the final model, we control for target viewing time, pre-target item (\(N-1\)) viewing time, and post-target item (\(N+1\)) viewing time, as well as for trial number (the aforementioned long-term effects). All of these are accompanied by random slopes per participants. Grade and lexicality have significant fixed effects, but length does not. All fixed-effects interactions are nonsignificant, so none are included in the model. So, the question is, are there significant effects of repetition when all of these are included in the model?

bm9el <- bam( ReadTSl ~ 1 + gradeC + lexC + NsylC +
               s(DwelTSl,bs="cs",k=12) + 
               s(DwelNp1Sl,bs="cs",k=12) +
               s(DwelNm1Sl,bs="cs",k=12) +
               s(repN,by=grade,bs="cs",k=6) +  
               s(TRIAL,by=grade,bs="cs",k=6) +
               s(sID,bs="re") + s(sentID,bs="re") + s(targID,bs="re") + 
               s(repNC,sID,bs="re",m=1) +
               s(TRIAL,sID,bs="re",m=1) +
               s(DwelTSl,sID,bs="re",m=1) +
               s(DwelNp1Sl,sID,bs="re",m=1) +
               s(DwelNm1Sl,sID,bs="re",m=1),
             cluster=cl,
             data=mergedat,na.action=na.exclude,
             subset= IA_DWELL_TIME>=80 & IA_DWELL_TIME<=10000 & Err.Targ==0 & targreps == 6 
)
# remove the non-fitting case (outlying residual) and re-fit
#head(sort(resid(bm9el))) # show the largest negative residuals
#tail(sort(resid(bm9el))) # show the largest positive residuals
#nrow(mergedat[mergedat$IA_DWELL_TIME>=80 & mergedat$IA_DWELL_TIME<=10000 & mergedat$Err.Targ==0 & mergedat$targreps == 6,]) == length(resid(bm9el)) 
mergedatr <- mergedat[mergedat$IA_DWELL_TIME>=80 & mergedat$IA_DWELL_TIME<=10000 & mergedat$Err.Targ==0 & mergedat$targreps == 6,]
mergedatr <- mergedatr[abs(resid(bm9el))<0.677,]
bm9elr <- bam( ReadTSl ~ 1 + gradeC + lexC + NsylC +
                s(DwelTSl,bs="cs",k=12) + 
                s(DwelNp1Sl,bs="cs",k=12) +
                s(repN,by=grade,bs="cs",k=6) +  
                s(TRIAL,by=grade,bs="cs",k=6) +
                s(sID,bs="re") + s(sentID,bs="re") + s(targID,bs="re") + 
                s(repNC,sID,bs="re",m=1) +
                s(TRIAL,sID,bs="re",m=1) +
                s(DwelTSl,sID,bs="re",m=1) +
                s(DwelNp1Sl,sID,bs="re",m=1),
              cluster=cl,
              data=mergedatr 
)

The answer is clearly yes, repetition has an additional effect on sentence reading time beyond target viewing time and all the rest:

summary(bm9elr)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## ReadTSl ~ 1 + gradeC + lexC + NsylC + s(DwelTSl, bs = "cs", k = 12) + 
##     s(DwelNp1Sl, bs = "cs", k = 12) + s(repN, by = grade, bs = "cs", 
##     k = 6) + s(TRIAL, by = grade, bs = "cs", k = 6) + s(sID, 
##     bs = "re") + s(sentID, bs = "re") + s(targID, bs = "re") + 
##     s(repNC, sID, bs = "re", m = 1) + s(TRIAL, sID, bs = "re", 
##     m = 1) + s(DwelTSl, sID, bs = "re", m = 1) + s(DwelNp1Sl, 
##     sID, bs = "re", m = 1)
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.531396   0.025670 -20.701  < 2e-16 ***
## gradeC      -0.384522   0.037913 -10.142  < 2e-16 ***
## lexC         0.017176   0.006013   2.856  0.00431 ** 
## NsylC        0.006350   0.006026   1.054  0.29208    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                        edf Ref.df        F  p-value    
## s(DwelTSl)       5.963e+00     11 2157.663  < 2e-16 ***
## s(DwelNp1Sl)     5.975e+00     11 1733.336  < 2e-16 ***
## s(repN):gradeG2  2.322e+00      5    6.173 7.31e-05 ***
## s(repN):gradeG5  2.139e+00      5   18.535 2.82e-06 ***
## s(TRIAL):gradeG2 9.469e-01      5    0.546 0.233333    
## s(TRIAL):gradeG5 7.170e-05      5    0.000 0.480724    
## s(sID)           1.297e+02    138  135.556  < 2e-16 ***
## s(sentID)        6.166e+01     63   90.038  < 2e-16 ***
## s(targID)        5.603e+00     29    0.365 0.168576    
## s(repNC,sID)     5.745e-04    140    0.000 0.507178    
## s(TRIAL,sID)     6.020e+01    140   56.210 0.000423 ***
## s(DwelTSl,sID)   6.109e+01    140   12.980 0.083389 .  
## s(DwelNp1Sl,sID) 4.303e+01    140    6.229 0.404314    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.869   Deviance explained = 87.9%
## fREML = -1152.4  Scale est. = 0.028392  n = 4710

gam.check(bm9elr)

## 
## Method: fREML   Optimizer: perf newton
## full convergence after 15 iterations.
## Gradient range [-1.26357e-05,1.403139e-05]
## (score -1152.357 & scale 0.02839183).
## Hessian positive definite, eigenvalue range [2.577504e-06,2356.234].
## Model rank =  842 / 842 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                        k'      edf k-index p-value
## s(DwelTSl)       1.10e+01 5.96e+00    1.00    0.40
## s(DwelNp1Sl)     1.10e+01 5.97e+00    1.00    0.66
## s(repN):gradeG2  5.00e+00 2.32e+00    1.02    0.88
## s(repN):gradeG5  5.00e+00 2.14e+00    1.02    0.86
## s(TRIAL):gradeG2 5.00e+00 9.47e-01    1.00    0.49
## s(TRIAL):gradeG5 5.00e+00 7.17e-05    1.00    0.56
## s(sID)           1.40e+02 1.30e+02      NA      NA
## s(sentID)        6.40e+01 6.17e+01      NA      NA
## s(targID)        3.20e+01 5.60e+00      NA      NA
## s(repNC,sID)     1.40e+02 5.75e-04      NA      NA
## s(TRIAL,sID)     1.40e+02 6.02e+01      NA      NA
## s(DwelTSl,sID)   1.40e+02 6.11e+01      NA      NA
## s(DwelNp1Sl,sID) 1.40e+02 4.30e+01      NA      NA

And this is what the effects look like, in the log scale of the model:

par(mfrow=c(3,2))

plot.gam(bm9elr,select=1,main="Sentence reading time by target viewing time"                       ,rug=T,xlim=c(-2.5,2.5),ylim=c(-0.2,1),xlab="Target item log viewing time",ylab="Sentence log reading time",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")
plot.gam(bm9elr,select=2,main="Sentence reading time by N+1 viewing time"                          ,rug=T,xlim=c(-2.5,2.5),ylim=c(-0.2,1),xlab="Target+1 item log viewing time",ylab="Sentence log reading time",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")

plot.gam(bm9elr,select=5,main="Effect of trial in Grade 2"                ,rug=T,xlim=c(4,83),ylim=c(-0.1,0.1),xlab="Trial",ylab="Sentence log reading time",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")
plot.gam(bm9elr,select=6,main="Effect of trial in Grade 5"                ,rug=T,xlim=c(4,83),ylim=c(-0.1,0.1),xlab="Trial",ylab="Sentence log reading time",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")

plot.gam(bm9elr,select=3,main="Additional effect of repetition in Grade 2",rug=T,xlim=c(1,6) ,ylim=c(-0.1,0.1),xlab="Number of repetitions",ylab="Sentence log reading time",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")
plot.gam(bm9elr,select=4,main="Additional effect of repetition in Grade 5",rug=T,xlim=c(1,6) ,ylim=c(-0.1,0.1),xlab="Number of repetitions",ylab="Sentence log reading time",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")

These are difficult to think about in terms of actual time, because all times have been long transformed and are therefore no longer additive with respect to the baseline, but we keep the fact that all terms are significant and fit a model with untransformed times so that we can verify the scale of things. (Recall that GAMMs model curvilinear relationships so the rescaling may not be as bad as it sounds.)

bm9e <- bam( ReadTsec ~ 1 + gradeC + lexC + NsylC +
               s(DwelTS,bs="cs",k=12) + 
               s(DwelNp1S,bs="cs",k=12) +
               s(DwelNm1S,bs="cs",k=12) +
               s(repN,by=grade,bs="cs",k=6) +  
               s(TRIAL,by=grade,bs="cs",k=6) +
               s(sID,bs="re") + s(sentID,bs="re") + s(targID,bs="re") + 
               s(repNC,sID,bs="re",m=1) +
               s(TRIAL,sID,bs="re",m=1) +
               s(DwelTS,sID,bs="re",m=1) +
               s(DwelNp1S,sID,bs="re",m=1) +
               s(DwelNm1S,sID,bs="re",m=1),
             cluster=cl,
             data=mergedat,
             subset= IA_DWELL_TIME>=80 & IA_DWELL_TIME<=10000 & Err.Targ==0 & targreps == 6 
)

The model summary is similar, but obviously the dispersion of the residuals is far from ideal, and not all terms are statistically significant here (including the critical ones—but the AIC is lower when they are included in the model).

summary(bm9e)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## ReadTsec ~ 1 + gradeC + lexC + NsylC + s(DwelTS, bs = "cs", k = 12) + 
##     s(DwelNp1S, bs = "cs", k = 12) + s(DwelNm1S, bs = "cs", k = 12) + 
##     s(repN, by = grade, bs = "cs", k = 6) + s(TRIAL, by = grade, 
##     bs = "cs", k = 6) + s(sID, bs = "re") + s(sentID, bs = "re") + 
##     s(targID, bs = "re") + s(repNC, sID, bs = "re", m = 1) + 
##     s(TRIAL, sID, bs = "re", m = 1) + s(DwelTS, sID, bs = "re", 
##     m = 1) + s(DwelNp1S, sID, bs = "re", m = 1) + s(DwelNm1S, 
##     sID, bs = "re", m = 1)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.53941    0.17883  36.568  < 2e-16 ***
## gradeC      -2.15847    0.28792  -7.497 7.74e-14 ***
## lexC         0.13398    0.03905   3.431 0.000606 ***
## NsylC        0.11427    0.03907   2.925 0.003466 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                        edf Ref.df        F  p-value    
## s(DwelTS)        5.530e+00     11 1582.226  < 2e-16 ***
## s(DwelNp1S)      7.425e+00     11 2836.867  < 2e-16 ***
## s(DwelNm1S)      7.721e+00     11 6873.938  < 2e-16 ***
## s(repN):gradeG2  2.058e+00      5    2.466  0.07347 .  
## s(repN):gradeG5  9.540e-01      5    3.412  0.05192 .  
## s(TRIAL):gradeG2 1.588e+00      5    1.915  0.12520    
## s(TRIAL):gradeG5 2.241e-04      5    0.000  0.35985    
## s(sID)           1.262e+02    138  242.977  < 2e-16 ***
## s(sentID)        6.131e+01     63   70.756  < 2e-16 ***
## s(targID)        3.701e+00     29    0.195  0.22985    
## s(repNC,sID)     3.767e+01    140    0.746  0.12346    
## s(TRIAL,sID)     7.385e+01    140  123.853  0.00434 ** 
## s(DwelTS,sID)    2.641e+01    140   28.958 1.40e-08 ***
## s(DwelNp1S,sID)  4.476e+01    140   46.410 1.05e-11 ***
## s(DwelNm1S,sID)  5.162e+01    140   75.452  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.887   Deviance explained = 89.7%
## fREML =   9096  Scale est. = 1.4627    n = 5255

gam.check(bm9e)

## 
## Method: fREML   Optimizer: perf newton
## full convergence after 13 iterations.
## Gradient range [-1.188676e-05,1.769143e-05]
## (score 9096.04 & scale 1.462698).
## Hessian positive definite, eigenvalue range [1.189897e-05,2628.596].
## Model rank =  993 / 993 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                        k'      edf k-index p-value   
## s(DwelTS)        1.10e+01 5.53e+00    0.96   0.005 **
## s(DwelNp1S)      1.10e+01 7.42e+00    1.03   0.965   
## s(DwelNm1S)      1.10e+01 7.72e+00    1.01   0.845   
## s(repN):gradeG2  5.00e+00 2.06e+00    0.99   0.200   
## s(repN):gradeG5  5.00e+00 9.54e-01    0.99   0.260   
## s(TRIAL):gradeG2 5.00e+00 1.59e+00    0.98   0.085 . 
## s(TRIAL):gradeG5 5.00e+00 2.24e-04    0.98   0.090 . 
## s(sID)           1.40e+02 1.26e+02      NA      NA   
## s(sentID)        6.40e+01 6.13e+01      NA      NA   
## s(targID)        3.20e+01 3.70e+00      NA      NA   
## s(repNC,sID)     1.40e+02 3.77e+01      NA      NA   
## s(TRIAL,sID)     1.40e+02 7.39e+01      NA      NA   
## s(DwelTS,sID)    1.40e+02 2.64e+01      NA      NA   
## s(DwelNp1S,sID)  1.40e+02 4.48e+01      NA      NA   
## s(DwelNm1S,sID)  1.40e+02 5.16e+01      NA      NA   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Anyway, here are the corresponding graphs, with all scales in seconds, so everything is clearly interpretable (and additive in time).

par(mfrow=c(3,2))

plot.gam(bm9e,select=1,main="Sentence reading time by target viewing time",rug=T,xlim=c(0,6),ylim=c(-1,6),xlab="Target item viewing time (sec)",ylab="Sentence total reading time (sec)",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")
plot.gam(bm9e,select=2,main="Sentence reading time by N+1 viewing time",rug=T,xlim=c(0,6),ylim=c(-1,6),xlab="Target+1 item viewing time (sec)",ylab="Sentence total reading time (sec)",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")

plot.gam(bm9e,select=6,main="Effect of trial in Grade 2",rug=F,xlim=c(4,83),ylim=c(-0.5,0.5),xlab="Trial",ylab="Sentence total reading time (sec)",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")
plot.gam(bm9e,select=7,main="Effect of trial in Grade 5",rug=F,xlim=c(4,83),ylim=c(-0.5,0.5),xlab="Trial",ylab="Sentence total reading time (sec)",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")

plot.gam(bm9e,select=4,main="Additional effect of repetition in Grade 2",rug=F,xlim=c(1,6),ylim=c(-0.5,0.5),xlab="Number of repetitions",ylab="Sentence total reading time (sec)",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")
plot.gam(bm9e,select=5,main="Additional effect of repetition in Grade 5",rug=F,xlim=c(1,6),ylim=c(-0.5,0.5),xlab="Number of repetitions",ylab="Sentence total reading time (sec)",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")

So it looks like there is a non-negligible additional effect of repetition on sentence reading time that cannot be attributed to viewing time on the target and adjacent items. In particular:

The effect of trial (long-term trend through the session) is curvilinear for Grade 2, consistent with early task learning and (slight) late fatigue. In contrast, there is no significant effect in Grade 5 (term shrunk to zero), perhaps suggesting that the reading (and attention?) level of the children at this age is such that there is little room for task learning and not much fatigue.
The additional effect of repetition on sentence reading time is steeper for Grade 2 and mostly concentrated in the first 3–4 repetitions. In contrast, the effect for Grade 5 is a bit smaller and constant throughout the session.

Is there a trivial explanation for this that I have missed?

Part C: Effects of repetition on target-adjacent items

Even though the aforementioned analysis includes viewing times on adjacent items in the model, so that the repetition effect on sentence reading time is “real”, there is still the question of whether the adjacent items are affected by repetition or not. If yes, this might better support an interpretation along the lines of freeing up resources (or would it?). Anyway, here are the (simplified) effects of repetition on item viewing time for the target and the item before (\(N-1\)) and after (\(N+1\)) it. For comparison to the above analyses of sentence reading time, we only condition on grade. Times are logarithmically transformed, for best model fit. Note that case points are conditioned on no reading errors on the target item (regardless of whether it is target looking times or adjacent items looking times being analyzed).

Target item

bm2N <- bam( DwelTSl ~ 1 + gradeC + lexC + NsylC + gradeC:NsylC +
               s(repN,by=grade,bs="cs",k=6) + 
               s(sID,bs="re") + s(sentID,bs="re") + s(targID,bs="re") + 
               s(repNC,sID,bs="re",m=1),
             cluster=cl,
             data=mergedat,
             subset= IA_DWELL_TIME>=80 & IA_DWELL_TIME<=10000 & Err.Targ==0 & targreps == 6 
)
summary(bm2N)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## DwelTSl ~ 1 + gradeC + lexC + NsylC + gradeC:NsylC + s(repN, 
##     by = grade, bs = "cs", k = 6) + s(sID, bs = "re") + s(sentID, 
##     bs = "re") + s(targID, bs = "re") + s(repNC, sID, bs = "re", 
##     m = 1)
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -0.35916    0.04185  -8.582  < 2e-16 ***
## gradeC       -0.62699    0.06282  -9.981  < 2e-16 ***
## lexC          0.28310    0.05573   5.080 3.92e-07 ***
## NsylC         0.27673    0.05574   4.964 7.12e-07 ***
## gradeC:NsylC -0.17729    0.03174  -5.585 2.46e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                     edf Ref.df      F  p-value    
## s(repN):gradeG2   3.729      5 59.463  < 2e-16 ***
## s(repN):gradeG5   3.552      5 59.812  < 2e-16 ***
## s(sID)          127.425    138 17.719 7.66e-11 ***
## s(sentID)        27.319     63  0.877 0.000235 ***
## s(targID)        25.083     29 33.964 0.000125 ***
## s(repNC,sID)     54.717    140  0.690 2.47e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.475   Deviance explained =   50%
## fREML = 4776.6  Scale est. = 0.32144   n = 5256

gam.check(bm2N)

## 
## Method: fREML   Optimizer: perf newton
## full convergence after 9 iterations.
## Gradient range [-2.524377e-09,1.004082e-09]
## (score 4776.609 & scale 0.3214371).
## Hessian positive definite, eigenvalue range [1.212104,2627.499].
## Model rank =  391 / 391 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                     k'    edf k-index p-value
## s(repN):gradeG2   5.00   3.73       1    0.48
## s(repN):gradeG5   5.00   3.55       1    0.46
## s(sID)          140.00 127.43      NA      NA
## s(sentID)        64.00  27.32      NA      NA
## s(targID)        32.00  25.08      NA      NA
## s(repNC,sID)    140.00  54.72      NA      NA

Preceding item (\(N-1\))

bm2Nm1 <- bam( DwelNm1Sl ~ 1 + gradeC + lexC + NsylC + gradeC:NsylC +
                s(repN,by=grade,bs="cs",k=6) + 
                s(sID,bs="re") + s(sentID,bs="re") + s(targID,bs="re") + 
                s(repNC,sID,bs="re",m=1),
              cluster=cl,
              data=mergedat,
              subset= Nm1_DWELL_TIME>=80 & Nm1_DWELL_TIME<=10000 & Err.Targ==0 & targreps == 6 
)
summary(bm2Nm1)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## DwelNm1Sl ~ 1 + gradeC + lexC + NsylC + gradeC:NsylC + s(repN, 
##     by = grade, bs = "cs", k = 6) + s(sID, bs = "re") + s(sentID, 
##     bs = "re") + s(targID, bs = "re") + s(repNC, sID, bs = "re", 
##     m = 1)
## 
## Parametric coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -0.463858   0.052361  -8.859   <2e-16 ***
## gradeC       -0.538943   0.054638  -9.864   <2e-16 ***
## lexC          0.009889   0.022874   0.432   0.6655    
## NsylC        -0.039456   0.022878  -1.725   0.0847 .  
## gradeC:NsylC -0.009289   0.031639  -0.294   0.7691    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                      edf Ref.df      F p-value    
## s(repN):gradeG2   1.4672      5  2.173 0.00856 ** 
## s(repN):gradeG5   0.4426      5  0.151 0.23742    
## s(sID)          125.4973    138 11.358 < 2e-16 ***
## s(sentID)        60.9865     63 35.441 < 2e-16 ***
## s(targID)        13.0827     29  1.322 0.01276 *  
## s(repNC,sID)     27.2497    140  0.265 0.02564 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.449   Deviance explained = 47.2%
## fREML = 5073.4  Scale est. = 0.33352   n = 5466

gam.check(bm2Nm1)

## 
## Method: fREML   Optimizer: perf newton
## full convergence after 11 iterations.
## Gradient range [-1.170973e-07,1.736316e-07]
## (score 5073.361 & scale 0.3335193).
## Hessian positive definite, eigenvalue range [0.05948174,2732.399].
## Model rank =  391 / 391 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                      k'     edf k-index p-value  
## s(repN):gradeG2   5.000   1.467    0.98   0.090 .
## s(repN):gradeG5   5.000   0.443    0.98   0.085 .
## s(sID)          140.000 125.497      NA      NA  
## s(sentID)        64.000  60.987      NA      NA  
## s(targID)        32.000  13.083      NA      NA  
## s(repNC,sID)    140.000  27.250      NA      NA  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Following item (\(N+1\))

bm2Np1 <- bam( DwelNp1Sl ~ 1 + gradeC + lexC + NsylC + gradeC:NsylC +
                s(repN,by=grade,bs="cs",k=6) + 
                s(sID,bs="re") + s(sentID,bs="re") + s(targID,bs="re") + 
                s(repNC,sID,bs="re",m=1),
              cluster=cl,
              data=mergedat,
              subset= Np1_DWELL_TIME>=80 & Np1_DWELL_TIME<=10000 & Err.Targ==0 & targreps == 6 
)
summary(bm2Np1)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## DwelNp1Sl ~ 1 + gradeC + lexC + NsylC + gradeC:NsylC + s(repN, 
##     by = grade, bs = "cs", k = 6) + s(sID, bs = "re") + s(sentID, 
##     bs = "re") + s(targID, bs = "re") + s(repNC, sID, bs = "re", 
##     m = 1)
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -0.47857    0.04236 -11.297   <2e-16 ***
## gradeC       -0.50809    0.05240  -9.697   <2e-16 ***
## lexC          0.06827    0.03331   2.049   0.0405 *  
## NsylC        -0.06235    0.03333  -1.871   0.0614 .  
## gradeC:NsylC -0.02106    0.03013  -0.699   0.4846    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                     edf Ref.df      F p-value    
## s(repN):gradeG2   1.377      5  1.068 0.08258 .  
## s(repN):gradeG5   1.579      5  2.232 0.01444 *  
## s(sID)          125.209    138 11.185 < 2e-16 ***
## s(sentID)        59.026     63 20.668 < 2e-16 ***
## s(targID)        20.706     29  6.614 0.00244 ** 
## s(repNC,sID)     27.341    140  0.255 0.03330 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.406   Deviance explained = 43.3%
## fREML = 4563.2  Scale est. = 0.29166   n = 5295

gam.check(bm2Np1)

## 
## Method: fREML   Optimizer: perf newton
## full convergence after 11 iterations.
## Gradient range [-1.331947e-05,1.956329e-05]
## (score 4563.166 & scale 0.2916576).
## Hessian positive definite, eigenvalue range [0.2195379,2646.957].
## Model rank =  391 / 391 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                     k'    edf k-index p-value
## s(repN):gradeG2   5.00   1.38       1    0.56
## s(repN):gradeG5   5.00   1.58       1    0.54
## s(sID)          140.00 125.21      NA      NA
## s(sentID)        64.00  59.03      NA      NA
## s(targID)        32.00  20.71      NA      NA
## s(repNC,sID)    140.00  27.34      NA      NA

In every case, there seems to be a diminishing effect with increasing repetition, most of it by repetition 3–4. The effects on the item preceding the target in Grade 2 and following the target in Grade 5 are significant. The general shape and magnitude of this effect resembles the effect on sentence reading time in Grade 2 (direct comparisons are not possible due to the log scale, as differences in logs do not translate linearly to differences in times). Here is the graphical illustration of these effects: (Note the different vertical scale for the top row!)

par(mfrow=c(3,2))

plot.gam(bm2N,  select=1,main="Target viewing time - Grade 2",  rug=T,xlim=c(1,6),ylim=c(-0.5,0.5),xlab="Number of repetitions",ylab="Target item log viewing time",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")
plot.gam(bm2N,  select=2,main="Target viewing time - Grade 5",  rug=T,xlim=c(1,6),ylim=c(-0.5,0.5),xlab="Number of repetitions",ylab="Target item log viewing time",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")

plot.gam(bm2Nm1,select=1,main="Target-1 viewing time -- Grade 2",rug=T,xlim=c(1,6),ylim=c(-0.2,0.2),xlab="Number of repetitions",ylab="N-1 item log viewing time",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")
plot.gam(bm2Nm1,select=2,main="Target-1 viewing time -- Grade 5",rug=T,xlim=c(1,6),ylim=c(-0.2,0.2),xlab="Number of repetitions",ylab="N-1 item log viewing time",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")

plot.gam(bm2Np1,select=1,main="Target+1 viewing time -- Grade 2",rug=T,xlim=c(1,6),ylim=c(-0.2,0.2),xlab="Number of repetitions",ylab="N+1 item log viewing time",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")
plot.gam(bm2Np1,select=2,main="Target+1 viewing time -- Grade 5",rug=T,xlim=c(1,6),ylim=c(-0.2,0.2),xlab="Number of repetitions",ylab="N+1 item log viewing time",col="yellow",shade.col="#505000",shade=T)
grid(col="gray40")

sessionInfo()

## R version 3.4.4 (2018-03-15)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 7 x64 (build 7601) Service Pack 1
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
## [1] itsadug_2.3       plotfunctions_1.3 mgcv_1.8-27       nlme_3.1-137     
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.0       codetools_0.2-16 lattice_0.20-38  digest_0.6.18   
##  [5] grid_3.4.4       magrittr_1.5     evaluate_0.13    stringi_1.3.1   
##  [9] Matrix_1.2-15    rmarkdown_1.11   splines_3.4.4    tools_3.4.4     
## [13] stringr_1.4.0    xfun_0.5         yaml_2.2.0       compiler_3.4.4  
## [17] htmltools_0.3.6  knitr_1.21