Hebrew-speaking children with autism were asked to read aloud sentences containing heterophonic homographs - ambiguous words whose meaning and pronunciation depends on the context in which they are read. We used linear mixed random effects analyses to model reading accuracy as a function of children’s age, autism severity, and language and reading abilities.
Brock, J., Sukenik, N., & Friedmann, N. (in press). Individual differences in autistic children’s homograph reading: Evidence from Hebrew. Autism and Developmental Language Impairments.
Load libraries
library(lme4)
library(influence.ME)
library(plyr)
library(ggplot2)
library(knitr)
library(grid)
library(ReporteRs)
library(dplyr)
library(corrplot)
sessionInfo()
## R version 3.3.2 (2016-10-31)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: macOS Sierra 10.12.2
##
## locale:
## [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] corrplot_0.77 dplyr_0.5.0 ReporteRs_0.8.8
## [4] ReporteRsjars_0.0.2 knitr_1.15.1 ggplot2_2.2.1
## [7] plyr_1.8.4 influence.ME_0.9-8 lme4_1.1-12
## [10] Matrix_1.2-7.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.8 nloptr_1.0.4 R.methodsS3_1.7.1
## [4] R.utils_2.5.0 tools_3.3.2 digest_0.6.11
## [7] evaluate_0.10 tibble_1.2 gtable_0.2.0
## [10] nlme_3.1-128 lattice_0.20-34 png_0.1-7
## [13] DBI_0.6-1 shiny_1.0.0 yaml_2.1.14
## [16] rJava_0.9-8 stringr_1.1.0 xml2_1.1.1
## [19] gdtools_0.1.3 rprojroot_1.1 R6_2.2.0
## [22] rvg_0.1.2 rmarkdown_1.3 minqa_1.2.4
## [25] magrittr_1.5 backports_1.0.4 scales_0.4.1
## [28] htmltools_0.3.5 MASS_7.3-45 splines_3.3.2
## [31] assertthat_0.1 xtable_1.8-2 mime_0.5
## [34] colorspace_1.3-2 httpuv_1.3.3 stringi_1.1.2
## [37] lazyeval_0.2.0 munsell_0.4.3 R.oo_1.21.0
Import data
HomographData <- read.csv("data/HebrewHomographsData.csv")
HomographData$SentenceID <- factor(paste0("Sentence", HomographData$SentenceID))
HomographData = HomographData %>% rename(WordReading = TILTANWords, WordAssoc = WordWordAssociation, PictureNaming = ShemeshPictureNaming)
str(HomographData)
## 'data.frame': 468 obs. of 19 variables:
## $ ChildID : Factor w/ 18 levels "Child1","Child10",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ SentenceID : Factor w/ 26 levels "Sentence1","Sentence10",..: 1 12 20 21 22 23 24 25 26 2 ...
## $ OrthographicForm : Factor w/ 13 levels "?ONA","HBIRA",..: 7 10 3 6 4 2 8 5 1 12 ...
## $ Meaning : Factor w/ 26 levels "barber","beard",..: 18 5 1 11 22 3 15 14 20 2 ...
## $ CorrectPronunciation: Factor w/ 26 levels "habiRA","haBIra",..: 8 20 14 6 5 2 12 21 11 23 ...
## $ Bias : num -0.0152 0.3182 -0.2273 -0.4394 0.3182 ...
## $ Dominance : Factor w/ 2 levels "Dominant","Subordinate": 2 1 2 2 1 1 2 1 2 2 ...
## $ InitialCorrect : int 1 0 1 1 1 1 1 1 1 1 ...
## $ FinalCorrect : int 1 1 1 1 1 1 1 1 1 1 ...
## $ ErrorTypeIni : Factor w/ 4 levels "Correct","HomographError",..: 1 2 1 1 1 1 1 1 1 1 ...
## $ ErrorTypeFinal : Factor w/ 4 levels "Correct","HomographError",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ HesitationBefore : int 0 1 0 1 0 1 0 0 1 0 ...
## $ HesitationAfter : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Age : num 11.3 11.3 11.3 11.3 11.3 11.3 11.3 11.3 11.3 11.3 ...
## $ Gender : Factor w/ 2 levels "f","m": 2 2 2 2 2 2 2 2 2 2 ...
## $ CARS : num 39 39 39 39 39 39 39 39 39 39 ...
## $ WordReading : num 0.971 0.971 0.971 0.971 0.971 ...
## $ PictureNaming : num 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 ...
## $ WordAssoc : num 1 1 1 1 1 1 1 1 1 1 ...
Our analyses focused on correct responses. Here we provide a breakdown of error types for each participant.
HomographData$HomographError <- ifelse(HomographData$ErrorTypeIni=="HomographError",1,0)
HomographData$SurfaceError <- ifelse(HomographData$ErrorTypeIni=="SurfaceError",1,0)
HomographData$OtherError <- ifelse(HomographData$ErrorTypeIni=="OtherError",1,0)
ErrorData <- ddply(HomographData, .(ChildID), summarise, CorrectResponse=sum(InitialCorrect), HomographError=sum(HomographError), SurfaceError=sum(SurfaceError), OtherError=sum(OtherError))
kable(ErrorData, title="Error analysis", digits=2)
ChildID | CorrectResponse | HomographError | SurfaceError | OtherError |
---|---|---|---|---|
Child1 | 25 | 1 | 0 | 0 |
Child10 | 16 | 4 | 5 | 1 |
Child11 | 22 | 2 | 1 | 1 |
Child12 | 25 | 1 | 0 | 0 |
Child13 | 25 | 0 | 0 | 1 |
Child14 | 24 | 1 | 0 | 1 |
Child15 | 24 | 2 | 0 | 0 |
Child16 | 19 | 5 | 0 | 2 |
Child17 | 18 | 5 | 1 | 2 |
Child18 | 24 | 2 | 0 | 0 |
Child2 | 25 | 1 | 0 | 0 |
Child3 | 15 | 8 | 1 | 2 |
Child4 | 25 | 0 | 1 | 0 |
Child5 | 22 | 2 | 1 | 1 |
Child6 | 20 | 5 | 0 | 1 |
Child7 | 23 | 2 | 1 | 0 |
Child8 | 22 | 4 | 0 | 0 |
Child9 | 25 | 1 | 0 | 0 |
We analysed homograph reading accuracy using logistic regression with mixed random effects. Rather than looking at the proportion of correct trials for each participant, this approach attempts to model the likelihood of a correct response on each trial by each participant.
We began with a series of models that each included the child’s age and one other predictor as fixed factors and the identity of the participant and the item as random effects.
Model1a <- glmer(InitialCorrect ~ CARS + Age + (1|SentenceID) + (1|ChildID), HomographData, family="binomial")
summary(Model1a)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: InitialCorrect ~ CARS + Age + (1 | SentenceID) + (1 | ChildID)
## Data: HomographData
##
## AIC BIC logLik deviance df.resid
## 360.7 381.4 -175.3 350.7 463
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.0517 0.1847 0.2649 0.3843 1.4404
##
## Random effects:
## Groups Name Variance Std.Dev.
## SentenceID (Intercept) 0.8403 0.9167
## ChildID (Intercept) 0.3319 0.5761
## Number of obs: 468, groups: SentenceID, 26; ChildID, 18
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.68779 1.95361 1.376 0.1689
## CARS -0.07078 0.03913 -1.809 0.0705 .
## Age 0.14113 0.07585 1.861 0.0628 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) CARS
## CARS -0.880
## Age -0.767 0.403
Model1b <- glmer(InitialCorrect ~ WordReading + Age + (1|SentenceID) + (1|ChildID), HomographData, family="binomial")
summary(Model1b)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: InitialCorrect ~ WordReading + Age + (1 | SentenceID) + (1 |
## ChildID)
## Data: HomographData
##
## AIC BIC logLik deviance df.resid
## 361.7 382.4 -175.8 351.7 463
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.2911 0.1852 0.2682 0.4045 1.3605
##
## Random effects:
## Groups Name Variance Std.Dev.
## SentenceID (Intercept) 0.8451 0.9193
## ChildID (Intercept) 0.4302 0.6559
## Number of obs: 468, groups: SentenceID, 26; ChildID, 18
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.36631 1.20270 -1.136 0.256
## WordReading 3.03424 2.18760 1.387 0.165
## Age 0.08749 0.10925 0.801 0.423
##
## Correlation of Fixed Effects:
## (Intr) WrdRdn
## WordReading -0.563
## Age -0.119 -0.735
Model1c <- glmer(InitialCorrect ~ WordAssoc + Age + (1|SentenceID) + (1|ChildID), HomographData, family="binomial")
summary(Model1c)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula:
## InitialCorrect ~ WordAssoc + Age + (1 | SentenceID) + (1 | ChildID)
## Data: HomographData
##
## AIC BIC logLik deviance df.resid
## 360.7 381.5 -175.4 350.7 463
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.5982 0.1843 0.2647 0.3936 1.3703
##
## Random effects:
## Groups Name Variance Std.Dev.
## SentenceID (Intercept) 0.8482 0.9210
## ChildID (Intercept) 0.3667 0.6056
## Number of obs: 468, groups: SentenceID, 26; ChildID, 18
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.01340 1.76045 -1.712 0.0869 .
## WordAssoc 3.82250 2.20565 1.733 0.0831 .
## Age 0.14169 0.07801 1.816 0.0693 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) WrdAss
## WordAssoc -0.842
## Age -0.110 -0.420
Model1d <- glmer(InitialCorrect ~ PictureNaming + Age + (1|SentenceID) + (1|ChildID), HomographData, family="binomial")
summary(Model1d)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: InitialCorrect ~ PictureNaming + Age + (1 | SentenceID) + (1 |
## ChildID)
## Data: HomographData
##
## AIC BIC logLik deviance df.resid
## 356.8 377.5 -173.4 346.8 463
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.0060 0.1828 0.2578 0.3853 1.7775
##
## Random effects:
## Groups Name Variance Std.Dev.
## SentenceID (Intercept) 0.8548 0.9246
## ChildID (Intercept) 0.2215 0.4706
## Number of obs: 468, groups: SentenceID, 26; ChildID, 18
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -8.556363 3.053977 -2.802 0.00508 **
## PictureNaming 11.963905 4.372151 2.736 0.00621 **
## Age -0.006888 0.097760 -0.070 0.94383
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) PctrNm
## PictureNmng -0.961
## Age 0.566 -0.764
To evaluate the contribution of each predictor, X, we asked whether removing it significantly reduced model fit. That is, whether X accounted for significant variation beyond Age. This was only true of Picture Naming.
We then removed the random effect of participant and determined whether this significantly reduced model fit, indicating that there was reliable variation in performance that was unaccounted for by Age and X. All predictors except Picture Naming left reliable individual variation unaccounted for.
ANOVAs.DF <- data.frame(Predictor=c("CARS", "WordReading", "WordAssoc", "PictureNaming"),
Chisq_Fixed=numeric(4),
p_Fixed=numeric(4),
Chisq_Random=numeric(4),
p_Random=numeric(4),
stringsAsFactors=FALSE)
for (i in 1:4) {
Predictor <- ANOVAs.DF$Predictor[i]
HomographData$X <- HomographData[[Predictor]]
Model1 <- glmer(InitialCorrect ~ X + Age + (1|SentenceID) + (1|ChildID), HomographData, family="binomial")
Model2 <- glmer(InitialCorrect ~ Age + (1|SentenceID) + (1|ChildID), HomographData, family="binomial")
Model3 <- glmer(InitialCorrect ~ X + Age + (1|SentenceID), HomographData, family="binomial")
Fixed.ANOVA <- anova(Model1, Model2)
ANOVAs.DF$Chisq_Fixed[i] <- Fixed.ANOVA$Chisq[2] # second row of ANOVA output
ANOVAs.DF$p_Fixed[i] <- Fixed.ANOVA$'Pr(>Chisq)'[2]
Random.ANOVA <- anova(Model1, Model3)
ANOVAs.DF$Chisq_Random[i] <- Random.ANOVA$Chisq[2]
ANOVAs.DF$p_Random[i] <- Random.ANOVA$'Pr(>Chisq)'[2]
}
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge with max|grad| = 0.0384645 (tol =
## 0.001, component 1)
kable(ANOVAs.DF, digits=3)
Predictor | Chisq_Fixed | p_Fixed | Chisq_Random | p_Random |
---|---|---|---|---|
CARS | 2.966 | 0.085 | 4.330 | 0.037 |
WordReading | 1.953 | 0.162 | 7.471 | 0.006 |
WordAssoc | 2.876 | 0.090 | 5.298 | 0.021 |
PictureNaming | 6.860 | 0.009 | 2.170 | 0.141 |
We also found that Picture Naming accounted for variation beyond each of the other predictors. Conversely, none of the other predictors account for significant variation beyond Picture Naming.
ANOVAs2.DF <- data.frame(Predictor=c("Age", "CARS", "WordReading", "WordAssoc"),
Chisq_4v5=numeric(4),
p_4v5=numeric(4),
Chisq_4v6=numeric(4),
p_4v6=numeric(4),
stringsAsFactors=FALSE)
for (i in 1:4) {
Predictor <- ANOVAs2.DF$Predictor[i]
HomographData$X <- HomographData[[Predictor]]
Model4 <- glmer(InitialCorrect ~ X + PictureNaming + (1|SentenceID) + (1|ChildID), HomographData, family="binomial")
Model5 <- glmer(InitialCorrect ~ X + (1|SentenceID) + (1|ChildID), HomographData, family="binomial")
Model6 <- glmer(InitialCorrect ~ PictureNaming + (1|SentenceID) + (1|ChildID), HomographData, family="binomial")
ANOVA_4v5 <- anova(Model4, Model5)
ANOVAs2.DF$Chisq_4v5[i] <- ANOVA_4v5$Chisq[2] # second row of ANOVA output
ANOVAs2.DF$p_4v5[i] <- ANOVA_4v5$'Pr(>Chisq)'[2]
ANOVA_4v6 <- anova(Model4, Model6)
ANOVAs2.DF$Chisq_4v6[i] <- ANOVA_4v6$Chisq[2] # second row of ANOVA output
ANOVAs2.DF$p_4v6[i] <- ANOVA_4v6$'Pr(>Chisq)'[2]
}
kable(ANOVAs2.DF, digits=3)
Predictor | Chisq_4v5 | p_4v5 | Chisq_4v6 | p_4v6 |
---|---|---|---|---|
Age | 6.860 | 0.009 | 0.005 | 0.944 |
CARS | 9.120 | 0.003 | 2.119 | 0.145 |
WordReading | 6.977 | 0.008 | 1.466 | 0.226 |
WordAssoc | 10.105 | 0.001 | 3.155 | 0.076 |
Export ANOVA tables as Word document.
doc <- docx()
is.num <- sapply(ANOVAs.DF, is.numeric)
ANOVAs.DF[is.num] <- lapply(ANOVAs.DF[is.num], round, 3)
doc <- addTitle(doc, "Table 2", level=2)
doc <- addFlexTable( doc, vanilla.table(ANOVAs.DF))
is.num <- sapply(ANOVAs2.DF, is.numeric)
ANOVAs2.DF[is.num] <- lapply(ANOVAs2.DF[is.num], round, 3)
doc <- addTitle(doc, "Table 3", level=2)
doc <- addFlexTable( doc, vanilla.table(ANOVAs2.DF))
writeDoc(doc, file = "output/doc/ANOVA_Table.docx")
First, we define a function, drawFitPlot, which plots homograph reading accuracy against a single predictor. Each plot contains three elements: the proportion correct for each participant, the function derived from the logistic regression analysis, and the predicted values for each participant (with prediction intervals).
drawFitPlot <- function(Predictor, Data = HomographData, XLabel = "Default", XJitter = 0) {
# Add predictor to the dataframe as a new column, X (this will be on the X-axis in the figure)
Data$X <- Data[,Predictor]
# Using ddply, create a new dataframe, PlotData, with the predictor, X, and the observed proportion correct for each participant
PlotData <- ddply(Data, .(ChildID, X), summarize, Correct=mean(InitialCorrect))
# Fit the model using lme4 with X as the fixed factor. We centre Age so that it's effectively controlled for in the fitting of X.
Data$AgeC <- Data$Age - mean(Data$Age)
fit <- glmer(InitialCorrect ~ X + AgeC + (1|SentenceID) + (1|ChildID), Data, family="binomial")
# Create a second dataframe (df) with the intercept and prediction interval for the random effect of ChildID. This is based on the code for creating caterpillar plots in ggplot2:
# http://stackoverflow.com/questions/13847936/in-r-plotting-random-effects-from-lmer-lme4-package-using-qqmath-or-dotplot
randoms <- ranef(fit, condVar = TRUE)
qq <- attr(ranef(fit, condVar = TRUE)[[2]], "postVar") # ChildID is the second random effect in the model so use "2"
rand.interc<-randoms$ChildID
df<-data.frame(RandIntercepts=randoms$ChildID[,1],
sd.interc=2*sqrt(qq[,,1:length(qq)]),
lev.names=rownames(rand.interc))
# Add this to the PlotData dataframe.
PlotData <- cbind(PlotData, df)
# Extract the intercept (beta1) and fixed effect of X (beta2) from the model
beta1 <- summary(fit)$coefficients[[1]]
beta2 <- summary(fit)$coefficients[[2]]
# Next, calculate the predicted value for each child based on the model. This involves the usual process of multiplying the predictor (X) by the slope (beta2) and adding the intercept (beta1). However, we also add the random intercepts for each participant. Then, because this is logistic data, we transform to make it a probability.
PlotData$Pred <- 1 / (1+exp(-1*(beta1 + beta2*PlotData$X + PlotData$RandIntercepts)))
# We can do the same thing to create prediction intervals for each participant
PlotData$YMin <- 1 / (1+exp(-1*(beta1 + beta2*PlotData$X + PlotData$RandIntercepts - PlotData$sd.interc)))
PlotData$YMax <- 1 / (1+exp(-1*(beta1 + beta2*PlotData$X + PlotData$RandIntercepts + PlotData$sd.interc)))
# Add jitter to x-vale for plotting (default is 0 jitter). Note that this is done after calculating the predicted values, which are based on the true X-values.
PlotData$Jitter <- runif(nrow(PlotData), min=-1*XJitter, max=XJitter)
PlotData$X <- PlotData$X + PlotData$Jitter
# Before plotting the figure, determine the label for the x-axis. XLabel is optional when calling this function. If it's missing, use Predictor1.
if (XLabel == "Default") {XLabel <- Predictor1}
ggplot() +
# Plot model fit
stat_function(data=PlotData, fun=function(x) 1/(1+exp(-1*(beta1+x*beta2))), geom="line", color="orange") +
# Plot predicted values based on the model
geom_point(data=PlotData, aes(x=X, y=Pred), colour="orange", size=3) +
# Plot error bars
geom_errorbar(data=PlotData, aes(x=X, ymin=YMin, ymax=YMax), width=0,color="orange") +
# Overlay the observed data
geom_point(data=PlotData, aes(x=X, y=Correct), colour="black", size=3) +
# Add theme, axis scales, labels, remove legend etc
xlab(XLabel) + ylab("Proportion Correct") +
scale_y_continuous(breaks = seq(0, 1, 0.1)) +
coord_cartesian(ylim = c(0.4, 1)) +
theme_bw() +
theme(axis.title=element_text(size=rel(1.3)),
panel.grid.minor=element_blank(),
panel.grid.major=element_blank(),
legend.position = "none")
}
We apply the drawFitPlot function to each predictor, create a multi-panel figure, and export this as a PNG file.
CARS_Plot <- drawFitPlot ("CARS", XLabel = "CARS Total Score", XJitter=1)
WordReading_Plot <- drawFitPlot ("WordReading", XLabel = "Word Reading", XJitter=0.01) +
theme(axis.title.y=element_blank(), axis.text.y=element_blank())
WordAssoc_Plot <- drawFitPlot ("WordAssoc", XLabel = "Word Association", XJitter=0.01) +
theme(axis.title.y=element_blank(), axis.text.y=element_blank())
PictureNaming_Plot <- drawFitPlot ("PictureNaming", XLabel = "Picture Naming", XJitter=0.01) +
theme(axis.title.y=element_blank(), axis.text.y=element_blank())
plots <- list(CARS_Plot, WordReading_Plot, WordAssoc_Plot, PictureNaming_Plot)
grobs <- lapply(plots, ggplotGrob)
g <- do.call(gridExtra::cbind.gtable, grobs)
png(filename="output/figures/Figure1.png", width=1000, height=400)
grid.draw(g)
dev.off()
## quartz_off_screen
## 2
knitr::include_graphics("output/figures/Figure1.png")
Given the small sample size, it is important to ensure that the strong association between Homograph Reading and Picture Naming is not driven by a small number of data points with excessive leverage. We identified and excluded participants with a Cook’s distance 3 times greater than the mean for the sample.
PictureNaming.lmer <- glmer(InitialCorrect ~ PictureNaming + Age + (1|SentenceID) + (1|ChildID), HomographData, family="binomial")
CooksDistance <- as.data.frame(cooks.distance(influence(PictureNaming.lmer, group="ChildID")))
CooksDistance <- cbind(ChildID = rownames(CooksDistance), CooksDistance)
CooksDistance$Outlier <- ifelse(CooksDistance$V1 > 3 * mean(CooksDistance$V1),1,0)
HomographData$Outlier <- CooksDistance$Outlier[match(HomographData$ChildID, CooksDistance$ChildID)]
Outliers <- HomographData[(HomographData$Outlier==1),]
NotOutliers <- HomographData[(HomographData$Outlier==0),]
We then re-ran the analyses, excluding the “outliers”. Picture naming remained the strongest predictor of homograph reading.
for (i in 1:4) {
Predictor <- ANOVAs.DF$Predictor[i]
NotOutliers$X <- NotOutliers[[Predictor]]
Model1 <- glmer(InitialCorrect ~ X + Age + (1|SentenceID) + (1|ChildID), NotOutliers, family="binomial")
Model2 <- glmer(InitialCorrect ~ Age + (1|SentenceID) + (1|ChildID), NotOutliers, family="binomial")
Model3 <- glmer(InitialCorrect ~ X + Age + (1|SentenceID), NotOutliers, family="binomial")
Fixed.ANOVA <- anova(Model1, Model2)
ANOVAs.DF$Chisq_Fixed[i] <- Fixed.ANOVA$Chisq[2] # second row of ANOVA output
ANOVAs.DF$p_Fixed[i] <- Fixed.ANOVA$'Pr(>Chisq)'[2]
Random.ANOVA <- anova(Model1, Model3)
ANOVAs.DF$Chisq_Random[i] <- Random.ANOVA$Chisq[2]
ANOVAs.DF$p_Random[i] <- Random.ANOVA$'Pr(>Chisq)'[2]
}
kable(ANOVAs.DF, digits=3)
Predictor | Chisq_Fixed | p_Fixed | Chisq_Random | p_Random |
---|---|---|---|---|
CARS | 0.448 | 0.503 | 6.120 | 0.013 |
WordReading | 1.509 | 0.219 | 6.436 | 0.011 |
WordAssoc | 3.674 | 0.055 | 3.449 | 0.063 |
PictureNaming | 11.539 | 0.001 | 0.032 | 0.857 |
We also replotted the model fit for Picture naming. Excluded participants (excess leverage) are shown as unfilled black circles. Comparison with the original plot (first panel) shows that these “outliers” had, if anything, been reducing the effect of picture naming rather than driving it.
PictureNaming_Plot <- drawFitPlot ("PictureNaming", XLabel = "Picture Naming", XJitter=0.01) +
scale_x_continuous(breaks = seq(0.7, 1, 0.1), limits = c(0.7, 1.02))
OutlierPlot <- drawFitPlot ("PictureNaming", Data = NotOutliers, XLabel = "Picture Naming", XJitter=0.01) +
geom_point(data=ddply(Outliers, .(ChildID, PictureNaming), summarize, Correct=mean(InitialCorrect)),
aes(x=PictureNaming, y=Correct), colour="black", shape=1, size=3) +
scale_x_continuous(breaks = seq(0.7, 1, 0.1), limits = c(0.7, 1.02)) +
theme(axis.title.y=element_blank(), axis.text.y=element_blank())
plots <- list(PictureNaming_Plot, OutlierPlot)
grobs <- lapply(plots, ggplotGrob)
g <- do.call(gridExtra::cbind.gtable, grobs)
png(filename="output/figures/OutlierPlot.png", width=450, height=400)
grid.draw(g)
dev.off()
## quartz_off_screen
## 2
knitr::include_graphics("output/figures/OutlierPlot.png")
Participants were recruited from two school grades, resulting in a bimodal distribution of age. We therefore repeated the analyses, treating age as a binary (older vs younger) variable. Results were comparable to the analyses reported in the paper. Picture Naming was again the strongest predictor of performance and left no variation unaccounted for. The other three predictors were either significant or close to significant, but all left considerable variation unaccounted for.
HomographData$AgeGroup <- ifelse(HomographData$Age > 13, 0.5, -0.5)
for (i in 1:4) {
Predictor <- ANOVAs.DF$Predictor[i]
HomographData$X <- HomographData[[Predictor]]
Model1 <- glmer(InitialCorrect ~ X + AgeGroup + (1|SentenceID) + (1|ChildID), HomographData, family="binomial")
Model2 <- glmer(InitialCorrect ~ AgeGroup + (1|SentenceID) + (1|ChildID), HomographData, family="binomial")
Model3 <- glmer(InitialCorrect ~ X + AgeGroup + (1|SentenceID), HomographData, family="binomial")
Fixed.ANOVA <- anova(Model1, Model2)
ANOVAs.DF$Chisq_Fixed[i] <- Fixed.ANOVA$Chisq[2] # second row of ANOVA output
ANOVAs.DF$p_Fixed[i] <- Fixed.ANOVA$'Pr(>Chisq)'[2]
Random.ANOVA <- anova(Model1, Model3)
ANOVAs.DF$Chisq_Random[i] <- Random.ANOVA$Chisq[2]
ANOVAs.DF$p_Random[i] <- Random.ANOVA$'Pr(>Chisq)'[2]
}
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge with max|grad| = 0.0384647 (tol =
## 0.001, component 1)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge with max|grad| = 0.0386013 (tol =
## 0.001, component 1)
kable(ANOVAs.DF, digits=3)
Predictor | Chisq_Fixed | p_Fixed | Chisq_Random | p_Random |
---|---|---|---|---|
CARS | 3.789 | 0.052 | 6.840 | 0.009 |
WordReading | 4.271 | 0.039 | 9.761 | 0.002 |
WordAssoc | 3.942 | 0.047 | 7.796 | 0.005 |
PictureNaming | 10.180 | 0.001 | 2.269 | 0.132 |
Pairwise correlations show again that homograph reading was most strongly associated with Picture Naming.
SubjectData <- ddply(HomographData, .(ChildID, Age, CARS, WordReading, WordAssoc, PictureNaming), summarise, CorrectResponse=sum(InitialCorrect))
CorrDF <- select(SubjectData, -ChildID)
CorrDF = CorrDF %>% rename(Hom = CorrectResponse, WR = WordReading, WA = WordAssoc, PN = PictureNaming)
corrplot.mixed(cor(CorrDF))
We also performed a conventional linear regression on the logit-transformed proportion of correct responses on the homograph reading task. We considered whether the predictor, X, accounted for significant variation beyond that accounted for by age. Consistent with the logistic regression analyses, this was only true of Picture Naming.
SubjectData$ProportionCorrect <- SubjectData$CorrectResponse / 26
SubjectData$LogitCorrect <- log(SubjectData$ProportionCorrect / (1-SubjectData$ProportionCorrect))
Regression.DF <- data.frame(Predictor=c("CARS", "WordReading", "WordAssoc", "PictureNaming"),
F=numeric(4),
p=numeric(4),
stringsAsFactors=FALSE)
for (i in 1:4) {
Predictor <- Regression.DF$Predictor[i]
SubjectData$X <- SubjectData[[Predictor]]
ModelA.lm <- lm(LogitCorrect ~ X + Age, data=SubjectData)
ModelB.lm <- lm(LogitCorrect ~ Age, data=SubjectData)
Fixed.ANOVA <- anova(ModelA.lm, ModelB.lm)
Regression.DF$F[i] <- Fixed.ANOVA$F[2] # second row of ANOVA output
Regression.DF$p[i] <- Fixed.ANOVA$'Pr(>F)'[2]
}
kable(Regression.DF, digits=3)
Predictor | F | p |
---|---|---|
CARS | 2.160 | 0.162 |
WordReading | 2.752 | 0.118 |
WordAssoc | 3.156 | 0.096 |
PictureNaming | 8.581 | 0.010 |