We are making a random-intercept model with the same data set from the previous homework.
To help improve the quality of data, we are first going to test our parameters for colinearity. It’s possible having several parameters that were correlated with eachother (e.g. multiple measure of math fluency) split up the explained variance too much, resulting in false negatives in the t-tests last homework.
library(car)
## Loading required package: carData
library(knitr)
load("D:/Behavioral_Data/N400.Rdata")
N400$RT<-N400$RT/1000 #convert RTs from ms to seconds
linfit<-lm(RT~Syllables+Age+LQ+PicVoc_SS+OralComp_SS+NumRev_SS+IncWord_SS+Add_SS+Sub_SS+Mult_SS+female+digit_first+rightfinger+N400ListA+correct_trial+accurate_resp,data=N400)
inflate1<-vif(linfit)
kable(inflate1)
| x | |
|---|---|
| Syllables | 1.000158 |
| Age | 4.147259 |
| LQ | 11.388999 |
| PicVoc_SS | 19.196909 |
| OralComp_SS | 4.308954 |
| NumRev_SS | 10.644917 |
| IncWord_SS | 2.440158 |
| Add_SS | 17.159958 |
| Sub_SS | 10.077422 |
| Mult_SS | 44.937911 |
| female | 17.741999 |
| digit_first | 4.879995 |
| rightfinger | 6.999612 |
| N400ListA | 5.117086 |
| correct_trial | 1.004277 |
| accurate_resp | 1.008896 |
I’m looking for any parameter with an inflation factor greater than 10. According to the rule of thumb, a inflation factor greater than 10 would indicate the parameter is too colinear with other parameters.
The following factors are to remain the the second linear model to test inflation. Sub_SS remains even though it exceeded the inflation factor threshold of 10 because the other two math-proficiency parameters greatly exceeded Sub_SS, and leaving Sub_SS might independetly account for varaince better.
linfit2<-lm(RT~Syllables+Age+OralComp_SS+IncWord_SS+Sub_SS+digit_first+rightfinger+N400ListA+correct_trial+accurate_resp,data=N400)
inflate2<-vif(linfit2)
kable(inflate2)
| x | |
|---|---|
| Syllables | 1.000158 |
| Age | 1.332450 |
| OralComp_SS | 1.299128 |
| IncWord_SS | 1.789697 |
| Sub_SS | 1.858291 |
| digit_first | 1.331525 |
| rightfinger | 1.650004 |
| N400ListA | 1.806680 |
| correct_trial | 1.004273 |
| accurate_resp | 1.007859 |
Indeed, when running the model again, the inflation factors drop dramatically (including Sub_SS). The random-intercepts and -slope models will use only these parameters.
Below are the steps to estimate the random-intercept model. With this model, we can ask the question:
“Do people answer more slowly (higher RT) when the trial condition is incorrect, regardless of other trial or demographic factors?”
library(lme4)
## Loading required package: Matrix
fit1<-lmer(RT~Syllables+Age+OralComp_SS+IncWord_SS+Sub_SS+digit_first+rightfinger+N400ListA+correct_trial+accurate_resp+(1|Subject),data=N400, subset=complete.cases(N400))
summary(fit1)
## Linear mixed model fit by REML ['lmerMod']
## Formula:
## RT ~ Syllables + Age + OralComp_SS + IncWord_SS + Sub_SS + digit_first +
## rightfinger + N400ListA + correct_trial + accurate_resp +
## (1 | Subject)
## Data: N400
## Subset: complete.cases(N400)
##
## REML criterion at convergence: -81
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.2301 -0.4437 -0.1233 0.2569 11.0519
##
## Random effects:
## Groups Name Variance Std.Dev.
## Subject (Intercept) 0.008345 0.09135
## Residual 0.050609 0.22496
## Number of obs: 1280, groups: Subject, 16
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 8.708e-01 4.593e-01 1.896
## Syllables -1.237e-02 9.691e-03 -1.276
## Age 4.958e-03 1.523e-02 0.325
## OralComp_SS -2.420e-03 3.037e-03 -0.797
## IncWord_SS 6.249e-05 2.769e-03 0.023
## Sub_SS 1.491e-03 2.568e-03 0.581
## digit_first -2.344e-01 5.897e-02 -3.975
## rightfinger -1.010e-01 6.281e-02 -1.609
## N400ListA 3.725e-02 6.413e-02 0.581
## correct_trial -1.719e-02 1.260e-02 -1.364
## accurate_resp -4.207e-02 3.507e-02 -1.199
##
## Correlation of Fixed Effects:
## (Intr) Syllbl Age OrC_SS InW_SS Sub_SS dgt_fr rghtfn N400LA
## Syllables -0.029
## Age -0.603 0.000
## OralComp_SS -0.557 0.000 0.224
## IncWord_SS -0.216 0.000 -0.068 -0.386
## Sub_SS -0.063 0.000 -0.397 -0.060 -0.222
## digit_first 0.027 0.000 0.163 -0.137 0.114 -0.378
## rightfinger -0.234 0.000 0.257 -0.182 0.473 -0.384 0.333
## N400ListA -0.076 0.000 -0.137 0.191 -0.439 0.541 -0.362 -0.269
## correct_trl -0.018 -0.003 0.000 0.000 -0.001 0.000 0.000 -0.001 0.001
## accurat_rsp -0.072 -0.012 -0.001 0.006 -0.013 0.005 -0.001 -0.010 0.011
## crrct_
## Syllables
## Age
## OralComp_SS
## IncWord_SS
## Sub_SS
## digit_first
## rightfinger
## N400ListA
## correct_trl
## accurat_rsp 0.065
In the model to describe particpipants’ reaction times to judge a multiplication problem’s correctness, the Subject variance was 0.008 seconds, which is incredibly small. The intracorrelation coefficient (ICC) for Subject variance is 0141, which means a large portion of the variance is explained by factors beyond the subject (85.9%).
The fixed effects have the following estimates and t-values:
| Parameter | Estimate | t-value |
|---|---|---|
| Intercept | 8.708e-01 | 1.896 |
| Syllables | -1.237e-02 | -1.276 |
| Age | 4.958e-03 | 0.325 |
| OralComp_SS | -2.420e-03 | -0.797 |
| Incword_SS | 6.249e-05 | 0.023 |
| Sub_SS | 1.491e-03 | 0.581 |
| digit_first | -2.344e-01 | -3.975 |
| rightfinger | -1.010e-01 | -1.609 |
| N400ListA | 3.725e-02 | 0.581 |
| correct_trial | -1.719e-02 | -1.364 |
| accurate_resp | -4.207e-02 | -1.199 |
For a two-tailed Student T-test with 15 degrees of freedom, the t-value threshold is +/- 2.13 for an alpha of 0.05. The only parameter that exceeded this threshold was the digit_first parameter with a t-value of -3.975, suggesting that if participants do a math verification task, they are significantly faster with this picture verification task, presumably because of practice.
Next, I will model the data using a mixed model with random intercepts and a random slope for the factor of digit_first, addressing the question:
Does the order of experiments affect reaction times equally between participants?
fit2<-lmer(RT~Syllables+Age+OralComp_SS+IncWord_SS+Sub_SS+digit_first+rightfinger+N400ListA+correct_trial+accurate_resp+(1+digit_first|Subject), data = N400)
summary(fit2)
## Linear mixed model fit by REML ['lmerMod']
## Formula:
## RT ~ Syllables + Age + OralComp_SS + IncWord_SS + Sub_SS + digit_first +
## rightfinger + N400ListA + correct_trial + accurate_resp +
## (1 + digit_first | Subject)
## Data: N400
##
## REML criterion at convergence: -84.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.2400 -0.4314 -0.1296 0.2525 11.1051
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 0.02063 0.1436
## digit_first 0.01898 0.1378 -0.95
## Residual 0.05061 0.2250
## Number of obs: 1280, groups: Subject, 16
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 1.0087971 0.3829967 2.634
## Syllables -0.0123659 0.0096907 -1.276
## Age 0.0098558 0.0112179 0.879
## OralComp_SS -0.0026388 0.0019437 -1.358
## IncWord_SS 0.0008689 0.0015478 0.561
## Sub_SS -0.0018436 0.0016303 -1.131
## digit_first -0.2010228 0.0693259 -2.900
## rightfinger -0.0603858 0.0391323 -1.543
## N400ListA -0.0114651 0.0411727 -0.278
## correct_trial -0.0171973 0.0126028 -1.365
## accurate_resp -0.0422475 0.0350737 -1.205
##
## Correlation of Fixed Effects:
## (Intr) Syllbl Age OrC_SS InW_SS Sub_SS dgt_fr rghtfn N400LA
## Syllables -0.035
## Age -0.770 0.000
## OralComp_SS -0.689 0.000 0.503
## IncWord_SS -0.054 0.000 -0.111 -0.396
## Sub_SS -0.204 0.000 -0.187 0.057 -0.270
## digit_first -0.079 0.000 0.024 -0.085 0.066 -0.204
## rightfinger -0.283 0.000 0.293 -0.080 0.471 -0.312 0.137
## N400ListA -0.309 0.000 0.075 0.251 -0.330 0.571 -0.188 -0.034
## correct_trl -0.022 -0.003 0.000 0.001 -0.001 0.001 0.000 -0.001 0.002
## accurat_rsp -0.089 -0.012 -0.002 0.009 -0.021 0.012 -0.002 -0.011 0.027
## crrct_
## Syllables
## Age
## OralComp_SS
## IncWord_SS
## Sub_SS
## digit_first
## rightfinger
## N400ListA
## correct_trl
## accurat_rsp 0.065
With this model, the first measurement that stood out is the very high -0.95 correlation coefficient.
Below, we can compare the Subject level variance between the model with and without the random intercept.
| Subject | RI Only | RI + RS |
|---|---|---|
| Variance | 0.008345 | 0.02063 |
| ICC | 0.141 | 0.289 |
| digit_first | ||
| Variance | - | 0.01898 |
| ICC | - | - |
The Subjects accounted for more variance in the random intercept (RI) + slope (RS) model, as well as a higher ICC in the RI+RS model.
The following are the states from fixed effects from both model types:
| Parameter | Estimate (RI Only) | t-value (RI Only) | Estimate (RI + RS) | t-value (RI + RS) |
|---|---|---|---|---|
| (Intercept) | 8.708e-01 | 1.896 | 1.008 | 2.634 |
| Syllables | -1.237e-02 | 1.896 | -0.012 | -1.276 |
| Age | 4.958e-03 | -1.276 | 0.009 | 0.879 |
| OralComp_SS | -2.420e-03 | -0.797 | -0.002 | -1.358 |
| Incword_SS | 6.249e-05 | 0.023 | 0.0008 | 0.561 |
| Sub_SS | 1.491e-03 | 0.581 | -0.001 | -1.131 |
| digit_first | -2.344e-01 | -3.975 | -0.201 | -2.900 |
| rightfinger | -1.010e-01 | -1.609 | -0.060 | -1.543 |
| N400ListA | 3.725e-02 | 0.581 | -0.011 | -0.278 |
| correct_trial | -1.719e-02 | -0.017 | -0.017 | -1.365 |
| accurate_resp | -4.207e-02 | -1.199 | -0.042 | -1.205 |
In this new model, the fixed effects are less likely to explain variance in participants reaction times, even if each subject’s experiment order was modelled with random slope. The factor of digit_first did lose some explanatory power, as seen by the reduced t-value in the RI + RS model. There was also an increase in the ICC in the RI + RS model (0.141 -> 0.289). I would conclude that the first RI model was a better fit to the data than the RI + RS model.
I leave you with a plot of my random-intercept and random-slope model for adult response times to a picture-verification task.
rancoefs<-ranef(fit2)
plot(NULL, ylim=c(0,2),xlim=c(0,1),ylab="Reaction Time (ms)", xlab="Task Order")
title(main="Regression Lines for each participant from Random Slope and Intercept Model")
cols=sample(rainbow(n=16),size=dim(rancoefs$Subject)[1],replace = T)
for (i in 1:dim(rancoefs$Subject)[1]){
abline(a=fixef(fit2)[1]+rancoefs$Subject[[1]][i],b=fixef(fit2)[7]+rancoefs$Subject[[2]][i],col=cols[i],lwd=.5)
}
abline(a=fixef(fit2)[1],b=fixef(fit2)[8],col=1,lwd=3)
legend("topright",col=1,lwd5,legend="Averaege Effect of Correctness")