These data are from a masked-priming lexical decision task in French. 35 native French speakers completed the task in Grenoble, France. 28 non-native French speakers completed the task in Lawrence, Kansas. Only native English speakers who grew up in an English only environment until age 10 remained in the analyses.
Participants sat at a computer and were presented with target words. They decided as quickly as they could if the target was a real word in French or not.
The trial structure was as follows:
The experiment was run in E-Prime 2.0. Native French data were compiled into 1 large Excel sheet with no changes to formatting from the E-data format. The non-native French data are in individual Excel files (named p1XX.csv) that are un-changed in format from the E-data output files. The L2 files thus are compiled into 1 data frame (“L2”), whereas the native French group does not need to be compiled.
Set working directory
setwd("C:/Users/Katie/Desktop/Research/Dissertation/STUDIES/Studies 1 and 2 Behavioral/RESULTS/All tasks raw excel files/Comparing L1 and L2")
file_list <- list.files(pattern="p1")
for (file in file_list){
# if the merged dataset doesn't exist, create it
if (!exists("dataset")){
dataset <- read.csv(file, header=TRUE)
}
# if the merged dataset does exist, append to it
if (exists("dataset")){
temp_dataset <-read.csv(file, header=TRUE)
dataset<-rbind(dataset, temp_dataset)
rm(temp_dataset)
}
}
L2<-dataset
dataset<-NULL
The E-prime output files include rows that correspond to practice items and breaks in the experiment. The below code removes these rows. R also reads in some factors as characters or numbers, which would be problematic for analysis, so the below code also turns these variables into factors.
The E-data files also contain many unnecessary columns. These are removed to have a clean dataframe to work with.
Some column names are not very transparent, so dplyr’s ‘rename’ function is used to give more transparent variable names
A variable “group” is added to marked all rows in this file as coming from the L2 group
Finally, there are some subject numbers for L2ers that are the same for L1ers (e.g., 101). Subjects are renamed with the prefix of their language group concatenated with the subject number they were assigned when they participated
L2<-L2[L2$Running !="PracList",]
L2$Running<-factor(L2$Running)
L2<-L2[L2$Procedure != "BreakProc",]
L2$Procedure<-factor(L2$Procedure)
L2$Condition<-factor(L2$Condition)
L2$Related<-factor(L2$Related)
L2$Slide1.ACC<-factor(L2$Slide1.ACC)
L2$PrimeCondition<-factor(L2$PrimeCondition)
L2<-L2[,c(1,2,16,18,20,30,31,33,35,42,44)]
library(dplyr)
L2<-rename(L2,List=ExperimentName)
L2<-rename(L2,TrialOrder=Block)
L2<-rename(L2,Accuracy=Slide1.ACC)
L2<-rename(L2,RT=Slide1.RT)
library(stats)
library(lme4)
library(lmerTest)
L2$Accuracy<-as.numeric(as.character(L2$Accuracy))
L2$RT<-as.numeric(as.character(L2$RT))
L2$Group<-"L2"
L2$Subject<-paste(L2$Group,L2$Subject)
L2<-L2[L2$Subject!="L2 112" &L2$Subject!="L2 106" &L2$Subject!="L2 101" &L2$Subject!="L2 124",]
L1<-read.csv("L1 french raw data- all merged.csv",header=TRUE)
The L1 file is prepped in a similar way to the L2 file above. In the L1 version E-prime did not code for the prime being related or not to the target, so a block of code is added to assist in a ‘lookup’ (qdap package) function to add this information to the dataframe
The columns are not in the exact same order as they are in the L2 file, so code is added to organize the columns in the same order as in the L2 file, which will be necessary for merging the 2 files
Finally, a variable “group” is added to marked all rows in this file as coming from the L1 group
L1<-L1[L1$Running !="PracList",]
L1<-L1[L1$Procedure != "BreakProc",]
L1<-L1[,c(1,2,17,19,21,31,32,35,42,44)]
library(dplyr)
L1<-rename(L1,List=ExperimentName)
L1<-rename(L1,TrialOrder=Block)
L1<-rename(L1,Accuracy=Slide1.ACC)
L1<-rename(L1,RT=Slide1.RT)
PrimeCondition<-c("Morph","Orth","Sem","ID","Unr","unr","orth","sem","morph","id")
Related<-c("Related","Related","Related","Related","Unrelated","Unrelated","Related","Related","Related","Related")
RelatedTable<-cbind(PrimeCondition,Related)
library(qdap)
L1$Related<-lookup(L1$PrimeCondition,RelatedTable)
L1<-L1[,c("List","Subject","TrialOrder","Condition","Item","Prime","PrimeCondition","Related","Accuracy","RT","Target")]
L1$Group<-"L1"
L1$Subject<-paste(L1$Group,L1$Subject)
Merge L1 and L2 files together to have a large dataframe called “data”
data<-rbind(L1,L2)
data$Group<-as.factor(data$Group)
The below code summarizes the accuracy in lexical decision for reach target condition, for each group
L1_acc<-aggregate(Accuracy~Condition,data=L1,FUN=mean)
L2_acc<-aggregate(Accuracy~Condition,data=L2,FUN=mean)
Acc_all<-cbind(L1_acc[,c(1,2)],L2_acc[,2])
names(Acc_all)<-c("Condition","L1","L2")
Acc_all
## Condition L1 L2
## 1 ID 0.9119048 0.8300000
## 2 Morph 0.9325397 0.8466667
## 3 Nonce 0.9093254 0.7275000
## 4 Orth 0.8698413 0.8000000
## 5 Sem 0.9325397 0.8188889
Comparing overall accuracy, it is clear that native speakers were more accurate, and the condition in which they were least accurate was the Orth condition. The non-native group was least accurate in the Nonce condition.
Accuracy of the lexical decision is analyzed below with a logistic mixed-effects model. The fixed effects are Condition(ID,Morph,Orth,Sem, Nonce), Related(related,unrelated), Group (L1,L2), and TrialOrder (i.e., presentation order). Random effects are subject and Item.
ID is the baseline for Condition, Unrelated is the baseline for Related, and L1 is the baseline for Group
data$Related<-as.factor(data$Related)
data$Related<-relevel(data$Related,ref="Unrelated")
acc1<-glm(Accuracy~ Condition*Related*Group+TrialOrder,data=data,family=binomial)
acc2<-glm(Accuracy~ Condition*Related*Group-Condition:Related:Group+TrialOrder,data=data,family=binomial)
acc3<-glm(Accuracy~ Condition*Related*Group-Condition:Related:Group-Condition:Related+TrialOrder,data=data,family=binomial)
summary(acc3)
##
## Call:
## glm(formula = Accuracy ~ Condition * Related * Group - Condition:Related:Group -
## Condition:Related + TrialOrder, family = binomial, data = data)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.3638 0.3820 0.4401 0.6130 0.8286
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.2294478 0.1126083 19.798 < 2e-16 ***
## ConditionMorph 0.2858601 0.1500058 1.906 0.056694 .
## ConditionNonce -0.0296351 0.1115001 -0.266 0.790404
## ConditionOrth -0.4428572 0.1299976 -3.407 0.000658 ***
## ConditionSem 0.2858075 0.1500063 1.905 0.056741 .
## RelatedRelated 0.0325524 0.0721475 0.451 0.651851
## GroupL2 -0.7413964 0.1410319 -5.257 1.46e-07 ***
## TrialOrder 0.0006139 0.0002627 2.337 0.019437 *
## ConditionMorph:GroupL2 -0.1634934 0.1973333 -0.829 0.407380
## ConditionNonce:GroupL2 -0.5726854 0.1476757 -3.878 0.000105 ***
## ConditionOrth:GroupL2 0.2397435 0.1780996 1.346 0.178264
## ConditionSem:GroupL2 -0.3670763 0.1946087 -1.886 0.059264 .
## RelatedRelated:GroupL2 -0.0253732 0.0931127 -0.273 0.785237
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 14346 on 17279 degrees of freedom
## Residual deviance: 13592 on 17267 degrees of freedom
## AIC: 13618
##
## Number of Fisher Scoring iterations: 5
The results of the logistic regression show
data<-data[data$Condition!="Nonce",]
Count how many data points are in the dataframe before cleaning
L1<-data[data$Group=="L1",]
L2<-data[data$Group=="L2",]
L1_before<-length(L1$Item)
L2_before<-length(L2$Item)
L1_before
## [1] 5040
L2_before
## [1] 3600
The dataframe “props” has lexical properties about all the primes and targets used in the study. It has length (in letter), syllable count, and frequency per million words (Lexique.org)
The below code loads in this dataframe, and creates matrices of 2 columns each to allow for the lookup function to be used to transfer the lexical properties from the “props” dataframe to the “data” dataframe
props<-read.csv("major paper stim properties.csv",header=TRUE)
head(props)
## Item Condition Word Letters Frequency Syllables tar.rel.unrel
## 1 D113 ID laçons 6 0 2 unrel
## 2 D114 ID parons 6 0 2 unrel
## 3 D115 ID gazons 6 0 2 unrel
## 4 D116 ID misons 6 0 2 unrel
## 5 D120 ID privons 7 0 2 unrel
## 6 D121 ID prônons 7 0 2 unrel
## rel.overlap word.nonce repeat.
## 1 NA word 0
## 2 NA word 0
## 3 NA word 0
## 4 NA word 0
## 5 NA word 0
## 6 NA word 0
targets<-props[props$tar.rel.unrel=="target" & props$word.nonce=="word",]
target_length<-targets[,c(3,4)]
target_freq<-targets[,c(3,5)]
target_syllables<-targets[,c(3,6)]
data$targetfreq<-lookup(data$Target,target_freq)
data$targetlength<-lookup(data$Target,target_length)
data$targetsyllables<-lookup(data$Target,target_syllables)
head(data)
## List Subject TrialOrder Condition Item Prime PrimeCondition
## 11 List1 L1 101 11 Morph D014 gagnons morph
## 15 List1 L1 101 15 Sem D085 narrons sem
## 16 List1 L1 101 16 Sem D106 discutons unr
## 20 List1 L1 101 20 Orth D068 amassons unr
## 23 List1 L1 101 23 Sem D082 montons sem
## 25 List1 L1 101 25 Morph D013 fondons morph
## Related Accuracy RT Target Group targetfreq targetlength
## 11 Related 1 757 GAGNE L1 19.66 5
## 15 Related 1 798 RACONTE L1 54.12 7
## 16 Unrelated 1 708 AIDE L1 18.38 4
## 20 Unrelated 1 729 CHARGE L1 7.97 6
## 23 Related 1 800 GRIMPE L1 8.18 6
## 25 Related 1 710 FONDE L1 2.03 5
## targetsyllables
## 11 1
## 15 2
## 16 1
## 20 1
## 23 1
## 25 1
* Remove 3000+ RTs
* Calculate zscores (per participant)
* Remove 2.5+ zscores
* Keep only accurate items
* Re-level the Related column so that Unrelated is treated as the baseline
library(stats)
data<-data[data$RT<=3000 & data$RT>=300,]
data$zRT<-ave(data$RT,data$Subject,FUN=scale,na.rm=T)
data<-data[data$zRT<=2.5,]
dim(data)
## [1] 8308 16
Get count of data points in dataframe after cleaning and find how much data was lost
L1<-data[data$Group=="L1",]
L2<-data[data$Group=="L2",]
L1_after<-length(L1$Item)
L2_after<-length(L2$Item)
L1_after
## [1] 4869
L2_after
## [1] 3439
L1_lost<-(L1_before-L1_after)/L1_before
L1_lost
## [1] 0.03392857
L2_lost<-(L2_before-L2_after)/L2_before
L2_lost
## [1] 0.04472222
Remove items with incorrect response
data<-data[data$Accuracy==1,]
data$Related<-as.factor(data$Related)
data$Related<-relevel(data$Related,ref="Unrelated")
Calculate lost due to inaccuracy
L1<-data[data$Group=="L1",]
L2<-data[data$Group=="L2",]
L1_acc<-length(L1$Item)
L2_acc<-length(L2$Item)
L1_acc
## [1] 4469
L2_acc
## [1] 2861
L1_final<-(L1_after-L1_acc)/L1_after
L1_final
## [1] 0.08215239
L2_final<-(L2_after-L2_acc)/L2_after
L2_final
## [1] 0.1680721
Log transform RTs and target frequencies to achieve a normal distribution
data$logRT<-log(data$RT)
data$logtargetfreq<-log(data$targetfreq)
Baselines in the model:
Models are compared with ANOVAs to test if adding/removing terms improves the model. When there is no significant difference between 2 models, the model with fewer terms is chosen.
library(LMERConvenienceFunctions)
test1<-lmer(logRT~Condition*Related*Group+TrialOrder+(1|Item)+(1|Subject),data)
test2<-lmer(logRT~Condition*Related*Group+logtargetfreq+TrialOrder+(1|Item)+(1|Subject),data)
test3<-lmer(logRT~Condition*Related*Group-Group:Condition+logtargetfreq+TrialOrder+(1|Item)+(1|Subject),data)
test4<-lmer(logRT~Condition*Related*Group-Group:Related+logtargetfreq+TrialOrder+(1|Item)+(1|Subject),data)
test5<-lmer(logRT~Condition*Related+Group+logtargetfreq+TrialOrder+(1|Item)+(1|Subject),data)
#test5 is best, simpler than others, and others aren't significantly better
test6<-lmer(logRT~Condition*Related+Group+logtargetfreq+TrialOrder+(1+Group|Item)+(1|Subject),data)
test7<-lmer(logRT~Condition*Related+Group+logtargetfreq+TrialOrder+(1+Group|Item)+(1+logtargetfreq|Subject),data)
# test 7 is best (i.e., better to include random slopes for group and targetfreq)
summary(test7)
## Linear mixed model fit by REML t-tests use Satterthwaite approximations
## to degrees of freedom [lmerMod]
## Formula:
## logRT ~ Condition * Related + Group + logtargetfreq + TrialOrder +
## (1 + Group | Item) + (1 + logtargetfreq | Subject)
## Data: data
##
## REML criterion at convergence: -1889.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.9717 -0.6448 -0.1305 0.5099 4.3087
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Item (Intercept) 0.0022160 0.04707
## GroupL2 0.0040613 0.06373 0.07
## Subject (Intercept) 0.0286915 0.16939
## logtargetfreq 0.0002223 0.01491 -0.59
## Residual 0.0411669 0.20290
## Number of obs: 7330, groups: Item, 144; Subject, 60
##
## Fixed effects:
## Estimate Std. Error df t value
## (Intercept) 6.691e+00 3.075e-02 9.400e+01 217.590
## ConditionMorph -1.017e-02 1.529e-02 2.020e+02 -0.665
## ConditionOrth -2.446e-02 1.554e-02 2.080e+02 -1.574
## ConditionSem -7.711e-04 1.541e-02 2.030e+02 -0.050
## RelatedRelated -6.327e-02 9.513e-03 7.055e+03 -6.650
## GroupL2 6.921e-02 3.875e-02 6.300e+01 1.786
## logtargetfreq -2.791e-02 3.781e-03 1.360e+02 -7.381
## TrialOrder -1.726e-04 2.858e-05 7.103e+03 -6.038
## ConditionMorph:RelatedRelated 2.011e-02 1.338e-02 7.052e+03 1.503
## ConditionOrth:RelatedRelated 4.673e-02 1.365e-02 7.065e+03 3.422
## ConditionSem:RelatedRelated 6.775e-02 1.346e-02 7.076e+03 5.033
## Pr(>|t|)
## (Intercept) < 2e-16 ***
## ConditionMorph 0.506509
## ConditionOrth 0.117055
## ConditionSem 0.960129
## RelatedRelated 3.14e-11 ***
## GroupL2 0.078895 .
## logtargetfreq 1.40e-11 ***
## TrialOrder 1.64e-09 ***
## ConditionMorph:RelatedRelated 0.132757
## ConditionOrth:RelatedRelated 0.000624 ***
## ConditionSem:RelatedRelated 4.96e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) CndtnM CndtnO CndtnS RltdRl GropL2 lgtrgt TrlOrd CnM:RR
## ConditnMrph -0.245
## ConditnOrth -0.271 0.495
## ConditionSm -0.222 0.502 0.482
## RelatedRltd -0.159 0.317 0.312 0.314
## GroupL2 -0.532 0.000 0.002 0.000 0.000
## logtargtfrq -0.462 -0.010 0.085 -0.085 0.005 0.065
## TrialOrder -0.141 -0.016 -0.013 -0.011 0.002 -0.002 0.003
## CndtnMrp:RR 0.111 -0.443 -0.222 -0.224 -0.712 0.000 -0.002 0.010
## CndtnOrt:RR 0.112 -0.221 -0.445 -0.218 -0.697 0.000 -0.008 0.002 0.496
## CndtnSm:RlR 0.112 -0.224 -0.221 -0.443 -0.708 0.002 -0.005 -0.002 0.504
## CnO:RR
## ConditnMrph
## ConditnOrth
## ConditionSm
## RelatedRltd
## GroupL2
## logtargtfrq
## TrialOrder
## CndtnMrp:RR
## CndtnOrt:RR
## CndtnSm:RlR 0.493
There’s an interaction of condition x related for Sem, meaning the effect of relatedness found in the ID condition is different in the Sem condition
The best model did not include any interactions with group meaning the patterns for the 2 groups do not differ. The lack of effect of group also indicates that the L2 group overall did not differ in log-transformed RTs comapared to the L1 group.
The above analyses indicate that there is no need to separate L1 and L2 French speakers for analyses (no interactions of group). However, I am interested in testing if French proficiency modulates the processing of morphologically complex words in non-native French speakers. Only the L2 group completed proficiency measures, so separate analyses are done on just the L2 group to investigate how proficiency may modulate morphological processing. Those analyses are documented in a separate markdown file posted to RPubs.
There are 35 native speakers and 25 usable L2 speakers in the above analyses. Typically these analyses are robust to unequal groups, but to ensure that the L1 group is not unfairly driving the results of the ‘big’ model with both language groups, a subset of 25 native French speakers is created and another ‘big’ model is used to test the effect of group.
Subsetting L1 group
L1_subjects<-unique(L1$Subject)
L1_subset<-L1_subjects[c(6:30)]
L1_subset
## [1] "L1 106" "L1 107" "L1 108" "L1 109" "L1 301" "L1 302" "L1 303"
## [8] "L1 304" "L1 305" "L1 306" "L1 201" "L1 202" "L1 203" "L1 204"
## [15] "L1 205" "L1 206" "L1 207" "L1 208" "L1 209" "L1 401" "L1 402"
## [22] "L1 403" "L1 404" "L1 405" "L1 406"
L1_25<-L1[L1$Subject %in% L1_subset,]
Merging new L1 group (n=24) with all L2 group (n=24)
data2<-rbind(L1_25,L2)
Accuracy Analyses
L1_acc<-aggregate(Accuracy~Condition,data=L1_25,FUN=mean)
L2_acc<-aggregate(Accuracy~Condition,data=L2,FUN=mean)
Acc_all<-cbind(L1_acc[,c(1,2)],L2_acc[,2])
names(Acc_all)<-c("Condition","L1","L2")
Acc_all
## Condition L1 L2
## 1 ID 1 1
## 2 Morph 1 1
## 3 Orth 1 1
## 4 Sem 1 1
data2$Related<-as.factor(data2$Related)
data2$Related<-relevel(data2$Related,ref="Unrelated")
acc1<-glm(Accuracy~ Condition*Related*Group+TrialOrder,data=data2,family=binomial)
## Warning: glm.fit: algorithm did not converge
summary(acc1)
##
## Call:
## glm(formula = Accuracy ~ Condition * Related * Group + TrialOrder,
## family = binomial, data = data2)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## 2.409e-06 2.409e-06 2.409e-06 2.409e-06 2.409e-06
##
## Coefficients:
## Estimate Std. Error z value
## (Intercept) 2.657e+01 1.954e+04 0.001
## ConditionMorph -1.000e-07 2.512e+04 0.000
## ConditionOrth 5.969e-08 2.559e+04 0.000
## ConditionSem 5.961e-08 2.526e+04 0.000
## RelatedRelated -8.287e-08 2.508e+04 0.000
## GroupL2 -1.670e-07 2.597e+04 0.000
## TrialOrder 1.900e-08 5.442e+01 0.000
## ConditionMorph:RelatedRelated 1.118e-07 3.542e+04 0.000
## ConditionOrth:RelatedRelated -1.993e-07 3.609e+04 0.000
## ConditionSem:RelatedRelated -1.867e-08 3.547e+04 0.000
## ConditionMorph:GroupL2 2.036e-07 3.658e+04 0.000
## ConditionOrth:GroupL2 7.596e-08 3.727e+04 0.000
## ConditionSem:GroupL2 -7.919e-08 3.673e+04 0.000
## RelatedRelated:GroupL2 2.667e-07 3.647e+04 0.000
## ConditionMorph:RelatedRelated:GroupL2 -2.121e-07 5.143e+04 0.000
## ConditionOrth:RelatedRelated:GroupL2 -8.895e-08 5.238e+04 0.000
## ConditionSem:RelatedRelated:GroupL2 -1.677e-07 5.167e+04 0.000
## Pr(>|z|)
## (Intercept) 0.999
## ConditionMorph 1.000
## ConditionOrth 1.000
## ConditionSem 1.000
## RelatedRelated 1.000
## GroupL2 1.000
## TrialOrder 1.000
## ConditionMorph:RelatedRelated 1.000
## ConditionOrth:RelatedRelated 1.000
## ConditionSem:RelatedRelated 1.000
## ConditionMorph:GroupL2 1.000
## ConditionOrth:GroupL2 1.000
## ConditionSem:GroupL2 1.000
## RelatedRelated:GroupL2 1.000
## ConditionMorph:RelatedRelated:GroupL2 1.000
## ConditionOrth:RelatedRelated:GroupL2 1.000
## ConditionSem:RelatedRelated:GroupL2 1.000
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 0.0000e+00 on 6038 degrees of freedom
## Residual deviance: 3.5036e-08 on 6022 degrees of freedom
## AIC: 34
##
## Number of Fisher Scoring iterations: 25
data2<-data2[data2$Condition!="Nonce",]
The dataframe “props” has lexical properties about all the primes and targets used in the study. It has length (in letter), syllable count, and frequency per million words (Lexique.org)
The below code loads in this dataframe, and creates matrices of 2 columns each to allow for the lookup function to be used to transfer the lexical properties from the “props” dataframe to the “data” dataframe
props<-read.csv("major paper stim properties.csv",header=TRUE)
head(props)
## Item Condition Word Letters Frequency Syllables tar.rel.unrel
## 1 D113 ID laçons 6 0 2 unrel
## 2 D114 ID parons 6 0 2 unrel
## 3 D115 ID gazons 6 0 2 unrel
## 4 D116 ID misons 6 0 2 unrel
## 5 D120 ID privons 7 0 2 unrel
## 6 D121 ID prônons 7 0 2 unrel
## rel.overlap word.nonce repeat.
## 1 NA word 0
## 2 NA word 0
## 3 NA word 0
## 4 NA word 0
## 5 NA word 0
## 6 NA word 0
targets<-props[props$tar.rel.unrel=="target" & props$word.nonce=="word",]
target_length<-targets[,c(3,4)]
target_freq<-targets[,c(3,5)]
target_syllables<-targets[,c(3,6)]
data2$targetfreq<-lookup(data2$Target,target_freq)
data2$targetlength<-lookup(data2$Target,target_length)
data2$targetsyllables<-lookup(data2$Target,target_syllables)
head(data2)
## List Subject TrialOrder Condition Item Prime PrimeCondition
## 1516 List1 L1 106 11 Orth D064 cinglons unr
## 1517 List1 L1 106 12 Sem D086 dénions sem
## 1518 List1 L1 106 13 Morph D026 écrasons unr
## 1519 List1 L1 106 14 Sem D108 tolérons unr
## 1520 List1 L1 106 15 ID D138 concluons unr
## 1521 List1 L1 106 16 Morph D018 signons morph
## Related Accuracy RT Target Group targetfreq targetlength
## 1516 Unrelated 1 1100 ARRIVE L1 164.12 6
## 1517 Related 1 837 REFUSE L1 24.19 6
## 1518 Unrelated 1 745 MONTRE L1 38.99 6
## 1519 Unrelated 1 576 VEXE L1 1.76 4
## 1520 Unrelated 1 655 PRÉPARE L1 19.59 7
## 1521 Related 1 602 SIGNE L1 7.57 5
## targetsyllables zRT
## 1516 2 2.3416291
## 1517 2 0.8355292
## 1518 1 0.3086805
## 1519 1 -0.6591176
## 1520 2 -0.2067149
## 1521 1 -0.5102256
* Remove 3000+ RTs
* Calculate zscores (per participant)
* Remove 2.5+ zscores
* Keep only accurate items
* Re-level the Related column so that Unrelated is treated as the baseline
library(stats)
data2<-data2[data2$RT<=3000,]
data2$zRT<-ave(data2$RT,data2$Subject,FUN=scale,na.rm=T)
data2<-data2[data2$zRT<=2.5,]
dim(data2)
## [1] 5853 16
data2<-data2[data2$Accuracy==1,]
data2$Related<-as.factor(data2$Related)
data2$Related<-relevel(data2$Related,ref="Unrelated")
Log transform RTs and target frequencies to achieve a normal distribution
data2$logRT<-log(data2$RT)
data2$logtargetfreq<-log(data2$targetfreq)
test11<-lmer(logRT~Condition*Related*Group+TrialOrder+(1|Item)+(1|Subject),data2)
test12<-lmer(logRT~Condition*Related*Group+logtargetfreq+TrialOrder+(1|Item)+(1|Subject),data2)
test13<-lmer(logRT~Condition*Related*Group-Group:Condition+logtargetfreq+TrialOrder+(1|Item)+(1|Subject),data2)
test14<-lmer(logRT~Condition*Related*Group-Group:Related+logtargetfreq+TrialOrder+(1|Item)+(1|Subject),data2)
test15<-lmer(logRT~Condition*Related+Group+logtargetfreq+TrialOrder+(1|Item)+(1|Subject),data2)
#test5 is best, simpler than others, and others aren't significantly better
test16<-lmer(logRT~Condition*Related+Group+logtargetfreq+TrialOrder+(1+Group|Item)+(1|Subject),data2)
test17<-lmer(logRT~Condition*Related+Group+logtargetfreq+TrialOrder+(1+Group|Item)+(1+logtargetfreq|Subject),data2)
# test 17 is best (i.e., better to include random slopes for group and targetfreq)
summary(test17)
## Linear mixed model fit by REML t-tests use Satterthwaite approximations
## to degrees of freedom [lmerMod]
## Formula:
## logRT ~ Condition * Related + Group + logtargetfreq + TrialOrder +
## (1 + Group | Item) + (1 + logtargetfreq | Subject)
## Data: data2
##
## REML criterion at convergence: -2451
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.0433 -0.6644 -0.1149 0.5637 3.9005
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Item (Intercept) 0.0018154 0.04261
## GroupL2 0.0029748 0.05454 0.05
## Subject (Intercept) 0.0302985 0.17406
## logtargetfreq 0.0001903 0.01379 -0.71
## Residual 0.0347800 0.18649
## Number of obs: 5853, groups: Item, 144; Subject, 50
##
## Fixed effects:
## Estimate Std. Error df t value
## (Intercept) 6.699e+00 3.492e-02 7.600e+01 191.854
## ConditionMorph 8.920e-04 1.482e-02 2.140e+02 0.060
## ConditionOrth -2.746e-02 1.510e-02 2.210e+02 -1.818
## ConditionSem -7.419e-04 1.496e-02 2.160e+02 -0.050
## RelatedRelated -6.356e-02 9.777e-03 5.622e+03 -6.501
## GroupL2 4.020e-02 4.017e-02 5.400e+01 1.001
## logtargetfreq -2.613e-02 3.692e-03 1.160e+02 -7.077
## TrialOrder -2.175e-04 2.951e-05 5.658e+03 -7.369
## ConditionMorph:RelatedRelated 1.803e-02 1.376e-02 5.621e+03 1.311
## ConditionOrth:RelatedRelated 4.595e-02 1.406e-02 5.628e+03 3.268
## ConditionSem:RelatedRelated 6.842e-02 1.387e-02 5.641e+03 4.932
## Pr(>|t|)
## (Intercept) < 2e-16 ***
## ConditionMorph 0.95206
## ConditionOrth 0.07041 .
## ConditionSem 0.96048
## RelatedRelated 8.66e-11 ***
## GroupL2 0.32136
## logtargetfreq 1.21e-10 ***
## TrialOrder 1.96e-13 ***
## ConditionMorph:RelatedRelated 0.18998
## ConditionOrth:RelatedRelated 0.00109 **
## ConditionSem:RelatedRelated 8.36e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) CndtnM CndtnO CndtnS RltdRl GropL2 lgtrgt TrlOrd CnM:RR
## ConditnMrph -0.209
## ConditnOrth -0.230 0.493
## ConditionSm -0.189 0.501 0.481
## RelatedRltd -0.143 0.335 0.329 0.331
## GroupL2 -0.589 0.000 0.004 -0.001 0.000
## logtargtfrq -0.496 -0.010 0.081 -0.084 0.005 0.088
## TrialOrder -0.130 -0.016 -0.017 -0.013 -0.006 -0.001 0.007
## CndtnMrp:RR 0.100 -0.468 -0.234 -0.236 -0.711 -0.001 -0.002 0.012
## CndtnOrt:RR 0.099 -0.233 -0.472 -0.230 -0.696 0.000 -0.008 0.015 0.495
## CndtnSm:RlR 0.101 -0.236 -0.232 -0.469 -0.706 0.002 -0.005 0.000 0.502
## CnO:RR
## ConditnMrph
## ConditnOrth
## ConditionSm
## RelatedRltd
## GroupL2
## logtargtfrq
## TrialOrder
## CndtnMrp:RR
## CndtnOrt:RR
## CndtnSm:RlR 0.491
morph2<-data2[data2$Condition=="Morph",]
orth2<-data2[data2$Condition=="Orth",]
sem2<-data2[data2$Condition=="Sem",]
id2<-data2[data2$Condition=="ID",]
id11<-lmer(logRT~Related*Group+logtargetfreq+TrialOrder+(1+Group|Item)+(1+logtargetfreq|Subject),id2)
id12<-lmer(logRT~Related+Group+logtargetfreq+TrialOrder+(1+Group|Item)+(1+logtargetfreq|Subject),id2)
summary(id12)
## Linear mixed model fit by REML t-tests use Satterthwaite approximations
## to degrees of freedom [lmerMod]
## Formula:
## logRT ~ Related + Group + logtargetfreq + TrialOrder + (1 + Group |
## Item) + (1 + logtargetfreq | Subject)
## Data: id2
##
## REML criterion at convergence: -475.1
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.7261 -0.6345 -0.1041 0.5587 3.5750
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 3.545e-02 0.188294
## logtargetfreq 9.541e-05 0.009768 -1.00
## Item (Intercept) 3.392e-03 0.058244
## GroupL2 2.975e-03 0.054543 -0.58
## Residual 3.563e-02 0.188759
## Number of obs: 1483, groups: Subject, 50; Item, 36
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 6.701e+00 4.142e-02 8.340e+01 161.790 < 2e-16 ***
## RelatedRelated -6.503e-02 9.933e-03 1.398e+03 -6.547 8.23e-11 ***
## GroupL2 5.269e-02 4.680e-02 5.360e+01 1.126 0.265293
## logtargetfreq -2.550e-02 6.871e-03 3.820e+01 -3.711 0.000656 ***
## TrialOrder -2.704e-04 5.949e-05 1.403e+03 -4.545 5.96e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) RltdRl GropL2 lgtrgt
## RelatedRltd -0.126
## GroupL2 -0.561 -0.001
## logtargtfrq -0.506 0.015 -0.019
## TrialOrder -0.231 -0.008 -0.005 0.034
morph11<-lmer(logRT~Related*Group+logtargetfreq+TrialOrder+(1+Group|Item)+(1+logtargetfreq|Subject),morph2)
morph12<-lmer(logRT~Related+Group+logtargetfreq+TrialOrder+(1+Group|Item)+(1+logtargetfreq|Subject),morph2)
summary(morph12)
## Linear mixed model fit by REML t-tests use Satterthwaite approximations
## to degrees of freedom [lmerMod]
## Formula:
## logRT ~ Related + Group + logtargetfreq + TrialOrder + (1 + Group |
## Item) + (1 + logtargetfreq | Subject)
## Data: morph2
##
## REML criterion at convergence: -485.9
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.8450 -0.6866 -0.1260 0.5652 3.6840
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 2.795e-02 0.167190
## logtargetfreq 3.115e-05 0.005581 -1.00
## Item (Intercept) 2.043e-03 0.045199
## GroupL2 1.417e-03 0.037648 0.30
## Residual 3.602e-02 0.189784
## Number of obs: 1513, groups: Subject, 50; Item, 36
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 6.731e+00 4.034e-02 7.700e+01 166.857 < 2e-16 ***
## RelatedRelated -4.523e-02 9.838e-03 1.426e+03 -4.597 4.66e-06 ***
## GroupL2 3.997e-02 4.433e-02 5.010e+01 0.902 0.372
## logtargetfreq -3.528e-02 7.689e-03 3.400e+01 -4.588 5.85e-05 ***
## TrialOrder -2.826e-04 5.952e-05 1.439e+03 -4.748 2.26e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) RltdRl GropL2 lgtrgt
## RelatedRltd -0.130
## GroupL2 -0.554 -0.003
## logtargtfrq -0.550 0.008 0.053
## TrialOrder -0.239 0.020 0.005 0.006
orth11<-lmer(logRT~Related*Group+logtargetfreq+TrialOrder+(1+Group|Item)+(1|Subject),orth2)
orth12<-lmer(logRT~Related+Group+logtargetfreq+TrialOrder+(1+Group|Item)+(1|Subject),orth2)
summary(orth12)
## Linear mixed model fit by REML t-tests use Satterthwaite approximations
## to degrees of freedom [lmerMod]
## Formula:
## logRT ~ Related + Group + logtargetfreq + TrialOrder + (1 + Group |
## Item) + (1 | Subject)
## Data: orth2
##
## REML criterion at convergence: -388.9
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.9668 -0.6516 -0.1367 0.5786 3.6735
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 0.020108 0.14180
## Item (Intercept) 0.001166 0.03415
## GroupL2 0.004377 0.06616 -0.14
## Residual 0.037362 0.19329
## Number of obs: 1388, groups: Subject, 50; Item, 36
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 6.648e+00 3.299e-02 7.380e+01 201.501 < 2e-16 ***
## RelatedRelated -1.670e-02 1.049e-02 1.297e+03 -1.592 0.11162
## GroupL2 4.840e-02 4.298e-02 5.350e+01 1.126 0.26513
## logtargetfreq -2.057e-02 4.051e-03 3.540e+01 -5.079 1.23e-05 ***
## TrialOrder -1.865e-04 6.419e-05 1.297e+03 -2.906 0.00373 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) RltdRl GropL2 lgtrgt
## RelatedRltd -0.168
## GroupL2 -0.613 -0.001
## logtargtfrq -0.241 -0.014 0.015
## TrialOrder -0.311 0.034 -0.004 -0.012
sem11<-lmer(logRT~Related*Group+logtargetfreq+TrialOrder+(1+Group|Item)+(1+logtargetfreq|Subject),sem2)
sem12<-lmer(logRT~Related+Group+logtargetfreq+TrialOrder+(1+Group|Item)+(1+logtargetfreq|Subject),sem2)
summary(sem12)
## Linear mixed model fit by REML t-tests use Satterthwaite approximations
## to degrees of freedom [lmerMod]
## Formula:
## logRT ~ Related + Group + logtargetfreq + TrialOrder + (1 + Group |
## Item) + (1 + logtargetfreq | Subject)
## Data: sem2
##
## REML criterion at convergence: -662.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.2596 -0.6372 -0.0888 0.5375 3.9904
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 0.0342910 0.18518
## logtargetfreq 0.0008081 0.02843 -0.67
## Item (Intercept) 0.0007527 0.02744
## GroupL2 0.0034772 0.05897 1.00
## Residual 0.0307702 0.17541
## Number of obs: 1469, groups: Subject, 50; Item, 36
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 6.692e+00 4.250e-02 6.840e+01 157.462 < 2e-16 ***
## RelatedRelated 4.444e-03 9.309e-03 1.382e+03 0.477 0.633191
## GroupL2 6.266e-02 4.219e-02 5.390e+01 1.485 0.143263
## logtargetfreq -3.039e-02 8.459e-03 4.820e+01 -3.593 0.000767 ***
## TrialOrder -1.566e-04 5.605e-05 1.377e+03 -2.795 0.005269 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) RltdRl GropL2 lgtrgt
## RelatedRltd -0.102
## GroupL2 -0.495 0.004
## logtargtfrq -0.690 -0.012 0.096
## TrialOrder -0.212 -0.014 -0.009 0.011