The purpose of this document is to work through the code from the paper below, tidyverse-ising it where possible and adding notes to help me understand what the code is doing to illustrating how linear mixed models work with this dataset.
In the real paper, they report …
Accuracy ~ Cuetype + (1 + Condition | Subject) + (1 | Word)
… which is a model including random intercepts for Subject and slopes for condition across subject (1 + Condition | Subject) AND random intercepts across words/items (1 | Word)
Then they report….
Accuracy ~ PhonologicalCue + zLength + zFrequency + zImageability + zNameAgreement + Phonological Cue: zVisualComplexity + (1 + Condition | Subject) + (1 | Word)
… which is cuetype reduced to 2 levels (phono vs not), plus all kinds of word parameters (length, freq, imageability, nameagreement) plus the INTERACTION between phono and visualComplexity, AND random intercepts of subject and slope of condition across subjects, and random intercepts for word/items.
TAKE home: in the real world this is really complex!
Picture naming experiment with 10 individuals with aphasia
Four conditions / two sessions - Semantic: related (associated) and unrelated (not associated) word presented with picture - Phonological: phonological cue (word onset) and tone presented with picture Counterbalanced so that in each session, half items cued and half uncued - 2 phonological sessions, 2 semantic sessions
Pictures taken from the Philadelphia Naming Test, 175 pictures. Vary along different lexical and picture variables.
For the purpose of this example, looking at the fixed effects of CueType, Length and Frequency.
Dependent variable: Accuracy of naming (coded as 0/1) Generalised LMM used with logistic link function / logistic regression
For each model, ONLY the random effect under consideration is fit (see manuscript). This allowed the figures to be generated more easily, and each one to be considered on its own. In reality, a model was fit with multiple random effects (e.g. for the effects of frequency and length and Cue type)
library(tidyverse)
library(broom.mixed)
library(here)
library(lmerTest)
library(lme4)
library(effects)
library(multcomp)
library(merTools)
here::here()
## [1] "/Users/jenny/Desktop/bestPractice_LMM"
NamingData <- read_csv(here("data", "naming_data.csv"))
Create tables to check responses across conditions/sessions
table(NamingData$Acc,NamingData$Subject,NamingData$Condition)
## , , = Phonological
##
##
## AM AW CB DH EW FF JK JV MH WR
## 0 299 37 46 169 39 177 163 42 151 83
## 1 51 310 302 162 237 162 132 293 135 199
##
## , , = Semantic
##
##
## AM AW CB DH EW FF JK JV MH WR
## 0 302 65 38 195 49 213 156 70 206 83
## 1 48 280 309 145 206 126 140 256 74 205
table(NamingData$Acc,NamingData$Subject,NamingData$Session)
## , , = 1
##
##
## AM AW CB DH EW FF JK JV MH WR
## 0 150 31 25 87 19 87 78 19 92 41
## 1 25 142 149 77 114 82 68 147 33 99
##
## , , = 2
##
##
## AM AW CB DH EW FF JK JV MH WR
## 0 149 34 21 82 20 90 85 23 114 42
## 1 26 138 153 85 123 80 64 146 41 100
##
## , , = 3
##
##
## AM AW CB DH EW FF JK JV MH WR
## 0 150 18 17 101 23 103 77 41 73 42
## 1 25 155 157 68 104 63 71 123 73 103
##
## , , = 4
##
##
## AM AW CB DH EW FF JK JV MH WR
## 0 152 19 21 94 26 110 79 29 78 41
## 1 23 155 152 77 102 63 69 133 62 102
Length and Frequence are continuous IVs, check coding of predictors
Make new varirables zLength zFreq using scale function.
NamingData$zLengthPh <- scale(NamingData$Length.Phon, scale = TRUE, center = TRUE)
NamingData$zFreq <- scale(NamingData$FreqLogLemma, scale = TRUE, center = TRUE)
summary(NamingData) # check new z scores mean 0
## Word Subject Trial ItemNo
## Length:7000 Length:7000 Min. : 1 Min. : 1
## Class :character Class :character 1st Qu.: 44 1st Qu.: 44
## Mode :character Mode :character Median : 88 Median : 88
## Mean : 88 Mean : 88
## 3rd Qu.:132 3rd Qu.:132
## Max. :175 Max. :175
##
## Session Condition Acc Cue
## Min. :1.00 Length:7000 Min. :0.0000 Length:7000
## 1st Qu.:1.75 Class :character 1st Qu.:0.0000 Class :character
## Median :2.50 Mode :character Median :1.0000 Mode :character
## Mean :2.50 Mean :0.5935
## 3rd Qu.:3.25 3rd Qu.:1.0000
## Max. :4.00 Max. :1.0000
## NA's :645
## ACC CueCondition Length.Phon FreqLogLemma
## Mode :logical Length:7000 Min. : 1.000 Min. :0.0000
## FALSE:2583 Class :character 1st Qu.: 3.000 1st Qu.:0.9031
## TRUE :3772 Mode :character Median : 4.000 Median :1.2788
## NA's :645 Mean : 4.497 Mean :1.3376
## 3rd Qu.: 5.000 3rd Qu.:1.8325
## Max. :11.000 Max. :3.2119
##
## zLengthPh.V1 zFreq.V1
## Min. :-1.980182 Min. :-2.1157856
## 1st Qu.:-0.847725 1st Qu.:-0.6872390
## Median :-0.281496 Median :-0.0929472
## Mean : 0.000000 Mean : 0.0000000
## 3rd Qu.: 0.284732 3rd Qu.: 0.7829096
## Max. : 3.682103 Max. : 2.9648797
##
For this logistic model, coding of response variable needs to be factor with 0=error and 1=correct, currently integer.
glimpse(NamingData)
## Rows: 7,000
## Columns: 14
## $ Word <chr> "ambulance", "ambulance", "ambulance", "ambulance",…
## $ Subject <chr> "FF", "WR", "AW", "FF", "MH", "AW", "AW", "CB", "JV…
## $ Trial <dbl> 156, 65, 156, 40, 40, 65, 132, 65, 132, 40, 65, 65,…
## $ ItemNo <dbl> 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54,…
## $ Session <dbl> 1, 4, 3, 3, 1, 1, 4, 3, 2, 3, 4, 3, 2, 3, 2, 3, 1, …
## $ Condition <chr> "Phonological", "Semantic", "Phonological", "Semant…
## $ Acc <dbl> 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, …
## $ Cue <chr> "Cued", "Cued", "Cued", "NotCued", "NotCued", "Cued…
## $ ACC <lgl> FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, …
## $ CueCondition <chr> "PhCue", "SemCue", "PhCue", "UnAssoc", "UnAssoc", "…
## $ Length.Phon <dbl> 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, …
## $ FreqLogLemma <dbl> 0.9542, 0.9542, 0.9542, 0.9542, 0.9542, 0.9542, 0.9…
## $ zLengthPh <dbl[,1]> <matrix[25 x 1]>
## $ zFreq <dbl[,1]> <matrix[25 x 1]>
In the original data, Acc is double, and ACC is logical. Need to have variable for accuracy that is 0 and 1 as a factor (to include in model) and one that is 0 and 1 as an integer (to do calc on). Leave Acc as integer and convert ACC (which is currently logical) to factor).
NamingData <- NamingData %>%
mutate(ACC = case_when(ACC == FALSE ~ 0,
ACC == TRUE ~ 1))
NamingData$ACC <- as.factor(NamingData$ACC)
NamingData$CueCondition <- as.factor(NamingData$CueCondition)
Make Subject factor, then check level names. Recode subject initials to participant = A-J.
NamingData$Subject <- as.factor(NamingData$Subject)
# first check the level names for the original Subject ID
levels(NamingData$Subject)
## [1] "AM" "AW" "CB" "DH" "EW" "FF" "JK" "JV" "MH" "WR"
# then recode the participant ID (levels), creating a new ID coding variable, Participant, in the same dataset
NamingData$Participant <- fct_recode(NamingData$Subject,
A = "CB", B = "AW", C = "EW",
D = "JV",E = "WR", F = "JK",
G = "FF", H = "DH", I = "MH",
J = "AM")
levels(NamingData$Participant)
## [1] "J" "B" "A" "H" "C" "G" "F" "D" "I" "E"
First step random effect to add is intercepts for participants. In this random intercepts only model, we are predicting accuracy (ACC) from fixed effects of CueCondition, length and frequency, plus the random effect due to between-subject variation in intercept.
This allows the intercept to vary for individual participants; in effect this accounts for overall between subject varibility in performance generally. Becuase this study is of people with aphasia, allowing for random intercepts means that the model accounts for differences between participants in the severity of their aphasia.
Here using generalised linear model (GLM) because ACC is binary (correct/incorrect). This is equivalent to logistic regression.
Run model predicting ACC from CueCondition, word length and freq, with random effects of Participant. The notations is (1|Participant) which means we are allowing the intercept (1) to vary by (|) participant.
Model.lmer1 <- glmer(ACC ~ CueCondition + zLengthPh + zFreq + (1|Participant),
data = NamingData, family = "binomial",
control = glmerControl(optimizer="bobyqa"))
summary(Model.lmer1) # standard model summary
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: ACC ~ CueCondition + zLengthPh + zFreq + (1 | Participant)
## Data: NamingData
## Control: glmerControl(optimizer = "bobyqa")
##
## AIC BIC logLik deviance df.resid
## 6600.7 6648.0 -3293.4 6586.7 6348
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -5.0036 -0.6876 0.3306 0.5796 4.0567
##
## Random effects:
## Groups Name Variance Std.Dev.
## Participant (Intercept) 1.58 1.257
## Number of obs: 6355, groups: Participant, 10
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.95584 0.40249 2.375 0.0176 *
## CueConditionSemCue -0.62757 0.08667 -7.241 4.45e-13 ***
## CueConditionTone -0.59457 0.08720 -6.818 9.22e-12 ***
## CueConditionUnAssoc -0.62037 0.08768 -7.075 1.49e-12 ***
## zLengthPh -0.34702 0.03439 -10.092 < 2e-16 ***
## zFreq 0.21299 0.03432 6.205 5.46e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) CCndSC CCndtT CCndUA zLngtP
## CueCndtnSmC -0.114
## CueCondtnTn -0.113 0.525
## CCndtnUnAss -0.113 0.523 0.520
## zLengthPh -0.007 0.031 0.029 0.028
## zFreq 0.005 -0.019 -0.016 -0.021 0.407
tidymodel1 <- tidy(Model.lmer1) # model summary as tidy df
The purpose of Figure 1A in the paper is to plot the mean performance of each participant (averaged across conditions) and the grand mean (averaged across all conditions and participants). This shows that there are BIG differences in the performance across individuals in the study (probably due to differences in the severity of their aphasia. The data in Figure 1A are pulled using the fitted.values() function.
The individual differences in performance across participants is what the random effect of participant is trying to model. The data in Figure 1B comes from pulling ranef() from the model and represents the same differences seen in Figure 1A, but as deviations from the grand mean.
Using fitted.values() from stats package
plotDat1 = df of fitted values with original data for plotting
plotDat1 <- na.omit(NamingData) #remove NAs from original data so aligns with model values
plotDat1 <- plotDat1 %>%
mutate(fitted_values = fitted.values(Model.lmer1))
names(plotDat1)
## [1] "Word" "Subject" "Trial" "ItemNo"
## [5] "Session" "Condition" "Acc" "Cue"
## [9] "ACC" "CueCondition" "Length.Phon" "FreqLogLemma"
## [13] "zLengthPh" "zFreq" "Participant" "fitted_values"
Create new plotDat1_mean df to plot summary stats, meanAcc for each participant, plus the grand mean across participants.
As I look at this code again, noticing that this plot is calcuating means for each participantf rom their actual Acc values, not from fitted.values, which makes me wonder why the code above was pulling fitted.values go back and check original code
glimpse(plotDat1)
## Rows: 6,355
## Columns: 16
## $ Word <chr> "ambulance", "ambulance", "ambulance", "ambulance"…
## $ Subject <fct> FF, WR, AW, FF, MH, AW, AW, CB, JV, JK, FF, EW, CB…
## $ Trial <dbl> 156, 65, 156, 40, 40, 65, 132, 65, 132, 40, 65, 65…
## $ ItemNo <dbl> 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54…
## $ Session <dbl> 1, 4, 3, 3, 1, 1, 4, 3, 2, 3, 4, 3, 2, 3, 2, 3, 1,…
## $ Condition <chr> "Phonological", "Semantic", "Phonological", "Seman…
## $ Acc <dbl> 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1,…
## $ Cue <chr> "Cued", "Cued", "Cued", "NotCued", "NotCued", "Cue…
## $ ACC <fct> 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1,…
## $ CueCondition <fct> PhCue, SemCue, PhCue, UnAssoc, UnAssoc, SemCue, To…
## $ Length.Phon <dbl> 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,…
## $ FreqLogLemma <dbl> 0.9542, 0.9542, 0.9542, 0.9542, 0.9542, 0.9542, 0.…
## $ zLengthPh <dbl[,1]> <matrix[25 x 1]>
## $ zFreq <dbl[,1]> <matrix[25 x 1]>
## $ Participant <fct> G, E, B, G, I, B, B, A, D, F, G, C, A, J, H, D, E,…
## $ fitted_values <dbl> 0.29341517, 0.42874248, 0.78257076, 0.18254126, 0.…
# get meanacc by participant
plotDat1_mean <- plotDat1 %>%
group_by(Participant) %>%
summarise(MeanAcc = mean(Acc))
## `summarise()` ungrouping output (override with `.groups` argument)
glimpse(plotDat1_mean) # check structure, note 'Participant' is factor but not ordered
## Rows: 10
## Columns: 2
## $ Participant <fct> J, B, A, H, C, G, F, D, I, E
## $ MeanAcc <dbl> 0.1414286, 0.8526012, 0.8791367, 0.4575261, 0.834275…
plotDat1_mean$Participant <- as.character(plotDat1_mean$Participant) # make participant character so can add Mean label below
# in the 11th row, put "Mean" in the 1st col
# in the 11th row, in the 2nd col, calculate the mean Acc across all participants (grand mean)
plotDat1_mean[11,1] <- "Mean"
plotDat1_mean[11,2] <- mean(plotDat1$Acc)
plotDat1_mean$MeanAcc <- round(plotDat1_mean$MeanAcc, digits = 2) # round numbers for plotting
glimpse(plotDat1_mean)
## Rows: 11
## Columns: 2
## $ Participant <chr> "J", "B", "A", "H", "C", "G", "F", "D", "I", "E", "M…
## $ MeanAcc <dbl> 0.14, 0.85, 0.88, 0.46, 0.83, 0.42, 0.46, 0.83, 0.37…
Plot accuracy by subject and with grand mean, this code reproduces Figure 1a from the paper, mean acc for each participant with the grand mean.
plotDat1_mean %>% ggplot(aes(x=as.factor(Participant), y=MeanAcc)) +
geom_col() +
xlab("Participant and Grand Mean Accuracy") +
ylab("Mean Accuracy") +
ggtitle("1a) Raw participant accuracy and group/grand mean") +
expand_limits(y = c(0,1)) +
theme_bw() +
theme(text=element_text(size=14))
In the original code, there was a quick and dirty lattice plot here that visualises the random effects of participant, that is the extent to which average performance of each participant differs from the grand mean. Below is how to reproduce that in ggplot.
Use ranef() to pull the random effects from the model object. This function returns a list of 1 thing, so its easy to turn it into a dataframe.
randomeffects1 <- ranef(Model.lmer1) # ranef() gets a list of 1 with by subject intercepts
df_randomeffects1 <- randomeffects1[[1]] # the list only contains 1 thing so it is easier to work with as a df
glimpse(df_randomeffects1)
## Rows: 10
## Columns: 1
## $ `(Intercept)` <dbl> -2.3830710, 1.3388007, 1.5818498, -0.6859396, 1.17…
# the pp-nos are in the row names, so need to make them a column
df_randomeffects1 <- df_randomeffects1 %>%
rownames_to_column(var = "subject")
df_randomeffects1 <- df_randomeffects1 %>%
dplyr::rename(intercept = "(Intercept)")
names(df_randomeffects1)
## [1] "subject" "intercept"
Plot random effects for subjects, this plot reproduces Figure 1B in the paper.
df_randomeffects1 %>%
ggplot(aes(x=as.factor(subject), y=intercept)) + geom_point(stat="identity") +
ylab("Deviation from Grand Mean Intercept (0)") +
xlab("Participant") +
geom_hline(yintercept=0) +
ggtitle("1b) Participant accuracy as deviations from the grand mean") + theme_bw() +
theme(text=element_text(size=14))
While the random effect of participant (intercept) accounts for individual differences in performance that the participant came to the experiment with (in this case probably differences in the severity of their aphasia), it is also possible that there are participant differences in the extent to which each participant is affected by the manipulation. To account for the possibility that the magnitude of the effect of CueCondition is difference across participants, here we add a random effect of participant on SLOPE cue condition effects.
Note that here we specify the random effect of subjects on the slope of the CueCondition effect by specifying +
(0 + CueCondition|Subject)
NOTE: here we are asking the glmer() function to ignore potential deviations between subjects in the intercepts, i.e. we are excluding the random effect of subjects on intercepts.
The (0 + CueCondidion|Subject) term models how the effect of Cue Condition might have a different slope for different people i.e. impact different people to a greater or lesser extent.
Model.lmer2 <- glmer(ACC ~ CueCondition + zLengthPh + zFreq +
(0 + CueCondition|Subject),
data = NamingData, family = "binomial",
control = glmerControl(optimizer="bobyqa"))
## boundary (singular) fit: see ?isSingular
summary(Model.lmer2)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: ACC ~ CueCondition + zLengthPh + zFreq + (0 + CueCondition |
## Subject)
## Data: NamingData
## Control: glmerControl(optimizer = "bobyqa")
##
## AIC BIC logLik deviance df.resid
## 6601.2 6709.3 -3284.6 6569.2 6339
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -5.3847 -0.6704 0.3238 0.5808 3.9320
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject CueConditionPhCue 1.608 1.268
## CueConditionSemCue 1.585 1.259 0.94
## CueConditionTone 1.715 1.310 0.99 0.97
## CueConditionUnAssoc 1.572 1.254 0.97 0.99 0.99
## Number of obs: 6355, groups: Subject, 10
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.95129 0.40647 2.340 0.019264 *
## CueConditionSemCue -0.62146 0.16964 -3.663 0.000249 ***
## CueConditionTone -0.58364 0.10478 -5.570 2.55e-08 ***
## CueConditionUnAssoc -0.61683 0.13411 -4.599 4.24e-06 ***
## zLengthPh -0.34961 0.03448 -10.141 < 2e-16 ***
## zFreq 0.21370 0.03442 6.209 5.34e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) CCndSC CCndtT CCndUA zLngtP
## CueCndtnSmC -0.231
## CueCondtnTn -0.011 0.647
## CCndtnUnAss -0.204 0.821 0.665
## zLengthPh -0.008 0.018 0.026 0.020
## zFreq 0.004 -0.008 -0.012 -0.012 0.407
## convergence code: 0
## boundary (singular) fit: see ?isSingular
tidymodel2 <- tidy(Model.lmer2)
Figure 2A plots main effect of CueCondition averaged across participants. Before when we were plotting random effect of participant, we used ranef() to pull the random effect of participant. Now we want the get the fixed effect of CueCondition, so use the allEffects() function. Sadly this function creates a super complex list, that you then need to index pieces out of.
The allEffects() function returns a “large efflist”. This code gets the fixed effects that are associated with CueCondition and turns them into a dataframe, adds lower and upper CIs and standard error, then adds labels to the values in the df to make the plot make sense.
library(plyr)
## -------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## -------------------------------------------------------------------------
##
## Attaching package: 'plyr'
## The following object is masked from 'package:here':
##
## here
## The following objects are masked from 'package:dplyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
## The following object is masked from 'package:purrr':
##
## compact
effects <- allEffects(Model.lmer2) # pull allEffects into "large efflist"
fixed.effects <- effects$CueCondition$fit # pull effects of cue condition
fixed.effects<- as.data.frame(fixed.effects) # make a dataframe
fixed.effects[,2]<- effects$CueCondition$lower # add col 2 with lower CI
fixed.effects[,3]<- effects$CueCondition$upper # add col 3 with upper CI
fixed.effects[,4]<- effects$CueCondition$se # add col 4 with se
fixed.effects[,5] <-c("PhCue","SemCue","Tone","UnAssoc") # add col 5 with condition type
fixed.effects[,6]<-c("Shared onset","Associated word","Tone","Non-associated word") # add col 6 with condition labels
fixed.effects <- fixed.effects %>%
dplyr::rename(fit = V1, lower = V2, upper = V3, se = V4, cue = V5, label = V6) #rename V1-6
fixed.effects$label <- fct_relevel(fixed.effects$label, c("Shared onset", "Associated word", "Tone", "Non-associated word")) #fix levels of label
levels(fixed.effects$label)
## [1] "Shared onset" "Associated word" "Tone"
## [4] "Non-associated word"
Mean accuracy across four cue conditions averaged across participants, use fixed.effects df to plot fitted values w 95% confidence intervals
fixed.effects %>%
ggplot(aes(x = label, y = fit)) +
geom_col() +
geom_errorbar(aes(ymin=lower,ymax=upper),
size = .3,
width=.2,
position=position_dodge(.9)) +
ylim(-0.5, 2) +
theme_bw() +
theme(text=element_text(size=14)) +
xlab("Cue Type") +
ylab("Accuracy (model fit)") +
labs(title = "2a) Average accuracy across four Cue Type conditions")
Figure 2B illustrates how the effect of CueCondition differs for each participant but plotting fitted values from the model using a boxplot and facet_grid(). This plot illustrates the individual subject level effect of cue condition
The plotDat2 df pulls fitted.values (from the new model lmer2 which includes random slopes) and adds them to the original NamingData.
The original code did this with lattice- can I get the same plot with ggplot?
plotDat2 <- na.omit(NamingData) #remove NAs from original data so aligns with model values
plotDat2 <- plotDat2 %>%
mutate(fitted_values = fitted.values(Model.lmer2))
glimpse(plotDat2)
## Rows: 6,355
## Columns: 16
## $ Word <chr> "ambulance", "ambulance", "ambulance", "ambulance"…
## $ Subject <fct> FF, WR, AW, FF, MH, AW, AW, CB, JV, JK, FF, EW, CB…
## $ Trial <dbl> 156, 65, 156, 40, 40, 65, 132, 65, 132, 40, 65, 65…
## $ ItemNo <dbl> 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54…
## $ Session <dbl> 1, 4, 3, 3, 1, 1, 4, 3, 2, 3, 4, 3, 2, 3, 2, 3, 1,…
## $ Condition <chr> "Phonological", "Semantic", "Phonological", "Seman…
## $ Acc <dbl> 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1,…
## $ Cue <chr> "Cued", "Cued", "Cued", "NotCued", "NotCued", "Cue…
## $ ACC <fct> 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1,…
## $ CueCondition <fct> PhCue, SemCue, PhCue, UnAssoc, UnAssoc, SemCue, To…
## $ Length.Phon <dbl> 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,…
## $ FreqLogLemma <dbl> 0.9542, 0.9542, 0.9542, 0.9542, 0.9542, 0.9542, 0.…
## $ zLengthPh <dbl[,1]> <matrix[25 x 1]>
## $ zFreq <dbl[,1]> <matrix[25 x 1]>
## $ Participant <fct> G, E, B, G, I, B, B, A, D, F, G, C, A, J, H, D, E,…
## $ fitted_values <dbl> 0.30829898, 0.46656863, 0.80474230, 0.17829396, 0.…
plotDat2$Participant <- fct_relevel(plotDat2$Participant, c("A" ,"B", "C", "D", "E", "F", "G", "H", "I", "J"))
levels(plotDat2$Participant)
## [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
plotDat2 %>%
ggplot(aes(x = CueCondition, y = fitted_values)) +
geom_boxplot() +
facet_wrap(~ Participant) +
ylab("Accuracy (model fit)") +
xlab("Cue Type") +
labs(title = "2b) Effect of Cue Type by Participant")
# WOOT
Figure 2c is a little like Figure 1b, except rather than just the random effect of participant, this is the random slope effect of cue type. For each participant, it is plotting the effect of CueCondition as a deviation from the condition mean, showing that while across the board the Cues with shared onset result in higher perforamnce than the other cues (Figure2a) this effet is not the same across all participants.
#Extract random effect variances
randomeffects2 <- ranef(Model.lmer2) #list of 1, 40 obs of 5 variables
str(randomeffects2) #look at what has been extracted
## List of 1
## $ Subject:'data.frame': 10 obs. of 4 variables:
## ..$ CueConditionPhCue : num [1:10] -2.459 1.486 1.281 -0.675 1.266 ...
## ..$ CueConditionSemCue : num [1:10] -2.235 1.195 1.789 -0.681 1.074 ...
## ..$ CueConditionTone : num [1:10] -2.501 1.454 1.535 -0.712 1.259 ...
## ..$ CueConditionUnAssoc: num [1:10] -2.323 1.296 1.642 -0.685 1.143 ...
## ..- attr(*, "postVar")= num [1:4, 1:4, 1:10] 0.02446 -0.00254 0.01569 0.00596 -0.00254 ...
## - attr(*, "class")= chr "ranef.mer"
df_randomeffects2 <- as.data.frame(randomeffects2)
df_randomeffects2 <- df_randomeffects2 %>%
dplyr::rename(participant = grp) %>%
dplyr::rename(deviation = condval) %>%
mutate(participant = recode(participant,
"CB" = "A", "AW" = "B","EW" = "C",
"JV" = "D","WR" = "E", "FF" = "G",
"JK" = "F", "DH" = "H","MH" = "I",
"AM" = "J")) %>%
mutate(cuetype = case_when(term == "CueConditionPhCue" ~ "Shared Onset",
term == "CueConditionSemCue" ~ "Associated Word",
term == "CueConditionTone" ~ "Tone",
term == "CueConditionUnAssoc" ~ "Non associated word"))
glimpse(df_randomeffects2)
## Rows: 40
## Columns: 6
## $ grpvar <chr> "Subject", "Subject", "Subject", "Subject", "Subject…
## $ term <fct> CueConditionPhCue, CueConditionPhCue, CueConditionPh…
## $ participant <fct> J, B, A, H, C, G, F, D, I, E, J, B, A, H, C, G, F, D…
## $ deviation <dbl> -2.4590051, 1.4858826, 1.2811261, -0.6754017, 1.2657…
## $ condsd <dbl> 0.15638615, 0.18284801, 0.17790477, 0.12606286, 0.18…
## $ cuetype <chr> "Shared Onset", "Shared Onset", "Shared Onset", "Sha…
df_randomeffects2$participant <- fct_relevel(df_randomeffects2$participant, c("A" ,"B", "C", "D", "E", "F", "G", "H", "I", "J"))
df_randomeffects2$cuetype <- fct_relevel(df_randomeffects2$cuetype, c("Shared Onset" ,"Associated Word", "Tone", "Non associated word"))
levels(df_randomeffects2$participant)
## [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
levels(df_randomeffects2$cuetype)
## [1] "Shared Onset" "Associated Word" "Tone"
## [4] "Non associated word"
#Quick plot of random effects (will order participants by deviation, and panel by Cue Type)
lattice::dotplot(ranef(Model.lmer2))
## $Subject
df_randomeffects2 %>%
ggplot(aes(x = participant, y = deviation)) +
geom_point() +
ylab("Deviation from Condition Mean (0)") + xlab("Participant") +
geom_hline(yintercept=0) +
facet_grid(. ~ cuetype) +
ggtitle("2c) Effect of Cue Type by Participant, as deviations from Condition Mean") +
theme_bw() + theme(text=element_text(size=14))
I think this alternative to Figure 2c by participant, which swaps x axis and facet to plot the effect of cueCond for each participant, makes it easier to see the random slopes of cue condition. In the average plot in Figure 2A, shared onset has a clear advantage over other conditions, but in individual participant plots that is not always the case. The pattern seen in averages (that Shared onset performance is better than the other 3 conditions) only holds in half the participants.
df_randomeffects2 %>%
ggplot(aes(x = cuetype, y = deviation)) +
geom_point() +
ylab("Deviation from Condition Mean (0)") + xlab("Cue Condition") +
scale_x_discrete(labels=c("Shared Onset" = "S", "Associated Word" = "A",
"Tone" = "T", "Non associated word" = "N")) +
geom_hline(yintercept=0) +
facet_grid(. ~ participant) +
ggtitle("2c alt) Effect of Cue Type by Participant, as deviations from Condition Mean") +
theme_bw() +
theme(text=element_text(size=14))
Filtering tidymodel2 output to only display sd for random slopes CueCondition aka varince in slopes associated with each cue type. Important things to note, “on average within each condition, participants vary around the mean by ~ 1.3 units”.
tidymodel2 %>%
filter(effect == "ran_pars") %>%
filter(term %in% c("sd__CueConditionPhCue","sd__CueConditionSemCue", "sd__CueConditionTone", "sd__CueConditionUnAssoc")) %>%
kableExtra::kable()
| effect | group | term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|---|---|
| ran_pars | Subject | sd__CueConditionPhCue | 1.267931 | NA | NA | NA |
| ran_pars | Subject | sd__CueConditionSemCue | 1.258864 | NA | NA | NA |
| ran_pars | Subject | sd__CueConditionTone | 1.309690 | NA | NA | NA |
| ran_pars | Subject | sd__CueConditionUnAssoc | 1.253928 | NA | NA | NA |
This model includes BOTH intercepts for subject (accounting for the overall differences in performance between subjects - as in model 1) AND random slopes for Length, that is the possiblity that the effect of word length on participants naming differs across participants. This model includes fixed effects of CueCondition, Length, and Frequency, as well as random intercepts for Subject and random slopes for the effect of Length across subjects.
Model.lmer3 = glmer(ACC ~ CueCondition + zLengthPh + zFreq + (1+zLengthPh|Subject),
data = NamingData, family = "binomial",
control = glmerControl(optimizer="bobyqa"))
summary(Model.lmer3)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula:
## ACC ~ CueCondition + zLengthPh + zFreq + (1 + zLengthPh | Subject)
## Data: NamingData
## Control: glmerControl(optimizer = "bobyqa")
##
## AIC BIC logLik deviance df.resid
## 6595.9 6656.7 -3288.9 6577.9 6346
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.7441 -0.6793 0.3361 0.5689 4.2515
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 1.58402 1.2586
## zLengthPh 0.01816 0.1348 0.34
## Number of obs: 6355, groups: Subject, 10
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.94763 0.40307 2.351 0.0187 *
## CueConditionSemCue -0.62891 0.08686 -7.241 4.46e-13 ***
## CueConditionTone -0.59569 0.08742 -6.814 9.46e-12 ***
## CueConditionUnAssoc -0.62221 0.08789 -7.080 1.44e-12 ***
## zLengthPh -0.34817 0.05531 -6.295 3.07e-10 ***
## zFreq 0.21124 0.03432 6.154 7.55e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) CCndSC CCndtT CCndUA zLngtP
## CueCndtnSmC -0.114
## CueCondtnTn -0.114 0.526
## CCndtnUnAss -0.113 0.524 0.521
## zLengthPh 0.256 0.019 0.018 0.017
## zFreq 0.005 -0.018 -0.016 -0.020 0.254
tidymodel3 <- tidy(Model.lmer3)
Lattice gets quick plot of Random effects (both intercept = overall performance and Length). Eyeballing it looks like intercepts much greater difference across subjects that the effect of length (all deviations hover around 0)
lattice::dotplot(ranef(Model.lmer3))
## $Subject
## REPRODUCE FIGURE 3
Figure 3 has 3 parts. - 3a illustrates the effet of length on naming performance, averaged across the group (pp are less accurate at naming longer words) - 3b illustrates how that main effect of length, differs in slope for each participant (there are some pp whose slops are alot worse than the others, but across the group, not much variability in the effect of length). - 3c illustrates the random effects, first intercept, which is the differences across the group in overall perforamnce (ie the effect of aphasia severity as in Figure 1B), then slope, which is the difference in the effect of length across participants (ie does length affect some participants more than others- not really all hovering around the mean)
plotDat3 <- na.omit(NamingData) #remove NAs from original data so aligns with model values
plotDat3 <- plotDat3 %>%
mutate(fitted_values = fitted.values(Model.lmer3))
names(plotDat3)
## [1] "Word" "Subject" "Trial" "ItemNo"
## [5] "Session" "Condition" "Acc" "Cue"
## [9] "ACC" "CueCondition" "Length.Phon" "FreqLogLemma"
## [13] "zLengthPh" "zFreq" "Participant" "fitted_values"
plotDat3$Participant <-as.character(plotDat3$Participant) #convert participant labels to text
# also need version of participant that is factor
plotDat3 %>%
mutate(PartCodeFact = fct_relevel(plotDat3$Participant, c("A" ,"B", "C", "D", "E", "F", "G", "H", "I", "J")))
## # A tibble: 6,355 x 17
## Word Subject Trial ItemNo Session Condition Acc Cue ACC
## <chr> <fct> <dbl> <dbl> <dbl> <chr> <dbl> <chr> <fct>
## 1 ambu… FF 156 54 1 Phonolog… 0 Cued 0
## 2 ambu… WR 65 54 4 Semantic 1 Cued 1
## 3 ambu… AW 156 54 3 Phonolog… 1 Cued 1
## 4 ambu… FF 40 54 3 Semantic 0 NotC… 0
## 5 ambu… MH 40 54 1 Semantic 0 NotC… 0
## 6 ambu… AW 65 54 1 Semantic 1 Cued 1
## 7 ambu… AW 132 54 4 Phonolog… 1 NotC… 1
## 8 ambu… CB 65 54 3 Semantic 1 Cued 1
## 9 ambu… JV 132 54 2 Phonolog… 1 NotC… 1
## 10 ambu… JK 40 54 3 Semantic 0 NotC… 0
## # … with 6,345 more rows, and 8 more variables: CueCondition <fct>,
## # Length.Phon <dbl>, FreqLogLemma <dbl>, zLengthPh[,1] <dbl>,
## # zFreq[,1] <dbl>, Participant <chr>, fitted_values <dbl>,
## # PartCodeFact <fct>
Plot average slope for effect of length
plotDat3 %>%
ggplot(aes(x=zLengthPh, y=fitted_values)) +
geom_smooth(method = "glm", colour = "black") +
xlab("Length in Phonemes (z score)") +
ylab("Accuracy (model fit)") +
ggtitle("3a) Average (group) effect of Length in Phonemes") +
ylim(0,1) +
theme_bw() +
theme(text=element_text(size=14))
## `geom_smooth()` using formula 'y ~ x'
Plot individual slopes for effect of length, this linetype = Participant trick is something I haven’t seen before (clever)
plotDat3 %>%
ggplot(aes(x=zLengthPh, y=fitted_values, linetype=Participant)) +
geom_smooth(se = FALSE, method = "glm", colour = "black") +
xlab("Length in Phonemes (z score)") +
ylab("Accuracy (model fit)") +
ggtitle("3b) Effect of Length in Phonemes by Participant") +
ylim(0,1) +
theme(text=element_text(size=14)) +
scale_colour_grey() + theme_bw()
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 23 rows containing missing values (geom_smooth).
As for Figure 2c, pull random effects slope for Length.
#Extract random effect variances
randomeffects3 <- ranef(Model.lmer3) #list of 1, 10 rows 2 col
str(randomeffects3) #look at what has been extracted
## List of 1
## $ Subject:'data.frame': 10 obs. of 2 variables:
## ..$ (Intercept): num [1:10] -2.396 1.326 1.58 -0.672 1.165 ...
## ..$ zLengthPh : num [1:10] -0.0779 0.0962 0.0326 0.0503 0.1077 ...
## ..- attr(*, "postVar")= num [1:2, 1:2, 1:10] 0.01297 0.00295 0.00295 0.00882 0.01187 ...
## - attr(*, "class")= chr "ranef.mer"
df_randomeffects3 <- randomeffects3[[1]]
# the pp-nos are in the row names, so need to make them a column , rename intercept, record Subject into A:J
df_randomeffects3 <- df_randomeffects3 %>%
rownames_to_column(var = "subject") %>%
dplyr::rename(intercept = "(Intercept)") %>%
mutate(participant= recode(subject,
"CB" = "A", "AW" = "B","EW" = "C",
"JV" = "D","WR" = "E", "FF" = "G",
"JK" = "F", "DH" = "H","MH" = "I",
"AM" = "J"))
# quick plot of random effects (will order participants by deviation, and using original participants codes from model)
lattice::dotplot(ranef(Model.lmer3))
## $Subject
NOW.. Fgure 3c this is a little different to the other random effects plots (1b,2c), because we have more than 1 random effect that we want to plot. In Figure 1b we were just plotting the random intercept for participants and in Figure 2c, we were plotting the random slope for cueCondition alone (WITHOUT the random intercept for participants). This is the first time we are wanting to plot a random intercept and a random slope at the same time.
So, we need to make the data long so we can facet by variable (intercept vs slope) and plot the deviation (intercept and slope for length) comparatively.
df_randomeffects3_long <- df_randomeffects3 %>%
pivot_longer(names_to = "variable", values_to = "deviation", 2:3)
df_randomeffects3_long %>%
ggplot(aes(x=as.factor(participant), y=deviation)) +
geom_point(stat="identity") +
facet_grid(~ variable) +
ylab("Deviation from Mean (0)") + xlab("Participant") +
ggtitle("3c) Participant intercepts and slopes for Length") +
geom_hline(yintercept=0) +
theme_bw() + theme(text=element_text(size=14))
Filtering tidymodel3 output to include only random effects. This model includes both random intercepts for participants (estiamte = 1.259), and random slopes for length. Variance for random effect of length = 0.134, telling us that participants vary much much more in their overall perforamnce, than in the extent to which the length of words makes a difference to their performance.
tidymodel3 %>%
filter(effect == "ran_pars") %>%
kableExtra::kable()
| effect | group | term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|---|---|
| ran_pars | Subject | sd__(Intercept) | 1.2585801 | NA | NA | NA |
| ran_pars | Subject | cor__(Intercept).zLengthPh | 0.3403571 | NA | NA | NA |
| ran_pars | Subject | sd__zLengthPh | 0.1347630 | NA | NA | NA |
Model 4 is the same as model 3, except instead of including random intercepts for participant AND slope for Length, we are including intercepts for participant AND slope for Frequency.
NOTE- probably in the real analysis they included both.
Model.lmer4 = glmer(ACC ~ CueCondition + zLengthPh + zFreq + (1+zFreq|Subject),
data = NamingData, family = "binomial",
control = glmerControl(optimizer="bobyqa"))
summary(Model.lmer4)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: ACC ~ CueCondition + zLengthPh + zFreq + (1 + zFreq | Subject)
## Data: NamingData
## Control: glmerControl(optimizer = "bobyqa")
##
## AIC BIC logLik deviance df.resid
## 6577.3 6638.1 -3279.6 6559.3 6346
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.9959 -0.6727 0.3477 0.5622 5.7565
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 1.60698 1.2677
## zFreq 0.03981 0.1995 -0.84
## Number of obs: 6355, groups: Subject, 10
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.93782 0.40592 2.310 0.0209 *
## CueConditionSemCue -0.63507 0.08714 -7.288 3.14e-13 ***
## CueConditionTone -0.59970 0.08766 -6.841 7.88e-12 ***
## CueConditionUnAssoc -0.62466 0.08813 -7.088 1.36e-12 ***
## zLengthPh -0.36034 0.03513 -10.256 < 2e-16 ***
## zFreq 0.17644 0.07238 2.438 0.0148 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) CCndSC CCndtT CCndUA zLngtP
## CueCndtnSmC -0.114
## CueCondtnTn -0.113 0.528
## CCndtnUnAss -0.113 0.525 0.522
## zLengthPh -0.006 0.031 0.029 0.029
## zFreq -0.726 -0.008 -0.007 -0.009 0.205
tidymodel4 <- tidy(Model.lmer4)
# quick plot of random effects from model
lattice::dotplot(ranef(Model.lmer4))
## $Subject
plotDat4 <- na.omit(NamingData) #remove NAs from original data so aligns with model values
plotDat4 <- plotDat4 %>%
mutate(fitted_values = fitted.values(Model.lmer3))
# participant labels are factor, need version that is text
plotDat4 <- plotDat4 %>%
mutate(PartCodeFact = fct_relevel(plotDat4$Participant, c("A" ,"B", "C", "D", "E", "F", "G", "H", "I", "J"))) # partcodefact is factos
plotDat4$Participant <-as.character(plotDat4$Participant) #convert participant labels to text
glimpse(plotDat4)
## Rows: 6,355
## Columns: 17
## $ Word <chr> "ambulance", "ambulance", "ambulance", "ambulance"…
## $ Subject <fct> FF, WR, AW, FF, MH, AW, AW, CB, JV, JK, FF, EW, CB…
## $ Trial <dbl> 156, 65, 156, 40, 40, 65, 132, 65, 132, 40, 65, 65…
## $ ItemNo <dbl> 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54…
## $ Session <dbl> 1, 4, 3, 3, 1, 1, 4, 3, 2, 3, 4, 3, 2, 3, 2, 3, 1,…
## $ Condition <chr> "Phonological", "Semantic", "Phonological", "Seman…
## $ Acc <dbl> 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1,…
## $ Cue <chr> "Cued", "Cued", "Cued", "NotCued", "NotCued", "Cue…
## $ ACC <fct> 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1,…
## $ CueCondition <fct> PhCue, SemCue, PhCue, UnAssoc, UnAssoc, SemCue, To…
## $ Length.Phon <dbl> 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,…
## $ FreqLogLemma <dbl> 0.9542, 0.9542, 0.9542, 0.9542, 0.9542, 0.9542, 0.…
## $ zLengthPh <dbl[,1]> <matrix[25 x 1]>
## $ zFreq <dbl[,1]> <matrix[25 x 1]>
## $ Participant <chr> "G", "E", "B", "G", "I", "B", "B", "A", "D", "F", …
## $ fitted_values <dbl> 0.16892833, 0.44141248, 0.81807048, 0.09837124, 0.…
## $ PartCodeFact <fct> G, E, B, G, I, B, B, A, D, F, G, C, A, J, H, D, E,…
Plot average slope for effect of freq, across participants and conditions, overall the less frequent a word is the harder it is to name.
plotDat4 %>%
ggplot(aes(x=zFreq, y=fitted_values)) +
geom_smooth(method = "glm", colour = "black") +
xlab("Frequency (z score)") +
ylab("Accuracy (model fit)") +
ggtitle("4a) Average (group) effect of Freq") +
ylim(0,1) +
theme_bw() +
theme(text=element_text(size=14))
## `geom_smooth()` using formula 'y ~ x'
Plot individual slopes for effect of length, showing that while overall infreqent words are harder to name, the extent to which frequency impacts naming perofmrance differs across participants. The slope of the effect differs across participants.
plotDat4 %>%
ggplot(aes(x=zFreq, y=fitted_values, linetype=Participant)) +
geom_smooth(se = FALSE, method = "glm", colour = "black") +
xlab("Frequency(z score)") +
ylab("Accuracy (model fit)") +
ggtitle("4b) Effect of Frequnecy by Participant") +
ylim(0,1) +
theme(text=element_text(size=14)) +
scale_colour_grey() + theme_bw()
## `geom_smooth()` using formula 'y ~ x'
# quick plot of random effects (will order participants by deviation, and using original participants codes from model)
lattice::dotplot(ranef(Model.lmer4))
## $Subject
#### plot 4c set up (df_randomeffects3)
Plot both random intercepts and slopes for Length in phonemes
#Extract random effect variances
randomeffects4 <- ranef(Model.lmer4) #list of 1, 10 rows 2 col
str(randomeffects4) #look at what has been extracted
## List of 1
## $ Subject:'data.frame': 10 obs. of 2 variables:
## ..$ (Intercept): num [1:10] -2.456 1.332 1.569 -0.658 1.189 ...
## ..$ zFreq : num [1:10] 0.315 -0.143 -0.246 0.014 -0.257 ...
## ..- attr(*, "postVar")= num [1:2, 1:2, 1:10] 0.01403 -0.00352 -0.00352 0.00646 0.01145 ...
## - attr(*, "class")= chr "ranef.mer"
df_randomeffects4 <- randomeffects4[[1]]
# the pp-nos are in the row names, so need to make them a column, rename intercept, recode participant values
df_randomeffects4 <- df_randomeffects4 %>%
rownames_to_column(var = "subject") %>%
dplyr::rename(intercept = "(Intercept)") %>%
mutate(participant= recode(subject,
"CB" = "A", "AW" = "B","EW" = "C",
"JV" = "D","WR" = "E", "FF" = "G",
"JK" = "F", "DH" = "H","MH" = "I",
"AM" = "J"))
# As for plot 3, data needs to be long in order to plot BOTH intercepts and slopes.
df_randomeffects4_long <- df_randomeffects4 %>%
pivot_longer(names_to = "variable", values_to = "deviation", 2:3)
Plot both random intercepts (overall differences in performance as a function of participant) and slopes for frequency (the extext to which the effect of frequency differs across participants).
Slope for frequency effect a bit more systematic here. Looks like the worse the aphasia, the bigger the effect of freq?
df_randomeffects4_long %>%
ggplot(aes(x=as.factor(participant), y=deviation)) +
geom_point(stat="identity") +
facet_grid(~ variable) +
ylab("Deviation from Mean (0)") + xlab("Participant") +
ggtitle("4c) Participant intercepts and slopes for Freq") +
geom_hline(yintercept=0) +
theme_bw() + theme(text=element_text(size=14))
Filtering tidymodel4 output to include only random effects. This model includes both random intercepts for participants (estiamte = 1.267), and random slopes for frequency. Variance for random effect of length = 0.200, also correlation between intercept and freq effect is high 0.84 telling us that participants higher intercepts (better accuracy overall) are less impated by frequency.
tidymodel4 %>%
filter(effect == "ran_pars") %>%
kableExtra::kable()
| effect | group | term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|---|---|
| ran_pars | Subject | sd__(Intercept) | 1.2676658 | NA | NA | NA |
| ran_pars | Subject | cor__(Intercept).zFreq | -0.8432434 | NA | NA | NA |
| ran_pars | Subject | sd__zFreq | 0.1995297 | NA | NA | NA |