Acknowledgements: This tutorial is based on the tutorial by Gabriela K Hajduk
Terms:
Fixed effect: main predictor (e.g., body length, test)
Random effect: source of systematic noise (e.g., mountain range, individual dragon) that is not important for our hypothesis
Levels: different values of a predictor (e.g., trial 1 vs 2)
Interaction: combinations of 2 predictors making different predictions for the dependent variable
Hypothesis 1: higher body length –> higher test score
lm.bl <- lm(score ~ bodyLength, data = dragons)
summary(lm.bl)
##
## Call:
## lm(formula = score ~ bodyLength, data = dragons)
##
## Residuals:
## Min 1Q Median 3Q Max
## -101.151 -30.292 9.781 15.755 139.128
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 20.92057 14.65522 1.428 0.154
## bodyLength 0.34729 0.07256 4.786 1.88e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 44.59 on 1438 degrees of freedom
## Multiple R-squared: 0.01568, Adjusted R-squared: 0.01499
## F-statistic: 22.91 on 1 and 1438 DF, p-value: 1.876e-06
How do we interpret the data?
ggplot(data = dragons,
mapping = aes(x = bodyLength, y = score)) +
geom_smooth(method='lm')
## `geom_smooth()` using formula = 'y ~ x'
ggplot(data = dragons,
mapping = aes(x = bodyLength, y = score)) +
geom_smooth(method='lm') +
geom_abline(intercept = 20.92057, slope = 0.34729,
color = "red")
## `geom_smooth()` using formula = 'y ~ x'
Reporting results:
Body length is a significant predictor of test scores, with dragons that have longer bodies scoring higher (\(\beta\) = 0.35, 95% \(CI\) [0.20, 0.49], \(p < .001\)).
Some useful functions:
coef(lm.bl)
: get only beta estimates
coef(summary(lm.bl))
: get beta estimates, p-value,
etc.
report(lm.bl)
: get a (lengthy) APA report.
confint
: get CI (default is 95%). Or you can report SE.
Use confint.merMod
for lmer/glmer.
force_decimals
and force_p
: additional
aesthetics stuff, automatically format values to APA.
When we facet out mountainRange and test, we saw that the effect is different between tests 1 and 2. We also saw that the effect is different between mountain ranges.
ggplot(data = dragons,
mapping = aes(x = bodyLength, y = score, colour = mountainRange)) +
geom_point(size = 2, alpha = .3) +
geom_smooth(method='lm') +
facet_grid(test~mountainRange) +
theme_classic() +
theme(legend.position = "none")
## `geom_smooth()` using formula = 'y ~ x'
Relationship between body length and score might differ based on which mountain range a dragon comes from. And the overall effect we found of body length on score might be driven by some mountain ranges where the effect is very strong.
Relationship between body length and score might also differ based on the individual dragon (i.e., some dragons are better at all tests.)
Syntax: (1|random effect)
lmer.base <- lmer(score ~ bodyLength + (1|mountainRange) + (1|pid), data = dragons)
## boundary (singular) fit: see help('isSingular')
summary(lmer.base)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: score ~ bodyLength + (1 | mountainRange) + (1 | pid)
## Data: dragons
##
## REML criterion at convergence: 15021.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.2751 -0.6674 0.2031 0.3593 3.1240
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 0.000 0.000
## mountainRange (Intercept) 8.696 2.949
## Residual 1981.747 44.517
## Number of obs: 1440, groups: pid, 480; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 18.47702 17.72650 18.47986 1.042 0.310700
## bodyLength 0.35942 0.08771 18.68057 4.098 0.000631 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## bodyLength -0.996
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.base, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 1.0865 1 0.2973
## bodyLength 16.7937 1 4.167e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coef(summary(lmer.base))
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 18.4770218 17.72650380 18.47986 1.042339 0.3106996953
## bodyLength 0.3594248 0.08770712 18.68057 4.098011 0.0006313981
Yes – after adding the random effect of mountain range, there is still an effect of body length on score.
Note: Singular fit
Not a big deal for this demonstration, but if you see this in your actual analysis, it’s a sign that your predictor structure is too complex and is overfitting the data. Consider dropping random effects.
Note: Common random effects:
Individual participant: if a participant sees multiple trials, there will be dependencies between their responses – they might intrinsically be better or worse at the task.
Individual trial: a trial might intrinsically be easier or more difficult for all participants.
I always include both of these random effects in my analysis. In this demonstration, we will be including random participant effect, but we don’t have multiple trials so we won’t be including that.
What is Anova
doing?
It’s running a Wald chi-squared test – p-values will be similar to a likelihood ratio test comparing against a uniformly informative model (no fixed effects, no random effects!) But these are different tests.
lm.uni <- lm(score ~ 1 , data = dragons)
#summary(lm.uni)
#Anova(lm.uni, type = 3)
anova(lmer.base, lm.uni, type = 3, refit = FALSE)
## Warning in anova.merMod(lmer.base, lm.uni, type = 3, refit = FALSE): some
## models fit with REML = TRUE, some not
## Data: dragons
## Models:
## lm.uni: score ~ 1
## lmer.base: score ~ bodyLength + (1 | mountainRange) + (1 | pid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lm.uni 2 15048 15059 -7522.2 15044
## lmer.base 5 15032 15058 -7510.8 15022 22.842 3 4.357e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Let’s say we have a second hypothesis:
Hypothesis 2: higher body length –> higher test score, scores for test 2 > test 1
Checking the effect of test:
lmer.test <- lmer(score ~ test + (1|mountainRange) + (1|pid), data = dragons )
## boundary (singular) fit: see help('isSingular')
summary(lmer.test)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: score ~ test + (1 | mountainRange) + (1 | pid)
## Data: dragons
##
## REML criterion at convergence: 14978.1
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.0958 -0.5653 0.0019 0.4473 3.4375
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 0.00 0.000
## mountainRange (Intercept) 29.29 5.412
## Residual 1931.43 43.948
## Number of obs: 1440, groups: pid, 480; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 101.885 2.772 16.498 36.754 < 2e-16 ***
## testtest_2 -14.290 2.837 1430.000 -5.037 5.32e-07 ***
## testtest_3 -18.859 2.837 1430.000 -6.648 4.22e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) tstt_2
## testtest_2 -0.512
## testtest_3 -0.512 0.500
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.test, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 1350.838 1 < 2.2e-16 ***
## test 48.108 2 3.577e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Add both fixed effects:
lmer.bl_test <- lmer(score ~ bodyLength + test + (1|mountainRange) + (1|pid), data = dragons)
## boundary (singular) fit: see help('isSingular')
summary(lmer.bl_test)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: score ~ bodyLength + test + (1 | mountainRange) + (1 | pid)
## Data: dragons
##
## REML criterion at convergence: 14966.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.1336 -0.5674 0.0128 0.4442 3.3524
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 5.513e-11 7.425e-06
## mountainRange (Intercept) 9.031e+00 3.005e+00
## Residual 1.920e+03 4.381e+01
## Number of obs: 1440, groups: pid, 480; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 29.38397 17.68361 19.29598 1.662 0.112744
## bodyLength 0.36013 0.08712 19.18554 4.134 0.000554 ***
## testtest_2 -14.29003 2.82808 1429.68868 -5.053 4.91e-07 ***
## testtest_3 -18.85880 2.82808 1429.68868 -6.668 3.68e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) bdyLng tstt_2
## bodyLength -0.992
## testtest_2 -0.080 0.000
## testtest_3 -0.080 0.000 0.500
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
lmer.anova.bl_test <- Anova(lmer.bl_test, type = 3)
lmer.anova.bl_test
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 2.7611 1 0.09658 .
## bodyLength 17.0893 1 3.566e-05 ***
## test 48.4063 2 3.081e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
lmer.compare.bl_test <- anova(lmer.bl_test, lmer.base, type = 3, refit = FALSE)
lmer.compare.bl_test
## Data: dragons
## Models:
## lmer.base: score ~ bodyLength + (1 | mountainRange) + (1 | pid)
## lmer.bl_test: score ~ bodyLength + test + (1 | mountainRange) + (1 | pid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lmer.base 5 15032 15058 -7510.8 15022
## lmer.bl_test 7 14980 15017 -7483.2 14966 55.217 2 1.023e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There is an effect of body length and test (after controlling for random effects). And adding test as a fixed effect results in a model that better explains the variance in the data.
How do we interpret the ‘test’ effect? Use emmeans
.
lmer.contr.bl_test <- lmer.bl_test %>%
emmeans(specs = pairwise ~ test,
type = "response") %>%
pluck("contrasts") %>%
summary()
lmer.contr.bl_test
## contrast estimate SE df t.ratio p.value
## test_1 - test_2 14.29 2.83 958 5.053 <.0001
## test_1 - test_3 18.86 2.83 958 6.668 <.0001
## test_2 - test_3 4.57 2.83 958 1.616 0.2395
##
## Degrees-of-freedom method: kenward-roger
## P value adjustment: tukey method for comparing a family of 3 estimates
Scores in test 1 is significantly higher than scores in test 2, and also significantly higher than test 3. But there is no significant differences between test 1 and 3.
Reporting results:
A model that includes both body length and test as predictors showed that they both are significant predictors of test scores, with dragons that have longer bodies scoring higher (\(\beta\) = 0.36, 95% \(CI\) [0.19, 0.53], \(p < .001\)). There is an effect of test (\(\chi^2\)(2) = 48.41, \(p < .001\)).
Post-hoc pairwise comparisons between the three test types with Tukey corrections showed that scores in Test 1 is significantly higher than Test 2 (\(t\) = 5.05, \(p < .001\)) and Test 3 (\(t\) = 6.67, \(p < .001\)). Scores in Test 2 was not significantly different from Test 3 (\(t\) = 1.62, \(p = .240\)).
A likelihood ratio test shows that the model with both body length and test as predictors explained significantly more variation in the observed data compared to the base model with only body length as predictor (\(\chi^2\)(2) = 55.22, \(p < .001\)).
Note: ANOVA:
Anova()
or car::Anova()
: compare the full
model with a uniformly informative model (all predictors removed) –>
this allows us to get the p-value to see whether a predictor (e.g., body
length) is significant.
anova()
: compare 2 models –> whether 1 model explains
significantly more variance in the data compared to the other.
See also: https://www.bookdown.org/rwnahhas/RMPH/mlr-distinctions.html
So now we know that both body length and test is a significant predictor of the data. Is there an interaction effect – does body length predict scores in a similar manner across both tests?
Syntax: fixed_eff1 * fixed_eff2
includes both interaction and main effects.
fixed_eff1 : fixed_eff2
to suppress main effects and only
test interaction.
lmer.bl_test_int <- lmer(score ~ bodyLength * test + (1|mountainRange) + (1|pid),
data = dragons)
## boundary (singular) fit: see help('isSingular')
summary(lmer.bl_test_int)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: score ~ bodyLength * test + (1 | mountainRange) + (1 | pid)
## Data: dragons
##
## REML criterion at convergence: 14960.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.2524 -0.5721 0.0021 0.4294 3.2704
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 0.000 0.000
## mountainRange (Intercept) 9.086 3.014
## Residual 1909.311 43.696
## Number of obs: 1440, groups: pid, 480; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) -1.116e+00 2.687e+01 9.985e+01 -0.042 0.966935
## bodyLength 5.116e-01 1.330e-01 1.011e+02 3.848 0.000209 ***
## testtest_2 7.906e+01 3.517e+01 1.428e+03 2.248 0.024747 *
## testtest_3 -2.078e+01 3.517e+01 1.428e+03 -0.591 0.554757
## bodyLength:testtest_2 -4.637e-01 1.742e-01 1.428e+03 -2.663 0.007842 **
## bodyLength:testtest_3 9.545e-03 1.742e-01 1.428e+03 0.055 0.956300
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) bdyLng tstt_2 tstt_3 bdL:_2
## bodyLength -0.996
## testtest_2 -0.655 0.653
## testtest_3 -0.655 0.653 0.500
## bdyLngth:_2 0.653 -0.655 -0.997 -0.498
## bdyLngth:_3 0.653 -0.655 -0.498 -0.997 0.500
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.bl_test_int, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 0.0017 1 0.9668519
## bodyLength 14.8043 1 0.0001193 ***
## test 8.9721 2 0.0112648 *
## bodyLength:test 9.6508 2 0.0080235 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lmer.bl_test_int, lmer.bl_test, type = 3, refit = FALSE)
## Data: dragons
## Models:
## lmer.bl_test: score ~ bodyLength + test + (1 | mountainRange) + (1 | pid)
## lmer.bl_test_int: score ~ bodyLength * test + (1 | mountainRange) + (1 | pid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lmer.bl_test 7 14980 15017 -7483.2 14966
## lmer.bl_test_int 9 14978 15026 -7480.2 14960 6.0273 2 0.04911 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There is an interaction effect!
Interpreting estimates of interactions requires some math, which we won’t go into. But I recommend this post if you’re interested!
Quick plot (note that this doesn’t take into account variances explained by the random effects) that demonstrates this:
Red line: test_1; Blue line: test_2; Green line: test_3
ggplot() +
geom_smooth(data = dragons %>%
filter(test == "test_1"),
mapping = aes(x = bodyLength, y = score),
method='lm',
color = "red") +
geom_smooth(data = dragons %>%
filter(test == "test_2"),
mapping = aes(x = bodyLength, y = score),
method='lm',
color = "blue") +
geom_smooth(data = dragons %>%
filter(test == "test_3"),
mapping = aes(x = bodyLength, y = score),
method='lm',
color = "green") +
theme_classic()
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
Let’s look at a subset of the data with only scores for test 1 and 3.
lmer.bl_test_13 <- lmer(score ~ bodyLength + test + (1|mountainRange) + (1|pid),
data = dragons %>%
filter(test != "test_2"))
## boundary (singular) fit: see help('isSingular')
summary(lmer.bl_test_13)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: score ~ bodyLength + test + (1 | mountainRange) + (1 | pid)
## Data: dragons %>% filter(test != "test_2")
##
## REML criterion at convergence: 9809.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.4652 -0.1759 -0.0033 0.0531 3.5476
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 3.800e-10 1.949e-05
## mountainRange (Intercept) 5.634e+00 2.374e+00
## Residual 1.610e+03 4.013e+01
## Number of obs: 960, groups: pid, 480; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) -1.41476 18.25883 15.33431 -0.077 0.939
## bodyLength 0.51312 0.09014 15.30095 5.692 3.96e-05 ***
## testtest_3 -18.85880 2.59020 950.66254 -7.281 6.95e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) bdyLng
## bodyLength -0.994
## testtest_3 -0.071 0.000
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.bl_test_13, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 0.006 1 0.9382
## bodyLength 32.402 1 1.254e-08 ***
## test 53.010 1 3.318e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
lmer.bl_test_int_13 <- lmer(score ~ bodyLength * test + (1|mountainRange) + (1|pid),
data = dragons %>%
filter(test != "test_2"))
## boundary (singular) fit: see help('isSingular')
summary(lmer.bl_test_int_13)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: score ~ bodyLength * test + (1 | mountainRange) + (1 | pid)
## Data: dragons %>% filter(test != "test_2")
##
## REML criterion at convergence: 9811.3
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.4675 -0.1759 -0.0037 0.0535 3.5428
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 0.00 0.000
## mountainRange (Intercept) 5.62 2.371
## Residual 1611.88 40.148
## Number of obs: 960, groups: pid, 480; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) -0.447916 24.350698 47.544616 -0.018 0.985401
## bodyLength 0.508317 0.120542 47.939601 4.217 0.000109 ***
## testtest_3 -20.780373 32.318533 949.663116 -0.643 0.520388
## bodyLength:testtest_3 0.009545 0.160019 949.663116 0.060 0.952447
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) bdyLng tstt_3
## bodyLength -0.997
## testtest_3 -0.664 0.662
## bdyLngth:_3 0.661 -0.664 -0.997
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.bl_test_int_13, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 0.0003 1 0.9853
## bodyLength 17.7824 1 2.477e-05 ***
## test 0.4134 1 0.5202
## bodyLength:test 0.0036 1 0.9524
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There is an effect of test, but not when an interaction with body length is added. However, a likelihood ratio test shows that the model without the interaction better explains the data.
anova(lmer.bl_test_int_13, lmer.bl_test_13, type = 3, refit = FALSE)
## Data: dragons %>% filter(test != "test_2")
## Models:
## lmer.bl_test_13: score ~ bodyLength + test + (1 | mountainRange) + (1 | pid)
## lmer.bl_test_int_13: score ~ bodyLength * test + (1 | mountainRange) + (1 | pid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lmer.bl_test_13 6 9821.4 9850.6 -4904.7 9809.4
## lmer.bl_test_int_13 7 9825.3 9859.3 -4905.6 9811.3 0 1 1
If DVs are measures where the residual error does not follow a normal
distribution –> use glmer
instead. This allows you to
specify a link function that “link” the response to a linear
predictor.
A common use is with a binary outcome (e.g., success / failure, 2AFC,
etc.), coded as 0/1. This is essentially a logistic regression. In our
example dataset, this is the case with test_4
.
#removed pid random effect because each dragon is only measured once for test_4
glmer.test4 <- glmer(score ~ bodyLength + (1|mountainRange),
data = dragons_test4,
family = "binomial")
## boundary (singular) fit: see help('isSingular')
summary(glmer.test4)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: score ~ bodyLength + (1 | mountainRange)
## Data: dragons_test4
##
## AIC BIC logLik deviance df.resid
## 654.4 666.9 -324.2 648.4 477
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.4269 -1.1456 0.7600 0.8543 0.9935
##
## Random effects:
## Groups Name Variance Std.Dev.
## mountainRange (Intercept) 0 0
## Number of obs: 480, groups: mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.331149 1.168484 1.995 0.0460 *
## bodyLength -0.009854 0.005773 -1.707 0.0878 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## bodyLength -0.997
## optimizer (Nelder_Mead) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(glmer.test4, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 3.9801 1 0.04604 *
## bodyLength 2.9142 1 0.08780 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Think of random effects as ‘grouping’ variables that determine dependencies of data. They can affect either the intercept or the slope of the best-fit line.
Consider the scenario where we are predicting score based on body length, with mountain range as the grouping variable.
Intercept: the projected score when body length = 0, for each mountain range. You can also think of this as the ‘baseline’ performance, even though this is not very accurate.
Slope: the change in score with a fixed change in body length, for each mountain range.
If we fit a best-fit line by each mountain range, the intercept and/or slope of each line might be different. We might want to account for this in our random effect structure.
Syntax:
Intercept random effect: (1|rand_eff)
Slope random effect – need to include an intercept random effect:
(1 + fixed_eff_whose_slope_vary | rand_eff)
. Essentially,
we are allowing the fixed effect’s slope to vary based on the grouping
variable.
# we let all main effects and the interaction effect vary in both slope and intercept based on mountainRange.
lmer.bl_test_int_slope <- lmer(score ~ bodyLength * test + (1 + (bodyLength * test)|mountainRange) + (1|pid), data = dragons)
## boundary (singular) fit: see help('isSingular')
## Warning: Model failed to converge with 4 negative eigenvalues: -7.1e-01
## -1.3e+00 -1.4e+00 -3.2e+05
summary(lmer.bl_test_int_slope)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula:
## score ~ bodyLength * test + (1 + (bodyLength * test) | mountainRange) +
## (1 | pid)
## Data: dragons
##
## REML criterion at convergence: 14951.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.52925 -0.55372 0.00112 0.45972 3.14159
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## pid (Intercept) 1.628e-03 0.04034
## mountainRange (Intercept) 2.095e+03 45.76748
## bodyLength 4.880e-02 0.22090 -1.00
## testtest_2 2.062e+03 45.41169 0.01 -0.01
## testtest_3 7.360e+03 85.79297 -0.07 0.07 -0.03
## bodyLength:testtest_2 5.899e-02 0.24287 -0.02 0.02 -1.00 -0.01
## bodyLength:testtest_3 1.901e-01 0.43601 0.04 -0.04 0.00 -1.00
## Residual 1.884e+03 43.40870
##
##
##
##
##
##
##
## 0.04
##
## Number of obs: 1440, groups: pid, 480; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) -8.9973 31.8843 4.4620 -0.282 0.7904
## bodyLength 0.5457 0.1559 4.8064 3.501 0.0184 *
## testtest_2 96.8812 40.2111 6.7594 2.409 0.0481 *
## testtest_3 -27.8376 49.9662 13.3054 -0.557 0.5867
## bodyLength:testtest_2 -0.5536 0.2022 7.8922 -2.738 0.0259 *
## bodyLength:testtest_3 0.0238 0.2489 14.3827 0.096 0.9251
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) bdyLng tstt_2 tstt_3 bdL:_2
## bodyLength -0.998
## testtest_2 -0.474 0.479
## testtest_3 -0.449 0.451 0.292
## bdyLngth:_2 0.462 -0.471 -0.997 -0.300
## bdyLngth:_3 0.436 -0.440 -0.300 -0.998 0.312
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.bl_test_int_slope, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 0.0796 1 0.7778006
## bodyLength 12.2559 1 0.0004638 ***
## test 7.5385 2 0.0230688 *
## bodyLength:test 8.4944 2 0.0143039 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lmer.bl_test_int_slope, lmer.bl_test_int, type = 3, refit = FALSE)
## Data: dragons
## Models:
## lmer.bl_test_int: score ~ bodyLength * test + (1 | mountainRange) + (1 | pid)
## lmer.bl_test_int_slope: score ~ bodyLength * test + (1 + (bodyLength * test) | mountainRange) + (1 | pid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lmer.bl_test_int 9 14978 15026 -7480.2 14960
## lmer.bl_test_int_slope 29 15010 15162 -7475.8 14952 8.7126 20 0.986
Yes, there is still an effect of body length, test and interaction effect when allowing for a random slope. But it does not explain a significantly larger variance in the data.
There are 3 sites (a, b, c) within each mountain range. Maybe the dragons in each have a different intercept (again, intercept = roughly baseline performance).
Wrong syntax: (1|mountainRange) + (1|site)
–> crossed
structure, assume that all sites ‘A’ across mountain ranges have the
same intercept. This model thinks that there are only 3 random site
intercepts (in fact, there are 3 sites * 8 mountain ranges = 24 site
intercepts).
Correct syntax: (1|mountainRange/site)
or
(1|mountainRange) + (1|mountainRange:site)
Note:
Make your life easier by just naming the sites differently from the beginning (e.g. mountainRange_A)
lmer.bl_test_int_site <- lmer(score ~ bodyLength * test + (1|mountainRange/site) + (1|pid), data = dragons)
## boundary (singular) fit: see help('isSingular')
summary(lmer.bl_test_int_site)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: score ~ bodyLength * test + (1 | mountainRange/site) + (1 | pid)
## Data: dragons
##
## REML criterion at convergence: 14957.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.2972 -0.5778 -0.0013 0.4557 3.1654
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 2.663e-07 5.161e-04
## site:mountainRange (Intercept) 2.374e+01 4.872e+00
## mountainRange (Intercept) 1.212e+00 1.101e+00
## Residual 1.894e+03 4.352e+01
## Number of obs: 1440, groups:
## pid, 480; site:mountainRange, 24; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 3.500e+00 2.757e+01 6.583e+01 0.127 0.899366
## bodyLength 4.887e-01 1.365e-01 6.629e+01 3.580 0.000648 ***
## testtest_2 7.906e+01 3.503e+01 1.412e+03 2.257 0.024162 *
## testtest_3 -2.078e+01 3.503e+01 1.412e+03 -0.593 0.553127
## bodyLength:testtest_2 -4.637e-01 1.734e-01 1.412e+03 -2.674 0.007592 **
## bodyLength:testtest_3 9.545e-03 1.734e-01 1.412e+03 0.055 0.956120
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) bdyLng tstt_2 tstt_3 bdL:_2
## bodyLength -0.997
## testtest_2 -0.635 0.633
## testtest_3 -0.635 0.633 0.500
## bdyLngth:_2 0.633 -0.635 -0.997 -0.498
## bdyLngth:_3 0.633 -0.635 -0.498 -0.997 0.500
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.bl_test_int_site, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 0.0161 1 0.8989795
## bodyLength 12.8178 1 0.0003433 ***
## test 9.0463 2 0.0108546 *
## bodyLength:test 9.7305 2 0.0077097 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lmer.bl_test_int_site, lmer.bl_test_int, type = 3, refit = FALSE)
## Data: dragons
## Models:
## lmer.bl_test_int: score ~ bodyLength * test + (1 | mountainRange) + (1 | pid)
## lmer.bl_test_int_site: score ~ bodyLength * test + (1 | mountainRange/site) + (1 | pid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lmer.bl_test_int 9 14978 15026 -7480.2 14960
## lmer.bl_test_int_site 10 14977 15030 -7478.7 14957 2.9318 1 0.08685 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Not an improvement over the interaction model without the nested random effect!