This tutorial is based on the tutorial by Gabriela K Hajduk
The code and data can be downloaded here.
A published version of the notebook is here.
Load libraries that we might use. Uncomment the line to install packages if necessary.
knitr::opts_chunk$set(echo = TRUE)
#install.packages(c('tidyverse', 'purrr', 'ggplot2', 'lme4', 'car'))
library('tidyverse')
library("purrr")
library('ggplot2')
library('lme4')
library('car')
There are different functions you can use for different data types. .csv is the most common file type to store your data, but we’re using .RData for this tutorial.
Long format shows multiple measurements for the same subject on different lines.
Wide format shows multiple measurements for the same subject on the same line.
It’s relatively easy to switch between the 2 formats. But in my experience, long format is more useful, for e.g., mixed-effects models (the standard analysis tool) requires long format. Therefore, I record my data in the long format.
dragons_long_to_wide <- dragons_long %>%
pivot_wider(
names_from = test, values_from = score
)
head(dragons_long_to_wide)
## # A tibble: 6 × 6
## pid mountainRange site bodyLength test_1 test_2
## <chr> <fct> <fct> <dbl> <dbl> <dbl>
## 1 2c98c3d0-6f27-4e71-b717-b67b6c0c… Bavarian a 166. 83.9 88.3
## 2 66bff560-fdc3-48cc-a9a5-99452b00… Bavarian a 168. 85.8 59.5
## 3 75be5f72-097e-42ee-863a-ab968991… Bavarian a 166. 84.2 44.0
## 4 64ccd831-70c6-43bc-9d00-30be2377… Bavarian a 168. 84.9 58.9
## 5 56fd0730-5cde-4717-b045-13b9f6f9… Bavarian a 170. 86.4 19.5
## 6 c8e97ce4-86fd-470d-a27d-e61c70b5… Bavarian a 169. 85.1 90.7
dragons_wide_to_long <- dragons_wide %>%
pivot_longer(c(test_1, test_2), names_to = "test", values_to = "score")
head(dragons_wide_to_long)
## # A tibble: 6 × 6
## pid mountainRange site bodyLength test score
## <chr> <fct> <fct> <dbl> <chr> <dbl>
## 1 2c98c3d0-6f27-4e71-b717-b67b6c0c29… Bavarian a 166. test… 83.9
## 2 2c98c3d0-6f27-4e71-b717-b67b6c0c29… Bavarian a 166. test… 88.3
## 3 66bff560-fdc3-48cc-a9a5-99452b000d… Bavarian a 168. test… 85.8
## 4 66bff560-fdc3-48cc-a9a5-99452b000d… Bavarian a 168. test… 59.5
## 5 75be5f72-097e-42ee-863a-ab968991d7… Bavarian a 166. test… 84.2
## 6 75be5f72-097e-42ee-863a-ab968991d7… Bavarian a 166. test… 44.0
Useful to save an original version before you manipulate data, especially if you haven’t set up and test your analysis pipeline beforehand.
dragons <- dragons_long
Example of base R:
dragons_baseR <- dragons
#`$` selects a column / makes the new column
dragons_baseR$new_score <- dragons$score + 50
head(dragons_baseR)
## # A tibble: 6 × 7
## pid mountainRange site bodyLength test score new_score
## <chr> <fct> <fct> <dbl> <chr> <dbl> <dbl>
## 1 2c98c3d0-6f27-4e71-b717-… Bavarian a 166. test… 83.9 134.
## 2 2c98c3d0-6f27-4e71-b717-… Bavarian a 166. test… 88.3 138.
## 3 66bff560-fdc3-48cc-a9a5-… Bavarian a 168. test… 85.8 136.
## 4 66bff560-fdc3-48cc-a9a5-… Bavarian a 168. test… 59.5 109.
## 5 75be5f72-097e-42ee-863a-… Bavarian a 166. test… 84.2 134.
## 6 75be5f72-097e-42ee-863a-… Bavarian a 166. test… 44.0 94.0
Examples of dplyr
:
Useful functions:
filter()
mutate()
select()
summarise()
+ group_by()
Logic:
&
for and
|
for or
Pipe:
%>%
How you link functions / actions together.
output <- input %>% action %>% action %>% ...
dplyr <- dragons %>%
filter(site == 'a' & mountainRange == 'Julian') %>% # this line filters out dragons from the Julian mountain range that are in site a
mutate(mountainRange = ifelse(as.character(mountainRange) == 'Julian', 'Julien', mountainRange)) # this line changes all the mountain range recorded as 'Julian' to 'Julien'
# toasted bread <- bread %>% cut %>% toast
head(dplyr)
## # A tibble: 6 × 6
## pid mountainRange site bodyLength test score
## <chr> <chr> <fct> <dbl> <chr> <dbl>
## 1 49aeee5f-bb4a-4759-b01d-53000a19a… Julien a 211. test… 107.
## 2 49aeee5f-bb4a-4759-b01d-53000a19a… Julien a 211. test… 50.9
## 3 df6a731d-152f-4bae-823a-96e3ffd46… Julien a 216. test… 110.
## 4 df6a731d-152f-4bae-823a-96e3ffd46… Julien a 216. test… 68.8
## 5 fe788f56-2260-4317-a2dc-2bfb96e3e… Julien a 203. test… 102.
## 6 fe788f56-2260-4317-a2dc-2bfb96e3e… Julien a 203. test… 8.68
Calculate the mean of the 2 scores for each dragon.
Without grouping, the mean score is calculated across all observations –> not what we want.
dragons_mean <- dragons %>%
mutate(mean = mean(score))
head(dragons_mean)
## # A tibble: 6 × 7
## pid mountainRange site bodyLength test score mean
## <chr> <fct> <fct> <dbl> <chr> <dbl> <dbl>
## 1 2c98c3d0-6f27-4e71-b717-b67b… Bavarian a 166. test… 83.9 75.9
## 2 2c98c3d0-6f27-4e71-b717-b67b… Bavarian a 166. test… 88.3 75.9
## 3 66bff560-fdc3-48cc-a9a5-9945… Bavarian a 168. test… 85.8 75.9
## 4 66bff560-fdc3-48cc-a9a5-9945… Bavarian a 168. test… 59.5 75.9
## 5 75be5f72-097e-42ee-863a-ab96… Bavarian a 166. test… 84.2 75.9
## 6 75be5f72-097e-42ee-863a-ab96… Bavarian a 166. test… 44.0 75.9
With grouping, we get the mean of the 2 tests for each dragon
dragons <- dragons %>%
group_by(pid) %>%
mutate(mean = mean(score))
head(dragons)
## # A tibble: 6 × 7
## # Groups: pid [3]
## pid mountainRange site bodyLength test score mean
## <chr> <fct> <fct> <dbl> <chr> <dbl> <dbl>
## 1 2c98c3d0-6f27-4e71-b717-b67b… Bavarian a 166. test… 83.9 86.1
## 2 2c98c3d0-6f27-4e71-b717-b67b… Bavarian a 166. test… 88.3 86.1
## 3 66bff560-fdc3-48cc-a9a5-9945… Bavarian a 168. test… 85.8 72.7
## 4 66bff560-fdc3-48cc-a9a5-9945… Bavarian a 168. test… 59.5 72.7
## 5 75be5f72-097e-42ee-863a-ab96… Bavarian a 166. test… 84.2 64.1
## 6 75be5f72-097e-42ee-863a-ab96… Bavarian a 166. test… 44.0 64.1
Grouping with summary: get the mean scores for each test, by each mountain range
dragons_mean_mountainRange <- dragons %>%
group_by(test, mountainRange) %>%
summarise(mean = mean(score))
## `summarise()` has grouped output by 'test'. You can override using the
## `.groups` argument.
dragons_mean_mountainRange
## # A tibble: 16 × 3
## # Groups: test [2]
## test mountainRange mean
## <chr> <fct> <dbl>
## 1 test_1 Bavarian 91.3
## 2 test_1 Central 106.
## 3 test_1 Emmental 107.
## 4 test_1 Julian 111.
## 5 test_1 Ligurian 107.
## 6 test_1 Maritime 101.
## 7 test_1 Sarntal 102.
## 8 test_1 Southern 91.1
## 9 test_2 Bavarian 52.2
## 10 test_2 Central 47.6
## 11 test_2 Emmental 42.4
## 12 test_2 Julian 52.9
## 13 test_2 Ligurian 46.7
## 14 test_2 Maritime 54.5
## 15 test_2 Sarntal 46.4
## 16 test_2 Southern 57.3
Standardize (z-score) scores: Note the combination of base R and dplyr.
dragons <- dragons %>%
mutate(score_z = (score - mean(dragons$score)) / sd(dragons$score))
head(dragons)
## # A tibble: 6 × 8
## # Groups: pid [3]
## pid mountainRange site bodyLength test score mean score_z
## <chr> <fct> <fct> <dbl> <chr> <dbl> <dbl> <dbl>
## 1 2c98c3d0-6f27-4e71-b… Bavarian a 166. test… 83.9 86.1 0.237
## 2 2c98c3d0-6f27-4e71-b… Bavarian a 166. test… 88.3 86.1 0.368
## 3 66bff560-fdc3-48cc-a… Bavarian a 168. test… 85.8 72.7 0.294
## 4 66bff560-fdc3-48cc-a… Bavarian a 168. test… 59.5 72.7 -0.491
## 5 75be5f72-097e-42ee-8… Bavarian a 166. test… 84.2 64.1 0.245
## 6 75be5f72-097e-42ee-8… Bavarian a 166. test… 44.0 64.1 -0.953
Bar plot:
ggplot(data = dragons,
mapping = aes(x = mountainRange, y = score)) +
stat_summary(fun.data = "mean_cl_boot",
geom = "bar") +
stat_summary(fun.data = "mean_cl_boot",
geom = "pointrange") +
theme_classic()
Violin + jitter plot:
ggplot(data = dragons,
mapping = aes(x = mountainRange, y = score)) +
geom_violin(aes(fill = mountainRange)) +
geom_jitter(height = 0, # jitter horizontally, not vertically
alpha = 0.5, # transparency
aes(color = test)) +
theme_classic() +
guides(fill = "none")
ggplot(data = dragons %>% filter(test == "test_1"),
mapping = aes(x = mountainRange, y = score)) +
geom_violin(aes(fill = mountainRange)) +
geom_jitter(height = 0,
alpha = 0.5) +
theme_classic() +
theme(legend.position = "none")
Scatter plot + best fit line:
ggplot(data = dragons,
mapping = aes(x = bodyLength, y = score)) +
geom_smooth(method = 'lm') +
geom_point(aes(color = mountainRange),
size = 2) +
theme_classic()
## `geom_smooth()` using formula = 'y ~ x'
Facets into grouping variables:
ggplot(data = dragons,
mapping = aes(x = bodyLength, y = score, colour = mountainRange)) +
geom_point(size = 2, alpha = .3) +
geom_smooth(method='lm') +
facet_grid(~mountainRange) +
theme_classic() +
theme(legend.position = "none")
## `geom_smooth()` using formula = 'y ~ x'
ggplot(data = dragons,
mapping = aes(x = bodyLength, y = score, colour = test)) +
geom_point(size = 2, alpha = .3) +
geom_smooth(method='lm') +
facet_grid(~test) +
theme_classic() +
theme(legend.position = "none")
## `geom_smooth()` using formula = 'y ~ x'
#Another example of facet_grid
ggplot(data = dragons,
mapping = aes(x = bodyLength, y = score, colour = mountainRange)) +
geom_point(size = 2, alpha = .3) +
geom_smooth(method='lm') +
facet_grid(test~mountainRange) +
theme_classic() +
theme(legend.position = "none")
## `geom_smooth()` using formula = 'y ~ x'
Terms:
Fixed effect: main predictor (e.g., body length, test)
Random effect: source of systematic noise (e.g., mountain range, individual dragon)
Levels: different values of a predictor (e.g., trial 1 vs 2)
Interaction: combinations of 2 predictors making different predictions for the dependent variable
Hypothesis 1: higher body length –> higher test score
lm.bl <- lm(score ~ bodyLength, data = dragons)
summary(lm.bl)
##
## Call:
## lm(formula = score ~ bodyLength, data = dragons)
##
## Residuals:
## Min 1Q Median 3Q Max
## -79.55 -24.95 17.10 26.48 37.32
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.55276 13.44816 2.792 0.00534 **
## bodyLength 0.19069 0.06659 2.864 0.00428 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 33.41 on 958 degrees of freedom
## Multiple R-squared: 0.008488, Adjusted R-squared: 0.007454
## F-statistic: 8.202 on 1 and 958 DF, p-value: 0.004277
There is an effect of body length.
But when we facet out mountainRange and test, we saw that the effect is different between tests 1 and 2. We also saw that for some mountain ranges, this isn’t the case.
Relationship between body length and score might differ based on which mountain range a dragon comes from. And the overall effect we found of body length on score might be driven by some mountain ranges where the effect is very strong.
Relationship between body length and score might also differ based on the individual dragon (i.e., some dragons are better at all tests.)
Syntax: (1|random effect)
lmer.base <- lmer(score ~ bodyLength + (1|mountainRange) + (1|pid), data = dragons)
## boundary (singular) fit: see help('isSingular')
summary(lmer.base)
## Linear mixed model fit by REML ['lmerMod']
## Formula: score ~ bodyLength + (1 | mountainRange) + (1 | pid)
## Data: dragons
##
## REML criterion at convergence: 9461.1
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.3808 -0.7466 0.5118 0.7925 1.1170
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 0 0.00
## mountainRange (Intercept) 0 0.00
## Residual 1116 33.41
## Number of obs: 960, groups: pid, 480; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 37.55276 13.44816 2.792
## bodyLength 0.19069 0.06659 2.864
##
## Correlation of Fixed Effects:
## (Intr)
## bodyLength -0.997
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.base, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 7.7975 1 0.005232 **
## bodyLength 8.2016 1 0.004185 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes – after adding the random effect of mountain range, there is still an effect of body length on score.
Note: Singular fit
Not a big deal for this demonstration, but if you see this in your actual analysis, it’s a sign that your predictor structure is too complex and is overfitting the data. Consider dropping random effects.
Note: Common random effects:
Individual participant: if a participant sees multiple trials, there will be dependencies between their responses – they might intrinsically be better or worse at the task.
Individual trial: a trial might intrinsically be easier or more difficult for all participants.
I always include both of these random effects in my analysis. In this demonstration, we will be including random participant effect, but we don’t have multiple trials so we won’t be including that.
Let’s say we have a second hypothesis:
Hypothesis 2: higher body length –> higher test score, scores for test 2 > test 1
Checking the effect of test:
lmer.test <- lmer(score ~ test + (1|mountainRange) + (1|pid), data = dragons)
## boundary (singular) fit: see help('isSingular')
summary(lmer.test)
## Linear mixed model fit by REML ['lmerMod']
## Formula: score ~ test + (1 | mountainRange) + (1 | pid)
## Data: dragons
##
## REML criterion at convergence: 8582.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.47364 -0.50218 0.02854 0.44364 2.36916
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 0.000 0.000
## mountainRange (Intercept) 5.837 2.416
## Residual 446.407 21.128
## Number of obs: 960, groups: pid, 480; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 101.885 1.288 79.09
## testtest_2 -51.885 1.364 -38.04
##
## Correlation of Fixed Effects:
## (Intr)
## testtest_2 -0.529
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.test)
## Analysis of Deviance Table (Type II Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## test 1447.3 1 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Add both fixed effects:
lmer.bl_test <- lmer(score ~ bodyLength + test + (1|mountainRange) + (1|pid), data = dragons)
## boundary (singular) fit: see help('isSingular')
summary(lmer.bl_test)
## Linear mixed model fit by REML ['lmerMod']
## Formula: score ~ bodyLength + test + (1 | mountainRange) + (1 | pid)
## Data: dragons
##
## REML criterion at convergence: 8570.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.59747 -0.40623 0.03411 0.33284 2.55997
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 1.183e-13 3.439e-07
## mountainRange (Intercept) 1.534e+00 1.239e+00
## Residual 4.412e+02 2.101e+01
## Number of obs: 960, groups: pid, 480; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 61.60849 9.55251 6.449
## bodyLength 0.20006 0.04716 4.242
## testtest_2 -51.88521 1.35590 -38.266
##
## Correlation of Fixed Effects:
## (Intr) bdyLng
## bodyLength -0.994
## testtest_2 -0.071 0.000
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.bl_test, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 41.596 1 1.123e-10 ***
## bodyLength 17.996 1 2.213e-05 ***
## test 1464.314 1 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lmer.bl_test, lmer.base, type = 3, refit = FALSE)
## Data: dragons
## Models:
## lmer.base: score ~ bodyLength + (1 | mountainRange) + (1 | pid)
## lmer.bl_test: score ~ bodyLength + test + (1 | mountainRange) + (1 | pid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lmer.base 5 9471.1 9495.4 -4730.5 9461.1
## lmer.bl_test 6 8582.5 8611.8 -4285.3 8570.5 890.55 1 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There is an effect of body length and test (after controlling for random effects). And adding test as a fixed effect results in a model that better explains the variance in the data.
Note: ANOVA:
Anova()
or car::Anova()
: compare the full
model with a reduced model where a predictor is removed –> this
allows us to get the p-value to see whether a predictor (e.g., body
length) is significant.
anova()
: compare 2 models –> whether 1 model explains
significantly more variance in the data compared to the other.
See also: https://www.bookdown.org/rwnahhas/RMPH/mlr-distinctions.html
So now we know that both body length and test is a significant predictor of the data. Do they predict the data in the same way?
Syntax: fixed_eff1 * fixed_eff2
lmer.bl_test_int <- lmer(score ~ bodyLength * test + (1|mountainRange) + (1|pid), data = dragons)
## boundary (singular) fit: see help('isSingular')
summary(lmer.bl_test_int)
## Linear mixed model fit by REML ['lmerMod']
## Formula: score ~ bodyLength * test + (1 | mountainRange) + (1 | pid)
## Data: dragons
##
## REML criterion at convergence: 8518
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.56955 -0.08974 -0.00160 0.09905 2.53328
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 1.684e-11 4.103e-06
## mountainRange (Intercept) 1.767e+00 1.329e+00
## Residual 4.165e+02 2.041e+01
## Number of obs: 960, groups: pid, 480; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) -0.75632 12.51795 -0.060
## bodyLength 0.50985 0.06196 8.228
## testtest_2 72.11810 16.42886 4.390
## bodyLength:testtest_2 -0.61596 0.08134 -7.572
##
## Correlation of Fixed Effects:
## (Intr) bdyLng tstt_2
## bodyLength -0.997
## testtest_2 -0.656 0.654
## bdyLngth:_2 0.654 -0.656 -0.997
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.bl_test_int, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 0.0037 1 0.9518
## bodyLength 67.7023 1 < 2.2e-16 ***
## test 19.2696 1 1.135e-05 ***
## bodyLength:test 57.3394 1 3.667e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lmer.bl_test_int, lmer.bl_test, type = 3, refit = FALSE)
## Data: dragons
## Models:
## lmer.bl_test: score ~ bodyLength + test + (1 | mountainRange) + (1 | pid)
## lmer.bl_test_int: score ~ bodyLength * test + (1 | mountainRange) + (1 | pid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lmer.bl_test 6 8582.5 8611.8 -4285.3 8570.5
## lmer.bl_test_int 7 8532.0 8566.1 -4259.0 8518.0 52.552 1 4.19e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Yes!
Think of random effects as ‘grouping’ variables that determine dependencies of data. They can affect either the intercept or the slope of the best-fit line.
Consider the scenario where we are predicting score based on body length, with mountain range as the grouping variable.
Intercept: the projected score when body length = 0, for each mountain range. You can also think of this as the ‘baseline’ performance, even though this is not very accurate.
Slope: the change in score with a fixed change in body length, for each mountain range.
If we fit a best-fit line by each mountain range, the intercept and/or slope of each line might be different. We might want to account for this in our random effect structure.
Syntax:
Intercept random effect: (1|rand_eff)
Slope random effect – need to include an intercept random effect:
(1 + slope_rand_eff | intercept_rand_eff)
. This is usually
the same as the fixed effect (essentially, we are allowing the fixed
effect’s slope to vary based on the grouping variable.)
lmer.bl_test_int_slope <- lmer(score ~ bodyLength * test + (1 + (bodyLength * test)|mountainRange) + (1|pid), data = dragons)
## boundary (singular) fit: see help('isSingular')
summary(lmer.bl_test_int_slope)
## Linear mixed model fit by REML ['lmerMod']
## Formula:
## score ~ bodyLength * test + (1 + (bodyLength * test) | mountainRange) +
## (1 | pid)
## Data: dragons
##
## REML criterion at convergence: 8509.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.70151 -0.10835 -0.00134 0.11802 2.69517
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## pid (Intercept) 1.484e-04 0.01218
## mountainRange (Intercept) 2.719e+02 16.48835
## bodyLength 6.565e-03 0.08103 -1.00
## testtest_2 2.180e+02 14.76378 0.86 -0.86
## bodyLength:testtest_2 5.070e-03 0.07120 -0.83 0.83 -0.98
## Residual 4.104e+02 20.25823
## Number of obs: 960, groups: pid, 480; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) -2.18460 13.53461 -0.161
## bodyLength 0.51456 0.06668 7.717
## testtest_2 60.21359 19.87466 3.030
## bodyLength:testtest_2 -0.55872 0.09827 -5.686
##
## Correlation of Fixed Effects:
## (Intr) bdyLng tstt_2
## bodyLength -0.997
## testtest_2 -0.381 0.385
## bdyLngth:_2 0.387 -0.394 -0.996
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.bl_test_int_slope, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 0.0261 1 0.871772
## bodyLength 59.5446 1 1.196e-14 ***
## test 9.1789 1 0.002448 **
## bodyLength:test 32.3285 1 1.302e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lmer.bl_test_int_slope, lmer.bl_test_int, type = 3, refit = FALSE)
## Data: dragons
## Models:
## lmer.bl_test_int: score ~ bodyLength * test + (1 | mountainRange) + (1 | pid)
## lmer.bl_test_int_slope: score ~ bodyLength * test + (1 + (bodyLength * test) | mountainRange) + (1 | pid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lmer.bl_test_int 7 8532.0 8566.1 -4259.0 8518.0
## lmer.bl_test_int_slope 16 8541.6 8619.5 -4254.8 8509.6 8.4153 9 0.4929
Yes, there is still an effect of body length, test and interaction effect when allowing for a random slope. But it does not explain a significantly larger variance in the data.
There are 3 sites within each mountain range. Maybe the dragons in each have a different intercept (remember, intercept = roughly baseline performance).
Wrong syntax: (1|mountainRange) + (1|site)
–> crossed
structure, assume that all sites ‘A’ across mountain ranges have the
same intercept. This model thinks that there are only 3 random site
intercepts (in fact, there are 3 sites * 8 mountain ranges = 24 site
intercepts).
Correct syntax: (1|mountainRange/site)
or
(1|mountainRange) + (1|mountainRange:site)
Note:
Make your life easier by just naming the sites differently from the beginning (e.g. mountainRange_A)
lmer.bl_test_int_site <- lmer(score ~ bodyLength * test + (1|mountainRange/site) + (1|pid), data = dragons)
## boundary (singular) fit: see help('isSingular')
summary(lmer.bl_test_int_site)
## Linear mixed model fit by REML ['lmerMod']
## Formula: score ~ bodyLength * test + (1 | mountainRange/site) + (1 | pid)
## Data: dragons
##
## REML criterion at convergence: 8518
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.56955 -0.08974 -0.00160 0.09905 2.53328
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 9.073e-08 3.012e-04
## site:mountainRange (Intercept) 0.000e+00 0.000e+00
## mountainRange (Intercept) 1.767e+00 1.329e+00
## Residual 4.165e+02 2.041e+01
## Number of obs: 960, groups: pid, 480; site:mountainRange, 24; mountainRange, 8
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) -0.75619 12.51790 -0.060
## bodyLength 0.50985 0.06196 8.228
## testtest_2 72.11810 16.42886 4.390
## bodyLength:testtest_2 -0.61596 0.08134 -7.572
##
## Correlation of Fixed Effects:
## (Intr) bdyLng tstt_2
## bodyLength -0.997
## testtest_2 -0.656 0.654
## bdyLngth:_2 0.654 -0.656 -0.997
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
Anova(lmer.bl_test_int_site, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: score
## Chisq Df Pr(>Chisq)
## (Intercept) 0.0036 1 0.9518
## bodyLength 67.7027 1 < 2.2e-16 ***
## test 19.2696 1 1.135e-05 ***
## bodyLength:test 57.3394 1 3.667e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(lmer.bl_test_int_site, lmer.bl_test_int, type = 3, refit = FALSE)
## Data: dragons
## Models:
## lmer.bl_test_int: score ~ bodyLength * test + (1 | mountainRange) + (1 | pid)
## lmer.bl_test_int_site: score ~ bodyLength * test + (1 | mountainRange/site) + (1 | pid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lmer.bl_test_int 7 8532 8566.1 -4259 8518
## lmer.bl_test_int_site 8 8534 8572.9 -4259 8518 0 1 1
Significant, but not an improvement over the interaction model!