As previously described, the model was built in stages, focusing mainly on affective valence as the independent variable and spatial working memory (WM) as the dependent variable. For every measurement of the dependent variable, four different indices of the independent variable were considered:
For modelling purposes, we median-imputed level 2 results for one participant, whose data from laboratory assessments were missing (Fluid intelligence, Busyness, Anxiety, and Neuroticism). Then, the data were grand mean centred and scaled. Lastly, we removed rows with data from tests that were not completed, which meant that grand means of level 2 variables deviated slightly from 0, because participants who completed more tests had more rows with their level 2 variables. This is the reason within- and between-subject variance does not add up to 1, even though the data were scaled and centred.
The maximum likelihood method was used to calculate model estimates using R’s lme4 package (Bates et al., 2015). Lüdecke and Lüdecke’s (2017) sjPlot package was used for calculation and visualisation of confidence intervals and p-values for fixed effects – based on Wald method, see Bolker et al., (2009) for explanation. The models met statistical assumptions of mixed-effects regression, such as normality of residuals and homoscedasticity of conditional residuals (Hox et al., 2010).
Before modelling, it is important to first understand, if there are any problems with the measured variables. For example, as previously explained in the Methods, our sample consisted mostly of students and that might have restricted the range of scores on various measures, such as intelligence. Collected data suggest that no such problem occurred in our study as illustrated in more detail below.
Present study focused mainly on variables which were measured at the within-person level. All hypothesis-tests were based on:
Also, pracitce effects were measured at that level to account for the steady increase in scores over time.
The raw affective valence scores seem to reflect a wide range of affective experiences. As seen in Figure 3., almost every participant experienced changing affect over the duration of the study.
Figure 3. Bar plots of valence scores. Each plot represents a different participant. Each bar represents a different valence score with negative scores on the left and positive scores on the right. Height represents the number of times a person selected that score.
These valence scores were then split into within- and between-subject effects – called Centred affect and Mean affect respectively.
Mean affect was the mean of raw SAM valence scores for each participant. Centred affect was equal to each of the raw SAM valence scores minus the previously calculated mean affect for the person who selected these scores.
Spatial WM scores also varied reasonably between- and within-participants over the course of the study as seen in Figure 4. Based on these results, a new variable was computed – Systematic daily variability – for which we used the formula provided by Brose et al. (2012). In short, Systematic daily variability is a proportion of variance in WM scores that varies systematically from day to day as opposed to varying from trial to trial.
Figure 4. Histograms of spatial WM accuracy. Each plot represents a different participant. Each bar represents a range of WM scores with the higher scores on the right. Height represents the proportion of the score relative to the total number of scores obtained.
Tests completed before is a variable we used to control for practice effects. Based on previous research, practice effects on spatial WM should be quadratic in the early stages of testing. However, this was not the case for most participants in our study (see Figure 5.), which might be due to its shorter duration, smaller sample, and/or high rates of missed assignments. As such, Tests completed before was included as a linear covariate.
We also did not include any additional quadratic or cubic terms as these could make the model overfit. The linear effect was sufficient for explaining pracitce effects in our study. Adding additional quadratic and/or cubic predictors for practice effects would likely force the model to mistake more random noise for signal, which would reduce its generalisibility to new samples (see Yarkoni & Westfall, 2017 for more information on the issue of overfitting).
Figure 5. Line plots showing how WM performance (y-axis) changed with each consecutive assessment (x-axis) for every participant in the study. The black lines were fitted using the least squares method for each participant.
As seen in Table 1, the level 2 variables had enough variance to justify their usage, and there were no obvious outliers.
Table 1. Descriptive statistics of variables measured at the individual level| Variable | Mean | SD | Median | Min | Max | Range |
|---|---|---|---|---|---|---|
| Systematic Daily Variability | 0.66 | 0.15 | 0.68 | 0.28 | 0.97 | [0,1] |
| Anxiety | 2.20 | 0.48 | 2.10 | 1.40 | 3.25 | [1,4] |
| Busyness | 3.38 | 0.65 | 3.50 | 2.00 | 4.50 | [1,5] |
| Neuroticism | 4.95 | 2.24 | 5.00 | 0.00 | 9.00 | [0,9] |
| Fluid Intelligence | 78.68 | 14.10 | 79.00 | 53.00 | 110.00 | [0,135] |
| Mean Valence | 5.53 | 0.98 | 5.40 | 3.33 | 8.06 | [1,9] |
One of the assumptions of multiple regression is that the independent variables are not perfectly collinear (Field, Miles, & Field, 2012). In our study, the only two variables potentially breaking this assumption were anxiety and neuroticism (r = .72). However, neuroticism was only added for its interaction with practice effects, so its high correlation with anxiety is not problematic.
Other variables were reasonably independent from one another, \(|r\)s\(| \leq .35\) (See Figure 6.), so perfect collinearity was not a problem (Field et al., 2012).
Figure 6. Scatterplots and Pearson’s correlations of all level 2 variables. The middle diagonal shows a density plot for each of the individual variables.
As previously described, much of the psychology literature treats cognitive performance as varying between people but neglects its variation within people. However, this approach might be simplistic given the research carried out on the within-subject variation in cognitive performance. In our study, the random intercept term explained only 0.52 out of 0.98 of total variance in spatial WM performance. The remaining 0.46 were explained by the within-subject variation (see Table 1.). In other words, only 53% of the total WM variance was explained by individual factors.
Table 1. Results of a random intercept model.
| Spatial WM Accuracy | |||
|---|---|---|---|
| Predictors | Estimates | CI | p |
| (Intercept) | -0.05 | -0.24 – 0.14 | 0.631 |
| Random Effects | |||
| σ2 | 0.46 | ||
| τ00 Subject | 0.52 | ||
| ICC | 0.53 | ||
| N Subject | 56 | ||
| Observations | 1930 | ||
| Marginal R2 / Conditional R2 | 0.000 / 0.530 | ||
| AIC | 4189.730 | ||
| log-Likelihood | -2091.865 | ||
This model was an important comparison point, to which the next the models were compared.
The purpose of the study was to examine the relationship between affect and WM, but we should also account for the random individual differences in cognitive performance. Hence, the previous model was extended to include two level 1 fixed effects: Centred affect and Tests completed before. Both of these effects came out significant (see Table 1.), which means that we will retain them for later modelling purposes. This suggests that the more negative participants felt the less accurate they were on the WM task.
Table 1. A comparison of the random intercept models: the null one on the left, the one with level 1 fixed effects on the right. * p < .05, ** p < .01, *** p < .001
| Spatial WM Accuracy | ||||||
|---|---|---|---|---|---|---|
| Predictors | Estimates | CI | p | Estimates | CI | p |
| Centred affect | 0.03 * | 0.00 – 0.06 | 0.036 | |||
| Tests completed before | 0.20 *** | 0.17 – 0.24 | <0.001 | |||
| Random Effects | ||||||
| σ2 | 0.46 | 0.42 | ||||
| τ00 | 0.52 Subject | 0.50 Subject | ||||
| ICC | 0.53 | 0.54 | ||||
| N | 56 Subject | 56 Subject | ||||
| Observations | 1930 | 1930 | ||||
| Marginal R2 / Conditional R2 | 0.000 / 0.530 | 0.044 / 0.562 | ||||
| AIC | 4189.730 | 4033.345 | ||||
| log-Likelihood | -2091.865 | -2011.673 | ||||
|
||||||
This step is analogous to the previous one, but this time, the level 2 predictors were added. These were the variables that differed between- but not within-people.
Our main variable of interest at this stage was Mean affect and it came out statistically insignificant (p = 0.13). Contrary to Centred affect, Mean affect had a stronger association with spatial WM – \(\beta = -\) \(.14\) as opposed to \(\beta = .03\) – but it was not enough to reach statistical significance due to lower statistical power of level 2 predictors (see Figure 7.). Also, the association Mean affect was opposite to that of Centred affect (i.e. the more positive a person felt on average, the worse that person scored on WM tests). It is difficult to interpret these results, because beta slope estimates for level 2 predictors are much less certain than the beta slope estimates for level 1 predictors (see Confidence Intervals in Figure 7.). In our model, that happens, because there are only 56 observations for level 2 variables (one for each participant) but 1930 observations for level 1 variables, meaning that the latter have a much smaller standard error of the slope, thus increasing their statistical significance.
As such, Mean affect was retained for later modelling stages in order to allow readers to compare its effects with these of other hypothesis-related terms from later modelling stages. For now, it is important to remember that the term was statistically insignificant.
Figure 7. Random intercept model with level 1 and level 2 predictors before the exclusion of insignificant level 2 predictors. The lines represent 95% confidence intervals.
* p < .05, ** p < .01, *** p < .001
As for covariates, only the Fluid intelligence term had a significant association with spatial WM (p < .001). All other covariates were dropped before we proceeded with later modelling. The final model at this stage is displayed in Table 2.
Table 2. On the left, random intercept model with level 1 predictors. On the right, the same model with added level 2 predictors after dropping insignificant level 2 covariates.| Spatial WM Accuracy | ||||||
|---|---|---|---|---|---|---|
| Predictors | Estimates | CI | p | Estimates | CI | p |
| Centred affect | 0.03 * | 0.00 – 0.06 | 0.036 | 0.03 * | 0.00 – 0.06 | 0.036 |
| Tests completed before | 0.20 *** | 0.17 – 0.24 | <0.001 | 0.20 *** | 0.17 – 0.23 | <0.001 |
| Fluid intelligence | 0.30 *** | 0.13 – 0.48 | 0.001 | |||
| Mean affect | -0.13 | -0.31 – 0.04 | 0.138 | |||
| Random Effects | ||||||
| σ2 | 0.42 | 0.42 | ||||
| τ00 | 0.50 Subject | 0.41 Subject | ||||
| ICC | 0.54 | 0.49 | ||||
| N | 56 Subject | 56 Subject | ||||
| Observations | 1930 | 1930 | ||||
| Marginal R2 / Conditional R2 | 0.044 / 0.562 | 0.141 / 0.563 | ||||
| AIC | 4033.345 | 4026.315 | ||||
| log-Likelihood | -2011.673 | -2006.157 | ||||
|
||||||
So far, we have allowed for random variation in WM scores between people but assumed same effects of the predictor variables for each person. This means that for every level 1 predictor, its effects – represented by beta slopes – are assumed to be equal. For example, our current model assumes that all people progress equally with every new WM test (i.e. Practice effects), even though we can already suspect that some people improved faster than others based on previous data exploration (see Figure 5.). Multilevel models can employ these random effects, in order to better estimate what the fixed effects are for the level 1 variables.
Accordingly with Hox et al. (2010), we expanded our model step-by-step by separately adding a random slope for each of the two level 1 predictors. First, we tested if adding a random slope for the Centred affect significantly increased the log-Likelihood of the model. This was not the case as the \(\chi\)2 (1) = 3.37, p = .07.
Then, we tested if adding a random slope for the Tests completed before significantly increased the log-Likelihood of the model. Here, we have found that, indeed, this random slope meaningfully improved model’s predictions, \(\chi\)2 (1) = 46.13, p < .001. The covariance between random slopes and intercepts was negligible (r = - .05) meaning that estimating this covariance might not significantly improve model’s predicitons.
A simpler model with uncorrelated slopes was statistically indifferent to the model with correlated slopes, \(\chi\)2 (1) = .06, p = .80, so we proceeded with the simpler model towards the later stages of the modelling process (see Balasubramanian, 1997 for an application of Occam’s razor in classical statistics). Table 3. shows a comparison between the final model from Step 4. of the analysis and the random slopes model from Step 5. (with the addition of uncorrelated slopes for Practice effects).
Table 3. Random intercept models with level 1 and level 2 predictors. Model on the left does not include random slopes. Model on the right includes uncorrelated random slopes for the ‘Trials completed before’ variable.
| Spatial WM Accuracy | ||||||
|---|---|---|---|---|---|---|
| Predictors | Estimates | CI | p | Estimates | CI | p |
| Centred affect | 0.03 * | 0.00 – 0.06 | 0.036 | 0.02 | -0.01 – 0.05 | 0.118 |
| Tests completed before | 0.20 *** | 0.17 – 0.23 | <0.001 | 0.22 *** | 0.17 – 0.27 | <0.001 |
| Fluid intelligence | 0.30 *** | 0.13 – 0.48 | 0.001 | 0.30 ** | 0.12 – 0.48 | 0.001 |
| Mean affect | -0.13 | -0.31 – 0.04 | 0.138 | -0.13 | -0.31 – 0.05 | 0.151 |
| Random Effects | ||||||
| σ2 | 0.42 | 0.40 | ||||
| τ00 | 0.41 Subject | 0.42 Subject | ||||
| 0.02 Subject.1 | ||||||
| ICC | 0.49 | 0.51 | ||||
| N | 56 Subject | 56 Subject | ||||
| Observations | 1930 | 1930 | ||||
| Marginal R2 / Conditional R2 | 0.141 / 0.563 | 0.145 / 0.582 | ||||
| AIC | 4026.315 | 3982.183 | ||||
| log-Likelihood | -2006.157 | -1983.091 | ||||
|
||||||
After adding random slopes, the coefficient estimates changed for many variables (see Table 3.). Most notably, the variation explained by differences in practice effects reduced the significance of Centred affect below the significance threshold of \(\alpha\) = .05.
The final step is including interactions between level 2 and level 1 variables. These include:
We tested two hypothesis-driven predictors at this stage, so their significance thresholds needed to be adjusted for repeated testing. After applying the Bonferroni correction, the required \(\alpha\) = .025. As seen in Figure 8., none of the added terms were significant, so we dropped the covariate and left the hypothesis-driven terms to allow for easy interpretation of the final results (see Table 4.).
Figure 8. Random intercept model with level 1 and level 2 predictors, random slopes for the Trials completed before variable, and all interaction terms. The lines represent 95% confidence intervals. * p < .05, ** p < .01, *** p < .001
Table 4. Final model summary| Spatial WM Accuracy | |||
|---|---|---|---|
| Predictors | Estimates | CI | p |
| Tests completed before | 0.22 *** | 0.16 – 0.27 | <0.001 |
| Centred affect | 0.03 | -0.00 – 0.06 | 0.078 |
| Fluid intelligence | 0.28 ** | 0.10 – 0.46 | 0.002 |
| Mean affect | -0.17 | -0.35 – 0.01 | 0.070 |
| Centred affect * Systematic variability | 0.02 | -0.01 – 0.06 | 0.124 |
| Mean affect * Systematic daily variability | 0.17 | -0.07 – 0.40 | 0.165 |
| Random Effects | |||
| σ2 | 0.40 | ||
| τ00 Subject | 0.41 | ||
| τ00 Subject.1 | 0.02 | ||
| ICC | 0.50 | ||
| N Subject | 56 | ||
| Observations | 1930 | ||
| Marginal R2 / Conditional R2 | 0.160 / 0.582 | ||
| AIC | 3981.917 | ||
| log-Likelihood | -1980.959 | ||
|
|||