lmm check

library(dplyr)
library(ggplot2)
library(lme4)

data <- read.csv("all_data.csv", header = TRUE)

My data is from a task switching experiment, where participants respond to two different types of task sequence (ABA vs. CBA), and we manipulate whether there is a response repetition or not (repetition vs. switch). The dependent variable is the outcome of a model fit routine, and is therefore a single model parameter per subject per condition. has the following structure:

data[1:30, ]

##    subject sequence response_repetition  value study
## 1     1_wm      ABA              switch 1.4649    wm
## 2     1_wm      ABA          repetition 1.6514    wm
## 3     1_wm      CBA              switch 1.6809    wm
## 4     1_wm      CBA          repetition 1.5641    wm
## 5    10_wm      ABA              switch 2.2977    wm
## 6    10_wm      ABA          repetition 2.1395    wm
## 7    10_wm      CBA              switch 2.5105    wm
## 8    10_wm      CBA          repetition 2.4147    wm
## 9    11_wm      ABA              switch 1.6748    wm
## 10   11_wm      ABA          repetition 1.4292    wm
## 11   11_wm      CBA              switch 1.4295    wm
## 12   11_wm      CBA          repetition 1.7851    wm
## 13   12_wm      ABA              switch 1.8995    wm
## 14   12_wm      ABA          repetition 2.0838    wm
## 15   12_wm      CBA              switch 1.7956    wm
## 16   12_wm      CBA          repetition 1.9365    wm
## 17   13_wm      ABA              switch 1.2129    wm
## 18   13_wm      ABA          repetition 1.1364    wm
## 19   13_wm      CBA              switch 1.3179    wm
## 20   13_wm      CBA          repetition 1.2999    wm
## 21   14_wm      ABA              switch 1.2494    wm
## 22   14_wm      ABA          repetition 1.2080    wm
## 23   14_wm      CBA              switch 1.3648    wm
## 24   14_wm      CBA          repetition 1.3058    wm
## 25   15_wm      ABA              switch 1.3755    wm
## 26   15_wm      ABA          repetition 1.5148    wm
## 27   15_wm      CBA              switch 1.5729    wm
## 28   15_wm      CBA          repetition 1.4449    wm
## 29   16_wm      ABA              switch 1.7125    wm
## 30   16_wm      ABA          repetition 1.9384    wm

str(data)

## 'data.frame':    756 obs. of  5 variables:
##  $ subject            : Factor w/ 189 levels "1","1_aging",..: 4 4 4 4 8 8 8 8 12 12 ...
##  $ sequence           : Factor w/ 2 levels "ABA","CBA": 1 1 2 2 1 1 2 2 1 1 ...
##  $ response_repetition: Factor w/ 2 levels "repetition","switch": 2 1 2 1 2 1 2 1 2 1 ...
##  $ value              : num  1.46 1.65 1.68 1.56 2.3 ...
##  $ study              : Factor w/ 4 levels "aging","mayr",..: 4 4 4 4 4 4 4 4 4 4 ...

subject: The subject number for the particular study, given by a number and then appended with “_x" where “x” refers to the Study name variable (described later). Note that different subjects were in each experiment.
sequence: 2 levels describing a repeated-measures manipulation of task sequence (ABA vs. CBA)
response_repetition: 2 levels describing a repeated-measures manipulation of whether there is a response repetition (“repetition”) or not (“switch”)
value: the dependent variable of the study. This is a single parameter value from a model fit routine.
study: 4 levels. We replicated this experiment across 4 different studies (“wm”, “ageing”, “mayr”, and “new”). Note that different subjects were in each study.

Although participants took part in many trials per condition, the model fit routine returns a single parameter value per condition. So, I cannot do the LMM on individual trial data, so will have to treat it as aggregated data.

Here’s a plot of the data:

I am interested in the 2-way interaction of sequence and response_repetition.