This document is a companion for my honours dissertation. It contains information that did not make it into the dissertation itself, but it is relevant enough to have its own place somewhere. It also provides some more resources for whoever will read the dissertation. Furthermore, this document showcases a little bit more of the work I have done throughout this year (better the second half of the year). The complete code used for the analyses can be found on Github

Permutation Analyses: Some notes

When I first started with permutation analyses, I wrote my own permutation functions to understand the procedures involved. I believe that visual tools play a vital role in learning, so I created two visualizations here. They both show a simple permutation test for the equality of means between two groups.

If we are interested in comparing whether two groups have different means, we can state the following hypotheses:

\[ H_{0}: X_{1} = X_{2} \rightarrow X_{1} - X_{2} = 0\] \[ H_{1}: X_{1} \neq X_{2} \rightarrow X_{1} - X_{2} = n \] Under \(H_{0}\) we can assume that the two groups are identical and that the labels are meaningless. Consequently, if we switch the values across groups, nothing should change, and the difference in means should be zero or near zero. This is the basic process used for permutation analyses:

  1. Compute the observed difference in groups means (\(n\))
  2. Shuffle the values across groups
  3. Compute the permuted group means
  4. Compute the difference between the permuted group means
  5. Repeat this process thousands of times

At the end of the permutations, we have a series of permuted differences in group means. This set constitutes the permutation distribution, which is a constructed sampling distribution. To verify whether the observed difference is statistically significant, we can compute the proportion of permuted differences that are \(\le -n\) or \(\ge n\), for a two-tailed test. This proportion is the p-value. It is interpreted exactly like a classic p-value by comparing it against a predefined \(\alpha\).

Below, an animation shows both the permutation at work and the permutation distribution being constructed. In the left plot, you can see how values are randomly switched between groups. The colour of the dots indicates their original group, and it helps follow the process. Notably, the two group sizes are different, and the permutation takes care of this by reassigning the values according to the original sample sizes. The histogram shows the permutation distribution and the two-tailed p-value (in green). This highlights how the p-value is defined as a proportion. Specifically, here it represents the proportion of values below or above \(\le -n\) or \(\ge n\), respectively. With \(n\) reflecting the original difference in means between groups A and B.

knitr::include_graphics(here::here("Images", "switch.gif"))
knitr::include_graphics(here::here("Images", "histGrow.gif"))

The second tool is a Shiny app, which allows the user to change the sample size, mean and SD of the distributions and highlights the corresponding p-value. You can find the app here.

The code for the animated plots is:

library(tidyverse)
library(gganimate)
library(gifski)
library(ggthemr)
library(magick)

## Set up theme -----
palette <- define_palette(
  swatch = c("#111111", "#E84646", "#233B43", "#C29365", "#168E7F", "#65ADC2", "#109B37", "#DB784D"),
  gradient = c(lower = "#E84646", upper = "#109B37")
)
ggthemr(palette, layout = "scientific")


# Create permutation function ---------------------------------------------

computePerm <- function(data, np, size1, size2) {
  result <- data

  for (n in 1:np) {

    # colName <- paste('rep', n, sep = '_')
    newperm <- sample(data$group)
    result[, ncol(result) + 1] <- newperm
    colnames(result)[ncol(result)] <- as.character(n)
  }

  return(result)
}


# Create Data -------------------------------------------------------------

# Create groups
x <- rnorm(n = 30, 7.5)
y <- rnorm(n = 20, 7)

# Groups are defined as integer so we can exploit the transition_time function in gganimate later
df <- data.frame(
  values = c(x, y),
  group = c(
    rep(5, length(x)),
    rep(8, length(y))
  )
)

# Do permutations
permShort <- computePerm(df, 20, length(x), length(y)) %>%
  pivot_longer(
    cols = !c(values, group),
    names_to = "permutation",
    values_to = "new_values"
  ) %>%
  mutate(permutation = as.integer(permutation))

permLong <- computePerm(df, 5000, length(x), length(y)) %>%
  pivot_longer(
    cols = !c(values, group),
    names_to = "permutation",
    values_to = "new_values"
  ) %>%
  mutate(permutation = as.integer(permutation))

permAvg <- permLong %>%
  group_by(permutation, new_values) %>%
  summarise(avg = mean(values)) %>%
  ungroup() %>%
  pivot_wider(
    names_from = new_values,
    values_from = avg
  ) %>%
  rowwise() %>%
  mutate(difference = `5` - `8`)


df_ani <- permAvg %>%
  split(.$permutation) %>%
  accumulate(~ bind_rows(.x, .y)) %>%
  bind_rows(.id = "frame") %>%
  mutate(frame = as.integer(frame))
# Plot --------------------------------------------------------------------
# For the plot we treat the groups as continuous values.
# However, as we provided only two numbers to labels the groups (5 and 8), we can use the original labels to colour the dots
# without these changing during the animation. Then we use the new numeric values to animate according to the permutation
OriginalMean <- permShort %>%
  group_by(group) %>%
  summarise(avg = mean(values)) %>%
  ungroup() %>%
  summarise(avg = abs(diff(avg)))

# Dots plot
permShort %>%
  ggplot(aes(
    x = new_values,
    y = values,
    colour = factor(group)
  )) +
  geom_point(
    size = 6,
    alpha = .8
  ) +
  geom_vline(
    xintercept = 6.5,
    colour = "black",
    size = 2
  ) +
  scale_x_continuous(breaks = c(5.5, 7.5), labels = c("A", "B")) +
  scale_colour_ggthemr_d(labels = c("A", "B")) +
  labs(
    x = "groups",
    y = "values",
    colour = "Original Group"
  ) +
  transition_states(permutation,
    transition_length = 1,
    state_length      = 0.1
  ) +
  ease_aes(default = "linear") +
  ggtitle("Permutation in Action")


# Histogram Plot
df_ani %>%
  ggplot(aes(x = difference)) +
  geom_histogram(fill = palette[["swatch"]][4]) +
  geom_histogram(
    data = df_ani %>% filter(difference >= OriginalMean$avg | difference <= -OriginalMean$avg),
    colour = palette[["swatch"]][5],
    fill = palette[["swatch"]][5]
  ) +
  labs(x = "difference in means") +
  transition_manual(frame) +
  ease_aes("linear") +
  enter_fade() +
  exit_fade() +
  ggtitle("Permutation Distribution",
    subtitle = "Permutation: {frame}"
  )

EEG ringing effect: the case of the mysterious oscillations

As reported in the dissertation, while I was exploring the EEG data, I observed a weird effect in the time-frequency domain. As you can see in the picture below, there is a clear ringing artifact, with the frequency amplitude oscillating steadily. There was obviously something going on, but it was not clear what. The people who originally collected and analyzed the data have not reported this effect and were unaware of it.

Ringing effect in one raw data file from the experiment

Ringing effect in one raw data file from the experiment

After investigating the matter, I noticed that the raw data started with approximately one second of very negative values. I think the cause of this is how the experiment was coded; it appears that a bunch of triggers was sent right at the beginning, which could be the cause.

EEG raw data with negative values at the onset

EEG raw data with negative values at the onset

To test this hypothesis, I simulated the data starting from the sample dataset included with EEGLAB. The plan was to inject the first two seconds of data from one of our participants and observe whether the ringing effects appeared. If it did, the solution to this problem was simple: remove those two seconds from each participant.

The first step was to properly set up the clean dataset and verify that no strange artifacts were visible. The dataset contained 68 channels, but our data was recorded with a 128 electrodes headset. Thus, the 68 channels were duplicated to create a 128-mirrored version of the dataset. Then we verified that the channels spectra were fine. Once this was done, the incriminated portion was transplanted at the beginning of the dataset. As hypothesized, the ringing artifact appeared. The figures below show the before and after this simulation. On the left, the original spectrum of the clean dataset. On the right, the spectrum of the same dataset after the portion of data was attached at the onset.

knitr::include_graphics(here::here("Images", "clean_dataset.png"))
knitr::include_graphics(here::here("Images", "simulated_ringing.png"))
Left: Original clean data. Right:Same dataset with bad onset insertedLeft: Original clean data. Right:Same dataset with bad onset inserted

Left: Original clean data. Right:Same dataset with bad onset inserted

It is important to highlight that resolving this problem was essential, as this bad portion of data would have interacted with the sliding window of the high/low pass filter, and it was not clear what effects would have caused to the final data.

Behavioural Analyses

This section contains some extra material regarding the analyses of behavioural data (reaction times and accuracy scores)

Reaction Times Assumptions

I based my honours project’s analyses on permutation techniques for two main reasons. Firstly, the data was unbalanced, and the sample sizes were small. The original analyses were based on ANOVAs conducted through SPSS. This could have been slightly problematic as SPSS uses Type III ANOVA by default, which might not be the best solution for an unbalanced design. Secondly, I wanted to use this year to learn new tools and increase my skill set. It has been a great challenge and a huge learning curve, but it was worth it and intriguing.

Although the permutation analyses were nonparametric, I began by checking the distribution of the behavioral data. In this way, I could get a sense of it before jumping into the proper analyses. The following code was used to load and prepare the data. The dataset required a bit of wrangling to have it set up properly.

# Define Working Memory Groups according to previous analyses
highWMC <- c(2, 3, 4, 9, 21, 27, 30, 31, 34)
lowWMC <- c(6, 7, 8, 12, 14, 17, 20, 22, 23, 32, 37, 39, 41)

## Load and set up -------------------------------------------

# Original dataset
originalData <- read.csv(here::here("DataSets", "organized_data_ALL.csv"))

# Clean up
cleanData <- originalData %>%
  select(
    "Subject",
    "Age",
    "Sex",
    "Handedness",
    "Procedure_Block_",
    "Trial",
    "Response_ACC",
    "Response_RT",
    "StimTrig",
    "StimTrig2",
    "newTargPos":"newDistance"
  ) %>%
  # Standardize row names
  janitor::clean_names() %>%
  # Add WMC information for each participant
  dplyr::mutate(
    wmc_group = case_when(
      (subject %in% highWMC) ~ "high",
      (subject %in% lowWMC) ~ "low"
    ),
    new_distr_type = replace(new_distr_type, new_distr_type == "0", "O")
  ) %>% # Some values were store with this typo (O vs 0)
  # Select only test blocks
  dplyr::filter(procedure_block == "BlockLAI") %>%
  # Transform response times in numeric values
  dplyr::mutate(response_rt = as.integer(response_rt))

## Extract Reaction times -------
rtDataset <- cleanData %>%
  # Group observations by subject
  dplyr::group_by(subject, new_distr_type, new_distr_col) %>%
  # Remove RT = 0, which probably are people that did not responded
  dplyr::filter(response_rt != 0) %>%
  # Compute mean for each subject
  dplyr::mutate(rt_mean = mean(response_rt)) %>%
  # Retain only useful columns
  dplyr::select(subject, wmc_group, new_distr_type, new_distr_col, rt_mean) %>%
  # Remove duplicated rows. Dataset is in long-format,
  # thus each subject has the same exact mean information in each row in the new dataset
  distinct(subject, .keep_all = TRUE) %>%
  dplyr::ungroup()

There were three input variables that we were interested in:

  • Working Memory Capacity (WMC)
  • Distractor Type
  • Distractor Colour

Here I report the preliminary analyses for reaction times (RTs). The same steps were performed on accuracy scores, and the code is freely available on Github. The first step was to obtain the descriptive statistics for these variables.

# Compute descriptive statistics for reaction times
## Compute descriptives for RT based on WMC
wmcDescriptives <- rtDataset %>%
  dplyr::group_by(wmc_group, new_distr_type, new_distr_col) %>%
  dplyr::summarise(
    n = n(),
    rt.mean = mean(rt_mean),
    median = median(rt_mean),
    sd = sd(rt_mean, na.rm = TRUE),
    se = sd / sqrt(n),
    min = min(rt_mean),
    max = max(rt_mean),
    IQR = IQR(rt_mean),
    skew = skew(rt_mean),
    kurt = kurtosi(rt_mean),
    shapiro.stat = shapiro.test(rt_mean)$statistic,
    shapiro.pvalue = shapiro.test(rt_mean)$p.value
  )
Descriptive statistics for RT for each input variable
wmc_group new_distr_type new_distr_col n rt.mean median sd se min max IQR skew kurt shapiro.stat shapiro.pvalue
high L Gr 9 514.6740 519.3376 25.93755 8.645850 461.7083 547.0495 11.04688 -0.7678409 -0.5752857 0.9033257 0.2719111
high L Or 9 446.1542 436.7636 45.73257 15.244191 352.6722 514.6915 40.57107 -0.4982301 -0.4249008 0.9333132 0.5135182
high O Gr 9 490.3568 493.4254 17.65994 5.886647 450.6028 511.0295 13.82530 -1.0233715 0.1486496 0.8843212 0.1742641
high O Or 9 420.3744 415.3387 41.64673 13.882245 353.5063 497.6068 49.89903 0.2859712 -0.8492266 0.9643191 0.8422620
low L Gr 13 620.4996 570.2393 132.16078 36.654805 461.3671 803.6742 245.19079 0.2070070 -1.8170464 0.8704524 0.0530390
low L Or 13 564.4444 512.3112 189.46817 52.549016 321.0054 895.2227 312.45687 0.3725534 -1.4925448 0.9211625 0.2600702
low O Gr 13 538.1017 535.9964 60.63407 16.816865 450.3515 632.2132 79.05114 0.1165274 -1.4191260 0.9503752 0.6039868
low O Or 13 443.6131 464.8387 57.26000 15.881067 332.1597 515.9219 78.14197 -0.5116069 -1.1591721 0.9325407 0.3677592

The table is quite dense, but only a few columns are really important. Firstly, the column n shows that the high and low WMC groups included 9 and 13 participants each, respectively. Secondly, while the skewness values (column skew) are mostly quite low, the kurtosis (column kurt) tends to be higher for the low WMC group. Nonetheless, the Shapiro-Wilk test is significant only for the low WMC group, with green L distractors. As the tests for normality could be misleading, visual inspection of raw data and QQ-plots was used.

## Distributions ----

# High WMC
wmcHighRT <- rtDataset %>%
  dplyr::filter(wmc_group == "high") %>%
  ggplot(aes(
    x = new_distr_col,
    y = rt_mean,
    fill = new_distr_col
  )) +
  # Add distribution
  ggdist::stat_halfeye(
    adjust = .5,
    width = .6,
    justification = -.2,
    .width = 0,
    point_colour = NA,
    alpha = .8
  ) +
  # Add boxplot
  geom_boxplot(
    width = .12,
    outlier.colour = NA,
    show.legend = FALSE
  ) +
  # Add point
  gghalves::geom_half_point(
    mapping = aes(colour = new_distr_col),
    side = "l",
    range_scale = .4,
    alpha = 0.6,
    size = 2,
    show.legend = FALSE
  ) +
  # Set colours to green and orange to match the distractor colours in the experiment
  scale_colour_manual(values = palette[["swatch"]][c(7, 8)]) +
  scale_fill_manual(
    values = palette[["swatch"]][c(7, 8)],
    name = "Distractor Colour"
  ) +
  facet_grid(~new_distr_type) +
  # Change labels
  labs(
    x = "",
    y = "Reaction Times (ms)"
  )

# Low WMC
wmclowRT <- rtDataset %>%
  dplyr::filter(wmc_group == "low") %>%
  ggplot(aes(
    x = new_distr_col,
    y = rt_mean,
    fill = new_distr_col
  )) +
  # Add distribution
  ggdist::stat_halfeye(
    adjust = .5,
    width = .6,
    justification = -.2,
    .width = 0,
    point_colour = NA,
    alpha = .8
  ) +
  # Add boxplot
  geom_boxplot(
    width = .12,
    outlier.colour = NA,
    show.legend = FALSE
  ) +
  # Add point
  gghalves::geom_half_point(
    mapping = aes(colour = new_distr_col),
    side = "l",
    range_scale = .4,
    alpha = 0.6,
    size = 2,
    show.legend = FALSE
  ) +
  # Set colours to green and orange to match the distractor colours in the experiment
  scale_colour_manual(values = palette[["swatch"]][c(7, 8)]) +
  scale_fill_manual(
    values = palette[["swatch"]][c(7, 8)],
    name = "Distractor Colour"
  ) +
  facet_grid(~new_distr_type) +
  # Change labels
  labs(
    x = "",
    y = "Reaction Times (ms)"
  )

cowplot::plot_grid(wmcHighRT, wmclowRT,
  nrow = 2,
  labels = c("High WMC", "Low WMC")
)

Only the distribution for low WMC with distractor circle (O) appears to be normally distributed. This is visible from both the distribution and the QQ plots. The low WMC-distractor L distributions seem to be a bit platykurtic. The high WMC distributions seem to diverge slightly from normality. However, the small samples make the interpretation more difficult.

Finally, the Levene’s test was used to test the homogeneity of variance assumption.

car::leveneTest(rt_mean ~ wmc_group * new_distr_type * new_distr_col,
  data = rtDataset
)
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value    Pr(>F)    
## group  7  9.0049 3.802e-08 ***
##       80                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The test is significant, indicating variability in variance across groups.

Taken together, these results indicate a possible violation of the assumptions of normality, with the small sample sizes making it more difficult to really assess this.

Central Task Accuracy: a strange case

The experiment included two tasks: the primary task (LAI-Visual search task) and the central task. The latter was used as a control for the EEG analyses, thus the behavioural results were not of primary interest. However, as reported in the dissertation, the accuracy scores produced on this task were unexpected. All the groups appeared to perform at chance level (accuracy around 50%), while the original analyses reported scores all in the 90% range. The data analyzed was the one provided; therefore, there is no clear explanation for this difference. As I am not the owner of the data, nor permission for sharing was previously requested to the ethics committee, I am not in the position to upload the data set. However, I would like to be transparent and I am more than happy to share my code so that anyone interested can verify if there are mistakes on my part.

Accuracy scores were recorded as zeros and ones. There are no recordings on responses were coded, therefore we worked under the assumption that correct responses were coded as ones and incorrect responses as zeros. Accuracy was defined as the ratio between the number of correct responses over the total number of responses.

## Load data ----
accRaw <- read.csv(here::here("DataSets", "Accuracy.csv"))

## Extract accuracy values for each participant ----
accDataset <- accRaw %>%
  dplyr::group_by(subject, new_distr_type, new_distr_col) %>%
  # Remove RT = 0, which probably are people that did not responded
  dplyr::filter(response_rt != 0) %>%
  # Compute accuracy for each subject as number correct trials over total trials
  dplyr::mutate(accuracy = sum(response_acc) / n()) %>%
  # Retain only useful columns
  dplyr::select(subject, wmc_group, new_distr_type, new_distr_col, accuracy) %>%
  # Remove duplicated rows
  # dataset is in long-format, thus each subject has the same exact mean information in
  # each row in the new dataset
  distinct(subject, .keep_all = TRUE)

Descriptives statistics are:

## Compute descriptives for RT based on WMC
accDescriptives <- accDataset %>%
  dplyr::group_by(wmc_group, new_distr_type, new_distr_col) %>%
  dplyr::summarise(
    n = n(),
    mean = mean(accuracy),
    median = median(accuracy),
    sd = sd(accuracy, na.rm = TRUE), # Standard deviation
    se = sd / sqrt(n), # Standard error
    min = min(accuracy),
    max = max(accuracy),
    IQR = IQR(accuracy),
    skew = skew(accuracy),
    kurt = kurtosi(accuracy),
    shapiro.stat = shapiro.test(accuracy)$statistic,
    shapiro.pvalue = shapiro.test(accuracy)$p.value
  )

knitr::kable(x = accDescriptives, caption = "Desacriptive statistics for accuracy scores in central task")
Desacriptive statistics for accuracy scores in central task
wmc_group new_distr_type new_distr_col n mean median sd se min max IQR skew kurt shapiro.stat shapiro.pvalue
high L Gr 9 0.4972224 0.4827586 0.0422291 0.0140764 0.4563107 0.5840708 0.0490372 0.8503864 -0.6768801 0.8772516 0.1468796
high L Or 9 0.4835332 0.4888889 0.0781741 0.0260580 0.3606557 0.6153846 0.0567585 0.0956471 -1.1268252 0.9815828 0.9719197
high O Gr 9 0.4982022 0.4952381 0.0348429 0.0116143 0.4512195 0.5520833 0.0404791 0.1948920 -1.4659492 0.9576434 0.7733598
high O Or 9 0.4966807 0.5081967 0.0673325 0.0224442 0.3750000 0.5857143 0.0790021 -0.4027506 -1.1718719 0.9600357 0.7986939
low L Gr 13 0.4954711 0.4964029 0.0776576 0.0215383 0.3333333 0.6136364 0.1065760 -0.3068257 -0.7798074 0.9716334 0.9129737
low L Or 13 0.5088658 0.5128205 0.0589424 0.0163477 0.4224138 0.6000000 0.0787144 0.1328090 -1.3012127 0.9416463 0.4785710
low O Gr 13 0.5141577 0.5057471 0.0402808 0.0111719 0.4528302 0.5833333 0.0589386 0.2105255 -1.2577624 0.9481763 0.5709303
low O Or 13 0.5071119 0.5000000 0.1318849 0.0365783 0.1818182 0.7460317 0.0401116 -0.5130894 0.9779865 0.8524550 0.0306788

Here there is a visual representation of these results:

Although all the groups show similar means, a permutation ANOVA was carried out for completeness. To do so the package ez was used, as it allows to conduct permutation analyses with mixed designs.

## Define parameters ----
set.seed(2021)
doParallel::registerDoParallel(cores = parallel::detectCores()) # Parallel computing for speed
np <- 10000 # Number of permutations

ez::ezPerm(
  data = accDataset,
  dv = accuracy,
  wid = subject,
  within = c(new_distr_type, new_distr_col),
  between = wmc_group,
  perms = np,
  parallel = TRUE
)
##                                   Effect      p p<.05
## 1                              wmc_group 0.4577      
## 2                         new_distr_type 0.6641      
## 3                          new_distr_col 0.9698      
## 4               wmc_group:new_distr_type 0.9387      
## 5                wmc_group:new_distr_col 0.7412      
## 6           new_distr_type:new_distr_col 0.8209      
## 7 wmc_group:new_distr_type:new_distr_col 0.6030

No significant results as expected.

Bayesian Mixed-Effects Models

NOTE: this sections contains quite a few plots. Feel free to skip them if they are too much

This section contains the code use to create the two models used for the dissertation analyses and it presents the steps used to assess their correct convergence. The procedure and code were adapted Rens van de Schoot. As the models require some time to run, here we present the model used to investigate the N2pc amplitude, while the code for the Ptc (second model) can be found on Github.

Model creation

The N2pc component was defined as the mean amplitude within the time window 200ms-260ms. Here is a glimpse of the dataset and of how the predictors were coded.

## Rows: 8640 Columns: 6
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr (4): wmc, id, type, colour
## dbl (2): trial, n2pc_signal
## 
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 8,640
## Columns: 6
## $ wmc         <fct> high, high, high, high, high, high, high, high, high, high~
## $ id          <chr> "P02", "P02", "P02", "P02", "P02", "P02", "P02", "P02", "P~
## $ trial       <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,~
## $ type        <fct> L, L, L, L, L, L, L, L, L, L, L, L, L, L, L, L, L, L, L, L~
## $ colour      <fct> Gr, Gr, Or, Gr, Or, Or, Gr, Or, Or, Or, Or, Gr, Or, Or, Or~
## $ n2pc_signal <dbl> -4.7585762, 0.2559777, 5.1265490, 5.0361513, -9.6171031, 4~
##      wmc    type colour
## [1,] "low"  "O"  "Gr"  
## [2,] "high" "L"  "Or"

The model was created to consider Working Memory Capacity (WMC), distractor type, and color as fixed effects. Participants were introduced as random effects.

# Define the parameters
iterations <- 10000 # Number of iterations per chain
warmup <- 2500 # Number of burning iteration (length of warm up period that will be discarded)
ncores <- parallel::detectCores() # Number of RAM cores
chains <- 4 # Number of Markov Chains to create to estimate the prior distribution
seed <- 2021
inits <- "random" # Randomly generates initial values
tapap <- .80 # Target-Average-Proposal-Acceptance-Probability (increase to decrease the step size during estimation)

# Create the model
model1 <- brms::brm(
  formula = n2pc_signal ~ 1 + wmc * type * colour + (1 | id),
  data = bayesData,
  warmup = warmup,
  iter = iterations,
  chains = chains,
  inits = inits,
  cores = ncores,
  control = list(adapt_delta = tapap),
  seed = seed
)
## Compiling Stan program...
## Start sampling

Once the model was created we assessed how well it converged and if the results were reliable. This procedure involves a series of steps that

Traces

The first step involves controlling the trace plots to verify that the values of the MCMC (Monte Carlo Markov Chains) chains converged within the warm-up period, i.e., the values that will be discarded from the model. If the chains converged correctly, the plots should look like ‘nice fat caterpillar’, without values diverging too much from the others.

# Transform model data into a long tibble
model1.df <- ggmcmc::ggs(model1)
## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2
## Warning in custom.sort(D$Parameter): NAs introduced by coercion

## Warning in custom.sort(D$Parameter): NAs introduced by coercion
# Select predictors of interest
paramToPlot <- c(
  "sigma",
  "b_Intercept",
  "b_wmc",
  "b_typeL",
  "b_colourOr",
  "b_typeL:colourOr",
  "b_wmchigh:typeL",
  "b_wmchigh:colourOr",
  "b_wmchigh:typeL:colourOr"
)

# Plot
trace1 <- model1.df %>%
  dplyr::filter(Parameter %in% paramToPlot[1:5]) %>%
  ggplot(aes(
    x = Iteration,
    y = value,
    col = as.factor(Chain)
  )) +
  geom_line(alpha = .5) +
  geom_vline(xintercept = warmup) +
  scale_colour_ggthemr_d() +
  facet_grid(Parameter ~ .,
    scale  = "free_y",
    switch = "y"
  ) +
  labs(
    title = "Main Effects",
    colour = "Chains"
  ) +
  theme(strip.text.y.left = element_text(angle = 0))

trace2 <- model1.df %>%
  dplyr::filter(Parameter %in% paramToPlot[6:9]) %>%
  ggplot(aes(
    x = Iteration,
    y = value,
    col = as.factor(Chain)
  )) +
  geom_line(alpha = .5) +
  geom_vline(xintercept = warmup) +
  scale_colour_ggthemr_d() +
  facet_grid(Parameter ~ .,
    scale  = "free_y",
    switch = "y"
  ) +
  labs(
    title = "Interaction",
    colour = "Chains"
  ) +
  theme(strip.text.y.left = element_text(angle = 0))

trace1

trace2

The trace plots look as they should. The traces converge and overlap very early on. Notably, the brm function would have thrown a message if this was not the case. Thus, this is another layer of reassurance that there are no convergence problems.

Gelman-Rubin Diagnostics

The second step in this process is to verify the Gelman-Rubin Diagnostics (\(\hat{R}\)). This is another measure of convergences that considers the variance within chains and the variance between chains for each parameter. Convergence can be assumed when \(\hat{R}\) equals one, or it is very close to one.

model1.posterior <- as.mcmc(model1) # Extract info for diagnostic
## Warning: as.mcmc.brmsfit is deprecated and will eventually be removed.
# Gelman Diagnostic (Rhat)
coda::gelman.diag(model1.posterior[, 1:10])
## Potential scale reduction factors:
## 
##                          Point est. Upper C.I.
## b_Intercept                       1          1
## b_wmchigh                         1          1
## b_typeL                           1          1
## b_colourOr                        1          1
## b_wmchigh:typeL                   1          1
## b_wmchigh:colourOr                1          1
## b_typeL:colourOr                  1          1
## b_wmchigh:typeL:colourOr          1          1
## sd_id__Intercept                  1          1
## sigma                             1          1
## 
## Multivariate psrf
## 
## 1
coda::gelman.plot(model1.posterior[, 1:10])

All \(\hat{R}\) equals one, as can be seen from the table. Furthermore, the plots show how the \(\hat{R}\) for each predictor converges at 1 quite early.

There is a second diagnostic that can be checked: the Geweke diagnostic. The value of this diagnostic should not be over 1.96 for each predictor.

geweke.plot(model1.posterior[, 1:10])

Posterior distributions

# Posterior Distributions
mcmc_plot(model1, type = "hist")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

All the posterior distributions look smooth and single peaked, and it does not look like there is anything to worry about.

Autocorrelation

This is the final step, and it involves the consideration of the autocorrelation values for each regressor. For the model to converge correctly, it is ideal that the autocorrelation values decrease with each repetition of the MCMC because that would indicate that the samples are independent. If the autocorrelation values were high, it would be necessary to increase the number of permutations and the step size (Target-Average-Proposal-Acceptance-Probability) or reparametrize the data.

coda::autocorr.plot(model1.posterior[, 1:10])

The plots show that the autocorrelation values oscillates around zero after a few iterations, which provided further evidence for the correct convergence of the model.