For this exercise, please try to reproduce the results from Experiment 1 of the associated paper (Farooqui & Manly, 2015). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

Participants (N=21) completed a series of trials that required them to switch or stay from one task to the other. One task was to choose the larger value of the two values if surrounded by a green box. The other task was to choose the value with the larger font if surrounded by a blue box. Subliminal cues followed by a mask were presented before each trial. Cues included “O” (non-predictive cue), “M” (switch predictive cue), and “T” (repeat predictive cue). Reaction times and performance accuracy were measured.


Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 1):

Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in longer RTs (836 vs. 689 ms) and lower accuracy rates (79% vs. 92%). If participants were able to learn the predictive value of the cue that preceded only switch trials and could instantiate relevant anticipatory control in response to it, the performance on switch trials preceded by this cue would be better than on switch trials preceded by the nonpredictive cue. This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; mean difference = 52 ms, 95% confidence interval, or CI = [19.5, 84.4]), two-tailed paired t(20) = 3.34, p < .01. However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%), p = .8.


Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files

# #optional packages:
# library(broom)

Step 2: Load data

# This reads all the participants data (each is in a seperate xls file) in and combines them into one dataframe
# Each xls has 250 rows, the rest is their calculations using excel, which we don't want in the data
files <- dir('data/Experiment 1')

data <- data.frame()
id <- 1
for (file in files){
  if(file != 'Codebook.xls'){
    temp_data <- read_xls(file.path('data/Experiment 1', file))
    temp_data$id <- id
    id <- id + 1
    temp_data <- temp_data[1:250, ]
    data <- rbind(data, temp_data)
  }
}

Step 3: Tidy data

Each row is an observation. The data is already in tidy format.

Step 4: Run analysis

Pre-processing

data_clean <- data %>%
  filter(RT >= 200, RT <= 2000) %>%
  filter(TrialType %in% c(1, 2)) %>%
  
  mutate(
    trial_type = ifelse(TrialType == 1, "repeat", "switch"),
    accuracy   = ifelse(RespCorr == 1, 1, 0),
    rt         = RT
  )

Descriptive statistics

Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in longer RTs (836 vs. 689 ms)

# reproduce the above results here

rt_switch_repeat <- data_clean %>%
  group_by(trial_type) %>%
  summarise(
    mean_rt = mean(rt),
    sd_rt = sd(rt),
    n = n()
  )

knitr::kable(rt_switch_repeat, caption = "RT for Switch vs Repeat (Correct, Subliminal, Trimmed)")
RT for Switch vs Repeat (Correct, Subliminal, Trimmed)
trial_type mean_rt sd_rt n
repeat 717.5715 282.8287 3959
switch 856.1699 290.4947 1194

Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in […] lower accuracy rates (79% vs. 92%)

# reproduce the above results here

subject_accuracy <- data_clean %>%
  group_by(id, trial_type) %>%
  summarise(
    accuracy_subject = mean(accuracy),
    .groups = "drop"
  )

accuracy_summary <- subject_accuracy %>%
  group_by(trial_type) %>%
  summarise(
    mean_accuracy = mean(accuracy_subject),
    sd_accuracy = sd(accuracy_subject),
    n = n()
  )

knitr::kable(
  accuracy_summary,
  caption = "Subject-level Accuracy for Repeat vs Switch Trials"
)
Subject-level Accuracy for Repeat vs Switch Trials
trial_type mean_accuracy sd_accuracy n
repeat 0.9142974 0.0618798 21
switch 0.7869736 0.1269567 21

Now you will analyze Predictive Switch Cues vs Non-predictive Switch Cues. Let’s start with reaction time.

This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; … )

# reproduce the above results here

data_switch <- data_clean %>%
  filter(trial_type == "switch") %>%
  mutate(
    cue_type = case_when(
      Prime == "O" ~ "predictive",
      Prime == "M" ~ "nonpredictive",
      TRUE ~ NA_character_
    )
  ) %>%
  filter(!is.na(cue_type))

rt_by_subject <- data_switch %>%
  group_by(id, cue_type) %>%
  summarise(
    mean_rt_subject = mean(rt),
    .groups = "drop"
  )

rt_summary <- rt_by_subject %>%
  group_by(cue_type) %>%
  summarise(
    mean_of_subject_means = mean(mean_rt_subject),
    sd_of_subject_means   = sd(mean_rt_subject),
    n = n()
  )

knitr::kable(rt_summary, caption = "Subject-level RT for Predictive vs Nonpredictive Switch Cues")
Subject-level RT for Predictive vs Nonpredictive Switch Cues
cue_type mean_of_subject_means sd_of_subject_means n
nonpredictive 877.3392 145.6130 14
predictive 901.9266 151.7379 14

Next you will try to reproduce error rates for Switch Predictive Cues vs Switch Non-predictive Cues.

However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%)

# reproduce the above results here

data_switch_acc <- data_clean %>%
  filter(trial_type == "switch") %>%
  mutate(
    cue_type = case_when(
      Prime == "O" ~ "predictive",
      Prime == "M" ~ "nonpredictive",
      TRUE ~ NA_character_
    )
  ) %>%
  filter(!is.na(cue_type))

acc_by_subject <- data_switch_acc %>%
  group_by(id, cue_type) %>%
  summarise(
    accuracy_subject = mean(accuracy),   # accuracy = 1/0 already defined
    .groups = "drop"
  )

acc_summary <- acc_by_subject %>%
  group_by(cue_type) %>%
  summarise(
    mean_of_subject_accuracy = mean(accuracy_subject),
    sd_of_subject_accuracy   = sd(accuracy_subject),
    n = n()
  )

knitr::kable(acc_summary, caption = "Subject-level Accuracy for Predictive vs Nonpredictive Switch Cues")
Subject-level Accuracy for Predictive vs Nonpredictive Switch Cues
cue_type mean_of_subject_accuracy sd_of_subject_accuracy n
nonpredictive 0.8298556 0.1067304 14
predictive 0.7999137 0.1336969 14

Inferential statistics

The first claim is that in switch trials, predictive cues lead to statistically significant faster reaction times than nonpredictive cues.

… the performance on switch trials preceded by this cue would be better than on switch trials preceded by the nonpredictive cue. This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; mean difference = 52 ms, 95% confidence interval, or CI = [19.5, 84.4]), two-tailed paired t(20) = 3.34, p < .01.

# reproduce the above results here

rt_by_subject <- data_switch_acc %>%
  group_by(id, cue_type) %>%
  summarise(
    mean_rt_subject = mean(rt),
    .groups = "drop"
  )

rt_wide <- rt_by_subject %>%
  select(id, cue_type, mean_rt_subject) %>%
  pivot_wider(
    names_from  = cue_type,
    values_from = mean_rt_subject
  )


# ---- 3. Paired t-test ----
t_test_rt <- t.test(
  rt_wide$predictive,
  rt_wide$nonpredictive,
  paired = TRUE
)

t_test_rt
## 
##  Paired t-test
## 
## data:  rt_wide$predictive and rt_wide$nonpredictive
## t = 1.1709, df = 13, p-value = 0.2626
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -20.77619  69.95104
## sample estimates:
## mean difference 
##        24.58743

Next, test the second claim.

However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%), p = .8.

# reproduce the above results here

acc_by_subject <- data_switch_acc %>%
  group_by(id, cue_type) %>%
  summarise(
    accuracy_subject = mean(accuracy),
    .groups = "drop"
  )

acc_wide <- acc_by_subject %>%
  select(id, cue_type, accuracy_subject) %>%
  pivot_wider(
    names_from  = cue_type,
    values_from = accuracy_subject
  )

# ---- 3. Paired t-test ----
t_test_acc <- t.test(
  acc_wide$predictive,
  acc_wide$nonpredictive,
  paired = TRUE
)

t_test_acc
## 
##  Paired t-test
## 
## data:  acc_wide$predictive and acc_wide$nonpredictive
## t = -1.7826, df = 13, p-value = 0.09801
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.066229671  0.006345812
## sample estimates:
## mean difference 
##     -0.02994193

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

I was unfortunatley not able to fully reproduce the results. I could not reproduce: RT values (e.g., 836 ms vs 689 ms); the predictive vs nonpredictive cue RTs (819 vs 871 ms); the accuracy rates (e.g., 79% vs 92%); the reported t-tests and confidence intervals. The results from my analysis did not match those reported in the paper, even though I tried it in 20 different ways. I gave up after the 3 hour time limit.

How difficult was it to reproduce your results?

It was unexpectedly difficult. Even after implementing correct preprocessing steps and running subject-level summaries, the results differed quite a lot from what’s reported in the paper.

What aspects made it difficult? What aspects made it easy?

Diffiuclt: The dataset contained no subliminal trials, so the key manipulation was missing. Filtering also removed many participants, making estimates unstable. Easy: Data wrangling (filtering, grouping, summarizing) once the structure was clear.