For this exercise, please try to reproduce the results from Experiment 1 of the associated paper (Farooqui & Manly, 2015). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

Participants (N=21) completed a series of trials that required them to switch or stay from one task to the other. One task was to choose the larger value of the two values if surrounded by a green box. The other task was to choose the value with the larger font if surrounded by a blue box. Subliminal cues followed by a mask were presented before each trial. Cues included “O” (non-predictive cue), “M” (switch predictive cue), and “T” (repeat predictive cue). Reaction times and performance accuracy were measured.

Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 1):

Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in longer RTs (836 vs. 689 ms) and lower accuracy rates (79% vs. 92%). If participants were able to learn the predictive value of the cue that preceded only switch trials and could instantiate relevant anticipatory control in response to it, the performance on switch trials preceded by this cue would be better than on switch trials preceded by the nonpredictive cue. This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; mean difference = 52 ms, 95% confidence interval, or CI = [19.5, 84.4]), two-tailed paired t(20) = 3.34, p < .01. However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%), p = .8.

Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files

# #optional packages:
# library(broom)

Step 2: Load data

# This reads all the participants data (each is in a seperate xls file) in and combines them into one dataframe
# Each xls has 250 rows, the rest is their calculations using excel, which we don't want in the data
files <- dir('data/Experiment 1')

data <- data.frame()
id <- 1
for (file in files){
  if(file != 'Codebook.xls'){
    temp_data <- read_xls(file.path('data/Experiment 1', file))
    temp_data$id <- id
    id <- id + 1
    temp_data <- temp_data[1:250, ]
    data <- rbind(data, temp_data)
  }
}

Step 3: Tidy data

Each row is an observation. The data is already in tidy format.

Step 4: Run analysis

Pre-processing

data_clean <- data

Descriptive statistics

Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in longer RTs (836 vs. 689 ms)

participant_means <- data_clean %>%
  group_by(id, TaskType) %>%
  summarise(
    mean_rt = mean(RT, na.rm = TRUE),
    .groups = "drop"
  )

p_summary <- participant_means %>%
  group_by(TaskType) %>%
  summarise(
    mean_accuracy = mean(mean_rt),     # ≈ .92 for repeat, .79 for switch
    n             = n()        ,         # should be 21
    .groups = "drop"
  )

p_summary

## # A tibble: 2 × 3
##   TaskType mean_accuracy     n
##      <dbl>         <dbl> <int>
## 1        1          856.    21
## 2        2          705.    21

Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in […] lower accuracy rates (79% vs. 92%)

# reproduce the above results here
acc_participants <- data %>%
  filter(!is.na(RespCorr)) %>%
  mutate(
    trial_type = ifelse(TrialType == 1, "repeat", "switch"),
    accuracy   = RespCorr == TRUE     # convert TRUE/FALSE to logical 1/0
  ) %>%
  group_by(id, trial_type) %>%
  summarise(
    mean_acc = mean(accuracy),  # accuracy per participant per condition
    .groups = "drop"
  )

acc_summary <- acc_participants %>%
  group_by(trial_type) %>%
  summarise(
    mean_accuracy = mean(mean_acc),     # ≈ .92 for repeat, .79 for switch
    n             = n()       ,          # should be 21
    .groups = "drop"
  )

acc_summary

## # A tibble: 2 × 3
##   trial_type mean_accuracy     n
##   <chr>              <dbl> <int>
## 1 repeat             0.914    21
## 2 switch             0.791    21

Now you will analyze Predictive Switch Cues vs Non-predictive Switch Cues. Let’s start with reaction time.

This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; … )

# reproduce the above results here

data_clean <- data_clean %>%
  mutate(
    cue_type = case_when(
      Prime == "M" ~ "predictive",
      Prime == "T" ~ "predictive",
      Prime == "O" ~ "nonpredictive",
      TRUE ~ NA_character_
    )
  )

cues_participants <- data_clean %>%
  filter(!is.na(Prime)) %>%
  group_by(id, cue_type) %>%
  summarise(
    mean_cues = mean(RT),  # RT per participant per condition
    .groups = "drop"
  )

cues_participants

## # A tibble: 36 × 3
##       id cue_type      mean_cues
##    <dbl> <chr>             <dbl>
##  1     1 nonpredictive      605.
##  2     1 predictive         620.
##  3     2 nonpredictive      701.
##  4     2 predictive         722.
##  5     3 nonpredictive     1003.
##  6     3 predictive         882.
##  7     4 nonpredictive      747.
##  8     4 predictive         742.
##  9     5 <NA>               886.
## 10     6 nonpredictive      808.
## # ℹ 26 more rows

c_summary <- cues_participants %>%
  group_by(cue_type) %>%
  summarise(
    mean_prime = mean(mean_cues), 
    n             = n()            ,     # should be 21
    .groups = "drop"
  )

c_summary

## # A tibble: 3 × 3
##   cue_type      mean_prime     n
##   <chr>              <dbl> <int>
## 1 nonpredictive       812.    14
## 2 predictive          794.    14
## 3 <NA>                759.     8

Next you will try to reproduce error rates for Switch Predictive Cues vs Switch Non-predictive Cues.

However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%)

# reproduce the above results here
prime_accuracy <- data_clean %>%
  filter(!is.na(RespCorr)) %>%          # keep only trials with a correctness value
  mutate(
    accuracy = RespCorr == TRUE         # TRUE/FALSE → 1/0
  ) %>%
  group_by(id, cue_type) %>%               # accuracy per participant per cue type
  summarise(
    mean_acc = mean(accuracy),
    error_rate = 1 - accuracy,
    .groups = "drop"
  )

p_err <- prime_accuracy %>%
  group_by(cue_type) %>%
  summarise(
    error_rate = mean(error_rate),    
    n             = n(),                 # should be 21
    .groups = "drop"
  )

p_err

## # A tibble: 3 × 3
##   cue_type      error_rate     n
##   <chr>              <dbl> <int>
## 1 nonpredictive     0.113   1814
## 2 predictive        0.0956  1684
## 3 <NA>              0.136   1752

Inferential statistics

The first claim is that in switch trials, predictive cues lead to statistically significant faster reaction times than nonpredictive cues.

… the performance on switch trials preceded by this cue would be better than on switch trials preceded by the nonpredictive cue. This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; mean difference = 52 ms, 95% confidence interval, or CI = [19.5, 84.4]), two-tailed paired t(20) = 3.34, p < .01.

# reproduce the above results here
data_clean <- data_clean %>%
  mutate(
    cue_type = case_when(
      Prime == "M" ~ "predictive",
      Prime == "O" ~ "nonpredictive",
      TRUE ~ NA_character_
    )
  )

switch_means <- data_clean %>%
  group_by(id, cue_type) %>%
  summarise(
    mean_rt = mean(RT, na.rm = TRUE),
    .groups = "drop"
  )

switch_wide <- switch_means %>%
  tidyr::pivot_wider(
    names_from = cue_type,
    values_from = mean_rt
  )

t_test_result <- t.test(
  switch_wide$predictive,
  switch_wide$nonpredictive,
  paired = TRUE
)

t_test_result

## 
##  Paired t-test
## 
## data:  switch_wide$predictive and switch_wide$nonpredictive
## t = 4.0129, df = 13, p-value = 0.001476
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##   56.71892 189.00624
## sample estimates:
## mean difference 
##        122.8626

Next, test the second claim.

However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%), p = .8.

# reproduce the above results here
data_clean <- data_clean %>%
  mutate(
    trial_type = ifelse(TrialType == 1, "repeat", "switch"),
    accuracy   = RespCorr == TRUE,    # TRUE/FALSE → 1/0
    cue_type = case_when(
      Prime == "M" ~ "predictive",
      Prime == "O" ~ "nonpredictive",
      TRUE ~ NA_character_
    )
  )

# compute accuracy just for switch trials (as in RT test)
switch_acc <- data_clean %>%
  filter(trial_type == "switch") %>%
  group_by(id, cue_type) %>%
  summarise(
    mean_acc = mean(accuracy, na.rm = TRUE),   # proportion correct
    .groups = "drop"
  )

# wide format for paired t-test
switch_acc_wide <- switch_acc %>%
  tidyr::pivot_wider(
    names_from = cue_type,
    values_from = mean_acc
  )

# paired t-test: predictive vs nonpredictive accuracy
t_test_acc <- t.test(
  switch_acc_wide$predictive,
  switch_acc_wide$nonpredictive,
  paired = TRUE
)

t_test_acc

## 
##  Paired t-test
## 
## data:  switch_acc_wide$predictive and switch_acc_wide$nonpredictive
## t = 1.4425, df = 13, p-value = 0.1728
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.01282885  0.06438656
## sample estimates:
## mean difference 
##      0.02577885

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

I could NOT reproduce 836 vs. 689 ms but I could reproduce 79% vs. 92% for splitting things by tasktype. For splitting things by prime… I could not reproduce because of the issues mentioned in part three of this reflection.

How difficult was it to reproduce your results?

It was quite difficult, and I hit the three hour mark. I couldn’t figure out if I was doing something wrong in the first reproduction question in descriptive statistics, but after trying a bunch of times I can only conclude that the research itself is wrong. Also, see my answer to part three of this reflection—I have no idea what the dataset was doing by labeling primes as 2,4,8 after MOT, which made it hard to reproduce.

What aspects made it difficult? What aspects made it easy?

The primes seemed to switch from the later datafiles to being labeled by letter to being labeled by number, throwing the dataset off, upon visual inspection of the datasets I could not figure out whether a number letter mapping existed.

Reproducibility Report: Group B Choice 1