Reproducibility Report: Group B Choice 1

For this exercise, please try to reproduce the results from Experiment 1 of the associated paper (Farooqui & Manly, 2015). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

Participants (N=21) completed a series of trials that required them to switch or stay from one task to the other. One task was to choose the larger value of the two values if surrounded by a green box. The other task was to choose the value with the larger font if surrounded by a blue box. Subliminal cues followed by a mask were presented before each trial. Cues included “O” (non-predictive cue), “M” (switch predictive cue), and “T” (repeat predictive cue). Reaction times and performance accuracy were measured.

Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 1):

Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in longer RTs (836 vs. 689 ms) and lower accuracy rates (79% vs. 92%). If participants were able to learn the predictive value of the cue that preceded only switch trials and could instantiate relevant anticipatory control in response to it, the performance on switch trials preceded by this cue would be better than on switch trials preceded by the nonpredictive cue. This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; mean difference = 52 ms, 95% confidence interval, or CI = [19.5, 84.4]), two-tailed paired t(20) = 3.34, p < .01. However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%), p = .8.

Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files

# #optional packages:
# library(broom)

Step 2: Load data

# This reads all the participants data (each is in a seperate xls file) in and combines them into one dataframe
# Each xls has 250 rows, the rest is their calculations using excel, which we don't want in the data
files <- dir('data/Experiment 1')

data <- data.frame()
id <- 1
for (file in files){
  if(file != 'Codebook.xls'){
    temp_data <- read_xls(file.path('data/Experiment 1', file))
    temp_data$id <- id
    id <- id + 1
    temp_data <- temp_data[1:250, ]
    data <- rbind(data, temp_data)
  }
}

Step 3: Tidy data

Each row is an observation. The data is already in tidy format.

Step 4: Run analysis

Pre-processing

#Remove the columns that are not part of the dataset
d <- data |> 
  select(Block_Number, Event_Number, Prime, PrimeVisible, TaskType, TrialType, CorrResp, RT, RespCorr, lnum, rnum, lFont)

Descriptive statistics

Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in longer RTs (836 vs. 689 ms)

#There are 3 different trial types, but TrialType = 0 was probably a catch trial. 
d_rt <- d |> 
  group_by(TrialType) |> 
  summarize(
    meanRT = mean(RT), 
    numTrials = n()
  )
print(d_rt)

## # A tibble: 3 × 3
##   TrialType meanRT numTrials
##       <dbl>  <dbl>     <int>
## 1         0  2021.        42
## 2         1   731.      3987
## 3         2   896.      1221

Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in […] lower accuracy rates (79% vs. 92%)

#There are 3 different trial types, but TrialType = 0 was probably a catch trial. 
d_accuracy <- d |> 
  group_by(TrialType) |> 
  summarize(
    meanAccurary = mean(RespCorr), 
    numTrials = n()
  )
print(d_accuracy)

## # A tibble: 3 × 3
##   TrialType meanAccurary numTrials
##       <dbl>        <dbl>     <int>
## 1         0        0.833        42
## 2         1        0.914      3987
## 3         2        0.792      1221

Now you will analyze Predictive Switch Cues vs Non-predictive Switch Cues. Let’s start with reaction time.

This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; … )

#Switch-predictive cue: M, repeat-predictive: T, non-predictive cue: O. Not sure what the number primes correspond to. 

#Get trials with predictive cues (switch-predictive or repeat-predictive)
predictive_cues <- d |> 
  filter(Prime == 'M' | Prime == 'T') 

#Get trials with non-predictive cues 
nonpredictive_cues <- d |> 
  filter(Prime == 'O')

cat('The mean RT of predictive cue trials =', mean(predictive_cues$RT))

## The mean RT of predictive cue trials = 794.3393

cat('\nThe mean RT of non-predictive cue trials =', mean(nonpredictive_cues$RT))

## 
## The mean RT of non-predictive cue trials = 811.8054

Next you will try to reproduce error rates for Switch Predictive Cues vs Switch Non-predictive Cues.

However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%)

#Switch-predictive cue: M, repeat-predictive: T, non-predictive cue: O. Not sure what the number primes correspond to. 
cat('The mean RT of predictive cue trials =', mean(predictive_cues$RespCorr))

## The mean RT of predictive cue trials = 0.9043943

cat('\nThe mean RT of non-predictive cue trials =', mean(nonpredictive_cues$RespCorr))

## 
## The mean RT of non-predictive cue trials = 0.8869901

Inferential statistics

The first claim is that in switch trials, predictive cues lead to statistically significant faster reaction times than nonpredictive cues.

… the performance on switch trials preceded by this cue would be better than on switch trials preceded by the nonpredictive cue. This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; mean difference = 52 ms, 95% confidence interval, or CI = [19.5, 84.4]), two-tailed paired t(20) = 3.34, p < .01.

meanRT_predictive <- mean(predictive_cues$RT)
meanRT_nonpredictive <- mean(nonpredictive_cues$RT)
cat('The mean RT of predictive cue trials =', meanRT_predictive)

## The mean RT of predictive cue trials = 794.3393

cat('\nThe mean RT of non-predictive cue trials =', meanRT_nonpredictive)

## 
## The mean RT of non-predictive cue trials = 811.8054

#Get the difference between the mean RTs
cat('\nThe mean difference in RT =', abs(meanRT_predictive-meanRT_nonpredictive))

## 
## The mean difference in RT = 17.46604

#Calculate the two-tailed paired t-test
two_tailed_test_RT <- t.test(predictive_cues$RT, nonpredictive_cues$RT)
cat('\nTwo-tailed paired t(20) =', two_tailed_test_RT$statistic, 'p =', two_tailed_test_RT$p.value)

## 
## Two-tailed paired t(20) = -1.369064 p = 0.1710677

cat('\nConfidence interval CI =', two_tailed_test_RT$conf.int)

## 
## Confidence interval CI = -42.47927 7.5472

Next, test the second claim.

However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%), p = .8.

meanAccuracy_predictive <- mean(predictive_cues$RespCorr)
meanAccuracy_nonpredictive <- mean(nonpredictive_cues$RespCorr)
cat('The mean accuracy of predictive cue trials =', meanAccuracy_predictive)

## The mean accuracy of predictive cue trials = 0.9043943

cat('\nThe mean accuracy of non-predictive cue trials =', meanAccuracy_nonpredictive)

## 
## The mean accuracy of non-predictive cue trials = 0.8869901

#Calculate the two-tailed paired t-test
two_tailed_test_accuracy <- t.test(predictive_cues$RespCorr, nonpredictive_cues$RespCorr)
cat('\nTwo-tailed paired t(20) =', two_tailed_test_accuracy$statistic, 'p =', two_tailed_test_accuracy$p.value)

## 
## Two-tailed paired t(20) = 1.685174 p = 0.09204434

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

I was not able to reproduce any of the results in this paper, despite working for more than 3 hours on the assignment. Some of the descriptive statistics were close but none of the results, with the exception of the accuracy rates for performance on switch trials relative to repeat trials (with rounding), matched the original authors’ results. I tried many different things with the data to see if I could get closer to the original answers, but I did not get the exact same numbers.

How difficult was it to reproduce your results?

It was fairly difficult, given that there was no data dictionary to let me know what the different values meant for each column. By playing around with grouping and guessing, I made the following assumptions about the columns: Prime corresponded to the cue, with “O” (non-predictive cue), “M” (switch-predictive cue), and “T” (repeat-predictive cue). There were also rows with Prime values of 2, 4, and 8, but it wasn’t clear what those primes were used for. PrimeVisible == 1 corresponded to the prime being on the screen, but the value was 1 for all rows. TaskType was either 1 or 2, corresponding to the two different tasks (blue versus green rectangle conditions), and TrialType was 0, 1, or 2, corresponding to the type of cue, making it a non-predictive, switch-predictive or repeat-predictive trial. I wasn’t sure whether the correct response marking was in the CorrResp column or RespCorr column, but since RespCorr had TRUE or FALSE values, I figured that column had to be the one measuring accuracy.

What aspects made it difficult? What aspects made it easy?

The lack of a data dictionary made it hard to know whether I was actually grouping the right variables together to calculate the descriptive statistics. At times, it seemed like I got closer answers when I grouped by TaskType rather than by TrialType, but it didn’t make sense to analyze the data based on TaskType given that I was assuming that “task” referred to whether the task was the one with the blue rectangle or the one with the green rectangle as opposed to the cue manipulation. Additionally, the TrialType values and the Prime values lined up such that TrialType was likely denoting whether it was a repeat-predictive, switch-predictive, or non-predictive trial.