For this exercise, please try to reproduce the results from Experiment 1 of the associated paper (Farooqui & Manly, 2015). The PDF of the paper is included in the same folder as this Rmd file.
Participants (N=21) completed a series of trials that required them to switch or stay from one task to the other. One task was to choose the larger value of the two values if surrounded by a green box. The other task was to choose the value with the larger font if surrounded by a blue box. Subliminal cues followed by a mask were presented before each trial. Cues included “O” (non-predictive cue), “M” (switch predictive cue), and “T” (repeat predictive cue). Reaction times and performance accuracy were measured.
Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 1):
Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in longer RTs (836 vs. 689 ms) and lower accuracy rates (79% vs. 92%). If participants were able to learn the predictive value of the cue that preceded only switch trials and could instantiate relevant anticipatory control in response to it, the performance on switch trials preceded by this cue would be better than on switch trials preceded by the nonpredictive cue. This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; mean difference = 52 ms, 95% confidence interval, or CI = [19.5, 84.4]), two-tailed paired t(20) = 3.34, p < .01. However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%), p = .8.
library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files
# #optional packages:
# library(broom)
# This reads all the participants data (each is in a seperate xls file) in and combines them into one dataframe
# Each xls has 250 rows, the rest is their calculations using excel, which we don't want in the data
files <- dir('data/Experiment 1')
data <- data.frame()
id <- 1
for (file in files){
if(file != 'Codebook.xls'){
temp_data <- read_xls(file.path('data/Experiment 1', file))
temp_data$id <- id
id <- id + 1
temp_data <- temp_data[1:250, ]
data <- rbind(data, temp_data)
}
}
Each row is an observation. The data is already in tidy format.
Processed_data <- data %>% select(c('Prime', 'TaskType', 'TrialType','RT','RespCorr', 'stay_2...15', 'stay_4...16', 'swt_2...17', 'swt_8...18', 'swt_2...19', 'swt_8...20'))
Processed_data$RespCorr[Processed_data$RespCorr == 'TRUE'] <- 1
Processed_data$RespCorr[Processed_data$RespCorr == 'FALSE'] <- 0
Processed_data <- Processed_data %>% mutate(Predictive_cue = ifelse(Prime == 'O', 0,1))
Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in longer RTs (836 vs. 689 ms)
# reproduce the above results here
Avg <- Processed_data %>%
group_by(TrialType) %>%
summarise(mean(RT))
Avg
## # A tibble: 3 × 2
## TrialType `mean(RT)`
## <dbl> <dbl>
## 1 0 2021.
## 2 1 731.
## 3 2 896.
Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in […] lower accuracy rates (79% vs. 92%)
# reproduce the above results here
AvgAcc <- Processed_data %>%
group_by(TrialType) %>%
summarise( sum(RespCorr== 1)/n() * 100)
AvgAcc
## # A tibble: 3 × 2
## TrialType `sum(RespCorr == 1)/n() * 100`
## <dbl> <dbl>
## 1 0 83.3
## 2 1 91.4
## 3 2 79.2
Now you will analyze Predictive Switch Cues vs Non-predictive Switch Cues. Let’s start with reaction time.
This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; … )
# reproduce the above results here
Avgpred <- Processed_data %>%
group_by(Predictive_cue, TrialType == 2) %>%
summarise(mean(RT))
Avgpred
## # A tibble: 4 × 3
## # Groups: Predictive_cue [2]
## Predictive_cue `TrialType == 2` `mean(RT)`
## <dbl> <lgl> <dbl>
## 1 0 FALSE 773.
## 2 0 TRUE 933.
## 3 1 FALSE 730.
## 4 1 TRUE 874.
#mean(Processed_data$swt_2...17, na.rm = TRUE)
#mean(Processed_data$swt_8...18, na.rm = TRUE)
Next you will try to reproduce error rates for Switch Predictive Cues vs Switch Non-predictive Cues.
However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%)
# reproduce the above results here
AvgAccPred <- Processed_data %>%
group_by(Predictive_cue, TrialType == 2) %>%
summarise( sum(RespCorr== 1)/n() * 100)
AvgAccPred
## # A tibble: 4 × 3
## # Groups: Predictive_cue [2]
## Predictive_cue `TrialType == 2` `sum(RespCorr == 1)/n() * 100`
## <dbl> <lgl> <dbl>
## 1 0 FALSE 91.4
## 2 0 TRUE 80.2
## 3 1 FALSE 91.2
## 4 1 TRUE 78.6
The first claim is that in switch trials, predictive cues lead to statistically significant faster reaction times than nonpredictive cues.
… the performance on switch trials preceded by this cue would be better than on switch trials preceded by the nonpredictive cue. This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; mean difference = 52 ms, 95% confidence interval, or CI = [19.5, 84.4]), two-tailed paired t(20) = 3.34, p < .01.
# reproduce the above results here
copy <- Processed_data %>% filter(Processed_data$TrialType == 2)
#length(copy$RT[copy$Predictive_cue == 1])
#length(copy$RT[copy$Predictive_cue==0])
t.test(copy$RT[copy$Predictive_cue == 1], copy$RT[copy$Predictive_cue==0] , paired = FALSE, alternative = "two.sided")
##
## Welch Two Sample t-test
##
## data: copy$RT[copy$Predictive_cue == 1] and copy$RT[copy$Predictive_cue == 0]
## t = -2.463, df = 911.93, p-value = 0.01396
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -105.28740 -11.90536
## sample estimates:
## mean of x mean of y
## 874.4925 933.0889
Next, test the second claim.
However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%), p = .8.
# reproduce the above results here
t.test(copy$RespCorr[copy$Predictive_cue == 1], copy$RespCorr[copy$Predictive_cue==0] , paired = FALSE, alternative = "two.sided")
##
## Welch Two Sample t-test
##
## data: copy$RespCorr[copy$Predictive_cue == 1] and copy$RespCorr[copy$Predictive_cue == 0]
## t = -0.67038, df = 932.2, p-value = 0.5028
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.06323688 0.03103457
## sample estimates:
## mean of x mean of y
## 0.7861716 0.8022727
Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?
some of them only:
– Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in longer RTs (836 vs. 689 ms) this part didn’t reproduce (895 vs 730)
– Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in […] lower accuracy rates (79% vs. 92%) this part reproduced (79.1 vs 91.37)
– Now you will analyze Predictive Switch Cues vs Non-predictive Switch Cues. Let’s start with reaction time. This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; … ) this part didn’t reproduce (762 vs 811)
– Next you will try to reproduce error rates for Switch Predictive Cues vs Switch Non-predictive Cues. However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%) this didn’t reproduce , the switch cues only turned out to be predictive , so the data doesn’t show a case in which the switch cue was unpredictive. the switch predictive cues gave a 83% accuracy rate.
– the first part of inferential statistics didn’t fully reproduce. with p-value = 0.013 a 95% CI but different means: 874 vs 933
– second part of inferential statistics didn’t reproduce, with p-value = 0.5 and mean values of 0.7861716 vs 0.8022727
How difficult was it to reproduce your results?
on a scale from 1-10, it was 5.
What aspects made it difficult? What aspects made it easy?
Firstly, the coding or labelling of columns is not clear, I had some hard time understanding which columns refer to what and deciding which column i should use. for example, there is a column called trial types , and then there is stay_2stay_4 swt_2 swt_8 which also indicates trial types but divided by the type of the preceding cue, that’s in addition to the prime column, so it wasn’t clear which columns i should use exactly among all these to indicate (switch trials after predictive cues) for instance. Secondly, the values in the columns were not very consistent, so there is a 0 in the trial type column that doesn’t refer to anything apparently becuase there are only two trial types. figuring out the data and what it means was the hardest part.