For this exercise, please try to reproduce the results from Experiment 1 of the associated paper (Farooqui & Manly, 2015). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

Participants (N=21) completed a series of trials that required them to switch or stay from one task to the other. One task was to choose the larger value of the two values if surrounded by a green box. The other task was to choose the value with the larger font if surrounded by a blue box. Subliminal cues followed by a mask were presented before each trial. Cues included “O” (non-predictive cue), “M” (switch predictive cue), and “T” (repeat predictive cue). Reaction times and performance accuracy were measured.

Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 1):

> Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in longer RTs (836 vs. 689 ms) and lower accuracy rates (79% vs. 92%). If participants were able to learn the predictive value of the cue that preceded only switch trials and could instantiate relevant anticipatory control in response to it, the performance on switch trials preceded by this cue would be better than on switch trials preceded by the nonpredictive cue. This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; mean difference = 52 ms, 95% confidence interval, or CI = [19.5, 84.4]), two-tailed paired t(20) = 3.34, p < .01. However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%), p = .8.

Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files
# #optional packages:
library(broom)

Step 2: Load data

# This reads all the participants data (each is in a seperate xls file) in and combines them into one dataframe
# Each xls has 250 rows, the rest is their calculations using excel, which we don't want in the data


files <- dir('~/R Mac working folder/class/problem_sets/ps3/Group B/Choice 1/data/Experiment 1')
data <- data.frame()
id <- 1
for (file in files){
  if(file != 'Codebook.xls'){
    temp_data <- read_xls(file.path('~/R Mac working folder/class/problem_sets/ps3/Group B/Choice 1/data/Experiment 1', file))
    temp_data$id <- id
    id <- id + 1
    temp_data <- temp_data[1:250, ]
    data <- rbind(data, temp_data)
  }
}

#had to take time to decipher and figure outhow to load and break down loading data, lesson on making this easy to do for authors and self

Step 3: Tidy data

view(data)

glimpse(data)

## Rows: 5,250
## Columns: 23
## $ Block_Number <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,…
## $ Event_Number <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1…
## $ Prime        <chr> "O", "T", "M", "O", "T", "T", "M", "T", "O", "T", "O", "…
## $ PrimeVisible <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ TaskType     <dbl> 2, 2, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1,…
## $ TrialType    <dbl> 0, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1,…
## $ CorrResp     <dbl> 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1,…
## $ RT           <dbl> 897.875, 608.500, 1097.000, 841.125, 945.000, 1020.500, …
## $ RespCorr     <lgl> TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, T…
## $ lnum         <dbl> 6, 85, 96, 42, 16, 22, 80, 11, 84, 17, 93, 79, 9, 78, 24…
## $ rnum         <dbl> 12, 61, 80, 67, 8, 11, 96, 22, 60, 32, 73, 53, 18, 95, 4…
## $ lFont        <dbl> 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0,…
## $ swt          <dbl> NA, NA, 1097.000, NA, NA, NA, 1200.875, NA, NA, NA, NA, …
## $ stay         <dbl> NA, 608.500, NA, 841.125, 945.000, 1020.500, NA, 737.000…
## $ stay_2...15  <dbl> NA, NA, NA, 841.125, NA, NA, NA, NA, 593.000, NA, 569.00…
## $ stay_4...16  <dbl> NA, 608.500, NA, NA, 945.000, 1020.500, NA, 737.000, NA,…
## $ swt_2...17   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ swt_8...18   <dbl> NA, NA, 1097.000, NA, NA, NA, 1200.875, NA, NA, NA, NA, …
## $ swt_2...19   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ swt_8...20   <dbl> NA, NA, 1, NA, NA, NA, 1, NA, NA, NA, NA, NA, NA, 1, NA,…
## $ stay_2...21  <dbl> NA, NA, NA, 1, NA, NA, NA, NA, 1, NA, 1, NA, 1, NA, 1, 1…
## $ stay_4...22  <dbl> NA, 1, NA, NA, 0, 1, NA, 1, NA, 1, NA, 1, NA, NA, NA, NA…
## $ id           <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…

unique(data$id) #shows us that we have 21 (#unique ids) respondents 5250 observations total on 23 diff variables

##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21

unique(data$Prime) #checking the unique variables, good reminder to check all the data it is also a chr 2, 4, 8 added to primes,

## [1] "O" "T" "M" "2" "4" "8"

unique(data$TaskType) #aligned to coding

## [1] 2 1

unique(data$TrialType) #does show 0, 1, 2 for trial types, what are the 0s? NAs? maybe test runs? 0=neither?

## [1] 0 1 2

data %>% filter(TrialType == "0")

## # A tibble: 42 x 23
##    Block_Number Event_Number Prime PrimeVisible TaskType TrialType CorrResp
##           <dbl>        <dbl> <chr>        <dbl>    <dbl>     <dbl>    <dbl>
##  1            3            1 O                1        2         0        1
##  2            4            1 O                1        1         0        0
##  3            3            1 O                1        2         0        0
##  4            4            1 O                1        2         0        0
##  5            3            1 O                1        1         0        0
##  6            4            1 O                1        1         0        0
##  7            3            1 O                1        1         0        1
##  8            4            1 O                1        2         0        1
##  9            3            1 2                1        2         0        1
## 10            4            1 2                1        1         0        0
## # … with 32 more rows, and 16 more variables: RT <dbl>, RespCorr <lgl>,
## #   lnum <dbl>, rnum <dbl>, lFont <dbl>, swt <dbl>, stay <dbl>,
## #   stay_2...15 <dbl>, stay_4...16 <dbl>, swt_2...17 <dbl>, swt_8...18 <dbl>,
## #   swt_2...19 <dbl>, swt_8...20 <dbl>, stay_2...21 <dbl>, stay_4...22 <dbl>,
## #   id <dbl>

Each row is an observation. The data is already in tidy format.

Step 4: Run analysis

Pre-processing

#reported taking the stable “median” values but also report means in the paper?

knew there was an issue with the additional misaligned responses but was unable to determine the coding, would have continued without, reminder to be resource goas planned or try to make do with what you have and also report it then

data$originalprime2 <- data$Prime ##oh this is cool realized this is shortcut to create new variables within the called dataset
data$originalPrime <- data$Prime
data$Prime <- recode(data$Prime, '2' = "O", '4' = "T", '8' = "M")

#recode variables to make referencing easier
data$originalTrialType <- data$TrialType
data$Prime <- recode(data$Prime, 'O' = "Nonpredictive Cue", 'M' = "Switch Predictive Cue", 'T' = "Repeat Predictive Cue")
data$TrialType <- recode(data$TrialType, '0' = "Neither", '1' = "Repeat Trials", '2' = "Switch Trials")

Descriptive statistics

Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in longer RTs (836 vs. 689 ms) -median taken unless stated otherwise

# reproduce the above results here
str(data)

## tibble [5,250 × 26] (S3: tbl_df/tbl/data.frame)
##  $ Block_Number     : num [1:5250] 3 3 3 3 3 3 3 3 3 3 ...
##  $ Event_Number     : num [1:5250] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Prime            : chr [1:5250] "Nonpredictive Cue" "Repeat Predictive Cue" "Switch Predictive Cue" "Nonpredictive Cue" ...
##  $ PrimeVisible     : num [1:5250] 1 1 1 1 1 1 1 1 1 1 ...
##  $ TaskType         : num [1:5250] 2 2 1 1 1 1 2 2 2 2 ...
##  $ TrialType        : chr [1:5250] "Neither" "Repeat Trials" "Switch Trials" "Repeat Trials" ...
##  $ CorrResp         : num [1:5250] 1 1 1 0 1 1 1 1 0 0 ...
##  $ RT               : num [1:5250] 898 608 1097 841 945 ...
##  $ RespCorr         : logi [1:5250] TRUE TRUE TRUE TRUE FALSE TRUE ...
##  $ lnum             : num [1:5250] 6 85 96 42 16 22 80 11 84 17 ...
##  $ rnum             : num [1:5250] 12 61 80 67 8 11 96 22 60 32 ...
##  $ lFont            : num [1:5250] 1 1 0 0 0 0 1 1 0 0 ...
##  $ swt              : num [1:5250] NA NA 1097 NA NA ...
##  $ stay             : num [1:5250] NA 608 NA 841 945 ...
##  $ stay_2...15      : num [1:5250] NA NA NA 841 NA ...
##  $ stay_4...16      : num [1:5250] NA 608 NA NA 945 ...
##  $ swt_2...17       : num [1:5250] NA NA NA NA NA NA NA NA NA NA ...
##  $ swt_8...18       : num [1:5250] NA NA 1097 NA NA ...
##  $ swt_2...19       : num [1:5250] NA NA NA NA NA NA NA NA NA NA ...
##  $ swt_8...20       : num [1:5250] NA NA 1 NA NA NA 1 NA NA NA ...
##  $ stay_2...21      : num [1:5250] NA NA NA 1 NA NA NA NA 1 NA ...
##  $ stay_4...22      : num [1:5250] NA 1 NA NA 0 1 NA 1 NA 1 ...
##  $ id               : num [1:5250] 1 1 1 1 1 1 1 1 1 1 ...
##  $ originalprime2   : chr [1:5250] "O" "T" "M" "O" ...
##  $ originalPrime    : chr [1:5250] "O" "T" "M" "O" ...
##  $ originalTrialType: num [1:5250] 0 1 2 1 1 1 2 1 1 1 ...

med_RT1 <- data %>%
  group_by(TrialType) %>%
  summarise(median_RT = median(RT),
            mean_RT = mean(RT))


kable(med_RT1)

TrialType	median_RT	mean_RT
Neither	1027.7852	2021.0156
Repeat Trials	665.0625	730.9531
Switch Trials	812.9375	895.6084

# doesnt match exact numbers but the relationship yes

med_RT <- data %>% 
        group_by(TrialType) %>% 
        summarise(median_RT = median(RT),
                  mean_RT=mean(RT))

kable(med_RT[-1, ])

TrialType	median_RT	mean_RT
Repeat Trials	665.0625	730.9531
Switch Trials	812.9375	895.6084

Performance on switch trials, relative to repeat trials, incurred a switch cost that was evident in […] lower accuracy rates (79% vs. 92%)

# reproduce the above results here, almost match with rounding

mean_cor <- data %>%
  group_by(TrialType) %>%
  summarise(accuracy = mean(RespCorr))

mean_RespCorr <- data %>% 
        group_by(TrialType) %>%
        summarise(accuracy = mean(RespCorr))

kable(mean_RespCorr[-1, ])

TrialType	accuracy
Repeat Trials	0.9137196
Switch Trials	0.7919738

Now you will analyze Predictive Switch Cues vs Non-predictive Switch Cues. Let’s start with reaction time.

This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; … )

# reproduce the above results here, does not match mean RTs

mean_RT_Prime <- data %>%
  group_by(Prime) %>%
  summarise(median_RT = median(RT),
            mean_RT = mean(RT))

kable(mean_RT_Prime)

Prime	median_RT	mean_RT
Nonpredictive Cue	717.0938	798.1232
Repeat Predictive Cue	657.0781	723.7750
Switch Predictive Cue	801.1250	883.2764

mean_Prime_RT_Ind <- data %>% 
  filter(TrialType == "Switch Trials") %>% 
  group_by(id, Prime) %>% 
  summarise(meanRT = mean(RT),
            medianRT = median(RT)) #Individual Means
mean_Prime_RT <- mean_Prime_RT_Ind %>% group_by(Prime) %>% 
  summarise(grandmeanRT = mean(meanRT),
            grandMedianRT = median(medianRT)) #Grand Means

kable(mean_Prime_RT)

Prime	grandmeanRT	grandMedianRT
Nonpredictive Cue	907.7555	783.1562
Switch Predictive Cue	883.3979	799.5000

Next you will try to reproduce error rates for Switch Predictive Cues vs Switch Non-predictive Cues.

However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%)

# reproduce the above results here - made it here in 3 hr

mean_Prime_RespCorr_Ind <- data %>% filter(TrialType == "Switch Trials") %>% group_by(id, Prime) %>% summarise(meanCorr = mean(RespCorr)) #Individual Means
mean_Prime_RespCorr <- mean_Prime_RespCorr_Ind %>% group_by(Prime) %>% summarise(grandmeanCorr = mean(meanCorr)) #Grand Means

kable(mean_Prime_RespCorr)

Prime	grandmeanCorr
Nonpredictive Cue	0.7802613
Switch Predictive Cue	0.7994066

Inferential statistics

The first claim is that in switch trials, predictive cues lead to statistically significant faster reaction times than nonpredictive cues.

… the performance on switch trials preceded by this cue would be better than on switch trials preceded by the nonpredictive cue. This was indeed the case (mean RT-predictive cue: 819 ms; nonpredictive cue: 871 ms; mean difference = 52 ms, 95% confidence interval, or CI = [19.5, 84.4]), two-tailed paired t(20) = 3.34, p < .01.

# reproduce the above results here

Next, test the second claim.

However, error rates did not differ across these two groups of switch trials (predictive cue: 78.9%; nonpredictive cue: 78.8%), p = .8.

# reproduce the above results here

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

ANSWER HERE How difficult was it to reproduce your results? I was unable to reproduce the entire results due to time but was able to complete some phases. Still having difficulty with knowing which verbs to put together to look at specific subsets and calling summary information correctly, howeverthis was helpful.

ANSWER HERE What aspects made it difficult? What aspects made it easy?I struggled with the loading and preprocessing part and required detailed look of the paper and thinking about how to get the results. I was able to do the descriptives but were going to be wrong without ensuring I Was able to handle some of the coding errors within the data set. I was confused if I was moving forward alogn correctly the whole time but also went back with some of the solutions to learn.

ANSWER HERE © 2020 GitHub, Inc. Terms Privacy Security Status Help Contact GitHub Pricing API Training Blog About

ps3b.251 Reproducibility Group B and choice 1

Kris Evans

11/7/2020