articleID <- "3-10-2014_PS" # insert the article ID code here e.g., "10-3-2015_PS"
reportType <- 'final'
pilotNames <- "Gobi, Marc" # insert the pilot's name here e.g., "Tom Hardwicke". If there are multiple pilots enter both names in a character string e.g., "Tom Hardwicke, Bob Dylan"
copilotNames <- "Kyle MacDonald" # insert the co-pilot's name here e.g., "Michael Frank". If there are multiple co-pilots enter both names in a character string e.g., "Tom Hardwicke, Bob Dylan"
pilotTTC <- 720 # insert the pilot's estimated time to complete (in minutes, fine to approximate) e.g., 120
copilotTTC <- 100 # insert the co- pilot's estimated time to complete (in minutes, fine to approximate) e.g., 120
pilotStartDate <- as.Date("10/26/17", format = "%m/%d/%y") # insert the pilot's start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y")
copilotStartDate <- as.Date("6/13/18", format = "%m/%d/%y") # insert the co-pilot's start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y")
completionDate <- as.Date("04/20/19", format = "%m/%d/%y") # copilot insert the date of final report completion (after any necessary rounds of author assistance) in US format e.g., as.Date("01/25/18", format = "%m/%d/%y")

Methods summary:

To test how exploration is affected by reward frequency and control, the researchers recruited 120 college students. The students were either in the with-control group, in which case outcome depended on their actions, or the yoked group, in which case outcome depended on the action of a with-control participant that they were paired with.

Participants were shown a 12x10 grid of keys and asked to press keys for 100 trials (divided into 4 blocks, 25 trials each). Keys they already pressed (old keys) turned gray. Partipants paid 1 point to explore/press new keys (exploration cost) and 0 points to press old keys. Upon pressing a new key, participants in the with-control condition received 11 points with probability p and 0 points with probability 1-p when pressing new keys. Note, they still had to pay their “new key press” cost of 1 point.

Participants in the yoked condition received whatever the with-control participant they were yoked to received on that trial during the first 50 trials but actually had control in the second 50 trials (i.e. they were treated like those in the with-control group). The study was 2 conditions (with-control / yoked) x 3 reward frequencies (p=0.1 extremely low, p=0.2 moderate, p=1 extremely high) and so the 120 participants were divided into 6 groups of 20 each. In such a setup, yoked participants would receive 11 points if they did not explore and 10 points if they did explore, on a trial when the with-control participant they were yoked to received a reward for exploring.

A direct measure of perceived controllability was found by surveying participants of how much they felt in control of their outcomes in the first and second half of the trials. An indirect measure of perceived controllability was found by having all participants predict (over the course of 16 trials picked randomly (but ensuring first/second half samples were evenly picked) from the 100 trials completed by another player) the probability of four possible outcomes (explore+reward, no-explore+reward, explore+no-reward, no-explore+no-reward) for trial t+1. Low variance in predicted probabilities indicated that the participants thought that the other player does not have control over their outcomes, so they predicted the same probability every time, whereas high variance in predicted probabilities indicated that they do have control.


Target outcomes:

The top row of Figure 2 presents exploration rates (the percentage of trials in which participants tried new keys) across 4 blocks of 25 trials each. To further examine exploration rates, we conducted a 4 (block: 1, 2, 3, 4) × 3 (reward frequency: extremely low, moderate, extremely high) × 2 (control group: with-control, yoked) repeated measures analysis of variance (ANOVA), which revealed a significant three-way interaction, F(6, 342) = 3.35, p < .01, ηp 2 = .05. A post hoc Tukey’s test of exploration rates in the two control groups revealed that when the frequency of rewards from exploration was extremely low (p = .1), exploration rates decreased from about 70% in the first block to approximately 40% in the last block for both the with-control (p < .01) and the yoked (p < .01) groups. However, when reward frequency was moderate (p = .2), with-control participants continued to explore in about 70% of the trials, whereas yoked participants reduced their exploration rates to approximately 40%. This difference between the groups was evident in the second block (p < .01) and remained after yoked participants regained control (p < .01 and p = .09 for the third and fourth blocks, respectively). Finally, when reward frequency was extremely high (p = 1), yoked participants explored less than with-control participants in the second block (p < .01); however, this gap disappeared immediately after yoked participants regained control (ps > .9 for the third and fourth blocks). In summary, the classic learned-helplessness pattern was observed only when the reward frequency was moderate.

Here is the top row of Figure 2. Note that error bars represent represent ±1 SEM. And that points show the mean percentage of trials on which participants tried new keys.


Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files
library(ReproReports) # custom report functions
library(car)
# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared.
reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, 
                           obtainedValue = NA, valueType = NA, 
                           percentageError = NA, comparisonOutcome = NA, 
                           eyeballCheck = NA)

Step 2: Load data

expldt = read_excel('data/LH_model_predictions_and_experimental_results.xlsx', 
                    col_names = TRUE) %>% 
  rename("p_reward" = "P(reward)", 
         "yoked" = "Yoked (0=with control)", 
         "1"="Exploration Rates_Block1", 
         "2"="Exploration Rates_Block2", 
         "3"="Exploration Rates_Block3", 
         "4"="Exploration Rates_Block4")

Step 3: Tidy data

expldt_tidy = expldt %>% gather(block, block_exploration, "1":"4") 
expldt_tidy = na.omit(expldt_tidy)

Change 1/0s to meaningful condition names.

expldt_tidy <- expldt_tidy %>%
  mutate(yoked_condition = ifelse(yoked == 0, "with_control", "yoked"))

Make sure all variables are coded as factors.

expldt_tidy <- expldt_tidy %>% 
  mutate(block_factor = factor(block),
         block = as.numeric(block),
         p_reward_factor = factor(p_reward),
         yoked_condition = factor(yoked_condition)) 

Check how many participants we have in each condition. From the Participants section,

One hundred twenty Technion students (52 female, 68 male; average age = 24 years).

From the results section,

This resulted in a 3 (reward frequency) × 2 (control group) between-participants design that yielded six groups of 20 participants each

From footnote 3,

The number of participants in each group was determined a priori following previous studies that used a similar paradigm. No observations were excluded.

expldt_tidy %>% 
  distinct(Subject, yoked_condition, p_reward) %>% 
  count(yoked_condition, p_reward) %>% 
  kable()
yoked_condition p_reward n
with_control 0.1 20
with_control 0.2 20
with_control 1.0 20
yoked 0.1 20
yoked 0.2 20
yoked 1.0 20

We can reproduce the number of participants in each condition and reward frequency condition.

Step 4: Run analysis

Descriptive statistics

ms <- expldt_tidy %>% 
  group_by(yoked_condition, p_reward, block) %>% 
  summarise(m = mean(block_exploration),
            stdev = sd(block_exploration),
            n = n(), 
            sem = stdev / sqrt(n)) %>% 
  mutate_if(is.numeric, round, digits = 3) 

ms %>% kable()
yoked_condition p_reward block m stdev n sem
with_control 0.1 1 0.684 0.228 20 0.051
with_control 0.1 2 0.648 0.302 20 0.068
with_control 0.1 3 0.564 0.315 20 0.070
with_control 0.1 4 0.410 0.248 20 0.056
with_control 0.2 1 0.696 0.202 20 0.045
with_control 0.2 2 0.737 0.280 20 0.063
with_control 0.2 3 0.727 0.320 20 0.072
with_control 0.2 4 0.686 0.387 20 0.086
with_control 1.0 1 0.947 0.073 20 0.016
with_control 1.0 2 0.958 0.063 20 0.014
with_control 1.0 3 0.976 0.033 20 0.007
with_control 1.0 4 0.972 0.057 20 0.013
yoked 0.1 1 0.670 0.268 20 0.060
yoked 0.1 2 0.552 0.346 20 0.077
yoked 0.1 3 0.476 0.394 20 0.088
yoked 0.1 4 0.422 0.340 20 0.076
yoked 0.2 1 0.634 0.294 20 0.066
yoked 0.2 2 0.360 0.339 20 0.076
yoked 0.2 3 0.354 0.338 20 0.076
yoked 0.2 4 0.416 0.310 20 0.069
yoked 1.0 1 0.792 0.246 20 0.055
yoked 1.0 2 0.598 0.339 20 0.076
yoked 1.0 3 0.856 0.112 20 0.025
yoked 1.0 4 0.948 0.058 20 0.013

Try to reproduce their key plot.

ms %>% 
  ggplot(aes(x = block, y = m, color = yoked_condition)) +
  geom_line(aes(group = yoked_condition), size = 1) +
  geom_pointrange(aes(ymin = m - sem, ymax = m + sem)) +
  facet_wrap(~p_reward) +
  lims(y=c(0,1)) +
  theme_minimal() +
  theme(legend.position = 'top',
        panel.border = element_rect(colour = "grey", fill=NA, size=1))

Our plot looks pretty spot on when compared to the top row of Figure 2:

reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[1,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[1,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[2,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[2,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[3,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[3,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure',
                           obtainedValue = ms[4,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[4,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[5,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[5,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[6,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[6,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[7,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[7,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[8,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[8,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[9,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[9,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[10,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[10,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[11,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[11,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[12,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[12,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[13,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[13,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[14,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[14,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[15,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[15,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[1,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[1,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[16,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[16,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[17,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[17,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[18,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[18,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[19,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[19,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure',
                           obtainedValue = ms[20,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[20,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[21,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[21,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[22,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[22,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure',
                           obtainedValue = ms[23,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[23,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[24,]$m, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = 'figure', 
                           obtainedValue = ms[24,]$sem, valueType = 'se', eyeballCheck = TRUE)
## [1] "MATCH for se. Eyeball comparison only."

Inferential statistics

To further examine exploration rates, we conducted a 4 (block: 1, 2, 3, 4) × 3 (reward frequency: extremely low, moderate, extremely high) × 2 (control group: with-control, yoked) repeated measures analysis of variance (ANOVA), which revealed a significant three-way interaction, F(6, 342) = 3.35, p < .01, ηp2 = .05.

m1 <- ez::ezANOVA(data = expldt_tidy, 
                  dv = block_exploration, 
                  wid = Subject, 
                  within = .(block_factor),
                  between = .(yoked_condition, p_reward_factor), 
                  detailed = TRUE,
                  type = 3)

m_output <- m1$ANOVA %>% mutate_if(is.numeric, round, digits = 4)

m1_dfn <- m_output$DFn[8]
m1_dfd <-  m_output$DFd[8]
m1_f <-  m_output$F[8]
m1_p <-  m_output$p[8]
m1_pes <-  m_output$SSn[8] / m_output$SSd[8]

Add ANOVA values to report object:

reportObject <- reproCheck(reportedValue = "6", obtainedValue = m1_dfn, valueType = 'df')
## [1] "MATCH for df. The reported value (6) and the obtained value (6) differed by 0%. Note that the obtained value was rounded to 0 decimal places to match the reported value."
reportObject <- reproCheck(reportedValue = "342", obtainedValue = m1_dfd, valueType = 'df')
## [1] "MATCH for df. The reported value (342) and the obtained value (342) differed by 0%. Note that the obtained value was rounded to 0 decimal places to match the reported value."
reportObject <- reproCheck(reportedValue = "3.35", obtainedValue = m1_f, valueType = 'F')
## [1] "MINOR_ERROR for F. The reported value (3.35) and the obtained value (3.15) differed by 5.97%. Note that the obtained value was rounded to 2 decimal places to match the reported value."
reportObject <- reproCheck(reportedValue = "p < .01", obtainedValue = m1_p, valueType = 'p', eyeballCheck = TRUE)
## [1] "MATCH for p. Eyeball comparison only."
reportObject <- reproCheck(reportedValue = ".05", obtainedValue = m1_pes, valueType = 'pes')
## [1] "MAJOR_ERROR for pes. The reported value (0.05) and the obtained value (0.06) differed by 20%. Note that the obtained value was rounded to 2 decimal places to match the reported value."

Next, we try to reproduce the post hoc Tukey tests. From the paper,

A post hoc Tukey’s test of exploration rates in the two control groups revealed that when the frequency of rewards from exploration was extremely low (p = .1), exploration rates decreased from about 70% in the first block to approximately 40% in the last block for both the with-control (p < .01) and the yoked (p < .01) groups.

post_hoc_ms <- expldt_tidy %>% 
  filter(p_reward == "0.1") %>% 
  group_by(block_factor, yoked_condition) %>% 
  summarise(m = mean(block_exploration)) %>% 
  mutate_if(is.numeric, funs(round(. * 100, 0)))

post_hoc_ms %>% kable()
block_factor yoked_condition m
1 with_control 68
1 yoked 67
2 with_control 65
2 yoked 55
3 with_control 56
3 yoked 48
4 with_control 41
4 yoked 42

We are able to reproduce values that look approximately similar to the means reported for the post-hoc Tukey tests.

wc_m_1 <- post_hoc_ms %>% filter(block_factor == 1, yoked_condition == "with_control") %>% pull(m)

reportObject <- reproCheck(reportedValue = "about 70", obtainedValue = wc_m_1, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
y_m_1 <- post_hoc_ms %>% filter(block_factor == 1, yoked_condition == "yoked") %>% pull(m)

reportObject <- reproCheck(reportedValue = "about 70", obtainedValue = y_m_1, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
wc_m_4 <- post_hoc_ms %>% filter(block_factor == 4, yoked_condition == "with_control") %>% pull(m)

reportObject <- reproCheck(reportedValue = "approximately 40", obtainedValue = wc_m_4, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
y_m_4 <- post_hoc_ms %>% filter(block_factor == 4, yoked_condition == "yoked") %>% pull(m)

reportObject <- reproCheck(reportedValue = "approximately 40", obtainedValue = y_m_4, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."

Next, we try to reproduce the reported p-values.

d_low_control <- expldt_tidy %>% 
  filter(yoked_condition == "with_control", 
         p_reward == "0.1", block %in% c(1, 4))

an_low_control <- aov(block_exploration ~ block_factor, data = d_low_control)
low_control_tuk <- TukeyHSD(an_low_control) 

reportObject <- reproCheck(reportedValue = "p < .01", obtainedValue = low_control_tuk$block_factor[4], valueType = 'p', eyeballCheck = TRUE)
## [1] "MATCH for p. Eyeball comparison only."

Next we do the same thing but with the yoked condition

d_low_yoked <- expldt_tidy %>% 
  filter(yoked_condition == "yoked",
         p_reward == "0.1", block %in% c(1, 4))

an_low_yoked <- aov(block_exploration ~ block_factor, data = d_low_yoked)
low_yoked_tuk <- TukeyHSD(an_low_yoked, "block_factor") 

reportObject <- reproCheck(reportedValue = "p < .01", obtainedValue = low_yoked_tuk$block_factor[4], 
                           valueType = 'p', eyeballCheck = FALSE)
## [1] "EYEBALL CHECK ERROR for p. Eyeball comparison only."

We are unable to reproduce the repored p value (\(p < .01\)) for the yoked condition. We get 0.015.

Next, we do the same analysis but focusing on differences between yoked-control in the moderate reward frequency and block. From the paper,

However, when reward frequency was moderate (p = .2), with-control participants continued to explore in about 70% of the trials, whereas yoked participants reduced their exploration rates to approximately 40%. This difference between the groups was evident in the second block (p < .01) and remained after yoked participants regained control (p < .01 and p = .09 for the third and fourth blocks, respectively).

d_moderate <- expldt_tidy %>% filter(p_reward == "0.2")

post_hoc_ms_mod <- d_moderate %>% 
  group_by(block_factor, yoked_condition) %>% 
  summarise(m = mean(block_exploration)) %>% 
  mutate_if(is.numeric, funs(round(. * 100, 0)))

post_hoc_ms_mod %>% kable()
block_factor yoked_condition m
1 with_control 70
1 yoked 63
2 with_control 74
2 yoked 36
3 with_control 73
3 yoked 35
4 with_control 69
4 yoked 42

We are able to reproduce values that look approximately similar to the means reported for the post-hoc Tukey tests.

wc_m_1 <- post_hoc_ms_mod %>% filter(block_factor == 1, yoked_condition == "with_control") %>% pull(m)

reportObject <- reproCheck(reportedValue = "about 70", obtainedValue = wc_m_1, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
y_m_1 <- post_hoc_ms_mod %>% filter(block_factor == 1, yoked_condition == "yoked") %>% pull(m)

reportObject <- reproCheck(reportedValue = "about 70", obtainedValue = y_m_1, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
wc_m_4 <- post_hoc_ms_mod %>% filter(block_factor == 4, yoked_condition == "with_control") %>% pull(m)

reportObject <- reproCheck(reportedValue = "about 70", obtainedValue = wc_m_4, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."
y_m_4 <- post_hoc_ms_mod %>% filter(block_factor == 4, yoked_condition == "yoked") %>% pull(m)

reportObject <- reproCheck(reportedValue = "approximately 40", obtainedValue = y_m_4, valueType = 'mean', eyeballCheck = TRUE)
## [1] "MATCH for mean. Eyeball comparison only."

Next, run the Tukey tests.

an_mod_block2 <- d_moderate %>% 
  filter(block == 2) %>% 
  aov(block_exploration ~ yoked_condition, data = .)

mod_block_tuk <- TukeyHSD(an_mod_block2) 

reportObject <- reproCheck(reportedValue = "p < .01", obtainedValue = mod_block_tuk$yoked_condition[4], 
                           valueType = 'p', eyeballCheck =  TRUE)
## [1] "MATCH for p. Eyeball comparison only."

Next, we do the same analysis but focusing on Block 3.

an_mod_block3 <- d_moderate %>% 
  filter(block == 3) %>% 
  aov(block_exploration ~ yoked_condition, data = .)

mod_block3_tuk <- TukeyHSD(an_mod_block3) 

reportObject <- reproCheck(reportedValue = "<.01", obtainedValue = mod_block3_tuk$yoked_condition[4], 
                           valueType = 'p', eyeballCheck = TRUE)
## [1] "MATCH for p. Eyeball comparison only."

Same analysis, but with block 4.

an_mod_block4 <- d_moderate %>% 
  filter(block == 4) %>% 
  aov(block_exploration ~ yoked_condition, data = .)

mod_block4_tuk <- TukeyHSD(an_mod_block4)

reportObject <- reproCheck(reportedValue = ".09", obtainedValue = mod_block4_tuk$yoked_condition[4], 
                           valueType = 'p')
## [1] "DECISION_ERROR for p. The reported value (0.09) and the obtained value (0.02) differed by 77.78%. Note that the obtained value was rounded to 2 decimal places to match the reported value."

We got a lower p-value (p = 0.02) than the one reported in the paper (\(p = .09\)).

The last Tukey tests focused on condition differences when reward frequency was high. From the paper,

Finally, when reward frequency was extremely high (p = 1), yoked participants explored less than with-control participants in the second block (p < .01); however, this gap disappeared immediately after yoked participants regained control (ps > .9 for the third and fourth blocks). In summary, the classic learned-helplessness pattern was observed only when the reward frequency was moderate.

d_high <- expldt_tidy %>% filter(p_reward == 1)

post_hoc_ms_high <- d_high %>% 
  group_by(block_factor, yoked_condition) %>% 
  summarise(m = mean(block_exploration)) %>% 
  mutate_if(is.numeric, funs(round(. * 100, 0)))

post_hoc_ms_high %>% kable()
block_factor yoked_condition m
1 with_control 95
1 yoked 79
2 with_control 96
2 yoked 60
3 with_control 98
3 yoked 86
4 with_control 97
4 yoked 95

Next, we try to reproduce the p-values.

an_high_block2 <- d_high %>% 
  filter(block == 2) %>% 
  aov(block_exploration ~ yoked_condition, data = .)

high_block2_tuk <- TukeyHSD(an_high_block2)

reportObject <- reproCheck(reportedValue = "p < .01", 
                           obtainedValue = high_block2_tuk$yoked_condition[4], 
                           valueType = 'p', eyeballCheck = TRUE)
## [1] "MATCH for p. Eyeball comparison only."

Same condition comparison, but with block 3

an_high_block3 <- d_high %>% 
  filter(block == 3) %>% 
  aov(block_exploration ~ yoked_condition, data = .)

high_block3_tuk <- TukeyHSD(an_high_block3)

reportObject <- reproCheck(reportedValue = "p > .9", 
                           obtainedValue = high_block3_tuk$yoked_condition[4], 
                           valueType = 'p', eyeballCheck = FALSE)
## [1] "EYEBALL CHECK ERROR for p. Eyeball comparison only."

We are unable to reproduce the p-value for the Block 3 comparison. We get \(p < .05\). Note that this is a decision error in that they interpreted this test as a non-significant p-value.

an_high_block4 <- d_high %>% 
  filter(block == 4) %>% 
  aov(block_exploration ~ yoked_condition, data = .)

high_block4_tuk <- TukeyHSD(an_high_block4)

reportObject <- reproCheck(reportedValue = "p > .9", 
                           obtainedValue = high_block4_tuk$yoked_condition[4], 
                           valueType = 'p', eyeballCheck = FALSE)
## [1] "EYEBALL CHECK ERROR for p. Eyeball comparison only."

We are unable to reproduce the p-value for the Block 4 comparison. We get \(p =\) 0.19. Note that this is NOT a decision error because the p-value is still above the p < .05 threshold.

Step 5: Conclusion

We were able to reproduce the descriptive results presented in Figure 2 but encountered problems reproducing the inferential statistics. After receiving a more specific analysis specification from the author, we were mostly able to reproduce the key 3-way interaction from the repeated measures ANOVA. Specifically, we able to reproduce the degrees of freedom and the p-value. However, we still found a minor error numerical error in the F-statistic and a major numerical error with the \(\eta^2\) (although it was a 20% difference with small numbers).

We were only able to reproduce a subset of the follow-up Tukey tests. It would be useful to get the following clarifications from the authors:

  • calculation that they used for the \(\eta^2\) value in the repeated measures ANOVA
  • precisely what subsets of the data were used for each p-value reported in the follow-up Tukey tests

We contacted the authors for clarification of these final issues but have not had a response after > 7 months. Therefore we are coding two insufficient information errors.

Author_Assistance = TRUE # was author assistance provided? (if so, enter TRUE)

Insufficient_Information_Errors <- 2 # how many discrete insufficient information issues did you encounter?

# Assess the causal locus (discrete reproducibility issues) of any reproducibility errors. Note that there doesn't necessarily have to be a one-to-one correspondance between discrete reproducibility issues and reproducibility errors. For example, it could be that the original article neglects to mention that a Greenhouse-Geisser correct was applied to ANOVA outcomes. This might result in multiple reproducibility errors, but there is a single causal locus (discrete reproducibility issue).

locus_typo <- 0 # how many discrete issues did you encounter that related to typographical errors?
locus_specification <- 1 # how many discrete issues did you encounter that related to incomplete, incorrect, or unclear specification of the original analyses?
locus_analysis <- 0 # how many discrete issues did you encounter that related to errors in the authors' original analyses?
locus_data <- 0 # how many discrete issues did you encounter that related to errors in the data files shared by the authors?
locus_unidentified <- 2 # how many discrete issues were there for which you could not identify the cause

# How many of the above issues were resolved through author assistance?
locus_typo_resolved <- 0 # how many discrete issues did you encounter that related to typographical errors?
locus_specification_resolved <- 0 # how many discrete issues did you encounter that related to incomplete, incorrect, or unclear specification of the original analyses?
locus_analysis_resolved <- 0 # how many discrete issues did you encounter that related to errors in the authors' original analyses?
locus_data_resolved <- 0 # how many discrete issues did you encounter that related to errors in the data files shared by the authors?
locus_unidentified_resolved <- 0 # how many discrete issues were there for which you could not identify the cause

Affects_Conclusion <- TRUE # Do any reproducibility issues encounter appear to affect the conclusions made in the original article? This is a subjective judgement, but you should taking into account multiple factors, such as the presence/absence of decision errors, the number of target outcomes that could not be reproduced, the type of outcomes that could or could not be reproduced, the difference in magnitude of effect sizes, and the predictions of the specific hypothesis under scrutiny.
reportObject <- reportObject %>%
  filter(dummyRow == FALSE) %>% # remove the dummy row
  select(-dummyRow) %>% # remove dummy row designation
  mutate(articleID = articleID) %>% # add the articleID 
  select(articleID, everything()) # make articleID first column

# decide on final outcome
if(any(!(reportObject$comparisonOutcome %in% c("MATCH", "MINOR_ERROR"))) | Insufficient_Information_Errors > 0){
  finalOutcome <- "Failure without author assistance"
  if(Author_Assistance == T){
    finalOutcome <- "Failure despite author assistance"
  }
}else{
  finalOutcome <- "Success without author assistance"
  if(Author_Assistance == T){
    finalOutcome <- "Success with author assistance"
  }
}

# collate report extra details
reportExtras <- data.frame(articleID, pilotNames, copilotNames, pilotTTC, copilotTTC, pilotStartDate, copilotStartDate, completionDate, Author_Assistance, finalOutcome, Insufficient_Information_Errors, locus_typo, locus_specification, locus_analysis, locus_data, locus_unidentified, locus_typo_resolved, locus_specification_resolved, locus_analysis_resolved, locus_data_resolved, locus_unidentified_resolved)

# save report objects
if(reportType == "pilot"){
  write_csv(reportObject, "pilotReportDetailed.csv")
  write_csv(reportExtras, "pilotReportExtras.csv")
}

if(reportType == "final"){
  write_csv(reportObject, "finalReportDetailed.csv")
  write_csv(reportExtras, "finalReportExtras.csv")
}

Session information

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value                       
##  version  R version 4.0.0 (2020-04-24)
##  os       macOS Catalina 10.15.4      
##  system   x86_64, darwin17.0          
##  ui       X11                         
##  language (EN)                        
##  collate  en_US.UTF-8                 
##  ctype    en_US.UTF-8                 
##  tz       Europe/London               
##  date     2020-05-13                  
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package      * version  date       lib
##  abind          1.4-5    2016-07-21 [1]
##  assertthat     0.2.1    2019-03-21 [1]
##  backports      1.1.6    2020-04-05 [1]
##  boot           1.3-24   2019-12-20 [1]
##  broom          0.5.6    2020-04-20 [1]
##  callr          3.4.3    2020-03-28 [1]
##  car          * 3.0-7    2020-03-11 [1]
##  carData      * 3.0-3    2019-11-16 [1]
##  cellranger     1.1.0    2016-07-27 [1]
##  cli            2.0.2    2020-02-28 [1]
##  colorspace     1.4-1    2019-03-18 [1]
##  crayon         1.3.4    2017-09-16 [1]
##  curl           4.3      2019-12-02 [1]
##  data.table     1.12.8   2019-12-09 [1]
##  DBI            1.1.0    2019-12-15 [1]
##  dbplyr         1.4.3    2020-04-19 [1]
##  desc           1.2.0    2018-05-01 [1]
##  devtools       2.3.0    2020-04-10 [1]
##  digest         0.6.25   2020-02-23 [1]
##  dplyr        * 0.8.5    2020-03-07 [1]
##  ellipsis       0.3.0    2019-09-20 [1]
##  evaluate       0.14     2019-05-28 [1]
##  ez             4.4-0    2016-11-02 [1]
##  fansi          0.4.1    2020-01-08 [1]
##  farver         2.0.3    2020-01-16 [1]
##  forcats      * 0.5.0    2020-03-01 [1]
##  foreign        0.8-78   2020-04-13 [1]
##  fs             1.4.1    2020-04-04 [1]
##  generics       0.0.2    2018-11-29 [1]
##  ggplot2      * 3.3.0    2020-03-05 [1]
##  glue           1.4.0    2020-04-03 [1]
##  gtable         0.3.0    2019-03-25 [1]
##  haven        * 2.2.0    2019-11-08 [1]
##  highr          0.8      2019-03-20 [1]
##  hms            0.5.3    2020-01-08 [1]
##  htmltools      0.4.0    2019-10-04 [1]
##  httr           1.4.1    2019-08-05 [1]
##  jsonlite       1.6.1    2020-02-02 [1]
##  knitr        * 1.28     2020-02-06 [1]
##  labeling       0.3      2014-08-23 [1]
##  lattice        0.20-41  2020-04-02 [1]
##  lifecycle      0.2.0    2020-03-06 [1]
##  lme4           1.1-23   2020-04-07 [1]
##  lubridate      1.7.8    2020-04-06 [1]
##  magrittr       1.5      2014-11-22 [1]
##  MASS           7.3-51.5 2019-12-20 [1]
##  Matrix         1.2-18   2019-11-27 [1]
##  memoise        1.1.0    2017-04-21 [1]
##  mgcv           1.8-31   2019-11-09 [1]
##  minqa          1.2.4    2014-10-09 [1]
##  modelr         0.1.7    2020-04-30 [1]
##  munsell        0.5.0    2018-06-12 [1]
##  nlme           3.1-147  2020-04-13 [1]
##  nloptr         1.2.2.1  2020-03-11 [1]
##  openxlsx       4.1.4    2019-12-06 [1]
##  pillar         1.4.4    2020-05-05 [1]
##  pkgbuild       1.0.7    2020-04-25 [1]
##  pkgconfig      2.0.3    2019-09-22 [1]
##  pkgload        1.0.2    2018-10-29 [1]
##  plyr           1.8.6    2020-03-03 [1]
##  prettyunits    1.1.1    2020-01-24 [1]
##  processx       3.4.2    2020-02-09 [1]
##  ps             1.3.2    2020-02-13 [1]
##  purrr        * 0.3.4    2020-04-17 [1]
##  R6             2.4.1    2019-11-12 [1]
##  Rcpp           1.0.4.6  2020-04-09 [1]
##  readr        * 1.3.1    2018-12-21 [1]
##  readxl       * 1.3.1    2019-03-13 [1]
##  remotes        2.1.1    2020-02-15 [1]
##  reprex         0.3.0    2019-05-16 [1]
##  ReproReports * 0.1      2020-05-06 [1]
##  reshape2       1.4.4    2020-04-09 [1]
##  rio            0.5.16   2018-11-26 [1]
##  rlang          0.4.6    2020-05-02 [1]
##  rmarkdown      2.1      2020-01-20 [1]
##  rprojroot      1.3-2    2018-01-03 [1]
##  rstudioapi     0.11     2020-02-07 [1]
##  rvest          0.3.5    2019-11-08 [1]
##  scales         1.1.0    2019-11-18 [1]
##  sessioninfo    1.1.1    2018-11-05 [1]
##  statmod        1.4.34   2020-02-17 [1]
##  stringi        1.4.6    2020-02-17 [1]
##  stringr      * 1.4.0    2019-02-10 [1]
##  testthat       2.3.2    2020-03-02 [1]
##  tibble       * 3.0.1    2020-04-20 [1]
##  tidyr        * 1.0.2    2020-01-24 [1]
##  tidyselect     1.0.0    2020-01-27 [1]
##  tidyverse    * 1.3.0    2019-11-21 [1]
##  usethis        1.6.1    2020-04-29 [1]
##  vctrs          0.2.4    2020-03-10 [1]
##  withr          2.2.0    2020-04-20 [1]
##  xfun           0.13     2020-04-13 [1]
##  xml2           1.3.2    2020-04-23 [1]
##  yaml           2.2.1    2020-02-01 [1]
##  zip            2.0.4    2019-09-01 [1]
##  source                                     
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  Github (METRICS-CARPS/CARPSreports@3277f85)
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
##  CRAN (R 4.0.0)                             
## 
## [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library