articleID <- "EXT_13-7-2015" # insert the article ID code here e.g., "10-3-2015_PS"
reportType <- 'pilot' # specify whether this is the 'pilot' report or 'final' report
pilotNames <- 'tyler bonnen' # insert the pilot's name here e.g., "Tom Hardwicke".
copilotNames <- 'Pardis Miri' # insert the co-pilot's name here e.g., "Michael Frank".
pilotTTC <- '??' # insert the pilot's estimated time to complete (in minutes, fine to approximate) e.g., 120
copilotTTC <- '??' # insert the co-pilot's estimated time to complete (in minutes, fine to approximate) e.g., 120
pilotStartDate <- '11/04/2018' # insert the pilot's start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y")
copilotStartDate <- '11/04/2018' # insert the co-pilot's start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y")
completionDate <- '11/04/2018' # copilot insert the date of final report completion (after any necessary rounds of author assistance)
Article ID: EXT_13-7-2015
For this article you should focus on the findings reported in the results section for Experiment 1.
Specifically, you should attempt to reproduce all descriptive and inferential analyses reported in the text below and associated tables/figures:
Heading errors were defined by the difference between^Mthe observer’s heading estimate and the actual heading^Mposition and are shown in Figure 2. Differences in^Mheading errors between walker type and limb-motion^Mconditions were tested using a two-way repeated measures^Manalysis of variance (ANOVA). When walkers^Mperformed limb movements in place, heading errors^Mwere larger than when figures maintained a single^Mposture, F(1, 13) = 21.27, p < .001, ηp^M2 = .62. The finding^Mthat limb motion negatively affects heading estimation^Msuggests that biological motion significantly perturbs^Mthe optic flow field. This is consistent with the proposal^Mthat limb motion introduces noise into the optic flow^Mfield, disturbing the global organization of the flow pattern^Mand negatively impacting heading detection. In addition, the ANOVA also showed a main effect^Mof walker type, F(2, 26) = 3.80, p = .036, ηp^M2 = .23, with^M false-discovery-rate-adjusted post hoc tests revealing^M that heading estimates were worse for normal walkers^Mthan for inverted walkers (p = .048). This result implies^M that the presence of biological motion interferes with^Mthe computation of the optic flow pattern.^MThis interference could occur for a number of reasons.^MFor example, the assignment of attention to biological^Mmotion reduces performance on a concurrent task^M(Thompson & Parasuraman, 2012; Thornton, Rensink, &^M Shiffrar, 2002), thus some division of attention could^Mhave produced this result. Alternatively, in the current^Mexperiment, a walker’s limb motion suggests a walker^Mthat translated, whereas in actual fact the point-light^Mwalker’s physical position in the scene was fixed in^Mplace. This conflicting information might also explain^Mwhy walkers produced larger heading errors than^Minverted walkers, whose articulation pattern is not suggestive^Mof biological ambulation (Troje & Westhoff,^M2006). The interaction between walker type and limb^Mmotion approached but did not meet the threshold for^Mstatistical significance, F(2, 26) = 2.95, p = .070.
# load packages
library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files
library(CARPSreports) # custom report functions
library(ggplot2)
library(foreign)
# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared
reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA)
# absolute path to data files
tmp_data = read.spss('./data/Experiment1_GroupData.sav', to.data.frame = T)
tmp_data$subject = as.factor(c(1:14))
data = tmp_data %>%
# remove the mean values
select(-contains('Mean')) %>%
# transform wide to long data, preserving relative positive by subject
gather('original_label', 'errors', -'subject') %>%
# recode movement to be a binary variable, instead of having 'static' or 'articulating' embedded in a string
mutate(movement = ifelse(substr(original_label, str_locate(pattern='_', original_label)[1]+1, 100) =='articulating', T, F)) %>%
# recode the stimuli categories to clearly reflect conditions
mutate(walker_type = case_when(substr(original_label, 1, 1)=='n' ~ 'normal',
substr(original_label, 1, 1)=='i' ~ 'inverted',
substr(original_label, 1, 1)=='s' ~ 'scrambled')) %>%
# remove temporary data types
select(-original_label) %>%
# convert to factors
mutate(movement=as.factor(movement)) %>%
mutate(walker_type=as.factor(walker_type))
head(data)
## subject errors movement walker_type
## 1 1 2.4543 TRUE normal
## 2 2 3.2338 TRUE normal
## 3 3 2.6781 TRUE normal
## 4 4 2.3936 TRUE normal
## 5 5 4.5392 TRUE normal
## 6 6 3.5245 TRUE normal
summary_data = group_by(data, walker_type, movement) %>%
summarise(n_errors = n(),
mean_errors = mean(errors, na.rm = TRUE),
sem_errors = sd(errors, na.rm = TRUE)/sqrt(n_errors))
# make sure it's in the right order
summary_data$walker_type <- factor(summary_data$walker_type, levels = c("normal", "inverted", "scrambled"), ordered = TRUE)
summary_data
## # A tibble: 6 x 5
## # Groups: walker_type [?]
## walker_type movement n_errors mean_errors sem_errors
## <ord> <fct> <int> <dbl> <dbl>
## 1 inverted FALSE 14 1.93 0.484
## 2 inverted TRUE 14 2.84 0.355
## 3 normal FALSE 14 2.12 0.561
## 4 normal TRUE 14 3.43 0.419
## 5 scrambled FALSE 14 2.45 0.493
## 6 scrambled TRUE 14 2.98 0.513
These seem to be the main findings for experiment 1 in the paper:
- When walkers performed limb movements in place, heading errors were larger than when figures maintained a single posture, F(1, 13) = 21.27, p < .001, ηp2 = .62
- In addition, the ANOVA also showed a main effectof walker type, F(2, 26) = 3.80, p = .036, ηp2 = .23,
- The interaction between walker type and limb motion approached but did not meet the threshold forstatistical significance, F(2, 26) = 2.95, p = .070.
# build model
main_model = aov(errors ~ movement * walker_type, data=data)
# display outputs
summary(main_model)
## Df Sum Sq Mean Sq F value Pr(>F)
## movement 1 17.52 17.516 5.536 0.0212 *
## walker_type 2 2.50 1.248 0.394 0.6755
## movement:walker_type 2 2.13 1.063 0.336 0.7157
## Residuals 78 246.79 3.164
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The results from this model fail to replicate the authors original claims. - Movement still appears to predict the errors that subjects made, F(1,78)=0.02, but not to the degree of the original F(1,13)<0.001 - There is not a main effect of walker type
We can validate the ηp2 the authors provide–minus the missing final value:
# given the values above:
compute_eta_squared <- function(a, b, c, test=0) {
# From F(a,b) = c in an anova
# we can compute the partial eta squared by ηp2 = (a*c)/(a*c+b)
partial_eta_squared = (a*c)/(a*c+b)
if (test | is.na(test)){
match = round(partial_eta_squared,2) == test
paste('Calculated ηp2 =', round(partial_eta_squared,2), 'EQUAL:',match )
} else {
paste0('Using correct degrees of freedom, ηp2 = ', round(partial_eta_squared,4) )
}
}
compute_eta_squared(1, 13, 21.27, .62)
## [1] "Calculated ηp2 = 0.62 EQUAL: TRUE"
compute_eta_squared(2, 26, 03.80, .23)
## [1] "Calculated ηp2 = 0.23 EQUAL: TRUE"
compute_eta_squared(2, 26, 02.95, NaN)
## [1] "Calculated ηp2 = 0.18 EQUAL: NA"
Given the sample size (n=14) and 3x2 design, the reported degrees of freedom in the original paper seems to be incorrect. That is, reporting a p value using F(1,13) would suggest that the authors ran the analysis within just a single movement x walker_type. This is, effectively, a t-test and not the anova the authors report to have used to model their data. In every case, we can recompute ηp2 given the (more likely correct) degrees of freedom:
compute_eta_squared(1, 78, 0.02)
## [1] "Using correct degrees of freedom, ηp2 = 3e-04"
There are several concerns I would have before concluding that I haven’t replicated their results. First, I can’t tell whether they’re reporting here is drawn from a single model or not. That is, in part, because the reported degrees of freedom seem to be incorrect. Second, it appears that the authors have incorporated an additional analysis step, which was first apparent when I visualized some of the experiemnts summary statistics:
# plot and see if we can recover Figure 2
ggplot(data=summary_data, aes(x=walker_type, y=mean_errors, color=movement) ) +
geom_point(size=5) +
ggtitle("Mostly recovers Figure 2, but I appear to have greater variance.") +
geom_errorbar(stat='identity', aes(ymin=mean_errors-(sem_errors/2), ymax=mean_errors+(sem_errors/2)), width=.1) +
theme_classic() + expand_limits(y=1.5)
Visual inspection of the original author’s Figure 2, alongside my own, suggest that the s.e.m. we were generating was different. Below the figure the authors note that “Error bars represent standard errors adjusted for the within-subjects design” but this within-subjects design is not mentioned further in the methods or suplimental material. This hidden step may explain the difference between my results and theirs.
And third, due to my unfamiliarity with ANOVAs, it’s entirely possible I made a mistake in my own anaylysis. For example, it’s possible that theres a standard practice around, for example, standardizing variance within subjects within repeated measure ANOVAs, which I failed to implement. For these reasons, I would not feel confident declaring that I have failed to replicate the authors oroginal findings until communicating with them.
Author_Assistance = FALSE # was author assistance provided? (if so, enter TRUE)
Insufficient_Information_Errors <- 1 # how many discrete insufficient information issues did you encounter?
# Assess the causal locus (discrete reproducibility issues) of any reproducibility errors. Note that there doesn't necessarily have to be a one-to-one correspondance between discrete reproducibility issues and reproducibility errors. For example, it could be that the original article neglects to mention that a Greenhouse-Geisser correct was applied to ANOVA outcomes. This might result in multiple reproducibility errors, but there is a single causal locus (discrete reproducibility issue).
locus_typo <- 1 # how many discrete issues did you encounter that related to typographical errors?
locus_specification <- 1 # how many discrete issues did you encounter that related to incomplete, incorrect, or unclear specification of the original analyses?
locus_analysis <- NA # how many discrete issues did you encounter that related to errors in the authors' original analyses?
locus_data <- 0 # how many discrete issues did you encounter that related to errors in the data files shared by the authors?
locus_unidentified <- 1 # how many discrete issues were there for which you could not identify the cause
# How many of the above issues were resolved through author assistance?
locus_typo_resolved <- NA # how many discrete issues did you encounter that related to typographical errors?
locus_specification_resolved <- NA # how many discrete issues did you encounter that related to incomplete, incorrect, or unclear specification of the original analyses?
locus_analysis_resolved <- NA # how many discrete issues did you encounter that related to errors in the authors' original analyses?
locus_data_resolved <- NA # how many discrete issues did you encounter that related to errors in the data files shared by the authors?
locus_unidentified_resolved <- NA # how many discrete issues were there for which you could not identify the cause
Affects_Conclusion <- TRUE # Do any reproducibility issues encounter appear to affect the conclusions made in the original article? TRUE, FALSE, or NA. This is a subjective judgement, but you should taking into account multiple factors, such as the presence/absence of decision errors, the number of target outcomes that could not be reproduced, the type of outcomes that could or could not be reproduced, the difference in magnitude of effect sizes, and the predictions of the specific hypothesis under scrutiny.
# decide on final outcome
if(any(reportObject$comparisonOutcome %in% c("MAJOR_ERROR", "DECISION_ERROR")) | Insufficient_Information_Errors > 0){
finalOutcome <- "Failure without author assistance"
if(Author_Assistance == T){
finalOutcome <- "Failure despite author assistance"
}
}else{
finalOutcome <- "Success without author assistance"
if(Author_Assistance == T){
finalOutcome <- "Success with author assistance"
}
}
# collate report extra details
reportExtras <- data.frame(articleID, pilotNames, copilotNames, pilotTTC, copilotTTC, pilotStartDate, copilotStartDate, completionDate, Author_Assistance, finalOutcome, Insufficient_Information_Errors, locus_typo, locus_specification, locus_analysis, locus_data, locus_unidentified, locus_typo_resolved, locus_specification_resolved, locus_analysis_resolved, locus_data_resolved, locus_unidentified_resolved)
# save report objects
if(reportType == "pilot"){
write_csv(reportObject, "pilotReportDetailed.csv")
write_csv(reportExtras, "pilotReportExtras.csv")
}
if(reportType == "final"){
write_csv(reportObject, "finalReportDetailed.csv")
write_csv(reportExtras, "finalReportExtras.csv")
}
[This function will output information about the package versions used in this report:]
devtools::session_info()
## ─ Session info ──────────────────────────────────────────────────────────
## setting value
## version R version 3.5.1 (2018-07-02)
## os macOS High Sierra 10.13.6
## system x86_64, darwin15.6.0
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz America/Los_Angeles
## date 2018-11-05
##
## ─ Packages ──────────────────────────────────────────────────────────────
## package * version date lib
## assertthat 0.2.0 2017-04-11 [1]
## backports 1.1.2 2017-12-13 [1]
## base64enc 0.1-3 2015-07-28 [1]
## bindr 0.1.1 2018-03-13 [1]
## bindrcpp * 0.2.2 2018-03-29 [1]
## broom 0.5.0 2018-07-17 [1]
## callr 3.0.0 2018-08-24 [1]
## CARPSreports * 0.1 2018-11-05 [1]
## cellranger 1.1.0 2016-07-27 [1]
## cli 1.0.1 2018-09-25 [1]
## colorspace 1.3-2 2016-12-14 [1]
## crayon 1.3.4 2017-09-16 [1]
## desc 1.2.0 2018-05-01 [1]
## devtools 2.0.1 2018-10-26 [1]
## digest 0.6.18 2018-10-10 [1]
## dplyr * 0.7.7 2018-10-16 [1]
## evaluate 0.12 2018-10-09 [1]
## fansi 0.4.0 2018-10-05 [1]
## forcats * 0.3.0 2018-02-19 [1]
## foreign * 0.8-70 2017-11-28 [1]
## fs 1.2.6 2018-08-23 [1]
## ggplot2 * 3.1.0 2018-10-25 [1]
## glue 1.3.0 2018-07-17 [1]
## gtable 0.2.0 2016-02-26 [1]
## haven * 1.1.2 2018-06-27 [1]
## hms 0.4.2 2018-03-10 [1]
## htmltools 0.3.6 2017-04-28 [1]
## httr 1.3.1 2017-08-20 [1]
## jsonlite 1.5 2017-06-01 [1]
## knitr * 1.20 2018-02-20 [1]
## labeling 0.3 2014-08-23 [1]
## lattice 0.20-35 2017-03-25 [1]
## lazyeval 0.2.1 2017-10-29 [1]
## lubridate 1.7.4 2018-04-11 [1]
## magrittr 1.5 2014-11-22 [1]
## memoise 1.1.0 2017-04-21 [1]
## modelr 0.1.2 2018-05-11 [1]
## munsell 0.5.0 2018-06-12 [1]
## nlme 3.1-137 2018-04-07 [1]
## pillar 1.3.0 2018-07-14 [1]
## pkgbuild 1.0.2 2018-10-16 [1]
## pkgconfig 2.0.2 2018-08-16 [1]
## pkgload 1.0.2 2018-10-29 [1]
## plyr 1.8.4 2016-06-08 [1]
## prettyunits 1.0.2 2015-07-13 [1]
## processx 3.2.0 2018-08-16 [1]
## ps 1.2.0 2018-10-16 [1]
## purrr * 0.2.5 2018-05-29 [1]
## R6 2.3.0 2018-10-04 [1]
## Rcpp 0.12.19 2018-10-01 [1]
## readr * 1.1.1 2017-05-16 [1]
## readxl * 1.1.0 2018-04-20 [1]
## remotes 2.0.2 2018-10-30 [1]
## rlang 0.3.0.1 2018-10-25 [1]
## rmarkdown 1.10 2018-06-11 [1]
## rprojroot 1.3-2 2018-01-03 [1]
## rstudioapi 0.8 2018-10-02 [1]
## rvest 0.3.2 2016-06-17 [1]
## scales 1.0.0 2018-08-09 [1]
## sessioninfo 1.1.0 2018-09-25 [1]
## stringi 1.2.4 2018-07-20 [1]
## stringr * 1.3.1 2018-05-10 [1]
## tibble * 1.4.2 2018-01-22 [1]
## tidyr * 0.8.2 2018-10-28 [1]
## tidyselect 0.2.5 2018-10-11 [1]
## tidyverse * 1.2.1 2017-11-14 [1]
## usethis 1.4.0 2018-08-14 [1]
## utf8 1.1.4 2018-05-24 [1]
## withr 2.1.2 2018-03-15 [1]
## xml2 1.2.0 2018-01-24 [1]
## yaml 2.2.0 2018-07-25 [1]
## source
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## Github (METRICS-CARPS/CARPSreports@89db4a9)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.1)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.1)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.1)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.1)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
## CRAN (R 3.5.0)
##
## [1] /Library/Frameworks/R.framework/Versions/3.5/Resources/library