Intro

This is problem set #2, in which we hope you will practice the visualization package ggplot2, as well as hone your knowledge of the packages tidyr and dplyr.

Sklar et al. (2012) claims evidence for unconscious arithmetic processing. We’re going to do a reanalysis of their Experiment 6, which is the primary piece of evidence for that claim. The data are generously contributed by Asael Sklar.

First let’s set up a few preliminaries.

library(ggplot2)
library(tidyr)
## Warning: package 'tidyr' was built under R version 3.3.2
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(lme4)
## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## The following object is masked from 'package:tidyr':
## 
##     expand
sem <- function(x) {sd(x, na.rm=TRUE) / sqrt(length(x))}
ci95 <- function(x) {sem(x) * 1.96}

Data Prep

First read in two data files and subject info. A and B refer to different trial order counterbalances.

subinfo <- read.csv("http://langcog.stanford.edu/sklar_expt6_subinfo_corrected.csv")
d.a <- read.csv("http://langcog.stanford.edu/sklar_expt6a_corrected.csv")
d.b <- read.csv("http://langcog.stanford.edu/sklar_expt6b_corrected.csv")

Gather these datasets into long form and get rid of the Xs in the headers.

d.along <- d.a %>% gather(variable, value, starts_with("X")) 
d.along$variable <- gsub("X", "", d.along$variable)

d.blong <- d.b %>% gather(variable, value, starts_with("X")) 
d.blong$variable <- gsub("X", "", d.blong$variable)

Bind these together. Check out bind_rows.

d.bindrow <- bind_rows(d.along, d.blong)

Merge these with subject info. You will need to look into merge and its relatives, left_ and right_join. Call this dataframe d, by convention.

d.bindrow$subid <- as.numeric(d.bindrow$variable)
d <- left_join(subinfo, d.bindrow)
## Joining, by = "subid"

Clean up the factor structure.

d$presentation.time <- factor(d$presentation.time)
levels(d$operand) <- c("addition","subtraction")

Data Analysis Preliminaries

Examine the basic properties of the dataset. First, take a histogram.

ggplot(d,aes(x=value)) +
  geom_histogram(binwidth = 30)
## Warning: Removed 237 rows containing non-finite values (stat_bin).

Challenge question: what is the sample rate of the input device they are using to gather RTs?

Sklar et al. did two manipulation checks. Subjective - asking participants whether they saw the primes - and objective - asking them to report the parity of the primes (even or odd) to find out if they could actually read the primes when they tried. Examine both the unconscious and conscious manipulation checks. What do you see? Are they related to one another?

By plotting the two tests against each other we can see if they seem to line up. We see here that a high subjective test (1.0, they saw the primes) seems correlated with a high score on the objective test, and a 0 on the subjective test is correlated with a low objective test score (correlation = 0.579).

cor(d$subjective.test, d$objective.test)
## [1] 0.5786542
ggplot(d,aes(x=subjective.test,y=objective.test)) +
  geom_point()

OK, let’s turn back to the measure and implement Sklar et al.’s exclusion criterion. You need to have said you couldn’t see (subjective test) and also be not significantly above chance on the objective test (< .6 correct). Call your new data frame ds.

ds <- d %>% filter(subjective.test==0 && objective.test<0.6)

Sklar et al.’s analysis

Sklar et al. show a plot of a “facilitation effect” - the amount faster you are for prime-congruent naming compared with prime-incongruent naming. They then show plot this difference score for the subtraction condition and for the two prime times they tested. Try to reproduce this analysis.

HINT: first take averages within subjects, then compute your error bars across participants, using the sem function (defined above).

ds.forplot <- ds %>% group_by(subid,presentation.time,operand,congruent) %>%
  summarize(avg = mean(value, na.rm=TRUE)) %>% #averages within subjects
  spread(congruent, avg) %>% #undo the gather
  mutate(diff = no-yes) %>% #add new variable
  group_by(presentation.time, operand) %>%
  summarize(facilitation_effect = mean(diff, na.rm=TRUE), sem = sem(diff))

Now plot this summary, giving more or less the bar plot that Sklar et al. gave (though I would keep operation as a variable here. Make sure you get some error bars on there (e.g. geom_errorbar or geom_linerange).

ggplot(ds.forplot, aes(x=presentation.time, y=facilitation_effect)) + 
         geom_bar(stat="identity") + 
         geom_errorbar(aes(ymin=facilitation_effect-sem, ymax=facilitation_effect+sem)) +
         facet_wrap(~operand)
## Warning: Stacking not well defined when ymin != 0

What do you see here? How close is it to what Sklar et al. report? Do the error bars match? How do you interpret these data?

This is approximately what Sklar et al reported, namely that there is an effect in the subtraction condition (right) but not in the addition condition (left). But the error bars here are much larger (about twice as large) as in Sklar et al!

Challenge problem: verify Sklar et al.’s claim about the relationship between RT and the objective manipulation check.

Your own analysis

Show us what you would do with these data, operating from first principles. What’s the fairest plot showing a test of Sklar et al.’s original hypothesis?

# As discussed in class, there is not good theoretical underpinning for why this effect should apply to subtraction and not addition. Therefore, I would compare subtraction AND addition (which presumably was not statistically singificant, or they would've reported that) instead of just subtraction. 

Challenge problem: Do you find any statistical support for Sklar et al.’s findings?