Intro

This is problem set #2, in which we hope you will practice the visualization package ggplot2, as well as hone your knowledge of the packages tidyr and dplyr.

Sklar et al. (2012) claims evidence for unconscious arithmetic processing. We’re going to do a reanalysis of their Experiment 6, which is the primary piece of evidence for that claim. The data are generously contributed by Asael Sklar.

First let’s set up a few preliminaries.

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.3.2
library(tidyr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(lme4)
## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## The following object is masked from 'package:tidyr':
## 
##     expand
library(forcats)
## Warning: package 'forcats' was built under R version 3.3.2
sem <- function(x) {sd(x, na.rm=TRUE) / sqrt(length(x))}
ci95 <- function(x) {sem(x) * 1.96}

Data Prep

First read in two data files and subject info. A and B refer to different trial order counterbalances.

subinfo <- read.csv("http://langcog.stanford.edu/sklar_expt6_subinfo_corrected.csv")
d.a <- read.csv("http://langcog.stanford.edu/sklar_expt6a_corrected.csv")
d.b <- read.csv("http://langcog.stanford.edu/sklar_expt6b_corrected.csv")

Gather these datasets into long form and get rid of the Xs in the headers.

#make d.a into long form
d.a.tidy <- d.a %>%
  gather(subid,RT,
         X1,X2,X3,X4,X5,X6,X7,X8,X9,X10,X11,X12,X13,X14,X15,X16,X17,X18,X19,X20,X21) %>%
  mutate(subid = gsub("X","",subid),
         subid = as.factor(subid)) #remove X's from headers

#make d.b into long form
d.b.tidy <- d.b %>%
  gather(subid,RT,
         X22, X23,X24,X25,X26,X27,X28,X29,X30,X31,X32,X33,X34,X35,X36,X37,X38,X39,X40,X41,X42) %>%
  mutate(subid = gsub("X","",subid),
         subid = as.factor(subid)) #remove X's

Bind these together. Check out bind_rows.

d.ab.tidy = bind_rows(d.a.tidy,d.b.tidy)
## Warning in bind_rows_(x, .id): Unequal factor levels: coercing to character
#throws error: "unequal factor levels: coercing to character". I googled a bit and I think the issue is that variables in d.a.tidy and d.b.tidy are of different types, but I see no differences in structure between them? 

Merge these with subject info. You will need to look into merge and its relatives, left_ and right_join. Call this dataframe d, by convention.

subinfo$subid = factor(subinfo$subid)
d.ab.tidy$subid = factor(d.ab.tidy$subid)

d = left_join(subinfo,d.ab.tidy)
## Joining, by = "subid"
## Warning in left_join_impl(x, y, by$x, by$y, suffix$x, suffix$y): joining
## factors with different levels, coercing to character vector

Clean up the factor structure.

d$presentation.time <- factor(d$presentation.time)
levels(d$operand) <- c("addition","subtraction")

Data Analysis Preliminaries

Examine the basic properties of the dataset. First, take a histogram.

ggplot(d, aes(RT)) +
  geom_histogram(binwidth = 100) +
  ylab("Frequency")
## Warning: Removed 237 rows containing non-finite values (stat_bin).

#histogram of RTs faceted by condition (congruent or incongruent)
ggplot(d, aes(RT)) +
  geom_histogram(binwidth = 100) +
  ylab("Frequency") +
  facet_grid(~ congruent)
## Warning: Removed 237 rows containing non-finite values (stat_bin).

Challenge question: what is the sample rate of the input device they are using to gather RTs?

Sklar et al. did two manipulation checks. Subjective - asking participants whether they saw the primes - and objective - asking them to report the parity of the primes (even or odd) to find out if they could actually read the primes when they tried. Examine both the unconscious and conscious manipulation checks. What do you see? Are they related to one another?

Subjective measure.

#subjective measure -- asking whether participants saw primes
subinfo$subjective.test = factor(subinfo$subjective.test, levels = c(0,1), labels = c("no","yes"))

table(subinfo$subjective.test)
## 
##  no yes 
##  21  21

It appears that only half of the participants (21 of 42) reported that they saw the prime.

Objective measure.

#my sense is that the values in the objective.test variable referes to % correct for the objective measure? 
ggplot(subinfo,aes(subid,objective.test, col = subjective.test)) +
  geom_point() + 
  ylab("")

  ylim(0,1)
## <ScaleContinuousPosition>
##  Range:  
##  Limits:    0 --    1

From the graph, we can see that those who reported “yes” on the subjective test tend to have higher scores reporting the parity of the primes in the objective test, whereas those who reported “no” on the subjective test have objective test scores that hover around chance (50%).

OK, let’s turn back to the measure and implement Sklar et al.’s exclusion criterion. You need to have said you couldn’t see (subjective test) and also be not significantly above chance on the objective test (< .6 correct). Call your new data frame ds.

ds = filter(d, subjective.test == 0, objective.test < .6)

Sklar et al.’s analysis

Sklar et al. show a plot of a “facilitation effect” - the amount faster you are for prime-congruent naming compared with prime-incongruent naming. They then show plot this difference score for the subtraction condition and for the two prime times they tested. Try to reproduce this analysis.

HINT: first take averages within subjects, then compute your error bars across participants, using the sem function (defined above).

ds_avg = ds %>%
  na.omit() %>%
  group_by(subid, congruent, operand, presentation.time) %>%
  summarise(RT = mean(RT)) %>%
  group_by(subid, operand, presentation.time) %>%
  summarise(diff = RT[congruent == "no"] - RT[congruent == "yes"]) %>%
  group_by(operand, presentation.time) %>%
  summarise(mean_diff = mean(diff),
            sem_diff = sem(diff))

Now plot this summary, giving more or less the bar plot that Sklar et al. gave (though I would keep operation as a variable here. Make sure you get some error bars on there (e.g. geom_errorbar or geom_linerange).

ggplot(ds_avg, aes(presentation.time,mean_diff,fill=presentation.time)) +
  geom_bar(stat = "identity") +
  facet_grid(~ operand) + 
  geom_errorbar(aes(ymax = mean_diff + sem_diff, ymin = mean_diff - sem_diff, width = .1)) +
  ggthemes::theme_few() +
  ylim(-35,35) +
  ylab("Facilitation (s)")

What do you see here? How close is it to what Sklar et al. report? Do the error bars match? How do you interpret these data?

Here, we can see that participants were indeed faster with prime-congruent naming than prime-incongruent naming; however, this effect appears to be specific to the subtraction condition. In the addition condition, participants actually appear to be slower with prime-congruent naming than prime-incongruent naming when the stimuli were presented for 1700ms. The error bars on the graph I generated appear to be quite larger than those in figure 2 of Sklar et al. (2012).

So, priming seems to particularly quicken RTs during subtraction trials but not addition trials.

Challenge problem: verify Sklar et al.’s claim about the relationship between RT and the objective manipulation check.

ds_2 = filter(d, subjective.test == 0) #just include those who passed subjective manipulation check

ds_avg_2 = ds_2 %>%
  na.omit() %>%
  mutate(objective = if_else(objective.test < .6, "pass","fail")) %>%
  group_by(subid, congruent, operand, presentation.time, objective) %>%
  summarise(RT = mean(RT)) %>%
  group_by(subid, operand, presentation.time, objective) %>%
  summarise(diff = RT[congruent == "no"] - RT[congruent == "yes"]) %>%
  group_by(operand, presentation.time, objective) %>%
  summarise(mean_diff = mean(diff),
            sem_diff = sem(diff))

ggplot(ds_avg_2, aes(presentation.time,mean_diff,fill=presentation.time)) +
  geom_bar(stat = "identity") +
  facet_grid(objective ~ operand) + 
  geom_errorbar(aes(ymax = mean_diff + sem_diff, ymin = mean_diff - sem_diff, width = .1)) +
  ggthemes::theme_few() +
  ylim(-35,35)

Here, I’ve plotted a difference in RT between the congruent-priming and incongruent-priming conditions, faceted by operand and whether participants who “failed” the mainpulation check (received a score higher than .6) or “passed” (received a score lower than .6 and were included in the final sample). Indeed, you can see that there doesn’t appear to be a difference between the priming conditions for those who failed the objective test.

Your own analysis

Show us what you would do with these data, operating from first principles. What’s the fairest plot showing a test of Sklar et al.’s original hypothesis?

From their paper, it seemed that Sklar et al.‘s original hypothesis was that primes would influence subjects’ judgment time when the result of the primed equation was the target stimulus (prime-congruent condition). They stated no prediction about the effect of the operand on subjects’ RT. So, a more fair plot would be showing RT collapsed across operand type and asking whether there are differences in RT between incongruent and congruent conditions. Also, I think it’s helpful to show the mean RT for each condition instead of showing differences in RT between conditions, so readers actually have a sense of how long this task was taking participants.

ds_fair= ds %>%
  na.omit() %>%
  group_by(subid, congruent, presentation.time) %>%
  summarise(RT = mean(RT)) %>%
  group_by(congruent, presentation.time) %>%
  summarise(mean_RT = mean(RT),
            sem_RT = sem(RT))

dodge <- position_dodge(width=0.9)

ggplot(ds_fair, aes(presentation.time,mean_RT,fill=congruent)) +
  geom_bar(stat = "identity", position = dodge) +
  geom_errorbar(aes(ymax = mean_RT + sem_RT, ymin = mean_RT - sem_RT), position = dodge , stat = "identity", width = .1) +
  ylab("Mean Reaction Time") +
  xlab("Presentation Time") + 
  ggthemes::theme_few()

Challenge problem: Do you find any statistical support for Sklar et al.’s findings?

#Does the congruency of the prime/target predict reaction time? 
summary(lmer(RT ~ congruent + (1|subid), ds))
## Linear mixed model fit by REML ['lmerMod']
## Formula: RT ~ congruent + (1 | subid)
##    Data: ds
## 
## REML criterion at convergence: 30864.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.7690 -0.6094 -0.1019  0.5720  6.7789 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subid    (Intercept)  9524     97.59  
##  Residual             11271    106.17  
## Number of obs: 2531, groups:  subid, 17
## 
## Fixed effects:
##              Estimate Std. Error t value
## (Intercept)   680.692     23.856  28.534
## congruentyes   -4.992      4.221  -1.183
## 
## Correlation of Fixed Effects:
##             (Intr)
## congruentys -0.088
#Does the operand predict reaction time? 
summary(lmer(RT ~ operand + (1|subid), ds))
## Linear mixed model fit by REML ['lmerMod']
## Formula: RT ~ operand + (1 | subid)
##    Data: ds
## 
## REML criterion at convergence: 30849.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.8373 -0.6402 -0.0776  0.5648  6.6984 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subid    (Intercept)  9521     97.58  
##  Residual             11203    105.85  
## Number of obs: 2531, groups:  subid, 17
## 
## Fixed effects:
##                    Estimate Std. Error t value
## (Intercept)         686.447     23.845  28.788
## operandsubtraction  -17.193      4.212  -4.082
## 
## Correlation of Fixed Effects:
##             (Intr)
## oprndsbtrct -0.085
#Is there an interaction between congruency and the operand?
summary(lmer(RT ~ congruent * operand + (1|subid),ds))
## Linear mixed model fit by REML ['lmerMod']
## Formula: RT ~ congruent * operand + (1 | subid)
##    Data: ds
## 
## REML criterion at convergence: 30831.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.8195 -0.6406 -0.0588  0.5783  6.6858 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subid    (Intercept)  9521     97.58  
##  Residual             11182    105.74  
## Number of obs: 2531, groups:  subid, 17
## 
## Fixed effects:
##                                 Estimate Std. Error t value
## (Intercept)                      684.253     24.022  28.484
## congruentyes                       4.391      5.829   0.753
## operandsubtraction                -7.418      5.944  -1.248
## congruentyes:operandsubtraction  -19.604      8.415  -2.330
## 
## Correlation of Fixed Effects:
##             (Intr) cngrnt oprnds
## congruentys -0.121              
## oprndsbtrct -0.119  0.490       
## cngrntys:pr  0.084 -0.693 -0.706
#Here is what I think they did.. a within-subjects anova?
ds_subt = subset(ds,ds$operand == "subtraction")
summary(aov(RT ~ congruent * presentation.time + Error(subid/prime), ds_subt)) #but it still doesn't come out? I'm not sure how they recovered such a high F value. 
## Warning in aov(RT ~ congruent * presentation.time + Error(subid/prime), :
## Error() model is singular
## 
## Error: subid
##                             Df   Sum Sq Mean Sq F value Pr(>F)
## congruent                    1   213525  213525   0.230  0.640
## presentation.time            1  1434073 1434073   1.543  0.236
## congruent:presentation.time  1    29837   29837   0.032  0.861
## Residuals                   13 12078729  929133               
## 
## Error: subid:prime
##                               Df   Sum Sq Mean Sq F value  Pr(>F)   
## congruent                      1    68326   68326   7.217 0.00732 **
## congruent:presentation.time    1     8852    8852   0.935 0.33376   
## Residuals                   1195 11312795    9467                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

First, I conducted a linear mixed-effects model with the condition (congruent or incongruent) as the predictor for RT and find no effect of condition on RT. Next, I asked whether the operand type predicted RT and indeed found that RT was significantly lower for subtraction than addition (est. = -17.193, t = -4.082). Finally, I found that there is a significant interaction between congruency and operand type on RT (est. = =19.604, t = -2.33).

I wasn’t totally sure which statistical tests the authors ran but I tried running an anova but my F value didn’t match theirs? In short, no.