A recent paper published in Psychological Science (Thorstenson, Pazda, & Elliot, 2015), which has already received a great deal of media attention, claims to have found evidence that sadness impairs color perception. In their experiments, subjects viewed either an “Amusing” or a “Sad” video (Experiment 1) or a “Neutral” or a “Sad” video (Experiment 2), and then indicated whether a number of color patches were red, green, blue, or yellow. The validity of the Emotion manipulation was assessed using a post-test questionnaire, which probed how sad the participants felt after watching their respective video.
For the purpose of my evaluation, I’m going to focus on Experiment 2, because I prefer the “Neutral” baseline condition. One oddity of these two experiments is that across the two sessions the grand means are wildly different (\(M_1\) = .90, \(M_2\) = .60), but we’ll assume that’s just noise.
For this experiment (N = 130), the authors report two t-tests: one testing the effect of Emotion condition on color patch accuracy for colors on the Red/Green axis, and one testing the effect of Emotion on accuracy for the Blue/Yellow axis. The difference for Red/Green is non-significant, and the difference for Blue/Yellow is just barely significant (p = .043).
Let’s read their Experiment 2 data in (graciously made available online at https://osf.io/sb58f) and take a look (I’ve converted the xlsx file to CSV and removed the key) and recode their data for readability.
blue2 <- read.csv("~/Dropbox/Data_blue/Study2Data.csv")
blue2$Emotion <- factor(with(blue2, ifelse(Emotion_Condition == 1, "Sad", "Neutral")))
# Make a "long" version of the data to make plotting (and analysis) easier
library(tidyr)
source("~/Dropbox/functions.R")
blue2L <- gather(blue2, Color, Mean, RG_ACC:BY_ACC)
# And recode the Color variable since it is a bit messy
blue2L$Color <- with(blue2L, ifelse(Color == "RG_ACC", "Red/Green", "Blue/Yellow"))
head(blue2L)
## Subject Emotion_Condition SAD_ESRI Emotion Color Mean
## 1 1 1 5 Sad Red/Green 0.59
## 2 2 0 NA Neutral Red/Green 0.88
## 3 3 1 2 Sad Red/Green 0.46
## 4 4 0 1 Neutral Red/Green 0.67
## 5 5 1 4 Sad Red/Green 0.79
## 6 6 0 0 Neutral Red/Green 0.71
Of immediate concern is that there is missing self-report data (see Subject 2). How many cases are there?
blue2[is.na(blue2$SAD_ESRI),]
## Subject Emotion_Condition RG_ACC BY_ACC SAD_ESRI Emotion
## 2 2 0 0.88 0.96 NA Neutral
## 67 75 0 0.75 0.67 NA Neutral
## 102 117 0 0.38 0.50 NA Neutral
## 123 143 0 0.58 0.63 NA Neutral
It looks like we are missing four data points, all in the Neutral condition. Interesting. This is actually very concerning. In the “manipulation check” comparison for Experiment 2 the authors report N=65 for both groups, but clearly N=61 for the Neutral group (and in their t-test they report df=128). From now on, I’m going to exclude these four participants, because they do not supply the full complement of data.
Here are the condition means, excluding these four subjects.
Color | Emotion | Mean | se |
---|---|---|---|
Blue/Yellow | Neutral | 0.608 | 0.016 |
Blue/Yellow | Sad | 0.569 | 0.014 |
Red/Green | Neutral | 0.619 | 0.024 |
Red/Green | Sad | 0.594 | 0.018 |
And here’s a plot (plotting code omitted for brevity).
Let’s now replicate the two t-tests that are reported in the paper, excluding the subjects with missing self-report data. First we compare the effect of Emotion on accuracy for the Blue/Yellow axis.
t.test(blue2.n[blue2.n$Emotion=="Sad",]$BY_ACC, blue2.n[blue2.n$Emotion=="Neutral",]$BY_ACC, var.equal=TRUE)
##
## Two Sample t-test
##
## data: blue2.n[blue2.n$Emotion == "Sad", ]$BY_ACC and blue2.n[blue2.n$Emotion == "Neutral", ]$BY_ACC
## t = -1.8179, df = 124, p-value = 0.07149
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.081027212 0.003443353
## sample estimates:
## mean of x mean of y
## 0.5690769 0.6078689
It looks like the effect is no longer significant. Well, that’s a pickle. Now we test the effect of Emotion on accuracy for the Red/Green axis, which is still nonsignificant.
t.test(blue2.n[blue2.n$Emotion=="Sad",]$RG_ACC, blue2.n[blue2.n$Emotion=="Neutral",]$RG_ACC, var.equal=TRUE)
##
## Two Sample t-test
##
## data: blue2.n[blue2.n$Emotion == "Sad", ]$RG_ACC and blue2.n[blue2.n$Emotion == "Neutral", ]$RG_ACC
## t = -0.805, df = 124, p-value = 0.4224
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.08322732 0.03510122
## sample estimates:
## mean of x mean of y
## 0.5944615 0.6185246
In this situation, we want to know which hypothesis is more likely, the null hypothesis of no effect, or the alternative hypothesis that sadness makes you worse at perceiving blue/yellow. We can accomplish this by computing Bayes factor, which is the ratio of the likelihood of the data under the alternative relative to the null model, using the BayesFactor package (Morey et al., 2015).
Here, we test the null hypothesis (that the difference in means is precisely 0) against an explicit alternative hypothesis. The BayesFactor package assumes a Cauchy prior on effect size (with default scale parameter r = .707), pictured below against the standard normal distribution.
A Bayes factor of 1 indicates equivocal support for the two models, while values above 1 indicate support for the alternative model and values below 1 indicate support for the null model. Let’s run our Bayesian t-test for the Blue/Yellow color condition in Experiment 2.
blue.bft <- ttestBF(formula = BY_ACC ~ Emotion, data = blue2.n)
blue.bft
## Bayes factor analysis
## --------------
## [1] Alt., r=0.707 : 0.84304 ±0%
##
## Against denominator:
## Null, mu1-mu2 = 0
## ---
## Bayes factor type: BFindepSample, JZS
Bayes factor here is .84, indicating some degree of support in favor of the null hypothesis. We can evaluate the evidence in favor of the null model directly by taking the ratio of this Bayes factor, which is 1.186. There doesn’t appear to be much support one way or the other.
We can test the robustness of our critical Bayesian t-test across a variety of Cauchy scale parameters. Below is a plot which illustrates how adjusting the r scale parameter influences the distribution of expected effect sizes.
Here is the Bayes factor for this particular comparison (Sad vs. Neutral for Blue/Yellow accuracy) across three values for the scale parameter. As can be seen in the figure, as the prior is adjusted (here the scale parameters are .707, 1.0, and 1.41) so that probability mass is spread more evenly across larger effects, the evidence in favor of the null hypothesis increases. Plot code from E-J Wagenmakers.
We can also plot the posterior distribution for the effect of Emotion (using r scale = .707, the default). Plot code from E-J Wagenmakers.
Out of curiosity, I now want to know whether there is any evidence in support of the critical comparison (effect of Emotion on Blue/Yellow accuracy) in the first experiment, which used a different video clip (“Amusing”) and demonstrated a larger effect. We will read in their data, present the condition means, and then run the Bayesian t-test, as before.
Color | Emotion | Mean | se |
---|---|---|---|
Blue/Yellow | Neutral | 0.90 | 0.01 |
Blue/Yellow | Sad | 0.86 | 0.01 |
Red/Green | Neutral | 0.91 | 0.01 |
Red/Green | Sad | 0.91 | 0.01 |
Here we run run the Bayesian t-test on the critical comparison from the Experiment 1 data. Here, there is some very weak support for the alternative hypothesis relative to the null, but certainly nothing to write home about.
blue.bft2 <- ttestBF(formula = BY_ACC ~ Emotion, data = blue2)
blue.bft2
## Bayes factor analysis
## --------------
## [1] Alt., r=0.707 : 1.231464 ±0%
##
## Against denominator:
## Null, mu1-mu2 = 0
## ---
## Bayes factor type: BFindepSample, JZS
For comparison, let’s compute the Bayes factor for the manipulation check (that subjects in the “Sad” condition report being sadder than those in the “Amusing” condition). Here, the mean difference is quite large (\(M_{Sad}\) = 5.32, \(M_{Neutral}\) = .75), and the Bayes factor reflects this, as it carries out to many digits.
blue.bft.m <- ttestBF(formula = SAD_ESRI ~ Emotion, data = blue)
blue.bft.m
## Bayes factor analysis
## --------------
## [1] Alt., r=0.707 : 4.828414e+23 ±0%
##
## Against denominator:
## Null, mu1-mu2 = 0
## ---
## Bayes factor type: BFindepSample, JZS
It appears that across the two experiments reported by Thorstenson et al., there is very little, if any positive evidence in favor of the hypothesis that sadness impairs color perception. Indeed, the evidence appears to be largely inconclusive, and hardly worthy of the claims made by the authors.