For this exercise, please try to reproduce the results from Study 1 of the associated paper (Joel, Teper, & MacDonald, 2014). The PDF of the paper is included in the same folder as this Rmd file.
In study 1, 150 introductory psychology students were randomly assigned to a “real” or a “hypothetical” condition. In the real condition, participants believed that they would have a real opportunity to connect with potential romantic partners. In the hypothetical condition, participants simply imagined that they are on a date. All participants were required to select their favorite profile and answer whether they were willing to exchange contact information.
Below is the specific result you will attempt to reproduce (quoted directly from the results section of Study 1):
We next tested our primary hypothesis that participants would be more reluctant to reject the unattractive date when they believed the situation to be real rather than hypothetical. Only 10 of the 61 participants in the hypothetical condition chose to exchange contact information with the unattractive potential date (16%). In contrast, 26 of the 71 participants in the real condition chose to exchange contact information (37%). A chi-square test of independence indicated that participants were significantly less likely to reject the unattractive potential date in the real condition compared with the hypothetical condition, X^2(1, N = 132) = 6.77, p = .009.
library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files
# #optional packages:
# library(broom)
# library(labelled)# converts SPSS's labelled to R's factor
# Just Study 1
d <- read_sav('data/Empathy Gap Study 1 data.sav')
d <- d %>% #changing condition and dv labels, and turning them into factors
select(condition, exchangeinfo) %>%
mutate(condition = case_when(condition == 0 ~ "hypothetical",
condition == 1 ~ "real")) %>%
mutate(exchangeinfo = case_when(exchangeinfo == 1 ~ "exchange",
exchangeinfo == 2 ~ "dontexchange")) %>%
mutate(condition = as.factor(condition)) %>%
mutate(exchangeinfo = as.factor(exchangeinfo))
Only 10 of the 61 participants in the hypothetical condition chose to exchange contact information with the unattractive potential date (16%). In contrast, 26 of the 71 participants in the real condition chose to exchange contact information (37%).
# reproduce the above results here
d %>% #percentage table
group_by(condition) %>%
summarize(wantedtoexchange = length(exchangeinfo[exchangeinfo == "exchange"]), total = length(exchangeinfo), percent = length(exchangeinfo[exchangeinfo == "exchange"])/length(exchangeinfo))
## # A tibble: 2 × 4
## condition wantedtoexchange total percent
## <fct> <int> <int> <dbl>
## 1 hypothetical 10 61 0.164
## 2 real 26 71 0.366
A chi-square test of independence indicated that participants were significantly less likely to reject the unattractive potential date in the real condition compared with the hypothetical condition, X^2(1, N = 132) = 6.77, p = .009.
# reproduce the above results here
d <- d %>% #for chi squared
group_by(condition) %>%
summarize(wantedtoexchange = length(exchangeinfo[exchangeinfo == "exchange"]), didnotwanttoexchange = length(exchangeinfo[exchangeinfo == "dontexchange"])) %>%
select(wantedtoexchange, didnotwanttoexchange)
chisq.test(d, correct = FALSE)
##
## Pearson's Chi-squared test
##
## data: d
## X-squared = 6.7674, df = 1, p-value = 0.009284
Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?
I was able to reproduce the full results.
How difficult was it to reproduce your results?
It was not difficult! I got a little confused because R seemed to automatically apply a correction to the chi-squared test, but once I got rid of that it seemed to be okay.
What aspects made it difficult? What aspects made it easy?
I appreciated that the authors demarcated which columns in the data file were relevant to the analyses, and to reproduce the results. I think I would have appreciated these more if they were at the front - when I first opened up the file I got a bit confused by all of the column names and it was only until I scrolled to the end that I saw the relevant columns. I think it would have been better if they labeled what their coding scheme was in their file (as opposed to 1s and 2s, or 0s and 1s), or have provided a code book to decipher what the values were. I just had to guess what the values referred to what condition or choice. But otherwise it went smoothly!