For this exercise, please try to reproduce the results from Study 1 of the associated paper (Joel, Teper, & MacDonald, 2014). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

In study 1, 150 introductory psychology students were randomly assigned to a “real” or a “hypothetical” condition. In the real condition, participants believed that they would have a real opportuniy to connect with potential romantic partners. In the hypothetical condition, participants simply imagined that they are on a date. All participants were required to select their favorite profile and answer whether they were willing to exchange contact information.


Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Study 1):

We next tested our primary hypothesis that participants would be more reluctant to reject the unattractive date when they believed the situation to be real rather than hypothetical. Only 10 of the 61 participants in the hypothetical condition chose to exchange contact information with the unattractive potential date (16%). In contrast, 26 of the 71 participants in the real condition chose to exchange contact information (37%). A chi-square test of independence indicated that participants were significantly less likely to reject the unattractive potential date in the real condition compared with the hypothetical condition, X^2(1, N = 132) = 6.77, p = .009.


Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files

# #optional packages:
# library(broom)
# library(labelled)# converts SPSS's labelled to R's factor 

Step 2: Load data

# Just Study 1
d <- read_sav('data/Empathy Gap Study 1 data.sav')

Step 3: Tidy data

d_tidy <- d %>% 
  select(exchangeinfo,
         condition) %>% 
  mutate(condition = ifelse(test = condition == 0, #making it more clear what the values in this column mean
                            yes = 'hypothetical',
                            no = 'real')) %>% 
  mutate(exchangeinfo = ifelse(test = exchangeinfo == 2,
                               yes = 0,
                               no = 1)) #this will make it easier to get the descriptive stats, 1 being exchanging info, and 0 being not exchanging info

Step 4: Run analysis

Descriptive statistics

Only 10 of the 61 participants in the hypothetical condition chose to exchange contact information with the unattractive potential date (16%). In contrast, 26 of the 71 participants in the real condition chose to exchange contact information (37%).

# reproduce the above results here

#confirming that the number of participants in each condition is correct
d_tidy %>% 
  filter(condition == 'hypothetical') %>% 
  nrow()
## [1] 61
d_tidy %>% 
  filter(condition == 'real') %>% 
  nrow()
## [1] 71
descriptive_stats <- d_tidy %>%
  group_by(condition) %>% 
  summarize(total_exchange = sum(exchangeinfo))

descriptive_stats
## # A tibble: 2 × 2
##   condition    total_exchange
##   <chr>                 <dbl>
## 1 hypothetical             10
## 2 real                     26

Inferential statistics

A chi-square test of independence indicated that participants were significantly less likely to reject the unattractive potential date in the real condition compared with the hypothetical condition, X^2(1, N = 132) = 6.77, p = .009.

Hint: if you are using the function chisq.test(), make sure to set the continuity correction to false (“correct = FALSE”) since sample size is greater than 20.

# reproduce the above results here

cont_table <- table(d_tidy$condition, d_tidy$exchangeinfo) #creating a contingency table

chisq.test(cont_table,
           correct = FALSE)
## 
##  Pearson's Chi-squared test
## 
## data:  cont_table
## X-squared = 6.7674, df = 1, p-value = 0.009284

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

Yes, I reproduced the results

How difficult was it to reproduce your results?

It was quite difficult to reproduce the results

What aspects made it difficult? What aspects made it easy?

It was difficult to reproduce the results because it was not clear what the values in the two required columns meant – specifically, I had to figure out that for condition, 0 meant hypothetical and 1 meant real by looking at the counts of participants in each condition. I then had to figure out that for exchangeinfo, 2 meant not exchanging info and 1 meant exchanging info – at first, I assumed the opposite, but then realized that the descriptive stats were giving me the inverse of what the authors had found. Lastly, it took me a while to realize that I needed to create a contingency table to run the chi-sqaure test.

Some aspects that made it easy were that there were only 2 columns needed for the analysis and they were clearly named. There were also no missing values.