For this exercise, please try to reproduce the results from Study 1 of the associated paper (Joel, Teper, & MacDonald, 2014). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

In study 1, 150 introductory psychology students were randomly assigned to a “real” or a “hypothetical” condition. In the real condition, participants believed that they would have a real opportunity to connect with potential romantic partners. In the hypothetical condition, participants simply imagined that they are on a date. All participants were required to select their favorite profile and answer whether they were willing to exchange contact information.


Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Study 1):

We next tested our primary hypothesis that participants would be more reluctant to reject the unattractive date when they believed the situation to be real rather than hypothetical. Only 10 of the 61 participants in the hypothetical condition chose to exchange contact information with the unattractive potential date (16%). In contrast, 26 of the 71 participants in the real condition chose to exchange contact information (37%). A chi-square test of independence indicated that participants were significantly less likely to reject the unattractive potential date in the real condition compared with the hypothetical condition, X^2(1, N = 132) = 6.77, p = .009.


Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files

# #optional packages:
# library(broom)
# library(labelled)# converts SPSS's labelled to R's factor 

Step 2: Load data

# Just Study 1
d <- read_sav('data/Empathy Gap Study 1 data.sav')

Step 3: Tidy data

ana_d <- select(d, c("ID", "condition", "exchangeinfo"))
ana_d
## # A tibble: 132 × 3
##       ID condition        exchangeinfo
##    <dbl> <dbl+lbl>        <dbl+lbl>   
##  1    53 1 [real]         1 [yes]     
##  2    93 1 [real]         2 [no]      
##  3    83 1 [real]         2 [no]      
##  4    27 0 [hypothetical] 2 [no]      
##  5     6 0 [hypothetical] 1 [yes]     
##  6   116 0 [hypothetical] 1 [yes]     
##  7    24 0 [hypothetical] 2 [no]      
##  8   127 0 [hypothetical] 2 [no]      
##  9    32 1 [real]         1 [yes]     
## 10    73 1 [real]         2 [no]      
## # ℹ 122 more rows

Step 4: Run analysis

Descriptive statistics

Only 10 of the 61 participants in the hypothetical condition chose to exchange contact information with the unattractive potential date (16%). In contrast, 26 of the 71 participants in the real condition chose to exchange contact information (37%).

# reproduce the above results here
# Number of participants with hypothetical condition
hyp <- sum(ana_d$condition == 0)
sprintf("Total participants in hypothetical condition = %d", hyp)
## [1] "Total participants in hypothetical condition = 61"
# Number of participants in hypothetical condition that agreed to exchange info
hyp_y <- sum(ana_d$condition == 0 & ana_d$exchangeinfo == 1)
sprintf("Participants in hypothetical condition that agreed to exchange info = %d", hyp_y)
## [1] "Participants in hypothetical condition that agreed to exchange info = 10"
sprintf("Percentage of hypothetical participants that chose to exchange numbers = %f", (hyp_y/hyp)*100)
## [1] "Percentage of hypothetical participants that chose to exchange numbers = 16.393443"
sprintf("--------------------------------------------")
## [1] "--------------------------------------------"
# Number of participants with real condition
real <- sum(ana_d$condition == 1)
sprintf("Total participants in real condition = %d", real)
## [1] "Total participants in real condition = 71"
# Number of participants in real condition that agreed to exchange info
real_y <- sum(ana_d$condition == 1 & ana_d$exchangeinfo == 1)
sprintf("Participants in real condition that agreed to exchange info = %d", real_y)
## [1] "Participants in real condition that agreed to exchange info = 26"
sprintf("Percentage of real participants that chose to exchange numbers = %f", (real_y/real)*100)
## [1] "Percentage of real participants that chose to exchange numbers = 36.619718"
sprintf("--------------------------------------------")
## [1] "--------------------------------------------"

Inferential statistics

A chi-square test of independence indicated that participants were significantly less likely to reject the unattractive potential date in the real condition compared with the hypothetical condition, X^2(1, N = 132) = 6.77, p = .009.

Hint: if you are using the function chisq.test(), make sure to set the continuity correction to false (“correct = FALSE”) since sample size is greater than 20.

# reproduce the above results here
real_rej = sum(ana_d$condition == 1 & ana_d$exchangeinfo == 2)
hyp_rej = sum(ana_d$condition == 0 & ana_d$exchangeinfo == 2)
chi_data <- matrix(c(hyp_y, hyp_rej, real_y, real_rej), nrow = 2, byrow = TRUE)
test <- chisq.test(chi_data, correct = FALSE)
sprintf("X-squared: %f",test$statistic)
## [1] "X-squared: 6.767375"
sprintf("p-value: %f",test$p.value)
## [1] "p-value: 0.009284"

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

Yes. I was able reproduce all the results of this study.

How difficult was it to reproduce your results?

It was not too hard after learning more about tidying data and X^2 testing.

What aspects made it difficult? What aspects made it easy?

Difficult: Nothing, apart from not having a thorough knowledge of statistical testing. Once I learnt about the test it was easy to implement. There could have been a word document that described the different headers in the data columns in the dataset. Easy: Tidying data + simple and clear explanation of the statistical measure and process.