For this exercise, please try to reproduce the results from Study 1 of the associated paper (Joel, Teper, & MacDonald, 2014). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

In study 1, 150 introductory psychology students were randomly assigned to a “real” or a “hypothetical” condition. In the real condition, participants believed that they would have a real opportuniy to connect with potential romantic partners. In the hypothetical condition, participants simply imagined that they are on a date. All participants were required to select their favorite profile and answer whether they were willing to exchange contact information.


Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Study 1):

We next tested our primary hypothesis that participants would be more reluctant to reject the unattractive date when they believed the situation to be real rather than hypothetical. Only 10 of the 61 participants in the hypothetical condition chose to exchange contact information with the unattractive potential date (16%). In contrast, 26 of the 71 participants in the real condition chose to exchange contact information (37%). A chi-square test of independence indicated that participants were significantly less likely to reject the unattractive potential date in the real condition compared with the hypothetical condition, X^2(1, N = 132) = 6.77, p = .009.


Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files

# #optional packages:
# library(broom)
# library(labelled)# converts SPSS's labelled to R's factor 

Step 2: Load data

# Just Study 1
d <- read_sav('data/Empathy Gap Study 1 data.sav')

Step 3: Tidy data

# Check the raw variable distributions 
table(d$condition, useNA = "ifany")
## 
##  0  1 
## 61 71
table(d$exchangeinfo, useNA = "ifany")
## 
##  1  2 
## 36 96
# Rename numerical variables to descriptive variables 
# Based on inspection, condition "0" = hypothetical, "1" = real 
# and exchangeinfo "1" = yes, "2" = no

tidy_d <- d %>%
  mutate(
    condition = factor(condition,
                       levels = c(0, 1),
                       labels = c("hypothetical", "real")),
    exchangeinfo = factor(exchangeinfo,
                          levels = c(2, 1),
                          labels = c("no", "yes"))
  )


# Check table
table(tidy_d$condition, tidy_d$exchangeinfo)
##               
##                no yes
##   hypothetical 51  10
##   real         45  26

Step 4: Run analysis

Descriptive statistics

Only 10 of the 61 participants in the hypothetical condition chose to exchange contact information with the unattractive potential date (16%). In contrast, 26 of the 71 participants in the real condition chose to exchange contact information (37%).

# reproduce the above results here

# Summarize counts and percentages by condition
summary_table <- tidy_d %>%
  group_by(condition) %>%
  summarise(
    yes = sum(exchangeinfo == "yes"),
    total = n(),
    percent_yes = round(100 * yes / total, 1)
  )

# Check summary table
print(summary_table)
## # A tibble: 2 × 4
##   condition      yes total percent_yes
##   <fct>        <int> <int>       <dbl>
## 1 hypothetical    10    61        16.4
## 2 real            26    71        36.6

Inferential statistics

A chi-square test of independence indicated that participants were significantly less likely to reject the unattractive potential date in the real condition compared with the hypothetical condition, X^2(1, N = 132) = 6.77, p = .009.

Hint: if you are using the function chisq.test(), make sure to set the continuity correction to false (“correct = FALSE”) since sample size is greater than 20.

# reproduce the above results here
# Build a 2x2 table and perform a chi-square test
xtab <- table(tidy_d$condition, tidy_d$exchangeinfo)
print(xtab)
##               
##                no yes
##   hypothetical 51  10
##   real         45  26
chi_res <- chisq.test(xtab, correct = FALSE)
print(chi_res)
## 
##  Pearson's Chi-squared test
## 
## data:  xtab
## X-squared = 6.7674, df = 1, p-value = 0.009284
# Optional: display the result in a simple text summary
chi_val <- round(chi_res$statistic, 2)
p_val <- round(chi_res$p.value, 3)
df_val <- chi_res$parameter
n_total <- sum(xtab)

cat("Chi-square test of independence:",
    "\nX^2(", df_val, ", N =", n_total, ") =", chi_val,
    ", p =", p_val, "\n")
## Chi-square test of independence: 
## X^2( 1 , N = 132 ) = 6.77 , p = 0.009

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

ANSWER HERE Yes, I was able to reproduce both the descriptive and inferential results reported in the original paper. The descriptive counts (10 out of 61 participants in the hypothetical condition and 26 out of 71 participants in the real condition) and percentages (16%, 37%) matched the paper exactly. The chi-square test also yielded a nearly identical result, X-squared = 6.77 and p-value = 0.009.

How difficult was it to reproduce your results?

ANSWER HERE The analysis was relatively straightforward. Once I identified how the numeric variables corresponded to the two conditions and responses in the SPSS dataset, the rest of the analysis required only basic data recoding and a chi-square test.

What aspects made it difficult? What aspects made it easy?

ANSWER HERE The process was easy because the dataset was complete and the labels were straightforward. The only minor challege was determining which numeric values represented each condition and response. After clarifying that part, the results reproduced exactly.