For this exercise, please try to reproduce the results from Study 1 of the associated paper (Joel, Teper, & MacDonald, 2014). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

In study 1, 150 introductory psychology students were randomly assigned to a “real” or a “hypothetical” condition. In the real condition, participants believed that they would have a real opportunity to connect with potential romantic partners. In the hypothetical condition, participants simply imagined that they are on a date. All participants were required to select their favorite profile and answer whether they were willing to exchange contact information.


Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Study 1):

We next tested our primary hypothesis that participants would be more reluctant to reject the unattractive date when they believed the situation to be real rather than hypothetical. Only 10 of the 61 participants in the hypothetical condition chose to exchange contact information with the unattractive potential date (16%). In contrast, 26 of the 71 participants in the real condition chose to exchange contact information (37%). A chi-square test of independence indicated that participants were significantly less likely to reject the unattractive potential date in the real condition compared with the hypothetical condition, X^2(1, N = 132) = 6.77, p = .009.


Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files

# #optional packages:
# library(broom)
# library(labelled)# converts SPSS's labelled to R's factor 

Step 2: Load data

# Just Study 1
d <- read_sav('data/Empathy Gap Study 1 data 2.sav')

Step 3: Tidy data

selected_d = d%>%
  select(ID, condition, exchangeinfo)
head(selected_d, 132)
## # A tibble: 132 × 3
##       ID condition        exchangeinfo
##    <dbl> <dbl+lbl>        <dbl+lbl>   
##  1    53 1 [real]         1 [yes]     
##  2    93 1 [real]         2 [no]      
##  3    83 1 [real]         2 [no]      
##  4    27 0 [hypothetical] 2 [no]      
##  5     6 0 [hypothetical] 1 [yes]     
##  6   116 0 [hypothetical] 1 [yes]     
##  7    24 0 [hypothetical] 2 [no]      
##  8   127 0 [hypothetical] 2 [no]      
##  9    32 1 [real]         1 [yes]     
## 10    73 1 [real]         2 [no]      
## # ℹ 122 more rows

Step 4: Run analysis

Descriptive statistics

Only 10 of the 61 participants in the hypothetical condition chose to exchange contact information with the unattractive potential date (16%). In contrast, 26 of the 71 participants in the real condition chose to exchange contact information (37%).

# Calculate the descriptive statistics

# exchangeinfo 1=yes , 2=no
descriptive_stats <- selected_d %>%
  mutate(exchangeinfo = if_else(exchangeinfo == 1, "yes", "no")) %>%
  group_by(condition) %>%
  summarise(
    count_exchange = sum(exchangeinfo == "yes", na.rm = TRUE), 
     #1 = yes and if else 1 is no 
    # column for each group, ignoring NA values (na.rm = TRUE).
    total = n(),
    # total number of observations in each group.
    percentage = (count_exchange / total) * 100
 )
# condition 0 = hypothetical, 1=real
descriptive_stats$condition <- factor(descriptive_stats$condition, levels = c(0, 1), labels = c("Hypothetical", "Real"))

knitr::kable(descriptive_stats)
condition count_exchange total percentage
Hypothetical 10 61 16.39344
Real 26 71 36.61972

Inferential statistics

A chi-square test of independence indicated that participants were significantly less likely to reject the unattractive potential date in the real condition compared with the hypothetical condition, X^2(1, N = 132) = 6.77, p = .009.

Hint: if you are using the function chisq.test(), make sure to set the continuity correction to false (“correct = FALSE”) since sample size is greater than 20.

#Make the cross_tab first for the chi-square test 
selected_d$condition <- factor(selected_d$condition, levels = c(0, 1), labels = c("Hypothetical", "Real"))
selected_d$exchangeinfo <- factor(selected_d$exchangeinfo, levels = c(1, 2), labels = c("Yes", "No"))
cross_tab <- table(selected_d$condition, selected_d$exchangeinfo)
print(cross_tab)
##               
##                Yes No
##   Hypothetical  10 51
##   Real          26 45
#chi_result
chi_result <- chisq.test(cross_tab, correct = FALSE) 
print(chi_result)
## 
##  Pearson's Chi-squared test
## 
## data:  cross_tab
## X-squared = 6.7674, df = 1, p-value = 0.009284

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

Yes, I was able to reproduce the result I atepted to do.

How difficult was it to reproduce your results?

It was not that difficult, but as it was my first time to use the R for conducting descriptive statistics and chi-square test, so it took time for me to search the code to conduct the analysis.

What aspects made it difficult? What aspects made it easy?

The problem set 2 was instrumental in deepening my understanding of the ‘select’ and ‘mutate’ functions in the tidyverse package. This exercise was an excellent opportunity to apply these functions to a new dataset. Additionally, I conducted googling and asked ChatGPT to learn about writing the code for the chi-square test, which was quite beneficial.