For this exercise, please try to reproduce the results from Experiment 2 of the associated paper (de la Fuente, Santiago, Roman, Dumitrache, & Casasanto, 2014). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

Researchers tested the question of whether temporal focus differs between Moroccan and Spanish cultures, hypothesizing that Moroccans are more past-focused, whereas Spaniards are more future-focused. Two groups of participants (\(N = 40\) Moroccan and \(N=40\) Spanish) completed a temporal-focus questionnaire that contained questions about past-focused (“PAST”) and future-focused (“FUTURE”) topics. In response to each question, participants provided a rating on a 5-point Likert scale on which lower scores indicated less agreement and higher scores indicated greater agreement. The authors then performed a mixed-design ANOVA with agreement score as the dependent variable, group (Moroccan or Spanish, between-subjects) as the fixed-effects factor, and temporal focus (past or future, within-subjects) as the random effects factor. In addition, the authors performed unpaired two-sample t-tests to determine whether there was a significant difference between the two groups in agreement scores for PAST questions, and whether there was a significant difference in scores for FUTURE questions.


Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 2):

According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjectS factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2). Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001, and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001. (de la Fuente et al., 2014, p. 1685).


Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files

# #optional packages/functions:
# library(afex) # anova functions
# library(ez) # anova functions 2
# library(scales) # for plotting
# std.err <- function(x) sd(x)/sqrt(length(x)) # standard error

Step 2: Load data

# Just Experiment 2
data_path <- 'data/DeLaFuenteEtAl_2014_RawData.xls'
d <- read_excel(data_path, sheet=3)

Step 3: Tidy data

d<-d%>%select(group,subscale,participant,`Agreement (0=complete disagreement; 5=complete agreement)`)%>%rename(Agreement = `Agreement (0=complete disagreement; 5=complete agreement)`)
d <- d %>%mutate(group = ifelse(group == "young Spaniard", "Spaniards", group))
d <- d %>%mutate(group = ifelse(group == "Moroccan", "Moroccans", group))
d <- d %>%
  mutate(
    participant = paste0(
      participant,  # original number
      ifelse(group == "Moroccans", "M",
             ifelse(group == "Spaniards", "S", ""))
    )
  )
d
## # A tibble: 1,680 × 4
##    group     subscale participant Agreement
##    <chr>     <chr>    <chr>           <dbl>
##  1 Moroccans PAST     1M                  4
##  2 Moroccans PAST     1M                  4
##  3 Moroccans PAST     1M                  5
##  4 Moroccans PAST     1M                  2
##  5 Moroccans PAST     1M                  4
##  6 Moroccans PAST     1M                  3
##  7 Moroccans PAST     1M                  4
##  8 Moroccans PAST     1M                  2
##  9 Moroccans PAST     1M                  2
## 10 Moroccans PAST     1M                  3
## # ℹ 1,670 more rows

Step 4: Run analysis

Pre-processing

summary_d <- d%>%group_by(group,subscale)%>%summarise(
    mean_ag = mean(Agreement,na.rm=TRUE),se_ag = sd(Agreement,na.rm=TRUE),
    se_ag = se_ag / sqrt(sum(!is.na(Agreement))))
summary_d<-summary_d %>% mutate(
    group = factor(group, levels = c("Spaniards", "Moroccans")),
    subscale = factor(subscale, levels = c("PAST", "FUTURE"))
  )
summary_d
## # A tibble: 4 × 4
## # Groups:   group [2]
##   group     subscale mean_ag  se_ag
##   <fct>     <fct>      <dbl>  <dbl>
## 1 Moroccans FUTURE      3.12 0.0698
## 2 Moroccans PAST        3.29 0.0698
## 3 Spaniards FUTURE      3.49 0.0600
## 4 Spaniards PAST        2.68 0.0578

Descriptive statistics

Try to recreate Figure 2 (fig2.png, also included in the same folder as this Rmd file):

ggplot(summary_d, aes(x = group, y = mean_ag, fill = subscale)) +
  geom_bar(stat = "identity", position = position_dodge(width = 1)) +
  geom_errorbar(aes(ymin = mean_ag - se_ag, ymax = mean_ag + se_ag),
                width = 0.2,
                position = position_dodge(width = 1)) +
  labs(
    x = "Group",
    y = "Rating",
    fill = "Subscale"
  ) + coord_cartesian(ylim = c(2, 4))

Inferential statistics

According to a mixed analysis of variance (ANOVA) with group (Spanish vs. Moroccan) as a between-subjects factor and temporal focus (past vs. future) as a within-subjects factor, temporal focus differed significantly between Spaniards and Moroccans, as indicated by a significant interaction of temporal focus and group, F(1, 78) = 19.12, p = .001, ηp2 = .20 (Fig. 2).

# reproduce the above results here
anova_model <- aov(
  Agreement ~ group * subscale + Error(participant/subscale),
  data = d
)
summary(anova_model)[["Error: participant:subscale"]]
##                Df Sum Sq Mean Sq F value   Pr(>F)    
## subscale        1   42.3   42.31    8.05  0.00583 ** 
## group:subscale  1  103.9  103.92   19.77 2.93e-05 ***
## Residuals      76  399.4    5.26                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
eta2 <- 103.92/(103.92+399.4)
print(paste("eta is",eta2," with F value 19.77and p=2.93 e-05"))
## [1] "eta is 0.20646904553763  with F value 19.77and p=2.93 e-05"

Moroccans showed greater agreement with past-focused statements than Spaniards did, t(78) = 4.04, p = .001,

# reproduce the above results here
past_data <- d %>% filter(subscale == "PAST") %>% group_by(participant,group) %>% summarise(mA = mean(Agreement,na.rm=TRUE))
past_data
## # A tibble: 78 × 3
## # Groups:   participant [78]
##    participant group        mA
##    <chr>       <chr>     <dbl>
##  1 10M         Moroccans  3.27
##  2 10S         Spaniards  3.18
##  3 11M         Moroccans  2.36
##  4 11S         Spaniards  1.91
##  5 12M         Moroccans  3.55
##  6 12S         Spaniards  2.64
##  7 13M         Moroccans  1.82
##  8 13S         Spaniards  2   
##  9 14M         Moroccans  2.45
## 10 14S         Spaniards  2.91
## # ℹ 68 more rows
t.test(
  mA ~group,
  data =past_data,
  var.equal = TRUE  # or FALSE for Welch's t-test
)
## 
##  Two Sample t-test
## 
## data:  mA by group
## t = 3.8562, df = 76, p-value = 0.0002394
## alternative hypothesis: true difference in means between group Moroccans and group Spaniards is not equal to 0
## 95 percent confidence interval:
##  0.2851528 0.8943343
## sample estimates:
## mean in group Moroccans mean in group Spaniards 
##                3.280886                2.691142

and Spaniards showed greater agreement with future-focused statements than Moroccans did, t(78) = −3.32, p = .001.(de la Fuente et al., 2014, p. 1685)

# reproduce the above results here
future_data <- d %>% filter(subscale == "FUTURE") %>% group_by(participant,group) %>% summarise(mA = mean(Agreement,na.rm=TRUE))
future_data
## # A tibble: 80 × 3
## # Groups:   participant [80]
##    participant group        mA
##    <chr>       <chr>     <dbl>
##  1 10M         Moroccans   2.7
##  2 10S         Spaniards   3.5
##  3 11M         Moroccans   4.1
##  4 11S         Spaniards   3.4
##  5 12M         Moroccans   3.2
##  6 12S         Spaniards   4  
##  7 13M         Moroccans   4.3
##  8 13S         Spaniards   3.7
##  9 14M         Moroccans   3.8
## 10 14S         Spaniards   3.5
## # ℹ 70 more rows
t.test(
  mA ~group,
  data =future_data,
  var.equal = TRUE  # or FALSE for Welch's t-test
)
## 
##  Two Sample t-test
## 
## data:  mA by group
## t = -3.2098, df = 78, p-value = 0.001929
## alternative hypothesis: true difference in means between group Moroccans and group Spaniards is not equal to 0
## 95 percent confidence interval:
##  -0.5758588 -0.1349746
## sample estimates:
## mean in group Moroccans mean in group Spaniards 
##                3.138333                3.493750

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

Mostly, except the values for the anova and t test were slightly different.

How difficult was it to reproduce your results?

The statistic tests were difficult for me as I’ve not used them properly before. It was also confusing that the data was not complete - participant with id 24 and 25 seemed to not have all the data for all 21 scenarios.

What aspects made it difficult? What aspects made it easy?

young Spaniards was a wierd naming which confused me as it was labled as just Spaniards in the paper.

AI use: > I asked chatgpt to show me how to encode between and within subject factor in the anova, and asked it to explain why this was the right formulation.